Stream to channel functions

3 views
Skip to first unread message

Philippe Veber

unread,
Nov 1, 2013, 5:49:18 AM11/1/13
to Biocaml
Hi everyone,

I've been missing functions that would be the symetric to the in_channel_to_*_items, that is simple sequential wrappers of [Transform]s to save a fasta/fastq/bed/gff file. I wrote one for fasta files, as follows:

let char_seq_items_to_out_channel ?tags items oc =
  let open Fasta in
  let t =
    Biocaml_transform.compose
      (Transform.char_seq_item_to_raw_item ?tags ())
      (Transform.char_seq_raw_item_to_string ?tags ())
  in
  Stream.iter
    (Biocaml_transform.to_stream_fun t items)
    ~f:(output_string oc)

Is this a correct use of [Transform]s?

More generally, how about adding similar functions in all modules proposing transforms? I think just as the in_channel_to_*_items permit a concise incantation to read values, we should have the same for writing to files. I would start by adding a general function in Biocaml_transform:

val stream_to_out_channel : ('a, string) Biocaml_transform.t -> 'a Stream.t -> out_channel ->  unit

Please tell me how it sounds, and if relevant I'll submit a patch.

Philippe.

Ashish Agarwal

unread,
Nov 1, 2013, 10:07:24 AM11/1/13
to Biocaml
This seems reasonable to me. The only reason such functions are missing is that we parse more often than print, but certainly both directions should be supported in general.

However, note that I've been meaning to evaluate the performance of the transform composition operators. I'm concerned they encourage an inefficient style that moves data in and out of too many intermediate data structures. Thus, I wouldn't go too far on adding your functions across all modules. Just add the ones you need since we might find this isn't the most efficient implementation.

-Ashish



--
You received this message because you are subscribed to the Google Groups "biocaml" group.
To unsubscribe from this group and stop receiving emails from it, send an email to biocaml+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Philippe Veber

unread,
Nov 1, 2013, 11:58:52 AM11/1/13
to Biocaml
Hi Ashish!


2013/11/1 Ashish Agarwal <agarw...@gmail.com>

This seems reasonable to me. The only reason such functions are missing is that we parse more often than print, but certainly both directions should be supported in general.
Good.
 

However, note that I've been meaning to evaluate the performance of the transform composition operators.
Are you referring to the use of [Transform.compose]? If so, then it is really not the central point of my suggestion. I use it because there is no function to transform a [Fasta.item] to [string]s directly.

I'm concerned they encourage an inefficient style that moves data in and out of too many intermediate data structures.Thus, I wouldn't go too far on adding your functions across all modules. Just add the ones you need since we might find this isn't the most efficient implementation.
Sorry I don't understand. My suggestion is really only about providing functions that can directly save a stream of items to a channel. It happens that under the hood they will rely on Transform, but this is not visible in the signature. Did I miss your point?

cheers,
 ph.
 

Ashish Agarwal

unread,
Nov 1, 2013, 12:36:30 PM11/1/13
to Biocaml
On Fri, Nov 1, 2013 at 11:58 AM, Philippe Veber <philipp...@gmail.com> wrote:

Sorry I don't understand. My suggestion is really only about providing functions that can directly save a stream of items to a channel. It happens that under the hood they will rely on Transform, but this is not visible in the signature.

Exactly, so that's why I think it's fine to add the functions. I was just suggesting you not waste too much time on functions you don't immediately need because we might rethink the implementations.

Philippe Veber

unread,
Nov 2, 2013, 4:35:48 AM11/2/13
to Biocaml
Makes perfect sense, thanks!



2013/11/1 Ashish Agarwal <agarw...@gmail.com>
On Fri, Nov 1, 2013 at 11:58 AM, Philippe Veber <philipp...@gmail.com> wrote:

Sorry I don't understand. My suggestion is really only about providing functions that can directly save a stream of items to a channel. It happens that under the hood they will rely on Transform, but this is not visible in the signature.

Exactly, so that's why I think it's fine to add the functions. I was just suggesting you not waste too much time on functions you don't immediately need because we might rethink the implementations.

Reply all
Reply to author
Forward
0 new messages