Dear Paolo,
I have a question on the behavior of combine. Combine appears to automatically flatten tuples. Is this intentional?
For example:
num = Channel.from(1,2,3)
tuple_ch = Channel.from( ['a','b'] , ['c','d'] )
num.combine(tuple_ch).println()
will output
[1,'a','b']
[2,'a','b']
[3,'a','b']
[1,'c','d']
[2,'c','d']
[3,'c','d']
Instead of
[1 , ['a','b']]
[2 , ['a','b']]
[3 , ['a','b']]
[1 , ['c','d']]
[2 , ['c','d']]
[3 , ['c','d']]
The behavior is a bit strange, since I could not find reference to this automatic "flattening" in the documentation. (Please let me know if it is somewhere !)
I think also the behavior is a bit undesireable in cases when you want to combine multiple output files from two processes [e.g a BWA index output files and paired ends]
For example, I would expect to be able to write a "map" process like this:
samples = [ ['s1_R1.fq', 's1_R2.fq' ] , [ 's2_R1.fq', 's2_R2.fq' ] ]
ref_idx = [ [ 'bacteria1.aln', 'bacteria1.bwt' ] , [ 'bacteria2.aln', 'bacteria2.bwt' ] ]
mapping_pairs = samples.combine(ref_indexes)
process map{
input:
set file(sample),file(idx) from mapping_pairs
...
}
So that sample would refer to [ s1_R1.fq,s1_R2.fq ] and idx refers to ['bacteria1.aln','bacteria1.bwt']
But with the current behavior what would happen is that sample will be assigned to s1_R1.fq , idx to s1_R2.fq and the index files are ignored and not staged.
I did find a relatively simple workaround which is embedding the tuple in another tuple with map, so defining my channel mapping_pairs as
mapping_pairs = samples.map( { [it,] }).combine( ref_indexes.map( { [it,] })
works, but somehow this threw me a bit off guard .
It would be nice to know your thoughts about this! I apologize if my text is a bit confusing, I am very very new to Nextflow. I have attached also a simple
example.nf that illustrates what I mean too. Please let me know about the confusing bits, or if I can help out in any way
Best,
Mauricio