combine two channels by sample string

106 views
Skip to first unread message

bio informatics

unread,
Dec 19, 2022, 2:23:57 AM12/19/22
to Nextflow
Hi 

I have two channels and would like to combine them based on their sample names. However, keep failing. Here are the details:

Input Channels

ch_alignment
demux.Clontech_5p--bc1003_3p.flnc_clustered.sorted.sam
demux.Clontech_5p--bc1001_3p.flnc_clustered.sorted.sam
demux.Clontech_5p--bc1002_3p.flnc_clustered.sorted.sam


ch_clustered
demux.Clontech_5p--bc1001_3p.flnc_clustered.fasta
demux.Clontech_5p--bc1002_3p.flnc_clustered.fasta
demux.Clontech_5p--bc1003_3p.flnc_clustered.fasta


Desired output

[demux.Clontech_5p--bc1003_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1003_3p.flnc_clustered.fasta]
[demux.Clontech_5p--bc1001_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1001_3p.flnc_clustered.fasta]
[demux.Clontech_5p--bc1002_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1002_3p.flnc_clustered.fasta]


I tried "combine" and "filter" but it's not working for me

ch_alignment.flatten()
                        .combine(ch_clustered)
                        .filter {it[0].fileName.toString().split(".")[:-1].contains(it[1].fileName.toString().split(".")[:-2]) }



Any help is much appreciated. 

Cheers,
Mark


Matteo Schiavinato

unread,
Feb 2, 2023, 8:40:03 AM2/2/23
to Nextflow
I think you're not assuming that the two channels have the same order, therefore you must combine the two of them by some sort of matching key.

I think it won't be straightforward in this case, but you can tokenize the filenames in order to do so. I'd do it like this:

ch_alignment
.map{ it -> [ it.tokenize("."), it ] }
.map{ it -> [ [ it[0][0], it[0][1] ].join(".") , it[1] ] }
.set{ ch_alignment_token }

ch_clustered
.map{ it -> [ it.tokenize("."), it ] }
.map{ it -> [ [ it[0][0], it[0][1] ].join(".") , it[1] ] }
.set{ ch_clustered_token }

ch_alignment
.join(ch_clustered)
.map{ it -> it.flatten() }
.set{ ch_final }

output
ch_final.view()

[
[demux.Clontech_5p--bc1003_3p, demux.Clontech_5p--bc1003_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1003_3p.flnc_clustered.fasta]
[demux.Clontech_5p--bc1001_3p, demux.Clontech_5p--bc1001_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1001_3p.flnc_clustered.fasta]
[demux.Clontech_5p--bc1002_3p, demux.Clontech_5p--bc1002_3p.flnc_clustered.sorted.sam, demux.Clontech_5p--bc1002_3p.flnc_clustered.fasta]
]
Reply all
Reply to author
Forward
0 new messages