DSL2: Optional process execution based on file presence

1,172 views
Skip to first unread message

MB

unread,
Jan 6, 2022, 1:53:12 PM1/6/22
to Nextflow
Hello all,

I want to do something which I thought would be easy to implement but for the life of me can't get to work... Some advise would be very welcome.

I want to make a workflow for Bowtie2 with an optional indexing process. The reference file will be provided as a parameter and within the workflow, I want to check if the index files (*.bt2) are present. If not, perform the building process first, if they are, continue to alignment.

At the moment, my solution fails on checking for a file. Is there a callable method for a file channel that returns true/false based on file presence? Already tried .exists() but get an error that this isn't a valid method.

workflow BOWTIE2 {
take:
reference // Channel: [reference]
reads // Channel: [samplename, reads]

main:
index = file("${reference.getParent() }/${reference.baseName }*.bt2")
(index_ch, no_index_ch) = ( index.isEmpty()
? [ Channel.empty(), tuple(reference, reference.getParent()) ]
: [ reference.combine(Channel.fromPath("${reference.getParent() }/${reference.baseName }*.bt2")) , Channel.empty() ] )
build(no_index_ch)
align(reads.combine(build.out.mix(index_ch)))
}

Paolo Di Tommaso

unread,
Jan 6, 2022, 9:16:09 PM1/6/22
to nextflow
Think can be simplified a lot, with dsl2 you can just use an `if` statement. 

```
if( index.isEmpty() ) {
align( .. )
}
else {
build( .. )
align( .. )
}
```

Take it as pseudocode, I've omitted the combination of the corresponding channels.

p


--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/42a3f863-8925-4ed1-9163-b6c8d1a7d545n%40googlegroups.com.

Mark Bessem

unread,
Jan 7, 2022, 5:40:53 AM1/7/22
to next...@googlegroups.com
Hey Paolo,

Thanks, that already makes my code a lot cleaner. I'm still stuck with checking for files at the original location of the reference file though. Is there a nextflow method for getting the parent of the original file in a channel? Otherwise I'd have to do it in a map function, which complicates things quite a bit since I'd have two possible output channels defined within a map function. .getParent() in the workflow doesn't seem to work for this purpose.

take:
   reference // Channel: [reference.fasta]
   reads // Channel: [samplename, reads]

main:
   index = file("${ reference.getParent() }/${ reference.baseName }*bt2")
   if( index.isEmpty() ) {
build( reference )
align( reads.combine(build.out) )
}
else {
align( reads.combine( tuple (reference, index ) )
}


Op vr 7 jan. 2022 om 03:16 schreef Paolo Di Tommaso <paolo.d...@gmail.com>:

drhp...@gmail.com

unread,
Jan 7, 2022, 5:58:32 AM1/7/22
to Nextflow

Hi Mark,

In terms of workflow design it may be an issue assuming that the reference and the index are in the same path as above. Another solution is to parameterise "params.bowtie2_index" and then to have some pre-processing to generate the index followed by alignment. An example of this is below:
https://github.com/nf-core/viralrecon/blob/2ebae61442598302c64916bd5127cf23c8ab5611/subworkflows/local/prepare_genome_illumina.nf#L115-L129

The code above doesn't explicitly check that the ".bt2" suffix files exist but the pipeline would fail at the alignment step if this isn't provided correctly.

Cheers,

Harshil

Reply all
Reply to author
Forward
0 new messages