Best way to implement conditional pipeline

788 views
Skip to first unread message

Marc Logghe

unread,
Nov 26, 2015, 3:02:32 AM11/26/15
to Nextflow
Hi,
What is the best way to enforce that a process/task is executed before all the others and depending on the outcome, the (rest of the) pipeline is not executed.
Some context: a pipeline is reading in a number of tsv files, counts the rows and populates a MongoDB database. Now, I would like to add a process that checks first if that database is already populated or not and exits the workflow in case it is.
Thanks and regards,
Marc

Paolo Di Tommaso

unread,
Nov 26, 2015, 4:30:49 AM11/26/15
to nextflow
Hi Marc, 

You need a value channel that trigger (or not) the remaining part of your pipeline. Something like the following example: 


process foo {
    output:
    file count
    '''
    echo 0 > count
    '''
}

count.map { it.text as int }.filter { it > 0 }.first().into { trigger }


process bar {
    input: 
    val trigger  
    '''
    echo RUN THIS
    '''
}

process baz {
    input: 
    val trigger 
    '''
    echo RUN THAT
    '''
}

The first process will execute the DB query and will return a number or a flag. You will use that value to create a "trigger" channel that emits a single value or nothing depending the filter condition. 

Not sure that is the optimal solution, but it works!


Hope this helps.


Cheers,
Paolo



--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
Visit this group at http://groups.google.com/group/nextflow.
For more options, visit https://groups.google.com/d/optout.

Marc Logghe

unread,
Nov 26, 2015, 5:41:52 AM11/26/15
to Nextflow
That's elegant indeed. Thanks !

But I'm a little confused now. I was convinced that you could only read a channel once. I tend to always make copies of a particular channel if the content is to be used in multiple steps.
E.g. I would have done:
count.map { it.text as int }.filter { it > 0 }.first().into {

 trigger1
 trigger2
}


process bar {
   input:
    val trigger1  
   '''
   echo RUN THIS
   '''
}

process baz {
   input:
    val trigger2
   '''
   echo RUN THAT
   '''
}

Or is the channel only read (and emptied) in the case you call subscribe() ?
To rephrase my additional question: when do you need to create copies of a channel and in what cases can you reuse a channel ?


Op donderdag 26 november 2015 10:30:49 UTC+1 schreef Paolo Di Tommaso:

Paolo Di Tommaso

unread,
Nov 26, 2015, 6:35:22 AM11/26/15
to nextflow
Hi Marc, 

Yes, this may be a bit cryptic mostly because I've never managed to document it clearly. 

There are two kinds of channels: 
  • dataflow stream (or queue)  
  • dataflow value (or singleton)

The first works as you've described. A stream channel is basically an asynchronous FIFO queue or a stream of items. Thus when an item is read, it is removed from the channel. 

The second has a different semantic. By definition a dataflow variable is a partial data structure that can be assigned to one and only one value [1]. Thus, once assigned (or bounded in the dataflow parlance), it will return always the same value (or the empty value) when you will read it i.e. is never consumed. 


So the question is: how distinguish these two types of channels? A value channel is created by the channel factory Channel.value() or by an operator that returns a single value e.g. first(), last(), count(), toList(), etc. 


All the remaining channels are dataflow queues that emit a sequence of values. 


Does it clarify your doubt? 


Cheers,
Paolo





1. Van-Roy, P. & Haridi, S., 2004. Concepts, techniques, and models of computer programming, MIT press. Available at: http://www.epsa.org/forms/uploadFiles/3B6300000000.filename.booksingle.pdf

Marc Logghe

unread,
Nov 26, 2015, 7:08:34 AM11/26/15
to Nextflow
Great, yes that clarifies a lot ! Thank you.
Cheers,
Marc

Op donderdag 26 november 2015 12:35:22 UTC+1 schreef Paolo Di Tommaso:
Reply all
Reply to author
Forward
0 new messages