Drake workflow with set-up task

21 views
Skip to first unread message

slu...@gmail.com

unread,
Apr 7, 2016, 12:30:25 PM4/7/16
to drake-workflow
Hi,

I'm trying to find a way to have a "set-up" task in my Drakefile which is always executed no matter what. In order to do that I understood that it's just a matter of creating a task without input:

setup <-
# store something into $[OUTPUT]

Now the problem is that my next tasks need that "setup" output, as input

task1 <- setup
# something heavy here

task2 <- setup
# something heavy here

task 3 <- task1, task2

Is there a way to avoid the execution of task1 and task2 (and of course task3) if their outputs have been already created?

Thanks.

Aaron Crow

unread,
Apr 8, 2016, 7:31:17 PM4/8/16
to slu...@gmail.com, drake-workflow
Hi Slux,

A core part of drake's dependency logic is, "if a step's input is new, the step should be run". And you're setting up your workflow to always create your setup file that is used as input in task1 and task2. So it looks like you're trying to have it both ways, which I believe is logically impossible in this case.

I should ask: what exactly is the purpose of the setup file, and why do you need to run it every time? I'm asking this b/c what you're currently saying is "i need to run it every time. later steps need it. but later steps don't need an updated version". Is that right?

Also: is the thing that must be run every time, also the thing that populates the setup file? (maybe a silly question but i just want to be sure you're not conflating 2 unrelated things)


--
You received this message because you are subscribed to the Google Groups "drake-workflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drake-workflo...@googlegroups.com.
Visit this group at https://groups.google.com/group/drake-workflow.
For more options, visit https://groups.google.com/d/optout.

Alessio Di Fazio

unread,
Apr 9, 2016, 5:10:06 AM4/9/16
to drake-workflow
Hi Aaron,

I'll try to explain it better and also explain the workaround/solution I found.

We are try to use Drake as workflow engine in something that is more than files. We have developed a command line tool which has a database behind and has a concept of workflow run and session. The setup is the task which initialize the session, getting a session token back, stored in $OUTPUT.

Now, the other task need always this session token but they should be treated as normal Drake tasks, meaning that if the output is already there, their execution is skipped.

Reading the manual I understood that Drake follows the task order in the Drake file if no dependencies are detected. This means that if I place my setup task at the beginning of the file, this will be always run. The other tasks of course don't have any dependency to the setup.

Not sure it's a good solution, but it works well for us!

Regards,
Alessio

Aaron Crow

unread,
Apr 14, 2016, 10:21:08 PM4/14/16
to Alessio Di Fazio, drake-workflow
Thanks Alessio, I'm glad things are working for you!  8-]

Reply all
Reply to author
Forward
0 new messages