controlling parallelization

22 views
Skip to first unread message

Souheil Inati

unread,
Jun 27, 2013, 5:15:02 PM6/27/13
to nipy...@googlegroups.com
I have an extremely simple pipeline:
[list] -> infosource -> step1 -> step2 -> sink

I did this in my script:
workflow.run(plugin='PBS',plugin_args=dict(qsub_args='-l nodes=1'))
and each node launched its own job.  This surprised me, but I guess it shouldn't have :-)

Suppose that some of the nodes do very, very little (say step1), then it doesn't really make sense to parallelize down to the node and it would be better to treat the whole pipeline as a block and just farm out the len([list]) instances of it.  Is there a way to do this? This is like Maarten saying that he only needed to parallelize at the subject level.  I'm missing something in the docs.

Thanks,
Souheil


Satrajit Ghosh

unread,
Jun 27, 2013, 9:59:38 PM6/27/13
to nipy-user
hi souheil,

nipype parallelizes at the atomic level of the interfaces. currently we do not have a concept of grouping. one way to do this though is to use a Function node and define a Workflow or a sequence of steps within it to achieve grouping.

this is also useful when you might want to do pointer based operations, like a scikit-learn Pipeline that can run in memory.

grouping would require some serious graph to job engineering, which we haven't done.

relatedly, you can also ask a node to check it's hash locally through an execution configuration so that it won't submit the job to the cluster just to check if it has been run before.

cheers,

satra




--
 
---
You received this message because you are subscribed to the Google Groups "NiPy Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nipy-user+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Souheil Inati

unread,
Jun 27, 2013, 10:52:37 PM6/27/13
to nipy...@googlegroups.com, nipy-user
Ah, I like the function idea. You don't want to the code to automatically figure out the parallelization. You want to provide a container to wrap it all up. I'll try this out and let you know how it works.

Should checking the hash without submitting a job be the default?  I'll dig around and try to find that. 

-SI

Sent from my iPhone
You received this message because you are subscribed to a topic in the Google Groups "NiPy Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/nipy-user/qeKtjjPZkdY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to nipy-user+...@googlegroups.com.

Satrajit Ghosh

unread,
Jun 27, 2013, 11:16:08 PM6/27/13
to nipy-user
Should checking the hash without submitting a job be the default?  I'll dig around and try to find that. 

perhaps it's time to switch over:  http://nipy.org/nipype/users/config_file.html 

it will put more load on the headnode.

cheers,

satra

Souheil Inati

unread,
Jun 27, 2013, 11:38:44 PM6/27/13
to nipy...@googlegroups.com
Oh neat :-)
I'm guessing that it's way more overhead to do a qsub then it is to check a hash. Is that true?  

Btw on our cluster we're advised to launch a job at the beginning. So it's not the head node doing this work.

Sent from my iPhone

Satrajit Ghosh

unread,
Jun 27, 2013, 11:43:45 PM6/27/13
to nipy-user
I'm guessing that it's way more overhead to do a qsub then it is to check a hash. Is that true?  

yes.
Reply all
Reply to author
Forward
0 new messages