Machine learning capabilities and the Function Interface

36 views
Skip to first unread message

Дмитро Бєлєвцов

unread,
Oct 7, 2014, 9:28:15 AM10/7/14
to nipy...@googlegroups.com
Hi all,

I wanted to use the distributed capabilities of nipype for multivariate pattern analysis (i.e. machine learning) of some fMRI data. However, after playing around with the Function Interface, I got the feeling that due to it's heavy dependency on files and impossibility to pass arbitrary python objects around, nipype is probably not the most suitable engine for pure machine learning tasks. What are your thoughts on this?

Regards,
Dmytro

Satrajit Ghosh

unread,
Oct 7, 2014, 9:40:34 AM10/7/14
to nipy-user
hi dmytro,

I wanted to use the distributed capabilities of nipype for multivariate pattern analysis (i.e. machine learning) of some fMRI data. However, after playing around with the Function Interface, I got the feeling that due to it's heavy dependency on files and impossibility to pass arbitrary python objects around, nipype is probably not the most suitable engine for pure machine learning tasks.

this is a valid observation. 
 
What are your thoughts on this?

however, this partly depends on what you would like to do and how your current cluster setup is.

we have contemplated pure python objects, but that is usually difficult in the generic sense of clusters which cannot really pass pointers to objects from one node to another. that essentially requires pickling and moving.

however, on a local multicore machine you might simply be better of using the function node and calling a distributed computing interface (either using joblib, or using the ipython machinery).

if there are any suggestions on how to improve this, we welcome suggestions and pull -requests! we are very interested in making the workflow engine as generic as possible, so this would be a welcome addition.

hope that helps.

cheers,

satra

Chris Filo Gorgolewski

unread,
Oct 7, 2014, 2:48:08 PM10/7/14
to Nipy User
Just to clarify: Nipype does allows to use arbitrary python objects as inputs and outputs. They will be serialised under the hood which makes it less efficient of a single machine environment, but scales very well on clusters.

Best,
Chris


--

---
You received this message because you are subscribed to the Google Groups "NiPy Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nipy-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Дмитро Бєлєвцов

unread,
Oct 8, 2014, 12:12:05 PM10/8/14
to nipy...@googlegroups.com
@Satra, thanks. I recentlt found the PaPy engine, which seems to do file operations transparently whenever there's not enough memory if I understood correctly. Maybe nipype can have sort of a plugin for it or something.

@Chris
I tried a function that returns a perfectly picklabe object instance and couldn't make it work. Could you pls give a really minimal code snippet that shows how to achieve this? I'm trying to wrap PyMVPA around or such, just in case.

Satrajit Ghosh

unread,
Oct 8, 2014, 12:29:02 PM10/8/14
to nipy-user
hi dmytro,

@Satra, thanks. I recentlt found the PaPy engine, which seems to do file operations transparently whenever there's not enough memory if I understood correctly. Maybe nipype can have sort of a plugin for it or something.

i'll check this out.
 
@Chris
I tried a function that returns a perfectly picklabe object instance and couldn't make it work. Could you pls give a really minimal code snippet that shows how to achieve this? I'm trying to wrap PyMVPA around or such, just in case.

On Tuesday, October 7, 2014 3:40:34 PM UTC+2, satra wrote:
hi dmytro,

I wanted to use the distributed capabilities of nipype for multivariate pattern analysis (i.e. machine learning) of some fMRI data. However, after playing around with the Function Interface, I got the feeling that due to it's heavy dependency on files and impossibility to pass arbitrary python objects around, nipype is probably not the most suitable engine for pure machine learning tasks.

this is a valid observation. 
 
What are your thoughts on this?

however, this partly depends on what you would like to do and how your current cluster setup is.

we have contemplated pure python objects, but that is usually difficult in the generic sense of clusters which cannot really pass pointers to objects from one node to another. that essentially requires pickling and moving.

however, on a local multicore machine you might simply be better of using the function node and calling a distributed computing interface (either using joblib, or using the ipython machinery).

if there are any suggestions on how to improve this, we welcome suggestions and pull -requests! we are very interested in making the workflow engine as generic as possible, so this would be a welcome addition.

hope that helps.

cheers,

satra

--

Chris Filo Gorgolewski

unread,
Oct 8, 2014, 12:33:32 PM10/8/14
to Nipy User
On Wed, Oct 8, 2014 at 9:12 AM, Дмитро Бєлєвцов <belev...@gmail.com> wrote:
@Satra, thanks. I recentlt found the PaPy engine, which seems to do file operations transparently whenever there's not enough memory if I understood correctly. Maybe nipype can have sort of a plugin for it or something.
Thanks for the reference. This looks exciting - we should have a closer look. 

@Chris
I tried a function that returns a perfectly picklabe object instance and couldn't make it work. Could you pls give a really minimal code snippet that shows how to achieve this? I'm trying to wrap PyMVPA around or such, just in case.


On Tuesday, October 7, 2014 3:40:34 PM UTC+2, satra wrote:
hi dmytro,

I wanted to use the distributed capabilities of nipype for multivariate pattern analysis (i.e. machine learning) of some fMRI data. However, after playing around with the Function Interface, I got the feeling that due to it's heavy dependency on files and impossibility to pass arbitrary python objects around, nipype is probably not the most suitable engine for pure machine learning tasks.

this is a valid observation. 
 
What are your thoughts on this?

however, this partly depends on what you would like to do and how your current cluster setup is.

we have contemplated pure python objects, but that is usually difficult in the generic sense of clusters which cannot really pass pointers to objects from one node to another. that essentially requires pickling and moving.

however, on a local multicore machine you might simply be better of using the function node and calling a distributed computing interface (either using joblib, or using the ipython machinery).

if there are any suggestions on how to improve this, we welcome suggestions and pull -requests! we are very interested in making the workflow engine as generic as possible, so this would be a welcome addition.

hope that helps.

cheers,

satra

--

Дмитро Бєлєвцов

unread,
Oct 8, 2014, 2:55:24 PM10/8/14
to nipy...@googlegroups.com
Oh, thanks, I see now. The problem was that I was defining a class inside the function definition, without importing it externally, in which case it falls out with an error.
Reply all
Reply to author
Forward
0 new messages