Data flows and memory consumption

12 views
Skip to first unread message

Martijn Starmans

unread,
Nov 21, 2016, 7:57:54 AM11/21/16
to Fastr Users
Hi all,

I have a bit of a dilemma concerning the data flow in my network. I am processing quite an amount of images, which sometimes serve as input to a tool all at once or in pairs, sometimes sequentially where multiple instances of the tool are currently created. The memory consumption of such a large amount of tools and corresponding saved images is quite large, thereby hampering my network. What I ideally would like is to say per tool how the images are to be processed and on which tools the data flow depend and thus should wait. I was wondering if such a thing is already implemented in the framework?

To give a little more context if above description seems vague, my network basically consists of four core tools and accepts images from two modalities:
- An image converter
- Elastix to align the image pairs
- A tool that calculates several features on an image
- A classifier

I would like the individual images to be converted sequentially, the pairs of the two modalities to be registered sequentially, the feature calculations performed sequentially per individual image and finally the classifier needs all feature files at once.

Kind regards,

Martijn

Hakim Achterberg

unread,
Nov 28, 2016, 3:44:34 AM11/28/16
to Fastr Users
Dear Martijn,

At the moment we do not have a system in place for this type of management. If you run fastr on a cluster with the DRMAA executor, you can specify the amount of memory per job and the scheduler should take that into account. For local execution we do not support that (yet). A limitation of how many instances of a tool/node should run concurrently is also not yet available, but it might be a good idea. We have some ideas on that, but for a future release.

Cheers,
Hakim
Reply all
Reply to author
Forward
0 new messages