hi souheil,
nipype parallelizes at the atomic level of the interfaces. currently we do not have a concept of grouping. one way to do this though is to use a Function node and define a Workflow or a sequence of steps within it to achieve grouping.
this is also useful when you might want to do pointer based operations, like a scikit-learn Pipeline that can run in memory.
grouping would require some serious graph to job engineering, which we haven't done.
relatedly, you can also ask a node to check it's hash locally through an execution configuration so that it won't submit the job to the cluster just to check if it has been run before.