[MISC]Interoperability between Hadoop and OpenMPI

10 views

Skip to first unread message

Harsh But Determined

unread,

Aug 10, 2011, 3:19:56 PM8/10/11

to vit...@googlegroups.com

As far as I understand, Hadoop is used when the data chunks are large in amount and are totally independent for processing needs, while MPI is used for relatively lower number of data chunk for task farming and that too when message passing required while processing is a much higher.

Consider we have a case where our initial job started with conditions satisfying that of Hadoop, where data chunks were in say 100s of GBs or TBs and relatively low interaction was required. But the output of the initial task leads to another task where the data chunks are low and the processing involved needs a lot of message passing, so it would be better to do it with MPI. So, we can switch to MPI for this job and after the processing is over, it can then revert back to initial Hadoop job; or vice versa.

Now the question is - Is this scenario worth working upon?

Your friend,
Harsh :)

Connect with me on:

Reply all

Reply to author

Forward

0 new messages