Updating Jobs from MRv1 to Yarn

27 views
Skip to first unread message

JB

unread,
Apr 30, 2015, 5:10:26 PM4/30/15
to cascadi...@googlegroups.com
Apologies if this was posted previously.

We are planning to migrate from a MRv1 cluster to Yarn.  We have a number of Cascading jobs that we would like to migrate from 2.0 to 2.6.

If my understanding is correct, Cascading has abstracted away the differences in the Hadoop APIs, so we would simply re-compile our jobs with the appropriate Yarn jars.  Clearly we would need to update for any changes in the Cascading API between 2.0 and 2.6.  Is this correct or is it more complicated than that?

Also, there are a number of separate jars for Yarn and MRv1 in the Cascading 2.6 distro.  Would we need to specify the Yarn based jars explicitly in Maven?  

Is there anyone out there that has done this with a sample Maven pom.xml?

Thanks in advance.


Andre Kelpe

unread,
Apr 30, 2015, 6:18:30 PM4/30/15
to cascadi...@googlegroups.com
Hi,

that should be straight forward: Change your dependencies from cascading-hadoop to cascading-hadoop2mr1 on the Cascading side. Cascading sets all hadoop dependencies to provided, so that you can use a version of your liking. In case of Apache Hadoop you will need these two dependencies:

https://github.com/Cascading/cascading/blob/2.6/cascading-hadoop2-mr1/build.gradle#L43-L44

Then change the usage of HadoopFlowConnector to Hadoop2MR1FlowConnector, recompile and you are good to go.

You find compatible Hadoop distros listed on our compatibility page: http://www.cascading.org/support/compatibility/

- André

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/ab1f3737-1408-48b6-802a-7daefcb5f38e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--

Koert Kuipers

unread,
Apr 30, 2015, 6:22:31 PM4/30/15
to cascadi...@googlegroups.com
for what its worth, we never changed our dependencies from cascading-hadoop to cascading-hadoop2mr1, and it still runs fine on yarn. i am still unsure what the benefit is of the changed dependencies.

Andre Kelpe

unread,
Apr 30, 2015, 6:52:21 PM4/30/15
to cascadi...@googlegroups.com

It might run right now, but it can break in any release. We do not guarantee that this will work and advice against it.

In fact, it will at least print a warning from 2.7 onwards and might totally break in 3.0. It is a trivial change, so please do it.

- André

Koert Kuipers

unread,
Apr 30, 2015, 7:16:28 PM4/30/15
to cascadi...@googlegroups.com
its not a trivial change in that it might stop the job from running on mr1. currently we are able to publish a single job that we can deploy on both mr1 and yarn clusters, which simplifies things a lot.

Andre Kelpe

unread,
May 1, 2015, 8:06:01 AM5/1/15
to cascadi...@googlegroups.com
For the apps we maintain (lingual, multitool, load) we build different jars from the same code-base; one for each of the cascading platforms. That way we know, that it will work. Feel free to copy the code from those. It should be straight forward.

- André


For more options, visit https://groups.google.com/d/optout.

Koert Kuipers

unread,
May 3, 2015, 12:20:49 PM5/3/15
to cascadi...@googlegroups.com
yarn pretty much guarantees that mr1 old-api jobs will run without modification.

currently cascading-hadoop sticks to using a pretty narrow set of hadoop apis, but basically it runs on the old mapred api. which to me indicates it should continue to run on yarn just fine. i am still unsure what part of it could "break it in any release".

if we ever get there (that it does break) we will probably switch to mr2 and simply abaondon mr1 compatibility.... in the meantime we would be happy to run on cascading 2.6 for another year if we have to.

Reply all
Reply to author
Forward
0 new messages