How to Azkaban submit job to hadoop jobtracker

Skip to first unread message


Sep 25, 2011, 8:13:02 AM9/25/11
to azkaban

I succeed of running a cascading jar file locally as javaprocess it's
ok. But is there a way the run it on hadoop by configuring another job
properties file.
Actually the only solution i found is to make a new job command that
run "hadoop jar <myjar> -run". Is it ok to do it like that ?

If there is not way so how azkaban is linked to hadoop ?? (except that
I can browse hdfs file system nice)

Please is there any more document or sample or anything that can help
to configure workflows ? like what is prop-dependency and how to pass
the result of previous job to next job in the workflow ?


Richard Park

Oct 4, 2011, 10:44:46 PM10/4/11
The connection between azkaban and hadoop is weak.
You can, like you suggested, run hadoop jar. We tend to invoke hadoop through Java program or through the pig type. There are thoughts of increasing the hadoop interactivity by having better time completion percentage and to show job tracker usage as well.

For property propagation, a special property file is written by your jar that azkaban knows to pass to the next job. Unfortunately I'm not as familiar with this area as I should be. We'll need to update the docs for this behaviour.

Reply all
Reply to author
0 new messages