Re: How can I dynamically change input to a pig job?

266 views
Skip to first unread message

Richard

unread,
May 21, 2013, 7:17:07 PM5/21/13
to azkab...@googlegroups.com
With the pig type, in the job file you can set a property called params.<some param name>. That param name will be passed to the pig script as an input variable.

If you want today, Azkaban will also set some runtime variables that can be used: http://azkaban.github.io/azkaban2/documents/2.1/jobconf.html
So adding something like the following can pass the variable named 'year' to the pig script.
params.year=${azkaban.flow.start.year}

There are several ways that we've gotten other dates. Some have written shell commands in the script itself to define 'yesterday'. Others have created Pig UDF's.
One common way we've done it is to have the pig loaders resolve something like 'days.ago' in code.

You can also do parameter passing between jobs in azkaban. Your first job would output something like 'date.to.run.against', and that value could then be passed to the pig script. Parameter passing is described in the same page for Job Configuration.
-Richard


On Tuesday, May 21, 2013 11:44:31 AM UTC-7, Jim wrote:
I have my data time partitioned so for the nightly jobs I'd like to run the pig job for the data from the previous day. What's the best way to automate that in azkaban so that each day there's a new dynamic variable for the "date to run against" that gets submitted to pig?

thanks!

Reply all
Reply to author
Forward
0 new messages