Adding group back in.
Oh *2*.0. For some reason I was just thinking 1.0, which is just a
minor change from the 0.20 series. 2.0 is the whole new YARN setup,
right? Yes, this makes it much more likely that the version change is
the source of the problem.
The error you're seeing seems to be correlated with the way classes
are loaded. 2.0 could very well have changed that process
significantly. What I'd recommend is going through the 2.0 docs and
seeing if/how the InputFormat infrastructure has changed. Either way,
you'll want to update the development dependencies to reflect the new
version. I'll be happy to code review changes, but unfortunately, I
don't have time to do this upgrade myself.
On Tue, Apr 16, 2013 at 10:37 AM, Michael Richman <
m...@bitly.com> wrote:
> One of my colleagues highlighted the fact that we're using Hadoop version
> 2.0.0 while your README says "This version of oddjob is designed to be used
> as a -libjar argument with hadoop 0.20."
>
> We talked about Hadoop version in our general versions mail, but I figured
> I'd raise it again since it come up internally. Do we think the issue could
> lie in there somewhere?
>
>
> On Tue, Apr 16, 2013 at 8:47 AM, Michael Richman <
m...@bitly.com> wrote:
>>
>> Hey Jim,
>>
>> Thanks for this! Unfortunately, I get the same error. The full run command
>> and output is below, in case it gives you any more ideas.
>>
>> % python test_unify_one_love.py -o hdfs:///user/mrwoof/test-2013-01 -r
>> hadoop
>> hdfs:///user/mrwoof/keyword_clicks.stream06_ec2.2013-02-01_00.10.log.gz
>> --hadoop-arg -libjars --hadoop-arg /home/mrwoof/oddjob-1.0.1-standalone.jar
>> --no-output --jobconf mapred.reduce.tasks=120
>> --input_file=input_multi_sample.json --month=01 --jobconf
>> mapred.output.compress=true --jobconf
>> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec
>> --jobconf
mapred.job.name=audience-analysis-test-2013-01
>> using configs in /etc/mrjob.conf
>> creating tmp directory
>> /tmp/test_unify_one_love.mrwoof.20130416.144420.430216
>> Copying non-input files into
>> hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/
>> Using Hadoop version 2.0.0
>> HADOOP: Exception in thread "main" java.lang.ExceptionInInitializerError
>> HADOOP: at clojure.core__init.__init0(Unknown Source)
>> HADOOP: at clojure.core__init.<clinit>(Unknown Source)
>> HADOOP: at java.lang.Class.forName0(Native Method)
>> HADOOP: at java.lang.Class.forName(Class.java:266)
>> HADOOP: at clojure.lang.RT.loadClassForName(RT.java:2098)
>> HADOOP: at clojure.lang.RT.load(RT.java:430)
>> HADOOP: at clojure.lang.RT.load(RT.java:411)
>> HADOOP: at clojure.lang.RT.doInit(RT.java:447)
>> HADOOP: at clojure.lang.RT.<clinit>(RT.java:329)
>> HADOOP: at clojure.lang.Namespace.<init>(Namespace.java:34)
>> HADOOP: at clojure.lang.Namespace.findOrCreate(Namespace.java:176)
>> HADOOP: at clojure.lang.Var.internPrivate(Var.java:163)
>> HADOOP: at clojure.lang.Var.invoke(Var.java:415)
>> HADOOP: at clojure.lang.RT.doInit(RT.java:460)
>> HADOOP: at clojure.lang.RT.<clinit>(RT.java:329)
>> HADOOP: ... 28 more
>> Job failed with return code 1: ['/usr/bin/hadoop', 'jar',
>> '/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.1.3.jar',
>> '-files',
>> 'hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/test_unify_one_love.py#test_unify_one_love.py,hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/input_multi_sample.json#input_multi_sample.json',
>> '-archives',
>> 'hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/mrjob.tar.gz#mrjob.tar.gz',
>> '-libjars', '/home/mrwoof/oddjob-1.0.1-standalone.jar', '-D',
>> '
mapred.job.name=audience-analysis-test-2013-01', '-D',
>> 'mapred.output.compress=true', '-D',
>> 'mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec',
>> '-D', 'mapred.reduce.tasks=120', '-cmdenv', 'PYTHONPATH=mrjob.tar.gz',
>> '-outputformat', 'oddjob.MultipleJSONOutputFormat', '-input',
>> 'hdfs:///user/mrwoof/keyword_clicks.stream06_ec2.2013-02-01_00.10.log.gz',
>> '-output', 'hdfs:///user/mrwoof/test-2013-01', '-mapper',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --mapper
>> --input_file input_multi_sample.json --month 01', '-combiner',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --combiner
>> --input_file input_multi_sample.json --month 01', '-reducer',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --reducer
>> --input_file input_multi_sample.json --month 01']
>> Scanning logs for probable cause of failure
>> Traceback (most recent call last):
>> File "test_unify_one_love.py", line 445, in <module>
>> HadoopAudienceAnalysis.run()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/job.py", line 545,
>> in run
>> mr_job.execute()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/job.py", line 561,
>> in execute
>> self.run_job()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/job.py", line 631,
>> in run_job
>> runner.run()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/runner.py", line
>> 490, in run
>> self._run()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/hadoop.py", line
>> 246, in _run
>> self._run_job_in_hadoop()
>> File "/bitly/local/lib/python2.5/site-packages/mrjob/hadoop.py", line
>> 449, in _run_job_in_hadoop
>> raise Exception(msg)
>> Exception: Job failed with return code 1: ['/usr/bin/hadoop', 'jar',
>> '/usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.1.3.jar',
>> '-files',
>> 'hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/test_unify_one_love.py#test_unify_one_love.py,hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/input_multi_sample.json#input_multi_sample.json',
>> '-archives',
>> 'hdfs:///user/mrwoof/tmp/mrjob/test_unify_one_love.mrwoof.20130416.144420.430216/files/mrjob.tar.gz#mrjob.tar.gz',
>> '-libjars', '/home/mrwoof/oddjob-1.0.1-standalone.jar', '-D',
>> '
mapred.job.name=audience-analysis-test-2013-01', '-D',
>> 'mapred.output.compress=true', '-D',
>> 'mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec',
>> '-D', 'mapred.reduce.tasks=120', '-cmdenv', 'PYTHONPATH=mrjob.tar.gz',
>> '-outputformat', 'oddjob.MultipleJSONOutputFormat', '-input',
>> 'hdfs:///user/mrwoof/keyword_clicks.stream06_ec2.2013-02-01_00.10.log.gz',
>> '-output', 'hdfs:///user/mrwoof/test-2013-01', '-mapper',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --mapper
>> --input_file input_multi_sample.json --month 01', '-combiner',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --combiner
>> --input_file input_multi_sample.json --month 01', '-reducer',
>> '/bitly/local/bin/python test_unify_one_love.py --step-num=0 --reducer
>> --input_file input_multi_sample.json --month 01']
>>
>>
>> % cat /etc/mrjob.conf
>> runners:
>> hadoop:
>> hadoop_bin: /usr/bin/hadoop
>> hadoop_home: /usr/lib/hadoop
>> hadoop_streaming_jar:
>> /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh4.1.3.jar
>> python_bin: /bitly/local/bin/python
>>
>>
>>
>> On Mon, Apr 15, 2013 at 7:09 PM, Jim Blomo <
jbl...@yelp.com> wrote:
>>>
>>> Hi, bumped the dependencies and created this jar. Please give it a try.
>>> > --
>>> > You received this message because you are subscribed to the Google
>>> > Groups
>>> > "mrjob" group.
>>> > To unsubscribe from this group and stop receiving emails from it, send
>>> > an
>>> > email to
mrjob+un...@googlegroups.com.
>>> > For more options, visit
https://groups.google.com/groups/opt_out.
>>> >
>>> >
>>
>>
>>
>>
>> --
>> _____________________
>> michael richman
>> sys arch
>>
m...@bitly.com
>
>
>
>
> --
> _____________________
> michael richman
> sys arch
>
m...@bitly.com