Hi,
I am trying to get HelloWorld up and running and am beating my head against the wall. It works fine when I submit the job using Amazon EMR, however, I simply cannot get it to work locally. I am basing my custom job off of HelloWorld, and it is a nightmare to have to debug it running on EC2, as it takes several minutes to upload the JAR to S3 and then wait for the process to spin up. I would very much like to run it locally, as the README file suggests you should be able to.
The problem seems to be an intricate ballet between the commoncrawl, jets3t and hadoop jars. If I try to use the default jets3t jar in Hadoop 1.0.3 (0.6.1), it fails thusly:
Exception in thread "main" java.lang.VerifyError: (class: org/commoncrawl/hadoop/io/JetS3tARCSource, method: configureImpl signature: (Lorg/apache/hadoop/mapred/JobConf;)V) Incompatible argument to function
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:865)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:891)
at org.commoncrawl.hadoop.io.ARCInputFormat.configure(ARCInputFormat.java:159)
...
If I instead kill that file and use version 0.8.1 (the version included as part of HelloWorld), I get much further, but still fail like this:
Exception in thread "main" java.lang.NoSuchMethodError: org.jets3t.service.impl.rest.httpclient.RestS3Service.<init>(Lorg/jets3t/service/security/AWSCredentials;)V
at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.initialize(Jets3tNativeFileSystemStore.java:54)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.fs.s3native.$Proxy1.initialize(Unknown Source)
at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:216)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.mapred.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:110)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:889)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:824)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1261)
at org.commoncrawl.tutorial.HelloWorld.main(HelloWorld.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
In the first case (using 0.6.1), it seems that Hadoop is happy with the jets3t version, but commoncrawl is not.
In the second case (using 0.8.1), commoncrawl is happy with the jets3t version, but hadoop is not
Am I missing something? Can anyone get this example to run locally?
FYI, I am running Hadoop 1.0.3 on OSX Lion. Also tried on Ubuntu 12.04 but am seeing similar problems.
I also tried all versions in between 0.6.1 and 0.9.0 of jets3t.
Any help is greatly appreciated.
Thanks,
Eric