scalding distributed mode - support for jar dependecies

160 views
Skip to first unread message

Ophir Yoktan

unread,
Nov 21, 2013, 10:39:59 AM11/21/13
to cascadi...@googlegroups.com
I'm trying to run a simple scalding job using scald.rb on a hadoop cluster (local mode did work fine)
The job depends on external jars.

When using native hadoop jobs, we use the -libjars parameter to distribute the jars.

I added the libjars parameter in the scald.rb:

def hadoop_command
  "HADOOP_CLASSPATH=/usr/share/java/hadoop-lzo-0.4.15.jar:#{JARBASE}:job-jars/#{JOBJAR} " +
    "hadoop jar #{JARBASE} -libjars job-jars/#{JOBJAR},#{OPTS[:libjars]} #{hadoop_opts} #{JOB} --hdfs " +
    JOB_ARGS
end

(added code is highlighted)

the resulting command appears also fine:

HADOOP_CLASSPATH=/usr/share/java/hadoop-lzo-0.4.15.jar:scalding-core-assembly-0.9.0rc4.jar:job-jars/simpleJob.jar hadoop jar scalding-core-assembly-0.9.0rc4.jar -libjars job-jars/simpleJob.jar,<comma separated jars> -Dmapred.reduce.tasks=20 -Dmapred.min.split.size=2000000000 simpleJob --hdfs --input /in --output /out

but I get a class not found exception

Exception in thread "main" java.lang.Throwable: If you know what exactly caused this error, please consider contributing to GitHub via following link.
        at com.twitter.scalding.Tool$.main(Tool.scala:154)
        at com.twitter.scalding.Tool.main(Tool.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at com.twitter.scalding.Job$.apply(Job.scala:41)
        at com.twitter.scalding.Tool.getJob(Tool.scala:50)
        at com.twitter.scalding.Tool.run(Tool.scala:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at com.twitter.scalding.Tool$.main(Tool.scala:140)
        ... 6 more
Caused by: java.lang.NoClassDefFoundError: com/package/Class
        at simpleJob.<init>(seq.scala:7)
        ... 15 more
Caused by: java.lang.ClassNotFoundException: package.Class
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        ... 16 more


Oscar Boykin

unread,
Nov 22, 2013, 3:51:36 PM11/22/13
to cascadi...@googlegroups.com
Well, on submission it is failing to find the jar.

Note:
Caused by: java.lang.NoClassDefFoundError: com/package/Class
        at simpleJob.<init>(seq.scala:7)
        ... 15 more

I think you need all the things you are adding on the classpath AND in libjars. See that it cannot find com/package/Class. It is not on the classpath (libjars, for some reason, is independent).

Can you try that?


--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/e16106a5-b403-424f-9334-e9cc6595232b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Oscar Boykin :: @posco :: http://twitter.com/posco

Ophir Yoktan

unread,
Nov 22, 2013, 4:30:37 PM11/22/13
to cascadi...@googlegroups.com
I tried both (separate, and together) - no change in the output.

--
You received this message because you are subscribed to a topic in the Google Groups "cascading-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/cascading-user/7qIRqhl-GOY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to cascading-use...@googlegroups.com.

To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
Reply all
Reply to author
Forward
0 new messages