Issue running on hadoop

126 views
Skip to first unread message

Adrian

unread,
Oct 29, 2013, 4:15:04 PM10/29/13
to camu...@googlegroups.com
So I started to post earlier, but realized I just had a stupid classpath issue.  I resolved this problem but I am still not able to complete the map/reduce tasks on hadoop.  I am just running one node on my own box for testing purposes.  The CamusJob starts correctly and creates the directories, but then says it can't find the native hadoop libraries:
13/10/29 13:55:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable   
and the job finishes saying that the job was a failure.  I catted the log on hdfs and I see the following exception:

MapAttempt TASK_TYPE="CLEANUP" TASKID="task_201310250938_0010_m_000001" TASK_ATTEMPT_ID="attempt_201310250938_0010_m_000001_0" TASK_STATUS="FAILED" FINISH_TIME="1383076533477" HOSTNAME="beemer" ERROR="Error: java\.lang\.ClassNotFoundException: com\.linkedin\.camus\.etl\.IEtlKey
        at java\.net\.URLClassLoader$1\.run(URLClassLoader\.java:366)
        at java\.net\.URLClassLoader$1\.run(URLClassLoader\.java:355)

Obviously this means that the map job is unable to find the IEtlKey, but I don't know why.  Does this have to be manually copied over to hadoop somehow?  Am I missing something?

Thanks,
Adrian

Gaurav Gupta

unread,
Oct 29, 2013, 4:19:55 PM10/29/13
to Adrian, camu...@googlegroups.com
There is a parameter in the config file :
hdfs.default.classpath.dir=/hadoop/libs

This directory points to the libs needed by Camus and stored on HDFS.
When the code runs it will look for these jar files at this location and add all the needed jar dependencies to the job.

I guess all you need to do is create a directory on HDFS, copy all the dependency jar files there and set this parameter in your config file.
That should take care of the issue.

--
You received this message because you are subscribed to the Google Groups "Camus - Kafka ETL for Hadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to camus_etl+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

cti...@gmail.com

unread,
Jul 14, 2014, 6:54:45 PM7/14/14
to camu...@googlegroups.com, adrian....@gmail.com, ggu...@linkedin.com
Gaurav -

I have created the folder and copied the jars to the folder

hdfs.default.classpath.dir=/hadoop/libs

I still get a exception:

2014-07-14 15:52:41,147 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201407141543_0003_m_000002_0: Error: java.lang.ClassNotFoundException: com.linkedin.camus.etl.IEtlKey
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:1713)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:1678)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:1772)
at org.apache.hadoop.mapreduce.task.JobContextImpl.getOutputFormatClass(JobContextImpl.java:227)
at org.apache.hadoop.mapred.Task.initialize(Task.java:521)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:313)
at org.apache.hadoop.mapred.Child$4.
Reply all
Reply to author
Forward
0 new messages