Unable to get anything to work in Scoobi 0.5.0 - class loader problem

216 views
Skip to first unread message

Ben Wing

unread,
Jul 25, 2012, 6:15:54 AM7/25/12
to scoobi...@googlegroups.com
Hello.  It ended up taking a lot of work to get everything more or less working with Scoobi 0.5.0 -- more than I expected.  But even still, things aren't in fact working.

The logs show stuff like this:

2012-07-25 04:47:01,900 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201207242004_0007_m_000000_2'
2012-07-25 04:47:01,962 INFO org.apache.hadoop.mapred.JobInProgress: Choosing a failed task task_201207242004_0007_m_000000
2012-07-25 04:47:01,962 INFO org.apache.hadoop.mapred.JobTracker: Adding task (MAP) 'attempt_201207242004_0007_m_000000_3' to tip task_201207242004_0007_m_000000, for tracker 'tracker_c202-113.longhorn:localhost.localdomain/127.0.0.1:54592'
2012-07-25 04:47:01,962 INFO org.apache.hadoop.mapred.JobInProgress: Choosing rack-local task task_201207242004_0007_m_000000
2012-07-25 04:47:03,176 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201207242004_0007_m_000000_3: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper    at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)    at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602)    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:270)    at java.security.AccessController.doPrivileged(Native Method)    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)    at org.apache.hadoop.mapred.Child.main(Child.java:264)Caused by: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)    at java.lang.ClassLoader.loadClass(ClassLoader.java:3062012-07-25 04:47:06,510 INFO org.apache.hadoop.mapred.TaskInProgress: TaskInProgress task_201207242004_0007_m_000000 has failed 4 times.
2012-07-25 04:47:06,510 INFO org.apache.hadoop.mapred.JobInProgress: Aborting job job_201207242004_0007
2012-07-25 04:47:06,511 INFO org.apache.hadoop.mapred.JobInProgress: Killing job 'job_201207242004_0007'


This happens even for a simple WordCount example, with the way that I build and package an assembly.

I've tried absolutely everything I can think of, and no luck:

1. I've tried both using sbt-assembly and sbt-scoobi; same issue both ways.
2. It wasn't that I was still using SBT 0.11; I upgraded to SBT 0.12 (RC2) and rebuilt from scratch, but no such luck, still same problem.
3. I tried adding these lines at the beginning of the run method, but still no luck:

    // remove the existing jars on the cluster
    deleteJars
    // upload the dependencies to the cluster
    uploadLibJars
4. I tried removing those lines again and adding 'override def upload = false', in the hope that there was a problem with the (perhaps overly clever) new LibJars stuff, and it might work if I disabled it entirely. (Does 'upload = false' do that?)

At this point I simply don't know what to do.  This used to work in 0.4.0 -- what exactly has changed in initialization/class loading in 0.5.0 that caused this problem? 

I have to add, I seem to have had inordinately bad luck with Hadoop -- everything going wrong in 155 different ways, each time different from the next.  The whole system seems extremely fragile -- lots and lots of little pieces that all have to work perfectly for everything to fly, but they all have a habit of failing more or less randomly (or at least it seems that way).  It sometimes seems as if just blowing on them in the wrong way can make things break.  Is this the experience of others, too, or does it get better once you really understand all the bits and pieces and really know how they fit together?



BTW here's the output on the console, although I imagine it won't help too much:


benwing@c201-109:~ 5:02 32209% peg --verbose --hadoop run opennlp.textgrounder.preprocess.ScoobiWordCount input-naleo-jun-21 output-39 -- scoobi verbose
Executing: hadoop jar /home/01683/benwing/devel/poligrounder/target/PoliGrounder-hadoop-0.1.0.jar opennlp.textgrounder.preprocess.ScoobiWordCount input-naleo-jun-21 output-39 -- scoobi verbose
[INFO] TextInput - Input path: input-naleo-jun-21 (0.010698149GB)
[INFO] TextOutput - Output path: output-39
[INFO] Job - Running job: scoobi-20120725-050215-ScoobiWordCount$-9b79d45c-d58e-414f-8363-c950e6428ec5
[INFO] Job - Number of steps: 1
[INFO] Job - Running step: 1 of 1
[INFO] Job - Number of input channels: 1
[INFO] Job - Number of output channels: 1
[INFO] Step - Total input size: 0.010698149GB
[INFO] Step - Number of reducers: 1
[INFO] FileInputFormat - Total input paths to process : 1
[WARN] LoadSnappy - Snappy native library is available
[INFO] NativeCodeLoader - Loaded the native-hadoop library
[INFO] LoadSnappy - Snappy native library loaded
[INFO] Step - MapReduce job 'job_201207242004_0010' submitted. Please see http://c201-109:50030/jobdetails.jsp?jobid=job_201207242004_0010 for more info.
[INFO] Step - Task attempt 'attempt_201207242004_0010_m_000000_0' failed! Trying again. Please see http://c201-112.longhorn:50060/tasklog?attemptid=attempt_201207242004_0010_m_000000_0&all=true for task attempt logs
[INFO] Step - Task attempt 'attempt_201207242004_0010_m_000000_1' failed! Trying again. Please see http://c202-111.longhorn:50060/tasklog?attemptid=attempt_201207242004_0010_m_000000_1&all=true for task attempt logs
[INFO] Step - Task attempt 'attempt_201207242004_0010_m_000000_2' failed! Trying again. Please see http://c201-113.longhorn:50060/tasklog?attemptid=attempt_201207242004_0010_m_000000_2&all=true for task attempt logs
[INFO] Step - Map 100%    Reduce 100%
[ERROR] Step - Task 'task_201207242004_0010_m_000000' failed! Please see http://c202-101.longhorn:50060/tasklog?attemptid=attempt_201207242004_0010_m_000000_3&all=true for task attempt logs
Exception in thread "main" com.nicta.scoobi.impl.exec.JobExecException: MapReduce job 'job_201207242004_0010' failed! Please see http://c201-109:50030/jobdetails.jsp?jobid=job_201207242004_0010 for more info.
at com.nicta.scoobi.impl.exec.MapReduceJob.run(MapReduceJob.scala:270)
at com.nicta.scoobi.impl.exec.Executor$.executeMSCR(Executor.scala:160)
at com.nicta.scoobi.impl.exec.Executor$.executeArr(Executor.scala:143)
at com.nicta.scoobi.impl.exec.Executor$.executeArrOutput(Executor.scala:139)
at com.nicta.scoobi.application.PFn$$anon$11$$anonfun$execute$1.apply(Persister.scala:314)
at com.nicta.scoobi.application.PFn$$anon$11$$anonfun$execute$1.apply(Persister.scala:312)
at scalaz.package$State$$anon$1.apply(package.scala:138)
at scalaz.package$State$$anon$1.apply(package.scala:137)
at scalaz.StateT$class.eval(StateT.scala:24)
at scalaz.package$State$$anon$1.eval(package.scala:137)
at com.nicta.scoobi.application.Persister$$anon$3.apply(Persister.scala:51)
at com.nicta.scoobi.application.Persister$.persist(Persister.scala:43)
at com.nicta.scoobi.Persist$class.persist(Scoobi.scala:52)
at com.nicta.scoobi.Scoobi$.persist(Scoobi.scala:23)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.run(ScoobiWordCount.scala:29)
at com.nicta.scoobi.application.ScoobiApp$$anonfun$main$1.apply$mcV$sp(ScoobiApp.scala:39)
at com.nicta.scoobi.application.ScoobiApp$$anonfun$main$1.apply(ScoobiApp.scala:37)
at com.nicta.scoobi.application.ScoobiApp$$anonfun$main$1.apply(ScoobiApp.scala:37)
at com.nicta.scoobi.application.Hadoop$class.runOnCluster(Hadoop.scala:59)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.runOnCluster(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.Hadoop$class.executeOnCluster(Hadoop.scala:35)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.executeOnCluster(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.Hadoop$$anonfun$onCluster$1.apply(Hadoop.scala:24)
at com.nicta.scoobi.application.LocalHadoop$class.withTimer(LocalHadoop.scala:58)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.withTimer(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.LocalHadoop$class.showTime(LocalHadoop.scala:66)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.showTime(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.Hadoop$class.onCluster(Hadoop.scala:24)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.onCluster(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.Hadoop$class.onHadoop(Hadoop.scala:28)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.onHadoop(ScoobiWordCount.scala:8)
at com.nicta.scoobi.application.ScoobiApp$class.main(ScoobiApp.scala:37)
at opennlp.textgrounder.preprocess.ScoobiWordCount$.main(ScoobiWordCount.scala:8)
at opennlp.textgrounder.preprocess.ScoobiWordCount.main(ScoobiWordCount.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)


Looking at the assembly, the class in question is indeed present:

benwing@c201-109:~poli/target 4:56 32187% javap -classpath PoliGrounder-hadoop-0.1.0.jar com.nicta.scoobi.impl.exec.MscrMapper 
Compiled from "MscrMapper.scala"
public class com.nicta.scoobi.impl.exec.MscrMapper extends org.apache.hadoop.mapreduce.Mapper implements scala.ScalaObject{
    public final com.nicta.scoobi.impl.rtt.TaggedKey com$nicta$scoobi$impl$exec$MscrMapper$$tk();
    public final com.nicta.scoobi.impl.rtt.TaggedValue com$nicta$scoobi$impl$exec$MscrMapper$$tv();
    public void setup(org.apache.hadoop.mapreduce.Mapper$Context);
    public void map(java.lang.Object, java.lang.Object, org.apache.hadoop.mapreduce.Mapper$Context);
    public void cleanup(org.apache.hadoop.mapreduce.Mapper$Context);
    public com.nicta.scoobi.impl.exec.MscrMapper();
}


Thanks,

Ben

Eric Springer

unread,
Jul 25, 2012, 7:23:18 AM7/25/12
to scoobi...@googlegroups.com
Hi Ben,

Eric T. will probably be able to help you through this tomorrow, but could you please give us as much information as possible (e.g. what version of hadoop is on your cluster) and is there anything useful at those linked task logs?

Also, one thing you can try is adding "% provided" to your apps l build.sbt file -- and use `sbt package-hadoop`

e.g. try using this as your build.sbt file

https://github.com/NICTA/scoobi/blob/5037a9b9c9ab77228fb7ec97113326774149adfa/examples/wordCount/build.sbt

Alex Cozzi

unread,
Jul 25, 2012, 11:00:33 AM7/25/12
to scoobi...@googlegroups.com
Hi Ben,
I think I am hitting the same problem as you do. I also get the same issue with the com.nicta.scoobi.impl.exec.MscrMapper class not being found by the class loader (and adding provided in the sbt build file does not fix it)

I think it has to do with the way you set the main class of the hadoop job. In plain map-reduce you do call setApplicationClassJar to tell hadoop what is the main class of your app. I do not know how scoobi does it, but something must have changed between 0.4.0 and 0.5.0-SNAPSHOT. 

Actually I wonder whether is possible to have multiple ScoobiApp in the same project and somehow pick a different main via arguments on the command line, like:

hadoop jar Scoobi-app.jar com.example.MyMain1  args args 

and

hadoop jar Scoobi-app.jar com.example.MyMain2  args args 


And about hadoop: yes, unfortunately in general hadoop is a pretty fragile mess. Think of it like a hot-rod: goes really fast but can burst in flame at any moment.
Alex

Ben Wing

unread,
Jul 25, 2012, 5:07:41 PM7/25/12
to scoobi...@googlegroups.com
Hmmm, I do have multiple ScoobiApps in the same project.  They are simply different objects, and I run the appropriate one exactly like you mention below, by telling Hadoop the name of the class to run, like this:

hadoop jar /home/01683/benwing/devel/poligrounder/target/PoliGrounder-hadoop-0.1.0.jar opennlp.textgrounder.preprocess.ScoobiWordCount input-naleo-jun-21 output-40 -- scoobi verbose

Eric, would this break things?  Can you tell me approximately what has changed between 0.4.0 and 0.5.0 in the initial load-up phase?

ben

Ben Wing

unread,
Jul 25, 2012, 5:20:30 PM7/25/12
to scoobi...@googlegroups.com
Eric, the version of Hadoop used to start the cluster is hadoop-0.20.2-cdh3u2 (Cloudera).  This is the same version that the 'hadoop' program itself refers to.  I notice that Scoobi appears to be linked with cdh3u1 instead of cdh3u2, although I'm not sure that would make any difference.

Nothing useful in the task logs, just the same stuff:

2012-07-25 05:12:50,065 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library                                                     
 2012-07-25 05:12:50,205 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=                           
 2012-07-25 05:12:50,320 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs` truncater with mapRetainSize=-1 and reduceRetainSize=-1        
 2012-07-25 05:12:50,325 WARN org.apache.hadoop.mapred.Child: Error running child                                                                           
 java.lang.RuntimeException: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper                                                        
         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)                                                                           
         at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)                                                                      
         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602)                                                                                 
         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)                                                                                          
         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)                                                                                            
         at java.security.AccessController.doPrivileged(Native Method)                                                                                      
         at javax.security.auth.Subject.doAs(Subject.java:396)                                                                                              
         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)                                                            
         at org.apache.hadoop.mapred.Child.main(Child.java:264)                                                                                             
 Caused by: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper                                                                         
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)                                                                                          
         at java.security.AccessController.doPrivileged(Native Method)                                                                                      
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)                                                                                      
         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)                                                                                           
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)                                                                                   
         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)                                                                                           
         at java.lang.Class.forName0(Native Method)                                                                                                         
         at java.lang.Class.forName(Class.java:247)                                                                                                         
         at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)                                                                     
         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)                                                                           
         ... 8 more                                                                                                                                         
 2012-07-25 05:12:50,330 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task        

Ben Wing

unread,
Jul 26, 2012, 4:12:09 AM7/26/12
to scoobi...@googlegroups.com
BTW if there's anything else you need (e.g. the code itself?), let me know.  I do want to get this figured out eventually, else I'll be stuck forever at 0.4.0.  Sounds like Alexi's getting the same problem.

ben

Alex Cozzi

unread,
Jul 26, 2012, 6:43:11 PM7/26/12
to scoobi...@googlegroups.com
Just to help in debugging: Ben's fork of scoobi-0.5.0 for chd4 works (https://github.com/blever/scoobi/tree/cdh4), meaning that I do not have the class loader problem.

I do not mean to imply that the problem is cdh4 vs cdh3, but I mean that the bug must have been introduced somewhere after scoobi-0.5.0 and Ben's fork diverged.

Eric Springer

unread,
Jul 26, 2012, 11:46:10 PM7/26/12
to scoobi...@googlegroups.com
Thanks guys. We're in the middle of doing a few substantial changes for our 0.5.0 release, like proper cdh4 support as well as improvements in the deploy department. It might be a few days before we get everything working together

:)

Alex Cozzi

unread,
Jul 27, 2012, 1:35:08 AM7/27/12
to scoobi...@googlegroups.com
No worries. I can work with the cdh4 version in the meanwhile and ben wing seems to be fine as well having reverted to 0.4.0, so I guess we are both good.
Thank you for all your work on scoobi: it is just that even in its early stage it is the best way to write map-reduce job yet, so I guess we are all quite eager to move to it as soon as possible.
Alex

Eric Torreborre

unread,
Jul 27, 2012, 4:57:10 AM7/27/12
to scoobi...@googlegroups.com
Hi guys,

Let me browse you through what the code is doing, this might help with debugging.

 1. if LibJars.upload is true (the default) then we upload the jars in the "libjars" directory on the cluster

 1.1. the list of selected jars comes from the current context class loader, more precisely every jar that's referenced with a URL in the classloader and:
     - containing .ivy2 or .m2 (meaning that scoobi-0.5.0.jar should be included if it's been downloaded with sbt)
     - not containing "hadoop-core" to avoid loading a jar that should already be available on the server
   
     => you can debug this by overriding the method in your ScoobiApp and printing the result of super.jars

 1.2. then the jars are uploaded to the "libjars" directory on the cluster

     => you can debug this by checking with hadoop fs -ls libjars what's currently in your directory
 
 2. The next step is to set-up the Configuration object with the right information

  2.1 we add all the jars which are in the libjars directory to the DistributedCache: DistributedCache.addFileToClassPath(path, configuration)
  2.2 we add the paths of those jars, separated by ":", to the value of the mapred.classpath property 

    => one way to debug that is to check the Configuration properties (I think that's accessible in the web interface for the job execution)    

Based on this steps, can you please tell me what you observe when the MsrcMapper class is not found?

Thanks,

Eric.

Ben Wing

unread,
Jul 31, 2012, 1:13:49 AM7/31/12
to scoobi...@googlegroups.com


On Thursday, July 26, 2012 6:43:11 PM UTC-4, Alex Cozzi wrote:
Just to help in debugging: Ben's fork of scoobi-0.5.0 for chd4 works (https://github.com/blever/scoobi/tree/cdh4), meaning that I do not have the class loader problem.

I do not mean to imply that the problem is cdh4 vs cdh3, but I mean that the bug must have been introduced somewhere after scoobi-0.5.0 and Ben's fork diverged.

I'm using cdh3 so this can't be the problem. 

Alex Cozzi

unread,
Jul 31, 2012, 8:48:41 PM7/31/12
to scoobi...@googlegroups.com
I think I might have an idea what the problem could be:

I noticed that on my cluster the libjars directory gets created but it is empty. I suspect it has something to do with the fact that I create jar file of my application with "sbt package-hadoop" on one machine and then copy it on the gateway machine of the cluster, this means that my .ivy and .m2 directory on the gateway are empty (the gateways are fire walled anyhow, so they can not download jar from the internet). So I think that the issue is that the jar files are not uploaded to the shared cache and the job fails with:

java.lang.RuntimeException: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:865) at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:195) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:628) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335) at org.apache.hadoop.mapred.Child$4.run(Child.java:242) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.Child.main(Child.java:236) Caused by: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:818) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:863) ... 8 more

Eric Torreborre

unread,
Aug 1, 2012, 2:29:42 AM8/1/12
to scoobi...@googlegroups.com
Hi Alex,

>  I create jar file of my application with "sbt package-hadoop" on one machine and then copy it on the gateway machine

This is indeed probably the issue. You can then override the "jars" method to get the correct jar URLs from either a configuration file, or from the URLClassLoader if it happens to reference them. 

Just replace the existing filter: (Seq(".ivy2", ".m2").exists(url.getFile.contains) with something more relevant (at line com.nicta.scoobi.application.LibJars #32).

Eric T.

Alex Cozzi

unread,
Aug 1, 2012, 12:27:14 PM8/1/12
to scoobi...@googlegroups.com
Well, our gateway machines are fire walled in, so I really need to move physically the files into the machine with ssh. A solution that I like is what cascading does (using the assembly plug-in for maven, see: http://blog.mafr.de/2010/07/24/maven-hadoop-job/) where it packages all the dependencies in a sub-directory of the job's jar, this way I can copy a single file to the gateway and then scoobi could potentially take care of copying them to the distributed cache.

Ben Wing

unread,
Aug 2, 2012, 7:09:32 PM8/2/12
to scoobi...@googlegroups.com


On Wednesday, August 1, 2012 2:29:42 AM UTC-4, Eric Torreborre wrote:
Hi Alex,

>  I create jar file of my application with "sbt package-hadoop" on one machine and then copy it on the gateway machine

This is indeed probably the issue. You can then override the "jars" method to get the correct jar URLs from either a configuration file, or from the URLClassLoader if it happens to reference them. 

Just replace the existing filter: (Seq(".ivy2", ".m2").exists(url.getFile.contains) with something more relevant (at line com.nicta.scoobi.application.LibJars #32).

Eric T.


Eric, I have access to two very different Hadoop configurations.

The one I was using before to test Scoobi 0.5 is a fairly small cluster with a long-term persistent HDFS file system, as well as a single job tracker, a single name node, and 16 task nodes.  I only have ssh access to the job tracker, and AFAIK the other machines are firewalled from the Internet and do not have access to my home directory on the job tracker -- i.e. the only shared file system is HDFS.  I compile and launch the application from the job tracker.  The version of the 'hadoop' client executable is Cloudera 0.20.2 cdh3u3; I'm not sure what the version of HDFS or the servers is but I would guess the same or similar.

I don't understand what Alex's issue is, but I have to ask -- why did this work before, and why doesn't it work now?  I thought the whole point of building an assembly/combined jar was precisely to include *all* the necessary libraries in it.  Why does Scoobi 0.5 screw around with trying to upload libraries itself rather than relying on what's in the assembly, as Scoobi 0.4 did?

Now, the other configuration is completely different.  This system has an enormous number (in the hundreds) of 8-core compute servers, managed by a Sun Grid Engine, where you submit jobs with qsub.  48 of these are set aside for Hadoop usage, but they don't form a normal Hadoop cluster.  Instead, all they really have is an extra local 2 TB disk installed on /hadoop.  Rather, what I do is ssh to a login node and then request some subset of compute servers (e.g. 8 nodes) for some amount of time (you get exclusive use of the servers you request, but for a maximum of only 24 hours!!) using an appropriate qsub script.  This is just a shell script with some extra directives in it telling qsub how many machines you're asing for, which type, for how long, etc., which gets run as soon as your requested resources are available.  It proceeds to set up one of the nodes as a combined job tracker/name node and all the rest as task nodes, and format a new HDFS using all the disks in /hadoop, storing the configuration info in a subdirectory of my home directory. Then, I ssh into the job tracker, copy my data into HDFS, and run my Hadoop tasks -- for a maximum of 24 hours, which is all you get at a time.  In this setup, my home directory as well as a series of ginormous 1000+ TB Lustre partitions are all available on all of the compute servers (as well as the login server), and I can freely ssh into all the compute servers that I've requested and have been given control over, and all of them can connect directly to the Internet.

This second system uses an installation of Hadoop Cloudera 0.20.2 cdh3u2 sitting in my home dir.  The same version is used both for starting HDFS and the various servers and the client 'hadoop' executable.  It took a good deal of dicking around with the configuration and qsub script and such to get it working, so I'd rather not touch it, although conceivably I could update to a newer version, since (as mentioned above) I start a new HDFS each time.

After Alex's and your comment, I tried getting things working on this system, since my home dir is mounted on all the machines, so whatever issue there is with accessing the .ivy2 and .m2 dirs, it shouldn't exist here.

Unfortunately:

(1) I get a warning on some of my code when compiling:

[warn] /home/01683/benwing/devel/poligrounder/src/main/scala/opennlp/textgrounder/util/hadoop.scala:69: method isDir in class FileStatus is deprecated: see corresponding Javadoc for more information.
[warn]       get_file_system(filename).getFileStatus(new Path(filename)).isDir
[warn]                                                                   ^
[warn] one warning found

This is not problematic except that it indicates that everything is being compiled against Hadoop 0.21 or later, which doesn't sound good, and indeed:

(2) I immediately get an error when running, apparently due to an incompatibility between 0.20 and 0.21:

Exception in thread "main" java.io.IOException: Input path input-naleo-jun-21 does not exist.
at com.nicta.scoobi.io.text.TextInput$TextSource$$anonfun$inputCheck$1.apply(TextInput.scala:140)
at com.nicta.scoobi.io.text.TextInput$TextSource$$anonfun$inputCheck$1.apply(TextInput.scala:136)
at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
at scala.collection.immutable.List.foreach(List.scala:76)
at com.nicta.scoobi.io.text.TextInput$TextSource.inputCheck(TextInput.scala:136)
at com.nicta.scoobi.impl.exec.Executor$$anonfun$prepare$4.apply(Executor.scala:79)
at com.nicta.scoobi.impl.exec.Executor$$anonfun$prepare$4.apply(Executor.scala:79)
at scala.collection.immutable.Set$Set1.foreach(Set.scala:86)
at com.nicta.scoobi.impl.exec.Executor$.prepare(Executor.scala:79)
at com.nicta.scoobi.application.Persister$.com$nicta$scoobi$application$Persister$$createPlan(Persister.scala:259)
        ...


So this leads to some more questions:

Am I going to have to upgrade to Hadoop 0.21 or later just to run Scoobi 0.5?  Besides all the hassle involved, this seems like a bad idea, because the Hadoop 0.21 series, whose latest release is Hadoop 2.0.0-alpha, is ...  well, alpha software.  In general, why is Scoobi tracking the bleeding edge like this?  I understand that eventually we will need to upgrade, but it seems preliminary in this case, particularly since there appear to be significant backward-compatibility issues, and Hadoop 2 still is far from being released in stable form.  


Overall, I still don't understand the whole story behind Hadoop configuration and such, but I wonder, why was it necessary to switch away from just building and running a big assembly, and why was it necessary to move to Hadoop 0.21?  

Thanks,

ben

Ben Wing

unread,
Aug 2, 2012, 7:43:59 PM8/2/12
to scoobi...@googlegroups.com
Pardon me, I didn't notice that you still have a cdh3 branch ... now I feel a bit stupid.  In any case, I'll switch over to that branch and see how things go.

ben

Ben Wing

unread,
Aug 2, 2012, 8:27:46 PM8/2/12
to scoobi...@googlegroups.com
Still failure.  Same error.

2012-08-02 19:21:49,618 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201208020237_0005_m_000000_3: java.lang.RuntimeException: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)
        at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306

Keep in mind this is running on a cluster where I build an assembly using package-hadoop on the job tracker itself, and the entire filesystem on the job tracker is visible on all of the task trackers.

Eric Springer

unread,
Aug 2, 2012, 8:49:33 PM8/2/12
to scoobi...@googlegroups.com
On Fri, Aug 3, 2012 at 10:27 AM, Ben Wing <b...@benwing.com> wrote:
> Still failure. Same error.

You using Mac? I wonder if this is issue:

https://github.com/NICTA/scoobi/issues/1

See if that work around works

Ben Wing

unread,
Aug 2, 2012, 8:59:57 PM8/2/12
to scoobi...@googlegroups.com
BTW, I tried running the WordCount app I have in the same package, just to make sure it's not a problem with my normal app.  Same error -- here's a sample from the task attempt log:

2012-08-02 19:52:22,912 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library                                                  
 2012-08-02 19:52:23,005 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hadoop/benwing/local/taskTracker/benwing/job
 2012-08-02 19:52:23,008 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /hadoop/benwing/local/taskTracker/benwing/job
 2012-08-02 19:52:23,055 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=                        
 2012-08-02 19:52:23,168 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs` truncater with mapRetainSize=-1 and reduceRetainSize=-1     
 2012-08-02 19:52:23,172 WARN org.apache.hadoop.mapred.Child: Error running child                                                                        
 java.lang.RuntimeException: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper                                                     
         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)                                                                        
         at org.apache.hadoop.mapreduce.JobContext.getMapperClass(JobContext.java:212)                                                                   
         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:602)                                                                              
         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)                                                                                       
         at org.apache.hadoop.mapred.Child$4.run(Child.java:270)                                                                                         
         at java.security.AccessController.doPrivileged(Native Method)                                                                                   
         at javax.security.auth.Subject.doAs(Subject.java:396)                                                                                           
         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)                                                         
         at org.apache.hadoop.mapred.Child.main(Child.java:264)                                                                                          
 Caused by: java.lang.ClassNotFoundException: com.nicta.scoobi.impl.exec.MscrMapper                                                                      
         at java.net.URLClassLoader$1.run(URLClassLoader.java:202)                                                                                       
         at java.security.AccessController.doPrivileged(Native Method)                                                                                   
         at java.net.URLClassLoader.findClass(URLClassLoader.java:190)                                                                                   
         at java.lang.ClassLoader.loadClass(ClassLoader.java:306)                                                                                        
         at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)                                                                                
         at java.lang.ClassLoader.loadClass(ClassLoader.java:247)                                                                                        
         at java.lang.Class.forName0(Native Method)                                                                                                      
         at java.lang.Class.forName(Class.java:247)                                                                                                      
         at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)                                                                  
         at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)                                                                        
         ... 8 more                                                                                                                                      
 2012-08-02 19:52:23,177 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task                                                               

Ben Wing

unread,
Aug 2, 2012, 9:04:53 PM8/2/12
to scoobi...@googlegroups.com


On Thursday, August 2, 2012 8:49:33 PM UTC-4, Eric Springer wrote:
On Fri, Aug 3, 2012 at 10:27 AM, Ben Wing wrote:
> Still failure.  Same error.

You using Mac? I wonder if this is issue:

https://github.com/NICTA/scoobi/issues/1

See if that work around works

No, this isn't a Mac.  This is a very large Linux-based cluster.  See the comments in my last long post, just before the "sorry" post ...  even though some of the comments in that post about Hadoop 0.21/CDH4 don't completely apply, most of the rest of them do.  I also ask a question about why Scoobi can't just rely on what's in the assembly JAR, like it does in 0.4, and I'm still curious what the answer is.

As I said in that post, everything in the filesystems in this enormous cluster is shared across the entire cluster and mounted in the same place, except for the local HDFS disks mounted into /hadoop on each task tracker.


ben



Eric Springer

unread,
Aug 2, 2012, 9:47:11 PM8/2/12
to scoobi...@googlegroups.com
On Fri, Aug 3, 2012 at 11:04 AM, Ben Wing <b...@benwing.com> wrote:
> I also ask a question about why Scoobi
> can't just rely on what's in the assembly JAR, like it does in 0.4, and I'm
> still curious what the answer is.

I believe the idea is to also support this. It's just an additional
feature of pre-uploading the dependencies to the cluster and using
them there [That way, you don't have to waste so much time building
the jar / uploading it each time]
Reply all
Reply to author
Forward
0 new messages