HDFS, Hadoop 2.2 and Spark error

405 views
Skip to first unread message

Richard Conway

unread,
Jan 2, 2014, 7:45:20 AM1/2/14
to spark...@googlegroups.com
Hi all,

Wonder if someone can point me in the right direction ...

I'm using the 0.81 binaries downloaded from the Spark website with Apache Ambari 1.42 deployed YARN cluster. HDFS works fine locally through hadoop fs command shell. The cluster looks like it has a bunch of executors and is delegating tasks correctly but it can't seem to read from HDFS. Is this a compatibility issue with the binaries? I'm at the point where I'm planning to rebuild the assembly myself. If anyone can shed light on this I'd be grateful.

Thanks.

Richard

14/01/02 12:19:24 WARN cluster.ClusterTaskSetManager: Lost TID 3 (task 0.0:1)
14/01/02 12:19:24 WARN cluster.ClusterTaskSetManager: Loss was due to java.lang.
NoSuchMethodError
java.lang.NoSuchMethodError: org.apache.commons.io.IOUtils.closeQuietly(Ljava/io
/Closeable;)V
        at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.j
ava:1052)
        at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java
:533)
        at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream
.java:749)
        at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:793)
        at java.io.DataInputStream.read(DataInputStream.java:83)
        at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:211
)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:174)
        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:
206)
        at org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:
45)
        at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:167)
        at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:150)
        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.
scala:27)
        at scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:400)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:702)
        at org.apache.spark.rdd.RDD$$anonfun$count$1.apply(RDD.scala:698)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.sc
ala:872)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.sc
ala:872)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:107)
        at org.apache.spark.scheduler.Task.run(Task.scala:53)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mc
V$sp(Executor.scala:215)
        at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.sca
la:50)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:182)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExec
utor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:908)
        at java.lang.Thread.run(Thread.java:662)
Reply all
Reply to author
Forward
0 new messages