java.util.NoSuchElementException: head of empty list when calling collect() on RDD

Karthik Thiyagarajan

unread,

Sep 28, 2012, 1:08:35 PM9/28/12

to spark...@googlegroups.com

Hi,

I'm running Spark (0.5.1.2) on Mesos.

When I call collect() on a RDD, I get a "Exception in thread "main" java.util.NoSuchElementException: head of empty list".

I know it shouldn't matter but the RDD was non empty.

I'm able to run other jobs on the cluster successfully. Any insights into when this exception is triggered would be useful. Thanks

Stack Trace :

Exception in thread "main" java.util.NoSuchElementException: head of empty list

at scala.collection.immutable.Nil$.head(List.scala:371)

at scala.collection.immutable.Nil$.head(List.scala:368)

at spark.DAGScheduler$$anonfun$runJob$7.apply(DAGScheduler.scala:276)

at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)

at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)

at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:38)

at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)

at scala.collection.mutable.ArrayOps.map(ArrayOps.scala:38)

at spark.DAGScheduler$class.runJob(DAGScheduler.scala:276)

at spark.MesosScheduler.runJob(MesosScheduler.scala:26)

at spark.SparkContext.runJob(SparkContext.scala:306)

at spark.SparkContext.runJob(SparkContext.scala:317)

at spark.SparkContext.runJob(SparkContext.scala:328)

at spark.RDD.collect(RDD.scala:161)

--

Karthik

Matei Zaharia

unread,

Sep 28, 2012, 2:08:07 PM9/28/12

to spark...@googlegroups.com

Can you tell me which commit you're on? Also, did you see any "Lost TID" messages before indicating failed tasks?

Matei

Karthik Thiyagarajan

unread,

Sep 28, 2012, 2:25:38 PM9/28/12

to spark...@googlegroups.com

I do see "LOST TID" messages of the form

...

12/09/28 00:24:16 INFO spark.SimpleJob: Lost TID 2007 (task 39:3)

12/09/28 00:24:16 INFO spark.SimpleJob: Loss was due to fetch failure from null

12/09/28 00:24:16 INFO spark.MesosScheduler: Marking Stage 38 for resubmision due to a fetch failure

12/09/28 00:24:16 INFO spark.MesosScheduler: The failed fetch was from Stage 39; marking it for resubmission

12/09/28 00:24:16 INFO spark.SimpleJob: Lost TID 2005 (task 39:4)

12/09/28 00:24:16 INFO spark.SimpleJob: Loss was due to fetch failure from http://10.10.5.230:58133

…

The commit I'm running is 4a9c58913d0a9bd5

--

Karthik

Matei Zaharia

unread,

Sep 28, 2012, 2:28:50 PM9/28/12

to spark...@googlegroups.com

Ah, got it. What is the first one? Is it the one with null above? I'm wondering whether a node crashed earlier, which can cause that "head of empty list" error. (We need to fix that error in that case but there could be another issue in your job.)

Matei

Karthik Thiyagarajan

unread,

Sep 28, 2012, 2:38:51 PM9/28/12

to spark...@googlegroups.com

No. The first "Lost TID" one is with a fetch failure from a valid node

...

12/09/28 00:21:30 INFO spark.MapOutputTrackerActor: Asked to get map output locations for shuffle 19

12/09/28 00:21:36 INFO spark.MapOutputTrackerActor: Asked to get map output locations for shuffle 19

12/09/28 00:23:51 INFO spark.SimpleJob: Lost TID 1991 (task 39:0)

12/09/28 00:23:51 INFO spark.SimpleJob: Loss was due to fetch failure from http://10.10.5.229:48542

--

Karthik

Karthik Thiyagarajan

unread,

Oct 1, 2012, 11:31:26 PM10/1/12

to spark...@googlegroups.com

Found the problem. The collect() was on a RDD of substantial size (few 100 megs).

I was able to get around it easily but wasn't obvious that the lost tasks were due to calling collect() on a big RDD.

Thanks,

Karthik

Matei Zaharia

unread,

Oct 2, 2012, 1:19:01 AM10/2/12

to spark...@googlegroups.com

Thanks, that's good to know. I'll see what we can do to improve the error reporting for this.

Matei

Reply all

Reply to author

Forward