java.util.NoSuchElementException: head of empty list when calling collect() on RDD

1,577 views
Skip to first unread message

Karthik Thiyagarajan

unread,
Sep 28, 2012, 1:08:35 PM9/28/12
to spark...@googlegroups.com
Hi,

I'm running Spark (0.5.1.2) on Mesos.
When I call collect() on a RDD, I get a "Exception in thread "main" java.util.NoSuchElementException: head of empty list".
I know it shouldn't matter but the RDD was non empty.
I'm able to run other jobs on the cluster successfully. Any insights into when this exception is triggered would be useful. Thanks

Stack Trace : 

Exception in thread "main" java.util.NoSuchElementException: head of empty list
at scala.collection.immutable.Nil$.head(List.scala:371)
at scala.collection.immutable.Nil$.head(List.scala:368)
at spark.DAGScheduler$$anonfun$runJob$7.apply(DAGScheduler.scala:276)
at spark.DAGScheduler$$anonfun$runJob$7.apply(DAGScheduler.scala:276)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:194)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:34)
at scala.collection.mutable.ArrayOps.foreach(ArrayOps.scala:38)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:194)
at scala.collection.mutable.ArrayOps.map(ArrayOps.scala:38)
at spark.DAGScheduler$class.runJob(DAGScheduler.scala:276)
at spark.MesosScheduler.runJob(MesosScheduler.scala:26)
at spark.SparkContext.runJob(SparkContext.scala:306)
at spark.SparkContext.runJob(SparkContext.scala:317)
at spark.SparkContext.runJob(SparkContext.scala:328)
at spark.RDD.collect(RDD.scala:161)

--
Karthik

Matei Zaharia

unread,
Sep 28, 2012, 2:08:07 PM9/28/12
to spark...@googlegroups.com
Can you tell me which commit you're on? Also, did you see any "Lost TID" messages before indicating failed tasks?

Matei

Karthik Thiyagarajan

unread,
Sep 28, 2012, 2:25:38 PM9/28/12
to spark...@googlegroups.com

I do see "LOST TID" messages of the form

...
12/09/28 00:24:16 INFO spark.SimpleJob: Lost TID 2007 (task 39:3)
12/09/28 00:24:16 INFO spark.SimpleJob: Loss was due to fetch failure from null
12/09/28 00:24:16 INFO spark.MesosScheduler: Marking Stage 38 for resubmision due to a fetch failure
12/09/28 00:24:16 INFO spark.MesosScheduler: The failed fetch was from Stage 39; marking it for resubmission
12/09/28 00:24:16 INFO spark.SimpleJob: Lost TID 2005 (task 39:4)
12/09/28 00:24:16 INFO spark.SimpleJob: Loss was due to fetch failure from http://10.10.5.230:58133

The commit I'm running is 4a9c58913d0a9bd5 

-- 
Karthik

Matei Zaharia

unread,
Sep 28, 2012, 2:28:50 PM9/28/12
to spark...@googlegroups.com
Ah, got it. What is the first one? Is it the one with null above? I'm wondering whether a node crashed earlier, which can cause that "head of empty list" error. (We need to fix that error in that case but there could be another issue in your job.)

Matei

Karthik Thiyagarajan

unread,
Sep 28, 2012, 2:38:51 PM9/28/12
to spark...@googlegroups.com
No. The first "Lost TID" one is with a fetch failure from a valid node

...
12/09/28 00:21:30 INFO spark.MapOutputTrackerActor: Asked to get map output locations for shuffle 19
12/09/28 00:21:36 INFO spark.MapOutputTrackerActor: Asked to get map output locations for shuffle 19
12/09/28 00:23:51 INFO spark.SimpleJob: Lost TID 1991 (task 39:0)
12/09/28 00:23:51 INFO spark.SimpleJob: Loss was due to fetch failure from http://10.10.5.229:48542


-- 
Karthik

Karthik Thiyagarajan

unread,
Oct 1, 2012, 11:31:26 PM10/1/12
to spark...@googlegroups.com

Found the problem. The collect() was on a RDD of substantial size (few 100 megs).
I was able to get around it easily but wasn't obvious that the lost tasks were due to calling collect() on a big RDD.

Thanks,
Karthik

Matei Zaharia

unread,
Oct 2, 2012, 1:19:01 AM10/2/12
to spark...@googlegroups.com
Thanks, that's good to know. I'll see what we can do to improve the error reporting for this.

Matei
Reply all
Reply to author
Forward
0 new messages