What does this exception mean?

Skip to first unread message

nazar buzun

Dec 21, 2013, 1:32:03 PM12/21/13
to spark...@googlegroups.com
13/12/21 21:49:10 INFO FlatMappedRDD: Removing RDD 1 from persistence list
Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000] milliseconds
at akka.dispatch.DefaultPromise.ready(Future.scala:870)
at akka.dispatch.DefaultPromise.result(Future.scala:874)
at akka.dispatch.Await$.result(Future.scala:74)
at org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:110)
at org.apache.spark.rdd.RDD.unpersist(RDD.scala:164)
at egolp.StaticNetwork.mergeWithHubs(StaticNetwork.scala:137)
at egolp.EpidemicProcessObject$.run(EpidemicProcess.scala:97)
at egolp.RunLabelPropagation$.main(RunLabelPropagation.scala:46)
at egolp.RunLabelPropagation.main(RunLabelPropagation.scala)

Mark Hamstra

Dec 21, 2013, 3:07:54 PM12/21/13
to spark...@googlegroups.com
It means that after a call to unpersist() on an RDD, no answer was received from the BlockManager within 10 seconds when it was asked to remove that RDD from the cache, and that there is nothing setup within Spark to handle that timeout case.  Looks like a bug to me -- probably even more than one, since the wait is for only 10 seconds (30 seconds in the latest master-branch Spark) even though multiple requests will be made if the first doesn't succeed within 10 seconds (30 seconds), each of which will take that long to fail.

I haven't checked yet, but what exactly happens to Spark when this exception goes uncaught?  And what are you doing to get to this point?

nazar buzun

Dec 21, 2013, 3:42:10 PM12/21/13
to spark...@googlegroups.com
private var nodes: RDD[(Int, StaticNode)] = ....some code for init....

val tmp = nodes

nodes = GraphLoader(isDirected).loadNodes(sc, edges_path, sc.defaultParallelism) // persist("DISK")

index2labels = nodes.mapValues(m => m.index).leftOuterJoin(index2labels).mapValues({...some code..})

nodes.foreach(x => {})

Possibly block manager with  tmp was lost (or removed till tmp.unpersist() action) but I didn't find message like  
WARN BlockManagerMasterActor: Removing BlockManager BlockManagerId(1, hadoop2-11.yandex.ru, 39809, 0) with no recent heart beats: 92791ms exceeds 45000ms    
exactly in the time interval from nodes = Grap .... until tmp.unpersist()
that warn arisen before but not in the mentioned interval, so  block manager with  tmp should be presented in block manager  
Reply all
Reply to author
0 new messages