What does this exception mean?

nazar buzun

unread,

Dec 21, 2013, 1:32:03 PM12/21/13

to spark...@googlegroups.com

13/12/21 21:49:10 INFO FlatMappedRDD: Removing RDD 1 from persistence list

Exception in thread "main" java.util.concurrent.TimeoutException: Futures timed out after [10000] milliseconds

at akka.dispatch.DefaultPromise.ready(Future.scala:870)

at akka.dispatch.DefaultPromise.result(Future.scala:874)

at akka.dispatch.Await$.result(Future.scala:74)

at org.apache.spark.storage.BlockManagerMaster.removeRdd(BlockManagerMaster.scala:110)

at org.apache.spark.rdd.RDD.unpersist(RDD.scala:164)

at egolp.StaticNetwork.mergeWithHubs(StaticNetwork.scala:137)

at egolp.EpidemicProcessObject$.run(EpidemicProcess.scala:97)

at egolp.RunLabelPropagation$.main(RunLabelPropagation.scala:46)

at egolp.RunLabelPropagation.main(RunLabelPropagation.scala)

Mark Hamstra

unread,

Dec 21, 2013, 3:07:54 PM12/21/13

to spark...@googlegroups.com

It means that after a call to unpersist() on an RDD, no answer was received from the BlockManager within 10 seconds when it was asked to remove that RDD from the cache, and that there is nothing setup within Spark to handle that timeout case. Looks like a bug to me -- probably even more than one, since the wait is for only 10 seconds (30 seconds in the latest master-branch Spark) even though multiple requests will be made if the first doesn't succeed within 10 seconds (30 seconds), each of which will take that long to fail.

I haven't checked yet, but what exactly happens to Spark when this exception goes uncaught? And what are you doing to get to this point?

nazar buzun

unread,

Dec 21, 2013, 3:42:10 PM12/21/13

to spark...@googlegroups.com

private var nodes: RDD[(Int, StaticNode)] = ....some code for init....

.......

val tmp = nodes

nodes = GraphLoader(isDirected).loadNodes(sc, edges_path, sc.defaultParallelism) // persist("DISK")

index2labels = nodes.mapValues(m => m.index).leftOuterJoin(index2labels).mapValues({...some code..})

nodes.foreach(x => {})

tmp.unpersist()

Possibly block manager with tmp was lost (or removed till tmp.unpersist() action) but I didn't find message like

WARN BlockManagerMasterActor: Removing BlockManager BlockManagerId(1, hadoop2-11.yandex.ru, 39809, 0) with no recent heart beats: 92791ms exceeds 45000ms

exactly in the time interval from nodes = Grap .... until tmp.unpersist()

that warn arisen before but not in the mentioned interval, so block manager with tmp should be presented in block manager

Reply all

Reply to author

Forward