Johnpaul Ci
unread,Apr 2, 2013, 1:39:20 AM4/2/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to spark...@googlegroups.com
Hi all
I was wondering if anyone has a guess what might cause this or how I could start to further debug it:
java.lang.ArrayIndexOutOfBoundsException: 1
at spark.bagel.examples.GraphGeneration$$anonfun$2.apply(GraphGeneration.scala:37)
at spark.bagel.examples.GraphGeneration$$anonfun$2.apply(GraphGeneration.scala:33)
at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
at scala.collection.Iterator$class.foreach(Iterator.scala:772)
at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
at spark.CacheManager.getOrCompute(CacheManager.scala:53)
at spark.RDD.iterator(RDD.scala:193)
at spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:19)
at spark.RDD.computeOrReadCheckpoint(RDD.scala:206)
at spark.RDD.iterator(RDD.scala:195)
at spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:125)
at spark.scheduler.ShuffleMapTask.run(ShuffleMapTask.scala:74)
at spark.scheduler.local.LocalScheduler.runTask$1(LocalScheduler.scala:74)
at spark.scheduler.local.LocalScheduler$$anon$1.run(LocalScheduler.scala:50)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
13/03/30 23:19:32 INFO scheduler.DAGScheduler: Failed to run main at <unknown>:0
Exception in thread "main" spark.SparkException: Job failed: ShuffleMapTask(3, 0) failed: ExceptionFailure(java.lang.ArrayIndexOutOfBoundsException: 1)
at spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:629)
at spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:627)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:627)
at spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:588)
at spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:294)
at spark.scheduler.DAGScheduler.spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:358)
at spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:102)
My code is for calculating the pagerank algorithm and the part where the error happens is shown below .
33 val verts = input.map(line => {
34 println(" line " + line)
35 var fields = new Array[String](2)
36 fields = line.split('\t')
37 println("fields(0) is " + fields(0) + "fields(1) is " + fields(1)) //ERROR !!!!!!!
38 val(id,linksStr)=(fields(0),fields(1))
39 val links = linksStr.split(',').map(new GraphGenEdge(_))
40 (id, new GraphGenVertex(id,1.0 / num, links,true))
41 }).cache
My input file for the time being contains 7 entires of a directed graph and it is as follows .
1 4,6
2 1,4,3
3 5
4 3 5
5 2
6 5
where the first column specifies the source vertex and the second column specifies the destination vertices's from the source node.
Regards
John