[JIRA] (SPARK-755) Kryo serialization failing

50 views
Skip to first unread message

Evan Sparks (JIRA)

unread,
May 31, 2013, 5:50:57 PM5/31/13
to spark-...@googlegroups.com
Issue Type: Bug Bug
Affects Versions: 0.8.0
Assignee: Unassigned
Components: Block Manager, Spark Core
Created: 31/May/13 2:48 PM
Description:

When I turn on Kryo serialization, I get the following error as I increase the size of my input dataset. (From ~10GB to ~100GB). This issue does not manifest itself when I turn kryo off.

I have code that successfully reads files, parses them into an RDD(String,Vector), which can then be .count()'ed. I then run a .flatMap on these, with a function that has the following signature:
def expandData(x: (String, Vector)): Seq(String, Float, Vector)

And running a .count() on that RDD crashes - stack trace of failed task looks like this:
13/05/31 00:16:53 INFO cluster.TaskSetManager: Finished TID 2024 in 23594 ms (progress: 10/1000)
13/05/31 00:16:53 INFO scheduler.DAGScheduler: Completed ResultTask(3, 24)
13/05/31 00:16:53 INFO cluster.ClusterScheduler: parentName:,name:TaskSet_3,runningTasks:151
13/05/31 00:16:53 INFO cluster.TaskSetManager: Starting task 3.0:175 as TID 2161 on slave 14: ip-10-62-199-77.ec2.internal:40850 (NODE_LOCAL)
13/05/31 00:16:53 INFO cluster.TaskSetManager: Serialized task 3.0:175 as 2832 bytes in 0 ms
13/05/31 00:16:53 INFO cluster.TaskSetManager: Lost TID 2053 (task 3.0:49)
13/05/31 00:16:53 INFO cluster.TaskSetManager: Loss was due to com.esotericsoftware.kryo.KryoException
com.esotericsoftware.kryo.KryoException: java.lang.ArrayIndexOutOfBoundsException
Serialization trace:
elements (org.mlbase.Vector)
_3 (scala.Tuple3)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:585)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:504)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.write(FieldSerializer.java:564)
at com.esotericsoftware.kryo.serializers.FieldSerializer.write(FieldSerializer.java:213)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:571)
at spark.KryoSerializationStream.writeObject(KryoSerializer.scala:26)
at spark.serializer.SerializationStream$class.writeAll(Serializer.scala:63)
at spark.KryoSerializationStream.writeAll(KryoSerializer.scala:21)
at spark.storage.BlockManager.dataSerialize(BlockManager.scala:910)
at spark.storage.MemoryStore.putValues(MemoryStore.scala:61)
at spark.storage.BlockManager.liftedTree1$1(BlockManager.scala:584)
at spark.storage.BlockManager.put(BlockManager.scala:580)
at spark.CacheManager.getOrCompute(CacheManager.scala:55)
at spark.RDD.iterator(RDD.scala:207)
at spark.scheduler.ResultTask.run(ResultTask.scala:84)
at spark.executor.Executor$TaskRunner.run(Executor.scala:104)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:679)
13/05/31 00:16:53 INFO cluster.ClusterScheduler: parentName:,name:TaskSet_3,runningTasks:151
13/05/31 00:16:53 INFO cluster.TaskSetManager: Starting task 3.0:49 as TID 2162 on slave 12: ip-10-11-46-255.ec2.internal:38878 (NODE_LOCAL)
13/05/31 00:16:53 INFO cluster.TaskSetManager: Serialized task 3.0:49 as 2832 bytes in 0 ms
13/05/31 00:16:53 INFO cluster.ClusterScheduler: parentName:,name:TaskSet_3,runningTasks:152
13/05/31 00:16:54 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Added rdd_7_257 in mem

My Kryo Registrator looks like this:
class MyRegistrator extends KryoRegistrator {
override def registerClasses(kryo: Kryo)

{ kryo.register(classOf[Vector]) kryo.register(classOf[String]) kryo.register(classOf[Float]) kryo.register(classOf[Tuple3[String,Float,Vector]]) kryo.register(classOf[Seq[Tuple3[String,Float,Vector]]]) kryo.register(classOf[Map[String,Vector]]) }

}

"Vector" in this case is an org.mlbase.Vector, which in this case is a slightly modified version of spark.util.Vector (uses floats instead of Doubles).

Project: Spark
Priority: Major Major
Reporter: Evan Sparks
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reynold Xin (JIRA)

unread,
May 31, 2013, 5:56:57 PM5/31/13
to spark-...@googlegroups.com
Reynold Xin commented on Bug SPARK-755

I wonder if the problem is you've exceeded the Java array size limit, which is only indexed using int types, which gives you ~ 2^31 bytes.

Evan Sparks (JIRA)

unread,
May 31, 2013, 6:45:57 PM5/31/13
to spark-...@googlegroups.com
Evan Sparks commented on Bug SPARK-755

Each record is a Tuple3(String,Float,Vector) where internally the vectors are all ArrayFloat of size 160000.

Is it possible that would Kryo try and serialize many of these vectors into the same Java Array?

Evan Sparks (JIRA)

unread,
May 31, 2013, 9:51:58 PM5/31/13
to spark-...@googlegroups.com
Change By: Evan Sparks (31/May/13 6:50 PM)
Summary: Kryo serialization failing    - MLbase

Anonymous (JIRA)

unread,
May 29, 2015, 5:45:04 AM5/29/15
to spark-...@googlegroups.com
Anonymous started work on Bug SPARK-755
 
Change By: Anonymous
Status: Open In Progress
Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.5-OD-04-052#65000-sha1:8b5a8f9)
Atlassian logo

Anonymous (JIRA)

unread,
May 20, 2018, 4:26:01 PM5/20/18
to spark-...@googlegroups.com
Anonymous stopped work on Bug SPARK-755
 
Change By: Anonymous
Status: In Progress Open
Get Jira notifications on your phone! Download the Jira Cloud app for Android or iOS
This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100085-sha1:2007996)
Atlassian logo

Anonymous (JIRA)

unread,
Apr 15, 2019, 1:25:19 PM4/15/19
to spark-...@googlegroups.com
Anonymous started work on Bug SPARK-755
 
Change By: Anonymous
Status: Open In Progress
Get Jira notifications on your phone! Download the Jira Cloud app for Android or iOS
This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100099-sha1:27643ba)
Atlassian logo

Anonymous (JIRA)

unread,
Jun 30, 2019, 7:53:02 AM6/30/19
to spark-...@googlegroups.com
Anonymous stopped work on Bug SPARK-755
 
Change By: Anonymous
Status: In Progress Open
Get Jira notifications on your phone! Download the Jira Cloud app for Android or iOS
This message was sent by Atlassian Jira (v1001.0.0-SNAPSHOT#100105-sha1:327840d)
Atlassian logo
Reply all
Reply to author
Forward
0 new messages