Broadcast value deserializing encounter EOF Exception when using KryoSerialization on master branch

546 views
Skip to first unread message

Jiacheng Guo

unread,
Feb 6, 2013, 4:29:20 AM2/6/13
to spark...@googlegroups.com
Hi,
  I'm on master branch to test out the new kryo 2.0 version serialization.  The value I was trying to broadcast is a over serval GB trove hashmap. And I  use  com.esotericsoftware.kryo.serializers.JavaSerializer for the hashmap.  I can confirm when directly used with Kryo, the trove collection can be serialized and deserialized normally with com.esotericsoftware.kryo.serializers.JavaSerializer. However, I encouter an EOF exception when used with spark during the stage deserializing broadcast variable .

13/02/04 20:02:39 INFO cluster.TaskSetManager: Loss was due to java.io.EOFException
    at spark.KryoDeserializationStream.readObject(KryoSerializer.scala:44)
    at spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:129)
    at spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:40)
    at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at java.io.ObjectStreamClass.invokeReadObject(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.defaultReadFields(Unknown Source)
    at java.io.ObjectInputStream.readSerialData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readObject(Unknown Source)
    at spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:115)
    at java.io.ObjectInputStream.readExternalData(Unknown Source)
    at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source)
    at java.io.ObjectInputStream.readObject0(Unknown Source)
    at java.io.ObjectInputStream.readObject(Unknown Source)
    at spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23)
    at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45)
    at spark.executor.Executor$TaskRunner.run(Executor.scala:93)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)

Any Suggestion to how to debug?

Thanks,

Jiacheng Guo

Matei Zaharia

unread,
Feb 6, 2013, 6:11:18 PM2/6/13
to spark...@googlegroups.com
It's still more likely a bug in how Kryo is serializing the objects. Maybe try writing a serializer for them manually. Also, make sure there weren't errors earlier in the log.

Matei

--
You received this message because you are subscribed to the Google Groups "Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to spark-users...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Jiacheng Guo

unread,
Feb 7, 2013, 2:21:23 AM2/7/13
to spark...@googlegroups.com
Very Strange. Some handwritten serialzer do fix the problem. But It's somewhat inferior in speed comparing the one came with trove library. Not sure why the java serlizer come with Kryo failed.

Thanks,
Jiacheng Guo

Andrew Milkowski

unread,
May 17, 2013, 2:30:23 PM5/17/13
to spark...@googlegroups.com
Have similar problem (working off the master) de-serializer fails (will fall back to spark 0.7.0) and repeat the test (submitting shark, hive sql and using mesos)


13/05/17 14:21:09 INFO spark.KryoSerializer: Running user registrator: shark.KryoRegistrator 13/05/17 14:21:09 INFO broadcast.HttpBroadcast: Started reading broadcast variable 0 13/05/17 14:21:09 ERROR executor.Executor: Exception in task ID 0 java.io.EOFException at spark.KryoDeserializationStream.readObject(KryoSerializer.scala:44) at spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:129) at spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:40) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at scala.collection.immutable.$colon$colon.readObject(List.scala:435) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23) at spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:55) at spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:126) at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1791) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350) at spark.JavaDeserializationStream.readObject(JavaSerializer.scala:23) at spark.JavaSerializerInstance.deserialize(JavaSerializer.scala:45) at spark.executor.Executor$TaskRunner.run(Executor.scala:99) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)

Ryan LeCompte

unread,
May 17, 2013, 2:36:08 PM5/17/13
to spark...@googlegroups.com
My kryo serializer pull request hasn't been merged yet, but you could try to see if your problems go away by manually including the code in this pull request:


Ryan

Andrew Milkowski

unread,
May 17, 2013, 4:54:56 PM5/17/13
to spark...@googlegroups.com
thanks Ryan.. isolated my problem to running spark on mesos... while submitting HSQL via JavaSharkContext getting exception previously stated (this shark feature just have been merged https://github.com/amplab/shark/pull/93)

don't want to crosspost and will submit this in the shark group

thanks
 
(without mesos deserializng works with mesos it does not)

Andrew Milkowski

unread,
May 17, 2013, 5:02:35 PM5/17/13
to spark...@googlegroups.com
Link to shark specific post but maybe it is relevant to spark overall https://groups.google.com/forum/?fromgroups=#!topic/shark-users/4OvK8Ux1I4Q


On Friday, May 17, 2013 4:54:56 PM UTC-4, Andrew Milkowski wrote:
thanks Ryan.. isolated my problem to running spark on mesos... while submitting HSQL via JavaSharkContext getting exception previously stated (this shark feature just have been merged https://github.com/amplab/shark/pull/93)
eleva
Reply all
Reply to author
Forward
0 new messages