scalding issues of kryo serialization

112 views
Skip to first unread message

wudio...@gmail.com

unread,
Oct 14, 2014, 10:36:47 PM10/14/14
to cascadi...@googlegroups.com
     I am a new user of scalding. These days, when I write an M/R job with scalding, I met an serialization question.

     The logic of the job is that it will read a props file first, and dynamic produce some class basing on the information of the props file using reflection methods, then it will execute normal M/R job process. At this moment, it will throw an exception,  
The exception stack is like below :
2014-10-13 23:50:45,073 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.RuntimeException: Error in configuring object
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:426)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:167)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1650)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:162)
Caused by: java.lang.reflect.InvocationTargetException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)
	... 9 more
Caused by: cascading.flow.FlowException: internal error during mapper configuration
	at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:99)
	... 14 more
Caused by: com.esotericsoftware.kryo.KryoException: Unable to find class: com.ebay.ssti.sampling.A
Serialization trace:
filterList (com.ebay.ssti.sampling.SingleTableSampling)
$outer (com.ebay.ssti.sampling.SingleTableSampling$$anonfun$assemble$2)
$outer (com.ebay.ssti.sampling.SingleTableSampling$$anonfun$assemble$2$$anonfun$apply$1)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
	at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:43)
	at com.twitter.chill.Tuple2Serializer.read(TupleSerializers.scala:34)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
	at com.twitter.chill.TraversableSerializer.read(Traversable.scala:44)
	at com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
	at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
	at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:25)
	at com.twitter.chill.SomeSerializer.read(SomeSerializer.scala:19)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
	at com.twitter.chill.SerDeState.readClassAndObject(SerDeState.java:61)
	at com.twitter.chill.KryoPool.fromBytes(KryoPool.java:94)
	at com.twitter.chill.Externalizer.fromBytes(Externalizer.scala:149)
	at com.twitter.chill.Externalizer.maybeReadJavaKryo(Externalizer.scala:162)
	at com.twitter.chill.Externalizer.readExternal(Externalizer.scala:152)
	at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at java.util.HashMap.readObject(HashMap.java:1180)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
	at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
	at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
	at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
	at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
	at cascading.flow.hadoop.util.JavaObjectSerializer.deserialize(JavaObjectSerializer.java:101)
	at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:295)
	at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:276)
	at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:80)
	... 14 more
Caused by: java.lang.ClassNotFoundException: com.ebay.ssti.sampling.A
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:270)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
	... 78 more




    The class com.ebay.ssti.sampling.A is dynamic generated in assembled process and stored by a map. But the node can not find the class, I suspend there is something wrong with the serialization.

Koert Kuipers

unread,
Oct 14, 2014, 11:00:44 PM10/14/14
to cascadi...@googlegroups.com
if you dynamically generate classes you have to make sure to save them to a new jar file and add that new jar file to the distributed cache (for example using tmpjars in hadoop configuration)

--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/ef22aa12-b4c6-484a-bdb6-1889d7fa4952%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

wudio...@gmail.com

unread,
Oct 15, 2014, 3:06:32 AM10/15/14
to cascadi...@googlegroups.com
The class is dynamically generated only when the job is executed. How can I add it to the jar file? It is not generated in develop process.

Koert Kuipers

unread,
Oct 16, 2014, 11:11:35 PM10/16/14
to cascadi...@googlegroups.com
we generate classes too at execution time, using twitter util-eval. we save them to a jar at execution time and add it to distributed cache.

Reply all
Reply to author
Forward
0 new messages