Error in Configuration/FLow of Hadoop and Cascading App

150 views
Skip to first unread message

Pankaj Sinha

unread,
Mar 28, 2014, 7:28:40 PM3/28/14
to cascadi...@googlegroups.com

Hi I'm getting an error while running the Cascading App with Hadoop cdh4.


It Says the main error is "java.io.StreamCorruptedException: invalid type code: 68" , but I read at multiple places that it is due to configuration issue on Hadoop Cluster (Also I'm not sure why it failing with StreamCorruptedException be.


Could you please suggest what should i be looking for? Any guidance to work in a particular direction will be much appreciated. Thanks.


I understand that this issue is not much related to Cascading App. But I'm not able to understand the flow of Data.


My App is actually picking up the data from hive Sequence File (I'm using the Hive-Cascading Module - HCatTap for the source), running aggregations and storing it in a Textdelimited file in HDFS (Hive Table).



Here is the error log :


2014-03-28 22:48:37,395 WARN org.apache.hadoop.mapred.Child: Error running child

java.lang.RuntimeException: Error in configuring object

        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109)

        at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75)

        at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)

        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:413)

        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)

        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject.doAs(Subject.java:396)

        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

        at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: java.lang.reflect.InvocationTargetException

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106)

        ... 9 more

Caused by: cascading.flow.FlowException: internal error during mapper configuration

        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:99)

        ... 14 more

Caused by: java.io.StreamCorruptedException: invalid type code: 68

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1355)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)

        at java.util.HashMap.readObject(HashMap.java:1030)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)

        at java.lang.reflect.Method.invoke(Method.java:597)

        at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:969)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1848)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)

        at cascading.flow.hadoop.util.JavaObjectSerializer.deserialize(JavaObjectSerializer.java:101)

        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:295)

        at cascading.flow.hadoop.util.HadoopUtil.deserializeBase64(HadoopUtil.java:276)

        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:80)

        ... 14 more


Sam Ritchie

unread,
Mar 28, 2014, 7:39:10 PM3/28/14
to cascadi...@googlegroups.com
Funky, that looks like you're overriding one of Cascading's custom serialization tokens:

http://docs.cascading.org/cascading/2.0/userguide/htmlsingle/#N21BFA

Tokens up to 128 are reserved by Cascading. You don't happen to have this annotation registered with a number lower than 128, do you?

http://docs.cascading.org/cascading/1.2/javadoc/cascading/tuple/hadoop/SerializationToken.html

March 28, 2014 5:28 PM
--
You received this message because you are subscribed to the Google Groups "cascading-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cascading-use...@googlegroups.com.
To post to this group, send email to cascadi...@googlegroups.com.
Visit this group at http://groups.google.com/group/cascading-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/cascading-user/8a8d5942-1ff7-46f6-b4f5-a3272ac4fe08%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Sam Ritchie (@sritchie)

Pankaj Sinha

unread,
Mar 28, 2014, 8:04:15 PM3/28/14
to cascadi...@googlegroups.com
Thanks Sam,

I've not defined any annotations explicitly for any of my classes. 

I'm using the extensions for Cascading-Hive (from https://github.com/branky/cascading.hive) and trying to read the Sequence File created by Hive.

I'm using HCatTap as a source from this module and trying to read the data, do I need to do anything extra to read the data ?  I'm not writing anything in the same OutputStream as well, therefore not sure why am I getting that Exception.

Ken Krugler

unread,
Mar 28, 2014, 8:28:39 PM3/28/14
to cascadi...@googlegroups.com
Could this be that one version of a class is being serialized into the JobConf, but during deserializing on a slave another version of the class is found on the classpath?

I see the stack trace starts with:

        at cascading.flow.hadoop.FlowMapper.configure(FlowMapper.java:80)

Given Pankaj's previous issues with building a job jar, I'd be curious to see what's inside it :)

-- Ken



For more options, visit https://groups.google.com/d/optout.

--------------------------
Ken Krugler
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





--------------------------
Ken Krugler
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Branky Shao

unread,
Mar 30, 2014, 10:07:17 AM3/30/14
to cascadi...@googlegroups.com
Hi Pankaj,

Can you open an issue on github?  You can tell hive/hcatalog version you're using and attach the sample DDL which can be used to reproduce the issue would be even better.

Thanks,
Branky 
Reply all
Reply to author
Forward
0 new messages