InstantiationException ASTNodeOrigin and serialization errors

60 views
Skip to first unread message

Tim Harsch

unread,
Oct 14, 2014, 2:59:15 PM10/14/14
to big-...@googlegroups.com
Hi,
I am running into an issue on benchmarking my pseudo-distributed hadoop 2.4.1/Hive 0.12 cluster (now that I've configured a postgres metastore and am past the derby problems)..  I believe it is the same issue documented in the README under the section "Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin ###"

Here's the bug trail:

https://issues.apache.org/jira/browse/HIVE-6765     ASTNodeOrigin unserializable leads to fail when join with view
     resolution: won't fix

https://issues.apache.org/jira/browse/HIVE-5068     Some queries fail due to XMLEncoder error on JDK7
     Seems to be the issue but is marked resolved as duplicate of HIVE-5263

https://issues.apache.org/jira/browse/HIVE-5263     Query Plan cloning time could be improved by using Kryo
     fixed in 0.13, Hive-5068 is marked duplicate

https://issues.apache.org/jira/browse/HIVE-4583     Make Hive compile and run with JDK7
     fixed in 0.13.

In this comment of HIVE-5009, someone ran into the same issue and solved it by switching to JDK6.
https://issues.apache.org/jira/browse/HIVE-5009?focusedCommentId=13736707&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13736707

The only work around I can see is to switch to JDK6, which doesn't work for me.  Also, I'm very confused because my production Hadoop cluster which is CDH5/JDK7 doesn't encounter this issue.  From the bugs above it looks like JDK7 is supposed to be the cause?!?    The following is a cleaned (for redundancy/size) errors log.   Since it is mentioned in the README.md I'm just wondering, what happened when it was encountered or do you have any clues?


===============================================
Errors in queries
===============================================
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:72:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:73:    at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:106:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:108:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:109:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:125:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:126:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:137:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:284:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:300:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q08_hive_POWER_TEST_IN_PROGRESS_0.log:318:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:69:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:70:    at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:103:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:105:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:106:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:122:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:280:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:281:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:296:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:297:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:313:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q12_hive_POWER_TEST_IN_PROGRESS_0.log:331:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:87:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:88:    at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:121:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:123:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:124:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:140:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:362:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:363:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:378:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:379:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:395:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q19_hive_POWER_TEST_IN_PROGRESS_0.log:413:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

/home/users/tharsch/git/Big-Bench/logs/q20_hive_POWER_TEST_IN_PROGRESS_0.log:68:FAILED: SemanticException [Error 10016]: Line 7:69 Argument type mismatch '0.0': The expression after ELSE should have the same type as those after THEN: "bigint" is expected but "double" is found

/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:126:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:127:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:160:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:162:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:163:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:179:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:180:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:191:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:370:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:385:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:386:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:402:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:420:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:484:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:485:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:518:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:520:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:521:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:537:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:538:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:549:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:550:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:727:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:728:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:743:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:744:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:760:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q23_hive_POWER_TEST_IN_PROGRESS_0.log:778:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:73:java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:74:    at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:107:Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:109:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:110:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:126:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:269:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:284:Caused by: java.lang.RuntimeException: Cannot serialize object
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:285:   at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:301:Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin
/home/users/tharsch/git/Big-Bench/logs/q24_hive_POWER_TEST_IN_PROGRESS_0.log:319:FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

Michael Frank

unread,
Oct 15, 2014, 7:34:54 AM10/15/14
to big-...@googlegroups.com
Hi Tim,

yes, your problem and the one from the readme.md are related. Both boil down to hives XML encoder not beeing able to serialize the generated Execution-Plan (AST) into XML for distribution between nodes.
This is beacuse the xml serializier in Java 1.7 uses reflection. The fail occures because a serialized class is missing a Zero-Arg-Constructor, used to re-instantiate the deserialized AST.

A possible reason why you encounter this error with bigbench and not in your production system is: The error depends on which HIVE-QL language features you use. If your production queries just happen to never use the feature tied to the correspondig faulty backend class of hive, they will work. Same in Big-Bench where many queries work but other fail. Whenever hive decided to transfor a Join into a Map-Join this error may surface in combination with Java 1.7 and Hive 0.12.

On quirk is, that there must be a unknown factor involved, because we developed and tested the Hive part of Big-Bench on Hive 0.12  and (Oracle)-Java 1.7 without encountering this issue.

If you are able to, you may want to switch to Hive 0.13.

To help other people and maybe even narrow down the "unknown factor", do you mind posting your full (software-)specs here?
If you are using some hadoop distribution like (cloudera, biginsights, hortonworks ,..) pleas post the exact version.
This distributions may ship hive with serveral major and minor modifications compared to the official hive branch. (Hive 0.12 != hive-cloudera 0.12)


There is a second unrelated exception:


/home/users/tharsch/git/Big-Bench/logs/q20_hive_POWER_TEST_IN_PROGRESS_0.log:68:FAILED: SemanticException [Error 10016]: Line 7:69 Argument type mismatch '0.0': The expression after ELSE should have the same type as those after THEN: "bigint" is expected but "double" is found

See: https://issues.apache.org/jira/browse/HIVE-5825

I have made some changes to q20. Please check out the latest version from github.
To re-run q20 start:
 ./scripts/bigBench -q 20  runQuery

I have added you encountered errors to the FAQ section of the readme.md.

best regards
Michael

Tim Harsch

unread,
Oct 15, 2014, 2:23:24 PM10/15/14
to big-...@googlegroups.com
Hi,
I agree there is some unknown factor in play with this error in Hive0.12.  Unfortunately, I am stuck with Hive 0.12, because our production instance is Cloudera.   We will only be able to get Hive 0.13 when Cloudera makes it available.   I am not able to use the production instance right now, when I can get back on I will post a description of the software stack.

It seems the cast issue must have also had an unknown factor in play, since I have run the benchmark 4 times on our CDH cluster ( once at SF1, twice at SF50, once at SF100, and once at SF1000) without encountering the problem.   I encounter the issue in my vanilla hadoop pseduo-distributed environment.   Just an FYI though, since I think your commit https://github.com/intel-hadoop/Big-Bench/commit/30181c64ee23c4d98038fd07fbf9cc1b41e26d16  will likely avert the issue in any case.

Michael Frank

unread,
Oct 16, 2014, 9:15:28 AM10/16/14
to big-...@googlegroups.com
Hi,


Just an FYI though, since I think your commit https://github.com/intel-hadoop/Big-Bench/commit/30181c64ee23c4d98038fd07fbf9cc1b41e26d16  will likely avert the issue in any case.

This patch will only avert this exception in query 20:

FAILED: SemanticException [Error 10016]: Line 7:69 Argument type mismatch '0.0': The expression after ELSE should have the same type as those after THEN: "bigint" is expected but "double" is found
 
but unfortunatly not the more generic:

java.lang.RuntimeException: Cannot serialize object
    at org.apache.hadoop.hive.ql.exec.Utilities$1.exceptionThrown(Utilities.java:652)

Caused by: java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(MapredWork);
[..]
Caused by: java.lang.InstantiationException: org.apache.hadoop.hive.ql.parse.ASTNodeOrigin

FAILED: SemanticException Generate Map Join Task Error: Cannot serialize object

Since it appears to be linked to map joins ([..]Generate Map Join Task Error[..]) you could try disabling map joins (WARNING: big performance penalty!):
set hive.auto.convert.join.noconditionaltask=false
set hive.auto.convert.join=false
 
best regards
Michael

Tim Harsch

unread,
Oct 16, 2014, 1:49:21 PM10/16/14
to big-...@googlegroups.com
Hi Michael,
Thanks again!   Yes I was clear on the patch only covering the cast issue in query 20.  

Good to know the disabling map joins might be able to solve the serialization issue.  I will try on the pseudo system and see what happens.

Tim

Tim Harsch

unread,
Oct 16, 2014, 5:42:15 PM10/16/14
to big-...@googlegroups.com
Looks like the patch did not work.  The error has changed but still similar: 
/home/users/tharsch/git/Big-Bench/logs/q20_hive_POWER_TEST_IN_PROGRESS_0.log:66:FAILED: SemanticException [Error 10016]: Line 7:69 Argument type mismatch '0': The expression after ELSE should have the same type as those after THEN: "bigint" is expected but "int" is found

I believe this is probably what you wanted (notice 0L before END)
100.0 * COUNT(distinct (CASE WHEN r_date IS NOT NULL THEN oid ELSE 0L END)) / COUNT(distinct oid) AS r_order_ratio,

Michael Frank

unread,
Oct 20, 2014, 9:03:26 AM10/20/14
to big-...@googlegroups.com
finally had the time to push the changes as anounced in https://github.com/intel-hadoop/Big-Bench/commit/30181c64ee23c4d98038fd07fbf9cc1b41e26d16
Thanks Tim for pointing that out.
Reply all
Reply to author
Forward
0 new messages