ClassNotFoundException with Druid ORC Extension

453 views
Skip to first unread message

shailesh prajapati

unread,
Feb 16, 2017, 4:43:59 AM2/16/17
to Druid User
Hello,

I am trying to load data into Druid using hdfs and ORC extension, got following error because of "Jackson" version conflicts 

java.lang.VerifyError: class com.fasterxml.jackson.datatype.guava.deser.HostAndPortDeserializer overrides final method deserialize.(Lcom/fasterxml/jackson/core/JsonParser;Lcom/fasterxml/jackson/databind/DeserializationContext;)Ljava/lang/Object;

I resolved this by using "mapreduce.job.classloader": "true" in specs as suggested at http://druid.io/docs/0.9.2/operations/other-hadoop.html

Now, MR jobs are failing because of the following reason,

Error: java.lang.RuntimeException: readObject can't find class
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:136)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readFields(TaggedInputSplit.java:120)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71)
	at org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42)
	at org.apache.hadoop.mapred.MapTask.getSplitDetails(MapTask.java:372)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:754)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.hive.ql.io.orc.OrcNewSplit not found
	at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
	at org.apache.hadoop.mapreduce.lib.input.TaggedInputSplit.readClass(TaggedInputSplit.java:134)

I checked hive-exec-2.0.0.jar is in classpath. Don't know what is going wrong. I am using Druid 0.9.2 and Hadoop 2.7.1.

Thanks.

Slim Bouguerra

unread,
Feb 16, 2017, 10:53:46 AM2/16/17
to druid...@googlegroups.com
is the jar uploaded to the hdfs as a dependency ?
This suppose to be done by druid and you can check if it exists under the temporary working directory.
-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/b5241e84-20e1-41a9-a163-ed8140dc2f76%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Message has been deleted

shailesh prajapati

unread,
Feb 16, 2017, 10:34:52 PM2/16/17
to Druid User
All the required jars are present in working directory, still job is failing with ClassNotFoundException. Any other config is needed? 

Slim Bouguerra

unread,
Feb 16, 2017, 10:55:11 PM2/16/17
to druid...@googlegroups.com
I don’t think you need extra settings.
FYI this extension is not part of the druid core so as druid committers we don’t have too much involvement with it. 
Sorry.

-- 

B-Slim
_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______/\/\/\_______

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.

Slim Bouguerra

unread,
Feb 16, 2017, 10:56:58 PM2/16/17
to druid...@googlegroups.com

shailesh prajapati

unread,
Feb 17, 2017, 5:41:08 AM2/17/17
to Druid User
Now, i am trying to ingest CSV source but getting Jackson error in MR jobs,

 Error in custom provider, java.lang.NoSuchMethodError: com.fasterxml.jackson.core.JsonFactory.requiresPropertyOrdering()Z
  at io.druid.jackson.JacksonModule.jsonMapper(JacksonModule.java:46)
  at io.druid.jackson.JacksonModule.jsonMapper(JacksonModule.java:46)
  while locating com.fasterxml.jackson.databind.ObjectMapper annotated with interface io.druid.guice.annotations.Json
  while locating com.fasterxml.jackson.databind.ObjectMapper
    for the 1st parameter of io.druid.guice.JsonConfigurator.<init>(JsonConfigurator.java:64)
  at io.druid.guice.ConfigModule.configure(ConfigModule.java:40)
  while locating io.druid.guice.JsonConfigurator
    for the 2nd parameter of io.druid.guice.JsonConfigProvider.inject(JsonConfigProvider.java:188)
  at io.druid.guice.JsonConfigProvider.bind(JsonConfigProvider.java:131)

This looks like version mismatch of Jackson. My hadoop is using 2.2.3 and Druid using 2.4.6 Jackson. But, all the new versions of Jackson is already in hdfs working directory. I am using "mapreduce.job.classloader": "true" in my specs. Is it something related to classloader? 

On Thursday, 16 February 2017 15:13:59 UTC+5:30, shailesh prajapati wrote:

shailesh prajapati

unread,
Feb 17, 2017, 11:30:25 AM2/17/17
to Druid User
Resolved CSV ingestion. It was because of "druid-orc-extensions" present in druid.extensions.loadList. I removed it and MR job succeed. But i am still looking for ORC ingestion and its error resolution. 

baoti...@gmail.com

unread,
Mar 29, 2017, 11:21:44 PM3/29/17
to Druid User
hi, I meet with the same problem recently, I'd like to know whether you have figured out how to solve this issue?  Looking forward to your reply. BTW: I wanna ingest ORC data. 

在 2017年2月16日星期四 UTC+8下午11:53:46,Slim Bouguerra写道:
Reply all
Reply to author
Forward
0 new messages