mongodb-hadoop connector: hive MongoStorageHandler class not found

1,870 views
Skip to first unread message

Rishav Rohit

unread,
Aug 27, 2013, 3:10:24 AM8/27/13
to mongod...@googlegroups.com
Hi,

I am trying to create a hive table STORED BY "com.mongodb.hadoop.hive.MongoStorageHandler" using reference http://www.10gen.com/presentations/webinar-whats-new-mongodb-hadoop-integration (slide# 75).

My environment details are-
MongoDB 2.4.5
Cloudera 4.3
I have downloaded mongodb-hadoop CDH connectors from https://github.com/mongodb/mongo-hadoop and placed all 3 jars in hadoop, hadoop-hdfs, hadoop-mapreduce, hadoop-0.20-mapreduce and hive libs folder.

Below is my create table statement in hive-
CREATE TABLE Final.MetricDistFact (
Time_Id TIMESTAMP,
TimeLevel STRING,
Nation_DESC STRING,
Region_ID INT,
Region_DESC STRING,
District_ID INT,
District_DESC STRING
)
STORED BY "com.mongodb.hadoop.hive.MongoStorageHandler"
WITH SERDEPROPERTIES( "mongo.columns.mapping" = "Time_Id, TimeLevel, Nation_DESC, Region_ID, Region_DESC, District_ID, District_DESC" )
TBLPROPERTIES ("mongo.uri" = "mongodb://localhost:27017/Final.MetricDistFact");


And the error log from hive is -
2013-08-27 12:09:04,142 INFO  ql.Driver (PerfLogger.java:PerfLogEnd(127)) - </PERFLOG method=TimeToSubmit start=1377585543550 end=1377585544142 duration=592>
2013-08-27 12:09:04,209 ERROR exec.Task (SessionState.java:printError(421)) - Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.com.mongodb.hadoop.hive.MongoStorageHandler
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.com.mongodb.hadoop.hive.MongoStorageHandler
        at org.apache.hadoop.hive.ql.metadata.Table.getStorageHandler(Table.java:284)
        at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3575)
        at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:254)
        at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
        at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
        at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1374)
        at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1160)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:973)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:893)
        at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:759)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.com.mongodb.hadoop.hive.MongoStorageHandler
        at org.apache.hadoop.hive.ql.metadata.HiveUtils.getStorageHandler(HiveUtils.java:294)
        at org.apache.hadoop.hive.ql.metadata.Table.getStorageHandler(Table.java:279)
        ... 18 more
Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.hive.MongoStorageHandler
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.hadoop.hive.ql.metadata.HiveUtils.getStorageHandler(HiveUtils.java:287)
        ... 19 more

2013-08-27 12:09:04,218 ERROR ql.Driver (SessionState.java:printError(421)) - FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask

I also tried searching for MongoStorageHandler in all jars and mongo-hadoop git also but I am not able to find it.
Please help me to reslove this.

Regards,
Rishav

Rishav Rohit

unread,
Aug 28, 2013, 2:12:19 AM8/28/13
to mongod...@googlegroups.com
Someone please help me.

Rishav

mathias kluba

unread,
Oct 11, 2013, 10:49:37 AM10/11/13
to mongod...@googlegroups.com
You are unable to find that class because it doesn't exist in the downloaded jar files!
This class is quite new, and it's not available in current release (1.1).
So you have to download the sources and compile it yourself.
You have all information about that in the readme.

Rishav Rohit

unread,
Oct 13, 2013, 6:59:16 AM10/13/13
to mongod...@googlegroups.com
Thanks Mathias for help.
I am able to integrate my hive and mongodb table now :)

Just for FYI.
When i searched mongo-hadoop github for MongoStorageHandler during last week of August I was not able to find it and I also tried to clone github (for building jars) but was not able to do so as I believe the github was corrupted at that time :(.

Regards,
Rishav

--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongod...@googlegroups.com
To unsubscribe from this group, send email to
mongodb-user...@googlegroups.com
See also the IRC channel -- freenode.net#mongodb
 
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/QJzk5Mj-r7I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Maziyar S.P.

unread,
Nov 17, 2013, 4:37:16 AM11/17/13
to mongod...@googlegroups.com
Hi guys,

I have CDH4 up and running and have all the jars at the same paths (based on the read.me). The pig is working and can load the data from MongoDB. But I get the same error from HIVE. I complied the source and use the jar in hive/target instead of the other one that comes wit ./sbt package but still it says:

Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Error in loading storage handler.com.mongodb.hadoop.hive.MongoStorageHandler

Cloud you please guid me to the right direction, thanks.

Rishav Rohit

unread,
Nov 18, 2013, 2:05:45 AM11/18/13
to mongod...@googlegroups.com
Dear Maziyar,

Have you built the jars using ./sbt package command?
You need to build the hive jar because MongoStorageHandler class is not present in chd4 hive jar present on github.
Refer to the instruction for building jar present on github.
Hope this helps.

Rishav


--

Rishav Rohit

unread,
Nov 18, 2013, 2:32:52 AM11/18/13
to mongod...@googlegroups.com
you can also refer to blog http://rishavrohitblog.blogspot.in/2013/10/connecting-hive-to-mongodb-using.html for getting an example to use MongoStorageHandler.

Maziyar S.P.

unread,
Nov 18, 2013, 4:17:40 AM11/18/13
to mongod...@googlegroups.com
Hi Rishav,

Thanks for the reply. I saw you said we should built it separately so I did it by using this command sbt mongo-hadoop-hive/package

So one really basic question since I'm new to hive, I put these jars in hadoop/lib folder and also in hive/lib folder just in case since the guide doesn't say anything about what to do with the jar files (pig says register three jars and hadoop says put them all in hadoop directories) but hive read me says nothing after building them. Is there anything else should I do after building the jars for HIVE interactive mode in command line? Like ADD JAR at the beginning? I did ADD JAR also but still says it can't find it. My CDH version is 4.4 for more information.

Thanks for the help Rishav

Maziyar S.P.

unread,
Nov 18, 2013, 7:02:19 AM11/18/13
to mongod...@googlegroups.com
OK please discard this one cause I found the problem. I was keep copying the old jar ./sbt built. They are in the same name and I forgot the hive is in hive/target. I thought it goes to the /target.

Thank you Rishav, your blog btw is awesome!

Thanks for the help:)

Aakash Aggarwal

unread,
Aug 18, 2016, 4:45:49 PM8/18/16
to mongodb-user
Hi Rishav,

Can u tell me the location in github repository where u had downloaded these jars

Regards,
Aakash

Luke Lovett

unread,
Aug 19, 2016, 12:56:09 PM8/19/16
to mongodb-user
Aakash, please find the installation instructions for your setup in the wiki for mongo-hadoop here: https://github.com/mongodb/mongo-hadoop/wiki

aakash aggrawal

unread,
Aug 19, 2016, 1:37:14 PM8/19/16
to mongod...@googlegroups.com
Hi like,

Thanks for your help but I have installed all these component in my CDH. The only problem is that I m not getting the jars that are compatible to cloudera platform. All the jars available on google are supported to apache distributed hadoop. That's why asking for the exact location of jars in repository.

Regards,
Aakash Aggarwal

From: Luke Lovett
Sent: ‎19-‎08-‎2016 22:27
To: mongodb-user
Subject: [mongodb-user] Re: mongodb-hadoop connector: hive MongoStorageHandlerclass not found

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/

---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/QJzk5Mj-r7I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.

Luke Lovett

unread,
Aug 22, 2016, 1:06:00 PM8/22/16
to mongodb-user
Hi Aakash,

The jars are compiled against Apache Hadoop, but I'm not aware of any issues that make them incompatible with other distributions, like Cloudera Hadoop. What specific problem are you experiencing using the jars mentioned in the mongo-hadoop wiki?
To unsubscribe from this group and all its topics, send an email to mongodb-user+unsubscribe@googlegroups.com.

aakash aggrawal

unread,
Aug 22, 2016, 1:15:35 PM8/22/16
to mongod...@googlegroups.com
Hello like,

First of all sorry for spelling ur name in previous mail. Actually, I m trying to make an Hive table on the top of mongodb by using mongostoragehandler. To create that tablewe need to add jars that make the connection possible between mongodb and hive. While adding the jars in the classpath, I am getting an error as mentioned in that google document.

Regards,
Aakash Aggarwal

From: Luke Lovett
Sent: ‎22-‎08-‎2016 22:36
To: mongodb-user
Subject: Re: [mongodb-user] Re: mongodb-hadoop connector: hiveMongoStorageHandlerclass not found

To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.

To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.

Luke Lovett

unread,
Aug 22, 2016, 5:53:47 PM8/22/16
to mongodb-user
The classpaths for Hadoop/Hive may be slightly different for the Cloudera distribution. You might check the value of the HADOOP_CLASSPATH environment variable and ensure that mongo-hadoop-core and mongo-hadoop-hive jars are both somewhere on there. You can also add the location of the jars to HADOOP_CLASSPATH and set HADOOP_USER_CLASSPATH_FIRST=true. These are in addition to using the "ADD JAR" statement in the Hive shell as documented here: https://github.com/mongodb/mongo-hadoop/wiki/Hive-Usage#installation.
Reply all
Reply to author
Forward
0 new messages