mongoDB connector to Hadoop

hadar bey

unread,

Sep 10, 2017, 4:40:06 PM9/10/17

to mongodb-user

hi

i am a beginner that try to connect mongo to hadoop

i try to work by this tutorial:

http://www.oodlestechnologies.com/blogs/Hadoop-connection-with-mongodb-using-mongoDBConnector

(i saw the same steps in other guides), But I got the following error:

super29@lab11-rd29-05:/usr/local/hadoop$ hadoop jar ImportWeblogsFromMongo.jar

Exception in thread "main" java.io.IOException: Error opening job jar: ImportWeblogsFromMongo.jar

at org.apache.hadoop.util.RunJar.run(RunJar.java:173)

at org.apache.hadoop.util.RunJar.main(RunJar.java:148)

Caused by: java.util.zip.ZipException: error in opening zip file

at java.util.zip.ZipFile.open(Native Method)

at java.util.zip.ZipFile.<init>(ZipFile.java:221)

at java.util.zip.ZipFile.<init>(ZipFile.java:151)

at java.util.jar.JarFile.<init>(JarFile.java:154)

at java.util.jar.JarFile.<init>(JarFile.java:91)

at org.apache.hadoop.util.RunJar.run(RunJar.java:171)

... 1 more

how to fix that??

Wan Bachtiar

unread,

Oct 9, 2017, 9:08:52 PM10/9/17

to mongodb-user

Hi Hadar,

It’s been a while since you posted this question, have you found an answer ?

Based on the stack trace error that you posted, it’s likely that your hadoop command failed to find or have access to ImportWeblogsFromMongo.jar.

Also worth noting that if you are following the tutorial link you posted, you’re using an older version of MongoDB Hadoop Connector. The current release version of mongo-hadoop is v2.0.2.

I would also review WIKI: MapReduce Usage from the mongo-hadoop project page for more information and guidance.

If you still have further question on this, please also include:

Hadoop version that you’re using
MongoDB Hadoop connector that you’re using
Snippet code example, along with pom.xml file content.
MongoDB example input documents.

Regards,
Wan.

Akshesh Doshi

unread,

May 28, 2018, 8:00:18 AM5/28/18

to mongodb-user

Hi Wan

Thank you for your help in the previous mail.

I am trying the MongoDB-Hadoop connector to store my data in HDFS and query it using MongoDB.

I followed the steps given in the MongoDB documentation & this getting started tutorial successfully. I also see the data in MongoDB when I run db.yield_historical.in.find() as shown in the article. But when I run hdfs dfs -ls / to check if the data has been stored in Hadoop HDFS I am NOT able to find any new data.

Is there anything that I am doing wrong here or anything that I've missed. I would really appreciate if you could help me here as I seem to have got stuck and there seems to be no step-wise guide on the web on how to use/configure the connector - I would really like to know if there's any article I can refer to (else I'll write one myself if I'm able to solve this riddle).

If it helps, I am using Hadoop 2.7.3.2.6.3.0-235 with MongoDB v3.6.5

Wan Bachtiar

unread,

May 31, 2018, 2:15:05 AM5/31/18

to mongodb-user

I also see the data in MongoDB when I run db.yield_historical.in.find() as shown in the article. But when I run hdfs dfs -ls / to check if the data has been stored in Hadoop HDFS I am NOT able to find any new data.

Hi Akshesh,

The example listed on Getting Started with Hadoop is showcasing mongo-hadoop: Treasury Yield Example. The first task of the example is importing the provided yield_historical_in.json into MongoDB. This is why you can see data when you run db.yield_historical.in.find().

I am trying the MongoDB-Hadoop connector to store my data in HDFS and query it using MongoDB.

The MongoDB Connector for Hadoop is a library which allows MongoDB to be used as an input source or output destination, for Hadoop MapReduce tasks; not to write data to HDFS. You can utilise module org.apache.hadoop.fs.FileSystem to create a stream and write to HDFS.

If you have further questions relating to MongoDB, please open a new discussion thread with the following information:

MongoDB Hadoop connector version that you’re using
Snippet code
Any error messages that you’re seeing

If you have further questions relating to Hadoop however, please post a question on StackOverflow: Hadoop to reach wider audience with Hadoop expertise.

Regards,
Wan.

Akshesh Doshi

unread,

Jun 3, 2018, 6:02:20 AM6/3/18

to mongodb-user

Thank you for your time and insights, Wan! I'll look further into this and see if this is the right solution to my problem.

Regards

Akshesh

Akshesh Doshi

unread,

Jun 7, 2018, 8:00:37 AM6/7/18

to mongodb-user

As suggested, I've posted my question here - https://stackoverflow.com/questions/50740657/use-mongodb-to-interface-data-in-hdfs .

Reply all

Reply to author

Forward