Hi Hadar,
It’s been a while since you posted this question, have you found an answer ?
Based on the stack trace error that you posted, it’s likely that your hadoop command failed to find or have access to ImportWeblogsFromMongo.jar.
Also worth noting that if you are following the tutorial link you posted, you’re using an older version of MongoDB Hadoop Connector. The current release version of mongo-hadoop is v2.0.2.
I would also review WIKI: MapReduce Usage from the mongo-hadoop project page for more information and guidance.
If you still have further question on this, please also include:
pom.xml file content. Regards,
Wan.
I also see the data in MongoDB when I run db.yield_historical.in.find() as shown in the article. But when I run hdfs dfs -ls / to check if the data has been stored in Hadoop HDFS I am NOT able to find any new data.
Hi Akshesh,
The example listed on Getting Started with Hadoop is showcasing mongo-hadoop: Treasury Yield Example. The first task of the example is importing the provided yield_historical_in.json into MongoDB. This is why you can see data when you run db.yield_historical.in.find().
I am trying the MongoDB-Hadoop connector to store my data in HDFS and query it using MongoDB.
The MongoDB Connector for Hadoop is a library which allows MongoDB to be used as an input source or output destination, for Hadoop MapReduce tasks; not to write data to HDFS. You can utilise module org.apache.hadoop.fs.FileSystem to create a stream and write to HDFS.
If you have further questions relating to MongoDB, please open a new discussion thread with the following information:
If you have further questions relating to Hadoop however, please post a question on StackOverflow: Hadoop to reach wider audience with Hadoop expertise.
Regards,
Wan.