MongoDB Hadoop Connector Set-up

42 views
Skip to first unread message

Jon

unread,
Apr 18, 2016, 12:49:59 AM4/18/16
to mongodb-user
Would anyone here be willing to walk me through setting this stuff up on Amazon Web Services?  I need to run MongoDB on an EC2 instance and connect to an EMR Hadoop cluster for a project, but I've never used any of this stuff (Mongo/Hadoop/the connector/AWS) before so it's a bit overwhelming.  I've downloaded the connector from Github so far.  I believe I need to run "gradlew jar" to build the jars (not really sure what those do either), but after that I'm a bit lost.  Have been searching for about a week now, but I can't find a good step-by-step process for this.  Anyone willing to help?  It'd be really really appreciated!

Luke Lovett

unread,
Apr 18, 2016, 2:41:57 PM4/18/16
to mongodb-user
Hi Jon,

There's a lot of excellent documentation out there for this already. Check out this tutorial for launching an EMR cluster: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/gsg-launch-cluster.html

Once you've launched your Hadoop cluster, you need to install the connector. While you can compile the connector yourself, it's a lot easier just to download the jars directly from here: http://search.maven.org/#search|ga|1|g%3A%22org.mongodb.mongo-hadoop%22. You can determine which jars you need and how to set them up from the documentation here: https://github.com/mongodb/mongo-hadoop/wiki.

Once you've set everything up, you can check if everything is working by running an example job like this one of these: https://github.com/mongodb/mongo-hadoop/tree/master/examples

Jon

unread,
Apr 19, 2016, 10:09:44 PM4/19/16
to mongodb-user
Thanks so much for the links!  Will try all of this out asap!
Reply all
Reply to author
Forward
0 new messages