Hi, I'm trying to connect to a mongodb using the Hadoop Connector and the Hortonworks Sandbox (with HDP 2.2).
I've followed the instructions on github:
- Downloaded the repository (version 1.3.1)
- Build with gradle
- Copy the jars to the lib directory (not sure what this was for the sandbox but I used /usr/lib/hadoop/lib)
- I've also added the mongo java driver to the lib directory (version 2.12.5)
Then I tried the following Pig script:
REGISTER /usr/lib/hadoop/lib/mongo-java-driver-2.12.5.jar
REGISTER /usr/lib/hadoop/lib/mongo-hadoop-core-1.3.1.jar
REGISTER /usr/lib/hadoop/lib/mongo-hadoop-pig-1.3.1.jar
raw = LOAD 'mongodb://<url>:27017/<database>' USING com.mongodb.hadoop.pig.MongoLoader;
raw_limited = LIMIT raw 3;
dump raw_limited;However this gives an error:
Pig Stack Trace
---------------
ERROR 1002: Unable to store alias raw_limited
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open i
terator for alias raw_limited
at org.apache.pig.PigServer.openIterator(PigServer.java:935)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:7
46)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScript
Parser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.j
ava:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.j
ava:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.PigException: ERROR 1002: Unable to store alias raw_li
mited
at org.apache.pig.PigServer.storeEx(PigServer.java:1038)
at org.apache.pig.PigServer.store(PigServer.java:997)
at org.apache.pig.PigServer.openIterator(PigServer.java:910)
... 13 more
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.l
ang.IllegalArgumentException: Couldn't connect and authenticate to get collectio
n
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launch
Pig(HExecutionEngine.java:286)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:13
75)
at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
... 15 more
Caused by: java.lang.IllegalArgumentException: Couldn't connect and authenticate
to get collection
at com.mongodb.hadoop.util.MongoConfigUtil.getCollection(MongoConfigUtil
.java:353)
at com.mongodb.hadoop.splitter.MongoSplitterFactory.getSplitterByStats(M
ongoSplitterFactory.java:71)
at com.mongodb.hadoop.splitter.MongoSplitterFactory.getSplitter(MongoSpl
itterFactory.java:107)
at com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:5
6)
at org.apache.pig.impl.io.ReadToEndLoader.init(ReadToEndLoader.java:190)
at org.apache.pig.impl.io.ReadToEndLoader.<init>(ReadToEndLoader.java:14
6)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationa
lOperators.POLoad.setUp(POLoad.java:99)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationa
lOperators.POLoad.getNextTuple(POLoad.java:127)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalO
perator.processInput(PhysicalOperator.java:307)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationa
lOperators.POLimit.getNextTuple(POLimit.java:122)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalO
perator.processInput(PhysicalOperator.java:307)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationa
lOperators.POStore.getNextTuple(POStore.java:159)
at org.apache.pig.backend.hadoop.executionengine.fetch.FetchLauncher.run
Pipeline(FetchLauncher.java:161)
at org.apache.pig.backend.hadoop.executionengine.fetch.FetchLauncher.lau
nchPig(FetchLauncher.java:81)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launch
Pig(HExecutionEngine.java:278)
... 18 more
Caused by: java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:33
3)
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988
)
at com.mongodb.DBApiLayer.doGetCollection(DBApiLayer.java:123)
at com.mongodb.DBApiLayer.doGetCollection(DBApiLayer.java:33)
at com.mongodb.DB.getCollection(DB.java:164)
at com.mongodb.hadoop.util.MongoConfigUtil.getCollection(MongoConfigUtil
.java:351)
... 32 moreWhat I find weird is the java.lang.IllegalArgumentException: Couldn't connect and authenticate to get collection because there is no authentication required for the database. Also when I try to connect to the database via pymongo it works fine.
Am I missing something here?