rdd = sc.newAPIHadoopRDD(
inputFormatClass='com.mongodb.hadoop.BSONFileInputFormat',
keyClass='org.apache.hadoop.io.Text',
valueClass='org.apache.hadoop.io.MapWritable',
conf={
'mapred.input.dir': 's3n://my-bucket/compressed_bson.gz'
}
)
INFO hadoop.BSONFileInputFormat: File s3n://my-bucket/compressed_bson.gz is compressed so cannot be split.
Traceback (most recent call last):
File "<stdin>", line 6, in <module>
File "/home/hadoop/spark/python/pyspark/context.py", line 547, in newAPIHadoopRDD
jconf, batchSize)
File "/home/hadoop/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
File "/home/hadoop/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.IllegalArgumentException: Wrong FS: s3n://my-bucket/compressed_bson.gz, expected: hdfs://10.0.2.139:9000
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/9b9e529c-9d0d-4d07-835c-1584124b80eb%40googlegroups.com.--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/2jcrxOdRuFo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
Rafael AguiarData Science Engineer | |
Mobile: +55 81 99730.0415 | |
Office: +55 81 3127.0881 Website: inlocomedia.com | |