s3a on emr

41 views
Skip to first unread message

dee...@gmail.com

unread,
Jan 7, 2019, 2:29:12 PM1/7/19
to RubiX
Hello
I have an issue to connect via s3a protocol.
I can successfully connect via s3 or s3n, but not via s3a.

added in conf 
spark.hadoop.fs.s3.impl         com.qubole.rubix.hadoop2.CachingNativeS3FileSystem
spark.hadoop.fs.s3n.impl        com.qubole.rubix.hadoop2.CachingNativeS3FileSystem

Please help me with s3a
Thanks

Abhishek Das

unread,
Jan 7, 2019, 2:52:46 PM1/7/19
to RubiX
Hi,

You can do either of these:

spark.hadoop.fs.s3.impl         com.qubole.rubix.hadoop2.CachingS3AFileSystem
spark.hadoop.fs.s3n.impl        com.qubole.rubix.hadoop2.CachingS3AFileSystem
spark.hadoop.fs.s3a.impl        com.qubole.rubix.hadoop2.CachingS3AFileSystem

or 

spark.hadoop.fs.s3.impl         com.qubole.rubix.hadoop2.CachingS3AFileSystem
spark.hadoop.fs.s3n.impl        com.qubole.rubix.hadoop2.CachingS3AFileSystem
spark.hadoop.fs.s3a.impl        com.qubole.rubix.hadoop2.CachingS3AFileSystem


Let us know if this works.

Regards,
Abhishek

Dmitry Yatsyuk

unread,
Jan 7, 2019, 2:54:14 PM1/7/19
to Abhishek Das, RubiX
Hello again
but 
spark.hadoop.fs.s3a.impl        com.qubole.rubix.hadoop2.CachingS3AFileSystem
is not working for me

--
You received this message because you are subscribed to a topic in the Google Groups "RubiX" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rubix-users/X0ulYdWgadI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rubix-users...@googlegroups.com.
To post to this group, send email to rubix...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rubix-users/1d09fde0-1f9c-463c-891c-3fce488fb869%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Abhishek Das

unread,
Jan 7, 2019, 2:58:47 PM1/7/19
to Dmitry Yatsyuk, RubiX
What error you are getting ?

Dmitry Yatsyuk

unread,
Jan 7, 2019, 2:59:42 PM1/7/19
to Abhishek Das, RubiX
no errors just not caching 

Abhishek Das

unread,
Jan 7, 2019, 3:01:43 PM1/7/19
to Dmitry Yatsyuk, RubiX
Can you check your spark executor logs and see what file system class is being used. I have a feeling that the conf is not getting set properly. We have been setting the same config and its working for us.

Dmitry Yatsyuk

unread,
Jan 8, 2019, 3:29:46 PM1/8/19
to Abhishek Das, RubiX
Hello
Thanks so much it started to work looks like on miss-spell.
Also will rubix works on emr without presto?

Abhishek Das

unread,
Jan 14, 2019, 8:44:50 PM1/14/19
to RubiX
Hi,

RubiX works on EMR presto. You need to install rubix-admin python package to install RubiX in a cluster and start the daemons. The support for rubix-admin is only available for presto. We are working on supporting spark with rubix-admin and will be done soon. 

But if you want, you can install deploy rubix jars in all the nodes and start the rubix daemons. One they are started, you can run your spark queries with properly configured filesystems (mentioned in the trailing email)

Hope that answer you questions. Feel free to revert if you have any queries.

Regards,
Abhishek
Reply all
Reply to author
Forward
0 new messages