SparkR external datasource

34 views
Skip to first unread message

Daniel Dean

unread,
Aug 21, 2015, 2:10:36 PM8/21/15
to SparkR Developers
Hi guys,

Is it possible to use an external data source for SparkR? For example, lets say I have my data sitting in a Cloudant database and I'd like to load it directly from there. From what I can see, I can get the appropriate .jar files loaded, but whenever I try to pull any data, SparkR seems to only look for it in HDFS. Any comments would be greatly appreciated.

Regards,
Daniel

Shivaram Venkataraman

unread,
Aug 21, 2015, 4:04:10 PM8/21/15
to Daniel Dean, SparkR Developers
The best way to do this would be to implement a Spark SQL data source
http://spark.apache.org/docs/latest/sql-programming-guide.html#data-sources
and then use `read.df` in SparkR.

BTW the SparkR development has moved to Apache Spark so please post
questions to the Spark user / developer mailing lists
http://spark.apache.org/community.html

Thanks
Shivaram
> --
> You received this message because you are subscribed to the Google Groups
> "SparkR Developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sparkr-dev+...@googlegroups.com.
> To post to this group, send email to spark...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/sparkr-dev/b2d45055-7f87-4398-ad64-bf6405627265%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages