tutorial of the RHadoop?

116 views
Skip to first unread message

Felipe Gutierrez

unread,
Dec 12, 2013, 1:17:54 PM12/12/13
to rha...@googlegroups.com

Hi everyone,
i have to execute the Revolution software at Hadoop. It seems I need to execute at RHadoop. When I go the the Revolution site (https://github.com/RevolutionAnalytics) there are 4 packages I have to download. Do I have to install all four? Is there any tutorial to install them and to check how to execute the RevolutionAnalytical software in R at RHadoop?

Thanks!
Felipe

Los

unread,
Dec 20, 2013, 5:29:32 PM12/20/13
to rha...@googlegroups.com
Hi Felipe,

The four packages provide different functionalities.  You might want to get started with just rhdfs to read from and write to HDFS from within R.

The basic process is:
1) Install a Hadoop cluster.
2) Configure a client machine to be able to connect to the Hadoop cluster.  i.e. you should be able to run `hdfs dfs -ls`.
3) Install R  on a client machine.
4) Tell R where to find Hadoop and Hadoop streaming
    export HADOOP_HOME=/path/to/hadoop/home
    export HADOOP_STREAMING=/path/to/hadoop-streaming-version.jar
5) Configure R to use Java.
R CMD javareconf
6) Install rJava
R --no-save <<EOT
  install
.packages( pkgs="rJava",
                    dependencies
=TRUE,
                    repos
="http://www.rforge.net" )
EOT
7) Install rhdfs
R CMD INSTALL /path/to/rhdfs_version.tar.gz

I hope this helps.

-Carlos
Reply all
Reply to author
Forward
0 new messages