Installing RHadoop (rmr,rhbase,rhdfs) on pseudo-distributed cluster

1,025 views
Skip to first unread message

Krishnanand Khambadkone

unread,
Mar 27, 2012, 6:29:49 PM3/27/12
to rha...@googlegroups.com
I have a CDH3 pseudo distributed cluster (hadoop, hbase) installed and running on my laptop and want to install RHadoop (rmr, rhbase, rhdfs) on it.    Does anyone have a concise set of instructions on what packages to download, what env vars to set and how to install these packages.

Antonio Piccolboni

unread,
Mar 27, 2012, 6:37:07 PM3/27/12
to rha...@googlegroups.com

Krishnanand Khambadkone

unread,
Mar 28, 2012, 6:38:02 PM3/28/12
to rha...@googlegroups.com
Antonio,  Thank you very much for these pointers.  I have installed R and rmr and these are now up and running.   I am able to run the map/reduce the examples from j.seidman's R package for streaming however when I try to run the rmr sample,  I get this error when I try to load the script  
 
> source('deptdelay-rmr.R')
Loading required packag:  RJSONIO
Loading required packag:  itertools
Loading required packag:  iterators
Loading required packag:  digest
Error in mapreduce(input = input, output = output, textinputformat = csvtextinputformat,  :
  unused arguments(s) (textinputformat = csvtextinputformat)

Antonio Piccolboni

unread,
Mar 28, 2012, 6:41:07 PM3/28/12
to rha...@googlegroups.com
That was written for rmr1.1


A

Jonathan Seidman

unread,
Mar 28, 2012, 6:51:11 PM3/28/12
to rha...@googlegroups.com
Krishnanand – That's because that jseidman guy is really lame and hasn't updated his code yet. Fortunately other community members are more diligent, so check out Jeffrey Breen's fork for a version that should work for you:

Krishnanand Khambadkone

unread,
Mar 28, 2012, 7:51:48 PM3/28/12
to rha...@googlegroups.com
Gentlemen,  Thank you so much for the prompt replies and the timely help.  

sandy

unread,
May 24, 2012, 9:57:25 AM5/24/12
to RHadoop
Hi ,

I have installed rmr in pseudo mode cluster using (https://github.com/
RevolutionAnalytics/RHadoop/wiki/rmr) and i have taken the code from
the below link (https://github.com/jeffreybreen/hadoop-R/blob/master/
airline/src/deptdelay_by_month/R/rmr/deptdelay-rmr.R) and i tried
running the script , i am seeing the following error. I am new to this
package, any help would be great appreciable.

[root@01HW044544 rhadoop]# ./firstrmr.R
Loading required package: methods
Loading required package: RJSONIO
Loading required package: itertools
Loading required package: iterators
Loading required package: digest
Error in mapreduce(input = input, output = output, textinputformat =
csvtextinputformat, :
unused argument(s) (textinputformat = csvtextinputformat)
Calls: from.dfs -> to.dfs.path -> deptdelay -> mapreduce
Execution halted

Thanks
Sandeep

On Mar 29, 4:51 am, Krishnanand Khambadkone
<danoomistmati...@gmail.com> wrote:
> Gentlemen,  Thank you so much for the prompt replies and the timely help.
>
>
>
>
>
>
>
> On Wednesday, March 28, 2012 6:51:11 PM UTC-4, jseidman wrote:
> > Krishnanand – That's because that jseidman guy is really lame and hasn't
> > updated his code yet. Fortunately other community members are more
> > diligent, so check out Jeffrey Breen's fork for a version that should work
> > for you:
>
> >https://github.com/jeffreybreen/hadoop-R
>
> > On Wed, Mar 28, 2012 at 10:38 PM, Krishnanand Khambadkone <
> > danoomistmati...@gmail.com> wrote:
>
> >> Antonio,  Thank you very much for these pointers.  I have installed R and
> >> rmr and these are now up and running.   I am able to run the map/reduce the
> >> examples from j.seidman's R package for streaming however when I try to run
> >> the rmr sample,  I get this error when I try to load the script
>
> >> > source('deptdelay-rmr.R')
> >> Loading required packag:  RJSONIO
> >> Loading required packag:  itertools
> >> Loading required packag:  iterators
> >> Loading required packag:  digest
> >> *Error in mapreduce(input = input, output = output, textinputformat =
> >> csvtextinputformat,  :*
> >> *  unused arguments(s) (textinputformat = csvtextinputformat)*
>
> >> On Tuesday, March 27, 2012 6:37:07 PM UTC-4, Antonio Piccolboni wrote:
>
> >>>https://github.com/**RevolutionAnalytics/RHadoop/**wiki/rmr<https://github.com/RevolutionAnalytics/RHadoop/wiki/rmr>
> >>>https://github.com/**RevolutionAnalytics/RHadoop/**wiki/rhdfs<https://github.com/RevolutionAnalytics/RHadoop/wiki/rhdfs>
> >>>https://github.com/**RevolutionAnalytics/RHadoop/**wiki/rhbase<https://github.com/RevolutionAnalytics/RHadoop/wiki/rhbase>

Antonio Piccolboni

unread,
May 26, 2012, 1:23:43 PM5/26/12
to rha...@googlegroups.com
Unfortunately not every line of rmr code self updates when a new release comes up. If you read help for the mapreduce function you can see that option has been replaced.

Antonio

Reply all
Reply to author
Forward
0 new messages