How can i use r package in Rhadoop?

201 views
Skip to first unread message

Jingmin

unread,
Nov 18, 2013, 12:18:00 AM11/18/13
to rha...@googlegroups.com
I tried to use some r packages in Rhadoop, but it always comes out error. 
I heard of the r package can not use in Rhadoop directly. It needs to call R program or something. Like in mapreduce's map function may can call local R program in datanodes. 
I am sorry for my English is not good enough to describe my problem exactly, whatever I want to know how to use r package in Rhadoop or how to call another R in Rhaoop. Thank you.

It is a example who use r package e1071 (svm) in Rhadoop. And I use the same code but i do not know why it does not work for me, it always comes out error.


Antonio Piccolboni

unread,
Nov 18, 2013, 12:29:25 AM11/18/13
to RHadoop Google Group
On Sun, Nov 17, 2013 at 9:18 PM, Jingmin <jingm...@gmail.com> wrote:
I tried to use some r packages in Rhadoop, but it always comes out error. 

FInding out the exact error would be a first step. There is a number of error logs all accessible via the web UI, the one I am most interested in would be the task attempt stderr.
 
I heard of the r package can not use in Rhadoop directly. It needs to call R program or something. Like in mapreduce's map function may can call local R program in datanodes. 

Not sure which R package you are talking about. The rest of the sentence may be accurate if one is trying to describe how rmr2 is implemented, but I don't think that should be a concern of yours, at least at this stage. 
 
I am sorry for my English is not good enough to describe my problem exactly, whatever I want to know how to use r package in Rhadoop or how to call another R in Rhaoop. Thank you.

It is a example who use r package e1071 (svm) in Rhadoop. And I use the same code but i do not know why it does not work for me, it always comes out error.


Did you install e1071 on each node at a system location available to all users? That usually is enough to allow you to access its functions in the map and reduce functions.

 

I though I had answered that one already, and it's marked accepted. What else?


Antonio
 



--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jingmin

unread,
Nov 18, 2013, 8:03:03 PM11/18/13
to rha...@googlegroups.com, ant...@piccolboni.info
I had installed e1071 in every datanode, that's why I'm so confused that use the same code he can run without error and mine came out error.
The most important is I have to use some r packages in Hadoop. And I am trying to use system() in map function, it can call external software (R). I want to use r package in R independently outside of Rhadoop and get the result in Rhadoop. it does not go well until now, also came out error.
I am still trying and do you have any better method to use r package in Rhadoop?

Antonio Piccolboni

unread,
Nov 18, 2013, 8:12:18 PM11/18/13
to RHadoop Google Group
On Mon, Nov 18, 2013 at 5:03 PM, Jingmin <jingm...@gmail.com> wrote:
I had installed e1071 in every datanode, that's why I'm so confused that use the same code he can run without error and mine came out error.
The most important is I have to use some r packages in Hadoop. And I am trying to use system() in map function, it can call external software (R).

This is not how rmr2 works, it's hadoop streaming that calls java. You are free to try any approach you want, but I would say it's remote enough from RHadoop that you are as likely to get help from this group as any other.
 
I want to use r package in R independently outside of Rhadoop and get the result in Rhadoop. it does not go well until now, also came out error.
I am still trying and do you have any better method to use r package in Rhadoop?


Yes 

library(rmr2)
library(e1017)

mapreduce(input, map = function(k, v) {code including calls to e1017 functions})

No system, no R invocation, no nothing. I would say this is not a better method, this is the only method. Any other method of using packages within a rmr2 mapreduce job is not supported. And at last, my usual recommendation: you can try to ignore the documentation and use rmr2 based on your intuition, as you seem to be doing, but the results are going to be disappointing.


Antonio
Reply all
Reply to author
Forward
0 new messages