Hi, I'm a master student in the Korea and i'm studying Big data with RHadoop.
I have a problem with under the code. Pleas give me a tip!
When I input the 10Mbyte file using the under code, I have not problems.
However, in the case of the 200Mbyte(>10Mbyte ), MapReduce stops!
Without error messages, just stop at Reduce 67% point.
My question is why my code is wrong with 200Mbyte data.
<My RHadoop code>-------------------------------------------------------------------------------------
cs.map=function(.,M)
{
jData<- do.call("rbind", lapply(strsplit(unlist(M),","),as.numeric))
keyval(1,jData+1)
}
cs.reduce=function(k,Z)
{
keyval(k,Z)
}
Jinput='/JBH/l200M.csv'
Joutput='/result06'
mapreduce(input=Jinput,output=Joutput,input.format="text",map=cs.map,reduce=cs.reduce,combine=F)
-------------------------------------------------------------------------------------------------------
<My distributed Enviroment>-----------
Number of Nodes : 1 master
5 slaves
software version :
- OS : Ubuntu 14.04LTS
- Java : 1.7.0
- Hadoop 0.20.2
- R : 3.1.0
- rmr2 3.3.0
- rhdfs 1.0.8
---------------------------------------
Any assistance is appreciated.
Best Regards,
Jieun