I am new to Rhadoop and want to run a simple function using a large file the file has a structure as follows
YMMKey Trim InternetPrice Mileage Certified SingleOwner
2015ChevroletSuburban LT 0 10 0 0
2014ChevroletEquinox LT 23620 10 0 0
2014ChevroletSilverado 1500 LTZ 41695 10 0 0
2014ChevroletMalibu LT 0 10 0 0
2014ChevroletVolt 36605 10 0 0
2015ChevroletTahoe LT 56480 10 0 0
2015ChevroletSilverado 3500HD LTZ 59145 10 0 0
2014ChevroletSilverado 1500 LTZ 44365 10 0 0
How can i import this file into hdfs and read it using rscript when the file has large number of rows(2 million)
also can someone also show me how to run a simple map function and a reduce function on the data set which i have imported.
When i use a simple r function to load the file
f= hdfs.file("/example/data/vipInputs.csv","r",buffersize=104857600)
m= hdfs.read(f)
c= rawToChar(m)
x = reader$read()
it fails as it runs out of buffer space after 200000 lines
Thanks in anticipation
Ajay