Rhwrite

16 views
Skip to first unread message

Jakub Paulina

unread,
May 21, 2016, 10:12:39 PM5/21/16
to rhipe
Hello,
 Its possible to use rhwrite to split data.frame in hdfs? So after i load this files from HDFS with ddf() it will be already splitted with multiple pairs of keys-values ? I dont want to load one big pair of key and values but already multiple ones.
i tried rhwrite(twitterData,"paulina/BTfile1/twitterDataChunks",numfiles = 5, kvpairs = FALSE)
but after ddf() instation i get warnings
Warning in str.default(val) : 'str.default': 'le' is NA, so taken as 0

And it will load only one row.I dont fully understand all arguments like chunks.Passbyte is something similar to mapred.max.split.size?
full code bellow:

rhmkdir("paulina/BTfile1","777")
rhexists("paulina/BTfile1")
rhwrite(twitterData,"paulina/BTfile1/twitterDataChunks",numfiles = 5,kvpairs = FALSE)
connect <- hdfsConn("/paulina/BTfile1/twitterDataChunks/", autoYes = TRUE)
ddfTwitters = ddf(connect)
Reply all
Reply to author
Forward
0 new messages