Hello,
Its possible to use rhwrite to split data.frame in hdfs? So after i load this files from HDFS with ddf() it will be already splitted with multiple pairs of keys-values ? I dont want to load one big pair of key and values but already multiple ones.
i tried rhwrite(twitterData,"paulina/BTfile1/twitterDataChunks",numfiles = 5, kvpairs = FALSE)
but after ddf() instation i get warnings
Warning in str.default(val) : 'str.default': 'le' is NA, so taken as 0
And it will load only one row.I dont fully understand all arguments like chunks.Passbyte is something similar to mapred.max.split.size?
full code bellow:
rhmkdir("paulina/BTfile1","777")
rhexists("paulina/BTfile1")
rhwrite(twitterData,"paulina/BTfile1/twitterDataChunks",numfiles = 5,kvpairs = FALSE)
connect <- hdfsConn("/paulina/BTfile1/twitterDataChunks/", autoYes = TRUE)
ddfTwitters = ddf(connect)