Hello folks,
I have been following this tutorial:
I am using a single node, Ubuntu 14.04, with java 1.8 and hadoop 2.6.0. Until now I have been able to follow everything, and I wanted to try my setup with the example provided, but it turns out that I got an Exception. I identified the minimum code to reproduce the Exception:
#!/usr/bin/env Rscript
Sys.setenv(HADOOP_STREAMING = "/usr/local/hadoop/share/hadoop/tools/sources/hadoop-streaming-2.6.0-sources.jar")
library(rmr2)
## map function
map <- function(k,lines) {
words.list <- strsplit(lines, '\\s')
words <- unlist(words.list)
return( keyval(words, 1) )
}
## reduce function
reduce <- function(word, counts) {
keyval(word, sum(counts))
}
wordcount <- function (input, output=NULL) {
mapreduce(input=input, output=output, input.format="text",
map=map, reduce=reduce)
}
## delete previous result if any
system("hadoop fs -rm -r wordcount/out")
## Submit job
hdfs.root <- 'wordcount'
hdfs.data <- file.path(hdfs.root, 'data')
hdfs.out <- file.path(hdfs.root, 'out')
out <- wordcount(hdfs.data, hdfs.out)
The complete output tha I get is:
Loading required package: methods
Please review your hadoop settings. See help(hadoop.settings)
Warning message:
S3 methods ‘gorder.default’, ‘gorder.factor’, ‘gorder.data.frame’, ‘gorder.matrix’, ‘gorder.raw’ were declared in NAMESPACE but not found
15/03/25 23:58:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
rm: `wordcount/out': No such file or directory
Exception in thread "main" java.lang.ClassNotFoundException: -D
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.util.RunJar.run(RunJar.java:214)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, :
hadoop streaming failed with error code 1
Calls: wordcount -> mapreduce -> mr
Execution halted
Could somebody could help me to identify what could be wrong and how can I solve it?
Thanks.