I've just installed rmr, rhbase and rhdfs packages. But whnen i try
the Tutorial called "My first Mapreduce job" (
https://github.com/
RevolutionAnalytics/RHadoop/wiki/Tutorial), i encounter the following
error after line small.ints = to.dfs(1:10):
>
> library(rmr)
Loading required package: RJSONIO
Loading required package: itertools
Loading required package: iterators
Loading required package: digest
>
> library(rhbase)
>
> library(rhdfs)
Loading required package: rJava
HADOOP_HOME=/usr/local/hadoop/
HADOOP_CONF=/usr/local/hadoop/conf
>
> small.ints = to.dfs(1:10)
12/05/22 15:12:24 ERROR streaming.StreamJob: Missing required options:
input, output
Usage: $HADOOP_HOME/bin/hadoop jar \
$HADOOP_HOME/hadoop-streaming.jar [options]
Options:
-input <path> DFS input file(s) for the Map step
-output <path> DFS output directory for the Reduce step
-mapper <cmd|JavaClassName> The streaming command to run
-combiner <cmd|JavaClassName> The streaming command to run
-reducer <cmd|JavaClassName> The streaming command to run
-file <file> File/dir to be shipped in the Job jar file
-inputformat TextInputFormat(default)|SequenceFileAsTextInputFormat|
JavaClassName Optional.
-outputformat TextOutputFormat(default)|JavaClassName Optional.
-partitioner JavaClassName Optional.
-numReduceTasks <num> Optional.
-inputreader <spec> Optional.
-cmdenv <n>=<v> Optional. Pass env.var to streaming commands
-mapdebug <path> Optional. To run this script when a map task
fails
-reducedebug <path> Optional. To run this script when a reduce task
fails
-verbose
Generic options supported are
-conf <configuration file> specify an application configuration
file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-files <comma separated list of files> specify comma separated
files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar
files to include in the classpath.
-archives <comma separated list of archives> specify comma
separated archives to be unarchived on the compute machines.
The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]
For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info
Streaming Job Failed!
Can anyone suggest what might have caused this problem?
thanks in advance