Error in first Mapreduce job

Chen

unread,

May 22, 2012, 3:23:03 AM5/22/12

to RHadoop

I've just installed rmr, rhbase and rhdfs packages. But whnen i try
the Tutorial called "My first Mapreduce job" (https://github.com/
RevolutionAnalytics/RHadoop/wiki/Tutorial), i encounter the following
error after line small.ints = to.dfs(1:10):

>
> library(rmr)
Loading required package: RJSONIO
Loading required package: itertools
Loading required package: iterators
Loading required package: digest
>
> library(rhbase)
>
> library(rhdfs)
Loading required package: rJava

HADOOP_HOME=/usr/local/hadoop/
HADOOP_CONF=/usr/local/hadoop/conf
>
> small.ints = to.dfs(1:10)
12/05/22 15:12:24 ERROR streaming.StreamJob: Missing required options:
input, output
Usage: $HADOOP_HOME/bin/hadoop jar \
$HADOOP_HOME/hadoop-streaming.jar [options]
Options:
-input <path> DFS input file(s) for the Map step
-output <path> DFS output directory for the Reduce step
-mapper <cmd|JavaClassName> The streaming command to run
-combiner <cmd|JavaClassName> The streaming command to run
-reducer <cmd|JavaClassName> The streaming command to run
-file <file> File/dir to be shipped in the Job jar file
-inputformat TextInputFormat(default)|SequenceFileAsTextInputFormat|
JavaClassName Optional.
-outputformat TextOutputFormat(default)|JavaClassName Optional.
-partitioner JavaClassName Optional.
-numReduceTasks <num> Optional.
-inputreader <spec> Optional.
-cmdenv <n>=<v> Optional. Pass env.var to streaming commands
-mapdebug <path> Optional. To run this script when a map task
fails
-reducedebug <path> Optional. To run this script when a reduce task
fails
-verbose

Generic options supported are
-conf <configuration file> specify an application configuration
file
-D <property=value> use value for given property
-fs <local|namenode:port> specify a namenode
-jt <local|jobtracker:port> specify a job tracker
-files <comma separated list of files> specify comma separated
files to be copied to the map reduce cluster
-libjars <comma separated list of jars> specify comma separated jar
files to include in the classpath.
-archives <comma separated list of archives> specify comma
separated archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]

For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info

Streaming Job Failed!

Can anyone suggest what might have caused this problem?
thanks in advance

David Champagne

unread,

May 22, 2012, 11:35:06 AM5/22/12

to rha...@googlegroups.com

What distribution and version of Hadoop are you running?

sandy

unread,

May 25, 2012, 12:30:03 AM5/25/12

to RHadoop

Hi ,
I am also seeing the same error and i am using hadoop-0.20.2.

> require(rmr)
Loading required package: rmr

Loading required package: RJSONIO
Loading required package: itertools
Loading required package: iterators
Loading required package: digest

> small.ints = to.dfs(1:10)
12/05/25 09:54:11 ERROR streaming.StreamJob: Missing required options:

input, output
Usage: $HADOOP_HOME/bin/hadoop jar \
$HADOOP_HOME/hadoop-streaming.jar [options]
Options:
-input <path> DFS input file(s) for the Map step
-output <path> DFS output directory for the Reduce step
-mapper <cmd|JavaClassName> The streaming command to run

-combiner <JavaClassName> Combiner has to be a Java class

thanks
Sandeep

On May 22, 8:35 pm, David Champagne

<david.champa...@revolutionanalytics.com> wrote:
> What distribution and version of Hadoop are you running?
>
>
>
>
>
>
>
> On Tuesday, May 22, 2012 12:23:03 AM UTC-7, Chen wrote:
>
> > I've just installed rmr, rhbase and rhdfs packages. But whnen i try
> > the Tutorial called "My first Mapreduce job" (https://github.com/

> > RevolutionAnalytics/RHadoop/wiki/Tutorial<https://github.com/RevolutionAnalytics/RHadoop/wiki/Tutorial>),

David Champagne

unread,

May 25, 2012, 4:41:55 AM5/25/12

to rha...@googlegroups.com

Hello,

rmr has only been certified to work on CDH3 and Apache Hadoop 1.0.2. There are some critical patches that are required for rmr to work properly (see the wiki page: https://github.com/RevolutionAnalytics/RHadoop/wiki/Which-Hadoop-for-rmr)

I hope that help

Kristy Patel

unread,

Mar 10, 2015, 7:42:16 PM3/10/15

to rha...@googlegroups.com

Hello David,

I am using rmr2 and have CDH5 installed. Still getting the error: ERROR streaming.StreamJob: Missing required options: input, output

Could you please assist me on this.

Thanks!

Antonio Piccolboni

unread,

Mar 10, 2015, 7:56:16 PM3/10/15

to RHadoop Google Group

This is an instance of thread hijacking and I am sorry I approved your message but the moderator interface doesn't make it easy to spot it. You are kindly requested to start a new thread with an original, informative subject. The other request is to learn how to report a problem so that you can actually get help. Your report doesn't contain close to enough information to do anything. Thanks

--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward