R-rmr2 org.apache.hadoop.streaming.PipeMapRed: java.io.EOFException

1,784 views
Skip to first unread message

naveen kumar

unread,
Dec 15, 2013, 9:43:18 PM12/15/13
to rha...@googlegroups.com
I am running a rmr2 example from http://bighadoop.wordpress.com/2013/02/25/r-and-hadoop-data-analysis-rhadoop/, this is the code i tried :

    Sys.setenv(HADOOP_HOME="/home/istvan/hadoop")
    Sys.setenv(HADOOP_CMD="/home/istvan/hadoop/bin/hadoop")
   
    library(rmr2)
    library(rhdfs)
   
    ints = to.dfs(1:100)
    calc = mapreduce(input = ints,
                       map = function(k, v) cbind(v, 2*v))


I am using hadoop-streaming-1.1.1.jar, after calling mapreduce function job starts and it fails with exception :


    2013-12-15 18:45:12,400 WARN mapreduce.Counters: Counter name MAP_INPUT_BYTES is

    deprecated. Use FileInputFormatCounters as group name and  BYTES_READ as counter name instead
    2013-12-15 18:45:12,406 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
    2013-12-15 18:45:12,548 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed exec [/usr/bin/Rscript, ./rmr-streaming-mapa1935822f4b]
    2013-12-15 18:45:12,627 INFO org.apache.hadoop.streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
    2013-12-15 18:45:12,642 INFO org.apache.hadoop.streaming.PipeMapRed: MRErrorThread done
    2013-12-15 18:45:12,642 INFO org.apache.hadoop.streaming.PipeMapRed: PipeMapRed failed!
    2013-12-15 18:45:15,660 WARN org.apache.hadoop.streaming.PipeMapRed: java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:218)
        at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152)
        at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51)
        at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:418)
   
    2013-12-15 18:45:15,689 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
    2013-12-15 18:45:15,695 WARN org.apache.hadoop.mapred.Child: Error running child
    java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at org.apache.hadoop.mapred.Child.main(Child.java:262)
    2013-12-15 18:45:15,705 INFO org.apache.hadoop.mapred.Task: Runnning cleanup for the task

its creating a sequence file in /tmp directory on hdfs but it seems its not able to read it properly and throwing `EOFException`. Any suggestions to fix it thanks.

Antonio Piccolboni

unread,
Dec 15, 2013, 11:50:39 PM12/15/13
to RHadoop Google Group
Hi,
could you please retry without loading rhdfs? Could you share a failed task attempt stderr (available from the web UI)? Thanks


Antonio


--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

naveen kumar

unread,
Dec 28, 2013, 2:06:39 AM12/28/13
to rha...@googlegroups.com, ant...@piccolboni.info

Exception remain same without loading rhdfs, this is the stderr from task tracker :

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:576)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:135)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)
log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.Task).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

This already is a part of jobtracker log.

Antonio Piccolboni

unread,
Dec 29, 2013, 12:44:56 AM12/29/13
to RHadoop Google Group
That looks like a Java stack trace, not R messages, you may have to look harder. The return code is nice to have, but doesn't help me much. If you don't know the difference between a return code and stderr you may need  an introduction to Unix or equivalent.


Antonio
Reply all
Reply to author
Forward
0 new messages