PipeMapRed.waitOutputThreads(): subprocess failed with code 1, Warning: namespace ‘rmr2’ is not ava

3,857 views
Skip to first unread message

Marc Paul

unread,
Dec 19, 2014, 3:26:01 AM12/19/14
to rha...@googlegroups.com

Hi Antonio,

thanks for your answer. Sorry I missed the message that the first port is moderated. 

I found the link for the debugging guideline: https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Ermr%3EDebugging-rmr-programs

It's another link like you describ in the intro. Maybe this is interesting for you, so that you get more exactly error-questions.

So maybe I write now the real question with an exactly error log:

To the case: I have a cluster with 8 nodes

On each node is Hadoop 2.2.0 isntalled and R with all the packages(rhdfs, rjava, rmr2...)

So the firstmapreduce (small.ints 1:1000) runs without porblems in backend=local and backend=hadoop.

But I got problems with the wordcount-example. It runs perfect in backend=local, but in backend=hadoop, I got many problems.

For your node: I switch the reduce-function off, like in the error-guideline mentioned.

 

When I take a small .txt-file for the wordcount-example sometimes it works without problems. Sometimes I got errors, but the programm runs until the end and I get an right output.

The error in this case is teh following one:

 

14/12/19 08:10:47 INFO mapreduce.Job:  map 0% reduce 0%

14/12/19 08:10:57 INFO mapreduce.Job: Task Id : attempt_1418972737310_0002_m_000001_0, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

            at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)

            at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)

            at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)

            at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)

            at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)

            at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)

            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

 

14/12/19 08:10:57 INFO mapreduce.Job: Task Id : attempt_1418972737310_0002_m_000000_0, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

            at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)

            at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)

            at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)

            at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)

            at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)

            at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)

            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

 

14/12/19 08:11:03 INFO mapreduce.Job: Task Id : attempt_1418972737310_0002_m_000001_1, Status : FAILED

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

            at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)

            at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)

            at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)

            at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)

            at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)

            at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)

            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)

            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)

            at java.security.AccessController.doPrivileged(Native Method)

            at javax.security.auth.Subject.doAs(Subject.java:415)

            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)

            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)

 

14/12/19 08:11:05 INFO mapreduce.Job:  map 50% reduce 0%

14/12/19 08:11:10 INFO mapreduce.Job:  map 100% reduce 0%

14/12/19 08:11:10 INFO mapreduce.Job: Job job_1418972737310_0002 completed successfully

14/12/19 08:11:10 INFO mapreduce.Job: Counters: 29

            File System Counters

                        FILE: Number of bytes read=0

                        FILE: Number of bytes written=168154

                        FILE: Number of read operations=0

                        FILE: Number of large read operations=0

                        FILE: Number of write operations=0

                        HDFS: Number of bytes read=309

                        HDFS: Number of bytes written=2583

                        HDFS: Number of read operations=10

                        HDFS: Number of large read operations=0

                        HDFS: Number of write operations=4

            Job Counters 

                        Failed map tasks=3

                        Launched map tasks=5

                        Other local map tasks=3

                        Rack-local map tasks=2

                        Total time spent by all maps in occupied slots (ms)=29903

                        Total time spent by all reduces in occupied slots (ms)=0

            Map-Reduce Framework

                        Map input records=3

                        Map output records=24

                        Input split bytes=184

                        Spilled Records=0

                        Failed Shuffles=0

                        Merged Map outputs=0

                        GC time elapsed (ms)=67

                        CPU time spent (ms)=1060

                        Physical memory (bytes) snapshot=300879872

                        Virtual memory (bytes) snapshot=2485153792

                        Total committed heap usage (bytes)=214433792

            File Input Format Counters 

                        Bytes Read=125

            File Output Format Counters 

                        Bytes Written=2583

14/12/19 08:11:10 INFO streaming.StreamJob: Output directory: /tmp/file12c878fae41b

14/12/19 08:11:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

 

 

Like described in the guideline I go to check the userlogs:

(the userlogs are sometimes empty, although I got an error from the R-console. That I don't understand)

But here is the error log: (notice the small.ints example works)

 

 

administrator@l101-pc01:/usr/local/hadoop/logs/userlogs/application_1418972737310_0002$ ls

container_1418972737310_0002_01_000002  container_1418972737310_0002_01_000003  container_1418972737310_0002_01_000004

 

 

administrator@l101-pc01:/usr/local/hadoop/logs/userlogs/application_1418972737310_0002/container_1418972737310_0002_01_000002$ cat stderr 

Loading objects:

  wordcount

Loading objects:

  backend.parameters

  combine

  combine.file

  combine.line

  debug

  default.input.format

Warning: namespace ‘rmr2’ is not available and has been replaced

by .GlobalEnv when processing object ‘default.input.format’

  default.output.format

  in.folder

  in.memory.combine

  input.format

  libs

  map

  map.file

  map.line

  out.folder

  output.format

  pkg.opts

  postamble

  preamble

  profile.nodes

  reduce

  reduce.file

  reduce.line

  rmr.global.env

  rmr.local.env

  save.env

  tempfile

  vectorized.reduce

  verbose

  work.dir

Loading required package: methods

Loading required package: rmr2

Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : 

  there is no package called ‘stringr’

Warning in FUN(c("base", "methods", "datasets", "utils", "grDevices", "graphics",  :

  can't load rmr2

Loading required package: rJava

Loading required package: rhdfs

Error : .onLoad failed in loadNamespace() for 'rhdfs', details:

  call: fun(libname, pkgname)

  error: Environment variable HADOOP_CMD must be set before loading package rhdfs

Warning in FUN(c("base", "methods", "datasets", "utils", "grDevices", "graphics",  :

  can't load rhdfs

Loading objects:

  backend.parameters

  combine

  combine.file

  combine.line

  debug

  default.input.format

Warning: namespace ‘rmr2’ is not available and has been replaced

by .GlobalEnv when processing object ‘default.input.format’

  default.output.format

  in.folder

  in.memory.combine

  input.format

  libs

  map

  map.file

  map.line

  out.folder

  output.format

  pkg.opts

  postamble

  preamble

  profile.nodes

  reduce

  reduce.file

  reduce.line

  rmr.global.env

  rmr.local.env

  save.env

  tempfile

  vectorized.reduce

  verbose

  work.dir

Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : 

  there is no package called ‘stringr’

Calls: <Anonymous> ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>

No traceback available 

Error during wrapup: 

Execution halted

 

 

The rmr2 package is intalled, otherwise the small.inits-example would not work or not?

That I don't understand.

 

The last question for me is:

I have 8 nodes,I see them on localhost:50070, but when go the URL localhost:8088 and looking for the application

I see only that one node is called:

 ApplicationMaster

Attempt Number        Start Time       Node   Logs

1          19-Dec-2014 08:10:40           l101-pc06:8042          logs

 

Like you see in the userlogs, there are three different container?that I don't understand:

 

And here is the R library from node l101-pc06:

Packages in library ‘/usr/local/lib/R/site-library’:

 

bitops                  Bitwise Operations

caTools                 Tools: moving window statistics, GIF, Base64,

                        ROC AUC, etc.

devtools                Tools to make developing R code easier

digest                  Create Cryptographic Hash Digests of R Objects

evaluate                Parsing and evaluation tools that provide more

                        details than the default.

functional              Curry, Compose, and other higher-order

                        functions

httr                    Tools for Working with URLs and HTTP

iterators               Iterator construct for R

itertools               Iterator Tools

jsonlite                A Robust, High Performance JSON Parser and

                        Generator for R

memoise                 Memoise functions

mime                    Map filenames to MIME types

plyr                    Tools for splitting, applying and combining

                        data

R6                      Classes with reference semantics

Rcpp                    Seamless R and C++ Integration

RCurl                   General network (HTTP/FTP/...) client interface

                        for R

reshape2                Flexibly Reshape Data: A Reboot of the Reshape

                        Package.

rhdfs                   R and Hadoop Distributed Filesystem

rJava                   Low-level R to Java interface

RJSONIO                 Serialize R objects to JSON, JavaScript Object

                        Notation

rmr2                    R and Hadoop Streaming Connector

rstudioapi              Safely access the RStudio API.

stringr                 Make it easier to work with strings.

whisker                 {{mustache}} for R, logicless templating

 

Packages in library ‘/usr/lib/R/library’:

 

base                    The R Base Package

boot                    Bootstrap Functions (originally by Angelo Canty

                        for S)

class                   Functions for Classification

cluster                 Cluster Analysis Extended Rousseeuw et al.

codetools               Code Analysis Tools for R

compiler                The R Compiler Package

datasets                The R Datasets Package

foreign                 Read Data Stored by Minitab, S, SAS, SPSS,

                        Stata, Systat, Weka, dBase, ...

graphics                The R Graphics Package

grDevices               The R Graphics Devices and Support for Colours

                        and Fonts

grid                    The Grid Graphics Package

KernSmooth              Functions for kernel smoothing for Wand & Jones

                        (1995)

lattice                 Lattice Graphics

MASS                    Support Functions and Datasets for Venables and

                        Ripley's MASS

Matrix                  Sparse and Dense Matrix Classes and Methods

methods                 Formal Methods and Classes

mgcv                    Mixed GAM Computation Vehicle with GCV/AIC/REML

                        smoothness estimation

nlme                    Linear and Nonlinear Mixed Effects Models

nnet                    Feed-forward Neural Networks and Multinomial

                        Log-Linear Models

parallel                Support for Parallel computation in R

rpart                   Recursive Partitioning and Regression Trees

spatial                 Functions for Kriging and Point Pattern

                        Analysis

splines                 Regression Spline Functions and Classes

stats                   The R Stats Package

stats4                  Statistical Functions using S4 Classes

survival                Survival Analysis

 

 

 

 

Thank you for the work and I hope you can help me.

 

Best regards

Marc

Antonio Piccolboni

unread,
Dec 19, 2014, 10:16:02 AM12/19/14
to rha...@googlegroups.com
If you go to the end of the stderr log, you can see this message

Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : 

  there is no package called ‘stringr’


Which makes the execution stop. I am not sure why your program needs stringr, but if it uses functions from stringr you need to install stringr on each node. If it doesn't, I would detach stringr before executing the wordcount program, so that it's not loaded on the nodes either.

...
Reply all
Reply to author
Forward
0 new messages