hadoop streaming failed with error code 1

105 views
Skip to first unread message

Konstantinos Mammas

unread,
Mar 25, 2015, 11:14:11 AM3/25/15
to rha...@googlegroups.com
Hi All,

I have just set up RHadoop and I am trying to run a very simple map reduce job. 


Sys.setenv(HADOOP_HOME="/opt/mapr/hadoop/hadoop-0.20.2/")  
  Sys.setenv(HADOOP_CMD="/opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop")
  Sys.setenv(HADOOP_CONF="/opt/mapr/hadoop/conf")
  Sys.setenv(HADOOP_STREAMING="/opt/mapr/hadoop/hadoop-0.20.2/contrib/streaming/hadoop-0.20.2-dev-streaming.jar")
  Sys.getenv("HADOOP_CMD")
  
  
  library(rmr2);library(rhdfs);library(plyrmr); library(jsonlite);library(functional);library(ravro);
  
  hdfs.init()
  
  hdfs.ls("/user/user1")
  
  # help(hadoop.settings)
  
  hdfs.delete("/user/user1/testInds")
  ints = to.dfs(1:100,"/user/user1A/testInds")

I am getting the following error. 

packageJobJar: [/tmp/hadoop-user1/hadoop-unjar4761534670667064596/] [] /tmp/streamjob6026865890801465662.jar tmpDir=null
15/03/25 05:04:59 INFO fs.JobTrackerWatcher: Current running JobTracker is: hdp0029/10.106.128.39:9001
15/03/25 05:04:59 INFO mapred.FileInputFormat: Total input paths to process : 1
15/03/25 05:04:59 INFO mapred.JobClient: Creating job's output directory at maprfs:/tmp/file8392754a6419
15/03/25 05:04:59 INFO mapred.JobClient: Creating job's user history location directory at maprfs:/tmp/file8392754a6419/_logs
15/03/25 05:04:59 INFO mapred.JobClient: user1, realuser: null
15/03/25 05:04:59 INFO streaming.StreamJob: getLocalDirs(): [/tmp/mapr-hadoop/mapred/local]
15/03/25 05:04:59 INFO streaming.StreamJob: Running job: job_201503181218_4375
15/03/25 05:04:59 INFO streaming.StreamJob: To kill this job, run:
15/03/25 05:04:59 INFO streaming.StreamJob: /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/// -kill job_201503181218_4375
15/03/25 05:04:59 INFO streaming.StreamJob: Tracking URL: http://hdp0029:50030/jobdetails.jsp?jobid=job_201503181218_4375
15/03/25 05:05:00 INFO streaming.StreamJob:  map 0%  reduce 0%
15/03/25 05:05:22 INFO streaming.StreamJob:  map 100%  reduce 100%
15/03/25 05:05:22 INFO streaming.StreamJob: To kill this job, run:
15/03/25 05:05:22 INFO streaming.StreamJob: /opt/mapr/hadoop/hadoop-0.20.2/bin/../bin/hadoop job  -Dmapred.job.tracker=maprfs:/// -kill job_201503181218_4375
15/03/25 05:05:22 INFO streaming.StreamJob: Tracking URL: http://hdp0029:50030/jobdetails.jsp?jobid=job_201503181218_4375
15/03/25 05:05:22 ERROR streaming.StreamJob: Job not successful. Error: NA
15/03/25 05:05:22 INFO streaming.StreamJob: killJob...
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  : 
  hadoop streaming failed with error code 1
Deleted maprfs:/tmp/file8392352a63e7


The log file indicates the following issues:


stderr logs
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.NativeCodeLoader).
log4j:WARN Please initialize the log4j system properly.
java.io.IOException: /tmp/mapr-hadoop/mapred/local/taskTracker/user1/jobcache/job_201503181218_4375/attempt_201503181218_4375_m_000000_3/work/./Rscript is not a file or does not have read permissions
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:202)
at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:353)
at org.apache.hadoop.mapred.Child$4.run(Child.java:282)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1151)
at org.apache.hadoop.mapred.Child.main(Child.java:271)
2015-03-25 05:05:17,2238 ERROR Client fs/client/fileclient/cc/writebuf.cc:154 Thread: 32459 FlushWrite failed: File part-00000, error: Stale File handle(116), pfid 2049.2904.1785684, off 0, fid 2049.2904.1785684


Can anyone provide assistance?

Thanks,

Konstantinos





Antonio Piccolboni

unread,
Mar 25, 2015, 11:25:09 AM3/25/15
to RHadoop Google Group
First suggestion is: don't load the kitchen sink. You are testing rmr2, leave the other packages alone. It's the rule of all debugging, not just in RHadoop or Hadoop: simplify and isolate. It doesn't matter that it should work: get one piece to work at a time. Second, look at this line of output

ob_201503181218_4375/attempt_201503181218_4375_m_000000_3/work/./Rscript is not a file or does not have read permissions


It's not finding Rscript, which is part of the R installation. So either R is missing on one node, or not on the PATH, so the hadoop process can't find it, or has the wrong permissions. People in general have had trouble manipulating the PATH as set during task execution: the normal suggestion is to install R at a standard location so that this is not necessary. 

--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Konstantinos Mammas

unread,
Apr 1, 2015, 2:06:57 PM4/1/15
to rha...@googlegroups.com, ant...@piccolboni.info
Thanks a lot for your suggestion. I am facing other types of issues that i will post in another post. Thanks again
Reply all
Reply to author
Forward
0 new messages