HI All
After setting rmr with Hadoop in windows 7, following the instructions from:
https://github.com/RevolutionAnalytics/RHadoop/wiki/user%3Ermr%3EHomeI am facing a problem when running the tutorial (My first map reduce job) from:
https://github.com/RevolutionAnalytics/rmr2/blob/master/docs/tutorial.mdThe first line of code:
small.ints = to.dfs(1:1000)
Apparently works okay, other than the warning:
WARN zlib.ZlibFactory: Failed to load/initialize native-zlib library
However when I try to execute the second line:
> mapreduce(input=small.ints, map =function(k,v) cbind(v, v^2))
I get a hadoop streaming failed and the job is therefore not executed. Below I have attached the stack trace of the exception.
T
here are a couple of things that I suspect where the problem might be.
1. There is the original WARN message Failed to load/initialize native-zlib library
2. I notice a WARN message: Fail to create symbolic links on Windows
.
I have to mention that I am able to execute Map Reduce Jobs successfully (without using the rmr package).
Also the line of code
from.dfs(small.ints) works okay.
Any idea or insight will be really appreciated.
Thank you
Other threads related to this issue might be:
https://groups.google.com/forum/#!searchin/rhadoop/hadoop$20streaming$20failed$20with$20error$20code$201/rhadoop/RtkOkcswJY8/Fjd-6XlNy6oJhttps://groups.google.com/forum/#!searchin/rhadoop/hadoop$20streaming$20failed$20with$20error$20code$201/rhadoop/jwtOWHlWJj8/a1lkvS-WDWkJ> small.ints = to.dfs(1:1000)
15/02/02 15:06:00 WARN zlib.ZlibFactory: Failed to load/initialize native-zlib library
15/02/02 15:06:00 INFO compress.CodecPool: Got brand-new compressor [.deflate]
> mapreduce(input=small.ints, map =function(k,v) cbind(v, v^2))
15/02/02 15:06:06 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
15/02/02 15:06:06 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
15/02/02 15:06:06 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
15/02/02 15:06:06 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
15/02/02 15:06:06 INFO mapred.FileInputFormat: Total input paths to process : 1
15/02/02 15:06:06 INFO mapreduce.JobSubmitter: number of splits:1
15/02/02 15:06:07 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local767853806_0001
15/02/02 15:06:07 WARN conf.Configuration: file:/tmp/hadoop-dalo9529/mapred/staging/dalo9529767853806/.staging/job_local767853806_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
15/02/02 15:06:07 WARN conf.Configuration: file:/tmp/hadoop-dalo9529/mapred/staging/dalo9529767853806/.staging/job_local767853806_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Creating symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967102\rmr-local-env25f04ccb5e23 <- C:\Users\dalo9529\Documents/rmr-local-env25f04ccb5e23
15/02/02 15:06:07 WARN fs.FileUtil: Fail to create symbolic links on Windows. The default security settings in Windows disallow non-elevated administrators and all non-administrators from creating symbolic links. This behavior can be changed in the Local Security Policy management console
15/02/02 15:06:07 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967102\rmr-local-env25f04ccb5e23 <- C:\Users\dalo9529\Documents/rmr-local-env25f04ccb5e23
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Localized file:/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-local-env25f04ccb5e23 as file:/tmp/hadoop-dalo9529/mapred/local/1422885967102/rmr-local-env25f04ccb5e23
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Creating symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967103\rmr-global-env25f010b919c <- C:\Users\dalo9529\Documents/rmr-global-env25f010b919c
15/02/02 15:06:07 WARN fs.FileUtil: Fail to create symbolic links on Windows. The default security settings in Windows disallow non-elevated administrators and all non-administrators from creating symbolic links. This behavior can be changed in the Local Security Policy management console
15/02/02 15:06:07 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967103\rmr-global-env25f010b919c <- C:\Users\dalo9529\Documents/rmr-global-env25f010b919c
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Localized file:/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-global-env25f010b919c as file:/tmp/hadoop-dalo9529/mapred/local/1422885967103/rmr-global-env25f010b919c
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Creating symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967104\rmr-streaming-map25f0553aae8 <- C:\Users\dalo9529\Documents/rmr-streaming-map25f0553aae8
15/02/02 15:06:07 WARN fs.FileUtil: Fail to create symbolic links on Windows. The default security settings in Windows disallow non-elevated administrators and all non-administrators from creating symbolic links. This behavior can be changed in the Local Security Policy management console
15/02/02 15:06:07 WARN mapred.LocalDistributedCacheManager: Failed to create symlink: \tmp\hadoop-dalo9529\mapred\local\1422885967104\rmr-streaming-map25f0553aae8 <- C:\Users\dalo9529\Documents/rmr-streaming-map25f0553aae8
15/02/02 15:06:07 INFO mapred.LocalDistributedCacheManager: Localized file:/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-streaming-map25f0553aae8 as file:/tmp/hadoop-dalo9529/mapred/local/1422885967104/rmr-streaming-map25f0553aae8
15/02/02 15:06:07 WARN conf.Configuration: file:/tmp/hadoop-dalo9529/mapred/local/localRunner/dalo9529/job_local767853806_0001/job_local767853806_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring.
15/02/02 15:06:07 WARN conf.Configuration: file:/tmp/hadoop-dalo9529/mapred/local/localRunner/dalo9529/job_local767853806_0001/job_local767853806_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring.
15/02/02 15:06:07 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
15/02/02 15:06:07 INFO mapred.LocalJobRunner: OutputCommitter set in config null
15/02/02 15:06:07 INFO mapreduce.Job: Running job: job_local767853806_0001
15/02/02 15:06:07 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapred.FileOutputCommitter
15/02/02 15:06:07 INFO mapred.LocalJobRunner: Waiting for map tasks
15/02/02 15:06:07 INFO mapred.LocalJobRunner: Starting task: attempt_local767853806_0001_m_000000_0
15/02/02 15:06:07 INFO util.ProcfsBasedProcessTree: ProcfsBasedProcessTree currently is supported only on Linux.
15/02/02 15:06:07 INFO mapred.Task: Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@372033f4
15/02/02 15:06:07 INFO mapred.MapTask: Processing split: hdfs://localhost:9000/tmp/file25f05a1771c:0+2142
15/02/02 15:06:07 WARN zlib.ZlibFactory: Failed to load/initialize native-zlib library
15/02/02 15:06:07 INFO compress.CodecPool: Got brand-new decompressor [.deflate]
15/02/02 15:06:07 INFO mapred.MapTask: numReduceTasks: 0
15/02/02 15:06:07 INFO streaming.PipeMapRed: PipeMapRed exec [Rscript, --vanilla, ./rmr-streaming-map25f0553aae8]
15/02/02 15:06:07 INFO streaming.PipeMapRed: R/W/S=1/0/0 in:NA [rec/s] out:NA [rec/s]
Fatal error: cannot open file './rmr-streaming-map25f0553aae8': No such file or directory
15/02/02 15:06:07 WARN streaming.PipeMapRed: java.io.IOException: The pipe has been ended
15/02/02 15:06:07 INFO streaming.PipeMapRed: MRErrorThread done
15/02/02 15:06:07 INFO streaming.PipeMapRed: PipeMapRed failed!
java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/02/02 15:06:07 INFO mapred.LocalJobRunner: map task executor complete.
15/02/02 15:06:08 WARN mapred.LocalJobRunner: job_local767853806_0001
java.lang.Exception: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
15/02/02 15:06:08 INFO mapreduce.Job: Job job_local767853806_0001 running in uber mode : false
15/02/02 15:06:08 INFO mapreduce.Job: map 0% reduce 0%
15/02/02 15:06:08 INFO mapreduce.Job: Job job_local767853806_0001 failed with state FAILED due to: NA
15/02/02 15:06:08 INFO mapreduce.Job: Counters: 0
15/02/02 15:06:08 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce, :
hadoop streaming failed with error code 1
In addition: Warning message:
running command 'D:\prog\hadoop\bin\hadoop jar D:\prog\hadoop\share\hadoop\tools\lib\hadoop-streaming-2.4.1.jar -D "stream.map.input=typedbytes" -D "stream.map.output=typedbytes" -D "stream.reduce.input=typedbytes" -D "stream.reduce.output=typedbytes" -D "mapred.reduce.tasks=0" -D "mapreduce.map.java.opts=-Xmx400M" -D "mapreduce.reduce.java.opts=-Xmx400M" -files "/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-local-env25f04ccb5e23,/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-global-env25f010b919c,/Users/dalo9529/AppData/Local/Temp/RtmpmmfVIg/rmr-streaming-map25f0553aae8" -input "/tmp/file25f05a1771c" -output "/tmp/file25f023b323cb" -mapper "Rscript --vanilla ./rmr-streaming-map25f0553aae8" -inputformat "org.apache.hadoop.streaming.AutoInputFormat" -outputformat "org.apache.hadoop.mapred.SequenceFileOutputFormat" 2>&1' had status 1