Re: rmr2 on yarn based cluster

Antonio Piccolboni

unread,

Feb 24, 2013, 12:09:01 PM2/24/13

to rha...@googlegroups.com

Hi,

it seems the only obstacle is a bug in streaming that's being worked on right now. Exactly when we'll have that fixed and in one of the mainstream distros, it's hard for me to tell (fix version is still none). Thanks

Antonio

On Saturday, February 23, 2013 11:10:41 PM UTC-8, Jagat Singh wrote:

Hi,

What are plans to support rmr2 on yarn based cluster.

Thanks in advance

Antonio Piccolboni

unread,

Feb 24, 2013, 11:19:04 PM2/24/13

to RHadoop Google Group

On Feb 24, 2013 8:04 PM, "Jagat Singh" <jagat...@gmail.com> wrote:
>
> Hi Antonio,
>
> Thanks for your reply.
>
> I see that there is patch now available for issue you pointed out.
>
> So if this is only blocker , building on my own the Hadoop should workout for RMR2 YARN version ?

Well, that bug stops the testing very early so I am not sure it's the only problem. I don't know of any other and steaming should work just the same under YARN so I hope it's the only problem but I don't know.
>
> I can give a try if that is the only case.

That's understandable but it would be very helpful for the project if you go ahead. You need to patch 1.0.2 or newer. To run the automated tests you need the package quickcheck from GitHub and then run R CMD check path-to-package. If it doesn't pass I will ask you to share your build and we'll take it from there. Thanks

>
> Thanks in advance for your reply.

> --
> post: rha...@googlegroups.com ||
> unsubscribe: rhadoop+u...@googlegroups.com ||
> web: https://groups.google.com/d/forum/rhadoop?hl=en-US
> ---
> You received this message because you are subscribed to the Google Groups "RHadoop" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Jagat Singh

unread,

Feb 24, 2013, 11:29:32 PM2/24/13

to rha...@googlegroups.com

Hi,

I am working on Hadoop 2.x alpha series.

I will apply patch and hopefully build and report back progress to you.

Thanks for your help.

Regards,

Jagat Singh

Jagat Singh

unread,

Feb 27, 2013, 4:07:22 AM2/27/13

to rha...@googlegroups.com

Hi here is output

jj@jj-VirtualBox:~/softwares/R/quickcheck-master/build$ sudo R CMD check quickcheck_1.0.tar.gz
* using log directory ‘/home/jj/softwares/R/quickcheck-master/build/quickcheck.Rcheck’
* using R version 2.15.2 (2012-10-26)
* using platform: i686-pc-linux-gnu (32-bit)
* using session charset: UTF-8
* checking for file ‘quickcheck/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘quickcheck’ version ‘1.0’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking whether package ‘quickcheck’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... WARNING
prepare_Rd: generators.Rd:119: Dropping empty section \keyword
prepare_Rd: generators.Rd:120: Dropping empty section \keyword
prepare_Rd: generators.Rd:97-99: Dropping empty section \details
prepare_Rd: generators.Rd:100-101: Dropping empty section \value
prepare_Rd: generators.Rd:107-109: Dropping empty section \note
prepare_Rd: generators.Rd:104-106: Dropping empty section \author
prepare_Rd: generators.Rd:102-103: Dropping empty section \references
prepare_Rd: generators.Rd:113-115: Dropping empty section \seealso
prepare_Rd: generators.Rd:116-117: Dropping empty section \examples
checkRd: (5) generators.Rd:0-121: multiple sections named '\name' are not allowed
prepare_Rd: quickcheck-package.Rd:24-25: Dropping empty section \references
prepare_Rd: quickcheck-package.Rd:27-29: Dropping empty section \seealso
prepare_Rd: quickcheck-package.Rd:30-32: Dropping empty section \examples
prepare_Rd: unit.test.Rd:27-29: Dropping empty section \details
prepare_Rd: unit.test.Rd:30-36: Dropping empty section \value
prepare_Rd: unit.test.Rd:43-45: Dropping empty section \note
prepare_Rd: unit.test.Rd:40-42: Dropping empty section \author
prepare_Rd: unit.test.Rd:37-39: Dropping empty section \references
prepare_Rd: unit.test.Rd:49-51: Dropping empty section \seealso
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... WARNING
Undocumented code objects:
‘catch.out’
All user-level objects in a package should have documentation entries.
See the chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.
* checking for code/documentation mismatches ... WARNING
Codoc mismatches from documentation object 'tdgg.any':
tdgg.any
Code: function(p.true = 0.5, int.lambda = 100, min = -1, max = 1,
                 list.tdg = tdgg.any(), max.level = 20, len.lambda =
                 10)
Docs: function(p.true = 0.5, lambda.int = 100, min = -1, max = 1,
                 len.char = 8, len.raw = 8, lambda.list = 10, list.tdg
                 = tdgg.any(), lambda.vector = 10, max.level = 20,
                 vector.tdg = tdgg.double())
Argument names in code not in docs:
    int.lambda len.lambda
Argument names in docs not in code:
    lambda.int len.char len.raw lambda.list lambda.vector vector.tdg
Mismatches in argument names (first 3):
    Position: 2 Code: int.lambda Docs: lambda.int
    Position: 5 Code: list.tdg Docs: len.char
    Position: 6 Code: max.level Docs: len.raw
tdgg.list
Code: function(tdg = tdgg.any(list.tdg = tdg, len.lambda = lambda,
                 max.level = max.level), lambda = 10, max.level = 20)
Docs: function(tdg = tdgg.any(list.tdg = tdg, lambda.list = lambda,
                 max.level = max.level), lambda = 10, max.level = 20)
Mismatches in argument default values:
    Name: 'tdg' Code: tdgg.any(list.tdg = tdg, len.lambda = lambda, max.level = max.level) Docs: tdgg.any(list.tdg = tdg, lambda.list = lambda, max.level = max.level)
tdgg.prototype
Code: function(prototype, generator = tdgg.any())
Docs: function(prototype)
Argument names in code not in docs:
    generator

* checking Rd \usage sections ... WARNING
Undocumented arguments in documentation object 'tdgg.any'
‘...’ ‘str.lambda’ ‘len.lambda’ ‘const’ ‘row.lambda’ ‘col.lambda’
‘distribution’ ‘lambda’ ‘elem.lambda’ ‘tdg’ ‘prototype’ ‘l’
Objects in \usage without \alias in documentation object 'tdgg.any':
‘catch.out’

Undocumented arguments in documentation object 'unit.test'
‘stop’

Functions with \usage entries need to have the appropriate \alias
entries, and all their arguments documented.
The \usage entries must correspond to syntactically valid R code.
See the chapter ‘Writing R documentation files’ in the ‘Writing R
Extensions’ manual.
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking examples ... OK
* checking for unstated dependencies in tests ... OK
* checking tests ...
Running ‘quickcheck.R’
OK
* checking PDF version of manual ... WARNING
LaTeX errors when creating PDF version.
This typically indicates Rd problems.
* checking PDF version of manual without hyperrefs or index ... ERROR
jj@jj-VirtualBox:~/softwares/R/quickcheck-master/build$

Jagat Singh

unread,

Feb 27, 2013, 4:52:14 AM2/27/13

to rha...@googlegroups.com

After that i ran the wordcount.R test

I was able to run it successfully.

Is there any other test which you want i should do to see that its working fine ?

Thanks

Message has been deleted

Antonio Piccolboni

unread,

Feb 27, 2013, 10:22:29 AM2/27/13

to RHadoop Google Group

Yes you need to run

R CMD check rmr2_2.1.0.tar.gz

quickcheck is only a helper in this process.
Thanks

Antonio

--

Jagat Singh

unread,

Feb 27, 2013, 10:21:15 PM2/27/13

to rha...@googlegroups.com

jj@jj-VirtualBox:~/softwares/R/rmr2-master/build$ sudo R CMD check rmr2_2.0.2.tar.gz
[sudo] password for jj:
* using log directory ‘/home/jj/softwares/R/rmr2-master/build/rmr2.Rcheck’

* using R version 2.15.2 (2012-10-26)
* using platform: i686-pc-linux-gnu (32-bit)
* using session charset: UTF-8

* checking for file ‘rmr2/DESCRIPTION’ ... OK

* checking extension type ... Package

* this is package ‘rmr2’ version ‘2.0.2’

* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK

* checking whether package ‘rmr2’ can be installed ... OK

* checking installed package size ... OK
* checking package directory ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking for unstated dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK

* checking R code for possible problems ... NOTE
is.keyval: no visible binding for global variable ‘key’
is.keyval: no visible binding for global variable ‘val’
* checking Rd files ... OK

* checking Rd metadata ... OK
* checking Rd cross-references ... OK

* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK

* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK

* checking line endings in C/C++/Fortran sources/headers ... OK
* checking line endings in Makefiles ... WARNING
Found the following Makefiles with CR or CRLF line endings:
src/Makevars
Some Unix ‘make’ programs require LF line endings.
* checking for portable compilation flags in Makevars ... OK
* checking for portable use of $(BLAS_LIBS) and $(LAPACK_LIBS) ... OK
* checking compiled code ... NOTE
File ‘/home/jj/softwares/R/rmr2-master/build/rmr2.Rcheck/rmr2/libs/rmr2.so’:
Found ‘_ZSt4cerr’, possibly from ‘std::cerr’ (C++)
    Object: ‘typed-bytes.o’

Compiled code should not call functions which might terminate R nor
write to stdout/stderr instead of to the console.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.
* checking examples ... ERROR
Running examples in ‘rmr2-Ex.R’ failed
The error most likely occurred in:

> ### Name: dfs.empty
> ### Title: Get a directory or file size or check if it is empty
> ### Aliases: dfs.empty dfs.size
>
> ### ** Examples
>
> dfs.empty(mapreduce(to.dfs(1:10)))
13/02/28 14:16:44 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
13/02/28 14:16:44 INFO compress.CodecPool: Got brand-new compressor [.deflate]
13/02/28 14:16:46 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [/tmp/RtmpRVqx3A/rmr-local-envf7313172ceb, /tmp/RtmpRVqx3A/rmr-global-envf73268da2cc, /tmp/RtmpRVqx3A/rmr-streaming-mapf733b6dd658] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.2-alpha.jar] /tmp/streamjob4595777032435723960.jar tmpDir=null
13/02/28 14:16:48 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/02/28 14:16:48 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/02/28 14:16:48 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is inited.
13/02/28 14:16:48 INFO service.AbstractService: Service:org.apache.hadoop.yarn.client.YarnClientImpl is started.
13/02/28 14:16:49 INFO mapred.FileInputFormat: Total input paths to process : 1
13/02/28 14:16:50 INFO mapreduce.JobSubmitter: number of splits:2
13/02/28 14:16:50 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
13/02/28 14:16:50 WARN conf.Configuration: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
13/02/28 14:16:50 WARN conf.Configuration: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
13/02/28 14:16:50 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
13/02/28 14:16:50 WARN conf.Configuration: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
13/02/28 14:16:50 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
13/02/28 14:16:50 WARN conf.Configuration: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
13/02/28 14:16:50 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
13/02/28 14:16:50 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
13/02/28 14:16:50 WARN conf.Configuration: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
13/02/28 14:16:50 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
13/02/28 14:16:50 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
13/02/28 14:16:50 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
13/02/28 14:16:50 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1362021180419_0001
13/02/28 14:16:51 INFO client.YarnClientImpl: Submitted application application_1362021180419_0001 to ResourceManager at /0.0.0.0:8032
13/02/28 14:16:51 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1362021180419_0001/
13/02/28 14:16:51 INFO mapreduce.Job: Running job: job_1362021180419_0001
13/02/28 14:17:16 INFO mapreduce.Job: Job job_1362021180419_0001 running in uber mode : false
13/02/28 14:17:16 INFO mapreduce.Job: map 0% reduce 0%
13/02/28 14:17:40 INFO mapreduce.Job: map 50% reduce 0%
13/02/28 14:17:48 INFO mapreduce.Job: map 0% reduce 0%
13/02/28 14:17:48 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:17:48 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:18:10 INFO mapreduce.Job: map 50% reduce 0%
13/02/28 14:18:14 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:18:15 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:18:16 INFO mapreduce.Job: map 0% reduce 0%
13/02/28 14:18:35 INFO mapreduce.Job: map 50% reduce 0%
13/02/28 14:18:38 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:18:38 INFO mapreduce.Job: Task Id : attempt_1362021180419_0001_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:320)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:533)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:400)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:335)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:157)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1367)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:152)

13/02/28 14:18:39 INFO mapreduce.Job: map 0% reduce 0%
13/02/28 14:19:06 INFO mapreduce.Job: map 50% reduce 0%
13/02/28 14:19:11 INFO mapreduce.Job: map 0% reduce 0%
13/02/28 14:19:11 INFO mapreduce.Job: map 50% reduce 0%
13/02/28 14:19:11 INFO mapreduce.Job: Job job_1362021180419_0001 failed with state FAILED due to:
13/02/28 14:19:12 INFO mapreduce.Job: Counters: 29
    File System Counters
        FILE: Number of bytes read=120
        FILE: Number of bytes written=70902
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=272
        HDFS: Number of bytes written=122
        HDFS: Number of read operations=4
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=1
    Job Counters
        Failed map tasks=7
        Launched map tasks=8
        Other local map tasks=6
        Rack-local map tasks=2
        Total time spent by all maps in occupied slots (ms)=187378
        Total time spent by all reduces in occupied slots (ms)=0
    Map-Reduce Framework
        Map input records=0
        Map output records=0
        Input split bytes=104
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=271
        CPU time spent (ms)=2260
        Physical memory (bytes) snapshot=69914624
        Virtual memory (bytes) snapshot=425742336
        Total committed heap usage (bytes)=16252928
    File Input Format Counters
        Bytes Read=168
    File Output Format Counters
        Bytes Written=122
13/02/28 14:19:12 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, in.folder = if (is.list(input)) { :
hadoop streaming failed with error code 1
Calls: dfs.empty -> dfs.size -> to.dfs.path -> mapreduce -> mr
Execution halted
jj@jj-VirtualBox:~/softwares/R/rmr2-master/build$

Antonio Piccolboni

unread,

Feb 27, 2013, 11:36:56 PM2/27/13

to RHadoop Google Group

That's unfortunate but good to know, could you share your hadoop build so that I can test myself? Thanks

Antonio

Edward J. Yoon

unread,

Feb 27, 2013, 11:42:16 PM2/27/13

to rha...@googlegroups.com

Is there any plan to use the BSP (Apache Hama) which is fit for
scientific computations, instead of MapReduce? In KMeans clustering,
BSP's 1000x faster than MapReduce.

1. http://hama.apache.org/

--
Best Regards, Edward J. Yoon
@eddieyoon

Antonio Piccolboni

unread,

Feb 28, 2013, 12:02:15 AM2/28/13

to rha...@googlegroups.com

This is an interesting question but unrelated to this thread, If you could be so kind as to open a new thread for a completely new question that would be highly appreciated.

Antonio

Antonio Piccolboni

unread,

Mar 19, 2013, 12:47:32 PM3/19/13

to rha...@googlegroups.com

I don't know if you are still working on this but I stumbled into MAPREDUCE-1122 and that's certain to break some tests.

Antonio

Austin Chungath

unread,

Nov 11, 2013, 4:56:49 AM11/11/13

to rha...@googlegroups.com

Hi Antonio,

I was trying to get rmr2 working on YARN and I found this thread.

I see that there hasn't been any activity on MAPREDUCE-1122 since Sep 2011.

Is this the only blocker for getting rmr2 to work on YARN? Will rewriting your custom input format using the old api solve this problem?

@Jagat Singh: Have you tried running rmr2 on YARN lately?

Thanks,

Austin

Antonio Piccolboni

unread,

Nov 11, 2013, 11:34:55 AM11/11/13

to RHadoop Google Group

My latest unofficial tests (that is in standalone mode on my laptop) pass on 2.1.0 which I think defaults to mrv2. Not sure how that can happen with this bug still open.

Antonio

Reply all

Reply to author

Forward