Rhadoop job running out of physical memory limits

276 views
Skip to first unread message

Harsha V.

unread,
Mar 19, 2014, 9:57:08 AM3/19/14
to rha...@googlegroups.com
The hadoop streaming job that I am running via Rhadoop (rmr2) fails with the following out-of-memory error. 

      14/03/18 21:02:49 INFO mapreduce.Job:  map 100% reduce 66%
      14/03/18 21:02:54 INFO mapreduce.Job:  map 100% reduce 67%
      14/03/18 22:24:32 INFO mapreduce.Job: Task Id : attempt_1395174999269_0003_r_000073_0, Status :  FAILED
     ...
      Container killed on request. Exit code is 143

The reducer seems to be running out of physical memory, even though none of the map output keys have a large count. All the data for a particular key should be a few tens of megabytes at most.

Here is the code where that sets mapreduce options.

     hadoop.options <- list(D="mapred.reduce.tasks=108",
                         D="mapred.map.tasks=200",
                         D='mapreduce.reduce.java.opts=-Xmx2048m',
                         D="mapred.compress.map.output=true",
                         D="mapreduce.map.memory.mb=4096",
                         D="mapreduce.reduce.memory.mb=8192",
                         D="mapred.reduce.parallel.copies=10",
                         D="mapred.reduce.slowstart.completed.maps =0.8")

I have seen the answers to similar questions and it appears that I am setting my java and reduce memory options correctly. Any insight into debugging this error would be much appreciated.

Harsha V.

unread,
Mar 19, 2014, 12:28:18 PM3/19/14
to rha...@googlegroups.com
Here is the full error log.

14/03/18 21:02:54 INFO mapreduce.Job:  map 100% reduce 67%
14/03/18 22:24:32 INFO mapreduce.Job: Task Id : attempt_1395174999269_0003_r_000073_0, Status : FAILED
Container [pid=20617,containerID=container_1395174999269_0003_01_001031] is running beyond physical memory limits. Current usage: 9.1 GB of 8 GB physical memory used; 9.8 GB of 16.8 GB virtual memory used. Killing container.
Dump of the process-tree for container_1395174999269_0003_01_001031 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 20617 3473 20617 20617 (bash) 0 0 65425408 276 /bin/bash -c /usr/jdk64/jdk1.6.0_31/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048m -Djava.io.tmpdir=/hadooptmp/yarn/local/usercache/harshav/appcache/application_1395174999269_0003/container_1395174999269_0003_01_001031/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1395174999269_0003/container_1395174999269_0003_01_001031 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 192.168.6.46 38160 attempt_1395174999269_0003_r_000073_0 1031 1>/hadoop/yarn/log/application_1395174999269_0003/container_1395174999269_0003_01_001031/stdout 2>/hadoop/yarn/log/application_1395174999269_0003/container_1395174999269_0003_01_001031/stderr  
	|- 20628 20617 20617 20617 (java) 13496 815 2571829248 511248 /usr/jdk64/jdk1.6.0_31/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx2048m -Djava.io.tmpdir=/hadooptmp/yarn/local/usercache/harshav/appcache/application_1395174999269_0003/container_1395174999269_0003_01_001031/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/hadoop/yarn/log/application_1395174999269_0003/container_1395174999269_0003_01_001031 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 192.168.6.46 38160 attempt_1395174999269_0003_r_000073_0 1031 
	|- 23513 23495 20617 20617 (cat) 29 185 60411904 133 cat 
	|- 23495 20628 20617 20617 (R) 487542 6828 7784603648 1861984 /usr/local/lib64/R/bin/exec/R --slave --no-restore --file=./rmr-streaming-reduce77093ef269ef --args 

Container killed on request. Exit code is 143

Antonio Piccolboni

unread,
Mar 19, 2014, 12:47:14 PM3/19/14
to RHadoop Google Group
What you are doing looks reasonable to me so we need more information. Maybe you could add an rmr.str(gc()) in your reduce function. This will perform garbage collection but also print relevant information in stderr, which you can later access.  Also please provide the rmr and Hadoop version you are running. The golden standard is always to give me something I can run myself (complete program with input), and at least showing the code is the second best option. I am aware that in certain cases it's impossible, but then it becomes also impossible to debug a more than trivial problem.  Most likely R is allocating too much memory > 6gb. It could be an error in your code or an error in my code, and it's hard to separate the two without me tampering with the rmr2 build. Finally keep in mind that having a large number of small reduce groups is a slow programming pattern, just because the reduce function is an interpreted function. But one thing at a time, let's figure out the memory first, then the speed.


Antonio


--
post: rha...@googlegroups.com ||
unsubscribe: rhadoop+u...@googlegroups.com ||
web: https://groups.google.com/d/forum/rhadoop?hl=en-US
---
You received this message because you are subscribed to the Google Groups "RHadoop" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rhadoop+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Harsha V.

unread,
Mar 19, 2014, 2:30:51 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
Antonio,

rmr2 is version 2.3.0
Hadoop version 2.2.0.2.0.6.0-101

Unfortunately I can't reproduce the error easily. On smaller inputs it finishes fine, and on the full input it takes hours before failing. I am running the job with your garbage collection statement. Will update with results.

Thanks for the quick response.

Antonio Piccolboni

unread,
Mar 19, 2014, 2:32:24 PM3/19/14
to RHadoop Google Group
Can you upgrade to rmr 3.0.0?

Harsha V.

unread,
Mar 19, 2014, 3:02:40 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
Here is the tail of the output from the rmr.str(gc()) call from a reducer that failed due to memory error.

gc()
 num [1:2, 1:6] 7.38e+05 3.82e+06 3.95e+01 2.92e+01 1.27e+06 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "Ncells" "Vcells"
  ..$ : chr [1:6] "used" "(Mb)" "gc trigger" "(Mb)" ...
Dotted pair list of 13
 $ : language (function() {     invisible(if (is.null(formals(load)$verbose)) load("./rmr-local-env50aa18c54f41") else load("./rmr-local-env50aa18c54f41",  ...
 $ : language rmr2:::reduce.loop(reduce = reduce, vectorized = vectorized.reduce, keyval.reader = default.reader(),      keyval.writer = output.writer(), profile = profile.nodes)
 $ : language apply.reduce(complete, red.as.kv)
 $ : language c.keyval(reduce.keyval(kv, reduce))
 $ : language reduce.keyval(kv, reduce)
 $ : language mapply(FUN, keys(kvs), values(kvs), SIMPLIFY = FALSE)
 $ : language (function (...)  do.call(FUN, c(.orig, list(...))))(dots[[1L]][[14L]], dots[[2L]][[14L]])
 $ : language do.call(FUN, c(.orig, list(...)))
 $ : language (function (k, vv, reduce)  as.keyval(reduce(k, vv)))(reduce = function (k, v)  ...
 $ : language as.keyval(reduce(k, vv))
 $ : language is.keyval(x)
 $ : language reduce(k, vv)
 $ :length 2 rmr.str(gc())
  ..- attr(*, "srcref")=Class 'srcref'  atomic [1:8] 22 5 22 17 5 17 22 22
  .. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x1e0c7440> 
gc()
 num [1:2, 1:6] 7.38e+05 3.82e+06 3.95e+01 2.92e+01 1.27e+06 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "Ncells" "Vcells"
  ..$ : chr [1:6] "used" "(Mb)" "gc trigger" "(Mb)" ...
Dotted pair list of 13
 $ : language (function() {     invisible(if (is.null(formals(load)$verbose)) load("./rmr-local-env50aa18c54f41") else load("./rmr-local-env50aa18c54f41",  ...
 $ : language rmr2:::reduce.loop(reduce = reduce, vectorized = vectorized.reduce, keyval.reader = default.reader(),      keyval.writer = output.writer(), profile = profile.nodes)
 $ : language apply.reduce(complete, red.as.kv)
 $ : language c.keyval(reduce.keyval(kv, reduce))
 $ : language reduce.keyval(kv, reduce)
 $ : language mapply(FUN, keys(kvs), values(kvs), SIMPLIFY = FALSE)
 $ : language (function (...)  do.call(FUN, c(.orig, list(...))))(dots[[1L]][[15L]], dots[[2L]][[15L]])
 $ : language do.call(FUN, c(.orig, list(...)))
 $ : language (function (k, vv, reduce)  as.keyval(reduce(k, vv)))(reduce = function (k, v)  ...
 $ : language as.keyval(reduce(k, vv))
 $ : language is.keyval(x)
 $ : language reduce(k, vv)
 $ :length 2 rmr.str(gc())
  ..- attr(*, "srcref")=Class 'srcref'  atomic [1:8] 22 5 22 17 5 17 22 22
  .. .. ..- attr(*, "srcfile")=Classes 'srcfilecopy', 'srcfile' <environment: 0x1e0c7440> 
gc()
 num [1:2, 1:6] 7.38e+05 3.82e+06 3.95e+01 2.92e+01 1.27e+06 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "Ncells" "Vcells"
  ..$ : chr [1:6] "used" "(Mb)" "gc trigger" "(Mb)" ...

Antonio Piccolboni

unread,
Mar 19, 2014, 3:08:43 PM3/19/14
to RHadoop Google Group

On Wed, Mar 19, 2014 at 12:02 PM, Harsha V. <hve...@gmail.com> wrote:
3.95e+01 2.92

This output, albeit truncated, suggests R is using a modest amount of memory, at least by today's standards. Your configuration seems to reserve several times that amount. Not sure what to think at the moment.

Antonio

Antonio Piccolboni

unread,
Mar 19, 2014, 3:21:39 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
I would suggest some system monitoring. You probably have more sophisticated monitoring software, but one way is to ssh into a node and run top during a failing job. How many processes are running, particularly R and java. How much memory are they taking? Does this information fit with your expectations?


Antonio

Antonio Piccolboni

unread,
Mar 19, 2014, 3:28:41 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
Wait a minute, have you read this? I am just saying because you are using a mix of mapred. and mapreduce. properties. I thought with YARN mapred. properties were down and out and that post says as much. Other than that there isn't additional information in that post wrt to what you have tried already.


Antonio

Harsha V.

unread,
Mar 19, 2014, 3:41:16 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
I know that the mapred properties are having an effect (for example the number of reduces is what is specified in the hadoop.options). Can you clarify whether rmr2 prefers mapreduce version 1 or version 2?

Harsha V.

unread,
Mar 19, 2014, 3:51:45 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
I changed all the options to "mapreduce". However, I still get warnings saying "mapred.xxx.yyy is deprecated". Does it have to do with how rmr is calling hadoop streaming, i.e., using mapreduce version 1 APIs as opposed to the version 2?


Antonio Piccolboni

unread,
Mar 19, 2014, 3:58:47 PM3/19/14
to RHadoop Google Group
It could. If you tell me exactly which property and particularly if it suggests an alternative I can try to figure it out. Streaming also triggers a few warnings, I think some of that type. For those we need to report upstream to that project (I will do it if necessary). A deprecation warning is innocuous, by definition only points to potential problems in the future. 



On Wed, Mar 19, 2014 at 12:51 PM, Harsha V. <hve...@gmail.com> wrote:
I changed all the options to "mapreduce". However, I still get warnings saying "mapred.xxx.yyy is deprecated". Does it have to do with how rmr is calling hadoop streaming, i.e., using mapreduce version 1 APIs as opposed to the version 2?


Harsha V.

unread,
Mar 19, 2014, 4:51:00 PM3/19/14
to rha...@googlegroups.com, ant...@piccolboni.info
Here are my new options.

  hadoop.options <- list(D="mapreduce.job.reduces=1080",
                         D='mapreduce.reduce.java.opts=-Xmx2048m',
                         D="mapreduce.map.output.compress=true",
                         D="mapreduce.map.memory.mb=4096",
                         D="mapreduce.reduce.memory.mb=8192",
                         D="mapreduce.reduce.shuffle.parallelcopies=10",
                         D="mapreduce.job.reduce.slowstart.completedmaps=0.8")

and here are the warnings.
14/03/19 19:46:15 INFO Configuration.deprecation: user.name is deprecated. Instead, use mapreduce.job.user.name
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.cache.files.filesizes is deprecated. Instead, use mapreduce.job.cache.files.filesizes
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.cache.files is deprecated. Instead, use mapreduce.job.cache.files
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.job.name is deprecated. Instead, use mapreduce.job.name
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.textoutputformat.separator is deprecated. Instead, use mapreduce.output.textoutputformat.separator
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.cache.files.timestamps is deprecated. Instead, use mapreduce.job.cache.files.timestamps
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
14/03/19 19:46:15 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir

The job still died with "running out of physical memory" error. I am at a loss as to the cause. Thanks, in any case, for all your help.

Antonio Piccolboni

unread,
Mar 19, 2014, 4:59:36 PM3/19/14
to RHadoop Google Group
What does "prefer" mean when applied to a package? rmr2 only uses hadoop streaming. Streaming has been ported from mr1 to mr2. We now test on mr2 because that's the present and future. 2.3.0 was tested on mr1 only because when it was released there were still enough bugs in streaming on mr2 that we couldn't run it. Upgrading one part of you system aggressively while staying put on another puts you in uncharted territory, but I doubt that's the problem here. 
 


--

Antonio Piccolboni

unread,
Mar 19, 2014, 5:04:05 PM3/19/14
to RHadoop Google Group
A small subset of those could be set by rmr, I will look into that. As far as your memory problem, let me know if you learn something from system monitoring and I will keep researching. We sure want to figure out this container issue. Potentially, it seems a better situation than mr1 because we can deliberately allocate extra room for R; that was not possible for mr1, we had to curtail the  number of map and reduce slots.


A


Saar Golde

unread,
Mar 20, 2014, 7:53:05 AM3/20/14
to rha...@googlegroups.com, ant...@piccolboni.info
I would suggest validating that the input to each reducer is indeed as small as you think it is. The logic is:
- There is a huge discrepancy between the the memory requirements, non-failing usage, and the size of allocated memory when the task fails. 
- The process may be killed by the container before the R code starts running over that specific chunk of data. 

A null / missing value for one of they the keys (happens a lot with real data...) will result in one key having lots and lots of data. Before you go into the rat-hole of profiling any further, I would recommend running a simplified version of your code that would count values per key, just to make sure you're not wasting your time. 

Harsha V.

unread,
Mar 20, 2014, 9:15:17 AM3/20/14
to rha...@googlegroups.com, ant...@piccolboni.info
Antonio,

It turned out that the reducers that failed with out-of-memory errors indeed were getting values that were larger. The job ran to completion when I put a threshold on the number of rows in the value in the reducer and returned NULL when the number of records exceeded the threshold. However, the largest value that a reducer got was a data frame of 3 * 10^5 rows and 10 columns (all the values were strings of about length 10). 

Thanks for helping me debug this problem. I am still unsure as to why a reducer with such a small input finds 8GB insufficient. Also, I will upgrade to rmr2 soon. When I said "prefer" I wanted to know if rmr2 was tested only with only one of the versions of map reduce.
Reply all
Reply to author
Forward
0 new messages