java.lang.OutOfMemoryError: Java heap space

Vicky

unread,

Jun 26, 2015, 9:43:08 AM6/26/15

to rha...@googlegroups.com

Getting OutOfMemory error during map phase. Tried with different memory settings as per the suggestions in hadoop.settings dbut still no luck. Any help is greatly appreciated.

Config details:

CDH 5.4.0

Hadoop 2.6.0

RMR2 3.3.1

Streaming Jar (hadoop-streaming-2.6.0-cdh5.4.0.jar)

The following are the Memory settings for the job:

mapreduce.map.memory.mb=8192m
mapreduce.map.java.opts=-Xmx4096m
mapreduce.reduce.memory.mb=8192m
mapreduce.reduce.java.opts=-Xmx4096m

The stack trace:

15/06/25 18:29:44 INFO mapreduce.Job: Running job: job_1435165845246_0880
15/06/25 18:29:50 INFO mapreduce.Job: Job job_1435165845246_0880 running in uber mode : false
15/06/25 18:29:50 INFO mapreduce.Job:  map 0% reduce 0%
15/06/25 18:30:01 INFO mapreduce.Job:  map 33% reduce 0%
15/06/25 18:30:02 INFO mapreduce.Job:  map 67% reduce 0%
15/06/25 18:30:39 INFO mapreduce.Job: Task Id : attempt_1435165845246_0880_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:336)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212)
	at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152)
	at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51)
	at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:378)

Garbage collection details:

Garbage collection 141 = 124+10+7 (level 2) ... 
23.4 Mbytes of cons cells used (59%)
4.2 Mbytes of vectors used (47%)
Dotted pair list of 1
 $ : language rmr.str(gc(verbose = TRUE, reset = FALSE))
gc(verbose = TRUE, reset = FALSE)
 num [1:2, 1:6] 437558 543467 23.4 4.2 741108 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "Ncells" "Vcells"
  ..$ : chr [1:6] "used" "(Mb)" "gc trigger" "(Mb)" ...
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 437558 23.4     741108 39.6   531268 28.4
Vcells 543467  4.2    1162592  8.9  1162592  8.9

Antonio Piccolboni

unread,

Jul 1, 2015, 1:52:35 PM7/1/15

to rha...@googlegroups.com, vikram...@gmail.com

Please provide test case. Thanks.

fishball

unread,

Sep 29, 2015, 9:33:11 AM9/29/15

to RHadoop, vikram...@gmail.com

I have same issue just running

small.ints <- to.dfs(1:100)

fs.ptr <- mapreduce(input=small.ints, map=function(k,v) cbind(v,v^2))

I've also adjusted my hadoop.settings as Vicky mentioned to no avail.

My issue is more specifically while running this code in R Markdown format, but maybe same solution applies. Thanks.

Reply all

Reply to author

Forward