java.lang.OutOfMemoryError: Java heap space

276 views
Skip to first unread message

Vicky

unread,
Jun 26, 2015, 9:43:08 AM6/26/15
to rha...@googlegroups.com
Getting OutOfMemory error during map phase. Tried with different memory settings as per the suggestions in hadoop.settings dbut still no luck. Any help is greatly appreciated.

Config details:
CDH 5.4.0
Hadoop 2.6.0
RMR2 3.3.1
Streaming Jar (hadoop-streaming-2.6.0-cdh5.4.0.jar)

The following are the Memory settings for the job:

mapreduce.map.memory.mb=8192m
mapreduce.map.java.opts=-Xmx4096m
mapreduce.reduce.memory.mb=8192m
mapreduce.reduce.java.opts=-Xmx4096m

The stack trace:

15/06/25 18:29:44 INFO mapreduce.Job: Running job: job_1435165845246_0880
15/06/25 18:29:50 INFO mapreduce.Job: Job job_1435165845246_0880 running in uber mode : false
15/06/25 18:29:50 INFO mapreduce.Job:  map 0% reduce 0%
15/06/25 18:30:01 INFO mapreduce.Job:  map 33% reduce 0%
15/06/25 18:30:02 INFO mapreduce.Job:  map 67% reduce 0%
15/06/25 18:30:39 INFO mapreduce.Job: Task Id : attempt_1435165845246_0880_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:336)
	at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:535)
	at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
	at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.OutOfMemoryError: Java heap space
	at org.apache.hadoop.typedbytes.TypedBytesInput.readRawBytes(TypedBytesInput.java:212)
	at org.apache.hadoop.typedbytes.TypedBytesInput.readRaw(TypedBytesInput.java:152)
	at org.apache.hadoop.streaming.io.TypedBytesOutputReader.readKeyValue(TypedBytesOutputReader.java:51)
	at org.apache.hadoop.streaming.PipeMapRed$MROutputThread.run(PipeMapRed.java:378)

Garbage collection details:

Garbage collection 141 = 124+10+7 (level 2) ... 
23.4 Mbytes of cons cells used (59%)
4.2 Mbytes of vectors used (47%)
Dotted pair list of 1
 $ : language rmr.str(gc(verbose = TRUE, reset = FALSE))
gc(verbose = TRUE, reset = FALSE)
 num [1:2, 1:6] 437558 543467 23.4 4.2 741108 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:2] "Ncells" "Vcells"
  ..$ : chr [1:6] "used" "(Mb)" "gc trigger" "(Mb)" ...
         used (Mb) gc trigger (Mb) max used (Mb)
Ncells 437558 23.4     741108 39.6   531268 28.4
Vcells 543467  4.2    1162592  8.9  1162592  8.9

Antonio Piccolboni

unread,
Jul 1, 2015, 1:52:35 PM7/1/15
to rha...@googlegroups.com, vikram...@gmail.com
Please provide test case. Thanks.

fishball

unread,
Sep 29, 2015, 9:33:11 AM9/29/15
to RHadoop, vikram...@gmail.com
I have same issue just running 

small.ints <- to.dfs(1:100)
fs.ptr <- mapreduce(input=small.ints, map=function(k,v) cbind(v,v^2))

I've also adjusted my hadoop.settings as Vicky mentioned to no avail. 

My issue is more specifically while running this code in R Markdown format, but maybe same solution applies. Thanks.
Reply all
Reply to author
Forward
0 new messages