Container is running beyond physical memory limits

1,555 views
Skip to first unread message

Qi Wang

unread,
Jul 30, 2015, 5:56:23 PM7/30/15
to Druid User
Hi,

I'm trying to ingest data from hdfs, but I keep getting errors like following. It seems like the task gets too much data to handle. But the thing is I have only dates as timestamp so I couldn't further break the data into smaller granularities. Any suggestions about how to solve this issue? Thanks!

Container [pid=70947,containerID=container_e22_1432969244945_6026_01_000075] is running beyond physical memory limits. Current usage: 4.0 GB of 4 GB physical memory used; 5.8 GB of 8.4 GB virtual memory used. Killing container.

Here is more log info.

2015-07-30T00:36:19,528 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2015-07-30T00:36:29,554 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 15%
2015-07-30T00:36:37,577 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 17%
2015-07-30T00:36:40,586 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 23%
2015-07-30T00:36:46,601 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 26%
2015-07-30T00:36:49,609 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 32%
2015-07-30T00:36:57,628 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 45%
2015-07-30T00:37:00,635 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 66%
2015-07-30T00:37:03,643 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 67%
2015-07-30T00:37:49,754 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 68%
2015-07-30T00:38:43,888 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 69%
2015-07-30T00:39:35,006 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 70%
2015-07-30T00:40:30,140 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 71%
2015-07-30T00:41:21,270 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 72%
2015-07-30T00:42:16,396 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 73%
2015-07-30T00:45:19,858 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job - Task Id : attempt_1432969244945_6026_r_000000_2, Status : FAILED
Container [pid=77928,containerID=container_e22_1432969244945_6026_01_000077] is running beyond physical memory limits. Current usage: 4.0 GB of 4 GB physical memory used; 5.8 GB of 8.4 GB virtual memory used. Killing container.
Dump of the process-tree for container_e22_1432969244945_6026_01_000077 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 77932 77928 77928 77928 (java) 76785 9822 6220464128 1049133 /usr/lib/jvm/j2sdk1.8-oracle/jre/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Djava.net.preferIPv4Stack=true -Xmx3460300800 -Djava.io.tmpdir=/mnt/hdfs_12o/yarn/nm/usercache/qi_wang/appcache/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.123.204.75 60033 attempt_1432969244945_6026_r_000000_2 77 
	|- 77928 77926 77928 77928 (bash) 0 0 9822208 289 /bin/bash -c /usr/lib/jvm/j2sdk1.8-oracle/jre/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Djava.net.preferIPv4Stack=true -Xmx3460300800 -Djava.io.tmpdir=/mnt/hdfs_12o/yarn/nm/usercache/qi_wang/appcache/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.123.204.75 60033 attempt_1432969244945_6026_r_000000_2 77 1>/var/log/hadoop-yarn/container/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077/stdout 2>/var/log/hadoop-yarn/container/application_1432969244945_6026/container_e22_1432969244945_6026_01_000077/stderr  

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

2015-07-30T00:45:20,862 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2015-07-30T00:45:41,910 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 11%
2015-07-30T00:45:44,917 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 13%
2015-07-30T00:45:53,938 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 18%
2015-07-30T00:45:56,945 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 22%
2015-07-30T00:46:03,965 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 25%
2015-07-30T00:46:06,972 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 32%
2015-07-30T00:46:14,992 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 52%
2015-07-30T00:46:17,999 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 67%
2015-07-30T00:47:12,128 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 68%
2015-07-30T00:48:06,250 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 69%
2015-07-30T00:49:01,361 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 70%
2015-07-30T00:49:52,465 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 71%
2015-07-30T00:50:43,574 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 72%
2015-07-30T00:51:35,685 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 73%
2015-07-30T00:54:30,044 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
2015-07-30T00:54:31,052 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job - Job job_1432969244945_6026 failed with state FAILED due to: Task failed task_1432969244945_6026_r_000000
Job failed as tasks failed. failedMaps:0 failedReduces:1

2015-07-30T00:54:31,139 INFO [task-runner-0] org.apache.hadoop.mapreduce.Job - Counters: 38
	File System Counters
		FILE: Number of bytes read=0
		FILE: Number of bytes written=810738421
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=2326661134
		HDFS: Number of bytes written=0
		HDFS: Number of read operations=219
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=0
	Job Counters 
		Failed reduce tasks=4
		Launched map tasks=73
		Launched reduce tasks=4
		Data-local map tasks=61
		Rack-local map tasks=12
		Total time spent by all maps in occupied slots (ms)=1428866
		Total time spent by all reduces in occupied slots (ms)=2184837
		Total time spent by all map tasks (ms)=1428866
		Total time spent by all reduce tasks (ms)=2184837
		Total vcore-seconds taken by all map tasks=1428866
		Total vcore-seconds taken by all reduce tasks=2184837
		Total megabyte-seconds taken by all map tasks=5852635136
		Total megabyte-seconds taken by all reduce tasks=8949092352
	Map-Reduce Framework
		Map input records=30597989
		Map output records=30597989
		Map output bytes=3795354607
		Map output materialized bytes=802113853
		Input split bytes=10001
		Combine input records=0
		Spilled Records=30597989
		Failed Shuffles=0
		Merged Map outputs=0
		GC time elapsed (ms)=277433
		CPU time spent (ms)=5533170
		Physical memory (bytes) snapshot=123575734272
		Virtual memory (bytes) snapshot=395487993856
		Total committed heap usage (bytes)=224425672704
	File Input Format Counters 
		Bytes Read=2326651133

Gian Merlino

unread,
Jul 30, 2015, 6:43:39 PM7/30/15
to druid...@googlegroups.com
-Xmx3460300800 is a bit high for a 4GB yarn container, given that the Druid indexer allocates a bunch of off-heap storage too. I usually use these settings:

mapreduce.map.memory.mb=2048
mapreduce.map.java.opts="-server -Xmx1536m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"
mapreduce.reduce.memory.mb=6144
mapreduce.reduce.java.opts="-server -Xmx2560m -Duser.timezone=UTC -Dfile.encoding=UTF-8 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps"

--
You received this message because you are subscribed to the Google Groups "Druid User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to druid-user+...@googlegroups.com.
To post to this group, send email to druid...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/druid-user/8bb11d51-32d3-4153-854c-c14be9e6cb88%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Qi Wang

unread,
Jul 30, 2015, 7:50:53 PM7/30/15
to Druid User, gianm...@gmail.com
Hi Gian,

Thanks for help! Where do i put those configuration? I tried to put them in the config/overlord/runtime.properties and it seems not working. I also saw the hadoop configuration page http://druid.io/docs/latest/configuration/hadoop.html. but we haven't really used it before. 

Thanks!

Gian Merlino

unread,
Jul 30, 2015, 8:00:57 PM7/30/15
to Qi Wang, Druid User
If you have your hadoop config xml's on the classpath, you can add those configs to mapred-site.xml.

Qi Wang

unread,
Jul 30, 2015, 8:15:02 PM7/30/15
to Druid User, steve....@gmail.com, gianm...@gmail.com
I see. Is there any settings we can do on the Druid side instead of yarn side?

Gian Merlino

unread,
Jul 30, 2015, 8:16:51 PM7/30/15
to Qi Wang, Druid User
They could still be added on the Druid side- you can definitely put different xml's on the Druid classpath than on the YARN NodeManager classpath. The ones on the Druid classpath will take priority since they'll be present on job submission.

Although if you'd rather use the same xmls everywhere, you can also add the properties to the "jobProperties" of the "tuningConfig" in the druid indexing spec.

Qi Wang

unread,
Jul 30, 2015, 8:18:56 PM7/30/15
to Druid User, steve....@gmail.com, gianm...@gmail.com
Got it. Will try. Thanks!
Reply all
Reply to author
Forward
0 new messages