Memory cgroup out of memory: Kill process

2,015 views
Skip to first unread message

ABHISHEK PALIWAL

unread,
Apr 7, 2016, 7:49:28 AM4/7/16
to inside...@googlegroups.com

Hi,
How to fix cgroup out of memory problem. It seems one of our application is killed and as a result it is escalating to board restart.
As per the below logs, it looks like the reason for application to be killed is due

to memory limit configured by the cgroup for the application.
0004: Apr 04 07:28:04 000400 kernel: loco_sps_evo_ed invoked oom-killer:
gfp_mask=0xd0, order=0, oom_score_adj=0

0004: Apr 04 07:28:04 000400 kernel: loco_sps_evo_ed cpuset=00050001

mems_allowed=0

0004: Apr 04 07:28:04 000400 kernel: CPU: 19 PID: 29944 Comm: loco_sps_evo_ed

Tainted: G           O 3.14.39ltsi-WR7.0.0.11_standard #1

0004: Apr 04 07:28:04 000400 kernel: Call Trace:

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96b770] [c00000000501a6fc]

.show_stack+0x168/0x278 (unreliable)

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96b860] [c000000005870460]

.dump_stack+0x9c/0xfc

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96b8e0] [c000000005173a5c]

.dump_header.isra.9+0x9c/0x250

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96b9b0] [c000000005174218]

.oom_kill_process+0x2d8/0x4a0

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96ba80] [c0000000051df4b0]

.mem_cgroup_oom_synchronize+0x644/0x7a8

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96bb90] [c000000005174a04]

.pagefault_out_of_memory+0x1c/0x84

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96bc00] [c0000000058665b4]

.do_page_fault+0x7e8/0x86c

0004: Apr 04 07:28:04 000400 kernel: [c00000001e96be30] [c00000000502f1dc]

storage_fault_common+0x20/0x44

0004: Apr 04 07:28:04 000400 kernel: Task in /00000005/00050001 killed as a

result of limit of /00000005/00050001

0004: Apr 04 07:28:04 000400 kernel: memory: usage 205824kB, limit 205824kB,
failcnt 22

0004: Apr 04 07:28:04 000400 kernel: memory+swap: usage 205824kB, limit

18014398509481983kB, failcnt 0

0004: Apr 04 07:28:04 000400 kernel: kmem: usage 0kB, limit

18014398509481983kB, failcnt 0

0004: Apr 04 07:28:05 000400 kernel: Memory cgroup stats for

/00000005/00050001: cache:268KB rss:205556KB rss_huge:0KB mapped_file:264KB

writeback:0KB swap:0KB inactive_anon:268KB active_anon:205500KB

inactive_file:0KB active_file:0KB unevictable:0KB

0004: Apr 04 07:28:05 000400 kernel: [ pid ]   uid  tgid total_vm      rss

nr_ptes swapents oom_score_adj name

0004: Apr 04 07:28:05 000400 kernel: [28771]     0 28771   194002    52320    

157        0             0 loco-sps-ip-lx-

0004: Apr 04 07:28:05 000400 kernel:
Memory cgroup out of memory: Kill process
28771 (loco-sps-ip-lx-) score 989 or sacrifice child

0004: Apr 04 07:28:05 000400 kernel: Killed process 28771 (loco-sps-ip-lx-)

total-vm:776008kB, anon-rss:205268kB, file-rss:4012kB

0004: Apr 04 07:28:05 000400 pghd[28193]: program (0x50001) terminated

0004: Apr 04 07:28:05 000400 pghd[28193]: pgh_cb_pgmterm for pid=0x50001,raw

status=9

0004: Apr 04 07:28:05 000400 pghd[28193]:
Program 1 terminated abnormally by
signal for pid=0x50001

0004: Apr 04 07:28:05 000400 pghd[28193]: Program terminated abnormally with

signal number=9 for pid=0x50001

0004: Apr 04 07:28:05 000400 pghd[28193]: pgh_cb_pgmterm 3 0x25e5

0004: Apr 04 07:28:05 000400 TRI_SERVER[28198]: rlog: $ Restart request$

2016-04-04 07:28:05$ Board Manager$ - $ Warm$ -$ -$ Restart of program

identity:CXC1322156%3_P91A132, container handle:CXC1322156%3_P91A132/1 program

handle:loco-sps-ip-lx-lm/1 with escalation set to board restart.
In this case we are not getting any core dump.
We are afraid of how to investigate such problems with out core dump or snapshot of the system.

Could you please check and let us know if there is a way to instruct/modify kernel such that it generates dump for this scenarios.

--




Regards
Abhishek Paliwal

Anil Kumar Pugalia

unread,
Apr 15, 2016, 6:03:52 AM4/15/16
to inside...@googlegroups.com
If it is fine, you may chekc out by changing the memory limit. Also, to make sure you get the core dump, I hope the maximum core file size is set to a non-zero value using ulimit -c.

Regards
Anil
Passion: http://sysplay.in (Playing with Systems)
--
You received this message because you are subscribed to the Google Groups "SysPlay's Inside Linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email to inside_linux...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages