Hello!
I mentioned this issue over pizza this last week at the monthly meeting - but wanted to get something on here for the sake of my memory having a hard time recalling some of the reccomendations.
We have a webserver running Ubuntu 14.04.2 LTS x64, Apache 2.4.7, MySQL 5.5.43. Everything is from the regular Ubuntu repositories.
This server only hosts 1 wordpress site running 4.2.2
The server is a VM running on XenServer 6.5. I've given it 3gb of RAM with 2 vCPUs.
Obviously one solution would be to just give it more RAM - and I would, but since it's crashing out at random times that don't seem to coorelate with high traffic, I think there's some sort of leak elsewhere. It could reside in the site being pretty heavy (designed by marketing company), or my MySQL or Apache configurations.
Most of the time - top shows the server to be calm. Current load shows "load average: 0.58, 0.54, 1.66" (the 1.66 is leftover from our recent CPU spike a bit ago).
Here's the top of the report from the apache server-status module:
- Server Version: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.9 OpenSSL/1.0.1f
- Server MPM: prefork
- Server Built: Mar 10 2015 13:05:59
- Current Time: Monday, 08-Jun-2015 11:21:39 PDT
- Restart Time: Friday, 29-May-2015 09:00:07 PDT
- Parent Server Config. Generation: 3
- Parent Server MPM Generation: 2
- Server uptime: 10 days 2 hours 21 minutes 31 seconds
- Server load: 0.32 0.59 1.37
- Total accesses: 702809 - Total Traffic: 29.4 GB
- CPU Usage: u814.48 s67.08 cu0 cs0 - .101% CPU load
- .806 requests/sec - 35.3 kB/second - 43.8 kB/request
- 5 requests currently being processed, 5 idle workers
_.K__.KWK.........W...__........................................
................................................................
......................
Scoreboard Key:
"_
" Waiting for Connection,
"S
" Starting up,
"R
" Reading Request,
"W
" Sending Reply,
"K
" Keepalive (read),
"D
" DNS Lookup,
"C
" Closing connection,
"L
" Logging,
"G
" Gracefully finishing,
"I
" Idle cleanup of worker,
".
" Open slot with no current process
-----
Selected Syslog entries show:
Jun 8 10:09:24 webserver kernel: [4467899.128655] apache2 invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
Jun 8 10:09:24 webserver kernel: [4467899.128664] apache2 cpuset=/ mems_allowed=0
Jun 8 10:09:24 webserver kernel: [4467899.128670] CPU: 1 PID: 28451 Comm: apache2 Not tainted 3.13.0-32-generic #57-Ubuntu
Jun 8 10:09:24 webserver kernel: [4467899.128814] Mem-Info:
Jun 8 10:09:24 webserver kernel: [4467899.128817] Node 0 DMA per-cpu:
Jun 8 10:09:24 webserver kernel: [4467899.128821] CPU 0: hi: 0, btch: 1 usd: 0
Jun 8 10:09:24 webserver kernel: [4467899.128823] CPU 1: hi: 0, btch: 1 usd: 0
Jun 8 10:09:24 webserver kernel: [4467899.128826] Node 0 DMA32 per-cpu:
Jun 8 10:09:24 webserver kernel: [4467899.128829] CPU 0: hi: 186, btch: 31 usd: 182
Jun 8 10:09:24 webserver kernel: [4467899.128831] CPU 1: hi: 186, btch: 31 usd: 62
Jun 8 10:09:24 webserver kernel: [4467899.128837] active_anon:574844 inactive_anon:124512 isolated_anon:96
Jun 8 10:09:24 webserver kernel: [4467899.128837] active_file:786 inactive_file:685 isolated_file:135
Jun 8 10:09:24 webserver kernel: [4467899.128837] unevictable:0 dirty:0 writeback:7 unstable:0
Jun 8 10:09:24 webserver kernel: [4467899.128837] free:14205 slab_reclaimable:11248 slab_unreclaimable:11351
Jun 8 10:09:24 webserver kernel: [4467899.128837] mapped:4578 shmem:11100 pagetables:22369 bounce:0
Jun 8 10:09:24 webserver kernel: [4467899.128837] free_cma:0
Jun 8 10:09:24 webserver kernel: [4467899.128842] Node 0 DMA free:12164kB min:232kB low:288kB high:348kB active_anon:1648kB inactive_anon:1748kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0$
Jun 8 10:09:24 webserver kernel: [4467899.128848] lowmem_reserve[]: 0 2986 2986 2986
Jun 8 10:09:24 webserver kernel: [4467899.128853] Node 0 DMA32 free:44656kB min:44820kB low:56024kB high:67228kB active_anon:2297728kB inactive_anon:496300kB active_file:3144kB inactive_file:2740kB unevictable:0$
Jun 8 10:09:24 webserver kernel: [4467899.128859] lowmem_reserve[]: 0 0 0 0
Jun 8 10:09:24 webserver kernel: [4467899.128863] Node 0 DMA: 3*4kB (UEM) 7*8kB (EM) 6*16kB (UEM) 1*32kB (U) 3*64kB (UEM) 2*128kB (UE) 1*256kB (E) 4*512kB (EM) 3*1024kB (UEM) 1*2048kB (R) 1*4096kB (M) = 12164kB
Jun 8 10:09:24 webserver kernel: [4467899.128880] Node 0 DMA32: 253*4kB (UE) 238*8kB (UEM) 239*16kB (UE) 253*32kB (UEM) 131*64kB (UEM) 55*128kB (EM) 42*256kB (EM) 5*512kB (UEM) 1*1024kB (E) 0*2048kB 0*4096kB = 4$
Jun 8 10:09:24 webserver kernel: [4467899.128895] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 8 10:09:24 webserver kernel: [4467899.128897] 14180 total pagecache pages
Jun 8 10:09:24 webserver kernel: [4467899.128901] 1468 pages in swap cache
Jun 8 10:09:24 webserver kernel: [4467899.128911] Swap cache stats: add 19182781, delete 19181313, find 164728409/166240070
Jun 8 10:09:24 webserver kernel: [4467899.128914] Free swap = 0kB
Jun 8 10:09:24 webserver kernel: [4467899.128917] Total swap = 1302524kB
Jun 8 10:09:24 webserver kernel: [4467899.128920] 785309 pages RAM
Jun 8 10:09:24 webserver kernel: [4467899.128922] 0 pages HighMem/MovableOnly
Jun 8 10:09:24 webserver kernel: [4467899.128924] 16142 pages reserved
(the process list that follows has a lot of apache2 processes - which makes me suspicious that it's either starting too many, or not killing them off properly)---
After those logs, MySQL tries to respawn, sucessfully, but then OOM kills it off again (still busy). Then it respawns again, but init stops it due to "Respawning too fast".
mpm_prefork is being used and is configured with:
<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxRequestWorkers 150
MaxConnectionsPerChild 0
# MaxConnectionsPerChild 2000
</IfModule>
Apache Memory Usage (MB): 885.859
Average Proccess Size (MB): 42.1838
mysqltuner isn't helpful right now since MySQL hasn't been running for longer than 24 hours.
Is there any other information I can round up to determine the cause of this? In rounding up all this info, I realized Apache doesn't have ServerLimit or MaxClients defined (defaults to 256). I just calculated that we should be able to support at least 30 apache processes at the average memory usage above. I have serious doubts that we're seeing that much traffic when I get these alarms (some during the day, but a lot at early morning hours)
Thanks! I'm looking at this as a learning project, so if there's anything I'm not understanding correctly, feel free to call me out.
-Stead