nginx worker virtual memory size keeps growing during load

644 views
Skip to first unread message

Stefan Parvu

unread,
Oct 21, 2015, 1:35:32 PM10/21/15
to openre...@googlegroups.com
Hi guys,

We are running OpenResty + Redis and we see a continous increase on
virtual size segment for our main worker processes. The workers are
getting always work, indeed, since we are running a performance
exercise, simulating 500-600 clients sending work. Top shows (FreeBSD 10.2):

last pid: 95475; load averages: 3.40, 3.44, 3.35

up 15+05:20:08 17:19:31
40 processes: 3 running, 37 sleeping
CPU: 55.0% user, 0.0% nice, 19.7% system, 0.5% interrupt, 24.8% idle
Mem: 3196M Active, 774M Inact, 2889M Wired, 20K Cache, 1072M Free
ARC: 1594M Total, 329M MFU, 1145M MRU, 17M Anon, 13M Header, 90M Other
Swap: 5120M Total, 5120M Free

PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU
COMMAND
93329 krmx 3 36 0 2517M 2488M uwait 1 321:44 98.97%
redis-server
93335 krmx 1 100 0 803M 775M RUN 1 297:08 96.88% nginx
93334 krmx 1 88 0 578M 550M CPU0 0 160:42 56.69% nginx
93324 krmx 3 27 0 58876K 42744K uwait 2 153:13 56.49%
redis-server
93363 krmx 1 20 0 57416K 19056K kqread 3 5:14 0.78% nginx
93364 krmx 1 20 0 57416K 18488K kqread 1 5:18 0.10% nginx
93357 krmx 3 29 0 22012K 6356K uwait 2 3:11 0.10%
redis-server

We are using OpenResty for numerical calculations. If we dig inside one
worker we see the following picture (procstat), high allocation on
private resident pages, PRES column. See below.

Questions:

1. This might look normal since the worker are getting work continuously
but is there a way to ensure we dont go more than MB ?

2. Is there any way to find out whats keeping all these allocations busy
? (stack / data ? df segments are pointing to stack within the process)


PID START END PRT RES PRES REF SHD FL
TP PATH
93335 0x1000 0x61000 rw- 96 0 1 0 C--- df
93335 0x61000 0x81000 rw- 32 0 1 0 C--- df
93335 0x82000 0xa2000 rw- 32 0 1 0 C--- df
93335 0xa2000 0xc2000 rw- 32 32 1 0 ---- df
93335 0xc3000 0x123000 rw- 96 0 1 0 C--- df
93335 0x123000 0x143000 rw- 32 48 2 0 ---- df
93335 0x143000 0x153000 r-x 16 48 2 0 ---- df
93335 0x15a000 0x1ba000 rw- 96 0 1 0 C--- df
93335 0x1ba000 0x2da000 rw- 288 304 2 0 ---- df
93335 0x2da000 0x2ea000 r-x 16 304 2 0 ---- df
93335 0x2f0000 0x3d0000 rw- 224 256 2 0 ---- df
93335 0x3d0000 0x3e0000 r-x 16 16 1 0 ---- df
93335 0x3e0000 0x400000 rw- 32 256 2 0 ---- df
93335 0x400000 0x57a000 r-x 349 371 4 1 CN--
vn /opt/kronometrix/kernel/nginx/sbin/nginx
93335 0x57a000 0x71a000 rw- 416 492 3 0 ---- df
93335 0x71a000 0x72a000 r-x 16 16 1 0 ---- df
93335 0x72a000 0x76a000 rw- 64 492 3 0 ---- df
93335 0x76a000 0x77a000 r-x 12 492 3 0 ---- df
93335 0x77a000 0x78f000 rw- 21 0 1 0 C---
vn /opt/kronometrix/kernel/nginx/sbin/nginx
93335 0x78f000 0x7a2000 rw- 6 0 1 0 C--- df
93335 0x7a2000 0x2bc2000 rw- 9248 101824 7 0 ---- df
93335 0x2bc2000 0x2be2000 rw- 32 32 1 0 ---- df
93335 0x2be3000 0x4403000 rw- 6176 101824 7 0 ---- df
93335 0x4403000 0x4443000 rw- 64 64 1 0 ---- df
93335 0x4444000 0x4f24000 rw- 2784 101824 7 0 ---- df
93335 0x4f24000 0x5024000 rw- 256 256 1 0 ---- df
93335 0x5025000 0x7f05000 rw- 12000 101824 7 0
---- df
93335 0x7f05000 0x8105000 rw- 512 512 1 0 ---- df
93335 0x8106000 0x8646000 rw- 1344 101824 7 0 ---- df
93335 0x8646000 0x8846000 rw- 512 512 1 0 ---- df
93335 0x8847000 0xc907000 rw- 16576 101824 7 0
---- df
93335 0xc907000 0xcd07000 rw- 1024 1024 1 0 ---- df
93335 0xcd08000 0x19ec8000 rw- 53696 101824 7 0
---- df
93335 0x19ec8000 0x1b588000 rw- 5824 8128 2 0 ---- df
93335 0x1b588000 0x1bd88000 rw- 2048 2048 1 0 ---- df
93335 0x1bd89000 0x1c689000 rw- 2304 8128 2 0 ---- df
93335 0x1c689000 0x1ce89000 rw- 2048 2048 1 0 ---- df
93335 0x1ce8a000 0x21aca000 rw- 19520 29622 3 0 ---- df
93335 0x21aca000 0x222ca000 rw- 2048 2048 1 0 ---- df
93335 0x222cb000 0x23ccb000 rw- 6656 29622 3 0 ---- df
93335 0x244cc000 0x2524c000 rw- 3446 29622 3 0 ----
...

nginx version: openresty/1.7.10.2
built by clang 3.4.1 (tags/RELEASE_34/dot1-final 208032) 20140512
TLS SNI support enabled
configure arguments: --prefix=/opt/kronometrix/kernel/nginx
--with-cc-opt=-O2 --add-module=../ngx_devel_kit-0.2.19
--add-module=../echo-nginx-module-0.58
--add-module=../xss-nginx-module-0.05 --add-module=../ngx_coolkit-0.2rc3
--add-module=../set-misc-nginx-module-0.29
--add-module=../form-input-nginx-module-0.11
--add-module=../encrypted-session-nginx-module-0.04
--add-module=../srcache-nginx-module-0.30 --add-module=../ngx_lua-0.9.16
--add-module=../ngx_lua_upstream-0.03
--add-module=../headers-more-nginx-module-0.26
--add-module=../array-var-nginx-module-0.04
--add-module=../memc-nginx-module-0.16
--add-module=../redis2-nginx-module-0.12
--add-module=../redis-nginx-module-0.3.7
--add-module=../rds-json-nginx-module-0.14
--add-module=../rds-csv-nginx-module-0.06
--with-ld-opt=-Wl,-rpath,/opt/kronometrix/kernel/luajit/lib
--with-cc=/usr/bin/cc --with-http_ssl_module





Thanks,

--
Stefan Parvu <spa...@kronometrix.org>

Lord Nynex

unread,
Oct 21, 2015, 4:37:15 PM10/21/15
to openre...@googlegroups.com
Hello,

I notice you're running a slightly older version of openresty. Is it possible to update your build? This is unrelated to the behavior you're seeing as I ran 1.7.10 in a heavy load environment for a very long time. It just can't hurt to run the latest stable build.

I suspect you have not given enough data to really make a speculation on the behavior you're seeing. I strongly suspect your lua code would be problematic far before nginx would (this is a thoroughly tested version of nginx). It would be helpful to see a working example of the lua code you're seeing this issue with. I'm not sure if I'm reading your output properly but it looks like the nginx process is consuming an abnormal amount of memory. It seems likely your lua code is the culprit. 

Around this point, if it were my issue, I would fire up systemtap and start profiling the process to see exactly what is causing this paging. Since you do not have systemtap, I'd recommend looking into dtrace probes. I'm not familiar with BSD to give any intelligent guidance. 

From the process list you've shown, you're running redis on the same host which leads me to believe this is a reasonably straight forward issue. If redis is allocating large amounts of memory (all or most of it) then of course you would start to see pages moved to disk. If I'm reading this correctly, the machine has ~4GB of memory. Redis is occupying 2.5-3, and you have a very bloated nginx process. The behavior you're seeing is unsurprising. 



--
Stefan Parvu <spa...@kronometrix.org>

--
You received this message because you are subscribed to the Google Groups "openresty-en" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openresty-en...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yichun Zhang (agentzh)

unread,
Oct 21, 2015, 11:08:58 PM10/21/15
to openresty-en
Hello!

On Thu, Oct 22, 2015 at 1:35 AM, Stefan Parvu wrote:
> 1. This might look normal since the worker are getting work continuously
> but is there a way to ensure we dont go more than MB ?
>

Yes, sure. Optimizing the memory usage of the Lua code can usually
help a lot if that is the memory hog according to my own experiences.

Maybe you have a memory leak in your Lua code (like an ever-growing
VM-level global Lua string or Lua table). Or maybe you're just
creating a lot of temporary Lua GC objects (tables, functions,
strings, and etc) too excessively like there's no tomorrow (so that
the GC cannot really catch up).

But there's also a chance that you have many long-running cosockets
that keep allocating new blocks in nginx's request memory pool.

We need to use tools to be sure.

> 2. Is there any way to find out whats keeping all these allocations busy
> ? (stack / data ? df segments are pointing to stack within the process)
>

Yes, there's tools to debug memory usage in nginx. Unfortunately
you're on FreeBSD, which means all of those systemtap-based tools [1]
are not applicable to your environment. Still you can try using the
gdb tools to debug things:

https://github.com/openresty/nginx-gdb-utils

The "lgc", "lgcstat", and "lgcpath" gdb commands there can be
particularly helpful for debugging Lua-land memory issues.

Regards,
-agentzh

[1] See https://github.com/openresty/nginx-systemtap-toolkit and
https://github.com/openresty/stapxx#samples

Stefan Parvu

unread,
Oct 22, 2015, 4:36:42 PM10/22/15
to openre...@googlegroups.com
Thank you all for comments.

Looks like our application for sure. We need to debug and find our whats
going on.

Cheers,


--
Stefan Parvu <spa...@kronometrix.org>
Reply all
Reply to author
Forward
0 new messages