Hello Jesse and Andrea -
I can confirm there is definitely a leak. our last sync point with
cloudfoundry/vcap is f7f7354c290285c477715986d1b6f399b7e01c1a
In our current deployment, HM can quickly balloon up to over 8G. It
maxed memory on the box and triggered the OOM killer causing general
mayhem. And it seems to be leaking memory at a very high rate for our
deployment.
Like Andrea, we were running a very old build (circa January) and had
no problems. Updating to latest (pre-repo split) is the first time
we're seeing this kind of memory use.
We haven't yet tried Andrea's Gemfile solution, but I would guess that
the problem is not specific to HM. We actually run HM on a different
box from CC, and are seeing CC leaking memory as well. It does so at a
much, much slower rate however. It's currently hovering at 1-2G
resident. Given the high memory profile on CC in addition to HM, it
would seem there is some call happening in CC happening much more
frequently in HM that is leaking memory.
I added a periodic_timer to check rss of Process.pid every 3s. (monit
restarts HM after 256M)
Jun 14 12:11:31 | pid=13930 | MEM:430516
Jun 14 12:11:35 | pid=13930 | MEM:450012
Jun 14 12:11:38 | pid=13930 | MEM:455308
Jun 14 12:11:41 | pid=13930 | MEM:455864
Jun 14 12:11:44 | pid=13930 | MEM:468984
Jun 14 12:11:47 | pid=13930 | MEM:481332
Jun 14 12:11:51 | pid=13930 | MEM:481696
Jun 14 12:12:00 | pid=15558 | MEM:87104
Jun 14 12:12:03 | pid=15558 | MEM:89824
Jun 14 12:12:06 | pid=15558 | MEM:90344
Jun 14 12:12:09 | pid=15558 | MEM:92232
Jun 14 12:12:13 | pid=15558 | MEM:92436
Jun 14 12:12:16 | pid=15558 | MEM:92444
Jun 14 12:12:19 | pid=15558 | MEM:92656
Jun 14 12:12:22 | pid=15558 | MEM:93160
Jun 14 12:12:25 | pid=15558 | MEM:90840
Jun 14 12:12:28 | pid=15558 | MEM:91712
Jun 14 12:12:31 | pid=15558 | MEM:91712
Jun 14 12:12:34 | pid=15558 | MEM:92560
Jun 14 12:12:37 | pid=15558 | MEM:92708
Jun 14 12:12:40 | pid=15558 | MEM:124248
Jun 14 12:12:43 | pid=15558 | MEM:124248
Jun 14 12:12:46 | pid=15558 | MEM:127540
Jun 14 12:12:49 | pid=15558 | MEM:145416
Jun 14 12:12:53 | pid=15558 | MEM:149556
Jun 14 12:12:56 | pid=15558 | MEM:149656
Jun 14 12:13:02 | pid=15558 | MEM:192392
as best as I can tell, the big steps in memory correspond to the
analyze_all_apps timer.
- Albert
On Jun 14, 12:33 am, Andrea Campi <
andrea.ca...@zephirworks.com>
wrote: