Finding native memory leaks in node

1,482 views
Skip to first unread message

Jimb Esser

unread,
Mar 15, 2012, 7:49:42 PM3/15/12
to nodejs
We have some servers running node.js under Ubuntu. After a day or two
of running, once and a while, a node process will be using something
like 4gb of memory. process.memoryUsage() indicates something like
rss: 4gb, heapTotal: 50mb, heapUsed: 40mb, so the memory is not in
anything a V8 heap snapshot would show me. There are lots of possible
culprits (Buffers, any number of things native modules are allocating
- I'm 90% certain I know which module, but no idea where/what
specifically).

So, my general question, how do people go about debugging memory leaks
in native code in node (or node itself)?

Coming from PC development, I'm familiar with 101 ways to skin a heap
on windows, but haven't the slightest idea on Linux in general (and a
bunch of the native modules we use do not work on Windows since node
stopped supporting cygwin, so no short term/easy solutions there -
plus this only happens after days on our production Linux servers).
I'd heard good things about Google's perftools' heap profiler, but I
attempted to link Node with that and it simply segfaulted very shortly
after startup.

Ben Noordhuis

unread,
Mar 15, 2012, 8:58:24 PM3/15/12
to nod...@googlegroups.com

In development environments, valgrind is the way to go. That won't
help in production because it's unbearably slow (essentially your code
runs inside a virtual machine that tracks every read and write).

I have had moderate success with making the process dump core and do a
post-mortem inspection of the heap in gdb. Use `gcore` or a simple
`kill -QUIT <pid>` to dump core, set `ulimit -c unlimited` before you
start the process.

robot1125

unread,
Mar 16, 2012, 7:34:38 AM3/16/12
to nodejs
> So, my general question, how do people go about debugging memory leaks
> in native code in node (or node itself)?

You may find this interview with Brian Cantrill of Joyent interesting.

http://www.infoq.com/interviews/operating-nodejs-production-bryan-cantrill

He discusses some of the tools they use to debug node apps in
production at Joyent. Some of them require running under their
homebrew OS, but the discussion may give you some ideas.

George Stagas

unread,
Mar 16, 2012, 11:34:01 AM3/16/12
to nod...@googlegroups.com
You can pinpoint the leak by writing stress tests. Don't wait days for
the leak to show in production. Find which component causes it. Make a
list of suspects. If it's an http server, start with simple requests.
Does the memory go up? No? Maybe it's the db? Try a million
read/writes. Did you find the component that leaks? Follow the chain
of events, commenting out suspicious code and trying again until the
leak disappears.

2012/3/16 robot1125 <shane.b...@gmail.com>:

> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com
> To unsubscribe from this group, send email to
> nodejs+un...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en

Jimb Esser

unread,
Mar 16, 2012, 4:46:09 PM3/16/12
to nodejs
The server in question is not an http server, but a back-end
simulation server running physics simulation for an online game using
a Bullet native library. Yeah, I know, not exactly a typical (or
perhaps wise...) use of node.js. We're 95% certain the Bullet module
is the culprit, but that is hundreds of thousands of lines of 3rd
party C++ code, not something feasible to poke in, and, like most
physics simulations, not particularly deterministic when combined with
the randomness of network latency and real user actions. Stand alone
stress tests we've tried never exhibit the problem, and since it takes
a fully loaded server a day to exhibit it, it's not likely to
reproduce in a development environment. That being said, it
consistently does reproduce on the production servers, so that is,
theoretically, an easy way to debug it with post-mortem debugging
(albeit with a day-long turn-around to test fixes). Heap dumps are a
much more reliable way of tracking down heap issues in a large system
than any "poke at different parts of the system at random" method, I
was just hoping there was an easy way to get them reliably...

I'll try poking around in a gdb dump, although I'm guessing the
default heap isn't going to have any allocation site information on
the heap entries, but it might show some useful information, at least
it should allow me to quickly sample the heap to determine what the
primary content type is (strings, floats, ints, etc) is that's
leaking, which may provide some insight.

On Mar 16, 8:34 am, George Stagas <gsta...@gmail.com> wrote:
> You can pinpoint the leak by writing stress tests. Don't wait days for
> the leak to show in production. Find which component causes it. Make a
> list of suspects. If it's an http server, start with simple requests.
> Does the memory go up? No? Maybe it's the db? Try a million
> read/writes. Did you find the component that leaks? Follow the chain
> of events, commenting out suspicious code and trying again until the
> leak disappears.
>
> 2012/3/16 robot1125 <shane.bouslo...@gmail.com>:
>
>
>
>
>
>
>
> >> So, my general question, how do people go about debugging memory leaks
> >> in native code in node (or node itself)?
>
> > You may find this interview with Brian Cantrill of Joyent interesting.
>
> >http://www.infoq.com/interviews/operating-nodejs-production-bryan-can...

mscdex

unread,
Mar 16, 2012, 5:12:07 PM3/16/12
to nodejs
On Mar 16, 4:46 pm, Jimb Esser <wastel...@gmail.com> wrote:
> The server in question is not an http server, but a back-end
> simulation server running physics simulation for an online game using
> a Bullet native library.  Yeah, I know, not exactly a typical (or
> perhaps wise...) use of node.js.  We're 95% certain the Bullet module
> is the culprit, but that is hundreds of thousands of lines of 3rd
> party C++ code, not something feasible to poke in, and, like most
> physics simulations, not particularly deterministic when combined with
> the randomness of network latency and real user actions.  Stand alone
> stress tests we've tried never exhibit the problem, and since it takes
> a fully loaded server a day to exhibit it, it's not likely to
> reproduce in a development environment.  That being said, it
> consistently does reproduce on the production servers, so that is,
> theoretically, an easy way to debug it with post-mortem debugging
> (albeit with a day-long turn-around to test fixes).  Heap dumps are a
> much more reliable way of tracking down heap issues in a large system
> than any "poke at different parts of the system at random" method, I
> was just hoping there was an easy way to get them reliably...

FWIW I wonder if either of these javascript bullet ports are worth
trying/looking into?: https://github.com/adambom/bullet.js/ and
https://github.com/kripken/ammo.js/

George Stagas

unread,
Mar 17, 2012, 8:22:13 AM3/17/12
to nod...@googlegroups.com
I hate it when that happens; you decide to use a 3rd party lib because
it'll make your life easier. Instead you spend 95% of the development
time fixing bugs, hacking to get the required behaviour. Then you
think you would be better off using that time building the library you
needed, with the exact features you needed and with actual control
over what happens in your code. I think that's where node wins, with
its low level API. Don't give me features, give me control.

2012/3/16 mscdex <msc...@gmail.com>:

Ilya Dmitrichenko

unread,
Mar 17, 2012, 9:21:31 AM3/17/12
to nod...@googlegroups.com
Fabric Engine provides Bullet binding among other things it can do for
you. They are about release the source code any day now, I definitely
recommend you to have a look into Fabric if you are doing something of
that kind in Node. It would take to long to give you all the info,
just see a couple of videos:

http://www.youtube.com/watch?v=WWjJE-6Ln24
http://fabric-engine.com/2011/08/fabric-architectural-overview/

Mark Hahn

unread,
Mar 17, 2012, 3:22:33 PM3/17/12
to nod...@googlegroups.com
Don't give me features, give me control. 

Amen.  +1.  Whatever.

Jimb Esser

unread,
Mar 19, 2012, 2:30:46 AM3/19/12
to nodejs
After quite a bit of digging, I found "mtrace" which is part of GCC on
Linux and provides a memory trace log which can be used to list
outstanding allocations.  I whipped up a native module exposing access
to this, and am now able to get a reasonable heap dump (or, rather,
heap delta from when starting/stopping tracing, which is perhaps more
useful since I can just start it going after the server has spun up
all of its initial loading).

When linking against C++ code, it's not particularly useful, because
the primary call site for all allocations is "operator new", but for C-
like code, it works great.  I replaced a bunch of unnecessary "new"
statements with some "mallocs" in our offending module and this
identified the particular leak immediately.

The module isn't NPMified, but I threw it up on GitHub for anyone who
wants to play with it.  It also includes a mtrace log parser written
in node to generate high-level summary information on outstanding
allocs.

https://github.com/Jimbly/node-mtrace

Oliver Leics

unread,
Mar 19, 2012, 7:40:53 AM3/19/12
to nod...@googlegroups.com
WARN: OT

On Sat, Mar 17, 2012 at 8:22 PM, Mark Hahn <ma...@hahnca.com> wrote:
>>  Don't give me features, give me control.
>
> Amen.  +1.  Whatever.

*poke* all those sexy but bloated node-modules.

Ilya Dmitrichenko

unread,
Mar 19, 2012, 8:00:09 AM3/19/12
to nod...@googlegroups.com
On 19 March 2012 06:30, Jimb Esser <wast...@gmail.com> wrote:
> The module isn't NPMified, but I threw it up on GitHub for anyone who
> wants to play with it.  It also includes a mtrace log parser written
> in node to generate high-level summary information on outstanding
> allocs.
>
> https://github.com/Jimbly/node-mtrace

Excellent work! Thanks a lot :)

I'll definitely give it a try sometime soon!

Reply all
Reply to author
Forward
0 new messages