how to identify memory leak

946 views
Skip to first unread message

foreverman

unread,
Apr 28, 2011, 6:57:58 AM4/28/11
to Phusion Passenger Discussions
Hey,

We are using Passenger(3.0.6) + Nginx(nginx/0.8.54) to host our rails
applications and we keep getting 'out of memory' issues. I think I
might get more helps here even if the issue is not related to
passenger. I am not quite familiar with how passenger and ruby manages
memory, so forgive me if I ask some naive questions.

Then I added some kind of memory usage logger for debugging this
issue, it looks like the following:

module MemoryUsageLogger
def self.included(klass)
klass.class_eval do
around_filter :log_memory_usage
end
end

private
def log_memory_usage
before_memory_usage = `ps -o rss= -p #{$$}`.to_i
yield
after_memory_usage = `ps -o rss= -p #{$$}`.to_i
logger.info("Memory usage: before-#{before_memory_usage}KB|after-
#{after_memory_usage}KB| change-#{after_memory_usage -
before_memory_usage}KB|PID: #{$$}")
end
end

Then I found there are requests with more than 30M memory changes(some
of them are more than 100M). Here are some questions:

If the memory change is for example 30M, could this imply the memory
leak? Or put it another way, is there a possibility that we load too
much in memory and the memory occupied will be reclaimed later?

Another question is about output of 'passenger-memory-stats', assuming
the following output of 'passenger-memory-stats':
----- Passenger processes -----
PID VMSize Private Name
-------------------------------
1635 93.3 MB 43.5 MB Rails: /mnt/app/current
1639 102.4 MB 52.1 MB Rails: /mnt/app/current
25783 3.9 MB 0.2 MB PassengerWatchdog
25786 21.5 MB 2.4 MB PassengerHelperAgent
25788 20.3 MB 6.0 MB Passenger spawn server
25794 9.4 MB 0.4 MB PassengerLoggingAgent
### Processes: 6
### Total private dirty RSS: 104.61 MB

Does '6.0MB' of 'Passenger spawn server' consists of memory occupied
by 'Application code' and 'Rails framework code'? And what does the
memory of every worker process contain?

Thanks a lot!



Hongli Lai

unread,
Apr 28, 2011, 3:54:37 PM4/28/11
to phusion-...@googlegroups.com

No. It could just mean the garbage collector hasn't run yet. In a lot
of cases growth in memory usage is not leaking. That said the GC may
not be able to release memory back to the OS even though it marks the
memory as free for reuse within the same process.

A major source of "leaks" is plain bloat. For example if you app
fetches 5000 database records in memory. They are garbage collected
but they cause the heap grow a lot, which Ruby may not be able to
release back to the OS after GC.


> Or put it another way, is there a possibility that we load too
> much in memory and the memory occupied will be reclaimed later?

Yes.

> Another question is about output of 'passenger-memory-stats', assuming
> the following output of 'passenger-memory-stats':
> ----- Passenger processes -----
> PID    VMSize    Private  Name
> -------------------------------
> 1635   93.3 MB   43.5 MB  Rails: /mnt/app/current
> 1639   102.4 MB  52.1 MB  Rails: /mnt/app/current
> 25783  3.9 MB    0.2 MB   PassengerWatchdog
> 25786  21.5 MB   2.4 MB   PassengerHelperAgent
> 25788  20.3 MB   6.0 MB   Passenger spawn server
> 25794  9.4 MB    0.4 MB   PassengerLoggingAgent
> ### Processes: 6
> ### Total private dirty RSS: 104.61 MB
>
> Does '6.0MB' of 'Passenger spawn server' consists of memory occupied
> by 'Application code' and 'Rails framework code'?

No. It works the other way around.

> And what does the
> memory of every worker process contain?

It shares memory with the ApplicationSpawner provided you use REE.
http://www.modrails.com/documentation/Users%20guide%20Apache.html#spawning_methods_explained


--
Phusion | Ruby & Rails deployment, scaling and tuning solutions

Web: http://www.phusion.nl/
E-mail: in...@phusion.nl
Chamber of commerce no: 08173483 (The Netherlands)

Joachim Buechse

unread,
Apr 29, 2011, 11:45:25 AM4/29/11
to Phusion Passenger Discussions
I think your best bet is to check what's on the heap after a leaky
request. With after I mean the next request.

A relatively easy way to find unexpected bloat is to implement an info
view and include counts of ActiveRecord objects using something like
this:

# returns [class, count] tuples for all active records in the heap
def ar_space
GC.start
h = Hash.new(0)
ObjectSpace.each_object do |o|
next if o.__id__ == self.__id__
next unless ActiveRecord::Base === o
h[o.class.to_s] += 1
end
return h.sort{|a,b| -(a[1]<=>b[1])}
end

Often getting counts of the ActiveRecord objects hanging around in
your heap is a good indicator of where things might go wrong. My mode
of debug is this:
- restart mongrel/webrick
- execute the leaky/bloating request
- execute info request
- start searching in the code
If you modify code make sure you restart mongrel - I don't know why
(maybe AR caching again) but while the changed code is executed I
sometimes saw a changed memory allocation behavior only after mongrel
restarts (working in development mode of course).

I was quite surprised how Ruby+Rails+Passenger deals with memory.
Passenger re-uses the processes so anything that's been loaded in by a
request and is still somehow referenced stays on the heap. One of the
things referencing (which I didn't think about) are closures - ie
blocks that you passed to Hash.map etc. Closures keep references to
all visible variables and hold onto them until the closure itself is
collected. Combined with caching (or effects I still haven't
understood;-) this means the process holds on to objects until the
same controller/action is called again.

As far as I know there is very little that can be done about this (I
think ruby should change the definition of closures;-). However you
can try to avoid loading to much stuff, especially duplicates. Loading
duplicates can happen in ways one wouldn't expect. Rails does some
caching with ActiveRecord, but don't expect too much.

Lets say you have

class Result < ActiveRecord::Base
has_many :links
has_many :messages
end

class Link < ActiveRecord::Base
belongs_to :result
end

class Message < ActiveRecord::Base
belongs_to :result
end

And controller + view that tries to generate an overview and would do
something like this:

messages= Message.find(:all, :readonly => true, :conditions =>
{ :nof_answers => 0 })
...
do something
...
<% messages.foreach do |m| %>
<%= rewrite_links(m, m.result.links) %>
<% end %>
...
do something else
...

You may end up re-creating (new) Result objects and Link objects every
time the m.result.links chain is executed. And there is a good chance
all of those will still be referenced somehow.

To solve the problem in the above example, you can use:

messages= Message.find(:all, :readonly => true, :conditions =>
{ :nof_answers => 0 }, :include => ['result'])


Hope this helps,
Joachim

Hongli Lai

unread,
Apr 30, 2011, 1:27:27 PM4/30/11
to phusion-...@googlegroups.com
On Fri, Apr 29, 2011 at 5:45 PM, Joachim Buechse <jbue...@greenliff.com> wrote:
> I was quite surprised how Ruby+Rails+Passenger deals with memory.
> Passenger re-uses the processes so anything that's been loaded in by a
> request and is still somehow referenced stays on the heap. One of the
> things referencing (which I didn't think about) are closures - ie
> blocks that you passed to Hash.map etc. Closures keep references to
> all visible variables and hold onto them until the closure itself is
> collected. Combined with caching (or effects I still haven't
> understood;-) this means the process holds on to objects until the
> same controller/action is called again.

You will find that pretty much every web application out there except
PHP works this way. They all load the application in memory and keep
the processes around for processing multiple requests, instead of
setting up and tearing down after every request like pretty much only
PHP does. Mongrel, Thin, JBoss, Tomcat, Node.js, etc etc all work like
this.

Joachim Buechse

unread,
May 2, 2011, 3:46:26 AM5/2/11
to Phusion Passenger Discussions
Good day,

I think that one should make a difference between code and data. It's
great that Ruby/Passenger pre-loads the framework code and keeps the
app code loaded after a request. But I'm sure you will find, that
almost any technology other than ruby (and especially Java) does not
have the same memory issues regarding data. Ie the next request will
not find data from a previous request on the heap unless you have
explicitly added code to store this data.

On Apr 30, 7:27 pm, Hongli Lai <hon...@phusion.nl> wrote:

Hongli Lai

unread,
May 2, 2011, 3:57:09 AM5/2/11
to phusion-...@googlegroups.com
On Mon, May 2, 2011 at 9:46 AM, Joachim Buechse <jbue...@greenliff.com> wrote:
> Good day,
>
> I think that one should make a difference between code and data. It's
> great that Ruby/Passenger pre-loads the framework code and keeps the
> app code loaded after a request. But I'm sure you will find, that
> almost any technology other than ruby (and especially Java) does not
> have the same memory issues regarding data. Ie the next request will
> not find data from a previous request on the heap unless you have
> explicitly added code to store this data.

Ruby or Phusion Passenger does not do that either so I'm not sure what
you're talking about.

Joachim Buechse

unread,
May 2, 2011, 12:20:32 PM5/2/11
to Phusion Passenger Discussions
Just take the example from my previous post. You will find plenty of
ActiveRecord objects on the heap from the previous request...

On May 2, 9:57 am, Hongli Lai <hon...@phusion.nl> wrote:
> On Mon, May 2, 2011 at 9:46 AM, Joachim Buechse <jbuec...@greenliff.com> wrote:
> > Good day,
>
> > I think that one should make a difference between code and data. It's
> > great that Ruby/Passenger pre-loads the framework code and keeps the
> > app code loaded after a request. But I'm sure you will find, that
> > almost any technology other than ruby (and especially Java) does not
> > have the same memory issues regarding data. Ie the next request will
> > not find data from a previous request on the heap unless you have
> > explicitly added code to store this data.
>
> Ruby or Phusion Passenger does not do that either so I'm not sure what
> you're talking about.
>
> --
> Phusion | Ruby & Rails deployment, scaling and tuning solutions
>
> Web:http://www.phusion.nl/
> E-mail: i...@phusion.nl

Hongli Lai

unread,
May 2, 2011, 12:26:44 PM5/2/11
to phusion-...@googlegroups.com
On Mon, May 2, 2011 at 6:20 PM, Joachim Buechse <jbue...@greenliff.com> wrote:
> Just take the example from my previous post. You will find plenty of
> ActiveRecord objects on the heap from the previous request...

Actually what you described is not what you think it is. Ruby's
garbage collector eventually cleans those objects up because they have
no references from the root set, but it does not do that immediately,
nor at the end of the request, but at some time it deems appropriate.
ObjectSpace.each_object just happens to allow you to access dead
objects; it does not mean that they will stay alive. Java's garbage
collector is no different, it does not immediately clean up dead
objects either, nor at the end of the web request, but at some time it
deems appropriate. The difference is that Java does not have an
ObjectSpace.each_object equivalent so there really is no way to access
dead objects.

--
Phusion | Ruby & Rails deployment, scaling and tuning solutions

Web: http://www.phusion.nl/
E-mail: in...@phusion.nl

foreverman

unread,
May 3, 2011, 7:26:55 AM5/3/11
to Phusion Passenger Discussions
Hey Hongli, Joachim,

Thanks for all your replies and suggestions. Sorry for being late to
reply.
About out of memory issue, I found that there are some places with
'memory bloat' issue in our code, that is, it loaded too much in
memory.
But I don't understand why the memory occupied didn't reduce after a
few requests(with memory bloat issue). I config
'passenger_max_pool_size = 1'
so I can inspect one process, and I continuously issues same request
which loads too much in memory, here is the memory change history:

mem_bloat_req:change-308392KB
mem_bloat_req:change-2436KB
mem_bloat_req:change-1248KB
mem_bloat_req:change-464KB
mem_bloat_req:change-564KB
mem_bloat_req:change-0KB

So you can see memory is up and never see it decreasing. My question
is this is normal(Sure, there might be a memory leak in our code)?

By the way, I created a controller/action which will call 'GC.start',
then I issue a request for starting GC after issuing memory bloat
request, but
it didn't change anything(I run passenger-memory-stats after that).

I also read the article here http://www.scribd.com/doc/27174770/Garbage-Collection-and-the-Ruby-Heap
to understand how ruby manages memory, think about
this scenario, assuming we need extra 15M to hold new data, but there
is no free slots, so GC will create a heap/slab(assume 30M per slab),
question is the extra 15M(30M - 15M) will be accounted as 'consumed
memory'(will be include in the numbers printed by passenger-memory-
stats)?

Thanks.

On 5月3日, 上午12时26分, Hongli Lai <hon...@phusion.nl> wrote:
> On Mon, May 2, 2011 at 6:20 PM, Joachim Buechse <jbuec...@greenliff.com> wrote:
> > Just take the example from my previous post. You will find plenty of
> > ActiveRecord objects on the heap from the previous request...
>
> Actually what you described is not what you think it is. Ruby's
> garbage collector eventually cleans those objects up because they have
> no references from the root set, but it does not do that immediately,
> nor at the end of the request, but at some time it deems appropriate.
> ObjectSpace.each_object just happens to allow you to access dead
> objects; it does not mean that they will stay alive. Java's garbage
> collector is no different, it does not immediately clean up dead
> objects either, nor at the end of the web request, but at some time it
> deems appropriate. The difference is that Java does not have an
> ObjectSpace.each_object equivalent so there really is no way to access
> dead objects.
>
> --
> Phusion | Ruby & Rails deployment, scaling and tuning solutions
>
> Web:http://www.phusion.nl/
> E-mail: i...@phusion.nl

foreverman

unread,
May 4, 2011, 4:39:35 AM5/4/11
to Phusion Passenger Discussions
Another question is about the value of 'passenger_max_pool_size',
assume
we have 1.4G memory in total per host, what is best optimal value of
'passenger_max_pool' depending on
total memory?

On 5月3日, 下午7时26分, foreverman <seanli...@gmail.com> wrote:
> Hey Hongli, Joachim,
>
> Thanks for all your replies and suggestions. Sorry for being late to
> reply.
> About out of memory issue, I found that there are some places with
> 'memory bloat' issue in our code, that is, it loaded too much in
> memory.
> But I don't understand why the memory occupied didn't reduce after a
> few requests(with memory bloat issue). I config
> 'passenger_max_pool_size = 1'
> so I can inspect one process, and I continuously issues same request
> which loads too much in memory, here is the memory change history:
>
> mem_bloat_req:change-308392KB
> mem_bloat_req:change-2436KB
> mem_bloat_req:change-1248KB
> mem_bloat_req:change-464KB
> mem_bloat_req:change-564KB
> mem_bloat_req:change-0KB
>
> So you can see memory is up and never see it decreasing. My question
> is this is normal(Sure, there might be a memory leak in our code)?
>
> By the way, I created a controller/action which will call 'GC.start',
> then I issue a request for starting GC after issuing memory bloat
> request, but
> it didn't change anything(I run passenger-memory-stats after that).
>
> I also read the article herehttp://www.scribd.com/doc/27174770/Garbage-Collection-and-the-Ruby-Heap
Reply all
Reply to author
Forward
0 new messages