Memory Leak with EM - where do I start?

166 views
Skip to first unread message

Avner Cohen

unread,
Feb 22, 2015, 3:48:52 PM2/22/15
to eventm...@googlegroups.com
Few months back I posted this:

The war story of how I migrated a socket.io (NodeJS based) socket server to an Event Machine based solution, and all problems gone away..

Turns out, I'm still circling around.. memory leak impact reduced, but I still need to restart my server every 24hours, seeing memory goes from 250mb to 600mb (when in 600Mb, CPU spikes to 100% and server hangs).

I use EM with EM http and EM-websockets, all in latest version.
Full code is visible in this gist - https://gist.github.com/AvnerCohen/72540e2dc13a56b4be87

Questions:
1. Any suggestions as to what could be causing this leak?
Or,
2. How do I go about trying to inspect a memory leak in event machine? being mainly a C++ code and few years old, can I assume this are is free of leaks by now?

any help or directions would be much appreciated.

Aman Gupta

unread,
Feb 23, 2015, 10:05:24 PM2/23/15
to eventm...@googlegroups.com
What ruby version are you using? Which platform (mac, linux, windows)? Which eventmachine version? And do you have epoll or kqueue enabled?

First step would be to narrow down a c/c++ leak vs ruby leak. You can use ObjectSpace.count_objects and GC.stat to see if the number of ruby objects is increasing over time. If so, it's likely a ruby-level reference leak. If not, something might be leaking at the C layer.

Aman

--
You received this message because you are subscribed to the Google Groups "EventMachine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eventmachine...@googlegroups.com.
To post to this group, send email to eventm...@googlegroups.com.
Visit this group at http://groups.google.com/group/eventmachine.
For more options, visit https://groups.google.com/d/optout.

Avner Cohen

unread,
Mar 1, 2015, 8:37:33 AM3/1/15
to eventm...@googlegroups.com
1. Using Ruby 2.1.5 , no specific GC configuration, using defaults.
2. running in Ubuntu, 12.04.3

I logged object_space every now and than, seeing the following pattern:
1. :FREE=>100213, :T_OBJECT=>8548,  :T_CLASS=>6714, :T_STRING=>34772, :T_HASH=>4488, :T_DATA=>24134
2. :FREE=>98593,  :T_OBJECT=>8845,  :T_CLASS=>6862, :T_STRING=>35157, :T_HASH=>4685, :T_DATA=>24731
3. :FREE=>93513,  :T_OBJECT=>9746,  :T_CLASS=>7313, :T_STRING=>36422, :T_HASH=>5345, :T_DATA=>26529
4. :FREE=>84027,  :T_OBJECT=>11434, :T_CLASS=>8157, :T_STRING=>38782, :T_HASH=>6562, :T_DATA=>29910
5. :FREE=>82279,  :T_OBJECT=>11737, :T_CLASS=>8308, :T_STRING=>39264, :T_HASH=>6767, :T_DATA=>30514

So it does look like a Ruby level leakage, Can't see however what exactly is leaking. 

Avner Cohen

unread,
Mar 2, 2015, 2:56:53 AM3/2/15
to eventm...@googlegroups.com
Some more investigation and info points to the following possible issues:

Leaked 1949 STRING objects of size 0/441032 at: ./gems/em-websocket-0.5.1/lib/em-websocket/framing07.rb:8
Leaked 1936 CLASS objects of size 0/1378432 at: ./gems/eventmachine-1.0.7/lib/em/connection.rb:49
Leaked 1932 STRING objects of size 0/77280 at: ./gems/em-websocket-0.5.1/lib/em-websocket/framing07.rb:9
Leaked 1930 OBJECT objects of size 0/186848 at: ./gems/em-websocket-0.5.1/lib/em-websocket/connection.rb:112
Leaked 1929 OBJECT objects of size 0/262344 at: ./gems/eventmachine-1.0.7/lib/em/connection.rb:49
Leaked 1899 HASH objects of size 0/75960 at: ./gems/em-websocket-0.5.1/lib/em-websocket/connection.rb:46
Leaked 230 HASH objects of size 0/53360 at: ./gems/em-websocket-0.5.1/lib/em-websocket/message_processor_06.rb:23
Leaked 224 STRING objects of size 0/8960 at: ./gems/em-websocket-0.5.1/lib/em-websocket/masking04.rb:28
Leaked 28 HASH objects of size 0/6496 at: ./gems/em-websocket-0.5.1/lib/em-websocket/handler.rb:79
Leaked 13 NODE objects of size 0/520 at: ./gems/eventmachine-1.0.7/lib/eventmachine.rb:187
Leaked 4 STRING objects of size 220/364 at: ./gems/eventmachine-1.0.7/lib/eventmachine.rb:187
Leaked 2 ARRAY objects of size 0/80 at: ./gems/eventmachine-1.0.7/lib/eventmachine.rb:187

Vaio Mike

unread,
Oct 30, 2015, 3:43:10 PM10/30/15
to EventMachine
Hi Avner,

just curious, were you ever able to resolve the problem?
I see a similar pattern with EM, and would appreciate if you could share your experiences.

Am on ruby 1.9.3-p551, Ubuntu 14.04, EM 1.0.7&1.0.8
My ObjectSpace.count_objects looks like this:

initially:
{:TOTAL=>101376, :FREE=>1024, :T_OBJECT=>2497, :T_CLASS=>2644, :T_MODULE=>238, :T_FLOAT=>2931, :T_STRING=>50977, :T_REGEXP=>583, :T_ARRAY=>19595, :T_HASH=>1602, :T_STRUCT=>262, :T_BIGNUM=>913, :T_FILE=>6, :T_DATA=>10079, :T_MATCH=>177, :T_COMPLEX=>1, :T_RATIONAL=>799, :T_NODE=>6824, :T_ICLASS=>224}
{:T_OBJECT=>78248, :T_CLASS=>1684144, :T_MODULE=>275360, :T_STRING=>1064583, :T_REGEXP=>390709, :T_ARRAY=>659536, :T_HASH=>766784, :T_STRUCT=>576, :T_FILE=>9488, :T_DATA=>14723296, :T_MATCH=>4880, :TOTAL=>19657604}


after a few days:
{:TOTAL=>23086193, :FREE=>285158, :T_OBJECT=>2559624, :T_CLASS=>199023, :T_MODULE=>238, :T_FLOAT=>33268, :T_STRING=>6176197, :T_REGEXP=>584, :T_ARRAY=>5153883, :T_HASH=>4125014, :T_STRUCT=>196631, :T_BIGNUM=>8758, :T_FILE=>7, :T_DATA=>3742914, :T_MATCH=>2088, :T_COMPLEX=>1, :T_RATIONAL=>4767, :T_NODE=>597813, :T_ICLASS=>225}
{:T_OBJECT=>187081256, :T_CLASS=>86526872, :T_MODULE=>275360, :T_STRING=>20057846, :T_REGEXP=>391224, :T_ARRAY=>1623497024, :T_HASH=>1231021656, :T_STRUCT=>6784, :T_FILE=>9704, :T_DATA=>322313000, :T_MATCH=>167040, :TOTAL=>3471347766}


According to Aman's comment, this should clearly point to a Ruby level leakage. The question remains, how one can track this further down!?

Thanks,
Mike

Avner Cohen

unread,
Nov 1, 2015, 2:30:49 AM11/1/15
to eventm...@googlegroups.com
We were never able to track the root cause, we ended up porting this part of the code to Go and it works flawlessly since.

-Avner

--
You received this message because you are subscribed to a topic in the Google Groups "EventMachine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/eventmachine/_9LcXqAKgaI/unsubscribe.
To unsubscribe from this group and all its topics, send an email to eventmachine...@googlegroups.com.

Bill Kelly

unread,
Nov 1, 2015, 2:20:29 PM11/1/15
to eventm...@googlegroups.com
Hi,

> Am on ruby 1.9.3-p551, Ubuntu 14.04, EM 1.0.7&1.0.8
> My ObjectSpace.count_objects looks like this:
>
> initially:
> {:TOTAL=>101376, :FREE=>1024, :T_OBJECT=>2497, :T_CLASS=>2644,
> :T_MODULE=>238, :T_FLOAT=>2931, :T_STRING=>50977, :T_REGEXP=>583,
> :T_ARRAY=>19595, :T_HASH=>1602, :T_STRUCT=>262, :T_BIGNUM=>913,
> :T_FILE=>6, :T_DATA=>10079, :T_MATCH=>177, :T_COMPLEX=>1,
> :T_RATIONAL=>799, :T_NODE=>6824, :T_ICLASS=>224}
> {:T_OBJECT=>78248, :T_CLASS=>1684144, :T_MODULE=>275360,
> :T_STRING=>1064583, :T_REGEXP=>390709, :T_ARRAY=>659536,
> :T_HASH=>766784, :T_STRUCT=>576, :T_FILE=>9488, :T_DATA=>14723296,
> :T_MATCH=>4880, :TOTAL=>19657604}

Not sure if this might be of help, but the following snippet should
show the class names of the objects in the system:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

obj_types = Hash.new(0)

GC.start

ObjectSpace.each_object do |obj|
typename = if (obj.respond_to?(:respond_to?) rescue nil)
if obj.respond_to? :ancestors
obj.name rescue "UnknownClass"
else
obj.class.name rescue "UnknownInstance"
end
else
"UnknownBasicObject"
end
obj_types[typename] += 1
end

obj_types.to_a.sort_by {|name, count| [-count, name]}.each {|name, count| puts "#{count}\t#{name}"}

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

(Tested on ruby 2.2.4p175)


Regards,

Bill

Reply all
Reply to author
Forward
0 new messages