God's memory leak - a scientific treatment

1,666 views
Skip to first unread message

Tom Werner

unread,
Feb 13, 2008, 2:02:40 PM2/13/08
to god.rb
As some of you may have experienced, god has a slow memory leak. Since
my informal techniques have so far been fruitless, it's time to bust
out the scientific method. I will use this thread to document the
process and hopefully we can solve this problem.

Let's begin. I ask "Why is god leaking?"

Looking at /proc/<pid>/status for a god process (ruby 1.8.5
(2006-12-25 patchlevel 12) [x86_64-linux]) that is obviously leaking,
I see the following:

VmPeak: 297144 kB
VmSize: 297132 kB
VmLck: 0 kB
VmHWM: 206344 kB
VmRSS: 206336 kB
VmData: 234948 kB
VmStk: 228 kB
VmExe: 772 kB
VmLib: 5000 kB
VmPTE: 584 kB

The stack size is small and the data size is large, which means that
the leak is on the heap.

Ruby can leak to the heap in two ways. By not garbage collecting from
the Ruby heap or by leaking memory to the C heap. I always assume it
is my own code that is broken, so my hypothesis is that I am somehow
causing Ruby heap objects to stick around and not be GC'ed. To test
this hypothesis I will use the excellent BleakHouse from Evan Weaver.
It produces a dump of the Ruby heap on demand and can then analyze the
frame dumps to calculate how many objects are never collected.

The test is against god 0.7.3 on ruby 1.8.5 (2006-12-25 patchlevel 12)
[i686-darwin8.9.1]. The config file contains a single watch with
interval = 0 (meaning it will run the condition continuously) that
runs a process_running condition. Heap snapshots are taken at the
beginning and end of the driver loop.

After several seconds, god holds steady at around 9150kb for ten
minutes or so. Then it starts leaking. When I finally stop god about
20 minutes later, it is up to around 9500kb. This test run produces a
20GB! dump file. An overnight analysis gives me this:

-----------------------------------

88733 full frames. Removing 15 frames from each end of the run to
account for
startup overhead and GC lag.

5220649 total births, 5220403 total deaths, 611 uncollected objects.

Tags sorted by persistent uncollected objects. These objects did not
exist at
startup, were instantiated by the associated tags, and were never
garbage
collected:
end-driver leaked (over 44350 requests):
178 String
146 Array
111 Time
4 MatchData
3 God::DriverEvent
1 IO
1 Float
1 God::System::Process
1 File
1 Process::Status
begin-driver leaked (over 44355 requests):
29 String
3 Array
1 NoMethodError

-----------------------------------

The analysis shows that no unexpected Ruby heap leaks are occurring.
God is set to hold on to the last 100 log lines (which are stored as
[text, time]) accounting for 100 Array objects, 100 String objects,
and 100 Time objects. The remaining objects represent the working set
that god uses during a normal driver cycle and are within my
expectations.

I conclude that under this setup, uncollectable objects do not account
for the memory leak. This leaves C heap leaks as the likely culprit.
My next experiment will formalize this hypothesis.

Tom Werner

unread,
Feb 13, 2008, 8:00:39 PM2/13/08
to god.rb
I know that the memory leak isn't caused by leaking Ruby objects and I
know that a basic looping Ruby program does not leak memory. There
must, then, be some part of the code that causes Ruby to leak memory
internally. Since the memory leak seems to be related to how
frequently the driver loop executes, I hypothesize that most or all of
the memory leak is contained within the driver loop handlers.

Experiment B

In this experiment I will disable the handler part of the driver loop,
instead rescheduling the condition immediately. The code for this test
can be found at

https://github.com/mojombo/god/tree/f59621c270c5d804ff362c152823946d6942e07c

The specific change is hilighted at

https://github.com/mojombo/god/tree/f59621c270c5d804ff362c152823946d6942e07c/lib/god/driver.rb#L73-75

For these memory leak tests, I will consider god NOT leaking if it
does not vary significantly from the startup memory usage after 10,000
seconds. Here are the test results:

memory in kb (second)
---------------------------------
7604 (1)
7604 (2)
7604 (3)
7604 (4)
7604 (5)
...
7588 (9996)
7588 (9997)
7588 (9998)
7588 (9999)
7588 (10000)

This is running under ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-
darwin8.9.1].

I conclude that the memory leak is caused by code within the driver
handlers. Some Ruby call or calls must be leaking memory within that
code. Isolating the offending code should hopefully allow me to work
around it.

Tom Werner

unread,
Feb 14, 2008, 1:49:15 PM2/14/08
to god.rb
As an addendum to Experiment B, I did the same test with the 1.8.6
BleakHouse Ruby.

$ ruby-bleak-house -v
ruby 1.8.6 (2007-09-23 patchlevel 110) [i686-darwin8.11.1]

$ ruby-bleak-house -I/p/libdata/ruby/1.8 bin/god -c test/configs/
stress/stress.god -D --no-events

Here are the results:

14356 (1)
14356 (2)
14356 (3)
14356 (4)
14356 (5)
...
14528 (10000)
...
14720 (20000)
...
14892 (30000)
...
15064 (40000)
...
15244 (50000)
...
15448 (62402)
15448 (62403)
15448 (62404)
15448 (62405)
15448 (62406)

Over the 17 hour test, memory increased by 1.1 MB. This represents
perhaps 1,000,000 times through the driver loop (the test used 100% of
both cores on my MBP).

I'm not sure what to make of this data right now. It appears to be
leaking, albeit very very slowly. I'm also not sure why initial memory
usage is higher. More investigation needs to go into 1.8.6 behavior,
but for the moment I will concentrate on 1.8.5 as it shows no memory
leak for this test which should make it easier to isolate the leaking
code.

Rubinius is almost the point where it can run god. I will keep an eye
on their progress and once a test run can be done against Rubinius I
will do an experiment under that environment.

Tom Werner

unread,
Feb 14, 2008, 5:21:12 PM2/14/08
to god.rb
Experiment B showed that god does not leak memory under 1.8.5 if the
main processing logic is removed. This is for a single Watch. It's
possible that thread interactions will cause the leak to manifest. I
hypothesize that configuring 10 watches (one driver thread per watch)
will also NOT leak. I will perform a 10,000 second experiment to test
this.

Experiment C

http://github.com/mojombo/god/tree/ea49aacb71b6d90fa0c65e19254299b060723822

7896 (0)
7896 (1)
7896 (2)
7896 (3)
7896 (4)
...
7940 (2500)
...
7988 (5000)
...
8032 (7500)
...
8084 (9996)
8084 (9997)
8084 (9998)
8084 (9999)
8084 (10000)

So it looks like a slow leak. Memory usage after 10,000 seconds (2.78
hours) was 188kb higher than initially. While I'd say this is a real
leak, it would only count as a tiny portion of the problem and so I
will shelve it for later inspection. If the main leak is not caused by
threaded interactions at this level, then it is time to delve into the
handler code and see if I can isolate the memory leaking code.

Tom Werner

unread,
Feb 15, 2008, 1:22:59 PM2/15/08
to god.rb
It occurs to me that I have not run a control test under the decided
upon setup.

$ ruby -v
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.9.1]

Experiment D

http://github.com/mojombo/god/tree/b91e3172af83c5a937a97ddbab88972922a01cf1

This experiment will act as the control, showing the memory leak in
god 0.7.2 (at full speed). A 10,000 second test with one Watch will be
performed if my machine can hold on that long.

8088 (0)
8136 (1)
8188 (2)
8240 (3)
8288 (4)
...
54208 (1000)
...
98304 (2000)
...
142992 (3000)
...
186936 (4000)
...
229920 (5000)
...
272516 (6000)
...
314444 (7000)
...
356028 (8000)
...
396920 (9000)
...
437708 (9996)
437752 (9997)
437788 (9998)
437832 (9999)
437868 (10000)

So there you have it. 437 MB memory usage after 10,000 seconds.
Amazing.

Tom Werner

unread,
Feb 15, 2008, 4:27:06 PM2/15/08
to god.rb
Intuition tells me that the logging subsystem is a likely candidate
for memory leaks. There's a lot of data being pushed around in there.
I hypothesize that this will at least reduce the memory leak.

Experiment E

I will disable all logging from within the handlers.

http://github.com/mojombo/god/commit/e4c85a3221fc1a40add0b73abad9403ba6d2005d

Here are the results

7404 (0)
7404 (1)
7404 (2)
7396 (3)
7396 (4)
...
7384 (9996)
7384 (9997)
7384 (9998)
7384 (9999)
7384 (10000)

Wow, this is great news! No leaks with logging removed. Slowly the
pincers close...

Tom Werner

unread,
Feb 15, 2008, 7:53:34 PM2/15/08
to god.rb
So far I've shown that the leak is in the log_line method of task.rb.

Experiment F:

I will play around with the enabled parts of log_line and see if I can
isolate a single line in the method that is causing the leak.

...

After enabling all the code, I've found that this line, when removed,
prevents the leak (at least for short runs):

debug_message = watch.name + ' ' + condition.base_name + " [#{result}]
" + self.dest_desc(metric, condition)

...

Further testing shows that it's the condition.base_name part of the
line that is responsible. Here's the code:

def base_name
self.class.name.split('::').last
end

Amazingly, this seemingly benign method rings a very familiar bell.
Another developer at Powerset had identified a piece of code in his
app that looked very much like this and resulted in a HUGE memory leak
as well. The odd but functional fix that we identified was to create a
local scope by declaring a local variable. Here's the diff that I will
try for this test run:

def base_name
+ x = 1
self.class.name.split('::').last
end

And, drumroll please:

7444 (0)
7444 (1)
7444 (2)
7444 (3)
7444 (4)
...
7432 (9996)
7432 (9997)
7432 (9998)
7432 (9999)
7432 (10000)

All I can say is SCIENCE!

I can't tell you how glad I am to have finally solved this problem.
The fun isn't over yet, though. I'll now confirm the fix for 10
watches and on 1.8.6. Stay tuned!

Chris Van Pelt

unread,
Feb 15, 2008, 10:11:30 PM2/15/08
to god...@googlegroups.com
Holy crap, It would interesting to figure out how the F defining a local variable prevents the leak...  It almost makes me more frustrated that this is the fix :)

Andrew Stewart

unread,
Feb 16, 2008, 7:34:03 AM2/16/08
to god...@googlegroups.com

On 16 Feb 2008, at 00:53, Tom Werner wrote:
> Further testing shows that it's the condition.base_name part of the
> line that is responsible. Here's the code:
>
> def base_name
> self.class.name.split('::').last
> end

Well done for isolating this; a triumph of empirical legwork.


> Amazingly, this seemingly benign method rings a very familiar bell.
> Another developer at Powerset had identified a piece of code in his
> app that looked very much like this and resulted in a HUGE memory leak
> as well. The odd but functional fix that we identified was to create a
> local scope by declaring a local variable. Here's the diff that I will
> try for this test run:
>
> def base_name
> + x = 1
> self.class.name.split('::').last
> end

Odd indeed. This looks like a bug in Ruby. Would Ruby Talk be the
best place to raise this in the hope of solving the problem at source?

Regards,
Andy Stewart

-------
http://airbladesoftware.com

Bob Hutchison

unread,
Feb 16, 2008, 11:40:04 AM2/16/08
to god...@googlegroups.com, Bob Hutchison
Hi,

On 15-Feb-08, at 7:53 PM, Tom Werner wrote:

> All I can say is SCIENCE!
>
> I can't tell you how glad I am to have finally solved this problem.
> The fun isn't over yet, though. I'll now confirm the fix for 10
> watches and on 1.8.6. Stay tuned!

Well done!

Here's a test case that demonstrates what is surely a bug. If you
want, you can post on ruby-talk or someplace more efficient to get
some attention and a fix. Make sure you post the test case.

module Play
class Toy
def peek_at_proc
puts `ps u -p #{Process.pid}`
end

def base_name
#x = 1 # <<<<<< REMOVE COMMENT TO FIX MEM LEAK
"a,b".split(',')
end

def test
peek_at_proc

1000000.times do
base_name
end

peek_at_proc
end
end
end

Play::Toy.new.test

Kevin Clark

unread,
Feb 16, 2008, 3:36:39 PM2/16/08
to god...@googlegroups.com, Bob Hutchison
I was actually able to reduce the leak further when we found it last month:

class Bar
def self.class_name
name.split(/::/)
end
end

loop { Bar.class_name }

Leaks rather a lot.

--
Kevin Clark
http://glu.ttono.us

Phil Murray

unread,
Feb 17, 2008, 4:59:05 AM2/17/08
to god.rb
Seems you're not the only ones to discover it

See here: http://opensynapse.net/post/24570619

Kevin Clark

unread,
Feb 17, 2008, 3:17:18 PM2/17/08
to god...@googlegroups.com
On Feb 17, 2008 1:59 AM, Phil Murray <pmu...@open2view.com> wrote:
>
> Seems you're not the only ones to discover it
>
> See here: http://opensynapse.net/post/24570619

He's another Powerset developer, referring to our internal ruby list ;)

Tom Werner

unread,
Feb 18, 2008, 6:33:27 PM2/18/08
to god.rb
This experiment will test god with memory leak fix against Ruby 1.8.6
BleakHouse version.

Experiment G:

Same code as Experiment F.

http://github.com/mojombo/god/commit/97f4e2a4d0785d22798996d99e29191243a1a711

$ ruby-bleak-house -v
ruby 1.8.6 (2007-09-23 patchlevel 110) [i686-darwin8.11.1]

I'll fire this off before I leave today and leave it run over the
weekend.

14380 (0)
14380 (1)
14380 (2)
14380 (3)
14380 (4)
...
14684 (50000)
...
14688 (100000)
...
14688 (150000)
...
14692 (200000)
...
14696 (249996)
14696 (249997)
14696 (249998)
14696 (249999)
14696 (250000)

This is interesting. I'm not sure whether this represents a long term
leak or not. Memory usage increased by 4k for 3 of the 4 last samples
above. Considering that each 50,000 second sample represents about 75
million times through the driver loop, that would be an awfully slow
leak. I am happy to see that it plateaus in 1.8.6, even if that
plateau is not completely flat. That's good enough for now and I'll
put aside some time in the future to investigate further. The fact
that 1.8.6 settles on about twice the memory usage of 1.8.5 is a bit
curious and also deserves more attention.

Tom Werner

unread,
Feb 19, 2008, 1:36:00 PM2/19/08
to god.rb
Experiment H:

Now I will test the memory fixed god on 1.8.5 with 40 watches.

13336 (0)
13556 (1)
13768 (2)
...
21428 (10000)
...
26584 (20000)
...
31800 (30000)
...
37096 (40000)
...
42344 (50000)
...
47680 (60000)
...
50648 (65901)
50648 (65902)
50648 (65903)

It seems the memory leak fix for a single watch does not translate to
no leaks under many watches. Unfortunate.

Looks like it's time to start a new set of experiments based on the
many-watch behavior.

Tom Werner

unread,
Feb 19, 2008, 5:46:09 PM2/19/08
to god.rb
In order to get a better understanding of the situation, I'm going to
do a set of runs with various numbers of watches to see how memory
usage compares between them. This may shed some light on where the
problem is situated.

ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.9.1]
bin/god -c test/configs/stress/stress.god -D

Test runs will be 1000 seconds each for 1, 2, 4 and 8 watches. All
values are in kilobytes.

1 watch
7520..7524

2 watches
7940..40312

4 watches
7696..46300

8 watches
14704..41140

The results show that for a single watch there is no memory leak, as
I've already concluded. Adding just one more watch causes memory to
explode. I want to verify again that removing the driver handler will
stop the leak.

8 watches (no driver handlers)
7916..7388

Oh look, memory usage actually decreased! Isn't that a sight to
behold! So the leak is somewhere within the handler, once again.

This wraps up this set of tests. Next I'll dig into the even handler
and start narrowing the field.

Tom Werner

unread,
Feb 19, 2008, 8:49:11 PM2/19/08
to god.rb
Experiment I:

This experiment tests my exploratory narrowing of logging code to a
single method in log_line. The codebase is

http://github.com/mojombo/god/tree/b2a50c284036e094e1f751d5745501f8e5d836c3

This is a standard 10000 second test on Ruby 1.8.5 with 8 watches.

14240 (1)
...
13104 (10000)

So the leak is caused somewhere in applog, but only for multiple
watches. Curious indeed. Here is the specific line that causes the
leak (and does not when commented out):

http://github.com/mojombo/god/tree/b2a50c284036e094e1f751d5745501f8e5d836c3/lib/god/task.rb#L420

Tom Werner

unread,
Feb 20, 2008, 1:33:33 PM2/20/08
to god.rb
Experiment J:

Digging around in the internals of the god logger have allowed me to
isolate the leak inside the Ruby's Logger class. To verify that it
leaks in a multi-threaded setup I wrote the following simple case:

-------------------------------

require 'logger'

log = Logger.new(STDOUT)

threads = []

10.times do
threads << Thread.new do
loop do
log.info("foo")
end
end
end

threads.each { |t| t.join }

---------------------------

Which does indeed leak like mad under 1.8.5 and 1.8.6 and is confirmed
by several people. This test, then, involves writing a very simple
logger to replace Ruby's Logger and see if that will eliminate the
leak. The version I am testing can be found here:

http://github.com/mojombo/god/tree/f67f6090a548f9139a28aa58205e0e402517e5d9

And now for the test. I will run this one overnight to be extra sure
of what's going on.

14412 (0)
...
14612 (10000)
...
14788 (20000)
...
14964 (30000)
...
15148 (40000)
...
15328 (50000)
...
15428 (55822)

There appears to be a leak at a rate of about 200k per 10,000 seconds
with the simple logger. I'm going to flesh out the rest of the simple
logger and then release a version with the fix. While a leak seems to
persist, it is 3 orders of magnitude smaller now. That will please
many users.

Maybe I'll even work on some new features for once. =)

Tom Werner

unread,
Feb 21, 2008, 4:08:51 PM2/21/08
to god.rb
Experiment K:

This is an experiment with the new SimpleLogger that only supports the
functionality that I need.

An interesting thing happened when I did some informal tests with the
new logger, which can be found at

http://github.com/mojombo/god/tree/ab3489cc21b26461fdd643dd7e0ff8d91bd40984

While this code is almost exactly the same as Loggy (the simple log I
wrote on the science branch), it now leaks again. In isolating the
leak, I narrowed it down to a change I had made in the assignment of
severity level constants. Loggy had

DEBUG = 4
INFO = 3
WARN = 2
ERROR = 1
FATAL = 0

which was backwards from what the Ruby Logger has, which is

DEBUG = 0
INFO = 1
WARN = 2
ERROR = 3
FATAL = 4

It turns out that the problem is in assigning the value 1 to INFO. If
that constant is changed to any other integer, it does not leak nearly
as fast (it still has that 200k/10,000 leak though). This is totally
incomprehensible to me, but if you'd like to try to replicate it, here
are the two git commits that represent the fast leak and slow leak
versions.

fast leak: http://github.com/mojombo/god/tree/c96249b0e8a97a49771039bc4e57dacd029d05ba
slow leak: http://github.com/mojombo/god/tree/e88d81654afd164e7b3fbaf05b7a6334e3a6b2a2

and the diff between them:

diff --git a/lib/god/logger.rb b/lib/god/logger.rb
index 7150fca..6f55d1c 100644
--- a/lib/god/logger.rb
+++ b/lib/god/logger.rb
@@ -1,7 +1,7 @@
module God
class Loggy
DEBUG = 0
- INFO = 1
+ INFO = 3
WARN = 2
ERROR = 3
FATAL = 4

I may go into this a bit more later as I discovered some other things
about this weirdness, but for now, let's move on. Simply changing the
constant values so that none of them are 1 gets us back to the slow
leak.

http://github.com/mojombo/god/tree/4ff157ce00017811414a6ef0b94d6d6e257932c3

And here is the 10k test to prove it.

14612 (0)
...
14836 (10000)
...
15052 (20000)
...
15264 (30000)
...
15480 (40000)
...
15708 (50000)
...
15920 (60000)
...
16128 (69269)

Once again, here is the 200k/10,000 second leak that we had in the
science branch with the hackish logger. Next, let's discover where
this 200k leak is coming from!

Andrew Stewart

unread,
Feb 22, 2008, 5:37:29 AM2/22/08
to god...@googlegroups.com

On 21 Feb 2008, at 21:08, Tom Werner wrote:
> It turns out that the problem is in assigning the value 1 to INFO. If
> that constant is changed to any other integer, it does not leak nearly
> as fast (it still has that 200k/10,000 leak though). This is totally
> incomprehensible to me, ....

This is like Alice in Wonderland. Curiouser and curiouser....

Keep up the good detective work!

psada...@gmail.com

unread,
Feb 29, 2008, 12:21:56 PM2/29/08
to god.rb
This is pretty interesting stuff. I've noticed there haven't been any
updates for a week, though. Is this still progressing? Any workarounds
(Can god be made to monitor and restart god?) Anything I can do to
help?

Thanks for the awesomeness of god
Paul

chris [at] thewebfellas.com

unread,
Mar 6, 2008, 8:31:54 PM3/6/08
to god.rb
Thanks for the in-depth approach. Noticed my God process had got up to
400mb and googled and got straight here - refreshing to be given such
an in-depth analysis - keep up the good work!

gobigdave

unread,
Apr 9, 2008, 8:19:44 PM4/9/08
to god.rb
Any new updates on this issue? It makes me a little nervous that my
monitor leaks (few hundred meg in 10 days). After reading this
thread, I a;so really interesting in hearing what the problem is.

On Mar 6, 9:31 pm, "chris [at] thewebfellas.com"

Chris

unread,
Apr 17, 2008, 12:15:46 AM4/17/08
to god.rb
The code that Bob Hutchison posted still leaks in ruby trunk. There
are two open ruby bug reports that describe the leak:
Newer one that refers to this thread:
http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=19088
Older one that actually suggests a patch:
http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=15425

I'm unsure about the line numbers in the older bug report, but I made
the following change and Bob's code does not leak.

Index: parse.y
===================================================================
--- parse.y (revision 16057)
+++ parse.y (working copy)
@@ -5752,6 +5752,7 @@
if (!(ruby_scope->flags & SCOPE_CLONE))
xfree(ruby_scope->local_tbl);
}
+ ruby_scope->local_vars[-1] = 0;
ruby_scope->local_tbl = local_tbl();

Chris

unread,
Apr 17, 2008, 1:48:58 PM4/17/08
to god.rb
Clarification: this still leaks on the ruby_1_8 branch. I just tried
it on ruby trunk (1.9.0 r16065) and it appears to be fixed.

On Apr 16, 9:15 pm, Chris <cklone....@gmail.com> wrote:
> The code that Bob Hutchison posted still leaks in ruby trunk. There
> are two open ruby bug reports that describe the leak:
> Newer one that refers to this thread:http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=...
> Older one that actually suggests a patch:http://rubyforge.org/tracker/?group_id=426&atid=1698&func=detail&aid=...

gobigdave

unread,
May 11, 2008, 7:25:02 PM5/11/08
to god.rb
How much are people leaking? My god process starts at around 11M, and
in less than a week it's up to 55M. I end up restarting it every few
days. I hate to think about this, but since I can't go to the next
version of Ruby right now, Monit may be in my future.

Moses Hohman

unread,
May 11, 2008, 8:12:08 PM5/11/08
to god...@googlegroups.com
We just restart god with a cron job every night, and it really doesn't
cause us much grief, it just means our app isn't monitored for a
couple seconds every night.

Moses

Moses Hohman

unread,
May 11, 2008, 8:16:10 PM5/11/08
to god...@googlegroups.com
In detail this time.

Here's our init.d script for god:

#!/usr/bin/env bash

case "$1" in
start)
RAILS_ENV=production RAILS_ROOT=/your_app_root/current
/usr/bin/god -P /var/run/god.pid -l /var/log/god.log
/usr/bin/god load /your_app_root/current/config/monitor.god
;;

stop)
# stops god, leaves monitored processes running
/usr/bin/god quit
;;

terminate)
# stops god, killing monitored processes
/usr/bin/god terminate
;;

restart)
$0 stop
$0 start
;;

*)
# Unknown argument.
echo "usage: $0 {start|stop|restart|terminate}"
exit 1
;;
esac

exit 0

And the root cron job is:

0 2 * * * /etc/init.d/god restart

Tom Werner

unread,
May 11, 2008, 9:52:27 PM5/11/08
to god...@googlegroups.com
Another stop-gap that you can try until the memleak is fixed is to run
god under init with respawn, and then create a watch that watches god
itself. If god exceeds some limit, just kill it and init will respawn
it.

We'll be focusing on the memory leak in the coming weeks at Powerset,
so hopefully this won't continue to be a problem for much longer.

Tom

gobigdave

unread,
May 20, 2008, 8:43:45 AM5/20/08
to god.rb
Just upgraded to 0.7.5, and my god process hasn't leaked in 18 hours.
It's never gone this long without leaking before. So far, the changes
seem to be working.

On May 11, 9:52 pm, "Tom Werner" <mojo...@gmail.com> wrote:
> Another stop-gap that you can try until the memleak is fixed is to run
> god under init with respawn, and then create a watch that watches god
> itself. If god exceeds some limit, just kill it and init will respawn
> it.
>
> We'll be focusing on the memory leak in the coming weeks at Powerset,
> so hopefully this won't continue to be a problem for much longer.
>
> Tom
>
> > On Sun, May 11, 2008 at 5:12 PM, Moses Hohman <moses.hoh...@gmail.com>
> > wrote:
> >> We just restart god with a cron job every night, and it really doesn't
> >> cause us much grief, it just means our app isn't monitored for a
> >> couple seconds every night.
>
> >> Moses
>

gobigdave

unread,
May 22, 2008, 3:24:29 PM5/22/08
to god.rb
My site can't thank you enough for fixing the memory leak. I was
running close to my memory limit so an extra 20-30M caused me to begin
swapping. I never ran into any issues before god.rb because my
processes all max out and stay there. Before this update, god would
grow enough in a day to cause swapping. Now, everything is back to
happy again, and I can stop looking at monit.

Thank you!

Tom Werner

unread,
May 22, 2008, 4:19:29 PM5/22/08
to god...@googlegroups.com
Hurray! Just out of curiosity, what OS are you on. God still leaks
under certain conditions and I'm trying to pin those down.

Tom

gobigdave

unread,
May 28, 2008, 8:05:34 AM5/28/08
to god.rb
Sorry for the delay. I'm running on two 64-bit instances of CentOS
5.1.

On May 22, 4:19 pm, "Tom Werner" <mojo...@gmail.com> wrote:
> Hurray! Just out of curiosity, what OS are you on. God still leaks
> under certain conditions and I'm trying to pin those down.
>
> Tom
>
Reply all
Reply to author
Forward
0 new messages