Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

gc doesn't collect?

14 views
Skip to first unread message

Roger Pack

unread,
Jul 30, 2008, 8:13:59 PM7/30/08
to
Any ideas why:

1.times { a = 'a'*1000};
30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

prints out 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...' despite the fact that
the a's should have been collected?
Thanks! More question to come, most likely.
-R
--
Posted via http://www.ruby-forum.com/.

Eric Hodel

unread,
Jul 30, 2008, 8:40:12 PM7/30/08
to
On Jul 30, 2008, at 17:13 PM, Roger Pack wrote:
> Any ideas why:
>
> 1.times { a = 'a'*1000};
> 30.times { GC.start };
> print ObjectSpace.each_object{|o| print o}
>
> prints out 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...' despite the fact that
> the a's should have been collected?

Ruby's garbage collector walks the C stack looking for values that
appear to point to the ruby heap. Ruby thinks you still have a
reference to your string because of this.

Ryan Davis

unread,
Jul 30, 2008, 11:39:18 PM7/30/08
to

well... I think in this case it is because he never dereferenced a, so
it is still a valid live object. {} doesn't scope variables the same
way as in, say, C.


Sandor Szücs

unread,
Jul 31, 2008, 6:05:22 AM7/31/08
to
On 31.07.2008, at 02:13, Roger Pack wrote:

> Any ideas why:
>
> 1.times { a = 'a'*1000};
> 30.times { GC.start };
> print ObjectSpace.each_object{|o| print o}
>
> prints out 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...' despite the fact that
> the a's should have been collected?

In my opinion GC has not collected 'a' at the time you run through your
ObjectSpace.
If you wait longer then 'a' will be collected.

# this one collects 'a' on my machine:
$ ruby -e "1.times { a = 'a'*1000};
32.times { GC.start; sleep(1) };


print ObjectSpace.each_object{|o| print o}
"

regards, Sandor Szücs
--

Dave Bass

unread,
Jul 31, 2008, 12:24:53 PM7/31/08
to
The thing with garbage collection (in most languages, I don't know about
Ruby specifically) is that it happens when the intepreter/compiler feels
like doing it, not when you tell it to do it. When it "feels like doing
it" could depend on a lot of factors. (I imagine it as being a low
priority child process.)

As far as I know there's no way to force garbage collection to happen,
although on the face of it this would seem to be a useful facility.

M. Edward (Ed) Borasky

unread,
Jul 31, 2008, 11:50:01 PM7/31/08
to
gc.start ... right?
--
M. Edward (Ed) Borasky
ruby-perspectives.blogspot.com

"A mathematician is a machine for turning coffee into theorems." --
Alfréd Rényi via Paul Erdős


Roger Pack

unread,
Aug 1, 2008, 1:18:41 AM8/1/08
to
Just to clear up confusion:
I believe that

GC.start 'forces' a garbage collection, and that
do...end and
{...} scopes do indeed have their own scope and local variables, as
methods do.

interestingly,

def go


1.times { a = 'a'*1000};

end
go


30.times { GC.start };
print ObjectSpace.each_object{|o| print o}

yields the same errant results. I might look into it sometime. Very
weird.

Now for some questions:

currently the GC marks live objects then sweeps to find any free
objects--except it doesn't actually free any objects that are free but
need finalization. It seems to only do finalizations when a user
explicitly calls GC.start, or when the program terminates. Is there a
reason for this 'deferred_final_list' activity?

Also is it true that objects marked FL_SINGLETON should never be freed,
even if they are no longer referenced by any live code? Or is
FL_SINGLETON just used as an internal GC marker to mean 'the heap this
object comes from is entirely free--don't bother adding it to the
freelist since it is on the chopping block to be free'ed' and nothing
else?

Thanks!
-R

John Winters

unread,
Aug 1, 2008, 2:15:14 AM8/1/08
to
Roger Pack wrote:
> Any ideas why:
>
> 1.times { a = 'a'*1000};
> 30.times { GC.start };
> print ObjectSpace.each_object{|o| print o}

I presume you don't really want both those invocations of "print"?

> prints out 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa...' despite the fact that
> the a's should have been collected?

For the record, I've just tested this with Ruby 1.8.7 running on Debian
Lenny (x86-64) and it does not print 'aaaaaaa...'

I also tried this code:

1.times { a = 'a'*1000};

puts a

To test the earlier assertion that "a" has not gone out of scope. This
produces a run-time error because it has in fact gone out of scope so
Roger seems right in expecting it to have been garbage collected.

I've also tested it with and without the explicit call on GC.start.
Without the call the object is still there. With just a single call (30
calls not needed) the object is gone.

John

Eric Hodel

unread,
Aug 1, 2008, 3:08:57 AM8/1/08
to
On Jul 31, 2008, at 22:18 PM, Roger Pack wrote:
> Just to clear up confusion:
> I believe that
>
> GC.start 'forces' a garbage collection

Well, if there's no garbage then there's no collection.

> , and that
> do...end and
> {...} scopes do indeed have their own scope and local variables, as
> methods do.

Ruby scope, yes, but that doesn't mean the C stack has no pointers to
your object. There's no guarantee that all references your object
have been clobbered by subsequent calls. (Or that there are values on
the C stack that look like pointers to your objects.)

> interestingly,
>
> def go
> 1.times { a = 'a'*1000};
> end
> go
> 30.times { GC.start };
> print ObjectSpace.each_object{|o| print o}
>
> yields the same errant results. I might look into it sometime. Very
> weird.

This is simply how ruby's conservative collector works.

> Now for some questions:
>
> currently the GC marks live objects then sweeps to find any free
> objects--except it doesn't actually free any objects that are free but
> need finalization.

I think you found a bug. Ruby 1.6 called finalizers after sweep, but
1.8.6 doesn't.

$ cat final.rb
$finalizer_proc = proc do |obj_id| puts "#{obj_id} finalized" end

def a() b end
def b() c end
def c() d end
def d() e end
def e() f end
def f() g end
def g() h end
def h() i end
def i() j end
def j() k end
def k() make_obj end

def make_obj
o = Object.new
ObjectSpace.define_finalizer o, $finalizer_proc
o.__id__
end

obj_id = a

puts "#{obj_id} created"

a = []
s = 'a'

begin
loop do
ObjectSpace._id2ref obj_id
a << s.succ!
print "#{s}\r"
end
rescue RangeError
puts
puts "#{obj_id} collected"
end

$ ruby -v final.rb
ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
81660 created
avr
81660 collected
81660 finalized
$ ruby16 -v final.rb
ruby 1.6.8 (2005-09-21) [i386-darwin9.4.0]
1117686 created
1117686 finalized
omu
1117686 collected
$

> It seems to only do finalizations when a user
> explicitly calls GC.start, or when the program terminates. Is there a
> reason for this 'deferred_final_list' activity?

This patch seems to restore 1.6 behavior:

$ svn diff gc.c
Index: gc.c
===================================================================
--- gc.c (revision 18230)
+++ gc.c (working copy)
@@ -1196,7 +1196,7 @@ gc_sweep()

/* clear finalization list */
if (final_list) {
- deferred_final_list = final_list;
+ finalize_list(final_list);
return;
}
free_unused_heaps();
$ ./miniruby -I./lib -I.ext/common -I./- -r./ext/purelib.rb ./
runruby.rb --extout=.ext -- ~/final.rb
605300 created
605300 finalized
rek
605300 collected

I'm not sure if it was accidentally removed or not, the log for r7090
points to several segmentation faults due to evil things done with
finalizers and threads:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/24536

> Also is it true that objects marked FL_SINGLETON should never be
> freed,
> even if they are no longer referenced by any live code? Or is
> FL_SINGLETON just used as an internal GC marker to mean 'the heap this
> object comes from is entirely free--don't bother adding it to the
> freelist since it is on the chopping block to be free'ed' and nothing
> else?

I'm not sure about this. It was added in the same changeset as above.

I'll make a pointer to this thread on ruby-core.

Robert Klemme

unread,
Aug 1, 2008, 4:30:21 AM8/1/08
to
On 1 Aug., 07:18, Roger Pack <rogerpack2...@gmail.com> wrote:
> Just to clear up confusion:
> I believe that
>
> GC.start 'forces' a garbage collection,

Yes, if forces a GC run. But I would not be so sure about whether it
forces actual collection of all collectible instances. In other
words: the GC is run but if it decides that there's nothing to collect
yet, I won't collect anything even if there were objects that could be
freed.

> and that
> do...end and
> {...} scopes do indeed have their own scope and local variables, as
> methods do.

Yes.

> interestingly,
>
> def go
>   1.times { a = 'a'*1000};

You do not need the block here as your method provides one already.

> end
> go
> 30.times { GC.start };
> print ObjectSpace.each_object{|o| print o}
>
> yields the same errant results.  I might look into it sometime.  Very
> weird.
>
> Now for some questions:
>
> currently the GC marks live objects then sweeps to find any free
> objects--except it doesn't actually free any objects that are free but
> need finalization.  It seems to only do finalizations when a user
> explicitly calls GC.start, or when the program terminates.  Is there a
> reason for this 'deferred_final_list' activity?

Which code did lead you to this conclusion? I am asking because
finalizers are sometimes hard to get right. For example, you cannot
define a finalizer with a block inside an instance method because the
block will hold on to self and thus prevent collection. In that case
you will see finalization only happen at program exit.

Kind regards

robert

Robert Klemme

unread,
Aug 1, 2008, 4:33:50 AM8/1/08
to
On 1 Aug., 09:08, Eric Hodel <drbr...@segment7.net> wrote:
> $ ruby -v final.rb
> ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
> 81660 created
> avr
> 81660 collected
> 81660 finalized
> $ ruby16 -v final.rb
> ruby 1.6.8 (2005-09-21) [i386-darwin9.4.0]
> 1117686 created
> 1117686 finalized
> omu
> 1117686 collected
> $
>
> > It seems to only do finalizations when a user
> > explicitly calls GC.start, or when the program terminates.  Is there a
> > reason for this 'deferred_final_list' activity?
>
> This patch seems to restore 1.6 behavior:

But does it also make finalization happen before program exit? (This
was Roger's main point IIRC.) There was probably a good reason why
the order was reversed namely to make sure that objects were gone
before invoking their finalizers. Actually this is how it is defined,
i.e. the finalizer is called after the object has vanished (see
Pickaxe for example).

> $ svn diff gc.c
> Index: gc.c
> ===================================================================
> --- gc.c        (revision 18230)
> +++ gc.c        (working copy)
> @@ -1196,7 +1196,7 @@ gc_sweep()
>
>       /* clear finalization list */
>       if (final_list) {
> -       deferred_final_list = final_list;
> +       finalize_list(final_list);
>         return;
>       }
>       free_unused_heaps();

Cheers

robert

Eric Hodel

unread,
Aug 1, 2008, 2:43:37 PM8/1/08
to
On Aug 1, 2008, at 01:34 AM, Robert Klemme wrote:
> On 1 Aug., 09:08, Eric Hodel <drbr...@segment7.net> wrote:
>> $ ruby -v final.rb
>> ruby 1.8.6 (2008-03-03 patchlevel 114) [universal-darwin9.0]
>> 81660 created
>> avr
>> 81660 collected
>> 81660 finalized
>> $ ruby16 -v final.rb
>> ruby 1.6.8 (2005-09-21) [i386-darwin9.4.0]
>> 1117686 created
>> 1117686 finalized
>> omu
>> 1117686 collected
>> $
>>
>>> It seems to only do finalizations when a user
>>> explicitly calls GC.start, or when the program terminates. Is
>>> there a
>>> reason for this 'deferred_final_list' activity?
>>
>> This patch seems to restore 1.6 behavior:
>
> But does it also make finalization happen before program exit? (This
> was Roger's main point IIRC.)

Yes, but the suggested patch was not correct. See [ruby-core:18050]
for the proper patch. The problem was that versions of 1.8 never ran
finalizers unless you called GC.start or were exiting.

> There was probably a good reason why
> the order was reversed namely to make sure that objects were gone
> before invoking their finalizers.

No, finalizers were never called before collection. It looks like it
was a simple oversight while fixing various SEGV bugs when doing evil
things to ruby.

> Actually this is how it is defined,
> i.e. the finalizer is called after the object has vanished (see
> Pickaxe for example).

There was no code for running finalizers, except at exit or when
calling GC.start.

Roger Pack

unread,
Aug 2, 2008, 11:48:19 AM8/2/08
to
John Winters wrote:

> For the record, I've just tested this with Ruby 1.8.7 running on Debian
> Lenny (x86-64) and it does not print 'aaaaaaa...'

Interesting. Maybe there's a difference among versions. For me it has
the resultant 'odd' behavior in Ubuntu with Ruby 1.8.6 patchlevel 111,
os x, and windows mingw (all 32-bit) and ruby 1.8.5 [x86_64-linux].
Perhaps the 64-bit aspect is clearing the false positives.

I also tested the following code on those same platforms, all running
1.8.6:

def go


1.times { a = 'a'*1000};

end

go
def recurse b
recurse b
end

begin
recurse 33 # an attempt to "clear the stack"
rescue => e
print "rescued #{e}"
end

30.times { GC.start };
ObjectSpace.each_object{|o| print o}

This still showed the miscreant a's in ubuntu 32-bit, x86_64 Linux, and
windows but not in OS X [it actually worked there]. FWIW.
I guess this corroborates the theory that it's a false positive but I'm
still uneasy about it. I guess it's not a huge problem, but still
somewhat disconcerting.


Some questions:

Currently it appears that if there is a freeable objects that wants
finalization within a page that can be freed, it doesn't add that that
page to the freelist...I can't tell from the code, however, whether that
page is basically 'pinned' "forever" or not, when that happens. And
with ruby's current GC, it seems almost impossible to reclaim memory, so
I don't even know how to test this [the test case being that there are
finalizable objects within a page that wants to be freed--does that page
ever get freed eventually?]. FL_SINGLETON seems to play some role I'm
not sure what.

In other news, please accept my apologies--it appears that
rb_gc_finalize_deferred IS called by eval0 "every 256 eval0's" I'm not
sure if that is optimal or not, or even a good idea, but at least it
gets called.

-R

0 new messages