parrot and refcounting semantics

7 views
Skip to first unread message

Robin Redeker

unread,
Apr 26, 2005, 2:41:18 PM4/26/05
to perl6-i...@perl.org
Hello!

The last weeks i've been in joy to implement a small vm for a
special runtime enviroment. As i'm a regular reader of the perl.perl6.*
mailing lists and know about parrot. I wondered how easy it would be
to throw away my own vm solution and use parrot.
There are currently only two things that bug me a little bit:

1. i wonder how to load bytecode from the memory to parrot when
embedding it. i've read embed.pod and couldn't find a function that let me
create a packfile or something i can run, from a memory buffer that holds the
bytecode.

2. as i currently use refcounting and like the timely destruction
semantics, i wondered whether parrot can give me that.
(as i have a database in background, and don't want too much
garbage to pile up in it)

I've read that parrot wants to do a gc run over special marked objects, that
'need' timely destruction. (correct me there..)
So, what would happen if i have many marked objects and call subroutines
often?

Or are there any other plans? What do other languages do, which
have refcounting and want to port to parrot?


cya,
Robin

--
el...@x-paste.de / ro...@nethype.de
Robin Redeker

Leopold Toetsch

unread,
Apr 27, 2005, 2:09:48 AM4/27/05
to Robin Redeker, perl6-i...@perl.org
Robin Redeker <el...@x-paste.de> wrote:
> Hello!

> The last weeks i've been in joy to implement a small vm for a
> special runtime enviroment. As i'm a regular reader of the perl.perl6.*
> mailing lists and know about parrot. I wondered how easy it would be
> to throw away my own vm solution and use parrot.
> There are currently only two things that bug me a little bit:

> 1. i wonder how to load bytecode from the memory to parrot when
> embedding it. i've read embed.pod and couldn't find a function that let me
> create a packfile or something i can run, from a memory buffer that holds the
> bytecode.

from src/embed.c:Parrot_readbc()

pf = PackFile_new(interpreter, is_mapped);
PackFile_unpack(interpreter, pf, (opcode_t *)program_code, program_size);

albeit it isn't part of the official embedding API (yet).

> 2. as i currently use refcounting and like the timely destruction
> semantics, i wondered whether parrot can give me that.

Yes.

> I've read that parrot wants to do a gc run over special marked objects, that
> 'need' timely destruction. (correct me there..)

You can flag a PMC for timely destruction:

needs_destroy $P0

On subroutine return (or scope exit) you can trigger a lazy DOD run,
which does nothing if there aren't any objects that need timely
destruction:

sweep 1 # lazy DOD

This will eventually be replaced by:

enter_scope 0 | 1 # NEEDS_TIMELY_DESTRUCTION
exit_scope

or some such. (Currently the register frame, where the "sweep 1" is in,
keeps these objects alive, so the lazy DOD run can only be run after the
register frame is discarded)

> So, what would happen if i have many marked objects and call subroutines
> often?

We'll try to make the lazy DOD run as fast as possible. Objects that
need timely destruction are counted. If all such objects have been seen
during DOD, the DOD run is aborted.

> Or are there any other plans? What do other languages do, which
> have refcounting and want to port to parrot?

You might have a look at ponie (perl5 on parrot). And, while Python
doesn't guarantee timely destruction, people nethertheless rely on it,
because CPython provides timely destruction due to refcounting.

That means: most of our major target HLLs need timely destruction and
we'll provide it.

> cya,
> Robin

leo

Robin Redeker

unread,
Apr 27, 2005, 4:29:18 AM4/27/05
to Leopold Toetsch, perl6-i...@perl.org
On Wed, Apr 27, 2005 at 08:09:48AM +0200, Leopold Toetsch wrote:
> Robin Redeker <el...@x-paste.de> wrote:
[...]

> > 1. i wonder how to load bytecode from the memory to parrot when
> > embedding it. i've read embed.pod and couldn't find a function that let me
> > create a packfile or something i can run, from a memory buffer that holds the
> > bytecode.
>
> from src/embed.c:Parrot_readbc()
>
> pf = PackFile_new(interpreter, is_mapped);
> PackFile_unpack(interpreter, pf, (opcode_t *)program_code, program_size);
>
> albeit it isn't part of the official embedding API (yet).

Ok, thanks, i'll have a look at that.
Any idea when there will be a more complete official embedding API?
Any hints when looking for functions that 'maybe' go into the API?

> > I've read that parrot wants to do a gc run over special marked objects, that
> > 'need' timely destruction. (correct me there..)
>
> You can flag a PMC for timely destruction:
>
> needs_destroy $P0
>
> On subroutine return (or scope exit) you can trigger a lazy DOD run,
> which does nothing if there aren't any objects that need timely
> destruction:
>
> sweep 1 # lazy DOD
>
> This will eventually be replaced by:
>
> enter_scope 0 | 1 # NEEDS_TIMELY_DESTRUCTION
> exit_scope

Does this mean, i have to know whether this subroutine handles objects
that need timely destruction?

>
> or some such. (Currently the register frame, where the "sweep 1" is in,
> keeps these objects alive, so the lazy DOD run can only be run after the
> register frame is discarded)
>
> > So, what would happen if i have many marked objects and call subroutines
> > often?
>
> We'll try to make the lazy DOD run as fast as possible. Objects that
> need timely destruction are counted. If all such objects have been seen
> during DOD, the DOD run is aborted.

Well, what happens if some objects are not seen at the beginning, but
very late in the run? Does that mean, i maybe have to wait 100ms (or
longer, don't have any idea how long it may take) or such if i have around
1*10^3 - 1*10^5 objects that need timely destruction?

And does that mean, the time spend in GC is very indeterministic?

>
> > Or are there any other plans? What do other languages do, which
> > have refcounting and want to port to parrot?
>
> You might have a look at ponie (perl5 on parrot). And, while Python
> doesn't guarantee timely destruction, people nethertheless rely on it,
> because CPython provides timely destruction due to refcounting.

Thanks, i'll have a look.

>
> That means: most of our major target HLLs need timely destruction and
> we'll provide it.

cu,
robin

Leopold Toetsch

unread,
Apr 27, 2005, 8:27:58 AM4/27/05
to Robin Redeker, perl6-i...@perl.org
Robin Redeker <el...@x-paste.de> wrote:
> On Wed, Apr 27, 2005 at 08:09:48AM +0200, Leopold Toetsch wrote:

>> albeit it isn't part of the official embedding API (yet).

> Ok, thanks, i'll have a look at that.
> Any idea when there will be a more complete official embedding API?
> Any hints when looking for functions that 'maybe' go into the API?

API's aren't finished yet and are extended more or less on demand.

>> This will eventually be replaced by:
>>
>> enter_scope 0 | 1 # NEEDS_TIMELY_DESTRUCTION
>> exit_scope

> Does this mean, i have to know whether this subroutine handles objects
> that need timely destruction?

Normally you don't know it in the general case. So the default for
languages with timely destruction semantics would be to emit
"enter_scope 1" except the compiler (or an optimization hint) can deduce
that the scope wouldn't need timely destruction.

> Well, what happens if some objects are not seen at the beginning, but
> very late in the run? Does that mean, i maybe have to wait 100ms (or
> longer, don't have any idea how long it may take) or such if i have around
> 1*10^3 - 1*10^5 objects that need timely destruction?

> And does that mean, the time spend in GC is very indeterministic?

It really depends, but worst case can be a full GC run through all
objects with O(n) performance. But the worst case can usually be avoided
by good coding paractice.

We'll have a generational garbage collector eventually, where one
generation is one scope that needs timely destruction. We can have
counts of active objects per generation. The "all seen" is then usually
achived by scavenging just the youngest generation on scope exit, *if*
the numbers of the old generation are still valid.

But there is of course the possibility of inter-generational stores. If
your code sticks a freshly created object into a rather old one,
live-ness of the object is determined by the old objects. If the
overwritten object needs timely destruction too, it's not know, if the
last reference to it is gone or not.

Note: "old" and "young" are terms of reachability in the object graph,
mostly related to object creation time too (for subroutines) but not for
more global-like objects.

> cu,
> robin

leo

Robin Redeker

unread,
Apr 27, 2005, 11:40:03 AM4/27/05
to Leopold Toetsch, perl6-i...@perl.org
On Wed, Apr 27, 2005 at 02:27:58PM +0200, Leopold Toetsch wrote:
> Robin Redeker <el...@x-paste.de> wrote:
[...]
> > Any hints when looking for functions that 'maybe' go into the API?
>
> API's aren't finished yet and are extended more or less on demand.
>

Ah, ok :) Who will decide what is demanded?

> > Well, what happens if some objects are not seen at the beginning, but
> > very late in the run? Does that mean, i maybe have to wait 100ms (or
> > longer, don't have any idea how long it may take) or such if i have around
> > 1*10^3 - 1*10^5 objects that need timely destruction?
>
> > And does that mean, the time spend in GC is very indeterministic?
>
> It really depends, but worst case can be a full GC run through all
> objects with O(n) performance. But the worst case can usually be avoided
> by good coding paractice.

Hm, ok, i guess i will try to port the language and environment i have
to parrot and see what i get.

Just for the curious me: What was the design decision behind the GC
solution? Was refcounting that bad? Refcounting gives a more global
speed hit indeed, but it's more deterministic and you wont run into
(probably) long halts during GC. (Java programs often suffer from this,
and it results in bad latency).

You said, that most languages will have refcount semantics, just sounds
funny for me to implement a GC then.

>
> We'll have a generational garbage collector eventually, where one
> generation is one scope that needs timely destruction. We can have
> counts of active objects per generation. The "all seen" is then usually
> achived by scavenging just the youngest generation on scope exit, *if*
> the numbers of the old generation are still valid.

Hm, that sounds not that bad if this works. Are there prototypes? Has
this been tried out?

>
> But there is of course the possibility of inter-generational stores. If
> your code sticks a freshly created object into a rather old one,
> live-ness of the object is determined by the old objects. If the
> overwritten object needs timely destruction too, it's not know, if the
> last reference to it is gone or not.

Well, whats the solution here then? Sticking fresh objects into old ones
isn't that rare i guess (and also in my case it wont be that rare).

>
> Note: "old" and "young" are terms of reachability in the object graph,
> mostly related to object creation time too (for subroutines) but not for
> more global-like objects.
>

How to combine generational garbage collection and timely destruction
there? Having a policy that doesn't check 'old' objects (often) wont
work for 'old' objects that need timely destruction.

So, are timely destruction objects in the younges generation all the time?

How to check whether this 'young' object, that need timely destruction,
isn't referenced by old ones?
Don't you have to go through all (even old) objects to determine whether
'this' object that needs timely destruction, is referenced?

I'm not an expert in garbage collectors, that are just some questions
that pop into my mind there. Maybe you can point me some paper or some
old discussion about this?

Dan Sugalski

unread,
Apr 27, 2005, 3:43:32 PM4/27/05
to Robin Redeker, perl6-i...@perl.org
At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
>Just for the curious me: What was the design decision behind the GC
>solution? Was refcounting that bad? Refcounting gives a more global
>speed hit indeed, but it's more deterministic and you wont run into
>(probably) long halts during GC. (Java programs often suffer from this,
>and it results in bad latency).

I'll answer this one, since I'm the one responsible for it.

Refcounting has three big issues:

1) It's expensive
2) It's error-prone
3) It misses circular garbage

#2 was the big one, so I suppose it's in the list wrong, but I'm
feeling too lazy to edit. If you watch the changelogs for the P
languages (perl, python, and ruby) you'll find that for almost every
release there's a refcount bug patched, sometimes more than one.
Throw in the problems that extensions have and you'll find a *lot* of
refcount problems.

The expense is non-trivial as well. Yeah, it's all little tiny bits
of time, but that adds up. It's all overhead, and useless overhead
for the most part.

The circular garbage thing's also an issue. Yes, there are
interesting hacks around it (python has one -- clever, but definitely
a hack) that essentially involves writing a separate mark&sweep
garbage collector.

The thing is, there aren't any languages we care about that require
true instant destruction -- the ones that care (or, rather, perl.
Python and Ruby don't care) only guarantee block boundary collection,
and in most cases (heck, in a near-overwhelming number of cases)
timely destruction is utterly irrelevant to the code being run. Yeah,
you *want* timely destruction, but you neither need nor notice it in
most program runs, since there's nothing that will notice.

Having been too deep into the guts of perl, and having written more
extensions in C than I care to admit to, I wanted refcounting *dead*.
It's just spread across far too much code, tracking down errors is a
massive pain, and, well, yech. Yes, non-refcounting GC systems are a
bit more complex, but the complexity is well-contained and
manageable. (I wrote the first cut of the GC system in an afternoon
sitting in my local public library) There's also the added bonus that
you can swap in all sorts of different GC schemes without disturbing
99% of the code base.

There are a couple of features built into the codebase to reduce GC
pauses. (Though none of these are, strictly speaking, required by the
semantics parrot expresses)

Firstly, we don't allocate base structures (PMC and buffer structs)
from random memory -- they all come out of arenas, which leaves us
with a small and relatively restricted set of places to scan for
objects. There are a number of operations that the GC system needs to
do full sweeps for, and keeping things tight together makes that a
lot easier.

We also have a split GC system (or, at least we do now -- it might
not stay). Looking for dead objects is a separate operation from
looking for dead memory. This is generally a win for us, since
*generally* we're chewing through the memory hanging off of objects
much faster than we are the objects themselves. This isn't
characteristic of all languages, certainly, but for perl and it's
cohorts it is.

The PMCs themselves are designed to reduce the number of objects that
have to actually exist (though cutting out the keyed version of all
the binary ops reduced the effectiveness of it a bit) since your
aggregates don't have to actually instantiate the things inside them
if they don't want to. A 10 million element integer array can be
represented with exactly one PMC.

Finally we're really, *really* picky about where pointers to objects
can be put, and I think this is the place where we get a win over
other GC schemes, and one of the spots where Java traditionally falls
down. If pointers to objects can be anywhere, it complicates the GC
system a lot -- you have to assume that any random blob of memory
*may* hold pointers to objects. It probably doesn't, but... how can
the GC know? Generally it can't, since it just doesn't have
sufficient knowledge of the contents, so it has to be careful, and
careful costs. For us, the only spot where we have to be conservative
is with the stack, the rest is all exact for GC purposes, which is
nice, and there are a few common special cases (like the 'this is an
array of PMC pointers' one)

All this does make it more expensive, on a per-incident basis, to
guarantee timely destruction. Scope exit, if there are extant objects
that need timely destruction, has a lot more one-shot overhead than a
refcounting scheme does because you do need a full sweep but... we
make it so that full sweeps are relatively cheap (with the known
pointer location stuff) and have to go through a relatively small
number of objects (since most of your objects will end up in
aggregates, and we're trying to make it so that aggregates don't have
to actually contain objects too often).

For regular GC sweeps we can cut down on the pauses with generational
collectors and other such interesting things, which can be plugged in
mostly seamlessly. Right now we're usually in stop-the-world mode but
heck we compile parrot with *no* C compiler optimizations too, given
that things are all in development as it is.

Also, with all this stuff, people are going to find timely
destruction is less useful than they might want, what with threads
and continuations, which'll screw *everything* up, as they are wont
to do. I know I've been making heavy use of continuations myself, and
this is for a compiler for a language that doesn't even have
subroutines in it. Continuations screw everything up, which is always
fun, for unusual values of 'fun'.

I really need to go profile perl 5 some time to get some real stats,
but I think it's likely that most programs (well, most programs I run
at least) have less than 0.1% of the variables with destructors, and
maybe one or two variables *total* that have a need for timely
destruction. (And most of the time they get cleaned up by global
destruction, which makes 'em not actually timely cleaned up)

>You said, that most languages will have refcount semantics, just sounds
>funny for me to implement a GC then.

Actually most languages won't have refcount semantics. Perl 5's the
only one that really guarantees that sort of thing now, though I
think it's in there for perl 6. I doubt the python, ruby, Lisp, or
Tcl compilers will emit the cleanup-at-block-boundary sweep code.
--
Dan

--------------------------------------it's like this-------------------
Dan Sugalski even samurai
d...@sidhe.org have teddy bears and even
teddy bears get drunk

Bob Rogers

unread,
Apr 27, 2005, 9:41:39 PM4/27/05
to Dan Sugalski, Robin Redeker, perl6-i...@perl.org
From: Dan Sugalski <d...@sidhe.org>
Date: Wed, 27 Apr 2005 15:43:32 -0400

At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
>Just for the curious me: What was the design decision behind the GC

>solution? Was refcounting that bad? . . .

I'll answer this one, since I'm the one responsible for it.

Refcounting has three big issues:

1) It's expensive
2) It's error-prone
3) It misses circular garbage

. . .

The circular garbage thing's also an issue. Yes, there are
interesting hacks around it (python has one -- clever, but definitely
a hack) that essentially involves writing a separate mark&sweep
garbage collector.

The circular garbage issue is just as important as the others, IMHO.
Because Perl5 can't deal with circular garbage, many reasonable data
structures are awkward to implement, and Perl5 lexical closures are
close to useless because they leak memory.

. . .

Having been too deep into the guts of perl, and having written more
extensions in C than I care to admit to, I wanted refcounting *dead*.

Yay, Dan!

>You said, that most languages will have refcount semantics, just sounds
>funny for me to implement a GC then.

Actually most languages won't have refcount semantics. Perl 5's the
only one that really guarantees that sort of thing now, though I
think it's in there for perl 6. I doubt the python, ruby, Lisp, or
Tcl compilers will emit the cleanup-at-block-boundary sweep code.
--
Dan

For the record, Common Lisp does not even provide a finalization
mechanism; if you want something destroyed, you must do it explicitly,
while you still have a reference to the thing.

-- Bob Rogers
http://rgrjr.dyndns.org/

Leopold Toetsch

unread,
Apr 27, 2005, 12:54:40 PM4/27/05
to Robin Redeker, perl6-i...@perl.org
Robin Redeker <ro...@nethype.de> wrote:
> On Wed, Apr 27, 2005 at 02:27:58PM +0200, Leopold Toetsch wrote:

>> API's aren't finished yet and are extended more or less on demand.

> Ah, ok :) Who will decide what is demanded?

A patch on p6l is a good indication for a demand :)

> Hm, ok, i guess i will try to port the language and environment i have
> to parrot and see what i get.

Fine.

> Just for the curious me: What was the design decision behind the GC
> solution? Was refcounting that bad? Refcounting gives a more global
> speed hit indeed, but it's more deterministic and you wont run into
> (probably) long halts during GC. (Java programs often suffer from this,
> and it results in bad latency).

Yep. Well there is one more issue besides speed: code complexity and
correctness. Writing correct refcounting code is tedious and all
extensions suffer from that problem too. Long halts can be avoided by
either an incremental GC or by doing less work with an generational GC.
None of the GC philosophies are faster per se it really depends on the
implementaion.

> You said, that most languages will have refcount semantics, just sounds
> funny for me to implement a GC then.

They have timely destruction semantics. I'm pretty sure that we can run
generational GC as fast as a refcounting one.

> Hm, that sounds not that bad if this works. Are there prototypes? Has
> this been tried out?

A very early and not yet functional generational GC is in src/gc_gms.c.
It's lacking basically the interpreter support for scope enter/exit.

> ... Sticking fresh objects into old ones


> isn't that rare i guess (and also in my case it wont be that rare).

Combined with:

>> Note: "old" and "young" are terms of reachability in the object graph,

It depends. Having everything in globals will hurt. But it's a bit too
early to say more about that.

>> mostly related to object creation time too (for subroutines) but not for
>> more global-like objects.
>>

> How to combine generational garbage collection and timely destruction
> there? Having a policy that doesn't check 'old' objects (often) wont
> work for 'old' objects that need timely destruction.

When objects are ordered according to their scopes, you basically have
only to cleanup the top scope (or generation of objects).

> So, are timely destruction objects in the younges generation all the time?

No. But objects that need timely destruction can be put in front of
their generation so that they are scanned first.

> How to check whether this 'young' object, that need timely destruction,
> isn't referenced by old ones?
> Don't you have to go through all (even old) objects to determine whether
> 'this' object that needs timely destruction, is referenced?

Worst case yes.

> I'm not an expert in garbage collectors, that are just some questions
> that pop into my mind there. Maybe you can point me some paper or some
> old discussion about this?

I've no bookmarks, sorry. But google has them :)

> robin

leo

Robin Redeker

unread,
Apr 27, 2005, 6:12:50 PM4/27/05
to Luke Palmer, perl6-i...@perl.org
On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:
> Dan Sugalski writes:
> > Also, with all this stuff, people are going to find timely destruction
> > is less useful than they might want, what with threads and
> > continuations, which'll screw *everything* up, as they are wont to do.
> > I know I've been making heavy use of continuations myself, and this is
> > for a compiler for a language that doesn't even have subroutines in
> > it. Continuations screw everything up, which is always fun, for
> > unusual values of 'fun'.
>
> When I programmed in C++ with proper use of the STL and other such
> abstractions, almost the only time I needed destructors were for
> block-exit actions. Perl 6 has block-exit hooks, so you don't need to
> use destructors to fudge those anymore. And there's a lot less explicit
> memory management in languages with GC (that's the point), so
> destructors become even less useful.
>
> I agree with Dan completely here. People make such a big fuss over
> timely destruction when they don't realize that they don't really need
> it. (But they want it). I think, more importantly, they don't
> understand what they're getting in return for giving it up.

Could you point out what i get?

I use TD is to handle resources: filehandles, database handles, gui
windows and all that. Refcounting does this with a little overhead, but
in a fast and deterministic O(1) way.

And how do you want to implement glib objects with parrot?
They are refcounted.

Luke Palmer

unread,
Apr 27, 2005, 5:59:05 PM4/27/05
to Robin Redeker, perl6-i...@perl.org
Robin Redeker writes:
> On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:
> > I think, more importantly, they don't understand what they're
> > getting in return for giving [refcounting] up.

>
> Could you point out what i get?
>
> I use TD is to handle resources: filehandles, database handles, gui
> windows and all that. Refcounting does this with a little overhead, but
> in a fast and deterministic O(1) way.

If it's deterministic behavior that you want, you probably ought to be
doing things explicitly. What you get as a Perl programmer is, well,
the ability to use circular data structure. Big deal, right? But if
you're an extention writer, you get relief from days of refcount
debugging and a nice fast throughput (though the former is what
convinces me). Hacking on parrot is such a joy, because I can just
forget that I need to free memory.

> And how do you want to implement glib objects with parrot? They are
> refcounted.

How do you do it in Perl 5? Glib doesn't use Perl's refcounts, it uses
its own right? (I don't actually know). But if the behavior of your
program depends on where things are destoryed, and caching a reference
somewhere can keep your window from closing, I'm under the impression
that you should be doing it explicitly. Perl 6 in particular provides
so many hooks that pure destruction-based program behavior is probably
already too obfuscated.

At least, that's my experience writing the ODE module. I initially had
the destroy_body calls inside the destructor. But I found that bodies
would magically disappear under my nose, and others would never go away.
I ended up switching to an explicit destroy call, and now my life is
much less painful.

But that's my opinion. :-)

Luke

Luke Palmer

unread,
Apr 27, 2005, 2:33:30 PM4/27/05
to Dan Sugalski, Robin Redeker, perl6-i...@perl.org
Dan Sugalski writes:
> Also, with all this stuff, people are going to find timely destruction
> is less useful than they might want, what with threads and
> continuations, which'll screw *everything* up, as they are wont to do.
> I know I've been making heavy use of continuations myself, and this is
> for a compiler for a language that doesn't even have subroutines in
> it. Continuations screw everything up, which is always fun, for
> unusual values of 'fun'.

When I programmed in C++ with proper use of the STL and other such


abstractions, almost the only time I needed destructors were for
block-exit actions. Perl 6 has block-exit hooks, so you don't need to
use destructors to fudge those anymore. And there's a lot less explicit
memory management in languages with GC (that's the point), so
destructors become even less useful.

I agree with Dan completely here. People make such a big fuss over
timely destruction when they don't realize that they don't really need

it. (But they want it). I think, more importantly, they don't
understand what they're getting in return for giving it up.

Luke

Robin Redeker

unread,
Apr 28, 2005, 11:00:53 AM4/28/05
to Luke Palmer, perl6-i...@perl.org
On Wed, Apr 27, 2005 at 03:59:05PM -0600, Luke Palmer wrote:
> Robin Redeker writes:
> > On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:
> > > I think, more importantly, they don't understand what they're
> > > getting in return for giving [refcounting] up.
> >
> > Could you point out what i get?
> >
> > I use TD is to handle resources: filehandles, database handles, gui
> > windows and all that. Refcounting does this with a little overhead, but
> > in a fast and deterministic O(1) way.
>
> If it's deterministic behavior that you want, you probably ought to be
> doing things explicitly. What you get as a Perl programmer is, well,
> the ability to use circular data structure. Big deal, right?

Well, i and the people i know don't think that collection of circular
data structures gives much.

> But if you're an extention writer, you get relief from days of refcount
> debugging and a nice fast throughput (though the former is what
> convinces me). Hacking on parrot is such a joy, because I can just
> forget that I need to free memory.

Memory is everything that you can forget. A pure GC costs you automatic
resource management. Right: A GC is for memory management.

>
> > And how do you want to implement glib objects with parrot? They are
> > refcounted.
>

[..]


> I ended up switching to an explicit destroy call, and now my life is
> much less painful.

Well, i don't want the explict destrutor-calling hell you have in java.
Some thing i just don't want to do explicit: Filedestruction for
example. And what if my class manages a database handle? do i have to do
something like:

$my_cool_handle_obj->free_my_resources (); ?

This would be manual memory management:

{
my $s = malloc CoolClass;
...
free $s;
}

And with explicit resource handling (without timely destruction) it may be:

{
my $s = new CoolClass;
...
destroy $s;
}

Not that big difference. And this is what we have with
refcounting/timely destruction:

{
my $s = new CoolClass;
...
}

It's just that in my opinion timely destruction isn't something people
use rare. It's a do-what-i-mean feature. And wasn't that a feature of
perl6?

I don't know whether most/many people think that timely destruction is
something that we don't need much.

But it lets you forget memory management AND resource management.

And if Leopold and Dan and others think we can get efficent timely
destruction with this, and it works, and it's fast,
i'm completly fine with it.

>
> But that's my opinion. :-)
>

And this is mine ;)


cu,

Robin Redeker

unread,
Apr 28, 2005, 11:57:10 AM4/28/05
to Dan Sugalski, perl6-i...@perl.org
On Wed, Apr 27, 2005 at 03:43:32PM -0400, Dan Sugalski wrote:
> At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
> >Just for the curious me: What was the design decision behind the GC
> >solution? Was refcounting that bad? Refcounting gives a more global
> >speed hit indeed, but it's more deterministic and you wont run into
> >(probably) long halts during GC. (Java programs often suffer from this,
> >and it results in bad latency).
>
> I'll answer this one, since I'm the one responsible for it.
>
> Refcounting has three big issues:
>
> 1) It's expensive
> 2) It's error-prone
> 3) It misses circular garbage
>
[...]

>
> The expense is non-trivial as well. Yeah, it's all little tiny bits
> of time, but that adds up. It's all overhead, and useless overhead
> for the most part.

Yes, but do we know whether refcounting is really slower than a garbage
collector in the end?

>
> The circular garbage thing's also an issue. Yes, there are
> interesting hacks around it (python has one -- clever, but definitely
> a hack) that essentially involves writing a separate mark&sweep
> garbage collector.

I don't think circular references are used that much. This is maybe
something a programmer still has to think a little bit about.
And if it means, that timely destruction maybe becomes slow only for the
sake of collecting circular references... don't know if thats a big
feature.

Are cicrular references such a big issue in perl5? I heard that the
buildin GC in perl5 only runs at program end and captures the circular
references, what sometimes causes a segfault (as a friend of mine experienced).

>
> The thing is, there aren't any languages we care about that require
> true instant destruction -- the ones that care (or, rather, perl.
> Python and Ruby don't care) only guarantee block boundary collection,
> and in most cases (heck, in a near-overwhelming number of cases)
> timely destruction is utterly irrelevant to the code being run. Yeah,
> you *want* timely destruction, but you neither need nor notice it in
> most program runs, since there's nothing that will notice.

In many programruns you wont notice the overhead of refcounting too.
And in scripts, that only run up to (max) a minute, you won't even
notice if the memory isn't managed at all.

And timely destruction is still a feature thats used much more than
collection of circular references would be (IMHO)

>
> Having been too deep into the guts of perl, and having written more
> extensions in C than I care to admit to, I wanted refcounting *dead*.
> It's just spread across far too much code, tracking down errors is a
> massive pain, and, well, yech. Yes, non-refcounting GC systems are a
> bit more complex, but the complexity is well-contained and
> manageable. (I wrote the first cut of the GC system in an afternoon
> sitting in my local public library) There's also the added bonus that
> you can swap in all sorts of different GC schemes without disturbing
> 99% of the code base.

Just because refcounting is error-prone it doesn't mean that a garbage
collector is better (and less error-prone).
I agree, the code is more localized. But i guess that memory leaks
(and resource leaks) that are caused by a bug in a garbage collector aren't that easy
to find and fix also.

And there are many other things that are error-prone:
- a jit is error-prone
- optimizations are error-prone
- register allocation is error-prone
- parsers/lexers/* is error-prone

But i understand your argument and see that it's maybe better to
localize the complex memory management stuff.

>
[.snipped interesting technical stuff about optimizing GC.]
>

I guess we will see what we get, and if GC turns out to deal
with all the issues (timely destruction etc.) well and is the
perfect solution, thats great!

I don't have the ability to look into the future, but Java has
still (after so many years) quite some GC issues (my freenet process
seems to leak like hell, and the gc runs are causing full stops for
0.6s, what causes heavy latency).

But i don't say (nor i know) that parrot will have these problems.

I was just afraid that GC won't be faster and less overhead than a
refcounted solution, and we end up with a laggy parrot ;)
But many people here are convinced that the GC solution will be
faster and easier, and as parrot is still in developmen, i guess
there will be profiles and benchmarks in future.

> I really need to go profile perl 5 some time to get some real stats,
> but I think it's likely that most programs (well, most programs I run
> at least) have less than 0.1% of the variables with destructors, and
> maybe one or two variables *total* that have a need for timely
> destruction. (And most of the time they get cleaned up by global
> destruction, which makes 'em not actually timely cleaned up)

IMHO timely destruction is not that useless. The alternative would be
to free resources and close files explicit, like one has to do in Java.

And doing resource freeing explicit doesn't feel very Perl*-ish to me.

> >You said, that most languages will have refcount semantics, just sounds
> >funny for me to implement a GC then.
>
> Actually most languages won't have refcount semantics. Perl 5's the
> only one that really guarantees that sort of thing now, though I
> think it's in there for perl 6. I doubt the python, ruby, Lisp, or
> Tcl compilers will emit the cleanup-at-block-boundary sweep code.

Well, python doesn't gurantee it. But i heard some people rely on
that feature as the implementation does provide it.

But ok, as you have to manage resources manually in Lisp and Ruby
anyway, the compilers won't output the boundary-sweep.

It's there for Perl6, right, and i thought that parrot is in first place
there for Perl6. And there maybe will be new and other languages that
rely at least on timely destruction.

cya,

MrJoltCola

unread,
Apr 28, 2005, 1:39:46 PM4/28/05
to Dan Sugalski, Robin Redeker, perl6-i...@perl.org
At 01:10 PM 4/28/2005, Dan Sugalski wrote:

>At 5:57 PM +0200 4/28/05, Robin Redeker wrote:
>>On Wed, Apr 27, 2005 at 03:43:32PM -0400, Dan Sugalski wrote:
>>> At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
>>> The expense is non-trivial as well. Yeah, it's all little tiny bits
>>> of time, but that adds up. It's all overhead, and useless overhead
>>> for the most part.
>>
>>Yes, but do we know whether refcounting is really slower than a garbage
>>collector in the end?
>Yes, we do. This is a well-researched topic and one that's been gone over
>pretty thouroughly for the past twenty years or so. There's a lot of
>literature on this -- it's worth a run through citeseer for some of the
>papers or, if you've a copy handy in a local library, the book "Garbage
>Collection" by Jones and Lins, which is a good summary of most of the
>techniques, their costs, drawbacks, and implementation details.

Research indicates that in theory, garbage collection can even be faster
than explicit memory management. One paper I remember in particular was
done by IBM using various GCs for the JVM.

-Melvin

Dan Sugalski

unread,
Apr 28, 2005, 1:10:00 PM4/28/05
to Robin Redeker, perl6-i...@perl.org
At 5:57 PM +0200 4/28/05, Robin Redeker wrote:
>On Wed, Apr 27, 2005 at 03:43:32PM -0400, Dan Sugalski wrote:
>> At 5:40 PM +0200 4/27/05, Robin Redeker wrote:
>> >Just for the curious me: What was the design decision behind the GC
>> >solution? Was refcounting that bad? Refcounting gives a more global
>> >speed hit indeed, but it's more deterministic and you wont run into
>> >(probably) long halts during GC. (Java programs often suffer from this,
>> >and it results in bad latency).
>>
>> I'll answer this one, since I'm the one responsible for it.
>>
>> Refcounting has three big issues:
>>
>> 1) It's expensive
>> 2) It's error-prone
>> 3) It misses circular garbage
>>
>[...]
>>
>> The expense is non-trivial as well. Yeah, it's all little tiny bits
>> of time, but that adds up. It's all overhead, and useless overhead
>> for the most part.
>
>Yes, but do we know whether refcounting is really slower than a garbage
>collector in the end?

Yes, we do. This is a well-researched topic and one that's been gone

over pretty thouroughly for the past twenty years or so. There's a
lot of literature on this -- it's worth a run through citeseer for
some of the papers or, if you've a copy handy in a local library, the
book "Garbage Collection" by Jones and Lins, which is a good summary
of most of the techniques, their costs, drawbacks, and implementation
details.

> > The circular garbage thing's also an issue. Yes, there are


>> interesting hacks around it (python has one -- clever, but definitely
>> a hack) that essentially involves writing a separate mark&sweep
>> garbage collector.
>
>I don't think circular references are used that much. This is maybe
>something a programmer still has to think a little bit about.
>And if it means, that timely destruction maybe becomes slow only for the
>sake of collecting circular references... don't know if thats a big
>feature.

Circular references are far more common than objects that truly need
timely destruction, yes, and the larger your programs get the more of
an issue it is. Neither are terribly common, though.

>Are cicrular references such a big issue in perl5? I heard that the
>buildin GC in perl5 only runs at program end and captures the circular
>references, what sometimes causes a segfault (as a friend of mine
>experienced).

Perl 5, like all other refcounting GC systems, is essentially
incremental and things get collected as the program runs. There's a
final global destruction sweep that clears up anything still alive
when the program exits, but this sweep doesn't guarantee ordering
(and it really can't in the general case when there's circular
garbage) and can in some cases cause segfaults if you've got
finalizers written in C that don't properly handle last-gasp out of
order cleanup.

> > The thing is, there aren't any languages we care about that require
>> true instant destruction -- the ones that care (or, rather, perl.
>> Python and Ruby don't care) only guarantee block boundary collection,
>> and in most cases (heck, in a near-overwhelming number of cases)
>> timely destruction is utterly irrelevant to the code being run. Yeah,
>> you *want* timely destruction, but you neither need nor notice it in
>> most program runs, since there's nothing that will notice.
>
>In many programruns you wont notice the overhead of refcounting too.
>And in scripts, that only run up to (max) a minute, you won't even
>notice if the memory isn't managed at all.

We're building a general purpose engine, remember. It needs to handle
programs with 10 objects that run in 10ms as well as ones with 10M
objects that run for 10 weeks.

That argument actually favors non-refcount GC -- there's a minor
speed win to a non-refcount system at the short end of program runs
and a significant one at the large end of program runs.

>And timely destruction is still a feature thats used much more than
>collection of circular references would be (IMHO)

In this case, YHO would turn out to be incorrect. Don't get me wrong,
it's a sensible thing to think, it's just that it doesn't hold up on
closer inspection.

> > Having been too deep into the guts of perl, and having written more
>> extensions in C than I care to admit to, I wanted refcounting *dead*.
>> It's just spread across far too much code, tracking down errors is a
>> massive pain, and, well, yech. Yes, non-refcounting GC systems are a
>> bit more complex, but the complexity is well-contained and
>> manageable. (I wrote the first cut of the GC system in an afternoon
>> sitting in my local public library) There's also the added bonus that
>> you can swap in all sorts of different GC schemes without disturbing
>> 99% of the code base.
>
>Just because refcounting is error-prone it doesn't mean that a garbage
>collector is better (and less error-prone).
>I agree, the code is more localized. But i guess that memory leaks
>(and resource leaks) that are caused by a bug in a garbage collector
>aren't that easy
>to find and fix also.

Actually they are, significantly. Bugs in a centralized GC system
show up reasonably quickly and usually very fatally -- they're pretty
darn catastrophic in most cases, either causing programs to consume
all memory and die, or collecting too soon and causing things to die.
Indeed, if you take a look through the list archives, most of the
subtle GC issues are in those places where we bypass the GC for some
reason, usually in tracking external resources.

> > I really need to go profile perl 5 some time to get some real stats,
>> but I think it's likely that most programs (well, most programs I run
>> at least) have less than 0.1% of the variables with destructors, and
>> maybe one or two variables *total* that have a need for timely
>> destruction. (And most of the time they get cleaned up by global
>> destruction, which makes 'em not actually timely cleaned up)
>
>IMHO timely destruction is not that useless. The alternative would be
>to free resources and close files explicit, like one has to do in Java.
>
>And doing resource freeing explicit doesn't feel very Perl*-ish to me.

Oh, I'm not saying that automatic resource freeing's a bad idea -- it
isn't. Neither am I saying that timely destruction's entirely useless
-- it isn't. What I *am* saying is that a true need for timely
destruction is rare when other facilities are provided, and as such
it doesn't warrant being optimized for.

> > >You said, that most languages will have refcount semantics, just sounds
>> >funny for me to implement a GC then.
>>
>> Actually most languages won't have refcount semantics. Perl 5's the
>> only one that really guarantees that sort of thing now, though I
>> think it's in there for perl 6. I doubt the python, ruby, Lisp, or
>> Tcl compilers will emit the cleanup-at-block-boundary sweep code.
>
>Well, python doesn't gurantee it. But i heard some people rely on
>that feature as the implementation does provide it.

Guido explicitly told me it's OK to break this sort of thing if I wanted. :)

>But ok, as you have to manage resources manually in Lisp and Ruby
>anyway, the compilers won't output the boundary-sweep.
>
>It's there for Perl6, right, and i thought that parrot is in first place
>there for Perl6.

Well.... no. Not for a number of years, but that's a very long story.

> And there maybe will be new and other languages that
>rely at least on timely destruction.

At the moment there aren't, and certainly not in the group of
languages that we primarily care about. Parrot has, for example,
features that make languages like C significantly difficult to
implement on top of parrot, but we don't really care because C isn't
one of our target languages..

Dan Sugalski

unread,
Apr 28, 2005, 2:34:03 PM4/28/05
to Robin Redeker, Luke Palmer, perl6-i...@perl.org
At 12:12 AM +0200 4/28/05, Robin Redeker wrote:
>On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:
>> Dan Sugalski writes:
>> > Also, with all this stuff, people are going to find timely destruction
>> > is less useful than they might want, what with threads and
>> > continuations, which'll screw *everything* up, as they are wont to do.
>> > I know I've been making heavy use of continuations myself, and this is
>> > for a compiler for a language that doesn't even have subroutines in
>> > it. Continuations screw everything up, which is always fun, for
>> > unusual values of 'fun'.
>>
>> When I programmed in C++ with proper use of the STL and other such
>> abstractions, almost the only time I needed destructors were for
>> block-exit actions. Perl 6 has block-exit hooks, so you don't need to
>> use destructors to fudge those anymore. And there's a lot less explicit
>> memory management in languages with GC (that's the point), so
>> destructors become even less useful.
>>
>> I agree with Dan completely here. People make such a big fuss over
>> timely destruction when they don't realize that they don't really need
>> it. (But they want it). I think, more importantly, they don't
>> understand what they're getting in return for giving it up.
>
>Could you point out what i get?
>
>I use TD is to handle resources: filehandles, database handles, gui
>windows and all that.

Sure, most people do. Heck, I do. And... I don't need timely
destruction for them. Not in the "at scope exit this will be cleaned
up" sense, at least.

"Timely" may be getting in the way here. Timely is as soon as
possible but with very few exceptions it isn't necessary. (There are
a couple of perl idioms that really demand it at the block level, but
that's about it) A number of languages use it as an alternative to
providing block exit actions, though those aren't really a great
place for forcing finalizers, since you can't be sure that a thing is
actually in a state to be finalized.

You've really got three classes of things that the GC system needs to
know about. There are the things that need cleaning eventually
(basically the no-finalizer objects), things that need cleaning up
reasonably quickly (most things with finalizers, including most file
and DB handles) where reasonably quickly's in the 100-200ms range,
and things that need cleaning as soon as they die.

We don't provide any good support for instant destruction. People are
just going to have to live with that, but honestly I don't think
anyone's going to notice. (Though compilers can force a call to a
PMC's finalizer if they want) For block-level cleanup there's
conditional garbage collection and tagging PMCs as needing GC
attention, which provides the support that most everyone needs. How
expensive a sweep is depends entirely on how many PMCs you have, but
the tracing part of the collector's darned fast, especially if you're
using the aggregate PMCs that we provide, since they're special-cased
in the collector.

>Refcounting does this with a little overhead, but
>in a fast and deterministic O(1) way.

Deterministic, yes. Fast, no. Refcounting, in general, is about twice
as expensive over the life of a program as other GC schemes.

>And how do you want to implement glib objects with parrot?
>They are refcounted.

They'll be wrapped, so it's no big deal. When the PMC wrapping them
dies, its finalizer decrements the refcount of the wrapped glib thing
and leaves it to die on its own.

Luke Palmer

unread,
Apr 28, 2005, 1:48:16 PM4/28/05
to Robin Redeker, perl6-i...@perl.org
Robin Redeker writes:
> This should actually be, to prevent the resource from leaking:

>
> {
> my $s = new CoolClass;
> eval {
> ... do stuff that may throws ...
> };
> destroy $s;
> }

Or, with the "block hooks" that I keep claiming makes timely destruction
almost never needed, it is:

{
my $s = new CoolClass;

# ... do stuff that may throw ...
LEAVE { destroy $s }
}

This destroys properly and even propagates the exception.

Luke

Dan Sugalski

unread,
Apr 28, 2005, 4:29:34 PM4/28/05
to Luke Palmer, Robin Redeker, perl6-i...@perl.org

Actually it destroys improperly, since it destroys unconditionally,
which is likely wrong. The right thing is to have the constructor for
CoolClass tag its constructed object as needing expedited destruction
and have the language compiler emit conditional sweep ops as part of
its block finishing code.

Explicit destruction like that will clean up the object even if there
are outstanding references, which is likely the wrong thing to do.

Dan Sugalski

unread,
Apr 28, 2005, 4:37:21 PM4/28/05
to Robin Redeker, Luke Palmer, perl6-i...@perl.org
At 7:24 PM +0200 4/28/05, Robin Redeker wrote:
>I just wanted to correct my small example:

>
>On Thu, Apr 28, 2005 at 05:00:53PM +0200, Robin Redeker wrote:
>> > Robin Redeker writes:
>> And with explicit resource handling (without timely destruction) it may be:
>>
>> {
>> my $s = new CoolClass;
>> ...
>> destroy $s;
>> }
>
>This should actually be, to prevent the resource from leaking:
>
> {
> my $s = new CoolClass;
> eval {
> ... do stuff that may throws ...
> };
> destroy $s;
> }

That's not necessary to prevent the resource from leaking. It's only
necessary if you want to unconditionally destroy the object when
control is leaving that block for the first time. You probably don't
want to, since that means any outstanding references will now refer
to a dead object.

If you don't have the destroy, and don't tag the object as needing
expedited cleanup, then the finalizer *will* still be called. You
just don't have any control over when its called.

> > Not that big difference. And this is what we have with
> > refcounting/timely destruction:
>>
>> {
>> my $s = new CoolClass;
>> ...
>> }
>>
>

>The latter example will destruct nicely if something throws.

Regardless of GC method, yes. The only question is when.

It's *really* important to note that I explained how parrot does GC
-- that wasn't opening a descussion on redesigning the feature.
Parrot doesn't have, and isn't going to have, a refcounting GC
system. That's just not an option.

Parrot's got a tracing GC system triggered on demand by failure to
allocate resources, explicitly by user code, or at regular intervals.
Switching to a different tracing system than what's in now is doable.
Switching away from a tracing system *isn't*. That'd require changing
the entire source base, and just isn't feasible.

Robin Redeker

unread,
Apr 28, 2005, 1:24:49 PM4/28/05
to Luke Palmer, perl6-i...@perl.org
I just wanted to correct my small example:

On Thu, Apr 28, 2005 at 05:00:53PM +0200, Robin Redeker wrote:
> > Robin Redeker writes:

> And with explicit resource handling (without timely destruction) it may be:
>
> {
> my $s = new CoolClass;
> ...
> destroy $s;
> }

This should actually be, to prevent the resource from leaking:

{


my $s = new CoolClass;

eval {
... do stuff that may throws ...
};
destroy $s;
}

>
> Not that big difference. And this is what we have with
> refcounting/timely destruction:
>
> {
> my $s = new CoolClass;
> ...
> }
>

The latter example will destruct nicely if something throws.

--

Dave Mitchell

unread,
Apr 28, 2005, 6:25:44 PM4/28/05
to Robin Redeker, Dan Sugalski, perl6-i...@perl.org
On Thu, Apr 28, 2005 at 05:57:10PM +0200, Robin Redeker wrote:
> Just because refcounting is error-prone it doesn't mean that a garbage
> collector is better (and less error-prone).

I'm one of the maintainers of the perl5 core. perl5 is very mature, with
relatively few new features being added, and much emphasis on fixing
existing bugs. A quick look at my last 60 patches shows that about 1/3 of
them were related to ref count and memory leak bugs. Ref counting is near
impossible to get 100% right.

One of the advantages of GC over RC is that you are automating the
process, and once the automation has the bugs wrinkled out of it, the whole
rest of your code reaps the advantages. With RC, every single line of code
in the core (and extensions) has to be carefully thought about to make
sure there aren't leaks or premature frees, and every single change risks
introducing new errors.

(On the other hand, I *enjoy* hunting down refcnt bugs, so
Down With GC !! :-)

--
That he said that that that that is is is debatable, is debatable.

Martin D Kealey

unread,
Apr 28, 2005, 9:43:49 PM4/28/05
to perl6-i...@perl.org
On Thu, 28 Apr 2005, Robin Redeker wrote:
> I don't think circular references are used that much.

Circular references are useful any time you need to be able to iterate
over a collection, and also have to identify which collection a given object
is in.

This may even be implicit from other requirements, such as:

* enforcing that an object may be in at most one collection of objects at
a time

* instructing an object to detach from its current collection (especially
when forcibly destroying the object)


In fact I often want BOTH timely destruction AND circular datastructures;
for example, I want to hold an object that represents a row in a database,
and I don't want to have to care which transaction, table, database or
connection it belongs to.

But at the other end, the object representing a transaction has to be able
to reverse changes to all its component row objects. And when the
transaction goes out of scope it needs to be either committed or rolled
back.

-Martin

Bob Rogers

unread,
Apr 28, 2005, 10:55:49 PM4/28/05
to Robin Redeker, Dan Sugalski, perl6-i...@perl.org
From: Robin Redeker <ro...@nethype.de>
Date: Thu, 28 Apr 2005 00:12:50 +0200

On Wed, Apr 27, 2005 at 12:33:30PM -0600, Luke Palmer wrote:

> I agree with Dan completely here. People make such a big fuss over
> timely destruction when they don't realize that they don't really need
> it. (But they want it). I think, more importantly, they don't
> understand what they're getting in return for giving it up.

Could you point out what i get?

I use TD is to handle resources: filehandles, database handles, gui
windows and all that.

My idea of "timely" is RIGHT NOW for filehandles, when the program ends
for DB handles, and whenever for the rest. So I don't see TD as much of
a win, even if it were completely for free. But it's not; the rest of
the world would have to forego entire classes of data structures in
order to save the few lines of extra code you would need to manage a few
of yours explicitly. This seems like a bad tradeoff.

Refcounting does this with a little overhead, but in a fast and
deterministic O(1) way.

This is the first half of an apples-to-oranges comparison, and so is
misleading even if partly true. Refcounting may be proportional
(approximately) to the amount of reference manipulation, but GC is
proportional (though even more approximately, and with a different
constant) to the amount of memory allocated [1]. Because of this, a
program that runs for a long time without allocating much additional
memory incurs zero GC cost during that time; this is a clear (and even
deterministic) win over refcounting.

From: Dan Sugalski <d...@sidhe.org>
Date: Thu, 28 Apr 2005 13:10:00 -0400

. . .

>I don't think circular references are used that much. This is maybe
>something a programmer still has to think a little bit about.
>And if it means, that timely destruction maybe becomes slow only for the
>sake of collecting circular references... don't know if thats a big
>feature.

Circular references are far more common than objects that truly need
timely destruction, yes, and the larger your programs get the more of
an issue it is. Neither are terribly common, though.

I'm astounded. Do neither of you ever design data structures with
symmetrical parent<->child pointers? No trees with parents? No
doubly-linked lists? In my (probably skewed) experience, circular
references are used frequently in languages like C or Lisp that don't
penalize them.

[1] Work done by a copying GC when it runs is proportional to the
amount of non-garbage copied, but that doesn't apply to Parrot, and
we were talking about GC frequency in any case.

Martin D Kealey

unread,
Apr 28, 2005, 10:59:36 PM4/28/05
to perl6-i...@perl.org
On Thu, 28 Apr 2005, Luke Palmer wrote:
> Or, with the "block hooks" that I keep claiming makes timely destruction
> almost never needed, it is:
>
> {
> my $s = new CoolClass;
> # ... do stuff that may throw ...
> LEAVE { destroy $s }
> }
>
> This destroys properly and even propagates the exception.

That's good ... but can we have a shorthand; stealing an old C keyword, I'd
suggest:

{
auto $a = new CoolClass;
# ... do stuff that my throw ...
}


Uri Guttman

unread,
Apr 29, 2005, 12:37:57 AM4/29/05
to Robin Redeker, Dan Sugalski, perl6-i...@perl.org
>>>>> "RR" == Robin Redeker <ro...@nethype.de> writes:

RR> I don't think circular references are used that much. This is
RR> maybe something a programmer still has to think a little bit
RR> about. And if it means, that timely destruction maybe becomes
RR> slow only for the sake of collecting circular references... don't
RR> know if thats a big feature.

ever do any callback stuff? an object needs to be called back from a
service (say a i/o handler). it must pass itself to the service to be
stored there. the service returns a handle which needs to be stored in
the object so it can be used to manage it (start/stop/abort/etc.). there
is a quick circular ref.

now, weakrefs can fix that but i don't like that solution. GC fixes it
with no problems at all. you can just drop the object and it and the
service handle (whatever that is) will be reclaimed soon enough.

and there are many useful data structures that need backrefs. not
everything can be represented in clean tree. i won't go into these but
there is plenty of literature on them.

by far the biggest problem of refcounting (from the coder's perspective,
not it speed or internal debugging stuff) is circular refs. i currently
require explicit destroys to make sure i break them. i could (maybe i
will one day) switch to weakrefs. but i would much prefer to not know or
care about circular refs at all and allow stuff be collected when they
can.

but there is another reason i need explicit destroys, the service must
be shut down for a given handle. you can't have a select loop running
with lots of dead sockets and callbacks to nowhere. even weakrefs and
refcounting can't fix this and neither can GC. waiting later to close
down external resources like DB handles can work but select loops need
finer control.

in any case, i am totally for dan's decision to go with GC. it removes a
major weakness in perl's memory management. refcounting just doesn't
win in speed nor in internal complexity nor in safety of coding.

uri

--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
--Perl Consulting, Stem Development, Systems Architecture, Design and Coding-
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org

Gerald Butler

unread,
Apr 29, 2005, 8:14:38 AM4/29/05
to Martin D Kealey, perl6-i...@perl.org

Isn't there something like:

{
my $s does LEAVE { destroy $s } = new CoolClass;


# ... do stuff that may throw ...
}

Or something like that?


The information contained in this e-mail message is privileged and/or
confidential and is intended only for the use of the individual or entity
named above. If the reader of this message is not the intended
recipient, or the employee or agent responsible to deliver it to the
intended recipient, you are hereby notified that any dissemination,
distribution or copying of this communication is strictly prohibited.
If you have received this communication in error, please immediately
notify us by telephone (330-668-5000), and destroy the original
message. Thank you.

Larry Wall

unread,
Apr 29, 2005, 8:48:40 AM4/29/05
to perl6-i...@perl.org
On Fri, Apr 29, 2005 at 08:14:38AM -0400, Butler, Gerald wrote:
:
: Isn't there something like:

:
: {
: my $s does LEAVE { destroy $s } = new CoolClass;
: # ... do stuff that may throw ...
: }
:
: Or something like that?

Yes,it's

my $s will leave { destroy $s } = new CoolClass;

and then it only destroyes if $s has been initialized. Also handy
are transactional versions that work on success or failure of the block:

my $s will keep { destroy $s } = new CoolClass;
my $s will undo { destroy $s } = new CoolClass;

There is no requirement for timely destruction in Perl 6.

Larry

Leopold Toetsch

unread,
Apr 29, 2005, 9:05:14 AM4/29/05
to Gerald Butler, perl6-i...@perl.org
Gerald Butler <gerald...@jewels.com> wrote:

> Isn't there something like:

> {
> my $s does LEAVE { destroy $s } = new CoolClass;
> # ... do stuff that may throw ...
> }

> Or something like that?

Not currently. There used to be a C<destroy> opcode, but I've deleted
it, because I thought it's too dangerous. But it could be useful *if*
the compiler or parrot hacker really knows that there is not any
reference to that object around.

Above would roughly translate to

leave = newsub .Closure, _LEAVE
pushaction leave
...
s = newclass "CoolClass"
...
.end

.sub _LEAVE
s = find_lex "$s"
destroy s
.end

(variable declarations omitted)

leo

Dan Sugalski

unread,
Apr 29, 2005, 3:06:14 PM4/29/05
to Uri Guttman, Robin Redeker, perl6-i...@perl.org
At 12:37 AM -0400 4/29/05, Uri Guttman wrote:
> >>>>> "RR" == Robin Redeker <ro...@nethype.de> writes:
>
> RR> I don't think circular references are used that much. This is
> RR> maybe something a programmer still has to think a little bit
> RR> about. And if it means, that timely destruction maybe becomes
> RR> slow only for the sake of collecting circular references... don't
> RR> know if thats a big feature.
>
>ever do any callback stuff? an object needs to be called back from a
>service (say a i/o handler). it must pass itself to the service to be
>stored there. the service returns a handle which needs to be stored in
>the object so it can be used to manage it (start/stop/abort/etc.). there
>is a quick circular ref.

Oh, I realize that, along with a number of other useful uses of
circular references. I can't speak for Robin, but when I said
circular refs weren't that common I was talking in the overall number
of things case. The large majority of objects are dead-stupid things
that have no finalizers and no references to anything, the second
largest (and definitely smaller case) is objects that refer to other
objects in a non-circular fashion, then circular objects, then

objects that need timely destruction.

Anyway, that usage pattern argues for efficiency handling simple PMCs
first, reference/aggregate PMCs second, and ones with timely
destructors last, which is how parrot's DOD/GC system's set up.

Dan Sugalski

unread,
Apr 29, 2005, 3:23:47 PM4/29/05
to Bob Rogers, Robin Redeker, perl6-i...@perl.org
At 10:55 PM -0400 4/28/05, Bob Rogers wrote:
> From: Robin Redeker <ro...@nethype.de>
> Date: Thu, 28 Apr 2005 00:12:50 +0200
> Refcounting does this with a little overhead, but in a fast and
> deterministic O(1) way.
>
>This is the first half of an apples-to-oranges comparison, and so is
>misleading even if partly true. Refcounting may be proportional
>(approximately) to the amount of reference manipulation, but GC is
>proportional (though even more approximately, and with a different
>constant) to the amount of memory allocated [1].

Actually it's proportional to the number of live objects.

A refcounting scheme has to touch *all* objects at least twice, while
a tracing scheme generally has to touch only the objects that are
live at trace time. For the most part, refcount O(n) time is
proportional to the total number of objects created, while tracing
O(n) time is proportional to the number of live objects.

It's definitely possible to work up degenerate examples for both
refcount and tracing systems that show them in a horribly bad light
relative to the other, but in the general case the tracing schemes
are significantly less expensive.

> From: Dan Sugalski <d...@sidhe.org>
> Date: Thu, 28 Apr 2005 13:10:00 -0400
>
> . . .
>
> >I don't think circular references are used that much. This is maybe
> >something a programmer still has to think a little bit about.
> >And if it means, that timely destruction maybe becomes slow only for the
> >sake of collecting circular references... don't know if thats a big
> >feature.
>
> Circular references are far more common than objects that truly need
> timely destruction, yes, and the larger your programs get the more of
> an issue it is. Neither are terribly common, though.
>
>I'm astounded. Do neither of you ever design data structures with
>symmetrical parent<->child pointers? No trees with parents? No
>doubly-linked lists? In my (probably skewed) experience, circular
>references are used frequently in languages like C or Lisp that don't
>penalize them.

I responded to Uri on this, but note that I said "neither are
terribly common", and they aren't. Relative to the total number of
GC-able things, objects in circular structures are a very small
minority. Which, of course, doesn't help much as an app designer when
you have to deal with these things, but it is important to know when
doing the design of the back end, since relative usage of features
needs to be kept in mind when making design tradeoffs. One of those
annoying engineering things. (Just once I'd love to have my cake
*and* eat it too, dammit! :)

Dan Sugalski

unread,
Apr 29, 2005, 3:07:36 PM4/29/05