Re: multithreading.

Chris Thomasson

unread,

Mar 28, 2008, 11:09:26 PM3/28/08

to

[comp.programming.threads added]

On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13uqrd4...@corp.supernews.com...
>>> You should really look at a language with a concurrent garbage collector
>>> if you want to do multithreading. C++ really falls down here because it
>>> lacks a good foundation.
>>
>> Why do you think one needs a garbage collector in order to implement
>> multi-threaded algorithms?
>
> Unless your allocation lifetimes happen to be statically well defined,
> you'll essentially end up reinventing a garbage collector. That is hard
> enough in single threaded applications but it is much harder in the
> presence of multithreading because it is so error prone.

That does not make sense. I choose the right lifetime management scheme
for the algorithms I need to implement. To this date, I have not
"needed" full-blown garbage collection. I mainly make use of distributed
reference counting and proxy collectors to manage the memory in
non-blocking algorithms. This is not reinventing a garbage collector at
all. BTW, garbage collectors have their share of problems; here are some
of them:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d

James Kanze

unread,

Mar 29, 2008, 7:33:47 AM3/29/08

to

On 29 mar, 04:09, Chris Thomasson <cris...@comcast.net> wrote:
> On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:
> > Chris Thomasson wrote:
> >> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
> >>news:13uqrd4...@corp.supernews.com...
> >>> You should really look at a language with a concurrent garbage collector
> >>> if you want to do multithreading. C++ really falls down here because it
> >>> lacks a good foundation.

> >> Why do you think one needs a garbage collector in order to
> >> implement multi-threaded algorithms?

> > Unless your allocation lifetimes happen to be statically
> > well defined, you'll essentially end up reinventing a
> > garbage collector. That is hard enough in single threaded
> > applications but it is much harder in the presence of
> > multithreading because it is so error prone.

> That does not make sense. I choose the right lifetime
> management scheme for the algorithms I need to implement.

Object lifetime and garbage collection are orthogonal---garbage
collectors do NOT manage object lifetime. And of course, both
are orthogonal to threading---except that like just about
everything else, it's harder to implement the algorithms used
(garbage collector, or the alternatives) in the presence of
threads.

> To this date, I have not "needed" full-blown garbage
> collection.

You never "need" it. (For that matter, you never need C++, or
even C. You can do everything in assembler.)

> I mainly make use of distributed reference counting and proxy
> collectors to manage the memory in non-blocking algorithms.
> This is not reinventing a garbage collector at all. BTW,
> garbage collectors have their share of problems; here are some
> of them:

> http://groups.google.com/group/comp.programming.threads/browse_frm/th...

Nothing is perfect, and there are contexts where garbage
collection may not be the answer. Or garbage collection mixed
with something else. It's a nice option to have, and makes
coding a few specific things easier, but it's not a silver
bullet.

--
James Kanze (GABI Software) email:james...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Jon Harrop

unread,

Mar 29, 2008, 10:05:21 AM3/29/08

to

Chris Thomasson wrote:
> On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:
>> Unless your allocation lifetimes happen to be statically well defined,
>> you'll essentially end up reinventing a garbage collector. That is hard
>> enough in single threaded applications but it is much harder in the
>> presence of multithreading because it is so error prone.
>
> That does not make sense. I choose the right lifetime management scheme
> for the algorithms I need to implement. To this date, I have not
> "needed" full-blown garbage collection. I mainly make use of distributed
> reference counting and proxy collectors to manage the memory in
> non-blocking algorithms. This is not reinventing a garbage collector at
> all.

Reference counting was one of the earliest forms of garbage collection and
it is riddled with many very serious and well known problems. You are
literally reinventing the GC and you are now several decades out of date.

> BTW, garbage collectors have their share of problems; here are some
> of them:
>
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d

The thread you cite is a discussion about two different articles, neither of
which substantiate your claim:

The first is an article describing problems with a specific Java data
structure (WeakHashMap):

http://blogs.azulsystems.com/cliff/2007/08/why-weakhashmap.html

As the commenters on that post have already explained in detail, the
author's use of this data structure is wrong and many of his statements
about it are wrong because he didn't bother to read the documentation. For
example, read the comment:

"I'm hoping the irony in the title is intentional. There are several
layers of "suck" going on here and WeakHashMap is the least of them.
Assuming you have a correct hashcode and concurrent GC, as you said
everything gets cleared out really fast and the Javadocs explain all this."

i.e. the poster's problem is a direct result of the GC being too efficient
at collecting unused values. This problem is commonly cited in many GC'd
languages by newbies who are trying to abuse the garbage collector.

The second article describes a problem with memory fragmentation:

http://www.coversant.net/Coversant/Blogs/tabid/88/EntryID/9/Default.aspx

As the poster described, this problem is entirely a result of pinning, i.e.
manual memory management. As he explains, if the code were entirely
high-level with automatic memory management this problem would never have
occurred because the (moving) GC automatically defragments the heap.

--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?u

Chris Thomasson

unread,

Mar 29, 2008, 8:23:15 PM3/29/08

to

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13usjnh...@corp.supernews.com...

> Chris Thomasson wrote:
>> On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:
>>> Unless your allocation lifetimes happen to be statically well defined,
>>> you'll essentially end up reinventing a garbage collector. That is hard
>>> enough in single threaded applications but it is much harder in the
>>> presence of multithreading because it is so error prone.
>>
>> That does not make sense. I choose the right lifetime management scheme
>> for the algorithms I need to implement. To this date, I have not
>> "needed" full-blown garbage collection. I mainly make use of distributed
>> reference counting and proxy collectors to manage the memory in
>> non-blocking algorithms. This is not reinventing a garbage collector at
>> all.
>
> Reference counting was one of the earliest forms of garbage collection and
> it is riddled with many very serious and well known problems.

cycles aside, can you list several other very serious problems? BTW, IMHO,
cycles are a red-herring and can usually be designed around.

> You are
> literally reinventing the GC and you are now several decades out of date.

How are virtually zero-overhead counting algorithms out-of-date? Please
explain...

>> BTW, garbage collectors have their share of problems; here are some
>> of them:
>>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d
>
> The thread you cite is a discussion about two different articles, neither
> of
> which substantiate your claim:
>
> The first is an article describing problems with a specific Java data
> structure (WeakHashMap):

[...]

How do you effectively cache object's in a "collect-the-world" environment?
GC and caching don't get along _sometimes_... From the applications point of
view an object in the cache is in a quiescent state. However, from the GC
perspective a node in the cache is a pointer to a "live object".

Chris Thomasson

unread,

Mar 29, 2008, 8:26:59 PM3/29/08

to

"James Kanze" <james...@gmail.com> wrote in message
news:5593adfa-6fbd-4b8a...@s50g2000hsb.googlegroups.com...

On 29 mar, 04:09, Chris Thomasson <cris...@comcast.net> wrote:
> > On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:
> > > Chris Thomasson wrote:
> > >> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
> > >>news:13uqrd4...@corp.supernews.com...
> > >>> You should really look at a language with a concurrent garbage
> > >>> collector
> > >>> if you want to do multithreading. C++ really falls down here because
> > >>> it
> > >>> lacks a good foundation.

> > >> Why do you think one needs a garbage collector in order to
> > >> implement multi-threaded algorithms?

> > > Unless your allocation lifetimes happen to be statically
> > > well defined, you'll essentially end up reinventing a
> > > garbage collector. That is hard enough in single threaded
> > > applications but it is much harder in the presence of
> > > multithreading because it is so error prone.

> > That does not make sense. I choose the right lifetime
> > management scheme for the algorithms I need to implement.

> Object lifetime and garbage collection are orthogonal---garbage
> collectors do NOT manage object lifetime.

They determine when an object is quiescent which is probably the most
important aspect of lifetime management schemes in general.

> And of course, both
> are orthogonal to threading---except that like just about
> everything else, it's harder to implement the algorithms used
> (garbage collector, or the alternatives) in the presence of
> threads.

> > To this date, I have not "needed" full-blown garbage
> > collection.

> You never "need" it. (For that matter, you never need C++, or
> even C. You can do everything in assembler.)

Agreed.

> > I mainly make use of distributed reference counting and proxy
> > collectors to manage the memory in non-blocking algorithms.
> > This is not reinventing a garbage collector at all. BTW,
> > garbage collectors have their share of problems; here are some
> > of them:

> > http://groups.google.com/group/comp.programming.threads/browse_frm/th...

> Nothing is perfect, and there are contexts where garbage
> collection may not be the answer. Or garbage collection mixed
> with something else. It's a nice option to have, and makes
> coding a few specific things easier, but it's not a silver
> bullet.

Agreed.

Chris Thomasson

unread,

Mar 29, 2008, 8:52:13 PM3/29/08

to

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13usjnh...@corp.supernews.com...

> Chris Thomasson wrote:
>> On Sat, 29 Mar 2008 01:39:41 +0000, Jon Harrop wrote:
>>> Unless your allocation lifetimes happen to be statically well defined,
>>> you'll essentially end up reinventing a garbage collector. That is hard
>>> enough in single threaded applications but it is much harder in the
>>> presence of multithreading because it is so error prone.
>>
>> That does not make sense. I choose the right lifetime management scheme
>> for the algorithms I need to implement. To this date, I have not
>> "needed" full-blown garbage collection. I mainly make use of distributed
>> reference counting and proxy collectors to manage the memory in
>> non-blocking algorithms. This is not reinventing a garbage collector at
>> all.
>
> Reference counting was one of the earliest forms of garbage collection and
> it is riddled with many very serious and well known problems. You are
> literally reinventing the GC and you are now several decades out of date.

[...]

How can you say that when you have absolutely no idea what reference
counting algorithms I use, or how I use them? Anyway, the only type of GC
that I have found to be really useful is Proxy GC.

Jon Harrop

unread,

Mar 29, 2008, 8:52:16 PM3/29/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
> news:13usjnh...@corp.supernews.com...

>> Reference counting was one of the earliest forms of garbage collection
>> and it is riddled with many very serious and well known problems. You are
>> literally reinventing the GC and you are now several decades out of date.
> [...]
>
> How can you say that when you have absolutely no idea what reference
> counting algorithms I use, or how I use them?

Because the problems are a consequence of counting references. The precise
details don't matter beyond the fact that you're still counting references.

This has been well documented over the past 48 years.

Chris Thomasson

unread,

Mar 29, 2008, 9:12:48 PM3/29/08

to

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13utpke...@corp.supernews.com...

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13usjnh...@corp.supernews.com...
>>> Reference counting was one of the earliest forms of garbage collection
>>> and it is riddled with many very serious and well known problems. You
>>> are
>>> literally reinventing the GC and you are now several decades out of
>>> date.
>> [...]
>>
>> How can you say that when you have absolutely no idea what reference
>> counting algorithms I use, or how I use them?
>
> Because the problems are a consequence of counting references. The precise
> details don't matter beyond the fact that you're still counting
> references.

The precise details don't matter? REALLY? Before I go on, please explain
yourself. You are starting to sound like you don't now any of the new
reference counting algorithms that are out there. Before I waste my time
trying to explain them to you, I need to hear your explanation on how the
details don't matter.

:^/

>
> This has been well documented over the past 48 years.

For what algorithm? You are talking non-sense because some of the new
counting tricks have only recently been invented.

Jon Harrop

unread,

Mar 29, 2008, 9:17:28 PM3/29/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message

> news:13utpke...@corp.supernews.com...

>> Because the problems are a consequence of counting references. The
>> precise details don't matter beyond the fact that you're still counting
>> references.
>
> The precise details don't matter? REALLY? Before I go on, please explain
> yourself. You are starting to sound like you don't now any of the new
> reference counting algorithms that are out there. Before I waste my time
> trying to explain them to you, I need to hear your explanation on how the
> details don't matter.

I only just explained that and gave you a reference. Just read the
reference.

> You are talking non-sense because some of the new counting tricks have
> only recently been invented.

You make it sound as if reference counting is making a come-back. I'll give
you the benefit of the doubt though: can you cite any references indicating
that reference counting can be even vaguely competitive compared to a real
GC?

Jon Harrop

unread,

Mar 29, 2008, 9:28:34 PM3/29/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
> news:13usjnh...@corp.supernews.com...

>> Reference counting was one of the earliest forms of garbage collection
>> and it is riddled with many very serious and well known problems.
>
> cycles aside, can you list several other very serious problems?

Sure:

. Fragility.
. Memory overhead.
. Performance degradation.
. Fragmentation.
. Not incremental, i.e. awful worst case performance.
...

> BTW, IMHO, cycles are a red-herring and can usually be designed around.

Of course: by reinventing the GC.

>> You are literally reinventing the GC and you are now several decades out
>> of date.
>
> How are virtually zero-overhead counting algorithms out-of-date? Please
> explain...

Are you referring to your own patented algorithm for which there are no
verifiable results?

>>> BTW, garbage collectors have their share of problems; here are some
>>> of them:
>>>
>>
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d
>>
>> The thread you cite is a discussion about two different articles, neither
>> of
>> which substantiate your claim:
>>
>> The first is an article describing problems with a specific Java data
>> structure (WeakHashMap):
> [...]
>
> How do you effectively cache object's in a "collect-the-world"
> environment? GC and caching don't get along _sometimes_... From the
> applications point of view an object in the cache is in a quiescent state.
> However, from the GC perspective a node in the cache is a pointer to a
> "live object".

You are simply restating that author's misconceptions. Read the
documentation.

Chris Thomasson

unread,

Mar 29, 2008, 9:47:55 PM3/29/08

to

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13utr3l...@corp.supernews.com...

Okay, I will give just a couple of examples... You seem to think that one
needs to reference count every single object; you do not. A simple example
of this is a Proxy GC implemented with reference counting. Here are some
more details on the concept of proxy GC in general:

http://groups.google.com/group/comp.programming.threads/msg/41f29efe33e7f124

http://groups.google.com/group/comp.programming.threads/search?hl=en&group=comp.programming.threads&q=proxy+gc

http://groups.google.com/group/comp.lang.c++/msg/e24a9459777ec430

You can download Joe Seighs atomic-ptr-plus package from here:

http://atomic-ptr-plus.sourceforge.net

http://sourceforge.net/project/showfiles.php?group_id=127837

Dmitriy V'jukov has created nice one here:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/42b035cd0280bb31

Here is my work in the area:

http://groups.google.com/group/comp.programming.threads/search?hl=en&group=comp.programming.threads&q=vzoom&qt_g=Search+this+group

http://appft1.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PG01&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.html&r=1&f=G&l=50&s1=%2220070067770%22.PGNR.&OS=DN/20070067770&RS=DN/20070067770

The vZOOM algorithm uses per-thread reference counts which don't need any
memory barriers or atomic operations in order to update them.

Here are some of my proxy collector that I released into public domain:

http://home.comcast.net/~vzoom/demos/pc_sample.c

http://appcore.home.comcast.net/misc/pc_sample_h_v1.html

Here is a reader/writer usage pattern:

http://groups.google.com/group/comp.programming.threads/msg/325a046ac6ed2800

Dmitriy V'jukov has also created this low-overhead reference counting
algorithm:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/dab22a17c32c6b13

Before I go on and on:

This is a lot of information to absorbed and you probably won't be able to
learn all of it any time soon. So, before you label the work cited above as
non-sense, please try and understand some of it. The folks over on
'comp.programming.threads' have been working with these types of algorithms
for almost a decade and we can help you out.

:^)

Chris Thomasson

unread,

Mar 29, 2008, 10:00:36 PM3/29/08

to

[comp.programming.threads added]

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13utrof...@corp.supernews.com...

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13usjnh...@corp.supernews.com...
>>> Reference counting was one of the earliest forms of garbage collection
>>> and it is riddled with many very serious and well known problems.
>>
>> cycles aside, can you list several other very serious problems?
>
> Sure:
>
> . Fragility.
> . Memory overhead.
> . Performance degradation.
> . Fragmentation.
> . Not incremental, i.e. awful worst case performance.

Reference counting is fragile? I don't think so; its only as fragile as the
programmer using them. There are no silver bullets.

Memory overhead? No, a reference count releases the object when the
reference drops to zero which is more accurate than a traditional GC could
ever be. Are you talking about the extra word of space per-object? Did you
know that many garbage collectors use per-object meta-data as well?

Performance degradation? Again, I ask you what specific algorithms are you
talking about? It does matter.

Not incremental? Drill down on that one some more please.

> ...
>
>> BTW, IMHO, cycles are a red-herring and can usually be designed around.
>
> Of course: by reinventing the GC.
>
>>> You are literally reinventing the GC and you are now several decades out
>>> of date.
>>
>> How are virtually zero-overhead counting algorithms out-of-date? Please
>> explain...
>
> Are you referring to your own patented algorithm for which there are no
> verifiable results?

That's not the only one out there. Anyway, my counting algorihtm uses
per-thread counters that do not need atomics or membars to be mutated. BTW,
this algorihtm won be a T2000 from Sun:

https://coolthreads.dev.java.net
(the vzoom project)

It has made me some $$$ because I managed to license it to a few companies
who are very pleased with the results. One of them was involved with
creating embedded devices (QUADROS on an ARM9) and wanted to decrease the
overhead of their reference counting, vZOOM helped them out. Also, the
low-overhead memory allocator allowed them to create a heap out of QUADROS
tasks, they were VERY pleased with that because they were getting ready to
alter the OS source code:

http://groups.google.com/group/comp.programming.threads/msg/f6d16f5323311361
(last paragraph...)

>>>> BTW, garbage collectors have their share of problems; here are some
>>>> of them:
>>>>
>>>
> http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d
>>>
>>> The thread you cite is a discussion about two different articles,
>>> neither
>>> of
>>> which substantiate your claim:
>>>
>>> The first is an article describing problems with a specific Java data
>>> structure (WeakHashMap):
>> [...]
>>
>> How do you effectively cache object's in a "collect-the-world"
>> environment? GC and caching don't get along _sometimes_... From the
>> applications point of view an object in the cache is in a quiescent
>> state.
>> However, from the GC perspective a node in the cache is a pointer to a
>> "live object".
>
> You are simply restating that author's misconceptions. Read the
> documentation.

I am asking you how to do it. Explain how to create effectively object
caches using "traditional" collect-the-world GC...

Jon Harrop

unread,

Mar 29, 2008, 10:56:03 PM3/29/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message

> news:13utrof...@corp.supernews.com...

>> . Fragility.
>> . Memory overhead.
>> . Performance degradation.
>> . Fragmentation.
>> . Not incremental, i.e. awful worst case performance.
>
> Reference counting is fragile? I don't think so; its only as fragile as
> the programmer using them. There are no silver bullets.

You cannot have your cake and eat it: you said that the programmer must be
aware of the low-level details of their data structures in order to work
around the pathological performance problems inherent with reference
counting. That is a form of fragility. Fragmentation is one specific
example.

With a real GC you just fire and forget. Exceptional circumstances are
extremely rare. Improper use of weak hash tables is not an example of this.

> Memory overhead? No, a reference count releases the object when the
> reference drops to zero which is more accurate than a traditional GC could
> ever be.

That is commonly claimed but actually wrong. Scope can keep reference counts
above zero and values alive unnecessarily when a real GC can collect them
because they are unreferenced even though they are still in scope.

> Are you talking about the extra word of space per-object?

Yes.

> Did you know that many garbage collectors use per-object meta-data as
> well?

None that we use. Which GCs are you referring to?

> Performance degradation? Again, I ask you what specific algorithms are you
> talking about? It does matter.

I benchmarked all the mainstream reference counters for C++ many years ago.
If you think the situation has improved, perhaps we could benchmark C++
against some GC'd languages for collection-intensive tasks now?

> Not incremental? Drill down on that one some more please.

You actually just described the problem: reference counting releases values
when their reference drops to zero. At the end of a scope, many reference
counts can be zeroed at the same time and the ensuing collections can stall
the program for an unacceptable amount of time. This can also be extremely
difficult to workaround without simply resorting to a real GC.

We had this exact problem on a 250kLOC C++ product line and eventually fixed
it by rewriting everything from scratch in a more modern language. The
worst case performance is now 5x faster. We wasted a couple of months
trying to fix the problem in C++ before having the revelation that we were
just reinventing a garbage collector and doing a poor job of it at that.

>> You are simply restating that author's misconceptions. Read the
>> documentation.
>
> I am asking you how to do it. Explain how to create effectively object
> caches using "traditional" collect-the-world GC...

You use a weak hash table to cache data without keeping it alive.

You do not use a weak hash table to cache data that has no other references
to keep it alive on the assumption that the GC will be inefficient enough
for your cache to work.

Chris Thomasson

unread,

Mar 30, 2008, 12:22:54 AM3/30/08

to

"Jon Harrop" <use...@jdh30.plus.com> wrote in message

news:13uu0sg...@corp.supernews.com...

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13utrof...@corp.supernews.com...
>>> . Fragility.
>>> . Memory overhead.
>>> . Performance degradation.
>>> . Fragmentation.
>>> . Not incremental, i.e. awful worst case performance.
>>
>> Reference counting is fragile? I don't think so; its only as fragile as
>> the programmer using them. There are no silver bullets.
>
> You cannot have your cake and eat it: you said that the programmer must be
> aware of the low-level details of their data structures in order to work
> around the pathological performance problems inherent with reference
> counting. That is a form of fragility. Fragmentation is one specific
> example.

Wrong. You completely misunderstood me. I was talking about the implementers
of refcounting. The implementation of reference counting should try and
avoid making calls into atomic instructions and memory barriers. This is
transparent to the user.

> With a real GC you just fire and forget. Exceptional circumstances are
> extremely rare. Improper use of weak hash tables is not an example of
> this.

Fire and forget? lol. You can get into big trouble if you fire and forget;
simple example of VERY poor programming practice. Improper use of weak
references is a direct example of the consequences that arise when a naive
programmer fires-and-forget. There are no silver bullets. GC is a tool, and
it has weaknesses.

>> Memory overhead? No, a reference count releases the object when the
>> reference drops to zero which is more accurate than a traditional GC
>> could
>> ever be.
>
> That is commonly claimed but actually wrong.

Please show me a 100% accurate GC; good luck finding one. Well, let me help
you... A proxy GC can be accurate indeed. But, they are hardly traditional.
In fact they were invented on comp.programming.threads by Joe Seigh.

> Scope can keep reference counts
> above zero and values alive unnecessarily when a real GC can collect them
> because they are unreferenced even though they are still in scope.

It will only collect anything when the GC decides to run!! It does not run
all the time. If it did the performance would be so crappy that nobody would
every even think about using it. If you think this is comparable to a proxy
GC then your simply not getting it. Anyway, traditional GC is NOT about
performance, its about helping some programmers that do not know how, or
want, to manage their memory manually.

>> Are you talking about the extra word of space per-object?
>
> Yes.

How is that different than GC meta-data? BTW, there are refcounting
algorithms that do not need any per-object meta-data. For instance, vZOOM
keeps this information on a per-thread basis. There is GC meta-data that
keeps track of objects. There has to be. Think about it.

>> Did you know that many garbage collectors use per-object meta-data as
>> well?
>
> None that we use. Which GCs are you referring to?

Many GC languages use object headers; Java. Or, keep their object meta-data
in another place. It does not matter because it all takes up memory.

>> Performance degradation? Again, I ask you what specific algorithms are
>> you
>> talking about? It does matter.
>
> I benchmarked all the mainstream reference counters for C++ many years
> ago.
> If you think the situation has improved, perhaps we could benchmark C++
> against some GC'd languages for collection-intensive tasks now?

C++ has no GC. Anyway, are you talking about a lock-free reader patterns?
What collection-intensive tasks? I know I can compete, and most likely beat,
a traditional GC with handcrafted implementations that C++ give me the
freedom to work with. C++ gives me the flexibility to use custom allocators
and I can use architecture specific optimizations. BTW, how do would you
construct a proper benchmark? What patterns? IMO, I think that a lock-free
reader pattern would be okay.

>> Not incremental? Drill down on that one some more please.
>
> You actually just described the problem: reference counting releases
> values
> when their reference drops to zero.

Yeah. This is NOT true with a traditional GC.

> At the end of a scope, many reference
> counts can be zeroed at the same time and the ensuing collections can
> stall
> the program for an unacceptable amount of time. This can also be extremely
> difficult to workaround without simply resorting to a real GC.

Stall? Are you sure you know how Proxy GC works? There is no stall. There is
no contrived introspection of thread stacks. There is no collect-the-world.
There is no mark-and-sweep. There is no thread suspension. Ect, ect... No
stop-the-world... None of that non-sense. Stall! Get real.

> We had this exact problem on a 250kLOC C++ product line and eventually
> fixed
> it by rewriting everything from scratch in a more modern language. The
> worst case performance is now 5x faster. We wasted a couple of months
> trying to fix the problem in C++ before having the revelation that we were
> just reinventing a garbage collector and doing a poor job of it at that.

Well, from your false comments on reference counting, I would expect you to
not be able to beat a tranditional garbage collector in any way shape or
form.

>>> You are simply restating that author's misconceptions. Read the
>>> documentation.
>>
>> I am asking you how to do it. Explain how to create effectively object
>> caches using "traditional" collect-the-world GC...
>
> You use a weak hash table to cache data without keeping it alive.

> You do not use a weak hash table to cache data that has no other
> references
> to keep it alive on the assumption that the GC will be inefficient enough
> for your cache to work.

Wrong, let me inform you that a cache keeps QUIESCENT objects for further
optimized allocations. Are you sure you know what a cache is? A cache keep
objects that have no references around to optimize further allocations. Do
you know about the Solaris slab allocator?

Chris Thomasson

unread,

Mar 30, 2008, 1:37:36 AM3/30/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message
news:m4mdnbbimOjpiHLa...@comcast.com...

> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
> news:13uu0sg...@corp.supernews.com...
>> Chris Thomasson wrote:
>>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>>> news:13utrof...@corp.supernews.com...
>>>> . Fragility.
>>>> . Memory overhead.
>>>> . Performance degradation.
>>>> . Fragmentation.
>>>> . Not incremental, i.e. awful worst case performance.
>>>
>>> Reference counting is fragile? I don't think so; its only as fragile as
>>> the programmer using them. There are no silver bullets.
>>
>> You cannot have your cake and eat it: you said that the programmer must
>> be
>> aware of the low-level details of their data structures in order to work
>> around the pathological performance problems inherent with reference
>> counting. That is a form of fragility. Fragmentation is one specific
>> example.
>
> Wrong. You completely misunderstood me. I was talking about the
> implementers of refcounting. The implementation of reference counting
> should try and avoid making calls into atomic instructions and memory
> barriers. This is transparent to the user.

[...]

Perhaps I misunderstood you... I did state that ref-counting is only as
fragile as the programmer using it. Well, guess what, that applies to GC as
well:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d

Ref-counting and GC have their problems. Are you trying to convince me that
GC is just there and you can fire objects at it and forget? Oh, wait... You
did say exactly that. Remember?

Jerry Coffin

unread,

Mar 30, 2008, 3:19:57 PM3/30/08

to

In article <13utr3l...@corp.supernews.com>, use...@jdh30.plus.com
says...

[ ... ]

> You make it sound as if reference counting is making a come-back. I'll give
> you the benefit of the doubt though: can you cite any references indicating
> that reference counting can be even vaguely competitive compared to a real
> GC?

Paul Wilson's garbage collector survey, section 2.1.

http://www.cs.utexas.edu/ftp/pub/garbage/gcsurvey.ps

Most of what you've said about reference counting is complete nonsense.
For example, you claim that one of the problems with reference counting
is:

> . Not incremental, i.e. awful worst case performance.

As the reference above makes clear to anybody capable of reading at all:

One advantage of reference counting is the _incremental_
nature of most of its operation--garbage collection work
(updating rerence counts) is interleaved closely with the
running program's own execution. It can easily be made
completely incremental and _real time_; that is,
performing at most a small and bounded amount of work per
unit of program execution.

he goes on to say:

This is not to say that reference counting is a dead
technique. It still has advantages in terms of the
immediacy with which it reclaims most garbage, and
corresponding beneficial effects on locality of reference;
a reference counting system may perform with little
degradation when almost all of the heap space is occupied
by live objects, while other collectors rely on trading
more space for higher efficiency. Reference counts
themselves may be valuable in some systems. For example,
they may support optimizations in functional garbage
collection implementations by allowing destrutive modification
to uniquely-referenced objects. Distributed garbage collection
is often done with reference-counting between nodes of a
distributed system, combined with mark-sweep or copying
collection within a node. Future systems may find other uses
for reference counting, perhaps in hybrid collectors also
involving other techniques, or when augmented by specialized
hardware.

He mentions two points of particular interest with respect to reference
counting making a "comeback". The difference in speed between CPUs and
main memory seems to be growing, forcing ever-greater dependence on
caching -- which means that the improved locality of reference with
reference counting means more all the time. This also means that the
primary performance cost of reference counting (having to update counts)
now means relatively little as a rule -- in most cases, a few extra
operations inside the CPU mean nearly nothing, while extra references to
main memory mean a great deal.

Likewise, individual computers now correspond much more closely to the
distributed systems at the time of the survey. In particlar, it's now
quite common to see a number of separate OS instances running in virtual
machines in a single box.

You opinions on garbage collection reflect a tremendous enthusiasm, but
equally tremendous ignorance of the subject matter.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Jon Harrop

unread,

Mar 30, 2008, 4:35:34 PM3/30/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message

> news:13uu0sg...@corp.supernews.com...

>> You cannot have your cake and eat it: you said that the programmer must
>> be aware of the low-level details of their data structures in order to
>> work around the pathological performance problems inherent with reference
>> counting. That is a form of fragility. Fragmentation is one specific
>> example.
>
> Wrong. You completely misunderstood me. I was talking about the
> implementers of refcounting. The implementation of reference counting
> should try and avoid making calls into atomic instructions and memory
> barriers. This is transparent to the user.

A data structure implementor using reference counting must be aware of the
overhead of reference counting and work around it by amortizing reference
counts whenever possible. You yourself only just described this in the
context of performance.

>>> Memory overhead? No, a reference count releases the object when the
>>> reference drops to zero which is more accurate than a traditional GC
>>> could ever be.
>>
>> That is commonly claimed but actually wrong.
>

> Please show me a 100% accurate GC; good luck finding one...

That is irrelevant. Your argument in favor of reference counting was
completely fallacious.

You claimed was that reference counting is "more accurate than a traditional
GC could ever be". Consider:

{
Bar bar()
f(bar);
g()
}

Reference counting will keep "bar" alive until its reference count happens
to be zeroed when it drops out of scope even though it is not reachable
during the call to "g()". Real garbage collectors can collect "bar" during
the call to "g" because it is no longer reachable.

So GCs can clearly collect sooner than reference counters.

>> Scope can keep reference counts above zero and values alive unnecessarily
>> when a real GC can collect them because they are unreferenced even though
>> they are still in scope.
>
> It will only collect anything when the GC decides to run!

Which may well be before the scope happens to end.

> Anyway, traditional GC is NOT about performance, its about helping some
> programmers that do not know how, or want, to manage their memory
> manually.

By the same logic: C++ is for programmers who don't know, or want, to write
assembler manually.

>>> Are you talking about the extra word of space per-object?
>>
>> Yes.
>
> How is that different than GC meta-data?

Reference counts consume a lot more space.

> BTW, there are refcounting
> algorithms that do not need any per-object meta-data. For instance, vZOOM
> keeps this information on a per-thread basis. There is GC meta-data that
> keeps track of objects. There has to be. Think about it.

Yes. For example, OCaml hides two bits inside each pointer totalling zero
overhead. In contrast, a reference counting system is likely to add a
machine word, bloating each and every value by 8 bytes unnecessarily on
modern hardware.

>>> Did you know that many garbage collectors use per-object meta-data as
>>> well?
>>
>> None that we use. Which GCs are you referring to?
>
> Many GC languages use object headers; Java.

Just Java?

>> I benchmarked all the mainstream reference counters for C++ many years
>> ago. If you think the situation has improved, perhaps we could benchmark
>> C++ against some GC'd languages for collection-intensive tasks now?
>
> C++ has no GC. Anyway, are you talking about a lock-free reader patterns?

Let's do a single-threaded benchmark first.

> What collection-intensive tasks?

Symbolic rewriting would make a good benchmark. Here is one:

http://www.lambdassociates.org/studies/study10.htm

Try implementing this using reference counting and we can see how it
compares (probably on a trickier rewrite). Replace the arbitrary-precision
ints with doubles.

> I know I can compete, and most likely beat, a traditional GC with
> handcrafted implementations that C++ give me the freedom to work with.

I admire your faith but I would like to see some evidence to back up such
claims because they run contrary to common wisdom accumulated over the past
few decades.

>>> Not incremental? Drill down on that one some more please.
>>
>> You actually just described the problem: reference counting releases
>> values when their reference drops to zero.
>
> Yeah. This is NOT true with a traditional GC.

Absolutely. GCs do not have to wait that long: they can collect sooner.

>> At the end of a scope, many reference
>> counts can be zeroed at the same time and the ensuing collections can
>> stall the program for an unacceptable amount of time. This can also be
>> extremely difficult to workaround without simply resorting to a real GC.
>
> Stall? Are you sure you know how Proxy GC works? There is no stall. There
> is no contrived introspection of thread stacks. There is no
> collect-the-world. There is no mark-and-sweep. There is no thread
> suspension. Ect, ect...

You've just made a series of irrelevant references here.

>> We had this exact problem on a 250kLOC C++ product line and eventually
>> fixed
>> it by rewriting everything from scratch in a more modern language. The
>> worst case performance is now 5x faster. We wasted a couple of months
>> trying to fix the problem in C++ before having the revelation that we
>> were just reinventing a garbage collector and doing a poor job of it at
>> that.
>
> Well, from your false comments on reference counting, I would expect you
> to not be able to beat a tranditional garbage collector in any way shape
> or form.

Of course. I fully expect you to fail on the above benchmark but am willing
to give you the benefit of the doubt.

>>>> You are simply restating that author's misconceptions. Read the
>>>> documentation.
>>>
>>> I am asking you how to do it. Explain how to create effectively object
>>> caches using "traditional" collect-the-world GC...
>>
>> You use a weak hash table to cache data without keeping it alive.
>
>> You do not use a weak hash table to cache data that has no other
>> references
>> to keep it alive on the assumption that the GC will be inefficient enough
>> for your cache to work.
>
> Wrong, let me inform you that a cache keeps QUIESCENT objects for further
> optimized allocations.

That would be a specific kind of cache. Caches do not inherently have
anything to do with allocation.

> Are you sure you know what a cache is? A cache keep objects that have no
> references around to optimize further allocations.

Apparently what you wanted to ask me was how I would implement an allocation
cache in a garbage collected language.

The answer depends upon the language. If you look at OCaml, for example,
there is no need to because the run-time has already automated this.

> Do you know about the Solaris slab allocator?

No.

Jon Harrop

unread,

Mar 30, 2008, 4:50:57 PM3/30/08

to

Chris Thomasson wrote:
> Perhaps I misunderstood you... I did state that ref-counting is only as
> fragile as the programmer using it. Well, guess what, that applies to GC
> as well:
>
>
http://groups.google.com/group/comp.programming.threads/browse_frm/thread/5e9357a6fb746e5d
>
> Ref-counting and GC have their problems. Are you trying to convince me
> that GC is just there and you can fire objects at it and forget? Oh,
> wait... You did say exactly that. Remember?

We already discussed that exact reference and it has nothing to do with
garbage collection. Read the comments left on his blog post.

Jon Harrop

unread,

Mar 30, 2008, 5:01:57 PM3/30/08

to

Jerry Coffin wrote:
> In article <13utr3l...@corp.supernews.com>, use...@jdh30.plus.com
> says...

>> You make it sound as if reference counting is making a come-back. I'll
>> give you the benefit of the doubt though: can you cite any references
>> indicating that reference counting can be even vaguely competitive
>> compared to a real GC?
>
> Paul Wilson's garbage collector survey, section 2.1.
>
> http://www.cs.utexas.edu/ftp/pub/garbage/gcsurvey.ps

That is 16 years out of date.

> Most of what you've said about reference counting is complete nonsense.

Then why is your best counterexample ancient history?

And sixteen years on his prophecy has not come true: we do not have
dedicated hardware for reference counting and none of the major GCs are
based upon reference counting.

Python and Mathematica are reference counted but both are slow and suffer
from stalls due to lack of incremental collection.

> He mentions two points of particular interest with respect to reference
> counting making a "comeback". The difference in speed between CPUs and
> main memory seems to be growing, forcing ever-greater dependence on
> caching --

Yes.

> which means that the improved locality of reference with
> reference counting means more all the time. This also means that the
> primary performance cost of reference counting (having to update counts)
> now means relatively little as a rule -- in most cases, a few extra
> operations inside the CPU mean nearly nothing, while extra references to
> main memory mean a great deal.

No. Locality is worse with reference counting which is why it has gotten
relatively slower since that ancient survey. You can easily test this
yourself.

> Likewise, individual computers now correspond much more closely to the
> distributed systems at the time of the survey. In particlar, it's now
> quite common to see a number of separate OS instances running in virtual
> machines in a single box.
>
> You opinions on garbage collection reflect a tremendous enthusiasm, but
> equally tremendous ignorance of the subject matter.

Yet you cannot cite a single relevant reference or piece of objective
evidence to support an argument that flies in the face of almost all modern
software development (outside restricted memory environments).

George Peter Staplin

unread,

Apr 5, 2008, 12:58:04 PM4/5/08

to

Chris Thomasson wrote:
> "Jon Harrop" <use...@jdh30.plus.com> wrote in message

> news:13uu0sg...@corp.supernews.com...

>> Chris Thomasson wrote:
>>> Memory overhead? No, a reference count releases the object when the
>>> reference drops to zero which is more accurate than a traditional GC
>>> could
>>> ever be.
>>
>> That is commonly claimed but actually wrong.
>
> Please show me a 100% accurate GC; good luck finding one. Well, let me help
> you... A proxy GC can be accurate indeed. But, they are hardly traditional.
> In fact they were invented on comp.programming.threads by Joe Seigh.

There are accurate GC. However, there are not many for C or C++. You
really need compiler support in C for a proper non-conservative GC.

>> Scope can keep reference counts
>> above zero and values alive unnecessarily when a real GC can collect them
>> because they are unreferenced even though they are still in scope.
>
> It will only collect anything when the GC decides to run!! It does not run
> all the time. If it did the performance would be so crappy that nobody would
> every even think about using it. If you think this is comparable to a proxy
> GC then your simply not getting it. Anyway, traditional GC is NOT about
> performance, its about helping some programmers that do not know how, or
> want, to manage their memory manually.

That is not true. There are GC that run all of the time. Dijkstra
wrote a concurrent collector many years ago, with a very clever coloring
algorithm designed to make objects move in stages, while preventing the
reuse of an object that is still in use.

I recommend that you read the book _Garbage Collection: Algorithms for
Automatic Dynamic Memory_. That book covers Dijkstra's work, and the
newer concurrent generational collectors used in languages like Java.

It's very difficult to make a GC work well in all situations and usage
patterns, though in practice that isn't usually a problem. The primary
language I work with uses reference-counted objects.

Reference counting introduces problems with circular references, and
thereby requires copying more data in some circumstances. On the other
hand reference counting tends to have more consistent runtime costs, so
it's sometimes preferable in realtime cases.

Knuth's TAOCP refers to a paper about a method for circular lists that
use reference counting, but I haven't thus far been able to find a copy
freely available online. I think it was used in a LISP or Scheme
implementation.

Thankfully, mark-and-sweep (from the 1950's IIRC) is not the end of the
research on GC.

George

Chris Thomasson

unread,

Apr 5, 2008, 6:53:50 PM4/5/08

to

"George Peter Staplin" <georgeps...@xmission.com> wrote in message
news:ft8b2s$lsj$1...@news.xmission.com...

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13uu0sg...@corp.supernews.com...
>>> Chris Thomasson wrote:
>>>> Memory overhead? No, a reference count releases the object when the
>>>> reference drops to zero which is more accurate than a traditional GC
>>>> could
>>>> ever be.
>>>
>>> That is commonly claimed but actually wrong.
>>
>> Please show me a 100% accurate GC; good luck finding one. Well, let me
>> help
>> you... A proxy GC can be accurate indeed. But, they are hardly
>> traditional.
>> In fact they were invented on comp.programming.threads by Joe Seigh.
>
> There are accurate GC. However, there are not many for C or C++. You
> really need compiler support in C for a proper non-conservative GC.

Indeed.

>>> Scope can keep reference counts
>>> above zero and values alive unnecessarily when a real GC can collect
>>> them
>>> because they are unreferenced even though they are still in scope.
>>
>> It will only collect anything when the GC decides to run!! It does not
>> run
>> all the time. If it did the performance would be so crappy that nobody
>> would
>> every even think about using it. If you think this is comparable to a
>> proxy
>> GC then your simply not getting it. Anyway, traditional GC is NOT about
>> performance, its about helping some programmers that do not know how, or
>> want, to manage their memory manually.
>
> That is not true. There are GC that run all of the time. Dijkstra
> wrote a concurrent collector many years ago, with a very clever coloring
> algorithm designed to make objects move in stages, while preventing the
> reuse of an object that is still in use.

How do the mutators interact with the GC which "runs all the time"? Are you
taking about threads from a per-application GC thread pool running non-stop?
How much GC-to-thread introspection in going on? Any thread synchronization
between GC thread(s) and mutator threads?

> I recommend that you read the book _Garbage Collection: Algorithms for
> Automatic Dynamic Memory_. That book covers Dijkstra's work, and the
> newer concurrent generational collectors used in languages like Java.
>
> It's very difficult to make a GC work well in all situations and usage
> patterns, though in practice that isn't usually a problem.
> The primary language I work with uses reference-counted objects.

Just pick the right tools for the job at hand.

> Reference counting introduces problems with circular references, and
> thereby requires copying more data in some circumstances. On the other
> hand reference counting tends to have more consistent runtime costs, so
> it's sometimes preferable in realtime cases.

The choice can be very simple for some types of reference counting, don't
use them if you simply cannot remove cycles in a given data-structure. If
your application requires deterministic prompt destruction of objects, then
some types of ref-counting can help.

> Knuth's TAOCP refers to a paper about a method for circular lists that
> use reference counting, but I haven't thus far been able to find a copy
> freely available online. I think it was used in a LISP or Scheme
> implementation.
>
> Thankfully, mark-and-sweep (from the 1950's IIRC) is not the end of the
> research on GC.

The are proxy garbage collectors which can workout very well because they
can be made deterministic. Here is some more information on proxy gc:

http://groups.google.com/group/comp.programming.threads/msg/41f29efe33e7f124

http://groups.google.com/group/comp.programming.threads/search?hl=en&group=comp.programming.threads&q=proxy+gc

http://groups.google.com/group/comp.lang.c++/msg/e24a9459777ec430

RCU is also a proxy gc. The quiescent states at analogous to an application
thread releasing a reference to a proxy collector.

Chris Thomasson

unread,

Apr 5, 2008, 6:59:44 PM4/5/08

to

"George Peter Staplin" <georgeps...@xmission.com> wrote in message
news:ft8b2s$lsj$1...@news.xmission.com...

> Chris Thomasson wrote:
>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>> news:13uu0sg...@corp.supernews.com...
>>> Chris Thomasson wrote:
>>>> Memory overhead? No, a reference count releases the object when the
>>>> reference drops to zero which is more accurate than a traditional GC
>>>> could
>>>> ever be.
>>>
>>> That is commonly claimed but actually wrong.
>>
>> Please show me a 100% accurate GC; good luck finding one. Well, let me
>> help
>> you... A proxy GC can be accurate indeed. But, they are hardly
>> traditional.
>> In fact they were invented on comp.programming.threads by Joe Seigh.
>
> There are accurate GC. However, there are not many for C or C++.

[...]

My favorite algorithms within that class would have to be Proxy GC. They can
be implemented with virtually zero-overheads. They can work very well within
C++. Here is some work that you can check out:

http://atomic-ptr-plus.sourceforge.net

George Peter Staplin

unread,

Apr 6, 2008, 8:59:25 PM4/6/08

to

Chris Thomasson wrote:
> "George Peter Staplin" <georgeps...@xmission.com> wrote in message
> news:ft8b2s$lsj$1...@news.xmission.com...
>> Chris Thomasson wrote:
>>> "Jon Harrop" <use...@jdh30.plus.com> wrote in message
>>> news:13uu0sg...@corp.supernews.com...

>>>> Scope can keep reference counts
>>>> above zero and values alive unnecessarily when a real GC can collect
>>>> them
>>>> because they are unreferenced even though they are still in scope.
>>>
>>> It will only collect anything when the GC decides to run!! It does not
>>> run
>>> all the time. If it did the performance would be so crappy that nobody
>>> would
>>> every even think about using it. If you think this is comparable to a
>>> proxy
>>> GC then your simply not getting it. Anyway, traditional GC is NOT about
>>> performance, its about helping some programmers that do not know how, or
>>> want, to manage their memory manually.
>>
>> That is not true. There are GC that run all of the time. Dijkstra
>> wrote a concurrent collector many years ago, with a very clever coloring
>> algorithm designed to make objects move in stages, while preventing the
>> reuse of an object that is still in use.
>
> How do the mutators interact with the GC which "runs all the time"? Are you
> taking about threads from a per-application GC thread pool running non-stop?
> How much GC-to-thread introspection in going on? Any thread synchronization
> between GC thread(s) and mutator threads?

This explains how Dijkstra's GC algorithm works (with a dedicated
"processor"):
http://www.cs.utexas.edu/~EWD/transcriptions/EWD04xx/EWD492.html

There might also be a later writing about his work on that.

I think that Steele also wrote a variation on Dijkstra's algorithm. If
you have an ACM account you can probably access it.

Guy L. Steele, Jr., Multiprocessing Compactifying Garbage Collection,
Comm. ACM, Sep. 1975, vol. 18, No. 9, pp. 495-508.

>> Knuth's TAOCP refers to a paper about a method for circular lists that
>> use reference counting, but I haven't thus far been able to find a copy
>> freely available online. I think it was used in a LISP or Scheme
>> implementation.
>>
>> Thankfully, mark-and-sweep (from the 1950's IIRC) is not the end of the
>> research on GC.
>
> The are proxy garbage collectors which can workout very well because they
> can be made deterministic. Here is some more information on proxy gc:
>
> http://groups.google.com/group/comp.programming.threads/msg/41f29efe33e7f124
>
> http://groups.google.com/group/comp.programming.threads/search?hl=en&group=comp.programming.threads&q=proxy+gc
>
> http://groups.google.com/group/comp.lang.c++/msg/e24a9459777ec430
>
>
> RCU is also a proxy gc. The quiescent states at analogous to an application
> thread releasing a reference to a proxy collector.

I'll read those. Thank you.

George

Jerry Coffin

unread,

Apr 7, 2008, 1:02:19 AM4/7/08

to

In article <ft8b2s$lsj$1...@news.xmission.com>,
georgeps...@xmission.com says...

[ ... ]

> There are accurate GC. However, there are not many for C or C++. You
> really need compiler support in C for a proper non-conservative GC.

Not disagreeing at all, but just to clarify: you really need compiler
support to have non-conservative GC in almost any language.

[ ... ]

> > It will only collect anything when the GC decides to run!! It does not run
> > all the time. If it did the performance would be so crappy that nobody would
> > every even think about using it. If you think this is comparable to a proxy
> > GC then your simply not getting it. Anyway, traditional GC is NOT about
> > performance, its about helping some programmers that do not know how, or
> > want, to manage their memory manually.
>
> That is not true. There are GC that run all of the time. Dijkstra
> wrote a concurrent collector many years ago, with a very clever coloring
> algorithm designed to make objects move in stages, while preventing the
> reuse of an object that is still in use.

Most concurrent collectors don't run all the time though. While it's
possible to do so, it's almost always a lousy idea. In fact, one of the
major advantages of most implementations of GC over most implementations
of manual collection is specifically that they work lazily.

When/if you really try to collect garbage immediately after it's
generated, you end up doing a collection when there's very little
garbage. That's generally quite inefficient -- with a mark/sweep style
of collector, it means that your heap is generally fairly fragmented.
With a copying collector, it means most objects are still alive when you
do a collection -- and the time taken by a copying collector is (mostly)
proportional to the number of objects that remain alive when the
collection cycle runs.

As such, with any sort of copying-based collector, you generally want to
delay collection as long as possible, to give as many of the objects in
the current heap a chance to die before the collection takes place.

[ ... ]

> Thankfully, mark-and-sweep (from the 1950's IIRC) is not the end of the
> research on GC.

Far from it. In fact, it's pretty much just the beginning of GC
research.

Since this is being discussed in comp.lang.c++ (among other places) I
think it's worth adding out one other point though: most GC research is
conducted using languages (e.g. Lisp, Smalltalk, Self) where _all_ (or
at least almost all) variables live in the garbage collected memory
pool, and all that's ever created on the stack are references, pointers,
or whatever you prefer to call them, to the real objects.

This means a lot of GC research is only marginally applicable to a
language like C or C++. It's almost unimaginable that anybody (except,
perhaps, a recent Java expatriate) would write C++ code like:

int *i=new int;
for (i=0; i<whatever; ++i)
// ...

In C++ (and C, for that matter) such temporary objects are inevitably
created directly on the stack, NOT dynamically allocated. This means
that most objects that are created on the heap are likely to be
relatively long-lived. That, in turn, means that other forms of
collection (including reference counting) are generally likely to be FAR
more competitive with things like copying collectors in C++ than would
appear to be the case in most GC research.

Jerry Coffin

unread,

Apr 7, 2008, 2:11:10 AM4/7/08

to

In article <ftbrld$4s8$1...@news.xmission.com>,
georgeps...@xmission.com says...

[ ... ]

> I think that Steele also wrote a variation on Dijkstra's algorithm. If
> you have an ACM account you can probably access it.
>
> Guy L. Steele, Jr., Multiprocessing Compactifying Garbage Collection,
> Comm. ACM, Sep. 1975, vol. 18, No. 9, pp. 495-508.

There has been a LOT of work on concurrent collectors since then.

Another paper that may be particularly applicable to this thread is _A
Unified Theory of Garbage Collection_, which deals specifically with not
only the differences but, especially, the similarities between garbage
collection based on reference counting and tracing:

http://www.research.ibm.com/people/d/dfb/papers/Bacon04Unified.pdf

Along with interesting content of its own, this has an _excellent_
bibliography.

Jerry Coffin

unread,

Apr 7, 2008, 2:11:09 AM4/7/08

to

In article <AMqdnYMgKLNdn2Xa...@comcast.com>,
cri...@comcast.net says...

[ ... ]

> How do the mutators interact with the GC which "runs all the time"? Are you
> taking about threads from a per-application GC thread pool running non-stop?
> How much GC-to-thread introspection in going on? Any thread synchronization
> between GC thread(s) and mutator threads?

Most concurrent garbage collectors require some sort of either read
barriers or write barriers. As such, they're really only somewhat
concurrent -- they don't (necessarily) require that you stop the world,
but they may stop certain parts of the world. In fact, some go so far as
to call them "mostly concurrent" instead of truly "concurrent." E.g.
see:

http://research.sun.com/techrep/2000/smli_tr-2000-88.pdf

Dijkstra's algorithm is typically implemented using read barriers. These
are (horrendously) inefficient without direct hardware support (that's
not present in typically mainstream hardware).

If you want to look at concurrent GC in more detail, you might want to
look at what claims to be _the_ Garbage Collection Bibliography:

http://www.cs.kent.ac.uk/people/staff/rej/gcbib/gcbibH.html

Jon Harrop

unread,

Apr 8, 2008, 10:42:47 AM4/8/08

to

Jerry Coffin wrote:
> When/if you really try to collect garbage immediately after it's
> generated, you end up doing a collection when there's very little
> garbage. That's generally quite inefficient -- with a mark/sweep style
> of collector, it means that your heap is generally fairly fragmented.
> With a copying collector, it means most objects are still alive when you
> do a collection -- and the time taken by a copying collector is (mostly)
> proportional to the number of objects that remain alive when the
> collection cycle runs.
>
> As such, with any sort of copying-based collector, you generally want to
> delay collection as long as possible, to give as many of the objects in
> the current heap a chance to die before the collection takes place.

That information is very out of date. Incremental generational collectors
that are based upon both mark-and-sweep and copying have been ubiquitous
for some time.

> Since this is being discussed in comp.lang.c++ (among other places) I
> think it's worth adding out one other point though: most GC research is
> conducted using languages (e.g. Lisp, Smalltalk, Self) where _all_ (or
> at least almost all) variables live in the garbage collected memory
> pool, and all that's ever created on the stack are references, pointers,
> or whatever you prefer to call them, to the real objects.
>
> This means a lot of GC research is only marginally applicable to a
> language like C or C++. It's almost unimaginable that anybody (except,
> perhaps, a recent Java expatriate) would write C++ code like:
>
> int *i=new int;
> for (i=0; i<whatever; ++i)
> // ...

This is just absolute nonsense.

Values are typically only boxed if they do not fit in a machine word. In
many cases (e.g. MLton, Stalin), the compiler automatically unboxes for you
and in other cases (e.g. F#) the programmer can control unboxing.

--
Dr Jon D Harrop, Flying Frog Consultancy

http://www.ffconsultancy.com/products/?u

Stephen J. Bevan

unread,

May 14, 2008, 12:00:04 AM5/14/08

to

Jon Harrop <use...@jdh30.plus.com> writes:
> Let's do a single-threaded benchmark first.
>
>> What collection-intensive tasks?
>
> Symbolic rewriting would make a good benchmark. Here is one:
>
> http://www.lambdassociates.org/studies/study10.htm
>
> Try implementing this using reference counting and we can see how it
> compares (probably on a trickier rewrite). Replace the arbitrary-precision
> ints with doubles.

The (old) machine I read news on is not nearly so powerful as Mark's
so for reference here's the time for a slightly modified version[1] of
the O'Caml code :-

$ ocamlopt -v
The Objective Caml native-code compiler, version 3.08.2
Standard library directory: /usr/lib/ocaml/3.08
$ ocamlopt simplify.ml -o simplify
$ time ./simplify
Took 5.020000s

real 0m5.137s
user 0m5.021s
sys 0m0.005s

Now for C code that uses naive reference counting :-

$ cc -v
Reading specs from /usr/lib/gcc-lib/i486-linux/3.3.5/specs
Configured with: ../src/configure -v --enable-languages=c,c++,java,f77,pascal,objc,ada,treelang --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --with-system-zlib --enable-nls --without-included-gettext --enable-__cxa_atexit --enable-clocale=gnu --enable-debug --enable-java-gc=boehm --enable-java-awt=xlib --enable-objc-gc i486-linux
Thread model: posix
gcc version 3.3.5 (Debian 1:3.3.5-8ubuntu2.1)
$ cc -O3 -Wall simplify.c -o simplify
$ time ./simplify
real 0m16.757s
user 0m16.553s
sys 0m0.003s
$

That's over 3x slower than O'Caml so O'Caml/GC is the winner, right?

We'll if your only choice is O'Caml or (naive) reference counting for
this problem then yes. However, there are other approaches. For
example, the following is the time for C code that uses a hand-coded
version of what a sufficiently smart compiler that implemented
linear/unique types could produce :-

$ time ./simplify
real 0m2.812s
user 0m2.751s
sys 0m0.002s

So on my machine naive reference counting is 3x slower than O'Caml GC
but the linear version is 2x faster than O'Caml. Should we conclude
that (O'Caml) GC and naive reference counting both suck?

My conclusion is that while the benchmark clearly measures something
it isn't clear to me that what it measures is interesting/useful. It
may be possible to make it useful in which case the benchmark should
also track peak/average heap size. At present that's pointless
because the the test expression is so small that the working set for
each iteration should only be a few hundred bytes. This will fit in
the nursery of any generational collector or only require a copying
collector to traverse a few hundred bytes. That's very GC friendly.

------------------

[1] Compiling the code from the site gave the following :-

$ ocamlopt simplify.ml -o simplify
No implementations provided for the following modules:
Num referenced from simplify.cmx

Rather than work out if I'm suffering from using an out of date
O'Caml or it just isn't installed right, I just changed the code to
use a good old data type and that works fine :-

$ cat simplify.ml
type expr = Int of int | Var of string | Add of expr * expr | Mul of expr * expr;;

let int n = (Int n);;

let ( +: ) f g = Add(f, g) and ( *: ) f g = Mul(f, g);;

let test_expr =
Var "x" *: ((int 12 *: int 0 +: (int 23 +: int 8)) +: Var "y");;

let time f x =
let t = Sys.time() in
let f_x = f x in
Printf.printf "Took %fs\n" (Sys.time() -. t);
f_x;;

let rec loop n f x = if n=1 then f x else (ignore(f x); loop (n-1) f x);;

let rec simplify = function
| Int _ | Var _ as f -> f
| Add (f, g) -> simplify f +: simplify g
| Mul (f, g) -> simplify f *: simplify g;;

time (loop 10000000 simplify) test_expr;;

Chris Thomasson

unread,

May 14, 2008, 12:23:14 AM5/14/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
news:87d4npo...@dnsalias.com...

> Jon Harrop <use...@jdh30.plus.com> writes:
>> Let's do a single-threaded benchmark first.
>>
>>> What collection-intensive tasks?
>>
>> Symbolic rewriting would make a good benchmark. Here is one:
>>
>> http://www.lambdassociates.org/studies/study10.htm
>>
>> Try implementing this using reference counting and we can see how it
>> compares (probably on a trickier rewrite). Replace the
>> arbitrary-precision
>> ints with doubles.

[...]

Try the test using the following allocator:

http://groups.google.com/group/comp.lang.c++/browse_frm/thread/68bdc76f9792de1f

What results do you get?

Chris Thomasson

unread,

May 14, 2008, 12:48:10 AM5/14/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message

news:yIednXsQ-LQe9bfV...@comcast.com...

Ah, never mind. Who cares.

Stephen J. Bevan

unread,

May 14, 2008, 9:44:46 AM5/14/08

to

"Chris Thomasson" <cri...@comcast.net> writes:

Using malloc(3) directly causes the time for the reference counting C
code to be 30 seconds i.e. 6x slower than O'Caml. So I used a simple
object pooling scheme. Although there are 50000011 allocations, only
14 of them result in a call to malloc(3). The same pool is used in
the linear code so while it may be possible to improve the pool, the
fact that the linear code is 8x quicker than the reference counting
version shows that the (native) reference couting is the place to look
for performance improvements.

The above and the fact that your code is C++ and I'm using C means
that there is is little incentive to convert your C++ to C to try it.

Chris Thomasson

unread,

May 14, 2008, 5:50:34 PM5/14/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:878wydn...@dnsalias.com...

Here is a crude C version of my single-threaded region allocator:

http://pastebin.com/m52ba914

http://groups.google.com/group/comp.lang.c/browse_frm/thread/f8f65273b09b4229

Stephen J. Bevan

unread,

May 14, 2008, 9:50:28 PM5/14/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

>> Using malloc(3) directly causes the time for the reference counting C
>> code to be 30 seconds i.e. 6x slower than O'Caml. So I used a simple
>> object pooling scheme. Although there are 50000011 allocations, only
>> 14 of them result in a call to malloc(3). The same pool is used in
>> the linear code so while it may be possible to improve the pool, the
>> fact that the linear code is 8x quicker than the reference counting
>> version shows that the (native) reference couting is the place to look
>> for performance improvements.
>>
>> The above and the fact that your code is C++ and I'm using C means
>> that there is is little incentive to convert your C++ to C to try it.
>
> Here is a crude C version of my single-threaded region allocator:

There were two reasons for not using your C++ allocator, the fact that
it was in C++ was the second reason. Having a C version does nothing
to address the first reason: the linear version 8x quicker than the
reference counting version using the _same_ allocator. Thus improving
the allocator will not make the reference counting version faster than
the linear version. It _may_ make the reference counting version
faster than O'Caml. However, looking at your C allocator code it does
observably more work per allocation than the pooling allocator I'm
using so it isn't going to be faster. Talk is cheap so I actually
tested it :-

$ time ./simplify speed
(0x804b148/0x804e008/1/8192)::allocator_sys_expand
(0x804b148/0x8052028/2/16384)::allocator_sys_expand
(0x804b148/0x804e008/2/8192)::allocator_sys_promote

real 0m17.530s
user 0m17.307s
sys 0m0.005s
$

This is one second slower than my original code. Not a huge slowdown,
so as code goes it is fine. However, it clearly doesn't make things
faster and so just re-inforces my first reason for not using your
C++ code: allocation is not the bottlneck, (naive) reference counting is.

Chris Thomasson

unread,

May 15, 2008, 12:44:43 AM5/15/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:871w44n...@dnsalias.com...

> "Chris Thomasson" <cri...@comcast.net> writes:
>> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
>>> Using malloc(3) directly causes the time for the reference counting C
>>> code to be 30 seconds i.e. 6x slower than O'Caml. So I used a simple
>>> object pooling scheme. Although there are 50000011 allocations, only
>>> 14 of them result in a call to malloc(3). The same pool is used in
>>> the linear code so while it may be possible to improve the pool, the
>>> fact that the linear code is 8x quicker than the reference counting
>>> version shows that the (native) reference couting is the place to look
>>> for performance improvements.
>>>
>>> The above and the fact that your code is C++ and I'm using C means
>>> that there is is little incentive to convert your C++ to C to try it.
>>
>> Here is a crude C version of my single-threaded region allocator:
>
> There were two reasons for not using your C++ allocator, the fact that
> it was in C++ was the second reason. Having a C version does nothing
> to address the first reason: the linear version 8x quicker than the
> reference counting version using the _same_ allocator. Thus improving
> the allocator will not make the reference counting version faster than
> the linear version. It _may_ make the reference counting version
> faster than O'Caml. However, looking at your C allocator code it does
> observably more work per allocation than the pooling allocator I'm
> using so it isn't going to be faster.

Why type of pool allocator are you using? AFAICT, pool allocator is
analogous to region allocator in that allocation and reclamations can be
as simple as:
_____________________________________________________________
#define BUF_SIZE 8192

struct region {
unsigned char buf[BUF_SIZE];
size_t offset;
};

void* region_malloc(
region* const _this,
size_t size
) {
size_t const offset = _this->offset;
size = /* adjust size for proper alignment */
if (size + offset <= BUF_SIZE) {
_this->offset += size;
return _this->buf + offset;
}
return NULL;
}

void region_reset(
region* const _this
) {
_this->offset = 0;
}
_____________________________________________________________

Now you can do stuff like:

void thread() {
int o, i1, i2;
region region[2] = { { { '\0' }, 0 }, { { '\0' }, 0 } };
for (o = 1 ;; ++o) {
region_malloc(region, (o % 128) + 1);
for (i1 = 1; i1 < (o % 512) + 2; ++i1) {
for (i2 = 1; i2 < (o % 32) + 2; ++i2) {
region_malloc(region + 1, (i2 % 16) + 1);
}
region_reset(region + 1);
}
region_reset(region);
}
}

The ratio of mallocs to frees does not need to be equal. Instead, it
can be N mallocs to 1 reset. The granularity can be improved by clever
partitioning of multiple regions.

> Talk is cheap so I actually
> tested it :-
>
> $ time ./simplify speed
> (0x804b148/0x804e008/1/8192)::allocator_sys_expand
> (0x804b148/0x8052028/2/16384)::allocator_sys_expand
> (0x804b148/0x804e008/2/8192)::allocator_sys_promote
>
> real 0m17.530s
> user 0m17.307s
> sys 0m0.005s
> $
>
> This is one second slower than my original code.

Did you free each object individually or did you take advantage of the
allocator_reset procedure? The region allocator I posted does not perform
all that well when you call allocator_release. Its best to create multiple
allocators and partition them into zones, and only call reset on a zone when
its in a persistent quiescent state.

> Not a huge slowdown,
> so as code goes it is fine. However, it clearly doesn't make things
> faster and so just re-inforces my first reason for not using your
> C++ code: allocation is not the bottlneck, (naive) reference counting is.

I clearly misunderstood you; sorry about that non-sense. I was under the
impression that the allocator was the bottle neck. Humm, can you post the
test code your using? Perhaps it contains some reference counting patterns
that are compatible with fine/medium-grain partitioned reference counting.
Think of zone-based object reclamation. You can amortize a plurality of
reference increments/decrements at so called check-points that only need to
be executed on a periodic/episodic basis.

Also, if the benchmark is single-threaded, well, you could apply more and
more elaborate amortizations for existing refcounting techniques.

Chris Thomasson

unread,

May 15, 2008, 1:03:23 AM5/15/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message

news:g4adnXQ_oNWXIrbV...@comcast.com...

[...]

>> Not a huge slowdown,
>> so as code goes it is fine. However, it clearly doesn't make things
>> faster and so just re-inforces my first reason for not using your
>> C++ code: allocation is not the bottlneck, (naive) reference counting is.
>
> I clearly misunderstood you; sorry about that non-sense. I was under the
> impression that the allocator was the bottle neck. Humm, can you post the
> test code your using? Perhaps it contains some reference counting patterns
> that are compatible with fine/medium-grain partitioned reference counting.
> Think of zone-based object reclamation. You can amortize a plurality of
> reference increments/decrements at so called check-points that only need
> to
> be executed on a periodic/episodic basis.
>
> Also, if the benchmark is single-threaded, well, you could apply more and
> more elaborate amortizations for existing refcounting techniques.

If you apply optimizations to naive reference counting, well, then its no
longer all that naive is it? Conclusion, use the right tool for the job at
hand.

;^)

Stephen J. Bevan

unread,

May 15, 2008, 1:25:38 AM5/15/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> Why type of pool allocator are you using? AFAICT, pool allocator is
> analogous to region allocator in that allocation and reclamations can be
> as simple as:

Given that the allocator performance is not the bottlenck so the
method I used is irrelevant.

> The ratio of mallocs to frees does not need to be equal. Instead, it
> can be N mallocs to 1 reset. The granularity can be improved by clever
> partitioning of multiple regions.

Agreed, but unless you have a clever partition for this problem then
that idea is moot.

> Did you free each object individually or did you take advantage of
> the allocator_reset procedure? The region allocator I posted does not
> perform all that well when you call allocator_release. Its best to
> create multiple allocators and partition them into zones, and only
> call reset on a zone when its in a persistent quiescent state.

I used it as a straight malloc&free replacment. There is no point in
trying to take advantage of anything because allocation is not the
bottleneck that is causing the reference counting version to be 8x
slower than a linear version.

> I clearly misunderstood you; sorry about that non-sense. I was under the
> impression that the allocator was the bottle neck. Humm, can you post the
> test code your using?

I could but it is trivial. If you want to take up Jon's challenge
then you can easily write the code using whatever reference counting
pattern you like. Changing to one form of deferred reference counting
decreases the time to ~11 seconds. Still 2x slower than O'Caml and 5x
slower than a linear version.

> Think of zone-based object reclamation.

How about you think about it? Jon was arguing for GC you were arguing
for reference counting. All I did was measure the two on Jon's
problem on my machine and show that on this test both suck and
linearity wins, though GC sucks less than two forms of reference
counting.

Stephen J. Bevan

unread,

May 15, 2008, 1:27:48 AM5/15/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> If you apply optimizations to naive reference counting, well, then its
> no longer all that naive is it? Conclusion, use the right tool for the
> job at hand.

And what might that tool be? It doesn't appear to be GC or reference
counting.

Chris Thomasson

unread,

May 15, 2008, 2:24:30 AM5/15/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:87od78m...@dnsalias.com...

> "Chris Thomasson" <cri...@comcast.net> writes:
>> Why type of pool allocator are you using? AFAICT, pool allocator is
>> analogous to region allocator in that allocation and reclamations can be
>> as simple as:
>
> Given that the allocator performance is not the bottlenck so the
> method I used is irrelevant.
>
>> The ratio of mallocs to frees does not need to be equal. Instead, it
>> can be N mallocs to 1 reset. The granularity can be improved by clever
>> partitioning of multiple regions.
>
> Agreed, but unless you have a clever partition for this problem then
> that idea is moot.

I don't think its worth the effort. Oh well. Nobody is paying me here...

;^)

>> Did you free each object individually or did you take advantage of
>> the allocator_reset procedure? The region allocator I posted does not
>> perform all that well when you call allocator_release. Its best to
>> create multiple allocators and partition them into zones, and only
>> call reset on a zone when its in a persistent quiescent state.
>
> I used it as a straight malloc&free replacment. There is no point in
> trying to take advantage of anything because allocation is not the
> bottleneck that is causing the reference counting version to be 8x
> slower than a linear version.
>
>
>> I clearly misunderstood you; sorry about that non-sense. I was under the
>> impression that the allocator was the bottle neck. Humm, can you post the
>> test code your using?
>
> I could but it is trivial. If you want to take up Jon's challenge
> then you can easily write the code using whatever reference counting
> pattern you like. Changing to one form of deferred reference counting
> decreases the time to ~11 seconds. Still 2x slower than O'Caml and 5x
> slower than a linear version.
>
>
>> Think of zone-based object reclamation.
>
> How about you think about it? Jon was arguing for GC you were arguing
> for reference counting.

I was trying to address the various false blanket claims that Jon was making
that he was making about every form of reference counting.

> All I did was measure the two on Jon's
> problem on my machine and show that on this test both suck and
> linearity wins, though GC sucks less than two forms of reference
> counting.

Well, don't use refcounting for this problem then... Refcounting is not the
right answer to every problem. :^)

Dmitriy V'jukov

unread,

May 15, 2008, 3:22:44 AM5/15/08

to

On 15 май, 09:27, step...@dino.dnsalias.com (Stephen J. Bevan) wrote:

Maybe this:
http://groups.google.ru/group/comp.programming.threads/browse_frm/thread/dab22a17c32c6b13
http://groups.google.ru/group/comp.programming.threads/msg/59c2bc2366831d29

Btw, what you mean by "reference counting"? Something like this:

void rc_acquire(rc_t* obj)
{
atomic_inc(&obj->rc);
}

void rc_release(rc_t* obj)
{
if (0 == atomic_dec(&obj->rc))
delete obj;
}

Are you testing on SMP/multicore machine?

Dmitriy V'jukov

Stephen J. Bevan

unread,

May 15, 2008, 9:45:41 AM5/15/08

to

"Dmitriy V'jukov" <dvy...@gmail.com> writes:
> On 15 май, 09:27, step...@dino.dnsalias.com (Stephen J. Bevan) wrote:
> Maybe this:
> http://groups.google.ru/group/comp.programming.threads/browse_frm/thread/dab22a17c32c6b13
> http://groups.google.ru/group/comp.programming.threads/msg/59c2bc2366831d29

They may or may not help in an SMP situation but in a single threaded
scenario all they do is at more overhead and thus make reference
(naive) counting even slower.

> Btw, what you mean by "reference counting"? Something like this:

[snip]

Yes, but since there is no SMP the inc/dec don't have to be atomic.
If they are the performance is even worse. Though to be fair then I'd
need to test against an SMP cable GC too.

> Are you testing on SMP/multicore machine?

No.

Stephen J. Bevan

unread,

May 15, 2008, 9:51:42 AM5/15/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

>> Agreed, but unless you have a clever partition for this problem then
>> that idea is moot.
>
> I don't think its worth the effort. Oh well. Nobody is paying me here...

Nobody was paying you to argue with Jon that reference counting was
sufficient either.

>> How about you think about it? Jon was arguing for GC you were arguing
>> for reference counting.
>
> I was trying to address the various false blanket claims that Jon was
> making that he was making about every form of reference counting.

Jon challenged you to back up your claims that this claims were false
by providing a reference counting version that is faster than GC for
a specific test. I saw no response from you thus I don't see how you
addressed his claims at all.

>> All I did was measure the two on Jon's
>> problem on my machine and show that on this test both suck and
>> linearity wins, though GC sucks less than two forms of reference
>> counting.
>
> Well, don't use refcounting for this problem then... Refcounting is
> not the right answer to every problem. :^)

I agree, but then you were the one arguing for reference counting and
against GC not me.

Chris Thomasson

unread,

May 15, 2008, 10:28:45 AM5/15/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:878wybn...@dnsalias.com...

> "Chris Thomasson" <cri...@comcast.net> writes:
>> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
>>> Agreed, but unless you have a clever partition for this problem then
>>> that idea is moot.
>>
>> I don't think its worth the effort. Oh well. Nobody is paying me here...
>
> Nobody was paying you to argue with Jon that reference counting was
> sufficient either.

That's not what I was arguing about. You need to re-read the thread. Jon
made misleading claims on reference couning, and I pointed them out.

>>> How about you think about it? Jon was arguing for GC you were arguing
>>> for reference counting.
>>
>> I was trying to address the various false blanket claims that Jon was
>> making that he was making about every form of reference counting.
>
> Jon challenged you to back up your claims that this claims were false
> by providing a reference counting version that is faster than GC for
> a specific test. I saw no response from you thus I don't see how you
> addressed his claims at all.

When did I say that reference counting in general is always faster than GC?
I said that some forms of reference counting is deterministic and GC is not.
Also most reference counting algorithms do not require some of the "heavy
handed" techniques that some GC employ. Thread suspension, thread-stack
introspection, stop-the-world, ect... Does this fact make reference count
better? Na.

>>> All I did was measure the two on Jon's
>>> problem on my machine and show that on this test both suck and
>>> linearity wins, though GC sucks less than two forms of reference
>>> counting.
>>
>> Well, don't use refcounting for this problem then... Refcounting is
>> not the right answer to every problem. :^)
>
> I agree, but then you were the one arguing for reference counting and
> against GC not me.

When did I argue against GC? I was responding to Jon blanket false claims.

Stephen J. Bevan

unread,

May 15, 2008, 9:51:08 PM5/15/08

to

"Chris Thomasson" <cri...@comcast.net> writes:

> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
> news:878wybn...@dnsalias.com...
>> "Chris Thomasson" <cri...@comcast.net> writes:
>>> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
>>>> Agreed, but unless you have a clever partition for this problem then
>>>> that idea is moot.
>>>
>>> I don't think its worth the effort. Oh well. Nobody is paying me here...
>>
>> Nobody was paying you to argue with Jon that reference counting was
>> sufficient either.
>
> That's not what I was arguing about. You need to re-read the
> thread. Jon made misleading claims on reference couning, and I pointed
> them out.

Jon's arguments against RC were :-

. Fragility.
. Memory overhead.
. Performance degradation.
. Fragmentation.

. Not incremental, i.e. awful worst case performance.

Whether each one is misleading is a matter of debate. I'd argue that
the first isn't. The last isn't unless you consider at least some
kind of deferred reference counting scheme which just adds to the
first issue, somewhat adds to the second depending on just when the
deferred work is done. The penultimate issue is true when compared to
a copying GC but not say a mark-sweep GC. Regarding performance
degredation for "collection intensive-tasks" you wrote in
<m4mdnbbimOjpiHLa...@comcast.com> :-

What collection-intensive tasks? I know I can compete, and
most likely beat, a traditional GC with handcrafted implementations
that C++ give me the freedom to work with.

This led Jon to suggested a specific benchmark which is clearly small
enough for do that without requiring an onerous amount of work to
prove the point either way. One can certainly question the benchmark
(I certainly do) but as far as I can see you didn't post any kind of
followup at all. That's fine you don't have to, but then it is rather
bold to now claim you were pointing out misleading claims without
specifying which claims were misleading and which were not.

>> Jon challenged you to back up your claims that this claims were false
>> by providing a reference counting version that is faster than GC for
>> a specific test. I saw no response from you thus I don't see how you
>> addressed his claims at all.
>
> When did I say that reference counting in general is always faster
> than GC?

I quoted what you claimed and so you were free to use reference
counting or not as you see fit. You didn't offer a solution.

> When did I argue against GC? I was responding to Jon blanket false
> claims.

Claiming they are false is not the same as showing they are false.

Jerry Coffin

unread,

May 16, 2008, 12:45:28 AM5/16/08

to

In article <874p8zm...@dnsalias.com>, ste...@dino.dnsalias.com
says...

[ ... ]

> Jon's arguments against RC were :-
>
> . Fragility.
> . Memory overhead.
> . Performance degradation.
> . Fragmentation.
> . Not incremental, i.e. awful worst case performance.
>
> Whether each one is misleading is a matter of debate.

I'd say I disagree, except that disagreeing more or less confirms your
point. Some, however, are clearly false, so they _shouldn't_ be open to
debate, even if they are.

> I'd argue that the first isn't.

When stated more specifically (i.e. noting the nature of the fragility)
I'd accept it as quite accurate.

> The last isn't unless you consider at least some
> kind of deferred reference counting scheme which just adds to the
> first issue, somewhat adds to the second depending on just when the
> deferred work is done.

More accurately, the last is really false (while still giving a fairly
accurate impression). Worse, your comments are incorrect as well --
deferring collection doesn't add to fragility at all.

Your claim of added memory overhead really IS quite misleading -- it's
easy to limit the extra memory to a single pointer.

Incremental collection is one of the major advantages of reference
counting -- most other forms of garbage collection have substantially
_worse_ worst-case performance, and limiting the worst case adds
considerably more to their complexity as well.

> The penultimate issue is true when compared to
> a copying GC but not say a mark-sweep GC.

Chris was talking about claims that were technically true, but
misleading anyway. This is the opposite: technically false, but most
gives an accurate idea anyway.

Reference counting is concerned only with determining _when_ objects can
be collected. It is entirely orthogonal to what you do with the memory
once you determine that objects are no longer needed.

At the same time, to compact the heap, you need to carry out something
on the order of the mark phase of a mark-sweep collector to find all the
pointers you're going to need to modify when you move an object. As
such, while there's nothing about RC that prevents it, I've never heard
of an RC-based collector that did it (and from a practical viewpoint, I
have a hard time imagining such a thing either).

[ ... ]

> Claiming they are false is not the same as showing they are false.

Claiming they're true is not the same as showing they're true either.

I'd score the claims as:

Fragility: True -- when properly qualified
Memory Overhead: generally misleading
Performance degradation: Open to (lots of) debate
Fragmentation: technically false, practically true
Not Incremental: clearly false

Since this is cross-posted to c.l.c++, I feel obliged to point out that
most GC testing is done in languages (e.g. Lisp, Smalltalk, Self) in
which _all_ objects are allocated dynamically, and open to garbage
collection. The is NOT the case in C++; many optimizations in garbage
collectors are based on usage patterns that appear far less likely to
happen in C++. In particular, most GC optimization depends on the fact
that MOST objects die before a collection cycle happens. This is FAR
less like in C++. While the exact lifetime may not be known, the average
life-span of a dynamically allocated object in C++ is MUCH longer than
in these other languages.

Chris Thomasson

unread,

May 16, 2008, 3:48:04 AM5/16/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:874p8zm...@dnsalias.com...

> "Chris Thomasson" <cri...@comcast.net> writes:
>
>> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
>> news:878wybn...@dnsalias.com...
>>> "Chris Thomasson" <cri...@comcast.net> writes:
>>>> "Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message
>>>>> Agreed, but unless you have a clever partition for this problem then
>>>>> that idea is moot.
>>>>
>>>> I don't think its worth the effort. Oh well. Nobody is paying me
>>>> here...
>>>
>>> Nobody was paying you to argue with Jon that reference counting was
>>> sufficient either.
>>
>> That's not what I was arguing about. You need to re-read the
>> thread. Jon made misleading claims on reference couning, and I pointed
>> them out.
>
> Jon's arguments against RC were :-
>
> . Fragility.
> . Memory overhead.
> . Performance degradation.
> . Fragmentation.
> . Not incremental, i.e. awful worst case performance.

[...]

First question, I don't have time right now to make proper response...

;^(

Stephen, do you honestly think that Jons blanket arguments apply to every
type of reference counting algorithm out there? Keep in mind that he did not
know that some state-of-the-art reference counting algorithms existed.
Please re-read the threads. He even brought up exception handling in the
context of RC to slam C++ and prop up another useful high-level lang. IMVHO,
he was doing a little trolling and I got hooked!

:^/

GC and RC are both great tools to have. Please try to understand that has
been my argument all along. When Jon writes about his false
claims/misunderstandings on "some" RC algorihtms, I point them out. So be
it.

AFAICT, Jon's main mistake is that he totally confuses the eligibility of an
object to be collected with the fact that a GC does not collect anything
unless it actually runs a collection cycle. His claims are wrong because he
is posting in C++ group which has no GC:

{
refcount<Foo> f(new Foo);
foo(f);
bar();
}

Jon says that GC will collect f while bar is executing. Well, his claim is
totally misleading because he cannot possible know when the GC is going to
decide to run a cycle. I have been over this with Jon. He refuses to
understand. Oh well.

He even slammed Linux, and insulted Linus Torvalds intelligence because he
happened to approve a proxy GC over a traditional one...

Stephen J. Bevan

unread,

May 16, 2008, 10:57:58 AM5/16/08

to

Jerry Coffin <jco...@taeus.com> writes:
> In article <874p8zm...@dnsalias.com>, ste...@dino.dnsalias.com

>> The last isn't unless you consider at least some
>> kind of deferred reference counting scheme which just adds to the
>> first issue, somewhat adds to the second depending on just when the
>> deferred work is done.
>
> More accurately, the last is really false (while still giving a fairly
> accurate impression). Worse, your comments are incorrect as well --
> deferring collection doesn't add to fragility at all.

So while you think that reference counting can be considered fragile
wyou do not consider that using a deferral mechanism makes it any more
fragile. Interesting. I disagree but I don't think that is
important.

> Your claim of added memory overhead really IS quite misleading -- it's
> easy to limit the extra memory to a single pointer.

I wrote "[deferred referenc counting] somewhat adds to the second
depending on just when the deferred work is done." That has nothing
to do with any extra pointer and everything to do with increased heap
occupancy that occurs due to not freeing memory right away as a naive
reference counting implementation would. Whether it actually has a
higher overhead than any other kind of GC would depend on the other GC
and more than likely the program(s).

> At the same time, to compact the heap, you need to carry out something
> on the order of the mark phase of a mark-sweep collector to find all the
> pointers you're going to need to modify when you move an object. As
> such, while there's nothing about RC that prevents it, I've never heard
> of an RC-based collector that did it (and from a practical viewpoint, I
> have a hard time imagining such a thing either).

One-bit RC implementations are typically coupled with a mark-sweep
collector when may or may not compact.

>> Claiming they are false is not the same as showing they are false.
>
> Claiming they're true is not the same as showing they're true either.

Indeed, and you'll note that I didn't enter this thread debating each
of the individual claims, I only addressed one of them: performance
degredation. I didn't even expect Chris to respond since it was Jon's
challenge to beat GC and while RC doesn't a linear version does.
The only reason for listing the claims again was Chris wrote :-

> That's not what I was arguing about. You need to re-read the
> thread. Jon made misleading claims on reference couning, and I
> pointed them out.

> I was responding to Jon blanket false claims.

and so I posted the list to clarify what those claims were and as a
side issue note that they aren't all "blanket false claims" as you
yourself agreed.

Stephen J. Bevan

unread,

May 16, 2008, 10:59:34 PM5/16/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> Stephen, do you honestly think that Jons blanket arguments apply to
> every type of reference counting algorithm out there? Keep in mind
> that he did not know that some state-of-the-art reference counting
> algorithms existed.

The only citations I could find are to three techniques posted by
people in comp.programming.threads (one of which is your own). Any RC
mechanism which doesn't put an RC on every object but instead on some
kind of container/proxy for a group of objects is going reduce the
number of inc/dec calls at the expense of something. There's nothing
particularly state-of-the-art about that idea. Whether the method is
actually applicable depends on the problem. It can be used for Jon's
challenge since there is nothing in the challenge which forces fine
grain sharing between terms. Change the problem slightly (e.g. graph
re-writing) and it is much harder to use a proxy or linearity.

> His claims are wrong because he is posting in C++ group which has no GC:
>
> {
> refcount<Foo> f(new Foo);
> foo(f);
> bar();
> }
>
> Jon says that GC will collect f while bar is executing. Well, his
> claim is totally misleading because he cannot possible know when the
> GC is going to decide to run a cycle. I have been over this with
> Jon. He refuses to understand. Oh well.

Since you didn't quote Jon I had a look back through the thread for
that example and I can find it in <13uvuuv...@corp.supernews.com>
where Jon writes "can collect" not "will collect". To me that makes a
big difference as to whether it is misleading or not.

> He even slammed Linux, and insulted Linus Torvalds intelligence
> because he happened to approve a proxy GC over a traditional one...

Whether he slammed Linux or insulted Linus is irrelevant. Just
because I don't agree with him about one issue doesn't mean I
automatically dismiss what he says about another.

Chris Thomasson

unread,

May 16, 2008, 11:48:30 PM5/16/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:87abipl...@dnsalias.com...

> "Chris Thomasson" <cri...@comcast.net> writes:
>> Stephen, do you honestly think that Jons blanket arguments apply to
>> every type of reference counting algorithm out there? Keep in mind
>> that he did not know that some state-of-the-art reference counting
>> algorithms existed.
>
> The only citations I could find are to three techniques posted by
> people in comp.programming.threads (one of which is your own). Any RC
> mechanism which doesn't put an RC on every object but instead on some
> kind of container/proxy for a group of objects is going reduce the
> number of inc/dec calls at the expense of something. There's nothing
> particularly state-of-the-art about that idea.

The most important aspect of state-of-the-art reference counting is the fact
that no dependence on any interlocked RMW and/or memory-barrier instructions
are needed at all. Of course, this means that various MAJOR benefits in the
realm of scalability, throughput and performance can be realized indeed.

> Whether the method is
> actually applicable depends on the problem. It can be used for Jon's
> challenge since there is nothing in the challenge which forces fine
> grain sharing between terms. Change the problem slightly (e.g. graph
> re-writing) and it is much harder to use a proxy or linearity.

Proxy GC works very well within certain problem arenas. For instance,
classic reader-writer problem is efficiently solved through clever use of
the technique.

>> His claims are wrong because he is posting in C++ group which has no GC:
>>
>> {
>> refcount<Foo> f(new Foo);
>> foo(f);
>> bar();
>> }
>>
>> Jon says that GC will collect f while bar is executing. Well, his
>> claim is totally misleading because he cannot possible know when the
>> GC is going to decide to run a cycle. I have been over this with
>> Jon. He refuses to understand. Oh well.
>
> Since you didn't quote Jon I had a look back through the thread for
> that example and I can find it in <13uvuuv...@corp.supernews.com>
> where Jon writes "can collect" not "will collect". To me that makes a
> big difference as to whether it is misleading or not.

Well, that rendered his argument completely moot. Of course a GC "can"
collect f while bar is executing. The misleading part is that he fails to
explain that "can" is predicated on the fact that a GC must run a cycle
while bar is executing. A small tweak to the program changes things:

{

{
refcount<Foo> f(new Foo);
foo(f);
}
bar();
}

Now, a naive RC WILL collect f _before_ bar has executed. Of course Jon said
that this fact is totally irrelevant. I called him on that point because it
takes advantage of the fact that some types of RC are deterministic. Most GC
algorihtms cannot rise to that level of determinism. We got into a spiral
argument that had no end in sight. Well, that was until he showed his true
colors and started implying that C++ is crap because it has no GC, OCaml
exceptions are faster.

>> He even slammed Linux, and insulted Linus Torvalds intelligence
>> because he happened to approve a proxy GC over a traditional one...
>
> Whether he slammed Linux or insulted Linus is irrelevant.

AFAICT, his insults against Linux and/or Linus are based in a form of
ignorance. Much like his attribution of several problems to _all_ forms of
RC. IMVHO, its relevant indeed. Jon __seems__ to hate RC no matter what
color, shape or form it presents itself in. Hate is "usually" based on utter
ignorance. Oh well, life goes on.

> Just
> because I don't agree with him about one issue doesn't mean I
> automatically dismiss what he says about another.

Well, it gave me an insight into why Jon casts his blanket assertions across
ALL forms of RC. This is why I use the term "BLANKET". Some person might be
mislead into thinking that Jons assertions are true across all forms of RC.
This is totally false and I think Jon knew better. He posted that non-sense
on purpose for some odd reason. Why? I don't know. Perhaps you can tell
me???

Chris Thomasson

unread,

May 16, 2008, 11:49:12 PM5/16/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:87y76al...@dnsalias.com...

> Jerry Coffin <jco...@taeus.com> writes:
>> In article <874p8zm...@dnsalias.com>, ste...@dino.dnsalias.com
>>> The last isn't unless you consider at least some
>>> kind of deferred reference counting scheme which just adds to the
>>> first issue, somewhat adds to the second depending on just when the
>>> deferred work is done.
>>
>> More accurately, the last is really false (while still giving a fairly
>> accurate impression). Worse, your comments are incorrect as well --
>> deferring collection doesn't add to fragility at all.
>
> So while you think that reference counting can be considered fragile
> wyou do not consider that using a deferral mechanism makes it any more
> fragile. Interesting. I disagree but I don't think that is
> important.

[...]

RC is ONLY as fragile as the programmer who makes use if it.

Chris Thomasson

unread,

May 16, 2008, 11:52:09 PM5/16/08

to

"Chris Thomasson" <cri...@comcast.net> wrote in message

news:jvadnXf0ZIcpyLPV...@comcast.com...

BTW, would you call GC fragile because a programmer can still leak memory?
In certain scenarios, forgetting to set a reference to NULL is analogous to
forgetting to call free.

Stephen J. Bevan

unread,

May 17, 2008, 10:28:08 AM5/17/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
>> RC is ONLY as fragile as the programmer who makes use if it.

Indeed, but for the most part humans write programs and humans make
mistakes. The important thing is whether the mistakes can easily be
found or whether they can be missed and only show up in production.
I'm reasonably confident that anyone who has used RC in a non-trivial
program (i.e. at least measured in tens of thousands of lines of code,
written by multiple people and actively used by third parties) has
some war stories about spending time trying to track down bugs due to
RC errors. I certainly have, but I have no choice but to use RC in
one or more of its forms. That doesn't mean I'm oblivious to its
problems and wouldn't prefer to use GC if it were feasible (memory
constrained environment).

> BTW, would you call GC fragile because a programmer can still leak
> memory? In certain scenarios, forgetting to set a reference to NULL is
> analogous to forgetting to call free.

GC and RC share one failure mode: potential to retain memory that is
semantically free. RC (when used manually and not generated by a
compiler) has the other which is to reclaim too early. Neither mode
is good. But two modes of failure puts RC higher up the fragility
scale than GC.

Stephen J. Bevan

unread,

May 17, 2008, 11:03:47 AM5/17/08

to

"Chris Thomasson" <cri...@comcast.net> writes:
> The most important aspect of state-of-the-art reference counting is
> the fact that no dependence on any interlocked RMW and/or
> memory-barrier instructions are needed at all. Of course, this means
> that various MAJOR benefits in the realm of scalability, throughput
> and performance can be realized indeed.

Despite the thread subject, Jou's challenge is single threaded thus
RMW, memory barriers, scalability ... etc. aren't an issue.

> Proxy GC works very well within certain problem arenas. For instance,
> classic reader-writer problem is efficiently solved through clever use
> of the technique.

Proxy GC may be fantastic in its area, but unless you think that Jon's
challenge is in this area and are willing to write the code and time
it then Proxy GC is irrelevant.

> Of course a GC "can" collect f while bar is executing. The
> misleading part is that he fails to explain that "can" is predicated
> on the fact that a GC must run a cycle while bar is executing.

He probably failed to explain it because he assumed it was obvious
that the GC would have to run. I wouldn't call it misleading to
assume that someone knows a basic fact about GC.

> A small tweak to the program changes things:
>
> {
> {
> refcount<Foo> f(new Foo);
> foo(f);
> }
> bar();
> }
>
>
> Now, a naive RC WILL collect f _before_ bar has executed. Of course
> Jon said that this fact is totally irrelevant.

The reason being that most people don't write code like that from
scratch, they'd only do it once if/when they'd discovered a problem
that forced them to do it. Jon is selling the idea that if you use GC
then you'd never have to go through the problem discovery stage, the
GC would take care of it for you. Whether anyone is buying depends on
how often they see this kind of problem. It isn't high up on my list.

> AFAICT, his insults against Linux and/or Linus are based in a form of
> ignorance. Much like his attribution of several problems to _all_
> forms of RC. IMVHO, its relevant indeed.

To you, but not to me.

Chris Thomasson

unread,

May 17, 2008, 6:42:53 PM5/17/08

to

"Stephen J. Bevan" <ste...@dino.dnsalias.com> wrote in message

news:87zlqpj...@dnsalias.com...

Fair enough.

Jerry Coffin

unread,

May 18, 2008, 2:10:29 PM5/18/08

to

In article <874p8xk...@dnsalias.com>, ste...@dino.dnsalias.com
says...

[ ... ]

> GC and RC share one failure mode: potential to retain memory that is
> semantically free. RC (when used manually and not generated by a
> compiler) has the other which is to reclaim too early. Neither mode
> is good. But two modes of failure puts RC higher up the fragility
> scale than GC.

How would RC free memory that's still in use, short of a bug in the RC
code itself?

Stephen J. Bevan

unread,

May 24, 2008, 7:34:52 PM5/24/08

to

Jerry Coffin <jco...@taeus.com> writes:
> In article <874p8xk...@dnsalias.com>, ste...@dino.dnsalias.com
> says...
>
> [ ... ]
>
>> GC and RC share one failure mode: potential to retain memory that is
>> semantically free. RC (when used manually and not generated by a
>> compiler) has the other which is to reclaim too early. Neither mode
>> is good. But two modes of failure puts RC higher up the fragility
>> scale than GC.
>
> How would RC free memory that's still in use, short of a bug in the RC
> code itself?

This isn't a RC bug per se rather it is a bug that is inherent to all
manual memory (de)allocation methods: human error in writing the
program such that in some context an extra put/release is called which
causes memory to be freed while another part of the program still
thinks it has a (final) reference to the object and so can access it
safely.