GC

Jon Harrop

unread,

Mar 26, 2009, 1:49:39 PM3/26/09

to unladen...@googlegroups.com

The project plan indicates that you would like to incorporate an existing
conservative GC into this project. May I ask why you are not aiming to
provide precise GC instead?

--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

Talin

unread,

Mar 26, 2009, 1:54:21 PM3/26/09

to unladen...@googlegroups.com

Precise collection is the goal - I'm not finding anywhere in the project plan that mentions conservative collection, if there is it probably needs to be fixed.

-- Talin

Jon Harrop

unread,

Mar 26, 2009, 5:01:10 PM3/26/09

to unladen...@googlegroups.com

On Thursday 26 March 2009 17:54:21 Talin wrote:
> Precise collection is the goal - I'm not finding anywhere in the project
> plan that mentions conservative collection, if there is it probably needs
> to be fixed.

My mistake: I got a link from a comment on the project plan and not from the
plan itself:

http://www.opensource.apple.com/darwinsource/10.5.5/autozone-77.1/

It refers to a conservative GC.

I recently added a precise GC to my HLVM project that is built upon LLVM. I
see that your overall objectives are as pragmatic as my own so it would
probably be very constructive to discuss designs.

My current GC is non-moving non-compacting non-incremental non-generational
mark and sweep using calloc and free directly and a shadow stack to record
local roots. I wrote it in a couple of days. The only unusual aspect is that
the code to traverse a value type is generated and JIT compiled on-the-fly as
new value types are encountered (e.g. tuples). That allowed me to use a
non-uniform representation and avoid boxing whenever possible which is
essential if I am to obtain optimal performance for numeric and string
processing (which is one of my main goals). It also means that HLVM jumps
directly to values of reference types rather than traversing every word at
run-time as OCaml's GC must.

Many people including the developers of Haskell compilers and Xavier Leroy
(lead dev of OCaml) asserted that such an approach would be extremely slow
and that using the system stack via stack maps (i.e. LLVM's GC intrinsics) is
essential in order to obtain usable performance. Consequently, I did not
expect to get within an order of magnitude of the most heavily optimized
single-threaded moving incremental generational GCs such as OCaml's. However,
HLVM is actually only 2.15x slower than ocamlopt on an allocation- and
GC-intensive list-based 10-queens solver.

May I ask what your plan is with regard to preserving the deterministic
collection of objects in today's Python? Will you break backward
compatibility in that respect?

Regards,

Talin

unread,

Mar 26, 2009, 5:13:40 PM3/26/09

to unladen...@googlegroups.com

Well, what I am working on at the moment is a general framework for accurate GC that is based on the general idea of C++ policy-based design. This means that every aspect of the GC - synchronization primitives, tracing strategies, error reporting, and so on - can all be replaced by substituting the appropriate template parameter.

The idea is to allow maximum compatibility with various VMs - each runtime can supply its own definition of atomic vars, mutexes, error reporting and logging, and so on.

The idea is inspired by MMTk, which is the GC framework for JikesRVM. The difference is that MMTk is written in Java and uses Java-centric design patterns for customization and extension.

This project (which I have named "Scarcity") is separate from unladen-swallow, but closely related to it. The reason for the separation is that I plan to make Scarcity useful for languages other than Python.

At the moment, I have the following pieces:

-- A heap manager and freelist allocator, using similar algorithms as the popular dlmalloc allocator, but tweaked in various ways to allow for integration with a collector (such as an efficient API for iteration over the heap for sweeping, and plenty of reserved bits in the allocation header for use by the GC)

-- Various utility classes for error logging, performancing timing, and so forth.

-- Extensive unit tests for all of the above.

At the moment I am working on sketching out the general framework for collector / mutator synchronization, and the various policy classes for low-level synchronization primitives (mutexes, barriers, latches, and so forth.)

-- Talin

Thomas Wouters

unread,

Mar 26, 2009, 5:21:57 PM3/26/09

to j...@ffconsultancy.com, unladen...@googlegroups.com

On Thu, Mar 26, 2009 at 22:01, Jon Harrop <j...@ffconsultancy.com> wrote:

May I ask what your plan is with regard to preserving the deterministic
collection of objects in today's Python? Will you break backward
compatibility in that respect?

I think the consensus is that we can't have deterministic collection of objects and performance in the face of multiple (POSIX) threads. CPython is currently the only Python implementation that has deterministic collection anyway, and barring discoveries of completely unprecedented ways of doing GC, we'll certainly get rid of it in Unladen Swallow.

--
Thomas Wouters <twou...@google.com>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Nathan

unread,

Apr 14, 2009, 3:17:33 AM4/14/09

to Unladen Swallow

Just wanted to point out that AutoZone linked above could be
beneficial for the PyObjC project, as it is what Objective-C and
MacRuby use (as you know, MacRuby is similarly incorporating LLVM in
the next 0.50 release).

http://www.opensource.apple.com/darwinsource/10.5.5/autozone-77.1/README.html

Whether it's the fastest, sufficiently cross-platform, and whether it
meets your other requirements, I couldn't say. All this is way over my
head, just wanted to add my 2 cents, and say thanks for taking on this
project. ^_^

Talin

unread,

Apr 14, 2009, 1:10:53 PM4/14/09

to unladen...@googlegroups.com

Autozone is a conservative collector, and I think we would prefer an accurate (or precise, depending on your terminology) collector. In particular, I'm interested in something that will operate in conjunction with LLVM's various hooks for accurate collection.

-- Talin

Jeffrey Yasskin

unread,

Apr 14, 2009, 1:35:36 PM4/14/09

to ta...@google.com, unladen...@googlegroups.com

We're not actually religious about having an accurate vs conservative
collector. We're likely to use any collector that's ready when we need
it and that does what we need (at least eventual finalization, weak
references, and enough speed to let us kill the GIL). It's likely that
an accurate collector is the best long-term strategy, and we'll try
hard to avoid making any permanent commitments to a particular
collector, but if a conservative collector is good enough and easiest
to integrate when we need it, we'll use it.

Reply all

Reply to author

Forward