Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

how to learn the BIBOP

207 views
Skip to first unread message

Marco Maggi

unread,
Aug 6, 2012, 2:09:48 AM8/6/12
to
Ciao,

to gain control of Vicare Scheme's source code I have to
learn how the BIBOP garbage collection algorithm works;
however, I cannot find papers on the Net describing the
actual algorithm in detail, in a way accessible to "non
academics".

There are papers of course[1], but it seems to me they all
take for granted that the reader already knows the
algorithm. Can someone suggest a suitable paper available
on the Net? (Unfortunately I have no access to a University
library.)

TIA

[1] Dybvig et al., "Don't stop the BIBOP".
--
Marco Maggi

Luca Saiu

unread,
Aug 6, 2012, 6:24:13 AM8/6/12
to
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 2012-08-06 at 08:09, Marco Maggi wrote:

> There are papers of course[1], but it seems to me they all
> take for granted that the reader already knows the
> algorithm. Can someone suggest a suitable paper available
> on the Net? (Unfortunately I have no access to a University
> library.)

There are many variants, but the basic idea is pretty simple. I think
it was first described in the MIT AI Memo 25, "Data Representations in
PDP-10 MACLISP": http://dspace.mit.edu/handle/1721.1/6278 .

Maybe you can find useful the first part of this draft of mine as well:
http://www-lipn.univ-paris13.fr/~saiu/papers/gc-draft.pdf

Regards,

- --
Luca Saiu
Home page: http://www-lipn.univ-paris13.fr/~saiu
My blog: http://ageinghacker.net/blog
GNU epsilon: http://www.gnu.org/software/epsilon
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iEYEARECAAYFAlAfm00ACgkQvzOavibF0obMbQCggbDB9rY633jzLmG/ICVxYQGk
6M0AoKWqcw8EVyBgDig4LmeFnlIe0Agm
=OmhW
-----END PGP SIGNATURE-----

John Cowan

unread,
Aug 6, 2012, 8:09:53 AM8/6/12
to
On Monday, August 6, 2012 2:09:48 AM UTC-4, Marco Maggi wrote:

> to gain control of Vicare Scheme's source code I have to
> learn how the BIBOP garbage collection algorithm works;

It's not a GC algorithm but an allocation algorithm. The idea is that you divide memory into pages of fixed size (aligned with hardware pages or not), and each page is used to allocate objects of the same dynamic type. For example, page 12043 may be used to allocate only pairs, and page 12044 may be used to allocate only bignums.

In this way, the GC can determine the dynamic type of a pointer by right-shifting it by the logarithm of the page size, and then using the result to index a small table of pointers to (or codes for) types.

Note well that there is no requirement that all entries of a given type fit on a single page.

I hope that helps.

--
John Cowan

George Neuner

unread,
Aug 6, 2012, 2:01:34 PM8/6/12
to
On Mon, 06 Aug 2012 08:09:48 +0200, Marco Maggi <ma...@maggi.invalid>
wrote:

>Ciao,
>
> to gain control of Vicare Scheme's source code I have to
>learn how the BIBOP garbage collection algorithm works;
>however, I cannot find papers on the Net describing the
>actual algorithm in detail, in a way accessible to "non
>academics".

As John Cowan said, BiBOP is not a GC algorithm. The original purpose
of BIBOP was to permit full width untagged immediates, and small
structured objects such as pairs, by shifting the type tags into the
object pointers. In practice, a separate logical heap is created for
each type that is intended to be discriminated by address.

Sometimes in BiBOP implementations only built-in types or only fixed
sized types are segregated while everything else is put into a common
heap. GC with BiBOP is similar to generational GC because the heaps
are globally interconnected.

> There are papers of course[1], but it seems to me they all
>take for granted that the reader already knows the
>algorithm. Can someone suggest a suitable paper available
>on the Net? (Unfortunately I have no access to a University
>library.)
>
>[1] Dybvig et al., "Don't stop the BIBOP".

I suggest a book instead:
Jones and Lins, "Garbage Collection: Algorithms for Automatic
Dynamic Memory Management", ISBN 0-471-94148-4.

http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484

You can spend a lifetime reading academic papers on the web ... this
book effectively is THE REFERENCE on GC and is better suited for
teaching a novice.


>TIA
Hope this helps,
George

Marco Maggi

unread,
Aug 6, 2012, 2:56:55 PM8/6/12
to
All right, thanks to everyone.

George Neuner wrote:

> As John Cowan said, BiBOP is not a GC algorithm.

I thought that value allocation, value collection and memory
page handling were deeply entangled. However, I do not know
enough to draw the line that separates implementation
specific choices from known algorithms definitions. But, it
is interesting that BIBOP is not concerned with GC; maybe it
means that I just need to put more effort into reading
Dybvig's paper.

> Sometimes in BiBOP implementations only built-in types or
> only fixed sized types are segregated while everything
> else is put into a common heap. GC with BiBOP is similar
> to generational GC because the heaps are globally
> interconnected.

He! He! To understand this paragraph one has to know the
algorithms. ;-) I already understand how individual values
are tagged and allocated in Ikarus/Vicare (each page can
contain values of any type), what I do not understand is how
memory pages are classified, recycled, whatever, and what
the hell is a dirty vector?

> You can spend a lifetime reading academic papers on the
> web ...

Yes...

> I suggest a book instead:
> Jones and Lins, "Garbage Collection: Algorithms for Automatic
> Dynamic Memory Management", ISBN 0-471-94148-4.
> [...]
> this book effectively is THE REFERENCE on GC and is better
> suited for teaching a novice.

Thanks, unfortunately it is out of my reach.
--
Marco Maggi

George Neuner

unread,
Aug 7, 2012, 1:13:38 AM8/7/12
to
On Mon, 06 Aug 2012 20:56:55 +0200, Marco Maggi <mrc...@gmail.com>
wrote:

>All right, thanks to everyone.
>
>George Neuner wrote:
>
>> As John Cowan said, BiBOP is not a GC algorithm.
>
>I thought that value allocation, value collection and memory
>page handling were deeply entangled. However, I do not know
>enough to draw the line that separates implementation
>specific choices from known algorithms definitions. But, it
>is interesting that BIBOP is not concerned with GC; maybe it
>means that I just need to put more effort into reading
>Dybvig's paper.

Allocation and collection are entangled, but VMM is an orthogonal
dimension ... some systems are purely software and do not deliberately
invoke VMM mechanisms at all.

At the simplest, a BiBOP heap is simply a contiguous address range
that stores objects of a single type. The complexity comes in how
each individual heap is managed.

For example, you might want to compact heaps to reduce the VMM working
set ... but do you slide the objects in order towards one end of the
heap or do you rearrange the order? Do you just fill in allocation
holes or do you sort the objects in some way? Do you want lists or
trees of linked objects to be contiguous in memory? What's the best
sort order - depth first, breadth first, allocation order, reversed of
any of these? Does it make sense to generically handle fixed size and
variably sized objects in the same way or do you want to special case
for performance?

What kind of latency can your applications tolerate? Are you going to
halt the mutator for the duration or let it continue to run during GC?
Are there multiple mutators (threads)? How and what do they share?



>> Sometimes in BiBOP implementations only built-in types or
>> only fixed sized types are segregated while everything
>> else is put into a common heap. GC with BiBOP is similar
>> to generational GC because the heaps are globally
>> interconnected.
>
>He! He! To understand this paragraph one has to know the
>algorithms. ;-) I already understand how individual values
>are tagged and allocated in Ikarus/Vicare (each page can
>contain values of any type), what I do not understand is how
>memory pages are classified, recycled, whatever, and what
>the hell is a dirty vector?

Sounds like a 2 stage allocator. But if so, it exists somewhere in
that orthogonal dimension I mentioned above ... it doesn't have
anything to do with the GC algorithm per se, but has to do with the
handling of pages that are found to be empty afterward.

The problem is that - even with really excellent code documentation -
you are going to have to learn some GC theory - and maybe also some
VMM mechanics - to know what to look for and to know if and how it can
be modified. There are multiple approaches to GC, each with numerous
variants, and virtually all practical systems are hybrids that employ
elements of more than one approach.


>> I suggest a book instead:
>> Jones and Lins, "Garbage Collection: Algorithms for Automatic
>> Dynamic Memory Management", ISBN 0-471-94148-4.
>> [...]
>> this book effectively is THE REFERENCE on GC and is better
>> suited for teaching a novice.
>
>Thanks, unfortunately it is out of my reach.

I can try to pull together a study packet for you. Might take a
couple of days as I don't have some of the fundamental papers handy in
digital form (the hazard of stocked bookshelves 8-). I presume your
email works?

George

Marco Maggi

unread,
Aug 7, 2012, 2:27:34 AM8/7/12
to
George Neuner wrote:
> For example, you might want to [...]

Right now I do not want to change Vicare's GC algorithm; I
just want to understand it and change the code to make it
more readable and comprehensible to people at my skill
level. It is a single-threaded process and while the GC is
running nothing else happens; it is my understanding that it
is similar to the algorithm described in Dybvig's paper.

> The problem is that - even with really excellent code
> documentation - you are going to have to learn some GC
> theory - and maybe also some VMM mechanics - to know what
> to look for and to know if and how it can be modified.

I acknowledge this.

> I can try to pull together a study packet for you.

Thanks! But do not put too much effort in this, I have to
do the hard work by myself. :-) I can be contacted from:
<http://github.com/marcomaggi/>.
--
Marco Maggi

Vitaly Magerya

unread,
Aug 7, 2012, 7:22:48 AM8/7/12
to
Marco Maggi wrote:
> to gain control of Vicare Scheme's source code I have to
> learn how the BIBOP garbage collection algorithm works;

Here is a complete list of papers any novice should read to get
a solid understanding of automatic memory management:

* "Uniprocessor Garbage Collection Techniques" by Paul R. Wilson [1].

* "Dynamic Storage Allocation: A Survey and Critical Review" by
Paul R. Wilson, Mark S. Johnstone, Michael Neely and David
Boles [2].

* "Representing Type Information in Dynamically Typed Languages"
by David Gudeman [3].

Neither of these focus on BIBOP specifically, but after reading
them a few times you'll be able to parse papers like Dybvig's
"Don't stop the BIBOP" without much sweat. You'll also see that
BIBOP is not an isolated technique, but rather one of the tricks
you can employ.

I very much recommend you to take the time to understand all of
the above papers (skip the third one if you know how a tagged
pointer works; read it first otherwise).

Specifically about the GC that Vicare/Ikarus uses, you might
find the work of Hans Boehm useful: his widely popular conservative
garbage collector uses BIBOP-like technique too (the fact that
it is conservative does not change much). See Boehm's "Fast
Multiprocessor Memory Allocation and Garbage Collection" [4] and
"Reducing Garbage Collector Cache Misses" [5]. These papers
aren't as detailed as you'd hope, but they do provide useful
details on memory organization and the collection algorithm.

[1] ftp://ftp.cs.utexas.edu/pub/garbage/bigsurv.ps
[2] ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps
[3] ftp://ftp.cs.indiana.edu/pub/scheme-repository/doc/pubs/typeinfo.ps.gz
[4] http://www.hpl.hp.com/techreports/2000/HPL-2000-165.pdf
[5] http://www.hpl.hp.com/techreports/2000/HPL-2000-99.pdf

George Neuner

unread,
Aug 7, 2012, 3:31:55 PM8/7/12
to
On Tue, 07 Aug 2012 08:27:34 +0200, Marco Maggi <ma...@maggi.invalid>
wrote:

>George Neuner wrote:
>> For example, you might want to [...]
>
>Right now I do not want to change Vicare's GC algorithm;

I get that. My point was simply to illustrate that there are many
dimensions that you might encounter in a real world system.


>I just want to understand it and change the code to make it
>more readable and comprehensible to people at my skill
>level. It is a single-threaded process and while the GC is
>running nothing else happens; it is my understanding that it
>is similar to the algorithm described in Dybvig's paper.

Dybvig's paper describes a collector which is both generational and
adapted to separated BiBOP type heaps. Moreover, the BiBOP model it
describes is not the classic model but a dynamic variant. In other
words, Dybvig's system is a hybrid in multiple dimensions.

To understand Dybvig's system, you need foundation:

- the tri-color abstraction
- read/update barriers
- remember sets
- book/page/card marking (dirty vectors)

some algorithms:

- 2 level allocation
- copying collection
- generational copying collection
and maybe
- mark-compact collection (large objects)

and some miscellanea:

- Unix memory management API
- VMM page protection, mapping & fault handling


>> The problem is that - even with really excellent code
>> documentation - you are going to have to learn some GC
>> theory - and maybe also some VMM mechanics - to know what
>> to look for and to know if and how it can be modified.
>
>I acknowledge this.
>
>> I can try to pull together a study packet for you.
>
>Thanks! But do not put too much effort in this, I have to
>do the hard work by myself. :-)

I don't mind. Trust me, you *will* do the hard work 8-)

I see Vitaly gave you some pointers. Wilson's paper is an excellent
survey of general technique, but it does assume you know the
foundation material.

The problem is that most papers assume you know foundations and only
talk about how the presented algorithms satisfy (or don't) the
foundation requirements - which usually are mentioned only in passing.
The vast majority of papers are presenting new variants or new
combinations of algorithms and will assume that you are familiar with,
at least, the classic versions.


Another problem is that a great many important papers were originally
published in the journals of the ACM. ACM retains the rights to them
and they can be hard to get hold of unless you are a member (I am) or
at university. Some authors make available draft versions of their
ACM published papers, but you frequently have to dig some to find
them.

Some places to look:

Rick Jones's biblio/collection:
http://www.cs.kent.ac.uk/people/staff/rej/gc.html

Henry Baker's biblio/collection:
http://home.pipeline.com/~hbaker1/

Edsger Dijkstra's collection:
http://www.cs.utexas.edu/users/EWD/

the biblio at:
http://www.memorymanagement.org/bib/

and poke around on Citeseer
http://citeseerx.ist.psu.edu
http://citeseer.ist.psu.edu/stats/articles


George

Marco Maggi

unread,
Aug 8, 2012, 1:46:18 AM8/8/12
to
Vitaly Magerya wrote:
> Here is a complete list of papers any novice should read
> to get a solid understanding of automatic memory
> management:

Thanks to everyone again; I now have plenty to study.
--
Marco Maggi
0 new messages