Confusion regarding 'mark-sweep-compact' naming

686 views
Skip to first unread message

Peter Veentjer

unread,
Aug 14, 2017, 1:47:01 AM8/14/17
to mechanical-sympathy
I have been improving my gc knowledge and there is some confusion on my side regarding the 'mark-sweep-compact' algorithm I frequently see mentioned in posts, articles and some not too formal books on the topic. E.g.

https://plumbr.eu/handbook/garbage-collection-algorithms/removing-unused-objects/compact
https://www.infoq.com/news/2015/12/java-garbage-collection-minibook

AFAIK mark-sweep-compact is not correct. It is either mark-sweep or mark-compact where as part of the compacting process the dead objects are removed. When I check more formal literature like "The Garbage Collection Handbook: the Art of Automatic Memory Management", then there is no mentioning of mark-sweep-compact.

So I guess that when 'mark-sweep-compact' is mentioned, they are actually referring to 'mark-compact'. Is this assumption correct?

Kirk Pepperdine

unread,
Aug 14, 2017, 2:45:09 AM8/14/17
to mechanica...@googlegroups.com
Hi,


> On Aug 14, 2017, at 7:47 AM, Peter Veentjer <alarm...@gmail.com> wrote:
>
> I have been improving my gc knowledge and there is some confusion on my side regarding the 'mark-sweep-compact' algorithm I frequently see mentioned in posts, articles and some not too formal books on the topic. E.g.
>
> AFAIK mark-sweep-compact is not correct. It is either mark-sweep or mark-compact where as part of the compacting process the dead objects are removed. When I check more formal literature like "The Garbage Collection Handbook: the Art of Automatic Memory Management", then there is no mentioning of mark-sweep-compact.
>
> So I guess that when 'mark-sweep-compact' is mentioned, they are actually referring to 'mark-compact'. Is this assumption correct?

The terminology around this is very loose so the very short answer is…. sort of. So, a very terse but hopefully useful summary of collectors.

When we talk about mark-sweep we are generally referring to a class of garbage collectors known as tracing collectors. The actual details of how the mark and sweep for each of the different tracing collectors is implemented depend upon a couple of factors. One of the most important factors is how memory pool(s) (data structure in heap space) is organized. I tend to label a collector as either “in-place” or “evacuation”. The most trivial examples for both are; if you only have a single memory pool then you will be performing an “in-place” collection as the data remains in the that memory pool at the end of the collection. If you have two spaces you can set this up as what is known has hemispheric collections. In this case you mark the data in the currently active memory pool and then “evacuate” it to the inactive pool. When complete active becomes inactive (thus empty) and inactive becomes active. Wash, rinse, repeat….

In-place collections will need a free-list and need to some how deal with fragmentation (of the free-list). Thus they compact. When and how often is an implementation detail. Evacuating collectors use a top of heap pointer (and other more complex tricks) that act as a free list. Since they evacuate to another pool, the data will naturally be compacted. Thus no compaction will be necessary. As you can imagine, the time/space cost models for these two general approaches are quite different.

In OpenJDK we have both generational and regional collectors. The generational collector divides memory into 4 memory pools named Eden, Survivor 0, Survivor 1, and Tenured space. The first three are grouped into young and Tenured into old space. In young we can use mark-evacuation where as in old we are forced to use mark-sweep-compact. I’m going to claim (I’m sure Gil will want to chime in here) that G1, C4, Balance and Shenandoah are regional collectors in that they divide heap into a large number of memory pools (regions). G1 aims for 2048 memory pools. In all 4 cases, the collectors are very very different variants of mark-evacuation each with their own quirks. The advantage of using regions is that you only need an evacuating collector as you don’t end up with a terminal pool (tenured) where you are forced to use an in-place collector. If you are careful in how you select regions to be recovered, you can realize some nice wins over generational collectors. However, these collectors tend to be far more complex and harder on your electron budget than generational collectors are.

Gil Tene

unread,
Aug 14, 2017, 11:11:38 AM8/14/17
to mechanical-sympathy
I've taken to using Mark-Sweep-Compact, Mark-Sweep (no Compact), and Mark-Compact (no sweep) to describe the variants, because you will find all of them in the wild these days.

The confusion starts with the fact that some algorithms that people often refer to as "Mark-Sweep" are actually Mark-Sweep-Compact, while other are purely Mark-
Sweep (with no compaction). Mark-Sweep with no compaction usually means that free memory is recycled in place, with no object movement. A good example of this in practice is the CMS collector in the HotSpot JVM, which tracks unused memory in the oldgen in some form of free lists, but does not move any objects in the oldgen to compact it. Mark-Sweep *with* compaction (Mark-Sweep-Compact) obviously does compact. It usually implies the use of the knowledge established during a sweep (e.g. a list of the empty ranges) to compact the space "in place", It also usually involves a complete fixup of all references in all live objects. E.g. a simplistic description of a common Mark-Sweep-Compact technique would involves: (A) mark the live objects in the heap, (B) sweep the heap to find the empty ranges, (C) move all live objects to one side of the heap (filling in the empty spaces), (D) linearly scan all live objects and fix up any references in them to point to the correct target location. A good example of this in practice is the ParallelGC and SerialGC collectors in the HotSpot JVM.

Similarly, more confusion is caused by the fact that some algorithms that people often refer to as "Mark-Compact" are actually "Mark-Sweep-Compact", while other are purely Mark-Compact (no sweep). Again a good example of the first in common use would be ParallelGC in the HotSpot JVM (and the Wikipedia entry https://en.wikipedia.org/wiki/Mark-compact_algorithm). Good examples for the latter in common use can be found in various regional evacuating collectors (like C4, G1, etc.), where there is no sweep per-se (live object are evacuated to outside of the originally marked heap, and there is some mechanism for finding/tracking them to do so, but that mechanism is not the traditional sweep used by Mark-Sweep style of collectors). Fixup behavior in such algorithms can also vary quite a bit (e.g. C4 normally uses the next cycle's Mark to perform fixups for the previous Compact's relocated objects, and as such does not require a scanning fixup pass).

Another key difference often pointed to is that with in-place compaction (Mark-Sweep-Compact), no additional empty memory is required to guaranteed full compaction, while regional evacuating collectors need to evacuate into some empty memory, so their ability to compact may be limited by the amount of empty memory available outside of the currently occupied heap regions. However,. several well documented regional collector algorithms (e.g. Pauseless, C4, Compressor) perform hand-over-hand compaction to guarantee forward progress in evacuation, such that a single empty region is sufficient to achieve full compaction in a single pass.

Gil Tene

unread,
Aug 14, 2017, 11:30:26 AM8/14/17
to mechanica...@googlegroups.com


On Sunday, August 13, 2017 at 11:45:09 PM UTC-7, Kirk Pepperdine wrote:
Hi,


> On Aug 14, 2017, at 7:47 AM, Peter Veentjer <alarm...@gmail.com> wrote:
>
> I have been improving my gc knowledge and there is some confusion on my side regarding the 'mark-sweep-compact' algorithm I frequently see mentioned in posts, articles and some not too formal books on the topic. E.g.
>
> AFAIK mark-sweep-compact is not correct. It is either mark-sweep or mark-compact where as part of the compacting process the dead objects are removed. When I check more formal literature like "The Garbage Collection Handbook: the Art of Automatic Memory Management", then there is no mentioning of mark-sweep-compact.
>
> So I guess that when 'mark-sweep-compact' is mentioned, they are actually referring to 'mark-compact'. Is this assumption correct?

The terminology around this is very loose so the very short answer is…. sort of. So, a very terse but hopefully useful summary of collectors.

When we talk about mark-sweep we are generally referring to a class of garbage collectors known as tracing collectors. The actual details of how the mark and sweep for each of the different tracing collectors is implemented depend upon a couple of factors. One of the most important factors is how memory pool(s) (data structure in heap space) is organized. I tend to label a collector as either “in-place” or “evacuation”. The most trivial examples for both are; if you only have a single memory pool then you will be performing an “in-place” collection as the data remains in the that memory pool at the end of the collection. If you have two spaces you can set this up as what is known has hemispheric collections. In this case you mark the data in the currently active memory pool and then “evacuate” it to the inactive pool. When complete active becomes inactive (thus empty) and inactive becomes active. Wash, rinse, repeat….

In-place collections will need a free-list and need to some how deal with fragmentation (of the free-list). Thus they compact. When and how often is an implementation detail. Evacuating collectors use a top of heap pointer (and other more complex tricks) that act as a free list. Since they evacuate to another pool, the data will naturally be compacted. Thus no compaction will be necessary. As you can imagine, the time/space cost models for these two general approaches are quite different.

In OpenJDK we have both generational and regional collectors. The generational collector divides memory into 4 memory pools named Eden, Survivor 0, Survivor 1, and Tenured space. The first three are grouped into young and Tenured into old space. In young we can use mark-evacuation where as in old we are forced to use mark-sweep-compact. I’m going to claim (I’m sure Gil will want to chime in here) that G1, C4, Balance and Shenandoah are regional collectors in that they divide heap into a large number of memory pools (regions).

Agreed. These are all regional Mark-Compact evacuating collectors. But G1, Balanced and Shenandoah are oldgen-only regional collectors (or single generation in the current Shenandoah case), while C4 uses both a regional Mark-compact newgen collector and a regional Mark-Compact oldgen collector. As such, it gets all the nice-for-your-electron-budget benefits of generational collection along with all the nice benefits of regional collection.
 
G1 aims for 2048 memory pools. In all 4 cases, the collectors are very very different variants of mark-evacuation each with their own quirks. The advantage of using regions is that you only need an evacuating collector as you don’t end up with a terminal pool (tenured) where you are forced to use an in-place collector. If you are careful in how you select regions to be recovered, you can realize some nice wins over generational collectors. However, these collectors tend to be far more complex and harder on your electron budget than generational collectors are.

The harder-on-electron-budget thing is not algorithmically inherent to evacuating regional collectors. It usually has to do with other algorithmic choices that may be made in collectors that happen to be regional. E.g. some region-based incremental STW compactors (G1, Balanced) generally try to compact a small increment of the heap and fixup a hopefully-limited set of regions that refer to the compacted set, all in a single STW pause. These algorithms tend to rely on cross-region remembered-set tracking which involves some interesting electron budget implications, but more importantly, because the same regions will be involved in multiple fixup passes, the resulting N^2 complexity potential can also cause the electron budget to significantly inflate. Evacuating regional collector algorithms that do not rely on cross-region remembered sets, and do not attempt to perform compaction is short incremental STW pauses (e.g. Pauseless, Compressor, C4, Shenandoah, ...) do not incur these specific extra costs.


 

Peter Veentjer

unread,
Aug 6, 2019, 1:38:31 AM8/6/19
to mechanical-sympathy
There is still some confusion on my side. This time it is regarding the algorithmic complexity of mark-sweep-compact.

The complexity of the mark phase is linear to the live set.
The complexity of the sweep phase is linear to the heap size.
The complexity of the compact phase is linear to the live set.

So the complexity of the mark-compact would be linear to the live set since you only deal with live objects.

Why would you ever want to include the a sweep phase because the complexity of mark-sweep-compact is now linear to the heap size; a lot worse compared to 'linear to the live set'. Get rid of the sweep and complexity goes down to 'linear to the live set'. What is the added value of such an inefficient GC implementation?

I'm sure there is value; but currently it doesn't compute.

Aleksey Shipilev

unread,
Aug 6, 2019, 5:41:28 AM8/6/19
to mechanica...@googlegroups.com, Peter Veentjer
On 8/6/19 7:38 AM, Peter Veentjer wrote:
> There is still some confusion on my side. This time it is regarding the algorithmic complexity of
> mark-sweep-compact.
>
> The complexity of the mark phase is linear to the live set.
> The complexity of the sweep phase is linear to the heap size.
> The complexity of the compact phase is linear to the live set.

Well...

The multiplicative factors in either case might make one linearity perform better than the other on
wildly different heap sizes and live sets. The adversarial example of this is N large byte arrays of
size M. Compact might have to copy N*M bytes (especially when doing sliding compaction), and sweep
might only need to visit N locations to check on object metadata.

There are second-order heap size effects for both mark and compact. Take two heaps with the same
live set, one very dense and one very sparse, and the mere fact you have to walk memory far away
adds up the secondary effect of heap size. It would not linger for compacting collectors, though, as
first few compaction cycles would collapse the heap to denser variant.

That is all to say that algorithmic complexity is a seriously bad way to reason about (GC) performance.

> Why would you ever want to include the a sweep phase because the complexity of mark-sweep-compact is
> now linear to the heap size; a lot worse compared to 'linear to the live set'. Get rid of the sweep
> and complexity goes down to 'linear to the live set'. What is the added value of such an inefficient
> GC implementation?

I think m-s-c nomenclature is confusing, as the actual implementations are doing either m-s or m-c
during their individual phases. The benefit of doing either is well covered.

The hard naming question is how to properly label the collector where different phases employ
different algos. For example, in CMS the young collection is copying, old collection is mark-sweep,
fall-back full GC is mark-compact. How would you label CMS then? I believe marking it m-s-c is
borderline tolerable, until users get confused thinking both sweep and compact happen in the same phase.

Also with implementations that manage backing storage in fine-grained chunks (for example, free
lists), "sweep" might mean dealing with them on per-dead-object basis. For example, this might mean
preparing the storage for subsequent compaction by "sweeping" out old objects before compaction
kicks in, which requires free space. Practical implementations I know opt to instead reconstruct the
backing storage metadata after massive compaction moves, though.

My $0.02c, anyway.

-Aleksey

signature.asc

Peter Veentjer

unread,
Aug 14, 2019, 2:34:15 AM8/14/19
to mechanical-sympathy
Thanks for your answer Aleksey,

comments inline.

On Tuesday, August 6, 2019 at 12:41:28 PM UTC+3, Aleksey Shipilev wrote:
On 8/6/19 7:38 AM, Peter Veentjer wrote:
> There is still some confusion on my side. This time it is regarding the algorithmic complexity of
> mark-sweep-compact.
>
> The complexity of the mark phase is linear to the live set.
> The complexity of the sweep phase is linear to the heap size.
> The complexity of the compact phase is linear to the live set.

Well...

The multiplicative factors in either case might make one linearity perform better than the other on
wildly different heap sizes and live sets. The adversarial example of this is N large byte arrays of
size M. Compact might have to copy N*M bytes (especially when doing sliding compaction), and sweep
might only need to visit N locations to check on object metadata.

There are second-order heap size effects for both mark and compact. Take two heaps with the same
live set, one very dense and one very sparse, and the mere fact you have to walk memory far away
adds up the secondary effect of heap size. It would not linger for compacting collectors, though, as
first few compaction cycles would collapse the heap to denser variant.

That is all to say that algorithmic complexity is a seriously bad way to reason about (GC) performance.

I understand. I created a bunch of flashcard to get a better conceptual understanding of the different gc algorithms;
I'll add some additional flashcard to refine my understanding.
 

> Why would you ever want to include the a sweep phase because the complexity of mark-sweep-compact is
> now linear to the heap size; a lot worse compared to 'linear to the live set'. Get rid of the sweep
> and complexity goes down to 'linear to the live set'. What is the added value of such an inefficient
> GC implementation?

I think m-s-c nomenclature is confusing, as the actual implementations are doing either m-s or m-c
during their individual phases.

Exactly. And this was the source of my confusion and caused by an abuse of names.

 
The benefit of doing either is well covered.

The hard naming question is how to properly label the collector where different phases employ
different algos. For example, in CMS the young collection is copying, old collection is mark-sweep,
fall-back full GC is mark-compact. How would you label CMS then?

It would be easier/simpler to label the young and old collector independently; but I understand your point.

Peter Veentjer

unread,
Aug 14, 2019, 2:44:31 AM8/14/19
to mechanical-sympathy
Giving it a bit more thought:

the parallel and serial collectors are categorized as 'mark-sweep-compact' collectors.


But in both cases, the young collector is a evacuating collector while the old collectors are mark-compact algorithms.

This brings up the question: how does sweep enter the picture?

CMS could be categorizes as mark-sweep-compact; even though it does a mark-sweep for the old generation; in case of a concurrent mode failure it will fall back on the parallel collector which does a mark-compact. So in that could I could understand why this one would be called mark-sweep-compact; but CMS is never called mark-sweep-compact.

Gil Tene

unread,
Aug 14, 2019, 5:30:10 AM8/14/19
to mechanica...@googlegroups.com
There are very real and popular collectors that do Mark, Sweep, *and* Compact, e.g the OpenJDK oldgen collectors in both ParallelGC and SerialGC. There are very real and popular collectors that do only Mark and Sweep, e.g. the non-compacting oldgen of the CMS collector (as long as it does not fall back to the STW compacting fullGC). And there are very real and popular collectors that do only Mark and Compact (no sweep), e.g. C4, Pauseless, Compressor, and recently ZGC.

For an easy to follow code example of Mark, Sweep, and Compact, look at the (ahem...) psMarkSweep.c code for the compacting OpenJDK ParallelGC oldgen implementation, which does 4 separate passes (

1. A  tracing, pointer chasing pass that marks the live set.
    Line 514: // Recursively traverse all live objects and mark them

2. A sweeping pass that computes the new object addresses 
    Line 579: // Now all live objects are marked, compute the new object addresses.
    (new addresses are computed to ”shift everything to one side”,
     which, when actually done in pass 4, will compact the heap in place)

3. A sweeping pass to adjust references to point to the new object addresses:
    Line 598: // Adjust the pointers to reflect the new locations
    (this is sometimes referred to as a fixup pass)

4. A compact pass that moves the objects to their previously chosen target locations.
    Line 645: // All pointers are now adjusted, move objects accordingly

This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits  live objects, while passes 2-4 linearly traverse the entire heap, live or dead), going forward one object at a time, and skipping over the dead objects one by one. For at least the first sweeping pass, when you arrive at a dead object’s header, you still need to follow its type indicator to determine the object size in order to know how far to skip ahead to the next object. So you visit and follow pointer information in each and every object header in the heap, dead or alive.

In contrast, a pure mark-compact can be built to never do a “move from one object to another, including the dead ones” pass. E.g. an evacuating mark compact can use the mark pass to produce an efficient way to move from one live object to the next when evacuating and doing pointer fixup.

There are various schemes for doing this efficiently without ever visiting a dead object. E.g. In C4 we keep live marks in a bit vector with one bit per 64 bit word of heap (we mark in this bit vector, not in the object headers). When it comes time to evacuate e.g. a 2MB region, we just scan through the associated 32KB of memory that contains the live bits for that 2MB region, using a find-first-bit operation (which on many processors can be done efficiently with direct bit scan, learding zeros, or trailing zeros instructions), stopping only on set live bits, and visiting only the live objects they correspond to. The evacuating compact never looks at a dead object header or follows its klass indicator to figure out its size.

An evacuating compactor will typically fix object references (to point to their correct location) after copying all the object bodies. Since all references in the heap may potentially need fixing, all reference fields in all live objects that could possibly point to any relocated objects will need to be visited at some point to do this, and each non-null reference visited will need to potentially follow a forwarding pointer somewhere to find the object’s actively target address.

People often point to this fixup pass as a “sweep” of sorts, as it is often performed linearly. In collectors where this pass is performed on a compacted set of objects, such a “sweep” will only visit live objects, so it can still be dramatically less work than visiting all dead object headers and following their Klass pointers. 

However, with the invention of “Quick Release” (first published in the Pauseless and C4 papers, but also used since by Compressor and ZGC, see explanation in section 2.5 of the C4 paper [1]), at least some concurrent evacuating regional collectors are able to avoid having a separate fixup pass altogether. Storing forwarding pointers outside of evacuated regions (as opposed to the traditional way of storing them in or near evacuated object headers) enables the recycling of an evacuated region’s memory for new allocations and for hand-over-hand compaction without stomping the forwarding pointers. This not only allows the compaction of the entire heap with no “target” memory space needed, it enables lazy-fixup, since fixup is no longer required to occur before memory can be completely recycled.

With quick release and lazy fixup (supported by a loaded reference value barrier like LVB), fixup can be delayed as far as the next GC cycle. Which means fixup work can be performed in the same pointer chasing tracing pass that would perform the marking for the next GC cycle, with some pretty big savings (eliminating an entire OSs in all objects in the heap).

This means that a modern regional collector that supports concurrent evacuation can do a pure mark compact (no sweep), and can achieve that with exactly two passes: a single mark+fixup pass (mark for this GC cycle, fixup fur previous), and a single evacuating relocate (but no fixup) pass. Pauseless, C4, Compressor, and now ZGC all do exactly that.

Note that the collector *has* to support concurrent compaction (or at the very least concurrent fixup) to get this dramatic improvement in efficiency, as without it the application could not be allowed to run between the compacting evacuation and the eventual fixup. This is a practical example of how a concurrent compacting collector can actually be built to be more CPU efficient than a stop-the-world compacting collector, and a busting of the myth that concurrency always comes at a cost. Because of Quick-Release, a concurrent collector can fundamentally beat a stop the world collector in efficiency.

[1] C4: The Continuously Concurrent Compacting Collector. ISMM’11 https://www.azul.com/files/c4_paper_acm1.pdf


 
The benefit of doing either is well covered.

The hard naming question is how to properly label the collector where different phases employ
different algos. For example, in CMS the young collection is copying, old collection is mark-sweep,
fall-back full GC is mark-compact. How would you label CMS then?

It would be easier/simpler to label the young and old collector independently; but I understand your point.

 
I believe marking it m-s-c is
borderline tolerable, until users get confused thinking both sweep and compact happen in the same phase.

Also with implementations that manage backing storage in fine-grained chunks (for example, free
lists), "sweep" might mean dealing with them on per-dead-object basis. For example, this might mean
 preparing the storage for subsequent compaction by "sweeping" out old objects before compaction
kicks in, which requires free space. Practical implementations I know opt to instead reconstruct the
backing storage metadata after massive compaction moves, though.

My $0.02c, anyway.

-Aleksey

--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/mechanical-sympathy/2a29baf3-2027-4eb6-a7f1-4737397f9f0f%40googlegroups.com.

Gil Tene

unread,
Aug 14, 2019, 5:59:49 AM8/14/19
to mechanica...@googlegroups.com


Sent from my iPad

On Aug 13, 2019, at 11:44 PM, Peter Veentjer <pe...@hazelcast.com> wrote:

Giving it a bit more thought:

the parallel and serial collectors are categorized as 'mark-sweep-compact' collectors.


But in both cases, the young collector is a evacuating collector while the old collectors are mark-compact algorithms.

The old collector and the young collector are often completely different collectors, which can (and quite often do) use completely different collector algorithms. There is no point in trying to mix the separate collectors and collector algorithms and classify them into one category.

There is also no point in trying to rationalize how the OpenJDK newgen collectors (all of which are copying collectors) fit into the Mark/Sweep/Compact categorization. They don’t.

Copying collection is a completely different category. It is a firm of evacuating collection, and it does use a tracing pass. But it (generally) performs the entire collection in a single tracing pass (there is no separate mark, or sweep, and/or compact), and is linear only to live set, which makes it fundamentall efficient in very sparsely populated heaps, and an obvious fit for the youngest generation of a multi-generation collector (asthe generational hypothesis is still nearly-universally applicable, and sparseness is extremely common, likely, and easy to achieve).

It’s “downsides” include needing an empty to space that is as big as the from space to be be able run safely, since the from-space needs to be completely evacuated once evacuation starts, and cool single-pass efficient copy thing cannot know (before starting) how much of the from space is alive. This (2x heap size needed) makes copying collection a poor fit for the oldest generation of a collector (multi-generational or not).

Newgen collectors that are copying collectors overcome the downsides by being able to promote into oldgen as a bail-out mechanism (if the to-space is not large enough to hold the newgen liveset) . And as a result of this oldgen safety net, the to-space in newgen collectors is typically configured to be a small fraction of the worst-case they’d have to keep around if they had nowhere else to put stuff.

Note that there are variants of copying collectors that are “abortable” in the sense that the collection can safely stop without  completely evacuating the from space. But the variants I’ve seen that do that are less efficient ( I don’t think I‘ve seen a single pass variant that is abortable).

--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Peter Veentjer

unread,
Aug 14, 2019, 8:18:24 AM8/14/19
to mechanical-sympathy
Hi Gil,

thanks for your detailed answer.

So mark-sweep-compact implementation really do exist. I'll need to update a set of flashcards.
To unsubscribe from this group and all its topics, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Aleksey Shipilev

unread,
Aug 14, 2019, 1:40:21 PM8/14/19
to mechanica...@googlegroups.com, Gil Tene
On 8/14/19 11:30 AM, Gil Tene wrote:
> For an easy to follow code example of Mark, Sweep, and Compact, look at the (ahem...) psMarkSweep.c
> code for the compacting OpenJDK ParallelGC oldgen implementation, which does 4 separate passes (
> http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/7576bbd5a03c/src/share/vm/gc_implementation/parallelScavenge/psMarkSweep.cpp#l201 ):
>
> 1. A  tracing, pointer chasing pass that marks the live set.
>     Line 514: // Recursively traverse all live objects and mark them
>
> 2. A sweeping pass that computes the new object addresses 
>     Line 579: // Now all live objects are marked, compute the new object addresses.
>     (new addresses are computed to ”shift everything to one side”,
>      which, when actually done in pass 4, will compact the heap in place)
>
> 3. A sweeping pass to adjust references to point to the new object addresses:
>     Line 598: // Adjust the pointers to reflect the new locations
>     (this is sometimes referred to as a fixup pass)
>
> 4. A compact pass that moves the objects to their previously chosen target locations.
>     Line 645: // All pointers are now adjusted, move objects accordingly
>
> This is a classic Mark, Sweep, and Compact collector. Pass 1 only visits  live objects, while passes
> 2-4 linearly traverse the entire heap, live or dead), going forward one object at a time, and
> skipping over the dead objects one by one.

I think the descriptions of steps (2) and (3) seriously stretch the definition of "sweep". If we
creatively redefine it to mean any (linear) heap walk, then most (all, except for copying?)
collectors should be marked as "sweep", which defies the purpose of the definition.

This describes Lisp2-style collector, which is listed as (*drumroll*) "Mark-Compact":
https://en.wikipedia.org/wiki/Mark-compact_algorithm#LISP2_algorithm

GC handbook mentions only "Mark-Compact" and "Mark-Sweep" styles, no "Mark-Sweep-Compact". The MM
Glossary [https://www.memorymanagement.org/glossary/s.html#sweeping] says: "Sweeping is the second
phase (“the sweep phase”) of the mark-sweep algorithm. It performs a sequential (address-order) pass
over memory ***to recycle unmarked blocks.***" (emphasis mine)

I would leave "sweep" as something that reclaims the storage of the objects (i.e. "sweeps away" the
garbage), which makes M-C and M-S classes more clearly defined and easier to reason about.

Introducing the fixup pass is not very relevant here, because it talks about how the heap references
are managed, not how the object storage is treated. Regardless of what modern concurrent collector
does (separate update-refs phase, piggybacking on next marking, self-fixing barriers) to update the
references, the objects are already either copied/compacted away, or the garbage is swept away in
(per-object-sized) blocks.

-Aleksey

signature.asc

Gil Tene

unread,
Aug 14, 2019, 3:21:09 PM8/14/19
to Aleksey Shipilev, mechanica...@googlegroups.com
Yup. those parts were written before Quick Release was commonly used to makes pure
Mark/Compact (no sweep) practical and profitable (over the other two variants).
I think that the G1GC evacuation stuff was also "too young" at the time of the writing to
consider the implications on terminology there, as G1GC-style collectors, and (I think Shenandoah
as well, since they share the same SATB marker logic), tend to keep separate live marks in
bits outside of object header for the same avoid-the-sweep reasons, and don't visit dead
objects when evacuating regions.

> The MM
> Glossary [https://www.memorymanagement.org/glossary/s.html#sweeping] says: "Sweeping is the second
> phase (“the sweep phase”) of the mark-sweep algorithm. It performs a sequential (address-order) pass
> over memory ***to recycle unmarked blocks.***" (emphasis mine)

Pass 2 above does exactly the "sequential (address-order) pass over memory to recycle unmarked blocks"
thing, and that makes it a classic sweep. It does a classic "find, visit, parse, and track all the dead stuff"
pass, and will use that information to recycle all of them (by compacting those holes). The way it tracks
the "free stuff" is just one of many possible ways to track the recycle-able holes identified in a sweep:
It 2 tracks them by reflecting the information about where the the dead holes are in the newly computed
address for the next live object it finds (the "how much to shift the next live object's address" amount
grows by the size of each dead object encountered and parsed to determine that size). If you wanted to,
you could directly deduce or reconstruct the list of dead hole addresses and sizes by walking the newly
computed object addresses backwards. It's just that the way this "list" is stored is optimized for the next
passes in this collector.

The way different sweepers track the recycle-able holes they find varies dramatically (lists, trees, bucketed
lists, buddy lists, some magic encoding in other data) but the work done to find them is fundamentally the
same in all sweepers: they linearly parse the heap, one object at a time, figuring out where the next object
starts by looking the at material within the object (dead or alive) being visited, and store (or embed) the
information they find about where the holes are into some data structure.

The fact that they visit and parse dead objects in order to identify recycle-able memory is what makes them
sweepers. Other techniques that don't visit dead objects do not perform a sweep. Those usually take the
apparoiach of "take all the live objects somewhere else, and the whole thing becomes one big free blob"
over the "sweep through the individual dead things and rack them for some recycling operation that fills
them in individually". Examples of those clearly-non-sweepers include copying collectors that evacuate and
fixup an entire from space in one tracing pass that only ever visits live objects. Regional evacuating
collectors that evacuate one region at a time (as opposed to an entire from space in an all-or-nothing
approach) but do so without ever visiting or parsing a dead object also fall into this "does not do a sweep"
category.

>
> I would leave "sweep" as something that reclaims the storage of the objects (i.e. "sweeps away" the
> garbage), which makes M-C and M-S classes more clearly defined and easier to reason about.

When M-C implies "also does a full linear traversal that visits all dead objects" (aka "a sweep"), that
terminology fails. Examples like C4, G1GC, Shenandoah, and ZGC are all Mark-Compact (NO Sweep)
collectors, while ParallelGC and Serial GC are Mark-Compact (WITH Sweep) collectors. Since the math
behind those varies dramatically in wasy that effect everyday choices, using "M-C" as a classification
that covers both leads people to misunderstand the interactions between e.g. heap size and GC work.

For example, I know that for C4 and ZGC (and I believe that for Shenandoah as well), all of which
are variants of Mark/Compact (the non-sweeping kind), increasing the heap without increasing the
application live set adds virtually no cost to the cost the collector bears per GC cycle, and as a
result, it improves the collector's efficiency linearly (i.e. doubling the heap size improves the efficiency
by 2x) without elongating any GC work time (pausing or not). This quality makes "use more heap if
you ahve the memory for it" an always-good thing to do (it never hurts).

But in contrast, in ParallelGC and SerialGC, whose oldgen collectors are also variants of Mark/Compact
(the also sweeping kind). increasing the oldgen heap size significantly increase the work the collector
does in each cycle, since the sweeping passes will fundamentally take longer to perform. This creates
a tradeoff between efficiency and other things. A larger heap for a given live set does mean less frequent
collection, and overall less work per allocation unit, since even as each collection gets individually longer,
some parts of each cycle (the tracing and object moving parts) stay the same length, so the of increase
in cycle length is smaller than the decrease in frequency. But if cycle time matters for something other
than efficiency (like pause time in a STW collector), or if longer cycle times end up costing elsewhere
(heuristics causing more frequent triggering in a concurrent collector to leave more room for the cycle to
complete), those efficiency benefits may be offset, and "use a larger heap" is an often
"Not the right thing to do", and can seriously hurt.

The difference between those two comes down to the difference between Mark-Compact (no Sweep)
and Mark-Compact (with Sweep).

>
> Introducing the fixup pass is not very relevant here, because it talks about how the heap references
> are managed, not how the object storage is treated. Regardless of what modern concurrent collector
> does (separate update-refs phase, piggybacking on next marking, self-fixing barriers) to update the
> references, the objects are already either copied/compacted away, or the garbage is swept away in
> (per-object-sized) blocks.

The fixup ia simply an integral part of compacting, as compacting is not complete until the
fixup is complete. I agree that it is not part of the Sweep part. But whether or not a separate
visit-every-live-object fixup pass happens or not in a compacting collector has as direct
a relationship on it's overall cost as whether or not a separate visit-all-the-dead-objects
sweep pass is done.

This difference is not as critical in very parse heaps (like newgen in most cases, or in Oldgens
that are sized to only have a small fraction of the heap size live), because the cost considerations
around doing or of not doing a visit-all-dead-objects sweep greatly out-weigh any costs
considerations around doing/not-doing a separate visit-all-the-live-objects pass for fixup in
those cases.

But it becomes fundamental at larger heap occupancies, where the live set may be a significant
portion of the overall heap size. This does not show up in silly things like SpecJBB (which
tends to have a alive set of ~1GB but is usually run with heaps in the tens of GB or higher)
but is commonplace in large-in-heap state situations that are actually trying to make use of
memory (like any form on in-memory cache or data, and e.g. in Spark, Elastic/Solr, Cassandra,
Hadoop-name-nodes, or name-your-app-that-actually-keeps-lostof-stuff-in-the-heap)

When heap utilizations are higher (e.g. if you actually have 40GB or more of live data in that
60GB heap), the difference in cost between doing two passes (one tracing MarkFixup, one
linear relocate), or three passes (one tracing Mark, one linear relocate, and an additional (either
linear or tracing) fixup is huge. In the end, the relevant live data (the object headers and bodies)
will either be brought into your CPI caches twice or 3 times. That's a 50% difference. And it is
as big as the difference of doing or not doing a sweep.

But we probably don't have room for the "with separate fixup pass" vs. "without separarte fixup pass"
distinction in the already mixed up Mark-Compact subcategories.


>
> -Aleksey
>

signature.asc

Aleksey Shipilev

unread,
Aug 15, 2019, 4:42:23 AM8/15/19
to Gil Tene, mechanica...@googlegroups.com
On 8/14/19 9:21 PM, Gil Tene wrote:
> The fact that they visit and parse dead objects in order to identify recycle-able memory is what makes them
> sweepers. Other techniques that don't visit dead objects do not perform a sweep.

I am trying to understand what is your central argument here. This seems to be it. Are you saying
that "sweeping" is when you visit dead objects, and non-"sweeping" is when you do not?

>> I would leave "sweep" as something that reclaims the storage of the objects (i.e. "sweeps away" the
>> garbage), which makes M-C and M-S classes more clearly defined and easier to reason about.
>
> When M-C implies "also does a full linear traversal that visits all dead objects" (aka "a sweep"), that> terminology fails. Examples like C4, G1GC, Shenandoah, and ZGC are all Mark-Compact (NO Sweep)
> collectors, while ParallelGC and Serial GC are Mark-Compact (WITH Sweep) collectors. Since the math
> behind those varies dramatically in wasy that effect everyday choices, using "M-C" as a classification
> that covers both leads people to misunderstand the interactions between e.g. heap size and GC work.

M-C does not imply "full linear traversal that visits all dead objects". As the counter-example, it
is very practical to use off-side marking bitmaps to walk only live objects of the heap. Careful
design of such the marking bitmap would yield dramatically better results than using whatever
self-parsable-heap walk. You can even make it multi-level and quite dense to skip over large chunks
of sparse heap.

When the marking phase is a separate phase, it stands to reason that the subsequent phases would
have to process the marking results somehow. That would naturally do some sort of walk over the data
structure that handles marks. If we call that walk "sweeping", then every Mark-* algorithm is
necessarily "sweeping" too. So we have to break this somehow to make the definition viable.

Is walking _dead_ objects the discriminator for "sweeping" then? So in your book, if we take the
same Lisp2 collector, and compute-new-addresses and adjust-pointers steps are walking the
self-parsable heap (visiting dead objects along the way), it is M-S-C? But if it uses marking bitmap
(thus only visiting live objects), it becomes M-C? [This would be a weird flex, but okay].

-Aleksey


signature.asc

Gil Tene

unread,
Aug 16, 2019, 4:07:32 AM8/16/19
to mechanical-sympathy
You make a good point about historical terminology classifying "mark/sweep" and
"mark/compact". But that terminology predates, and could not have anticipated, the
multitude of (sometimes fairly dramatic) variants within the universe of compacting
collectors that came in the decades that followed.

I'll remind folks that back when those terms were used to mean those specific
things, the term "parallel" was also used to describe what we've evolved to
calling "concurrent" collectors for the past 25 years or so.

Classification terms evolve as the art evolves.

On Thursday, August 15, 2019 at 1:42:23 AM UTC-7, Aleksey Shipilev wrote:
On 8/14/19 9:21 PM, Gil Tene wrote:
> The fact that they visit and parse dead objects in order to identify recycle-able memory is what makes them
> sweepers. Other techniques that don't visit dead objects do not perform a sweep.

I am trying to understand what is your central argument here. This seems to be it. Are you saying
that "sweeping" is when you visit dead objects, and non-"sweeping" is when you do not?

Yes. I think that's a very good summary.

To draw a specific line: If you end up visiting each individual dead object, you are sweeping.
If you only visit each "hole" once (which may comprise of 1 or more dead objects, and covers
everything between one line object and the next), the number of visits you make will be no
larger than the number of live objects, and you would not be sweeping.
 

>> I would leave "sweep" as something that reclaims the storage of the objects (i.e. "sweeps away" the
>> garbage), which makes M-C and M-S classes more clearly defined and easier to reason about.
>
> When M-C implies "also does a full linear traversal that visits all dead objects" (aka "a sweep"), that> terminology fails. Examples like C4, G1GC, Shenandoah, and ZGC are all Mark-Compact (NO Sweep)
> collectors, while ParallelGC and Serial GC are Mark-Compact (WITH Sweep) collectors. Since the math
> behind those varies dramatically in wasy that effect everyday choices, using "M-C" as a classification
> that covers both leads people to misunderstand the interactions between e.g. heap size and GC work.

M-C does not imply "full linear traversal that visits all dead objects".

We agree on that.

I am saying that M-C (with sweep) does imply that, and M-C (with no sweep) does not.

And I am pointing out that the difference between the two is profound enough that
use of "M-C" with no qualification makes the following misquotation of Stanislaw Ulam
a good fit:

"Describing all non-copying relocating collectors as 'Mark Compact' is like referring to the
bulk of zoology as the study of non-elephant animals."
 
As the counter-example, it is very practical to use off-side marking bitmaps to walk only live 
objects of the heap. Careful design of such the marking bitmap would yield dramatically better 
results than using whatever self-parsable-heap walk. You can even make it multi-level and 
quite dense to skip over large chunks of sparse heap. 

Yes. both straight livemark bit vectors and livemark bit vectors with summaries allow you to
avoid visiting individual dead objects, by effciency jumping form one live object to another
without ever visiting or parsing a dead object. And neither one would be "sweeping".


When the marking phase is a separate phase, it stands to reason that the subsequent phases would
have to process the marking results somehow. That would naturally do some sort of walk over the data
structure that handles marks. If we call that walk "sweeping", then every Mark-* algorithm is
necessarily "sweeping" too. So we have to break this somehow to make the definition viable.

Sweeping in not the processing of marking results, or the visiting or processing of the live objects.
It is the visiting and processing of dead objects.
 

Is walking _dead_ objects the discriminator for "sweeping" then? So in your book, if we take the
same Lisp2 collector, and compute-new-addresses and adjust-pointers steps are walking the
self-parsable heap (visiting dead objects along the way), it is M-S-C? But if it uses marking bitmap
(thus only visiting live objects), it becomes M-C? [This would be a weird flex, but okay].

Exactly. "In my book" adding an efficient livemark bit vector with possible summary layers would
covert the classic LISP2 GC from a Mark-Sweep-Compact to a Mark-Compact (with no sweep).

ParallelGC doesn't do that, and sticks with the classic Mark-Sweep-Compact. But if it did
add that mechanism, it would significantly improve in efficiency for low-occupancy oldgens,
and significantly reduce it's sensitivity to heap size (for a fixed live set).

The reason I am keenly aware of the difference is that we use a concurrent Mark-Compact
(no sweep) collector not only for the oldgen in C4, but also for the young generation.
Since the young generation tends to be very sparse, the difference between having a sweep 
or not is very profound, and efficient hopping from one live-mark to the next is key.
 

-Aleksey


Aleksey Shipilev

unread,
Aug 16, 2019, 4:30:10 AM8/16/19
to mechanica...@googlegroups.com, Gil Tene
On 8/16/19 10:07 AM, Gil Tene wrote:
> Classification terms evolve as the art evolves.

What I want to emphasize here that this discussion reinvents the meaning of "sweep". That definition
is not used in the way you describe in the sources I know of. Granted, definitions drift over time,
but we have to be careful to separate what is the "agreed on" definition, and what is whatever
definition we want to be the emergent one.

> On Thursday, August 15, 2019 at 1:42:23 AM UTC-7, Aleksey Shipilev wrote:
> I am trying to understand what is your central argument here. This seems to be it. Are you saying
> that "sweeping" is when you visit dead objects, and non-"sweeping" is when you do not?
>
> Yes. I think that's a very good summary.

Would be helpful if we started from this next time around!

> Is walking _dead_ objects the discriminator for "sweeping" then? So in your book, if we take the
> same Lisp2 collector, and compute-new-addresses and adjust-pointers steps are walking the
> self-parsable heap (visiting dead objects along the way), it is M-S-C? But if it uses marking
> bitmap
> (thus only visiting live objects), it becomes M-C? [This would be a weird flex, but okay].
>
>
> Exactly. "In my book" adding an efficient livemark bit vector with possible summary layers would
> covert the classic LISP2 GC from a Mark-Sweep-Compact to a Mark-Compact (with no sweep).

So this is what irks me here. In my Epsilon's mark-compact toy collector
(https://shipilev.net/jvm/diy-gc/), changing from mark bitmaps to self-parsable heap traversal is
about 5 lines of code. This does not feel like passing the bar for reclassification of the collector
algo between major classes. It feels more like the implementation detail that can swing both ways,
depending on circumstances.

-Aleksey

signature.asc

Steve Blackburn

unread,
Aug 16, 2019, 5:53:45 AM8/16/19
to mechanical-sympathy
For what it is worth, Kathryn and I have previously clarified some of this nomenclature (see our Immix paper).

Marking: identifying liveness via a trace
Sweeping: identifying free space while leaving live objects in place (classic mark-sweep and immix do this)
Evacuating: freeing space by moving survivors to another place
Compacting:  freeing space moving survivors within the same place.

We use the term 'defragmentation' to refer to the use of evacuation to free up space in a regional collector.

--Steve

Gil Tene

unread,
Aug 16, 2019, 1:08:47 PM8/16/19
to mechanica...@googlegroups.com


Sent from my iPad

On Aug 16, 2019, at 2:53 AM, Steve Blackburn <steve1b...@gmail.com> wrote:

For what it is worth, Kathryn and I have previously clarified some of this nomenclature (see our Immix paper).

Marking: identifying liveness via a trace
Sweeping: identifying free space while leaving live objects in place (classic mark-sweep and immix do this)
Evacuating: freeing space by moving survivors to another place
Compacting:  freeing space moving survivors within the same place.

:-)

Now we have the “is it Compacting or Evacuating?” question...

Because of Quick Release (C4 paper [1], section 2.5, and originally described in 2005 in the Pauseless paper [2], section 4.2) C4 [1] , Pauseless [2], Collie [3], Compressor [4], and ZGC [reference needed] are all examples of collectors created in the past ~15 years that can be classified as either/both Compacting or Evacuating under the nomenclature above.

At first glance, one might be tempted to say that they are simply Evacuating collectors: from the point of view of each region, they clearly appear to free space by moving survivors to another place (to outside of the region), rather then moving them within the same place (the same region).

But from an actual memory space perspective (the actual memory storage used by any set of regions being compacted) they are clearly Compacting, since with the hand-over-hand compaction enabled by Quick Release, surviving objects are clearly moved within the same “place”: the set of actual storage being compacted in a given relocate phase, where the survivors are before the moving started. These collectors exhibit a tell-tale quality of “in place” compaction: they move survivors directly on top where other survivors were just before they themselves were moved, all within the same pass.

In virtual memory, these collectors have a clear separation between the two places that are the “from space” and the “to space” for a given relocation pass. Each is a distinct set of virtual memory ranges, and they do not overlap.

But in actual storage, that is not the case in any of these collectors, as the “from space” and “to space” for a given relocation pass overlap with each other in physical space.

I obviously consider C4 to be a Compacting collector (it’s right there in the name: “The Continuously Concurrent Compacting Collector”), and would therefore classify Pauseless, Collie, Compressor, and ZGC as Compacting as well.

But I would also consider them all to be Evacuating regional collectors. Because that behavior (in logical/virtual memory, which is where our thoughts focus most of the time) clearly fits too. Region evacuation is core to Pauseless and C4, and the C4 name was actually a nod at how it frees memory by “blowing up” individual regions and moving all live objects “elsewhere”.

However, I wouldn’t call C4/Pauseless/Collie/Compressor/ZGC “in place Compactors”, because even tho at the physical storage level that is what happens, my brain hurts when I try to keep that picture in my head too long.

We too attempted to further clarify the nomenculture in the Collie paper [3] by differentiating different forms of Compaction in terms of whether or not the virtual address ranges of the from space and to space overlap, with “in-place compactors” being an example of a subset of compactors that exhibit such overlap:

“... Compaction of objects involves copying the reachable objects from a from-space into mostly contiguous memory locations in a to-space, replacing all pointers to from-space locations with corresponding pointers to to-space locations and reclaiming from-space memory for re-use. While the “from” and the “to” spaces could technically overlap in some collectors (such as the case in in-place compactors), this paper is mostly concerned with compactors that use non-overlapping from-space and to-space address ranges.”

This description is useful in the Collie paper, because it then delves into the ways that  consistency of object state is maintained during compaction, in any concurrent compacting collector, and introduces new terms useful in analysis: e.g. referrer set, transplantation, the constraints needed for general concurrent transplantation, and the key distinction between individually transplantable and non-individually transplantable objects that the paper then builds in to describe a wait free compactor.

[1] “C4: The Continuously Concurrent Compacting Collector” https://www.azul.com/files/c4_paper_acm1.pdf
[3] “The Collie: A Wait-Free Compacting Collector” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.406.7321&rep=rep1&type=pdf
[4] “The Compressor: Concurrent, Incremental, and Parallel Compaction” https://www.cs.utexas.edu/~speedway/fp031-kermany.pdf

--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/Kk0rQij7q8I/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Gil Tene

unread,
Aug 18, 2019, 3:34:59 PM8/18/19
to mechanical-sympathy


On Friday, August 16, 2019 at 1:30:10 AM UTC-7, Aleksey Shipilev wrote:
On 8/16/19 10:07 AM, Gil Tene wrote:
> Classification terms evolve as the art evolves.

What I want to emphasize here that this discussion reinvents the meaning of "sweep". That definition
is not used in the way you describe in the sources I know of. Granted, definitions drift over time,
but we have to be careful to separate what is the "agreed on" definition, and what is whatever
definition we want to be the emergent one.

> On Thursday, August 15, 2019 at 1:42:23 AM UTC-7, Aleksey Shipilev wrote:
>     I am trying to understand what is your central argument here. This seems to be it. Are you saying
>     that "sweeping" is when you visit dead objects, and non-"sweeping" is when you do not?
>
> Yes. I think that's a very good summary.

Would be helpful if we started from this next time around!

True. Maybe this discussion will help others start from there next time. It's certainly helped me hone
on that specific summary.
 

>     Is walking _dead_ objects the discriminator for "sweeping" then? So in your book, if we take the
>     same Lisp2 collector, and compute-new-addresses and adjust-pointers steps are walking the
>     self-parsable heap (visiting dead objects along the way), it is M-S-C? But if it uses marking
>     bitmap
>     (thus only visiting live objects), it becomes M-C? [This would be a weird flex, but okay].
>
>
> Exactly. "In my book" adding an efficient livemark bit vector with possible summary layers would
> covert the classic LISP2 GC from a Mark-Sweep-Compact to a Mark-Compact (with no sweep).

So this is what irks me here. In my Epsilon's mark-compact toy collector
(https://shipilev.net/jvm/diy-gc/), changing from mark bitmaps to self-parsable heap traversal is
about 5 lines of code. This does not feel like passing the bar for reclassification of the collector
algo between major classes.

Self-parseable heaps are expensive to traverse, and that expense grows with the size non-live
part of the heap. Being sensitive to live-occupancy % (or not) is a fundamental quality in
a collector.

If a "5 lines of code" change allows your collector to no longer be sensitive to heap size, and
stay sensitive only to the live set, it is well worth it.  And it also changes its classification.

E.g. sometimes changing from a linked list to an array list touches only one line of code, but
that change can dramatically alter the classification from a complexity and fit-for-use point
of view.
 
It feels more like the implementation detail that can swing both ways,
depending on circumstances.

ParalellGC probably wouldn't call the file that implements it's
Mark-Compact (with sweep) algorithm "MarkSweep.cpp", if the authors
of that code thought of their implementation as a Mark Compact that
does not include a sweep, or if they thouight of the sweeping as a
small implementation detail.

If it were just an implementation detail, there wouldn't be scores of academic
papers that seems to classify the "Mark Compact" (the historic kind that visits all
dead objects, so implies sweeping) as inherently less CPU-efficient, and back
that category-wide conclusion with analyses and measurements that all use
sweeping in their Mark-Compact implementations.

And there wouldn't be similar analyses that seem to classify all non-Mark-Compact
moving collectors (putting them into the only remaining categories of semi-space or 
regional evacuating) as less memory-efficient based on their assumed need
for space outside of the "in place" compaction that canonical Mark-Compact
(with implied sweeping) is known to be able to do.

These classifications (historically assuming that Mark-Compact includes
visiting all dead objects, and/or that anything else can't do in-place compaction)
both lead to historical conclusions that no longer apply to more modern
implementations that still clearly belong in the Mark-Compact category.

Here are two classifications of implementation-orthogonal characteristics
of any moving collector, and are fundamental to analyzing fit for use in
various common scenarios:

1. Do you visit all objects, or just the live ones?
[which I call "do you sweep?]

2. Do you need extra space in order to compact, or can you
perform the compaction within the memory previously occupied
by dead objects?
[which I don't have a great name for, but it is demonstrably not 
 "do you do in place compaction"?]

E.g. for an algorithm used in a young generation collector, which fundamentally 
hopes to leverage the observed quality that the generation's live set is
much smaller than the generation's heap size, the first quality is critical. 

And e.g. for an algorithm used for the oldest generation, and is looking to
be able to handle large live occupancy % (e.g. in the 40%-90%), the second
quality matters quite a bit, as without it either heap occupancy is either
capped (as is the case in many semi-space collectors), or the CPU-efficiency
drops dramatically (much faster than the linear relationship to the heap:live ratio)
in the cases of compactors that can stop compacting mid-cycle because they've
run out of room to complete.


-Aleksey

Steve Blackburn

unread,
Aug 19, 2019, 7:41:19 AM8/19/19
to mechanical-sympathy
It seems my last reply got lost, so I'll try again...

For me the distinction between evacuating and compacting is crisp: while both move objects, evacuation requires a reserve, compaction does not.

A compacting collector can use all available space because it moves objects into space regained in the process of its collection.

By contrast, an evacuating collector must put aside a reserve (however small).   So semispace is the poster-child evacuating collector.  More subtly, I describe immix's defragmentation as evacuation even though it does not need an explicit reserve---it moves objects into 'holes' generated by a previous collection.   This is an implicit reserve---space un-utilized by the mutator, not space scavenged by the same instance of the collection that performs the object movement.

This distinction between requiring a reserve and not requiring a reserve is algorithmically fundamental, which makes it a useful distinction in the nomenclature of collectors.

As an aside, sweeping need not visit dead objects.   For example, in a classic segregated fits free list, a sweep will identify entirely free blocks without visiting any of the constituents.   Likewise we describe immix as 'sweep to region', where 'holes' are identified, but the objects within it are not visited.   I'd describe sweeping as freeing memory by traversing the entire collected region (or metadata corresponding to that entire region), whether or not visiting objects within that region.

--Steve

Steve Blackburn

unread,
Aug 19, 2019, 7:41:19 AM8/19/19
to mechanical-sympathy
What I said above was not as sharp as it should have been.

I see a crisp distinction between 'evacuation' and 'compaction'.

- Evacuation moves (evacuates) survivors into a pre-existing reserve (however small) of heap space.
- Compaction requires no such reserve; survivors are only moved into space freed in the process of performing the compaction.

Evacuation may be done at a very fine grain, requiring absolutely minimal reserves, or in the other extreme, it may require reserving half the heap (semi-space).   Still, I would say 'evacuation' so long as some free reserve (however small) is required in order for the algorithm to proceed.   Thus while "in-place compaction" is a good description, I'd say it is a tautology.

I know this crisp distinction has not been universally observed in the literature, but I think it is helpful because it is absolutely crisp, and cuts to the very heart of the underlying algorithm.

We use the term 'defragmentation' to describe a fine-grained evacuation strategy (for example, one that moves single objects into small holes in the heap).   Still, we'd call this an evacuating collector, not a compacting collector.   We thought long and hard about this at the time we did our work, and settled on the above for the above reasons.

So I'd ask the question: does this algorithm require a reserve of heap space (even a small one) before it can proceed.   If so, it's evacuation.

One more thing: obviously we can and do routinely build collectors that are hybrids, so there's absolutely no reason why a collector could not do both.

--Steve
To unsubscribe from this group and all its topics, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages