Yup. those parts were written before Quick Release was commonly used to makes pure
Mark/Compact (no sweep) practical and profitable (over the other two variants).
I think that the G1GC evacuation stuff was also "too young" at the time of the writing to
consider the implications on terminology there, as G1GC-style collectors, and (I think Shenandoah
as well, since they share the same SATB marker logic), tend to keep separate live marks in
bits outside of object header for the same avoid-the-sweep reasons, and don't visit dead
objects when evacuating regions.
Pass 2 above does exactly the "sequential (address-order) pass over memory to recycle unmarked blocks"
thing, and that makes it a classic sweep. It does a classic "find, visit, parse, and track all the dead stuff"
pass, and will use that information to recycle all of them (by compacting those holes). The way it tracks
the "free stuff" is just one of many possible ways to track the recycle-able holes identified in a sweep:
It 2 tracks them by reflecting the information about where the the dead holes are in the newly computed
address for the next live object it finds (the "how much to shift the next live object's address" amount
grows by the size of each dead object encountered and parsed to determine that size). If you wanted to,
you could directly deduce or reconstruct the list of dead hole addresses and sizes by walking the newly
computed object addresses backwards. It's just that the way this "list" is stored is optimized for the next
passes in this collector.
The way different sweepers track the recycle-able holes they find varies dramatically (lists, trees, bucketed
lists, buddy lists, some magic encoding in other data) but the work done to find them is fundamentally the
same in all sweepers: they linearly parse the heap, one object at a time, figuring out where the next object
starts by looking the at material within the object (dead or alive) being visited, and store (or embed) the
information they find about where the holes are into some data structure.
The fact that they visit and parse dead objects in order to identify recycle-able memory is what makes them
sweepers. Other techniques that don't visit dead objects do not perform a sweep. Those usually take the
apparoiach of "take all the live objects somewhere else, and the whole thing becomes one big free blob"
over the "sweep through the individual dead things and rack them for some recycling operation that fills
them in individually". Examples of those clearly-non-sweepers include copying collectors that evacuate and
fixup an entire from space in one tracing pass that only ever visits live objects. Regional evacuating
collectors that evacuate one region at a time (as opposed to an entire from space in an all-or-nothing
approach) but do so without ever visiting or parsing a dead object also fall into this "does not do a sweep"
category.
>
> I would leave "sweep" as something that reclaims the storage of the objects (i.e. "sweeps away" the
> garbage), which makes M-C and M-S classes more clearly defined and easier to reason about.
When M-C implies "also does a full linear traversal that visits all dead objects" (aka "a sweep"), that
terminology fails. Examples like C4, G1GC, Shenandoah, and ZGC are all Mark-Compact (NO Sweep)
collectors, while ParallelGC and Serial GC are Mark-Compact (WITH Sweep) collectors. Since the math
behind those varies dramatically in wasy that effect everyday choices, using "M-C" as a classification
that covers both leads people to misunderstand the interactions between e.g. heap size and GC work.
For example, I know that for C4 and ZGC (and I believe that for Shenandoah as well), all of which
are variants of Mark/Compact (the non-sweeping kind), increasing the heap without increasing the
application live set adds virtually no cost to the cost the collector bears per GC cycle, and as a
result, it improves the collector's efficiency linearly (i.e. doubling the heap size improves the efficiency
by 2x) without elongating any GC work time (pausing or not). This quality makes "use more heap if
you ahve the memory for it" an always-good thing to do (it never hurts).
But in contrast, in ParallelGC and SerialGC, whose oldgen collectors are also variants of Mark/Compact
(the also sweeping kind). increasing the oldgen heap size significantly increase the work the collector
does in each cycle, since the sweeping passes will fundamentally take longer to perform. This creates
a tradeoff between efficiency and other things. A larger heap for a given live set does mean less frequent
collection, and overall less work per allocation unit, since even as each collection gets individually longer,
some parts of each cycle (the tracing and object moving parts) stay the same length, so the of increase
in cycle length is smaller than the decrease in frequency. But if cycle time matters for something other
than efficiency (like pause time in a STW collector), or if longer cycle times end up costing elsewhere
(heuristics causing more frequent triggering in a concurrent collector to leave more room for the cycle to
complete), those efficiency benefits may be offset, and "use a larger heap" is an often
"Not the right thing to do", and can seriously hurt.
The difference between those two comes down to the difference between Mark-Compact (no Sweep)
and Mark-Compact (with Sweep).
>
> Introducing the fixup pass is not very relevant here, because it talks about how the heap references
> are managed, not how the object storage is treated. Regardless of what modern concurrent collector
> does (separate update-refs phase, piggybacking on next marking, self-fixing barriers) to update the
> references, the objects are already either copied/compacted away, or the garbage is swept away in
> (per-object-sized) blocks.
The fixup ia simply an integral part of compacting, as compacting is not complete until the
fixup is complete. I agree that it is not part of the Sweep part. But whether or not a separate
visit-every-live-object fixup pass happens or not in a compacting collector has as direct
a relationship on it's overall cost as whether or not a separate visit-all-the-dead-objects
sweep pass is done.
This difference is not as critical in very parse heaps (like newgen in most cases, or in Oldgens
that are sized to only have a small fraction of the heap size live), because the cost considerations
around doing or of not doing a visit-all-dead-objects sweep greatly out-weigh any costs
considerations around doing/not-doing a separate visit-all-the-live-objects pass for fixup in
those cases.
But it becomes fundamental at larger heap occupancies, where the live set may be a significant
portion of the overall heap size. This does not show up in silly things like SpecJBB (which
tends to have a alive set of ~1GB but is usually run with heaps in the tens of GB or higher)
but is commonplace in large-in-heap state situations that are actually trying to make use of
memory (like any form on in-memory cache or data, and e.g. in Spark, Elastic/Solr, Cassandra,
Hadoop-name-nodes, or name-your-app-that-actually-keeps-lostof-stuff-in-the-heap)
When heap utilizations are higher (e.g. if you actually have 40GB or more of live data in that
60GB heap), the difference in cost between doing two passes (one tracing MarkFixup, one
linear relocate), or three passes (one tracing Mark, one linear relocate, and an additional (either
linear or tracing) fixup is huge. In the end, the relevant live data (the object headers and bodies)
will either be brought into your CPI caches twice or 3 times. That's a 50% difference. And it is
as big as the difference of doing or not doing a sweep.
But we probably don't have room for the "with separate fixup pass" vs. "without separarte fixup pass"
distinction in the already mixed up Mark-Compact subcategories.
>
> -Aleksey
>