Are there plans to support pauseless GC algorithm in Go?

4,067 views
Skip to first unread message

Elazar Leibovich

unread,
Oct 10, 2012, 5:33:04 AM10/10/12
to golan...@googlegroups.com
In certain use cases, Java GC pauses impose real problem in certain use cases, see e.g. [1] [2]. Azul systems have an interesting solution which sounds really appealing [3], I know they're selling VMs whose main benefit is a pauseless GC algorithm.

Is Go considering similar approach? I'm have very little understanding in GC, but it looks like pauseless GC will be a unique selling point for Go, not offered by any other open source language.

Ian Lance Taylor

unread,
Oct 10, 2012, 9:05:27 AM10/10/12
to Elazar Leibovich, golan...@googlegroups.com
On Wed, Oct 10, 2012 at 2:33 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> In certain use cases, Java GC pauses impose real problem in certain use
> cases, see e.g. [1] [2]. Azul systems have an interesting solution which
> sounds really appealing [3], I know they're selling VMs whose main benefit
> is a pauseless GC algorithm.
>
> Is Go considering similar approach? I'm have very little understanding in
> GC, but it looks like pauseless GC will be a unique selling point for Go,
> not offered by any other open source language.
>
> Are there plans to support such an algorithm?

I'm not aware of anybody working on this.

The pauseless GC algorithms I've seen impose a severe performance
penalty on all code running on ordinary multicore processors. It's
the right choice for some programs, but not for most.

Ian

Ian Lance Taylor

unread,
Oct 10, 2012, 11:47:04 AM10/10/12
to g...@azulsystems.com, golan...@googlegroups.com, Elazar Leibovich
On Wed, Oct 10, 2012 at 7:53 AM, <g...@azulsystems.com> wrote:
>
> The [common] impression that concurrent collection has to come at some
> significant overhead on either single core or multi-core processors is
> mostly due to misinformation.

Thanks for the info. Any interest in trying to implement this algorithm for Go?

> As for the overhead on individual threads (not multi-core related), and the
> impact on overall throughput achieved, the self-healing Loaded Value Barrier
> (LVB) used by C4 on modern, commodity multi-core processors exhibits a very
> small overhead compared to non-concurrent (Parallel GC). Mileage obviously
> varies by workload, but specifically, Figure 5 in the C4 paper
> (http://dl.acm.org/citation.cfm?id=1993491&dl=ACM&coll=DL&CFID=172051598&CFTOKEN=22910064)
> includes a direct throughput comparison between C4, ParallelGC, and CMS on a
> multicore workload based on SPECjbb. C4 is within 6% of the per-core
> throughput that ParallelGC achieves on the same load, and more than 20%
> better than the throughput CMS achieves.

I can't access the paper. What processor were the comparisons run on?

Ian

Gil Tene

unread,
Oct 10, 2012, 12:02:57 PM10/10/12
to Ian Lance Taylor, <golang-dev@googlegroups.com>, Elazar Leibovich
> Thanks for the info. Any interest in trying to implement this algorithm for Go?

Unfortunately, we're a bit busy working on the Zing JVM and on OpenJDK stuff, and don't have cycles to put into Go... Getting a full blown concurrent GC implemented in an actual runtime is typically a year+ effort, mostly dominated by runtime related details and edge conditions (and not by the classic GC'ed heap issues). E.g. class loading/unloading and namespace management, generated code lifeycle management, lock lifecycle management, weak/soft/phantom reference mamangement, etc.

The paper is an ISMM/ACM publication, and is freely available to ACM members... You can also find a copy on the Azul site at http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp (but it will want you to register).

The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of memory. They spanned heap size from 3 to 35GB.

-- Gil.

On Oct 10, 2012, at 8:47 AM, Ian Lance Taylor <ia...@google.com>
wrote:

Adam Langley

unread,
Oct 10, 2012, 12:19:49 PM10/10/12
to Gil Tene, Ian Lance Taylor, <golang-dev@googlegroups.com>, Elazar Leibovich
On Wed, Oct 10, 2012 at 12:02 PM, Gil Tene <g...@azulsystems.com> wrote:
> The paper is an ISMM/ACM publication, and is freely available to ACM members... You can also find a copy on the Azul site at http://www.azulsystems.com/products/zing/c4-java-garbage-collector-wp (but it will want you to register).
>
> The tests were on a a 2 socket, 12 core Intel x5680 server with 96GB of memory. They spanned heap size from 3 to 35GB.

Azul's GC work is seriously impressive. To answer the original
question, I think a GC of that quality would be very valuable in Go,
but I don't believe that anyone is currently working on it. As Gil
pointed out, it is a substantial endeavor.


Cheers

AGL

Ian Lance Taylor

unread,
Oct 10, 2012, 1:29:11 PM10/10/12
to Gil Tene, <golang-dev@googlegroups.com>, Elazar Leibovich
On Wed, Oct 10, 2012 at 9:02 AM, Gil Tene <g...@azulsystems.com> wrote:
>> Thanks for the info. Any interest in trying to implement this algorithm for Go?
>
> Unfortunately, we're a bit busy working on the Zing JVM and on OpenJDK stuff, and don't have cycles to put into Go... Getting a full blown concurrent GC implemented in an actual runtime is typically a year+ effort, mostly dominated by runtime related details and edge conditions (and not by the classic GC'ed heap issues). E.g. class loading/unloading and namespace management, generated code lifeycle management, lock lifecycle management, weak/soft/phantom reference mamangement, etc.

Several of those issues don't arise for Go, of course. Go does have a
different issue: the language permits pointers to the interior of
objects.

Ian

Gil Tene

unread,
Oct 10, 2012, 1:51:15 PM10/10/12
to Ian Lance Taylor, <golang-dev@googlegroups.com>, Elazar Leibovich
Refs to fields and array members are not a significant issue for most algorithms. They are just a twist that takes some extra work and some annoying backtracking to deal with. .Net has those too.

The starting point is usually a precise, generational stop-the-world collector. Before a runtime (or language execution environment) reaches that level, talking about concurrency is usually premature. The burden of supporting precise GC, write barriers (needed for any generational collector) and read barriers (needed for practical concurrent collectors) falls mostly on the compilers and runtime. This is true regardless of whether JITs or pre-compilers are used, and whether the runtime takes the form of libraries or more intricate other stuff is not really relevant there (meta-circular runtimes are probably the easiest to deal with, but those rarely appear in production yet). Reflection and introspection (I don't know enough about Go to tell if they matter here) add their own silly twists, depending on how they are implemented.

-- Gil.

Elazar Leibovich

unread,
Oct 10, 2012, 3:28:05 PM10/10/12
to Adam Langley, Gil Tene, Ian Lance Taylor, <golang-dev@googlegroups.com>
I really think that you underestimate the importance of pauseless GC. Already today, users of Go are having problems with Go at scale, YouTube's vitess pushes memory off-heap to memcached[1] because of GC issues.

Another example is indexing the entire wikipedia in memory - simply impossible even with Java's excellent GC algorithms (5% of request have delay of ~5 secs) [2].

Yet another example is HBase's book that recommends not using more than 16Gb heap, otherwise GC pauses will make your peers think you're dead (this is what made me think).

These issues are inherent to servers, since those have typically hundreds of Gigabytes, and situation will probably get "worse" in the near future.

This is not a question of having a little higher bar in a certain benchmark game, this is a question of whether Go will be able to solve some significant engineering problems, while still offering fully garbage collected environment or not. It could be the added value that will allow Go to fight the "good enough" solutions.

PS

Kyle Lemons

unread,
Oct 10, 2012, 3:55:51 PM10/10/12
to Elazar Leibovich, Adam Langley, Gil Tene, Ian Lance Taylor, <golang-dev@googlegroups.com>
On Wed, Oct 10, 2012 at 12:28 PM, Elazar Leibovich <ela...@gmail.com> wrote:
I really think that you underestimate the importance of pauseless GC. Already today, users of Go are having problems with Go at scale, YouTube's vitess pushes memory off-heap to memcached[1] because of GC issues.

Another example is indexing the entire wikipedia in memory - simply impossible even with Java's excellent GC algorithms (5% of request have delay of ~5 secs) [2].

Yet another example is HBase's book that recommends not using more than 16Gb heap, otherwise GC pauses will make your peers think you're dead (this is what made me think).

These issues are inherent to servers, since those have typically hundreds of Gigabytes, and situation will probably get "worse" in the near future.

This is not a question of having a little higher bar in a certain benchmark game, this is a question of whether Go will be able to solve some significant engineering problems, while still offering fully garbage collected environment or not. It could be the added value that will allow Go to fight the "good enough" solutions.

Go, the language, does not specify a particular garbage collection mechanism.  It is also very young in the grand scheme of things.  I would expect the scheduler and the garbage collector to be two active areas of development for some time to come.

Rob Pike

unread,
Oct 10, 2012, 3:58:52 PM10/10/12
to Elazar Leibovich, Adam Langley, Gil Tene, Ian Lance Taylor, <golang-dev@googlegroups.com>
On Thu, Oct 11, 2012 at 6:28 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> I really think that you underestimate the importance of pauseless GC.

What makes you say that? No one here has said it's not a good thing.
We've wanted it from the beginning, we spent quite a bit of time early
on thinking about it, but it's very hard and a research-level problem.
As Gil says, it's a year or more of work. We don't have a year handy
at the moment and we're a very small team. And what if we spend that
year and find it doesn't work as well as we'd hoped?

I'd love to have a pauseless GC, especially if it adds no overhead for
regular code, but I can't just snap my fingers and have one appear.

Look at it this way: Why doesn't the standard Java implementation have one?

-rob

Elazar Leibovich

unread,
Oct 10, 2012, 4:12:54 PM10/10/12
to Rob Pike, Adam Langley, Gil Tene, Ian Lance Taylor, <golang-dev@googlegroups.com>
On Wed, Oct 10, 2012 at 9:58 PM, Rob Pike <r...@golang.org> wrote:
On Thu, Oct 11, 2012 at 6:28 AM, Elazar Leibovich <ela...@gmail.com> wrote:
> I really think that you underestimate the importance of pauseless GC.

What makes you say that?

Indeed no one said that. It's bad phrasing on my part, I apologize.

I was just trying to emphasize that, in my view, pauseless GC is more than "let's make Go faster", it's "let's make make certain important problems solvable with Go".

My apologies again if I put words in anyone's mouth.

Ian Lance Taylor

unread,
Oct 11, 2012, 12:36:14 PM10/11/12
to 2pau...@googlemail.com, golan...@googlegroups.com, Rob Pike, Adam Langley, Gil Tene
On Thu, Oct 11, 2012 at 3:23 AM, <2pau...@googlemail.com> wrote:
>
> Maybe this is a really blunt question, but could somebody pin a number on
> the project, a dollar amount? With that information, the problem would be
> significantly narrowed down to the level of a fundraising problem.

I don't think there is any point to trying to raise funds without
first identifying the people who would actually do it. This is not a
project that can be put out to bid. The result has to be acceptable
to the Go maintainers.

Ian

Gil Tene

unread,
Oct 11, 2012, 1:03:44 PM10/11/12
to Ian Lance Taylor, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley
As an external observer with significant knowledge on what it takes to get there (we at Azul seem to currently be the only ones to have actually done it in a shipping commercial product), I would recommend an incremental approach. It is my understanding (based on offline discussion) that Go currently uses a conservative, non-relocating collector, so there is a lot of work yet to be done to support any form of long-running collector (which almost invariably means supporting a moving collector).

Starting with a precise and generational stop-the-world implementation that is robust is a must, and a good launching pad towards a concurrent compacting collector (which is what a "pauseless" collector must be in server-scale environments). Each of those qualities (precise, generational) slaps serious requirements on the execution environment and on the compilers (whether they are pre-compilers or JIT compilers doesn't matter): precise collectors require full identification of all references at code safepoints, and also require a robust safepoint mechanism. Code safepoints must be frequent (usually devolve to being at every method entry and loop back edge), and support in non-compiler-generated code (e.g. library and runtime code written in C/C++) usually involves some form of reference handle support around safepoints. Generational collectors require a write barrier (a ref-store barrier to be precise) with full coverage for any heap reference store operations (in compiler-generated code and in all runtime code).

It is my opinion that investing in the above capabilities early in the process (i.e. start now!) is critical. Environments that skip this step for too long and try to live with conservative GC in order to avoid putting in the required work for supporting precise collectors in the compilers and runtime and libraries find themselves heavily invested in compiler code that would need to be completely re-vapmed to move forward. Some (like Mono) get stuck there for many years, and end up with complicated things like mostly-precise-but-still-conservartive scanning that reduce the efficiency of generational collection and bump-pointer allocation. This usually happens because the investment in already existing compilers that do not provide precise information, and the lack of systemic and disciplined support for safepoints and write barriers in the runtime and non-generated code libraries become too big a thing to do a "from scratch overhaul" of. [write barrier support in generated code is usually a localized, near-trivial piece of work, it's the coverage of write barriers everywhere in non-genearated code that usually takes a long time, and has a long bug tail].

C4-like capability would add the need for a full-coverage read barrier (a ref-load barrier to be precise) for any heap reference load operations (in compiler-generated code and in all runtime code), so it would be good to keep that in mind as part of the overall effort.

-- Gil.

On Oct 11, 2012, at 9:36 AM, Ian Lance Taylor <ia...@google.com>
wrote:

Florian Weimer

unread,
Oct 13, 2012, 5:15:51 AM10/13/12
to Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley
* Gil Tene:

> Starting with a precise and generational stop-the-world
> implementation that is robust is a must, and a good launching pad
> towards a concurrent compacting collector (which is what a
> "pauseless" collector must be in server-scale environments).

I wonder why compaction is required. We've got many long-running
processes which use explicit memory management, and fragmentation does
not seem to be a huge issue there. Does something specific to garbage
collection make matters worse?

Robert Griesemer

unread,
Oct 13, 2012, 2:35:19 PM10/13/12
to Florian Weimer, Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley
Without compaction, one major benefit of a generational collector goes away: fast allocation (besides, are there generational collectors w/o compaction? - I don't know).

With compaction, allocation essentially becomes a single test and increment of a pointer (by the size of the allocated amount of memory). If there's one "eden" or "newspace" (where the memory is allocated from), this operation has to be atomic, if there's one eden per core, it doesn't have to be. Either way, this form of allocation beats any other scheme in performance by some margin (looking at allocation speed alone) and compensates to some extent for the extra cost incurred by read and write barriers required in a generational or incremental scheme.

The eden's are usually smallish (a few 100KBs) and they can be collected and compacted in the order of a few ms. The collection time is proportional to the size of data surviving - another important benefit. Very large objects and objects that don't contain pointers usually go elsewhere.

As has been pointed out before, the main problem with compacting schemes for Go is the presence of interior pointers. It certainly can be done at the cost of extra memory (a simple approach would be to make every pointer a pointer to the base of an object, plus an offset). But perhaps there's a smarter approach (along the lines of computing the bases and offsets when garbage collecting only, and undoing it again afterwards).

- gri

Florian Weimer

unread,
Oct 14, 2012, 6:28:13 AM10/14/12
to Robert Griesemer, Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley
* Robert Griesemer:

> Without compaction, one major benefit of a generational collector
> goes away: fast allocation (besides, are there generational
> collectors w/o compaction? - I don't know).

You can have size-specific processor-local allocation lists, which
could have similar performance characteristics.

Non-moving generational collectors exist. Lua 5.2 has one, and the
Boehm-Demers-Weiser collector provides two generational modes (but
both are used rarely).

I'm not disputing the potential usefulness of compaction. But I find
it odd that large, long-running processes are apparently fine without
it (perhaps because they use explicit memory management). And
compaction certainly comes with costs: heavy write traffic during
compactions, and the need to copy or pin objects when they are
referenced from C.

minux

unread,
Oct 14, 2012, 6:53:18 AM10/14/12
to Robert Griesemer, Florian Weimer, Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley

On Sun, Oct 14, 2012 at 2:35 AM, Robert Griesemer <g...@golang.org> wrote:
As has been pointed out before, the main problem with compacting schemes for Go is the presence of interior pointers. It certainly can be done at the cost of extra memory (a simple approach would be to make every pointer a pointer to the base of an object, plus an offset). But perhaps there's a smarter approach (along the lines of computing the bases and offsets when garbage collecting only, and undoing it again afterwards).
Without unsafe, pointers in Go objects can't move backward (decrease in value), so i think the interior
pointer problem is not that big. (i think it only affect finding out the type of value a pointer points to)

for example, if I allocate a large []byte to p and then return p[someLargeNumber:] (and don't retain any
references to the slices), i think it will be great that the memory used by p[0:someLargeNumber] could be
collected. The same applies to strings, too.

also, for a pointer to a struct field, if it's the only reference to that whole struct, then the memory occupied
by all the other fields could be collected. yes, this is odd, but valid (albeit aggressive) GC behavior, right?

i might get something fundamentally wrong here, please correct me if it's the case, thank you.

Ian Lance Taylor

unread,
Oct 15, 2012, 12:06:27 AM10/15/12
to Robert Griesemer, Florian Weimer, Gil Tene, <2paul.de@googlemail.com>, <golang-dev@googlegroups.com>, Rob Pike, Adam Langley
On Sat, Oct 13, 2012 at 11:35 AM, Robert Griesemer <g...@golang.org> wrote:
>
> As has been pointed out before, the main problem with compacting schemes for
> Go is the presence of interior pointers.

Another issue is pointers passed into C code via cgo, which according
to the current rules can be collected but can not be moved. I don't
know how significant that would be in practice.

Ian

Dmitry Vyukov

unread,
Oct 15, 2012, 12:10:09 AM10/15/12
to golan...@googlegroups.com, Robert Griesemer, Florian Weimer, Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, Rob Pike, Adam Langley


On Monday, October 15, 2012 5:11:21 AM UTC+4, (unknown) wrote:
Thinking about freeing partial objects makes my head hurt. I suspect it may make other things hurt too.

From a liveness semantic point of view, the simple, "makes sense" treatment of interior pointers is to think of them as keeping the object they point *into* reachable (and not just the field they point to). This keeps the problem within the classic heap GC realm, and simply requires a way to derive the object address from an interior pointer (something most generational collectors can already do due to card marking, see previous note).

Even if there was some specific partial-freeing optimization possible (e.g. through knowing that pointers only move forward as mentioned below), I doubt much real-world gains would be had from applying them, at least not in a way that would be worth the complexity and correctness risk...


I've heard that a small substring can hold a huge string alive in Java, and sometimes it represents a serious problem (put a small substring of a big request into a persistent container). I understand that it's posible to resolve it manually by copying the string, but wouldn't it be nice to resolve it automatically?

Patrick Higgins

unread,
Oct 15, 2012, 2:18:41 PM10/15/12
to golan...@googlegroups.com, Robert Griesemer, Florian Weimer, Gil Tene, Ian Lance Taylor, <2paul.de@googlemail.com>, Rob Pike, Adam Langley
I am glad this topic is being discussed, as it was my earliest concern. The first Go code I read was the garbage collector and I was impressed by how simple it was. I bought some books on garbage collection and considered implementing a generational collector for Go, but then realized my experience with Java has been that heap size was limited to 2-4 GB because compaction pauses became too long for a web server above that. That seems ridiculously low on modern hardware. C4 is the only thing I have heard about which allows really large heaps to be used with acceptable pause times. Is anyone familiar with anything else?

Anyway, thinking about the problem also made me realize that Java is able to allow different GC strategies to be selected because it is JIT compiled--read barriers and write barriers can be inserted if needed for the chosen strategy. Has anyone thought about how Go could support different strategies without requiring a full recompile of your app and all its dependencies?

Florian Weimer

unread,
Oct 15, 2012, 3:02:56 PM10/15/12
to g...@azulsystems.com, golan...@googlegroups.com, Robert Griesemer, Ian Lance Taylor, <2paul.de@googlemail.com>, Rob Pike, Adam Langley
> This keeps the problem within the classic heap GC realm, and simply
> requires a way to derive the object address from an interior pointer
> (something most generational collectors can already do due to card
> marking, see previous note).

I'm not sure if the problems are comparable. Marked cards are rare,
and they can be processed in the order of increasing addresses.
Neither applies to generic pointer traversal.

moru...@gmail.com

unread,
Dec 24, 2013, 5:02:33 PM12/24/13
to golan...@googlegroups.com
Having built large realtime data processing systems, i'd like to point out that from a industry perspective, GC is one of the most important issues if not the most important.
Especially in distributed, latency sensitive message driven systems (multicast messaging dislikes GC), GC is the major issue when using Java.
I'd say in many cases losing 30% performance to get a near pauseless GC could be considered a good tradeoff.
In fact the major point in favouring C++ over Java is GC, not performance (with modern class libs and smart pointers, C++ performance is below Java frequently).
I can't see GO squeeze inbetween C++ and Java without a significant improvement in GC. As Gil Tene said above, the later one starts adressing the issue, the more expensive it gets. Should be top priority

unread,
Dec 25, 2013, 2:39:01 AM12/25/13
to golan...@googlegroups.com, moru...@gmail.com
The incomplete Tulgo compiler is aiming for immediate deallocation of objects that aren't forming a cycle. Tulgo uses reference counting as the starting point for implementing the garbage collector, while the standard Go compiler chose the stop-the-world garbage collector as the starting point.

Elazar Leibovich

unread,
Dec 25, 2013, 2:44:49 AM12/25/13
to ⚛, golang-dev, moru...@gmail.com
Won't cycle breaking still cause stop-the-world pauses? You still need to scan the entire heap from time to time.


--
 
---
You received this message because you are subscribed to a topic in the Google Groups "golang-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-dev/GvA0DaCI2BU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

unread,
Dec 25, 2013, 3:06:47 AM12/25/13
to golan...@googlegroups.com, ⚛, moru...@gmail.com
On Wednesday, December 25, 2013 8:44:49 AM UTC+1, Elazar Leibovich wrote:
Won't cycle breaking still cause stop-the-world pauses? You still need to scan the entire heap from time to time.

I do not want to comment on this now because the Tul compiler isn't there yet. Garbage collection experience with Tul will be different from the experience people have with other compilers.

I imagine that (if the Tulgo compiler is ever finished) the biggest obstacle for use by people other than myself will be how to gain access to the compiler.

Rüdiger Möller

unread,
Dec 25, 2013, 10:06:24 AM12/25/13
to ⚛, golan...@googlegroups.com
The issues with GC persist since smalltalk's days. I think one should explore opportunities how to relax the burden put onto garbage collectors by trying to do something inbetween fully automated gc and malloc/free allocation. Having tested Azuls C4 i can confirm it is really impressing and showed me some program-induced latencies which we attributed to GC before :-).
However it seems to be very complex to implement, I always get a headach when Gil starts going to the details of concurrent GC in a concurrent system :-).

As memory has become very cheap, heaps beyond 64GB will become the norm.

What are we doing in Java to deal with that ?
In many cases it is easy to identify the data structures which are semi-static and will make up the bunch of heap consumption. In Java we put this stuff 'off-heap' (e.g. fast-serialization structs) and access it via a transaction internal API facade. This memory is managed manually then, however this is not that big of a problem, stuff is copied to the "GC'ed" heap in case (or referenced for simple cases). So we have a 2-heap model, where the dynamic data (frequently <500mb) is gc-managed, semi-static data is managed manually. However dealing with off-heap in Java is a big hack.

I could imagine introducing a dual heap model at language level. Objects living in the manual heap are forbidden to reference objects in the GC'ed Heap this should be checked at runtime). The application needs to deep-copy an object to move it from manual to gc'ed heap and vice versa.
Objects in the GC'ed heap could be allowed to contain references to manual heap as we'd like to contain the size of memory being scanned.

This way one can still write fully gc'ed applications, but there is an escape in case you have to deal with 80 GByte of reference data :-). For extreme realtime requirements, parts of an application can choose to stay completely GC free.

Regarding reference counting: afaik locality suffers a lot because with a store 2 memory regions are affected, so the performance degradation will be pretty hard due to cache misses.





2013/12/25 ⚛ <0xe2.0x...@gmail.com>

2pau...@googlemail.com

unread,
Dec 26, 2013, 2:41:58 PM12/26/13
to golan...@googlegroups.com


On Tuesday, December 24, 2013 11:02:33 PM UTC+1, Rüdiger Möller wrote:
Having built large realtime data processing systems, i'd like to point out that from a industry perspective, GC is one of the most important issues if not the most important.
Especially in distributed, latency sensitive message driven systems (multicast messaging dislikes GC), GC is the major issue when using Java.
I'd say in many cases losing 30% performance to get a near pauseless GC could be considered a good tradeoff.
In fact the major point in favouring C++ over Java is GC, not performance (with modern class libs and smart pointers, C++ performance is below Java frequently).
I can't see GO squeeze inbetween C++ and Java without a significant improvement in GC. As Gil Tene said above, the later one starts adressing the issue, the more expensive it gets. Should be top priority

As far as prioritys go, have you read this thread?: Go compiler and other tools written in Go

It looks to me like the prioritys for the next 12 months have already been set with sufficient ambition.   Since in this process it is planned that the code will be refactored and restructed into idomatic go, I suspect that there may be openings (I am just speculating) for improving and innovating storage management as well. At least I think its a good opportunity to get involved.  

Frankly a 30% performace hit looks like a very tough sale to me. I have come to understand that maintenance is also a big consideration.       

Dave Cheney

unread,
Dec 26, 2013, 5:19:17 PM12/26/13
to 2pau...@googlemail.com, golan...@googlegroups.com


On 27 Dec 2013, at 6:41, 2pau...@googlemail.com wrote:



On Tuesday, December 24, 2013 11:02:33 PM UTC+1, Rüdiger Möller wrote:
Having built large realtime data processing systems, i'd like to point out that from a industry perspective, GC is one of the most important issues if not the most important.
Especially in distributed, latency sensitive message driven systems (multicast messaging dislikes GC), GC is the major issue when using Java.
I'd say in many cases losing 30% performance to get a near pauseless GC could be considered a good tradeoff.
In fact the major point in favouring C++ over Java is GC, not performance (with modern class libs and smart pointers, C++ performance is below Java frequently).
I can't see GO squeeze inbetween C++ and Java without a significant improvement in GC. As Gil Tene said above, the later one starts adressing the issue, the more expensive it gets. Should be top priority

As far as prioritys go, have you read this thread?: Go compiler and other tools written in Go

It looks to me like the prioritys for the next 12 months have already been set with sufficient ambition.   Since in this process it is planned that the code will be refactored and restructed into idomatic go, I suspect that there may be openings (I am just speculating) for improving and innovating storage management as well. At least I think its a good opportunity to get involved.  

I cannot think of any other runtime, save maybe C with LD_LOAD_PATH hacks that let's you interchange your gc algorithm like the JVM does. Pluggable gc should be viewed as an outlier, one that has generated it's own subspecialty of tooling and consulting, rather than the goal of every language runtime. 


Frankly a 30% performace hit looks like a very tough sale to me. I have come to understand that maintenance is also a big consideration.       

I agree. Gil is amazingly important and Azul solves problems for people who have no other choice, but I think it is still preferable to avoid creating the garbage in the first place, which is what Go allows you to do if you are careful. 


Am Mittwoch, 10. Oktober 2012 11:33:04 UTC+2 schrieb Elazar Leibovich:
In certain use cases, Java GC pauses impose real problem in certain use cases, see e.g. [1] [2]. Azul systems have an interesting solution which sounds really appealing [3], I know they're selling VMs whose main benefit is a pauseless GC algorithm.

Is Go considering similar approach? I'm have very little understanding in GC, but it looks like pauseless GC will be a unique selling point for Go, not offered by any other open source language.

--
 
---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

Rüdiger Möller

unread,
Dec 28, 2013, 9:06:52 AM12/28/13
to golan...@googlegroups.com
"
> but I think it is still preferable to avoid creating the garbage in the first place, which is what Go allows you to do if you are careful. 
"

That does not help at all if you have 64GByte of data allocated (which is only 25%-50% of a modern server's available memory). At some point there will be a GC and it will stop your go/java program for >20..40seconds in case of 64GByte heap size. (Full)GC duration depends on the heap size not the amount of garbage created.

There is an underlying assumption in all GC implementations, that the size of avaiable memory scales proportional with memory traversal speed. But this does not hold true. Generational GC and/or careful programming style reduce frequency, but not duration of full GC's. Even fully concurrent Azul style collection suffers, because the more memory size and traversal speed divide, the more buffer space is required, so it might easily require >3 times heap size compared to actual data size. 

I really think a >2 heap memory model is urgently needed. It just generalizes the workarounds currently made.

Dave Cheney

unread,
Dec 28, 2013, 9:08:30 AM12/28/13
to Rüdiger Möller, golang-dev
I think a model that allows you to store values outside the heap is
what is needed.

Rüdiger Möller

unread,
Dec 28, 2013, 9:13:11 AM12/28/13
to golan...@googlegroups.com
"store values outside heap"

read my post, that's what I proposed. However why abandon a language's object model to store values off-heap ?. By adding a second manually managed heap with proper migration semantics (GC Heap <=> manual managed heap), use of off-heap memory could be much smoother and faster than with in-memory databases/cache workarounds. It would still be possible to access off-heap objects directly from the language.

Dmitry Vyukov

unread,
Dec 28, 2013, 9:25:30 AM12/28/13
to Rüdiger Möller, golang-dev
It's already available in Go.
If you put your big data in big slices of structs w/o pointers, then
GC won't scan that slices at all. This allows you to have 128GB heap,
but GC will work as if the heap is 128MB.

Rüdiger Möller

unread,
Dec 28, 2013, 9:43:47 AM12/28/13
to golan...@googlegroups.com
>It's already available in Go. 
>If you put your big data in big slices of structs w/o pointers, then 
>GC won't scan that slices at all. This allows you to have 128GB heap, 
>but GC will work as if the heap is 128MB. 

That's nice, but it still prevents use of common language structures like hashmaps etc. Adding a second"manual heap", then disallowing pointers from manual to GC'ed heap (and vice versa), would allow nearby uniform reuse of classes and data structures wether they are allocated in GC'ed heap or in "Manual" heap. This could also open really convenient data management down the memory hierarchy (e.g. persisting object graphs to SSD or HD).

unread,
Dec 28, 2013, 10:43:31 AM12/28/13
to golan...@googlegroups.com
I find it to hard to reason about this without having actual example code.

If it is possible to reduce your application's code into no more than 100 lines of Go code demonstrating the core problem, please send the code to golang-dev. I will add try to analyse it and add it to the Tul compiler as a benchmark. The code should import standard Go packages only.
Reply all
Reply to author
Forward
0 new messages