gc-aware pool draining policy

2,223 views
Skip to first unread message

Russ Cox

unread,
Nov 27, 2013, 9:57:41 AM11/27/13
to golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
Here is a proposal for a gc-aware pool type.

// A Pool is a set of allocated objects to be reused.
// It may be used by multiple goroutines simultaneously.
// The Pool is drained during garbage collection.
type Pool

// Put adds x to the pool.
func (p *Pool) Put(x interface{})

// Get removes and returns a value from the pool.
// If the pool is empty, Get returns nil.
func (p *Pool) Get() interface{}

Any implementation must of course drain pools once in a while, to avoid unwanted memory retention.

There are details about how to implement the synchronization, whether there should be per-M or per-P shards, whether the API should be slightly different (should Get invoke a function for you to avoid the nil return) and so on, but let's not worry about those things here. The key point here is the draining policy, which was the biggest problem with the earlier proposals.

Previous proposals got hung up on exactly how the pools are sized, or equivalently, when and how quickly they are drained. There were suggestions about using pool operations as a virtual clock, but then a pool that is not being used at all never drains. Using real time also seems wrong.

Most commonly, the goal of a pool is to make allocated memory available for reuse earlier than, and avoiding the cost of, the next garbage collection. Keeping allocated memory in a pool conflicts with the goal of the garbage collection, which is to make no longer used memory available for new uses or to return it to the operating system. That is, draining a pool before a collection is less than ideal because the memory has no chance of reuse until the collection. Retaining a value in a pool beyond a collection is less than ideal because the memory might profitably be collected and used as some other type or returned to the operating system. 

If before garbage collection is too early and after garbage collection too late, then the right time to drain the pool must be during garbage collection. That is, the semantics of the Pool type must be that it drains at each garbage collection.

The most direct way to implement the draining is to maintain a list of non-empty pools. A  call to Put on an empty pool adds the pool to this list. At the start of a collection, the garbage collector drains all pools on the list and then clears the list. Since the collection ends with all pools empty, the work is bounded by the number of pool operations since the last collection.

Note that this proposal is much simpler than weak pointers. A weak pointer is only reset to nil during a collection when no ordinary ("strong") pointers point to the same object. That definition effectively requires a two pass collector: one pass to find the memory to free and then a second pass to zero the corresponding weak pointers. Pools are drained during collection regardless of the rest of memory. In general a pointer in a pools should not be in use elsewhere in memory (or else it is a bug to reuse it), so the expensive weak pointer condition would add no benefit to a pool implementation.

Thoughts?
Russ

Jan Mercl

unread,
Nov 27, 2013, 10:09:35 AM11/27/13
to Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Wed, Nov 27, 2013 at 3:57 PM, Russ Cox <r...@golang.org> wrote:
> Here is a proposal for a gc-aware pool type.

...

> Thoughts?

Looks pretty reasonable to me. One thing which seems to not be a part
of this proposal is: How would pools of different types (pools holding
instances of different types) be declared? Can you please clarify that
detail? TIA.

-j

Russ Cox

unread,
Nov 27, 2013, 10:20:33 AM11/27/13
to Jan Mercl, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Wed, Nov 27, 2013 at 10:09 AM, Jan Mercl <0xj...@gmail.com> wrote:
Looks pretty reasonable to me. One thing which seems to not be a part
of this proposal is: How would pools of different types (pools holding
instances of different types) be declared? Can you please clarify that
detail? TIA.

The same way we used to use container/vector. There is no explicit type annotation for the elements. You put interface{} values in, and you take interface{} values out and type assert them. If you use the wrong type assertion, it fails.

Russ

Brad Fitzpatrick

unread,
Nov 27, 2013, 11:50:26 AM11/27/13
to Russ Cox, Carl Shapiro, Keith Randall, Dmitry Vyukov, golang-dev

SGTM

--
 
---
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Gustavo Niemeyer

unread,
Nov 27, 2013, 1:07:52 PM11/27/13
to Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
Rather than stating the pool is necessarily drained during garbage
collection, perhaps it's best to keep the draining logic
implementation-dependent, even if the first iteration is simply on
first GC. This way we can potentially evolve the logic over time as we
learn more about the problem. Well, unless someone actually has good
background on pools that collect on first GC (I don't).
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to golang-dev+...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--

gustavo @ http://niemeyer.net

Ian Lance Taylor

unread,
Nov 27, 2013, 1:09:20 PM11/27/13
to Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Wed, Nov 27, 2013 at 6:57 AM, Russ Cox <r...@golang.org> wrote:
> Here is a proposal for a gc-aware pool type.
>
> // A Pool is a set of allocated objects to be reused.
>
> // It may be used by multiple goroutines simultaneously.
>
> // The Pool is drained during garbage collection.
>
> type Pool
>
> // Put adds x to the pool.
> func (p *Pool) Put(x interface{})
>
> // Get removes and returns a value from the pool.
> // If the pool is empty, Get returns nil.
> func (p *Pool) Get() interface{}

To clarify this for myself, I think what you are saying is: you can
add anything to a Pool using Put. Get will return some element placed
in the Pool by Put. The Pool retains the values added via Put. At
garbage collection time, those retained values will be discarded. If
those values contain pointers, and nothing else points to the objects
to which they point, those objects will be collected as usual.

That seems like a good plan.

Ian

Russ Cox

unread,
Nov 27, 2013, 1:33:21 PM11/27/13
to Ian Lance Taylor, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Wed, Nov 27, 2013 at 1:09 PM, Ian Lance Taylor <ia...@golang.org> wrote:
To clarify this for myself, I think what you are saying is: you can
add anything to a Pool using Put.  Get will return some element placed
in the Pool by Put.  The Pool retains the values added via Put.  At
garbage collection time, those retained values will be discarded.  If
those values contain pointers, and nothing else points to the objects
to which they point, those objects will be collected as usual.

Yes, thanks.

Russ

Ingo Oeser

unread,
Nov 27, 2013, 4:33:40 PM11/27/13
to golan...@googlegroups.com
Will it be fully or partially drained?

Andrew Gerrand

unread,
Nov 27, 2013, 4:38:43 PM11/27/13
to Gustavo Niemeyer, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
Elegant solution. SGTM

On 28 November 2013 05:07, Gustavo Niemeyer <gus...@niemeyer.net> wrote:
Rather than stating the pool is necessarily drained during garbage
collection, perhaps it's best to keep the draining logic
implementation-dependent, even if the first iteration is simply on
first GC. This way we can potentially evolve the logic over time as we
learn more about the problem. Well, unless someone actually has good
background on pools that collect on first GC (I don't).

I'm not sure we need to worry about over-specification here, because the behavior of the garbage collector itself is not guaranteed.


Josh Hoak

unread,
Nov 28, 2013, 9:15:52 AM11/28/13
to Andrew Gerrand, Gustavo Niemeyer, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall

tracey....@gmail.com

unread,
Nov 30, 2013, 3:47:06 AM11/30/13
to golan...@googlegroups.com, Andrew Gerrand, Gustavo Niemeyer, Russ Cox, Dmitry Vyukov, Carl Shapiro, Keith Randall
Assuming this comes into existence, can there be an accompanying usage guide? I don't think I understand the proper time to use this structure, especially in the context of concurrency.

Florian Weimer

unread,
Dec 1, 2013, 6:37:51 AM12/1/13
to Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
* Russ Cox:

> Note that this proposal is much simpler than weak pointers. A weak pointer
> is only reset to nil during a collection when no ordinary ("strong")
> pointers point to the same object. That definition effectively requires a
> two pass collector: one pass to find the memory to free and then a second
> pass to zero the corresponding weak pointers.

Finalizers require this as well and are closely related.

> Pools are drained during
> collection regardless of the rest of memory. In general a pointer in a
> pools should not be in use elsewhere in memory (or else it is a bug to
> reuse it), so the expensive weak pointer condition would add no benefit to
> a pool implementation.

Pooling objects is most beneficial if they are expensive to construct.
Such objects often need finalizers, too. Even if the pools are
cleared, parts of the objects stored in them would end up in
finalization queues.

The proposed interface does not appear to be adequate to implement a
pool of reusable objects which is indexed with some key. A map with
pool values would not remove the corresponding keys when the pool
values are cleared. A pool of maps would not work because in order to
use the maps, you'd have to remove them from the pool.

Ian Lance Taylor

unread,
Dec 1, 2013, 11:18:49 AM12/1/13
to Florian Weimer, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Sun, Dec 1, 2013 at 3:37 AM, Florian Weimer <f...@deneb.enyo.de> wrote:
>
>> Pools are drained during
>> collection regardless of the rest of memory. In general a pointer in a
>> pools should not be in use elsewhere in memory (or else it is a bug to
>> reuse it), so the expensive weak pointer condition would add no benefit to
>> a pool implementation.
>
> Pooling objects is most beneficial if they are expensive to construct.
> Such objects often need finalizers, too. Even if the pools are
> cleared, parts of the objects stored in them would end up in
> finalization queues.

There are other uses for pooling objects. For example, the fmt
package today uses a cache for pp structs, to make printing more
efficient. The proposal here would permit a generic implementation of
that cache, rather than one specific to fmt. Defining a generic
implementation makes it more reasonable to hook it into the runtime
library, permitting existing entries to be returned without having to
take a lock.


> The proposed interface does not appear to be adequate to implement a
> pool of reusable objects which is indexed with some key. A map with
> pool values would not remove the corresponding keys when the pool
> values are cleared. A pool of maps would not work because in order to
> use the maps, you'd have to remove them from the pool.

That is true: this proposal is not a weak pointer proposal. A weak
pointer proposal is harder to understand and, particularly, to
implement safely in a language like Go.

One approach that might get something like what you want would be the
ability to register functions that would be called before running a
full garbage collection. I'm not at all sure that that would be a
good idea, though.

Ian

Brad Fitzpatrick

unread,
Dec 1, 2013, 12:04:54 PM12/1/13
to Ian Lance Taylor, Florian Weimer, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
You already can do that, though: http://play.golang.org/p/58Ky2KPPr-

Russ Cox

unread,
Dec 1, 2013, 3:58:56 PM12/1/13
to Florian Weimer, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Sun, Dec 1, 2013 at 6:37 AM, Florian Weimer <f...@deneb.enyo.de> wrote:
* Russ Cox:

> Note that this proposal is much simpler than weak pointers. A weak pointer
> is only reset to nil during a collection when no ordinary ("strong")
> pointers point to the same object. That definition effectively requires a
> two pass collector: one pass to find the memory to free and then a second
> pass to zero the corresponding weak pointers.

Finalizers require this as well and are closely related.

They may be related to weak pointers, but finalizers absolutely do not require this second pass to update the heap. We already have finalizers without this pass.

Russ

Florian Weimer

unread,
Dec 1, 2013, 4:21:46 PM12/1/13
to Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
* Russ Cox:

>> > Note that this proposal is much simpler than weak pointers. A weak
>> pointer
>> > is only reset to nil during a collection when no ordinary ("strong")
>> > pointers point to the same object. That definition effectively requires a
>> > two pass collector: one pass to find the memory to free and then a second
>> > pass to zero the corresponding weak pointers.
>>
>> Finalizers require this as well and are closely related.

> They may be related to weak pointers, but finalizers absolutely do not
> require this second pass to update the heap. We already have finalizers
> without this pass.

I might be wrong about this, but a quick glance at the run-time
library suggests that Go currently has a two-phase mark-and-sweep
collector, and the finalizer processing is done during the (sweeping)
pass over the heap.

Even with a different collector which doesn't perform a second heap
traversal, it's only necessary to scan the set of weak references (or
finalizable objects) and process the objects found there which the
collector has identified as non-reachable.

Florian Weimer

unread,
Dec 1, 2013, 4:27:24 PM12/1/13
to Ian Lance Taylor, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
* Ian Lance Taylor:

> That is true: this proposal is not a weak pointer proposal. A weak
> pointer proposal is harder to understand and, particularly, to
> implement safely in a language like Go.

Weak pointers and finalizers are dual, you can use one of them to
implement the other.

Ian Lance Taylor

unread,
Dec 2, 2013, 12:47:27 PM12/2/13
to Florian Weimer, Russ Cox, golang-dev, Dmitry Vyukov, Carl Shapiro, Keith Randall
On Sun, Dec 1, 2013 at 1:21 PM, Florian Weimer <f...@deneb.enyo.de> wrote:
>
>>> > Note that this proposal is much simpler than weak pointers. A weak
>>> pointer
>>> > is only reset to nil during a collection when no ordinary ("strong")
>>> > pointers point to the same object. That definition effectively requires a
>>> > two pass collector: one pass to find the memory to free and then a second
>>> > pass to zero the corresponding weak pointers.
>>>
>>> Finalizers require this as well and are closely related.
>
>> They may be related to weak pointers, but finalizers absolutely do not
>> require this second pass to update the heap. We already have finalizers
>> without this pass.
>
> I might be wrong about this, but a quick glance at the run-time
> library suggests that Go currently has a two-phase mark-and-sweep
> collector, and the finalizer processing is done during the (sweeping)
> pass over the heap.

That is true, but I'm not sure what the conclusion is. There is no
second pass to implement finalizers.


> Even with a different collector which doesn't perform a second heap
> traversal, it's only necessary to scan the set of weak references (or
> finalizable objects) and process the objects found there which the
> collector has identified as non-reachable.

In Go's current implementation, blocks with finalizers are treated as
roots but are not marked themselves. When the gc is about to free an
object with a finalizer, it instead does not free the object, queues
up the finalizer to run, and marks the object as not having a
finalizer. After the finalizer runs, there will presumably be no
remaining pointers to the object, and it will be freed in a later GC.
There is no additional scan.

I guess what you are suggesting is that in the mark phase the GC
should collect all the weak pointers. Then in the sweep phase the GC
can examine each weak pointer and see if it points to an unmarked
object. If it does, the pointer can be set to nil. But of course
this only works if we require that all weak pointers be in the heap.
And we need some way to mark those weak pointers. Actually I guess
that would be doable though unfortunately it would not work for maps
of weak pointers, since map entries are not addressable.


> Weak pointers and finalizers are dual, you can use one of them to
> implement the other.

Good, then since we have finalizers we are set.

Ian

Carl Shapiro

unread,
Dec 2, 2013, 5:25:58 PM12/2/13
to Russ Cox, golang-dev, Dmitry Vyukov, Keith Randall
On Wed, Nov 27, 2013 at 6:57 AM, Russ Cox <r...@golang.org> wrote:
The most direct way to implement the draining is to maintain a list of non-empty pools. A  call to Put on an empty pool adds the pool to this list. At the start of a collection, the garbage collector drains all pools on the list and then clears the list. Since the collection ends with all pools empty, the work is bounded by the number of pool operations since the last collection.

This typically does interact well with partial garbage collection (such as a generational collector).  Partial garbage collections typically occur at a higher frequency than full collection and so by your definition pools would be cleared more often limiting the benefit of the pool.  

Furthermore, with such a garbage collection mechanism, a pool might point to objects in a region of the heap that is not being collected, so clearing the pool can lead to unwanted bloat.  One could try to be clever and not clear all pools but it is very hard to get a good clearing signal for what to get rid of in a pool because, in most cases, quantifying the storage benefit of clearing a reference is a non-local operation.  Plus, complicating the definition of the eviction policy makes pools less useful as users cannot rely on when they will be cleared or understand why they are cleared.
 
Note that this proposal is much simpler than weak pointers. A weak pointer is only reset to nil during a collection when no ordinary ("strong") pointers point to the same object. That definition effectively requires a two pass collector: one pass to find the memory to free and then a second pass to zero the corresponding weak pointers. Pools are drained during collection regardless of the rest of memory. In general a pointer in a pools should not be in use elsewhere in memory (or else it is a bug to reuse it), so the expensive weak pointer condition would add no benefit to a pool implementation.

Weak pointers can be cleared in time strictly proportional to the number of weak pointers that need clearing.  The complexity is no higher than the implementation of pools you have described.  You still have to visit every pointer in every pool and clear it.

There is another assumption here which is one of atomicity.  Can the user ever observe a pool partially clear?  Tying some action to the start of a garbage collection induces a lot of latency.  Now you have to wait for all threads to stop before clearing a pool.  What happens if I run the mark concurrently with my application?  What gets cleared then?  What if I remove the pauses in my collector so there is no atomic phase at all?  Do adding pools prevent me from doing that?
 
Thoughts?

Tying the semantics of your program to the behavior of the garbage collector is fraught with peril.  It is not uncommon in Java to try an implement cache eviction using weak or their so-called "soft" references.  It seldom, if ever, works out well because the behavior of the collector is never predictable.

There is also an underlying assumption in your proposal that garbage collection events are caused by a low storage condition.  This need not be true.  Garbage collections often occur opportunistically (a user calls runtime.GC because now is a good time to collect so the runtime will not collect later) and that fouls up stuff like this because the pools are prevented from retaining objects.

I would avoid adding an API like this.  It is better to implement a cache eviction policy based on the amount of storage being retained by the cache and not involve the collector.  You'll need a different API and different signals, but it's easier for users to reason about.

Carl Shapiro

unread,
Dec 2, 2013, 6:57:52 PM12/2/13
to Russ Cox, golang-dev, Dmitry Vyukov, Keith Randall
On Mon, Dec 2, 2013 at 2:25 PM, Carl Shapiro <csha...@golang.org> wrote:
This typically does interact well with partial garbage collection (such as a generational collector).  Partial garbage

s/does interact/does not interact/

Russ Cox

unread,
Dec 2, 2013, 9:16:41 PM12/2/13
to Carl Shapiro, golang-dev, Dmitry Vyukov, Keith Randall
On Mon, Dec 2, 2013 at 5:25 PM, Carl Shapiro <csha...@golang.org> wrote:
I would avoid adding an API like this.  It is better to implement a cache eviction policy based on the amount of storage being retained by the cache and not involve the collector.  You'll need a different API and different signals, but it's easier for users to reason about.

I am not sure the API needs to change. The sentence about the garbage collector I would not actually put in the real API docs; I'd just say that the pool can discard objects at any time. And until there is a generational collector it seems like collection time is a good time. 

Supposing we end up with other collection implementations or other reasons not to tie the pool draining to collection, do you have any suggestions for a better signal?

Russ

Gustavo Niemeyer

unread,
Dec 2, 2013, 10:13:44 PM12/2/13
to Russ Cox, Carl Shapiro, golang-dev, Dmitry Vyukov, Keith Randall
On Tue, Dec 3, 2013 at 12:16 AM, Russ Cox <r...@golang.org> wrote:
> Supposing we end up with other collection implementations or other reasons
> not to tie the pool draining to collection, do you have any suggestions for
> a better signal?

There are two interesting metrics that are cheap to evaluate, and take
into account how the pool is being used by the application:

1) Was there an attempt to put into pool that failed because it was full?
2) Was there an attempt to get from the pool that failed as it was empty?

Growing the pool size has a better chance of having a positive effect
on the application if 2 happened after 1. It's also very cheap to grow
the pool after 2 since the pool is empty at that point. On the other
hand, if 1 happens without 2, growing the pool size wouldn't help, as
the current number of entries has satisfied all the get requests; and
if 2 happens without 1, growing it would also not help, because there
was never an attempt to put more than the current capacity.

Then the question is when to give some of the pool content back to the
system. Again thinking of an actual impact metric of cheap evaluation:

3) How many garbage collections have happened without the pool going
below a given level?

If the pool stays above a level for so long, the entries below that level are
unnecessary garbage. This also hooks into the growth logic, as 3 won't
kick in if 2 has happened.


gustavo @ http://niemeyer.net

Keith Randall

unread,
Dec 2, 2013, 10:58:43 PM12/2/13
to Gustavo Niemeyer, Russ Cox, Carl Shapiro, golang-dev, Dmitry Vyukov, Keith Randall
I have a similar issue with stack copying - when to shrink the stack?  The obvious choice right now is to do it at GC as the threads are already stopped, but in the future with incremental and/or concurrent GC there will need to be some other "do it occasionally" mechanism.  Sounds like both pools and stack copying will need such a mechanism eventually.
Reply all
Reply to author
Forward
0 new messages