When does a sync.Pool release it's contents?

Skip to first unread message

Jakob Borg

Aug 19, 2014, 3:38:47 PM8/19/14
to golang-nuts

The Go 1.3 sync.Pool seems awesome, but I'm failing at using it to
conserve memory / reuse buffers. I had *thought* that when a GC
happens, a sync.Pool would drop it's contents - somewhat like having
"weak references" to the contents, or having made the objects
available for garbage collection but being able to reuse them if a GC
hadn't happened yet when they were needed again. However in practice
it often leads to a very large footprint.

I compared two very simple (synthetic, yes) test programs. One uses a
"traditional" channel approach to store temporary buffers:


The other uses a sync.Pool:


In both cases the behavior is the same, the pool/channel is filled at
twice the rate it's emptied. Obviously this is synthetic, but it's not
*too* far from some use cases I've tried where a buffers are Put()
opportunistically and not necessarily used immediately afterwards.

The result is that the channel based approach holds steady at about 2
MB heap, while the pool based approach grows without bounds.

Is this expected? If it is, how is it envisioned that we use a
sync.Pool efficiently - do we need to periodically manually empty the
pool by repeatedly calling Get() until it returns nil?


Caleb Spare

Aug 19, 2014, 3:59:56 PM8/19/14
to Jakob Borg, golang-nuts
This does not address the question you asked, but: you probably
shouldn't copy a sync.Pool value. Share a *sync.Pool between the
goroutines so that you're operating on the same Pool.

> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Jakob Borg

Aug 19, 2014, 4:04:33 PM8/19/14
to Caleb Spare, golang-nuts
Ah, you are of course right, the program is buggy;
http://play.golang.org/p/RjVCHPlwbW is corrected (var p sync.Pool =>
var p = &sync.Pool{}). It halves the growth rate, obviously, but the
behavior at large remains. I guess the conclusion is that it's not
supported to Put() more into a pool than you Get() from it, but this
seems like a very limiting ... limitation, and opposite to what is
implied by the documentation?


Dave Collins

Aug 19, 2014, 4:09:54 PM8/19/14
to golan...@googlegroups.com
Your examples are not functionally equivalent.  The first example is limited
because it's using a non-blocking select which means once the channel is filled
with 100 items, it *drops* the rest.  The second example, on the other hand,
just keeps adding entries at twice the rate they are consumed and hence it
grows without bounds as you noted.

Carlos Castillo

Aug 19, 2014, 4:12:56 PM8/19/14
to golan...@googlegroups.com
Short answer: That is not how to use a sync.Pool

The expected pattern for using a sync.Pool, is to set the pool's "New" field which is a function returning an interface{}. If you need a new object you don't create one, you call Get, and it either returns a pool object, or the result of calling new. Then before you use it you should assert it to the correct type and then initialize it. When you are done with the object you finally put it back in the pool for future use.

IE, a single goroutine does:
  1. x := pool.Get() // either returns a pool object, or the result of pool.New
  2. y := x.(sometype)
  3. y.Init()
  4. y.DoStuff()
  5. pool.Put(y)
This way Gets and Puts are balanced, there will never be more objects in existence then at the time of the greatest concurrency. Should the concurrency level drop, then the pool fills, and it will drop it's unneeded contents on/after a GC.

What you've discovered is likely the result of the pool being drained after the GC performed its marking phase, and the new memory limit (for the next GC) is set without knowing that the pool will be empty. As a result, the memory used before next GC will be higher then before and the pool has more room to "fill" before the GC is called again. Under proper use this would never happen though.

Jakob Borg

Aug 19, 2014, 4:20:38 PM8/19/14
to Carlos Castillo, golang-nuts
Right, so Put() must be balanced by Get() or, easier, "Don't Put()
what you didn't Get()"...

Now, to figure out why this (the apparent memory leakage) happens in a
much more complicated code base than my example... :)


Caleb Spare

Aug 19, 2014, 4:26:49 PM8/19/14
to Carlos Castillo, golang-nuts
Carlos, I agree that this is a misuse of sync.Pool but I don't think
your GC theory holds up. You'll see the same behavior if you
explicitly call runtime.GC periodically as well.

Carlos Castillo

Aug 19, 2014, 5:37:54 PM8/19/14
to golan...@googlegroups.com, cook...@gmail.com
Re-iterating and expanding on what was discussed privately:

If you are using a sync.Pool to manage a collection of a single type of object, it should be used *instead* (or at least before) allocating memory yourself. Calling pool.Get() returns you a previously Put() object. If none exists, you either get nil and need to allocate memory yourself, or you get the result of pool.New (if set) which should create a new object.

The difference between using a pool object, and allocating yourself is that since you may have received a previously used object, it won't be zeroed, so if it's important to do so, you must init it yourself.

When you are done with a pool object, you should Put() it back in the pool for future re-use. You technically don't have to, since the GC will collect it when it's considered garbage, but then you've gained nothing from using a sync.Pool. When you call Put() it allows the object to be re-used immediately by someone else (calling Get()), avoiding the needing to create a new object.

The real benefit of a pool is to help create a pattern in the code where values are re-used without needing new allocations; the GC is usually only called when allocations push memory use over a limit, so by avoiding the allocations, you postpone having the GC run. Also, a sync.Pool has some optimizations that improve performance of the pool when used on multi-processor machines as opposed to using a channel, or some other simple scheme.

Using a pool is meant to be an optimization, and is best used when certain conditions apply:
 - All objects are the same type and roughly the same size
 - Instances of the type are allocated frequently (otherwise why bother?)
 - Further allocations are usually not needed (eg: object doesn't contain pointers, slices, maps, or channels which might need more memory)
 - Instances have a clear point when they are no longer needed
 - When calling Put(x), x is the only reference to the object.

In the common case some/most of these conditions won't apply, and it's probably simpler to just use regular allocation.

Carlos Castillo

Aug 19, 2014, 5:42:13 PM8/19/14
to golan...@googlegroups.com, cook...@gmail.com
True, I am mostly guessing what was going on with the expanding memory (my other guess would be the concurrent GC sweep). 

The important bit is that the sync.Pool is being used improperly, and I doubt much effort will be put into making the wrong way work "correctly".


Dec 28, 2015, 12:36:02 PM12/28/15
to golang-nuts
I am reviving this dead thread because it is referenced in a blog post at http://elithrar.github.io/article/using-buffer-pools-with-go/ , and I don't want it to mislead future gophers.

FTR, here's an example of using a sync.Pool the "right" way: http://play.golang.org/p/RXFN2ACuNC . In this example, the allocation of a []byte only happens three times: once for each of the three concurrent goroutines. 

OTOH, look at this example where I deliberately invoke the GC a bunch: http://play.golang.org/p/fwUs1JN-YM . Here the pool keeps being drained before it can be used, so there are a ton of allocations.

The point of a pool is to minimize the number of allocations without a bunch of fiddly nob twiddling (that is, you don't have to guess how big of a pool to create, when to drain it, etc.). Just define the New function at startup time, then always invoke Get and Put to ensure memory reuse.

Also, note that when you want to use a type that takes or returns an interface{}, it's trivial to get back to type safety land: Just wrap it in a statically typed method that calls the underlying dynamically type method. Since you control its inputs and output, you can know they're type safe.

Matt Silverlock

Dec 29, 2015, 5:03:29 AM12/29/15
to golang-nuts
As the author of that post: it's not intended to suggest avoiding sync.Pool. There are however use cases where a sync.Pool may not be objectively better than a channel-based pool of your own.

sync.Pool's dynamism is a huge plus and allows you to avoid any kind of tuning: particularly useful in a library context where the pool is an implementation detail.

In other cases, you want to explicitly retain the pool members—e.g. a "bursty" network application may not want the GC to cleanup during a period of idle time, as that would potentially require a bunch of re-allocation shortly after. The downside is that you have to size the pool yourself (and potentially its members), which can be non-trivial.

(Horses for courses)
Reply all
Reply to author
0 new messages