--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Hi Norman
Thanks for replying.
Netty 4 buffer pool is on my to-read list. Talk to you soon :)
Cheers
Gg
The most efficient pool is one which is just pre-allocated and you keep reusing. Passing resources esp byte[] between threads is very expensive to recycle as the byte[] will actually get copied back between CPU caches even if you overwrite it. i.e. the CPU cache is too stupid to not read in a cache line you are about to overwrite.
Using finalize is very expensive as it adds nodes to a queue to process in a background thread. This is likely to be more expensive than letting the GC do the work. This is made worse if a finalized object blocks for a while as it prevents anything else being finalized.
I suggest looking at using a ring buffer of buffers. This has the advantage that when you have a free slot to write data you also have a recycled buffer to write to as well. It look something like this.RingBuffer<ByteBuffer> ring = ...// producerByteBuffer bytes = ring.nextFree();populate(bytes);ring.writeComplete(bytes);// consumerByteBuffer bytes = ring.nextUsed();process(bytes);ring.readComplete(bytes);This doesn't solve the writing the buffer back again problem, but it does mean you can queue buffers and recycle them in one action.
Hi Kirk
That's quite factual! :)
Because the pooled object wrapping the byte array is not 'resurrected' do you think this still apply to the byte array?
I'm going to look at openjdk source code, you got my interest!
Anyway
Non-finalizer 3 - 0 finalizer
It's hard to fight :)
I have a lot to explore now
I keep you posted if I find anything interesting
Many thanks for you time
Gg
First, the use of a finalizer is correctly frowned upon. Virtually anything you can do with a finalizer can be more cleanly done with either a weak reference (for postmortem processing) or a phantom reference (for pre-mortem processing). In this specific case, the cleaner way to express what you are trying to achieve (the recycling of the byte array upon death of the containing object) is to use a weak reference, you do so by creating a subclass of WeakReference, say PooledByteArrayReference, which would include a field to that strongly references the byte array. PooledByteArrayReference instances would be created in PooledByteArray constructor, and the queue they are put in is then processed in the background to reap and recycle byte buffers. This way there are no additional liveness states, and PooledByteArray instances properly die when they can.
Second, the notion that allocating byte buffers instead of recycling them with this specific recycling pattern would be expensive usually turns out to be false (actually measuring it would be the best way to tell). The byte zeroing avoidance here is not much of a savings unless your byte buffers are multi-MB ones, since (assuming you actually use the byte buffers for storage) the storage of contents into a buffer right after allocation will usually overlap with the cost of zeroing, making the actual zeroing cost near-zero. This happens because both the zero and the contents stores have to bring the buffer's cache lines into the the cache in exclusive mode, and once a line in the L3 due to zeroing, the contents store is a very likely cache hit.
Third, the added de-reference cost on every array access operation here is likely to be MUCH more expensive than any (nearly zero to begin with) savings that come from avoiding the zeroing of buffers in normal allocation.
Fourth (and most dominant): Unlike both a limited size pool with explicit freeing, or simply allocating the buffers and letting GC pick them up, this recycling pattern is likely to fill the heap up with many more live byte buffers than are actually needed when under pressure, and throw GC into thrashing conditions. The reason for this is that recycling won't happen until the heap has filled up enough to trigger a GC. But when GC eventually triggers with this recycling approach, it will find tons of live byte buffers in the heap (in contrast to lots of dead ones in the trivial allocation/GC pattern without this recycling approach).
There are multiple parts to address here. I'll start with the small style stuff and progress thru to the "this us very inefficient" conclusion.
First, the use if a finalizer is correctly frowned open. Virtually anything you can do with a finalizer can be more cleanly done with either a weak reference (for postmortem processing) or a phantom reference (for pre-mortem processing). In this specific case, the cleaner way to express what you are trying to achieve (the recycling of the byte array upon death of the containing object) is to use a weak reference, you do so by creating a subclass of WeakReference, say PooledByteArrayReference, which would include a field to that strongly references the byte array. PooledByteArrayReference instances would be created in PooledByteArray constructor, and the queue they are put in is then processed in the background to reap and recycle byte buffers. This way there are no additional liveness states, and PooledByteArray instances properly die when they can.
Second, the notion that allocating byte buffers instead of recycling them with this specific recycling pattern would be expensive usually turns out to be false (actually measuring it would be the best way to tell). The byte zeroing avoidance here is not much of a savings unless your byte buffers are multi-MB ones, since (assuming you actually use the byte buffers for storage) the storage of contents into a buffer right after allocation will usually overlap with the cost of zeroing, making the actual zeroing cost near-zero. This happens because both the zero and the intents stores or have to bring the buffer's cache lines into the the cache in exclusive mode, and once a line in the L3 due to zeroing, the contents store is a very likely cache hit.
Third, the added de-reference cost on every array access operation here is likely to be MUCH more expensive than any (nearly zero to begin with) savings that come from avoiding the zeroing of buffers in normal allocation.
Fourth (and most dominant): Unlike both the fixed sized or limited size pool with explicit freeing, or simply allocating the buffers and letting GC pick them up, this recycling pattern is likely to fill the heap up with many more live byte buffers than are actually needed when under pressure, and throw GC into thrashing conditions. The reason for this is that recycling won't happen until the heap has filled up enough to trigger a GC. But when GC eventually triggers with this recycling approach, it will find tons of live byte buffers in the heap (in contrast to lots of dead ones in the trivial allocation/GC pattern without this recycling approach).
There are multiple parts to address here. I'll start with the small style stuff and progress thru to the "this us very inefficient" conclusion.
First, the use if a finalizer is correctly frowned open. Virtually anything you can do with a finalizer can be more cleanly done with either a weak reference (for postmortem processing) or a phantom reference (for pre-mortem processing). In this specific case, the cleaner way to express what you are trying to achieve (the recycling of the byte array upon death of the containing object) is to use a weak reference, you do so by creating a subclass of WeakReference, say PooledByteArrayReference, which would include a field to that strongly references the byte array. PooledByteArrayReference instances would be created in PooledByteArray constructor, and the queue they are put in is then processed in the background to reap and recycle byte buffers. This way there are no additional liveness states, and PooledByteArray instances properly die when they can.
Second, the notion that allocating byte buffers instead of recycling them with this specific recycling pattern would be expensive usually turns out to be false (actually measuring it would be the best way to tell). The byte zeroing avoidance here is not much of a savings unless your byte buffers are multi-MB ones, since (assuming you actually use the byte buffers for storage) the storage of contents into a buffer right after allocation will usually overlap with the cost of zeroing, making the actual zeroing cost near-zero. This happens because both the zero and the intents stores or have to bring the buffer's cache lines into the the cache in exclusive mode, and once a line in the L3 due to zeroing, the contents store is a very likely cache hit.
Third, the added de-reference cost on every array access operation here is likely to be MUCH more expensive than any (nearly zero to begin with) savings that come from avoiding the zeroing of buffers in normal allocation.
Fourth (and most dominant): Unlike both the fixed sized or limited size pool with explicit freeing, or simply allocating the buffers and letting GC pick them up, this recycling pattern is likely to fill the heap up with many more live byte buffers than are actually needed when under pressure, and throw GC into thrashing conditions. The reason for this is that recycling won't happen until the heap has filled up enough to trigger a GC. But when GC eventually triggers with this recycling approach, it will find tons of live byte buffers in the heap (in contrast to lots of dead ones in the trivial allocation/GC pattern without this recycling approach).
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-sympathy+unsub...@googlegroups.com.
Fourth (and most dominant): Unlike both the fixed sized or limited size pool with explicit freeing, or simply allocating the buffers and letting GC pick them up, this recycling pattern is likely to fill the heap up with many more live byte buffers than are actually needed when under pressure, and throw GC into thrashing conditions. The reason for this is that recycling won't happen until the heap has filled up enough to trigger a GC. But when GC eventually triggers with this recycling approach, it will find tons of live byte buffers in the heap (in contrast to lots of dead ones in the trivial allocation/GC pattern without this recycling approach).
I'm not too worry about this one because the life cycle of trivial byte[] vs PooledByteArray would be the same: collected in young or old depending on the situation. It would be easy to then limit the maximum number of byte[] recycled in the pool to avoid going through the roof after exceptional spikes.This would have the same footprint than fixed/limited sized pool.Am I missing something?
Thanks Gil
You know I trust any word you say :)
I'm going to do some benchmarks just to go through the bottom of it.
I keep you posted with the results.
Thanks again
Gg
--