I’ve been trying to track down a NIO memory leak that occurs in a Netty application I am porting from Netty 3 to Netty 4. This leak does not occur in the Netty 3 version of the application.
For now, I’m using only unpooled heap buffers in Netty 4, but NIO buffers do come into play for socket communication.
I’ve captured a few heap dumps from affected instances, and in each it appears that the leaked DirectByteBuf java objects are rooted in an io.netty.util.Recycler.
These buffers remain indefinitely: I can disable the application to drain traffic and force GCs, but the # of NIO buffers and NIO allocated space stays flat.
The issue is likely related to slow readers. However, the leak persists long after all channels have been closed.
I implemented a writability listener and the leak does appear to go away if I stop writing to a channel after it goes unwritable. This is good, but I’m still worried that this just makes the problem less likely since it’s still possible to write/flush and have pending data: writability just limits how much data will be buffered.
Digging into ChannelOutBoundBuffer I see the following stanza in close:
// Release all unflushed messages.
try {
Entry e = unflushedEntry;
while (e != null) {
// Just decrease; do not trigger any events via decrementPendingOutboundBytes()
int size = e.pendingSize;
TOTAL_PENDING_SIZE_UPDATER.addAndGet(this, -size);
if (!e.cancelled) {
ReferenceCountUtil.safeRelease(e.msg);
safeFail(e.promise, cause);
}
e = e.recycleAndGetNext();
}
} finally {
inFail = false;
}
clearNioBuffers();
This seems a bit curious to me: why are flushed buffers not released here? Since the leak seems to be rooted in the Recycler, this could be the culprit…What do you think?
--
You received this message because you are subscribed to the Google Groups "Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKADssKFcs-WCc8%2Br2RWrvbgg3csaJPdcsXL_mCD5yG2bg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/b66894c3-1e65-4235-9201-b4f1dca11a81%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/e823494b-caf1-4b1f-b629-405bbdbf4c40%40googlegroups.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Netty discussions" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/netty/Ve4lnRvFXjM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/F788648D-2C4E-4031-BD2A-EFFAEED64BDC%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKAiQseAviHAkbE-eoybDWQsGY1ek6LtXkMCebcMd4WU4g%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/925935D9-92F1-4516-B9E9-524991BF053E%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKDfjjLkj1yytjhrU564K3Z36SOaqfuPcER-CK3_vrK16g%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/A77C4A03-0A98-4104-9791-7F161A3AAD91%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKCA8g-4E0AJCf%2BopC2NMufmQBGcWnMXHkwbDj-veW7x8w%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/D30BEE2E-9119-4FD3-A144-F6B765B189E8%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKADHSVtqFoqMRak34S23OsDrZTEfADSTMpKGa3bguU-ig%40mail.gmail.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/netty/297B68FF-2B8F-4EE6-92B3-81856A901A7A%40googlemail.com.
On 20 Jul 2016, at 19:19, Chris Conroy <cco...@squareup.com> wrote:
Here's an example leaked buf
<Screen Shot 2016-07-20 at 1.18.24 PM.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKABgsVZaToNgE-UqK04v_fsy508Gwd35BTyq3_rZtfZ5Q%40mail.gmail.com.



To view this discussion on the web visit https://groups.google.com/d/msgid/netty/F4977516-4D85-4995-AA5D-AFDC2EC18911%40googlemail.com.
On 20 Jul 2016, at 19:39, Chris Conroy <cco...@squareup.com> wrote:
It's no problem! I'm sorry for all the back and forth. I'd just send you the heap dump if I could, but alas it will be difficult to impossible to sanitize it from sensitive data. (As an aside, I really wish there were tools that let you interact with java heap dumps more programmatically...)In the particular heap dump I'm looking at, I have 12,762 such buffers. Interestingly, I see 123k Recycler$DefaultHandles in the heap...
<Screen Shot 2016-07-20 at 1.31.40 PM.png>
Here's the only path to GC roots from the leaked byte bufs:
<Screen Shot 2016-07-20 at 1.34.50 PM.png>
The threads appear to all be server worker threads. The Biggest Objects - Dominators view for strong references in YourKit shows me that the server worker threads are the dominant roots in the heap:
<Screen Shot 2016-07-20 at 1.38.19 PM.png>
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKAMJyi3yLcSVunKLeRdtjJ7%2BQY0ExDq%2Bemb%2BoK9z8kg9A%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/F3DB7987-51F1-49A4-B77D-49AD2A270FE4%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKAWh8EQDe-D0hiY%2B4BXbLvDdtjUMDNoyY1B4_bBpYYNJg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/0328BF3C-794D-4821-8665-2C6C88DAF8E0%40googlemail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKBzJ84qpabpppj_SRQD6N5OVLN6kQnCFEV8nMYZ4Pcdjg%40mail.gmail.com.
If the recycler is used by each EventLoopGroup, then it probably should have a per EventLoopGroup configuration since you’ll need lower thresholds for more threads. In practice I’m only seeing much usage of the recycler on one of my EventLoopGroup s but I would be worried about running out of memory unnecessarily in some other situation where another group ends up buffering a large amount of data due to some slowdown.
This would be a bit easier to configure safe automatic defaults if it were a global (instead of per thread) recycler. How crazy would that be? Without that, the recyclers need to be small enough to multiply per thread in the app or there needs to be some kind of coordination mechanism to disable recycler growth in some threads if other threads are currently using a lot of capacity. There also might be some value in expiring older buffers so that after high pressure periods they are able to be reclaimed (I have had success doing this in MRU object pools elsewhere)
For HTTP applications, the default chunk size is 4k, so I imagine most buffers would be under that size? I’m currently seeing good memory usage results without too much extra GC using -Dio.netty.recycler.maxCapacity=4096 -Dio.netty.threadLocalDirectBufferSize=8192 but I haven’t explored too many other options yet. I did not seem to get much utilization with a thread local buffer size threshold of 4k for whatever reason though.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/320953AF-FBE7-4C5E-A84C-F3A7CBEDFB80%40googlemail.com.
On 25 Jul 2016, at 20:00, Chris Conroy <cco...@squareup.com> wrote:If the recycler is used by each
EventLoopGroup, then it probably should have a perEventLoopGroupconfiguration since you’ll need lower thresholds for more threads. In practice I’m only seeing much usage of the recycler on one of myEventLoopGroups but I would be worried about running out of memory unnecessarily in some other situation where another group ends up buffering a large amount of data due to some slowdown.
This would be a bit easier to configure safe automatic defaults if it were a global (instead of per thread) recycler. How crazy would that be? Without that, the recyclers need to be small enough to multiply per thread in the app or there needs to be some kind of coordination mechanism to disable recycler growth in some threads if other threads are currently using a lot of capacity. There also might be some value in expiring older buffers so that after high pressure periods they are able to be reclaimed (I have had success doing this in MRU object pools elsewhere)
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKBJSvf3o9XAKLHYGkbgFvVhsRmx1CEuAJg1YQWOwAaDHA%40mail.gmail.com.
At the moment these Recyclers are not point to an EventLoopGroup as these are no even aware of anything like EventLoops. For example you may use buffers without EventLoops at all.
This would be a bit easier to configure safe automatic defaults if it were a global (instead of per thread) recycler. How crazy would that be? Without that, the recyclers need to be small enough to multiply per thread in the app or there needs to be some kind of coordination mechanism to disable recycler growth in some threads if other threads are currently using a lot of capacity. There also might be some value in expiring older buffers so that after high pressure periods they are able to be reclaimed (I have had success doing this in MRU object pools elsewhere)
So you are talking about automatically drop stuff if its not used for X timeframe ?
--
You received this message because you are subscribed to the Google Groups "Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKC_f80VxeZOryQcOPHSZGEBLf0SuG6YS0eWx7T86%2Bb6zA%40mail.gmail.com.
--
You received this message because you are subscribed to a topic in the Google Groups "Netty discussions" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/netty/Ve4lnRvFXjM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/A7B9770D-8385-4CFB-9D35-93DED749F1F2%40googlemail.com.
Ping: What do you think about a global recycler instead of many thread-local recyclers?
Also, can you provide some more context on the rationale behind the recycler? Especially with the PooledByteBufAllocator, NIO allocations should be very cheap, so why bother to reuse the buffers?
On 29 Jul 2016, at 01:10, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:Ping: What do you think about a global recycler instead of many thread-local recyclers?
Also, can you provide some more context on the rationale behind the recycler? Especially with the
PooledByteBufAllocator, NIO allocations should be very cheap, so why bother to reuse the buffers?
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKDk8%3DhNfW1C6nDzj7wPEMZm7m9bkq%3D95fYX7hvnnR%3D0JQ%40mail.gmail.com.
On Fri, Jul 29, 2016 at 12:26 AM, ‘Norman Maurer’ via Netty discussions <ne...@googlegroups.com> wrote:
Comments inside..Im not sure this can be done without too much overhead. But if you want to cook up a PR and show it with benchmarks I would be interested for sure :)On 29 Jul 2016, at 01:10, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:Ping: What do you think about a global recycler instead of many thread-local recyclers?
Its because of object allocation. It basically reuses the “ByteBuf” container object (non the actual memory here).Also, can you provide some more context on the rationale behind the recycler? Especially with the
PooledByteBufAllocator, NIO allocations should be very cheap, so why bother to reuse the buffers?
The ByteBuf objects do pin the NIO memory with an unpooled allocator. Are you saying that this is not the case in the pooled allocator?
Object allocation is always very cheap. Garbage collection in the eden space is incredibly cheap, and most buffers are short-lived. I suspect that this may be a premature micro-optimization. I see no difference in JVM pause time or GC run rates when I disable the recycler completely.
On 29 Jul 2016, at 18:51, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:On Fri, Jul 29, 2016 at 12:26 AM, ‘Norman Maurer’ via Netty discussions <ne...@googlegroups.com> wrote:
Comments inside..Im not sure this can be done without too much overhead. But if you want to cook up a PR and show it with benchmarks I would be interested for sure :)On 29 Jul 2016, at 01:10, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:Ping: What do you think about a global recycler instead of many thread-local recyclers?
Its because of object allocation. It basically reuses the “ByteBuf” container object (non the actual memory here).Also, can you provide some more context on the rationale behind the recycler? Especially with the
PooledByteBufAllocator, NIO allocations should be very cheap, so why bother to reuse the buffers?The
ByteBufobjects do pin the NIO memory with an unpooled allocator. Are you saying that this is not the case in the pooled allocator?
Object allocation is always very cheap. Garbage collection in the eden space is incredibly cheap, and most buffers are short-lived. I suspect that this may be a premature micro-optimization. I see no difference in JVM pause time or GC run rates when I disable the recycler completely.
Im working on another fix for the problem you see. And you also may be interested in these:
--
You received this message because you are subscribed to the Google Groups "Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKAKQjw%3DPcqWoYFtKznk6RMtnCDu2G4FP9VAiAug76%2BmTw%40mail.gmail.com.
What you mean here ? In the PooledByteBufAllocator the memory is “pooled” separately from the ByteBuf instance.On 29 Jul 2016, at 18:51, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:The
ByteBufobjects do pin the NIO memory with an unpooled allocator. Are you saying that this is not the case in the pooled allocator?
In the past I saw issues because of heavy object allocation (even for this short-lived objects). If you not have this issue you could just disable the recycler.Object allocation is always very cheap. Garbage collection in the eden space is incredibly cheap, and most buffers are short-lived. I suspect that this may be a premature micro-optimization. I see no difference in JVM pause time or GC run rates when I disable the recycler completely.
On 29 Jul 2016, at 19:14, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:On Fri, Jul 29, 2016 at 12:57 PM, 'Norman Maurer' via Netty discussions <ne...@googlegroups.com> wrote:What you mean here ? In the PooledByteBufAllocator the memory is “pooled” separately from the ByteBuf instance.On 29 Jul 2016, at 18:51, 'Chris Conroy' via Netty discussions <ne...@googlegroups.com> wrote:The
ByteBufobjects do pin the NIO memory with an unpooled allocator. Are you saying that this is not the case in the pooled allocator?With the unpooled allocator, holding on to `ByteBuf` references causes the corresponding NIO memory to be held for the lifetime of the `ByteBuf`. This is why we exhausted our NIO space with the default settings as the recycler held `ByteBuf`s were taking up all available NIO memory. I haven't tested, but it looks like perhaps this is not the case when using pooled allocation since I don't see any `retain` or `release` calls inside the recycler.
In the past I saw issues because of heavy object allocation (even for this short-lived objects). If you not have this issue you could just disable the recycler.Object allocation is always very cheap. Garbage collection in the eden space is incredibly cheap, and most buffers are short-lived. I suspect that this may be a premature micro-optimization. I see no difference in JVM pause time or GC run rates when I disable the recycler completely.
Yep we can definitely do that.
--
You received this message because you are subscribed to the Google Groups "Netty discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKDOHrH_TWAGbLHHBCcLrF%3DWLVY8U-L1HCK8nMrXzqbHGA%40mail.gmail.com.
Configuration by type could be useful, but so far I’m unable to detect any performance degradation when leaving the recycler out altogether. Before adding any more complexity here, I think it would be illuminating to poll the community of Netty 4 users to see what sorts of workloads, if any, impact JVM pause time or GC rates.
Object pooling makes a lot of sense when object setup is expensive. For example, in my Netty based proxy, I pool Channels since connection setup (especially with TLS) is an expensive operation. There are definitely individual circumstances where it may make sense to pool objects without this characteristic, but it’s usually better to just let the JVM handle this. After all, an object pool ends up duplicating the same work that the GC would be doing.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/415234B2-5AED-46CB-AA6E-56C0D34BAFF2%40googlemail.com.--
You received this message because you are subscribed to a topic in the Google Groups "Netty discussions" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/netty/Ve4lnRvFXjM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to netty+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKDa_Kw5TxyrPAU-g8UZwRyrFkKK-nE9weNPV8-hpM4GUg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/F98F76E5-19B1-4E48-8135-A541562AC60F%40googlemail.com.
That blog post touches on the PooledByteBufAllocator which operates at a level below the Recycler object pool.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CAEAdJ9rLwwFWnTBhZebq_%3DpQ%3DrAUSDxEYBDcVeAvmU%3DLyngYMQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/netty/CA%2B%3DgZKCgew6fKDcAg6W1j8t4x0Ta2TzUtCj%3DfSS7WoU9LegNUA%40mail.gmail.com.