| Just my 2 cents... Jon Sten, I tried your cleanup script with mixed results. It was able to empty retryQueues but we still had a big leak. After isolating the problem, I noticed that if the HTTP session from the user expires, that script is not able to find the leaking EventDispatchers. They are, however, still in the GuavaPubSubBus subscribers map, and can be seen with:
import org.jenkinsci.plugins.pubsub.PubsubBus;
this.bus = PubsubBus.getBus();
this.bus.subscribers.each{ channelSubscriber, guavaSubscriber -> println channelSubscriber }
println "Done"
This outputs some lines containing: org.jenkinsci.plugins.ssegateway.sse.EventDispatcher$SSEChannelSubscriber@3472875b which is the Inner class that implements the ChannelSubscriber interface. I suspect that the memory is still leaking as it is being referenced through this subscriber. I saw there is already a PR https://github.com/jenkinsci/sse-gateway-plugin/pull/27 and a fork of the plugin https://github.com/taboola/sse-gateway-plugin The approach of this fix is changing the handling of retries so they are aborted and the queue cleared after some amount of time or retries. Howevery, I wonder if the problem is the cleanup should be done when the HttpSession is destroyed. Currently, I can see in https://github.com/jenkinsci/sse-gateway-plugin/blob/master/src/main/java/org/jenkinsci/plugins/ssegateway/sse/EventDispatcher.java the cleanup code for sessionDestroyed is:
/**
* Http session listener.
*/
@Extension
public static final class SSEHttpSessionListener extends HttpSessionListener {
@Override
public void sessionDestroyed(HttpSessionEvent httpSessionEvent) {
try {
Map<String, EventDispatcher> dispatchers = EventDispatcherFactory.getDispatchers(httpSessionEvent.getSession());
try {
for (EventDispatcher dispatcher : dispatchers.values()) {
try {
dispatcher.unsubscribeAll();
} catch (Exception e) {
LOGGER.log(Level.FINE, "Error during unsubscribeAll() for dispatcher " + dispatcher.getId() + ".", e);
}
}
} finally {
dispatchers.clear();
}
} catch (Exception e) {
LOGGER.log(Level.FINE, "Error during session cleanup. The session has probably timed out.", e);
}
}
}
but although I can see that for every dispatcher there is a call to dispatcher.unsubscribeAll(), I am missing a call to retryQueue.clear(); But I am just guessing here... Anyone knowing the internals of the plugin can confirm this? I might try forking the plugin and testing a modified version. Regards. |