Pax Logging loses track of loggers because of change made when you added generics in December 2015

192 views
Skip to first unread message

Monica Ron

unread,
Dec 3, 2019, 3:26:51 PM12/3/19
to OPS4J
We were using Pax Logging 1.6.1 for a long time (yes, I know it's old). We are now upgrading all of our infrastructure. We tried using Pax Logging 1.10.2. After much debugging, I finally figured out why our logging isn't working the same way as before.



In the original code, the m_loggers static variable for all LogFactory types (Commons Logging, SLF4J, Log4J, etc.) was initialized without generics as:
m_loggers = new WeakHashMap();


In the new code, this became:
m_loggers = new WeakHashMap<String, JclLogger>(); // Or <String, Slf4jLogger>, or <String, Logger>, etc., depending on type.

However, the original code was using m_loggers with the logger as the key and the name as the value, which means the generics should have been:
m_loggers = new WeakHashMap<JclLogger, String>(); // Or <Slf4jLogger, String>, or <Logger, String>, etc., depending on type.


    public Log getInstance(String name)
       
throws LogConfigurationException
   
{
       
PaxLogger logger;
       
if (m_paxLogging == null)
       
{
            logger
= FallbackLogFactory.createFallbackLog(null, name);
       
}
       
else
       
{
            logger
= m_paxLogging.getLogger(name, JclLogger.JCL_FQCN);
       
}
       
JclLogger jclLogger = new JclLogger(logger);
       
synchronized (m_loggers) {
            m_loggers
.put(jclLogger, name); // THIS LINE GOT CHANGED WHEN YOU ADDED GENERICS. KEY/VALUE GOT SWAPPED.
       
}
       
return jclLogger;
   
}


After generics were added, the indicated line became:

            m_loggers.put(name, jclLogger); // THIS LINE GOT CHANGED WHEN YOU ADDED GENERICS.


Even in the old code, even if the same name was passed to getInstance twice, each one resulted in an independent JclLogger. But, both JclLogger instances were stored in m_loggers (because the key was the logger itself, and the name was the value), and both (or all, if the same name was used more than twice) got the proper Pax Logging configuration once the Pax Logging API and Service got started (the DefaultServiceLog, or whatever came out of the FallbackLogFactory, got replaced with a TrackingLogger, and then the logging properties from org.ops4j.pax.logging.cfg were respected).

With the new code, if you pass the same name to getInstance twice, each one still gets an independent JclLogger. But, when the call to m_loggers.put is made with a name that already has a JclLogger, the old JclLogger is removed from m_loggers, and m_loggers.get(name) only returns the new JclLogger. When Pax Logging API and Service get started, only the most recent JclLogger for the name gets its DefaultServiceLog replaced by a TrackingLogger, and the other JclLogger instances for that name still use a DefaultServiceLog, because those JclLogger instances aren't stored in m_loggers anymore. Then, when an attempt is made to log to the JclLogger that still uses DefaultServiceLog, the logging properties from Pax Logging in org.ops4j.pax.logging.cfg are ignored.

This happens for all loggers (Log4J, SLF4J, etc.), not just JclLogger.

We have Pax Logging API and Pax Logging Log4J2 1.10.2 jars in /var/payara5/glassfish/modules/autostart (which correlates to where we always had the old Pax Logging 1.6.1 jars in Glassfish 3.1.2.2--/var/glassfish/glassfish/modules/autostart). When I use the "autodeploy" directory in my Glassfish domain to deploy a new war after the Glassfish domain is already up and running with Pax Logging started, all of the logging works correctly, because it never sees a DefaultServiceLog at all (it gets TrackingLogger immediately). If I stop the domain and restart it with a war already deployed, the war starts up with DefaultServiceLog instances (the war starts up before Pax Logging starts [war starts creating logger instances before the Activator for Pax Logging API is called])), and not all instances get changed to TrackingLogger, as described above. Thus, the logging is all messed up. If I "touch" the war (to force Glassfish to redeploy while Glassfish is running), then it gets the proper logging configuration, because Pax is started by now.

======================
Based on the observed behavior, I regard this as a bug in Pax Logging. I should not have to redeploy my wars every time I restart my domain, just so that logging is picked up correctly. There is no Jira ticket associated with the commit to "add generics". Thus, it is not clear that this was an intentional change to the behavior of Pax Logging. I'm surprised no one complained about it before.

Can you look at this and tell me if this is intentional, or if there is any other way to deal with it besides redeploy my wars every time we restart the Glassfish domain?

Thank you,
Monica

Grzegorz Grzybek

unread,
Dec 4, 2019, 3:55:25 AM12/4/19
to op...@googlegroups.com
Hello

I can't speak for this particular commit.

I can only try suggesting 1.11.3 version where I've done major refactoring. See https://ops4j1.jira.com/browse/PAXLOGGING-252 for the issue and for new and improved (IMHO) documentation: https://ops4j1.jira.com/wiki/spaces/paxlogging/pages/499351646/Documentation

You mention "m_loggers" weak hashmap for individual logging facades - with the PAXLOGGING-252 refactoring, I changed it into single, global map.

I also increased number of integration tests from 0 to 100+ showing different aspects of logging (including tests that check what happens when you stop/start/refresh pax logging bundles - both pax-logging-api and bundles for particular "backends").

About your case of restarting domain, I added some tests and mechanism that refresh (or not if needed) logging configuration.

I hope 1.11.3 will work for you - if don't, please create PAXLOGGING bug.

kind regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/952b7207-cd2b-4229-af4a-f60c934d2c1f%40googlegroups.com.

Monica Ron

unread,
Dec 4, 2019, 9:39:37 PM12/4/19
to OPS4J
Thanks.

I will write a bug in PAXLOGGING soon.

Version 1.11.3 doesn't run on Payara 5.192 (different OSGi version, maybe?). Version 1.10.4 runs, but has the same issue I reported. I downloaded 1.10.4 and swapped the order of the generic parameters to match the original code (i.e., I made key=Logger, value=String). When I did that, my logging worked properly when I restarted the domain. Even with all loggers being stored in a single map in the Activator in 1.11.3, I'll run into the same issue I saw with the actual 1.10.4. When a second logger is registered in that global map with a name that already exists, the first logger will be lost, and won't get the TrackingLogger when the Activator starts up.

I am still trying to run the tests for 1.11.3. I need to configure a few things in my system for that. When I write a bug, I will attach a modified unit test that exhibits the behavior that I am seeing.

Grzegorz Grzybek

unread,
Dec 5, 2019, 4:39:18 AM12/5/19
to op...@googlegroups.com
Hello

I'll run into the same issue I saw with the actual 1.10.4. When a second logger is registered in that global map with a name that already exists, the first logger will be lost

This was a case with previous (than 1.11.x) versions too and there was respective PAXLOGGING issue for this: https://ops4j1.jira.com/browse/PAXLOGGING-243

But it'd be interested if you created PAXLOGGING issue with steps to reproduce your problem on Glassfish/Payara.

regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.

Monica Ron

unread,
Dec 5, 2019, 9:57:56 PM12/5/19
to OPS4J
I got the tests to run, but I am still working on modifying them to illustrate my problem. It is clear to me from studying the pax logging code and observing its behavior that having the m_loggers maps (whether per logging API or as a single global map) with key=name and value=logger is wrong, and can cause some loggers to get stuck with DefaultServiceLog. However, I don't know yet know how to show it in a unit test.

What happens in my Glassfish/Payara setup is:
1. Wars that were already deployed when Payara is restarted start up first (before the pax logging bundles are started up), with pax-logging-api and pax-logging-log4j2 on the classpath, but those pax bundles haven't been activated yet (the Activator class has not run).

2. During #1, the JclLogger and Slf4jLogger get a DefaultServiceLog instance, because they can see the unactivated pax-logging-api code. I don't use either of these directly (our code uses Log4J2), but Mongo uses SLF4J, and Spring uses JCL. During startup of my wars, Mongo uses the same name multiple times when getting loggers, and Spring uses the same name multiple times for getting loggers. This is based on Mongo and Spring behavior (not mine). As I described, when the name is used a second time, the new JclLogger (or Slf4jLogger) replaces the first JclLogger/Slf4jLogger as a value in the m_loggers map. When m_loggers.put(name, jclLogger) is called, the previous JclLogger is returned (standard behavior of the "put" method on a map), and the second (or subsequent, as some come up many times) time "put" is called, the return value of "put" is not null.

3. Payara goes to OSGi start level 2, and activates pax-logging-api by running Activator, and the JclLogger/Slf4jLogger instances that are in the m_loggers maps get updated with a TrackingLogger instead of the original DefaultServiceLog. Then when pax-logging-log4j2 is activated, the JclLogger/Slf4jLogger instances log through the pax logging, based on my settings in org.ops4j.pax.logging.cfg. The JclLogger and Slf4jLogger instances that got removed from the map in #2 still use the DefaultServiceLog, not a TrackingLogger, because the Activator does not know that these instances exist. Mongo logs some status data every few minutes, and some of that still uses the DefaultServiceLog, because the name was used multiple times. The DefaultServiceLog does not respect my configuration that says "don't log this status data [it's at "debug" level, and I set that category to "info" level, but my setting is ignored]. Some Spring code also does not respect my configuration, because some instances of the logger still uses DefaultServiceLog.

4. With Payara now up and running with pax-logging-api and pax-logging-log4j2 fully activated, if I redeploy my war, all instances of JclLogger/Slf4jLogger immediately get TrackingLogger (no DefaultServiceLog), and log based on my org.ops4j.pax.logging.cfg. The Mongo status logging at debug level and the Spring logging that I don't want now goes away.

5. If I restart the domain again, the whole thing starts over--I get stuck with DefaultServiceLog until I redeploy the war while pax-logging-api and pax-logging-log4j2 are fully activated.


If I run a profiler (e.g., jvisualvm) on my process, I can see lots of DefaultServiceLog instances still exist after Step #3. After Step #4, many of those DefaultServiceLog instances go away, because the re-deployed war uses only TrackingLogger (because pax-logging-api is already running). If pax-logging was properly replacing all DefaultServiceLog with the TrackingLogger, then Step #3 (as soon as pax-logging-api got activated), then I think *all* of the DefaultServiceLog instances should go away.

If m_loggers was stored as key=logger, value=name (as it did before someone added generics in December 2015), then when iterating through the loggers in Activator (would iterate over the keys, not the values), the Activator would still be able to see *all* instances of loggers that ever got created, instead of losing some of them (losing them when name is repeated, if name is the key).

===========
I tried running the SimplestPaxLoggingServiceIntegrationTest, but as soon as the test starts, the logger already uses a TrackingLogger (not the DefaultServiceLog). That is, by the time the test starts, the pax-logging-api has already been activated. I need to run a test that creates two or more loggers with the same name *before* pax-logging-api Activator gets called. I can't tell if there are any tests that mimic my situation of getting loggers before the pax-logging-api is activated (so that the DefaultServiceLog is used), and then activating it as part of the test (so that the DefaultServiceLog is replaced with TrackingLogger during the test run).

Monica

Grzegorz Grzybek

unread,
Dec 6, 2019, 12:44:12 AM12/6/19
to op...@googlegroups.com
Hello

Thank you for this analysis - you are correct.
The goal of removing map-per-logging facade was to unify logging behavior and allowing two bundles to use two different loggers with the same name (though now I think the problem was caused by this generification you've mentioned).

There's also https://github.com/ops4j/org.ops4j.pax.logging/blob/logging-1.11.3/pax-logging-it/src/test/java/org/ops4j/pax/logging/it/Log4J2RestartBothPaxLoggingBundlesIntegrationTest.java test that shows what happens when logger was obtained and pax-logging-api+pax-logging-log4j2 bundles were restarted.

I added a comment that when logger is obtained when pax-logging-api is ACTIVE and then it's stopped, it WON'T be reconnected to a backend.
Though it seems that if logger was obtained when pax-logging-api is stopped (INSTALLED state), then it WILL be reconnected to actual backend...

From your description, I believe I can check your problem, but not this week.

regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.

Monica Ron

unread,
Dec 9, 2019, 4:35:39 PM12/9/19
to OPS4J
Thanks. If you do change the code to fix my problem (e.g., swap order of the generic parameters, or whatever other way you fix it), I hope you can release a 1.10.x as well as fixing it on the newer 1.11.x, since I can't use 1.11.x yet.

I can't even use 1.10.4, as I'm having trouble with a NoSuchMethodError in some of the pax-logging-log4j2 when printing stack traces. I'll submit a bug in Jira for that. I didn't try 1.10.3, but 1.10.2 works for printing stack traces.

Monica
To unsubscribe from this group and stop receiving emails from it, send an email to op...@googlegroups.com.

Grzegorz Grzybek

unread,
Jan 2, 2020, 9:48:43 AM1/2/20
to op...@googlegroups.com
Hello

I created https://ops4j1.jira.com/browse/PAXLOGGING-307 to track this issue.

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/132b64d3-6418-47b0-85f7-fd7b9199b8d9%40googlegroups.com.

Grzegorz Grzybek

unread,
Jan 2, 2020, 11:32:07 AM1/2/20
to op...@googlegroups.com
Hello

Thanks for describing the problem in such details!

I reproduced the problem and indeed - simple:

Logger log1a = LoggerFactory.getLogger(this.getClass());
Logger log1b = LoggerFactory.getLogger(this.getClass());

(getting same logger twice), when pax-logging-api is not yet started gives me two instances of org.ops4j.pax.logging.slf4j.Slf4jLogger, but only one is cached in (now global) org.ops4j.pax.logging.internal.Activator#m_loggers.

I changed this org.ops4j.pax.logging.internal.Activator#m_loggers map to ... list and it works fine ;) Both the above loggers have their m_delegate replaced when pax-logging-api starts.

What's more - I added special unit test (in pax-logging 1.11.x+) to prove this. But I can also backport the fix to pax-logging 1.10.x.

What's even more, here's resulting behavior:

1) when pax-logging-api is not yet started:

 - all bundles starting "earlier" can already use any logging facade provided by (not even started) pax-logging-api - pax-logging-api needs to be at least resolved, otherwise other bundles would not be resolved too.
 - all obtained loggers use underlying m_delegate set to DefaultServiceLog

2) when pax-logging-api finally starts:

 - all bundles added to (now a list) org.ops4j.pax.logging.internal.Activator#m_loggers have their org.ops4j.pax.logging.PaxLoggingManagerAwareLogger#setPaxLoggingManager() method called and underlying m_delegate switched from DefaultServiceLog to TrackingLogger obtained from org.ops4j.pax.logging.OSGIPaxLoggingManager#getLogger() and org.ops4j.pax.logging.internal.Activator#m_loggers is cleared! because we no longer need to replace underlying m_delegate
 - these underlying TrackingLoggers will switch back and forth between DefaultServiceLog and actual PaxLoggerImpl when pax-logging-log4j1/pax-logging-log4j2/pax-logging-logback bundles are restarted OR refreshed

3) when pax-logging-api stops

 - org.ops4j.pax.logging.internal.Activator#m_loggers is already empty, so nothing is done in this list
 - however org.ops4j.pax.logging.OSGIPaxLoggingManager#m_loggers is not empty - it contains ALL tracking loggers for all facades. These loggers have their m_delegate switched back to DefaultServiceLog

Conclusion:

 - loggers do NOT switch again to TrackingLoggers when pax-logging-api starts again
 - the above COULD be improved if org.ops4j.pax.logging.OSGIPaxLoggingManager#m_loggers was static, but:
    - I think stopping pax-logging-api is not something that should be done frequently
    - loggers would still be "disconnected" when pax-logging-api is REFRESHED - but this would lead to refresh of almost everything... So we could consider preserving the map of TrackingLoggers in the future
 - loggers switch to proper m_delegate when pax-logging-log4j1/pax-logging-log4j2/pax-logging-logback restarts/refreshes

regards
Grzegorz Grzybek

Monica Ron

unread,
Jan 2, 2020, 11:42:09 AM1/2/20
to OPS4J
I am glad you were able to reproduce the issue and see what I meant.

Yes, please backport the fix to 1.10.x. I tried the fix for PAXLOGGING-306 for getting the pax-logging-log4j2 in the 1.10.x branch to print stack traces, and it worked--my stack traces were printed correctly. So, if this issue PAXLOGGING-307 for losing track of loggers is also fixed in 1.10.x, it would be much appreciated.

Thanks again!
Monica


On Thursday, January 2, 2020 at 11:32:07 AM UTC-5, Grzegorz Grzybek wrote:
Hello

Grzegorz Grzybek

unread,
Jan 2, 2020, 11:48:40 AM1/2/20
to op...@googlegroups.com
Just checked new integration tests for 1.11.5-SNAPSHOT and 2.0.0-SNAPSHOT - all is fine.
So I'm backporting this change to pax-logging 1.10.x (without tests...)

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/96f01aae-bbb3-4b76-bcc8-9b9f050d1e4a%40googlegroups.com.

Grzegorz Grzybek

unread,
Jan 3, 2020, 3:49:48 AM1/3/20
to op...@googlegroups.com
Hello Monica

I've fixed https://ops4j1.jira.com/browse/PAXLOGGING-307 and while pax-logging 1.11.x and 2.0.x have proper tests, there are no such tests in 1.10.x. And also 1.10.x is not the version I was refactoring. So - may I ask you kindly to check https://github.com/ops4j/org.ops4j.pax.logging/commits/pax-logging-1.10.x branch if it solves your problem?

regards
Grzegorz Grzybek

Monica Ron

unread,
Jan 3, 2020, 10:43:59 AM1/3/20
to OPS4J
I downloaded the pax-logging-api-1.10.5.jar and pax-logging-log4j2-1.10.5.jar from Maven Central, and deployed them in my Payara glassfish/modules/autostart directory.

Now when I start Payara with the wars already deployed, it is properly connecting to the pax-logging, and is respecting my configuration in org.ops4j.pax.logging.cfg . If I stop the domains and restart, the configuration is immediately taken into account, and I don't have to re-deploy my wars with pax-logging already activated (as mentioned in my bug report above, I had to do a re-deploy with pax-logging 1.10.2 and 1.10.4).

The stack traces also get printed correctly now with 1.10.5 (as I mentioned in the thread related to that).

My problems seem to be solved. I'll let you know if I find anything else, but it seems good for now. My code uses Log4J2 API directly, and the third-party code I use (such as Mongo, Dozer, and Spring) use SLF4J and JCL Logging.

Thank you very much. I spent many hours trying to figure out what was wrong with my updated POM files and Payara5 deployment before I finally realized that the problem was with Pax Logging, not with my own code. I am glad you were able to fix this and make a new official release of 1.10.x before we released our code for running on Payara5.

Have a great day!
Monica

Grzegorz Grzybek

unread,
Jan 3, 2020, 10:50:46 AM1/3/20
to op...@googlegroups.com
Hello

I'm glad it worked!

Thank you very much. I spent many hours trying to figure out what was wrong with my updated POM files and Payara5 deployment before I finally realized that the problem was with Pax Logging, not with my own code. I am glad you were able to fix this and make a new official release of 1.10.x before we released our code for running on Payara5.

Thank *you* for such detailed and precise analysis! Who'd have thought that "adding generics" will cause such problems ;)

regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.

Monica Ron

unread,
Jan 3, 2020, 8:38:17 PM1/3/20
to OPS4J
You're welcome. And maybe if we had kept our code and dependencies up-to-date, the problem would have been noted 4 years ago when the "adding generics" change was first made. ;)

Monica
To unsubscribe from this group and stop receiving emails from it, send an email to op...@googlegroups.com.

Monica Ron

unread,
Jan 3, 2020, 8:58:55 PM1/3/20
to OPS4J
The new code works for me. Once we deploy our wars, we don't generally re-deploy until we have a new release, and we usually shutdown the Glassfish/Payara domain to do that deployment. So, shutting down the JVM obviously clears all memory.

I'm not really worried (I think our code should be fine, based on how we use it), but I do have one question:
Hasn't the garbage-collection behavior changed by changing the original WeakHashMap<Logger, String> (if generics had been added to the code as it is in 1.10.x [with separate m_loggers for each logging API] without otherwise changing underlying code) to WeakHashMap<String, List<Logger>>?

For 1.11.x, the change made for PAXLOGGING-307 was:
public static final Map<String, PaxLoggingManagerAwareLogger> m_loggers = new WeakHashMap<String, PaxLoggingManagerAwareLogger>();
became:
public static final List<PaxLoggingManagerAwareLogger> m_loggers = new LinkedList<>();

Based on the pre-generics code, the 1.11.x map should have had the logger as the key:
public static final Map<PaxLoggingManagerAwareLogger, String> m_loggers = new WeakHashMap<PaxLoggingManagerAwareLogger, String>();


For a WeakHashMap, if all of the other references to the key got discarded elsewhere, the key-value pair automatically gets discarded from the WeakHashMap, correct? So, if an instance of a class with a reference to a PaxLoggingManagerAwareLogger (or Logger [of whatever logging API you used] for 1.10.x) got garbage-collected, the logger would also have been removed from m_loggers, correct? This automatic removal from m_loggers will not occur with a List<PaxLoggingManagerAwareLogger> (for 1.11.x), or with a WeakHashMap<String, List<Logger>> (for 1.10.x).

As I said, I don't think this will be a problem for us, but it may be something to consider, if the original WeakHashMap was by intention to help with garbage collection. By this time, no one may know what the original intention is...

Thanks again,
Monica


Grzegorz Grzybek

unread,
Jan 7, 2020, 4:55:57 AM1/7/20
to op...@googlegroups.com
Hello

sob., 4 sty 2020 o 02:58 Monica Ron <moni...@gmail.com> napisał(a):
The new code works for me. Once we deploy our wars, we don't generally re-deploy until we have a new release, and we usually shutdown the Glassfish/Payara domain to do that deployment. So, shutting down the JVM obviously clears all memory.

I'm not really worried (I think our code should be fine, based on how we use it), but I do have one question:
Hasn't the garbage-collection behavior changed by changing the original WeakHashMap<Logger, String> (if generics had been added to the code as it is in 1.10.x [with separate m_loggers for each logging API] without otherwise changing underlying code) to WeakHashMap<String, List<Logger>>?

Actually both cases are a bit incorrect IMO... My new case (weak map of string → list<Logger>) only ensures that we don't loose loggers, but doesn't do any weak-functionality (because the logger strongly references key anyway). Previous case (weak map of logger → string) was better, but keys (loggers) were mostly used by strong references anyway throughout the code - because most libraries use "logger log = loggerFactory.getLogger()" idiom all the time - and mostly using static references, so the weak keys of old WeakHashMap<Logger, String> were not released anyway.

In pax-logging 1.11.x+, the List<Logger> is definitely not weak, but activator of pax-logging-api explicitly clears this list when everything is done, so it's even better (IMO).
 

For 1.11.x, the change made for PAXLOGGING-307 was:
public static final Map<String, PaxLoggingManagerAwareLogger> m_loggers = new WeakHashMap<String, PaxLoggingManagerAwareLogger>();
became:
public static final List<PaxLoggingManagerAwareLogger> m_loggers = new LinkedList<>();

Based on the pre-generics code, the 1.11.x map should have had the logger as the key:
public static final Map<PaxLoggingManagerAwareLogger, String> m_loggers = new WeakHashMap<PaxLoggingManagerAwareLogger, String>();


For a WeakHashMap, if all of the other references to the key got discarded elsewhere, the key-value pair automatically gets discarded from the WeakHashMap, correct? So, if an instance of a class with a reference to a PaxLoggingManagerAwareLogger (or Logger [of whatever logging API you used] for 1.10.x) got garbage-collected, the logger would also have been removed from m_loggers, correct? This automatic removal from m_loggers will not occur with a List<PaxLoggingManagerAwareLogger> (for 1.11.x), or with a WeakHashMap<String, List<Logger>> (for 1.10.x).


For 1.10.x, taking into account the fact that loggers are created once and mostly held statictly, it's not a big issue IMO.
 

As I said, I don't think this will be a problem for us, but it may be something to consider, if the original WeakHashMap was by intention to help with garbage collection. By this time, no one may know what the original intention is...

:)

regards
Grzegorz Grzybek


Thanks again,
Monica


--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/d4a9fb64-9389-4d08-9b3e-8a735b57d6ef%40googlegroups.com.

Monica Ron

unread,
Mar 11, 2020, 11:39:53 AM3/11/20
to OPS4J
Hi. I know it's been a while since this was last sent. The change in order of the generic parameters of the WeakHashMap definitely caused a change in memory behavior, and thus a memory leak in our application.

In some of our code, we have:
public abstract class AbstractBase {
   protected Logger logger = LogManager.getLogger(getClass());

   public void doSomething() {
       logger.debug("Base method.");
   }
}

And then we create instances of subclasses:
public class MyClassAAA extends AbstractBase {
   public void someMethodA() {
      logger.debug("Something.");
   }
}

public class MyClassBBB extends AbstractBase {
   public void someMethodB() {
      logger.debug("Something.");
   }
}

The inherited logger gets the name of the subclass, so you can see in the log whether an instance of MyClassAAA did the logging or MyClassBBB did the logging. This is a common usage of loggers from what I can tell. Not all loggers are static. We also create some names dynamically (so we get "MyClassAAA-1" and "MyClassAAA-2"), so that we can distinguish multiple instances of it, so we can track data through a single thread (where we might have 5 threads, each running a separate instance of the MyClassAAA class), for example.

With the nature of our program, instances of some of these classes with inherited loggers come and go frequently. Some are very short lived (an object is created, some processing is performed on it, and the instance is then dropped and garbage-collected). Some may live for days or for a few weeks (to consume data from an ActiveMQ queue), but then are removed, never to be used again. With the old Pax 1.6.1, because the WeakHashMap was effectively "WeakHashMap<Logger, String>" (even though generic parameters weren't specified, this was how it was used), the logger would be garbage-collected when the instance of our class was garbage-collected.

With the current behavior of Pax Logging 1.10.5 with the generic parameters switched to "WeakHashMap<String, Logger>" (and similar for other logging types), each instance of a class with an inherited logger may get a new instance of the logger, and those loggers are never garbage-collected. I don't know the behavior of the real Log4J, Log4J2, SLF4J, JCL, etc. in terms of memory management in our use case, but since our application has always used Pax Logging, the new behavior is definitely a change, and creates a memory leak.

I downloaded the 1.10.x branch (commit 398a8234c6c6ef831e64bd458263c637795a4087) and changed the following classes to switch the WeakHashMap parameters back to the original <Logger, String>:
pax-logging-api/src/main/java/org/apache/commons/logging/LogFactory.java
pax-logging-api/src/main/java/org/apache/juli/logging/LogFactory.java
pax-logging-api/src/main/java/org/apache/log4j/Logger.java
pax-logging-api/src/main/java/org/ops4j/pax/logging/avalon/AvalonLogFactory.java
pax-logging-api/src/main/java/org/ops4j/pax/logging/slf4j/Slf4jLoggerFactory.java

I also changed this:
pax-logging-api/src/main/java/org/ops4j/pax/logging/log4jv2/Log4jv2LoggerContext.java

to use a WeakHashMap<Log4jv2Logger, String> instead of a ConcurrentHashMap<String, Log4jv2Logger>. That one hadn't ever been ConcurrentHashMap<Log4jv2Logger, String>, but changing it to a WeakHashMap<Log4jv2Logger, String> improved the memory cleanup when our instances were garbage-collected (since our upgraded code uses Log4jv2 as an API instead of Log4J, but we are still using the PAX Logging API and Service at runtime). I synchronized the loggers access in Log4jv2LoggerContext, since I no longer had a ConcurrentHashMap.

For now, I have built a local copy of pax-logging-api with changes to the above files, and it has helped our memory management compared to the 1.10.5 release.

Regards,
Monica
Hello

To unsubscribe from this group and stop receiving emails from it, send an email to op...@googlegroups.com.

Grzegorz Grzybek

unread,
Mar 11, 2020, 1:08:48 PM3/11/20
to op...@googlegroups.com
Hello

Great analysis - as always. I'll have a look very soon!

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/fdd62aeb-e3ae-4d33-879f-6c54d4ddce43%40googlegroups.com.

Monica Ron

unread,
Mar 11, 2020, 1:44:40 PM3/11/20
to OPS4J
Thanks. I can send my patch, if it would help.

Monica

Grzegorz Grzybek

unread,
Mar 13, 2020, 7:53:41 AM3/13/20
to op...@googlegroups.com
Hello

I did simple test:

@Test
public void memoryIssues() throws IOException {
    for (int i = 0; i < 1_000_000; i++) {
        new MyClass().run();
    }
    LOG.info("Done");
}
private static class MyClass {
    private Logger nonStaticLogger = LoggerFactory.getLogger(UUID.randomUUID().toString());
    public void run() {
        // running a method
        nonStaticLogger.trace("Hello!");
    }
}

And I got:

OSGIPaxLoggingManager@4270
 tracker: org.osgi.util.tracker.ServiceTracker  = {org.osgi.util.tracker.ServiceTracker@4271}
 m_logService: org.ops4j.pax.logging.PaxLoggingService  = {org.ops4j.pax.logging.log4j2.internal.PaxLoggingServiceImpl$1ManagedPaxLoggingService@4249}
 m_logServiceRef: org.osgi.framework.ServiceReference  = {org.apache.felix.framework.ServiceRegistrationImpl$ServiceReferenceImpl@4272} "[org.osgi.service.log.LogService, org.osgi.service.log.LoggerFactory, org.ops4j.pax.logging.PaxLoggingService]"
 m_loggers: java.util.Map  = {java.util.HashMap@4273}  size = 1000014
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#516419e9-4bbc-4e5c-9518-204751cb669c#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4379}
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#8536c55c-631a-4fd1-8118-78b1eeda329a#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4381}
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#e4da2934-f756-41dc-ac02-b9a9d4290de5#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4383}
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#1c40dbe4-2475-4634-ba95-54c5bf2a2a5b#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4385}
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#0d80d39b-63aa-4719-b5ce-0265449cc924#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4387}
  "org.ops4j.pax.logging.slf4j.Slf4jLogger#919916db-6aef-4b77-a48e-7fccce2f8740#18" -> {org.ops4j.pax.logging.internal.TrackingLogger@4389}
...

m_loggers had million+ entries - definitely not good ;) I'm working on definitive solution.

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/abc30993-9caf-461a-a675-50cfa95ddf03%40googlegroups.com.

Monica Ron

unread,
Mar 13, 2020, 8:50:53 AM3/13/20
to OPS4J
I'm glad you can see what I saw. :)

I attached my patch to this post. I first downloaded the 1.10.x branch, and made changes. The patch is the output from 'git diff'. To build what I have using this patch, you can do the following:

git init
git fetch origin
git merge 398a8234c6c6ef831e64bd458263c637795a4087
git apply PaxLoggingWeakHashMapChange.patch

And then build with Maven. I only replaced the pax-logging-api.jar, not any of the other ones. We are using the pax-logging-log4j2-1.10.5.jar and my custom jar made with this patch.

I don't know if there are any other memory issues, but this helped with some of them.

Monica
PaxLoggingWeakHashMapChange.patch

Grzegorz Grzybek

unread,
Mar 13, 2020, 10:22:45 AM3/13/20
to op...@googlegroups.com
I've created https://ops4j1.jira.com/browse/PAXLOGGING-311 and I'm checking it.

But simple switching org.ops4j.pax.logging.OSGIPaxLoggingManager#m_loggers to WeakHashMap gave me OOME where the problem was not the order of key-value in the map (TrackingLoggers were correctly disposed, as well as those Pax Web crafted String keys). The problem is:
image.png

So Log4j2 itself holds huge number of non-GCed loggers.

I'm checking what can be done with it.

regards
Grzegorz Grzybek


To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/27968583-d090-43a0-b963-12a923c31fd4%40googlegroups.com.

Grzegorz Grzybek

unread,
Mar 13, 2020, 11:12:00 AM3/13/20
to op...@googlegroups.com
Unfortunately I don't think there's much we can do...


Once a Logger is created it will not be deleted until the LoggerContext it is associated with is deleted. Typically, this will only happen when the application is shut down or un-deployed. Each call to getLogger with the same logger name will return the same Logger instance.

 I even changed LoggerContext to use LoggerRegistry (log4j2) with org.apache.logging.log4j.spi.LoggerRegistry.WeakMapFactory), but this doesn't change anything - the created map uses weak key (strings - logger names), but the same key (string) is kept in a Logger instance - the value of WeakHashMap - this breaks WeakHashMap contract:

Thus care should be taken to ensure that value objects do not strongly refer to their own keys

Monica - which pax-logging backed are you using - Log4j2? Did you check a heapdump? How many actual different loggers do you create?

In my test, I tried creating 1,000,000 loggers with unique names and used -Xmx128M. OOME (heap) happened after ~300K loggers created.

The only change I needed in Pax Logging 1.11+ was change of org.ops4j.pax.logging.OSGIPaxLoggingManager#m_loggers to WeakHashMap.

I don't think there's a need to reverse back the key-value order - especially in 1.11+.

Is there a chance to analyze a heapdump of your application?

regards
Grzegorz Grzybek

regards
Grzegorz Grzybek

Grzegorz Grzybek

unread,
Mar 13, 2020, 2:42:22 PM3/13/20
to op...@googlegroups.com
Hello again

After little more investigation and careful usage of WeakHashMaps and WeakReferences I managed to create 1M of unique loggers on -Xmx128M that were correctly garbage collected in:
However it CAN'T be done easily with Logback, because the child loggers (and UUID-named logger is child of root logger) are held on a list...

there's ONE test faling were I check what happens with a reference to LogService after pax-logging-api and pax-logging-log4j2 are refreshed while the test method is running...


regards
Grzegorz Grzybek

Monica Ron

unread,
Mar 13, 2020, 4:40:13 PM3/13/20
to OPS4J
Thanks for the additional information.

We use pax-logging-log4j2-1.10.5.jar for the backend. I believe I did see in my previous heap analysis investigations that the Log4J2 (pax-logging-log4j2-1.10.5.jar) was holding onto references as you described, but my patch to pax-logging-api at least helped with garbage collection of the stuff in m_loggers in the pax-logging-api side. I didn't investigate the Log4J2 too hard. I couldn't get Pax Logging 1.11.x to deploy on Payara 5.192 at all (we started an upgrade from Glassfish 3.1.2.2 to Payara 5 last summer, when 5.192 was the latest...after getting things to at least deploy there, I couldn't get it running on future versions of 5.192, but we decided that 5.192 was good enough for now [still way newer than our old Glassfish]). While doing the upgrade, I ran into various logging issues, which I eventually tracked down to the Pax Logging issues I reported a few months ago. So, I stuck with pax-logging 1.10.x, using 1.10.5 after you fixed my previously reported issues, instead of trying 1.11.x. (On a side note, I had a non-logging-related memory leak that I tracked down to the osgi-adapter.jar that came from Payara 5.192, because Payara included a version that had a memory leak [osgi-adapter-2.5.0.jar fixed that issue, but the Payara 5.192 has a slightly earlier version].)


So, in terms of usage:
I don't have millions of uniquely-named loggers.

The short-lived loggers (a few minutes each, maybe less) use the same few names over and over (no suffix to distinguish the instance), so even if Log4J2 did store it and gave us back the same one every time, that would be okay. But, the pax-logging-api 1.10.5 is holding onto the loggers without my patch, so if I create 500 of these short-lived loggers with the same name, pax-logging-api holds 500 loggers, as you saw with your 1,000,000 test (even if you used the same name 1,000,000 times, your test would still show 1,000,000 entries in m_loggers). With my patch, when my short-lived loggers get garbage-collected, they get removed from the pax-logging-api m_loggers maps, even if not everything from Log4J2 maps (from the pax-logging-log4j2.jar) is garbage-collected.

The loggers that live for a few days or weeks are in groups of about 200 hundred (when I stopped one of the groups now, there were 214 threads that went away that had custom names for the loggers [the MyClassAAA-1 or MyClassAAA-2 or MyClassAAA-3 that I mentioned earlier]). So, without my patch, pax-logging-api held onto loggers for those 214 names even after my classes using the loggers were stopped and garbage-collected. Log4J2 probably still holds on to them (I didn't check just now), but if we start the same group again, if Log4J2 gives us a previously-created logger (because it already saw the name), that's fine.

If we re-start the same group of classes, the loggers (the 214 loggers in this case, and all groups should be the same size) from Log4J2 would be reused, because the names would be the same as before (the counting starts over at 0, for example). If we start a different group, we'll get different names (there is another suffix in there, so it's more like 'MyClassAAA-GroupX-1, MyClassAAA-GroupX-2 [two instances of MyClassAAA from GroupX], MyClassAAA-GroupY-1, MyClassAAA-GroupY-2 [two instances of MyClassAAA from GroupB]). Two (or more) groups can be started at the same time in our system. Then we can tell from the logs which group is doing what. So, if we start GroupX, we'll have one set of loggers, and if we start GroupY, we'll have a separate set of loggers. If we stop GroupX, then Log4J2 might hold on to them, but if we start it again, they'd be reused from pax-logging-log4j2. However, when we stop GroupX, the pax-logging-api 1.10.5 would still be holding onto the old loggers. When we restart, pax-logging-api 1.10.5 still has the old loggers in m_loggers, but they aren't re-used. Instead, pax-logging-api 1.10.5 would create new loggers with the names. With my patch to 1.10.5, stopping GroupX would cause the old loggers from pax-logging-api m_loggers to go away, and m_loggers would only have the new loggers.

We don't use LogBack directly, and none of the libraries we use use LogBack, either.

Ralph commented on the Log4J2 Jira issue that, "[he] should also be surprised if the same problem doesn't exist with other logging implementations." This echoes my comment from earlier, about not knowing the behavior of the other logging implementations. Because I was using Pax Logging (our application has used that for many years, before I was here), I only looked at Pax Logging behavior.

Since the Log4J2's behavior of holding onto logger references that might not be used again should be a slow leak, now that we are aware of it, we can watch for it and restart the Java process periodically to clean memory. There are breaks between when the Groups need to run (as I said, some might be a few days, some a few weeks, and the loggers for a running Group are needed during those few weeks, so it's not a leak until we "stop" the Group), and some Groups are safe enough to stop long enough to restart the Java process and restart. However, the pax-logging-api definitely holds onto too many loggers that aren't needed anymore (and my patch seems to address the pax-logging-api part) from the short-lived objects.

Thanks again for investigating this,
Monica


On Friday, March 13, 2020 at 2:42:22 PM UTC-4, Grzegorz Grzybek wrote:
Hello again

After little more investigation and careful usage of WeakHashMaps and WeakReferences I managed to create 1M of unique loggers on -Xmx128M that were correctly garbage collected in:
However it CAN'T be done easily with Logback, because the child loggers (and UUID-named logger is child of root logger) are held on a list...

there's ONE test faling were I check what happens with a reference to LogService after pax-logging-api and pax-logging-log4j2 are refreshed while the test method is running...


regards
Grzegorz Grzybek

pt., 13 mar 2020 o 16:11 Grzegorz Grzybek <gr.g...@gmail.com> napisał(a):
Unfortunately I don't think there's much we can do...


Once a Logger is created it will not be deleted until the LoggerContext it is associated with is deleted. Typically, this will only happen when the application is shut down or un-deployed. Each call to getLogger with the same logger name will return the same Logger instance.

 I even changed LoggerContext to use LoggerRegistry (log4j2) with org.apache.logging.log4j.spi.LoggerRegistry.WeakMapFactory), but this doesn't change anything - the created map uses weak key (strings - logger names), but the same key (string) is kept in a Logger instance - the value of WeakHashMap - this breaks WeakHashMap contract:

Thus care should be taken to ensure that value objects do not strongly refer to their own keys

Monica - which pax-logging backed are you using - Log4j2? Did you check a heapdump? How many actual different loggers do you create?

In my test, I tried creating 1,000,000 loggers with unique names and used -Xmx128M. OOME (heap) happened after ~300K loggers created.

The only change I needed in Pax Logging 1.11+ was change of org.ops4j.pax.logging.OSGIPaxLoggingManager#m_loggers to WeakHashMap.

I don't think there's a need to reverse back the key-value order - especially in 1.11+.

Is there a chance to analyze a heapdump of your application?

regards
Grzegorz Grzybek

regards
Grzegorz Grzybek

Grzegorz Grzybek

unread,
Mar 14, 2020, 6:13:14 AM3/14/20
to op...@googlegroups.com
Hello again

I've backported 3 "memory" tests from Pax Logging 1.11.x to https://github.com/ops4j/org.ops4j.pax.logging/commits/pax-logging-1.10.x

The test (e.g., org.ops4j.pax.logging.it.Log4J2MemoryTest) does this:

@Test
public void memoryIssues() throws IOException {
    LOG.info("Starting");
    String[] loggerNames = new String[10000];
    for (int i = 0; i < 10000; i++) {
        loggerNames[i] = UUID.randomUUID().toString();
    }
    for (int i = 0; i < 1000000; i++) {
        if (i % 10000 == 0) {
            LOG.info("iteration {}", i);
            System.gc();
        }
        new Log4J2MemoryTest.MyClass(loggerNames[i % 10000]).run();

    }
    LOG.info("Done");
}

private static class MyClass {
    private Logger nonStaticLogger;

    public MyClass(String name) {
        this.nonStaticLogger = LoggerFactory.getLogger(name);

    }

    public void run() {
        // running a method
        nonStaticLogger.trace("Hello!");
    }
}

You see - million logger instances using 10000 different unique names. What I'm not sure about is the "group" you've mentioned - are the new groups reusing logger names? or each "group" introduces completely new set of loggers?

I checked the result in heap dump and I got:

image.png

which means that in pax-logging 1.10.6-SNAPSHOT (not changed comparing to 1.10.5) there's exactly 10000 TrackingLoggers (+11 for non-test related ones).

The test uses SLF4J facade to get pax logging loggers.

May I ask you to change the tests, so I can try reproducing your problem? I agree that using WeakHashMap helps, but if the number of unique logger names doesn't increase endlessly, I think it's not necessary to use WeakHashMaps.

BTW, I've closed https://issues.apache.org/jira/browse/LOG4J2-2806 even if I was able to ensure proper weak-reference behavior inside Log4J2.

Thanks for great cooperation!

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/316e85a1-39de-47dd-b839-9d6be9f9660f%40googlegroups.com.

Monica Ron

unread,
Mar 17, 2020, 12:17:11 PM3/17/20
to OPS4J
It took me some time to figure out how to run the tests. For one thing, it didn't like the fact that I wasn't using the default repository (I wasn't using .m2/repository in my home directory). Eventually, I got it working by using that default repository.

Anyway, I'm still working on writing a test.

In the meantime, several observations (I noted #2 when originally writing my patch, but did not change it):

1. Log4J1 in pax-logging-api behaves differently from SLF4J in pax-logging-api in your Log4JMemoryTest.

    If you take the Log4JMemoryTest and use the Log4J1 API instead of SLF4J API in the MyClass to get the nonStaticLogger, then you end up with 1,000,000 Loggers within m_loggers in org.apache.log4j.Logger in pax-logging-api. To make this change, I changed only the MyClass constructor, to be:
        public MyClass(String name) {
            this.nonStaticLogger = Logger.getLogger(name);
        }

and changed the imports from
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

to

import org.apache.log4j.Logger;

This behavior is what my patch addresses. With my patch, there were at maximum 1,535 entries in m_loggers. (The 1,535 varies as garbage collection is not deterministic, but 1,535 is still better than 1,000,000.)

===================
===================
2. Explanation of #1 (difference in whether or not the new logger instance is stored in m_loggers when m_paxLogging is not null):
  • SLF4J in pax-logging-api: In org.ops4j.pax.logging.slf4j.Slf4jLoggerFactory getLogger, if m_paxLogging is not null (Line 95), then a new Slf4jLogger is created, but the Slf4jLogger is not stored in m_loggers.
  • Log4J1 in pax-logging-apiIn org.apache.log4j.Logger getLogger, the new Logger created on line 140 is stored in m_loggers regardless of whether m_paxLogging is null or not null.
  • Others in pax-logging-api: The other log implementations (Apache Commons Logging in org.apache.commons.logging.LogFactory, JULI Logging in org.apache.juli.logging.LogFactory, Avalon Logging in org.ops4j.pax.logging.avalon.AvalonLogFactory) behave like org.apache.log4j.Logger, and store new logger instance in m_loggers regardless of whether m_paxLogging is null or not null.
  • Only org.ops4j.pax.logging.log4jv2.Log4jv2LoggerContext checks for existence of a logger with the same name before creating a new logger to put in the loggers map.

===================
===================
I'll look at writing a test to match my usage with what I called "groups". Markers as the Ralph from Log4J2 suggested may help (I haven't tried how that works yet---I've just been using the existing logging we've had for years). However, without changing our own logging, the change in behavior of pax-logging-api between the very old 1.6.1 and the new 1.10.5 because of the change in the order of WeakHashMap parameters definitely had a negative effect on the memory behavior of pax-logging-api. Even if the TrackingLogger using the same name sticks around after we're done with that (I can deal with that), the fact that the pax-logging-api holds on to things longer doesn't seem too good. I guess since I now use pax-logging-api Log4J2 instead of pax-logging-api Log4J1, that does help, because Log4J2 in pax-logging-api only creates new instances for new names. Log4J1, as stated in #1 above, creates and holds onto new instances forever, even if the name is re-used.

Thanks again,
Monica

Monica Ron

unread,
Mar 18, 2020, 5:47:23 PM3/18/20
to OPS4J
I have a test that shows my groups usage. Should I just attach it as a part of a post to this forum? It definitely behaves differently with the 1.10.5 vs. with my patch, with regards to how many logger instances get stored in m_loggers (especially if I use Log4J1 vs. Log4J2 as my API).

I use the Log4J2 API in my real code, as I've stated before (but third-party code we use uses SLF4J or JCL, and maybe others). I tried to use the ThreadContext in my code (instead of the Markers that Ralph mentioned), and ran into trouble, because I ran into the problem described in https://ops4j1.jira.com/browse/PAXLOGGING-244 , for which the fix was not applied to the 1.10.5 branch. Once I backported that fix to the 1.10.5 branch (making a new pax-logging-api and new pax-logging-log4j2, the ThreadContext worked, and I could re-use logger names and still see which "group" my log statements were from.

Even if I change my code to use ThreadContext, the memory behavior of 1.10.5 with regards to m_loggers is still a leak compared to the old 1.6.1 we were using, as I have been stating all along.

I think the inconsistencies with regard to the following two items (mentioned in my previous post) is also an issue:
1. storing values in the m_loggers maps when m_paxLogging is non-null (only SLF4J API in pax-logging-api 1.10.5 does *not* store it if m_paxLogging is non-null), and
2. getting a new logger even if a name is reused vs. re-using the old logger (only Log4J2 API in pax-logging-api 1.10.5 reuses the logger if the name was already used--other implementations just keep creating new loggers for the same name, and store all of those loggers in m_loggers)

Because of #1 and #2, if I was using Log4J1 API in pax-logging-api 1.10.5, then even if I re-used the name for a non-static logger, the m_loggers just keeps growing. At least with Log4J2, if I re-use the name for a non-static logger, the m_loggers does not grow.

Thanks again,
Monica

Grzegorz Grzybek

unread,
Apr 21, 2020, 2:48:57 AM4/21/20
to op...@googlegroups.com
Hello

Sorry for big delay... I still remember about this issue and I think I can do something about it soon. Just a little bit patience please ;)

regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.

Monica Ron

unread,
Apr 21, 2020, 8:30:31 AM4/21/20
to OPS4J
Thanks. I decided to change my approach. I am not using the previous patch anymore.

I patched the ThreadContext (based on PAXLOGGING-244), reworked my code to use the ThreadContext instead of modifying the logger name, and also made some changes to the pax-logging-api to fix some of the leak issues and to address inconsistencies between the various logging implementations. For my pax-logging-api changes, some of it follows what was done in the 1.11.x branch for PAXLOGGING-307. I no longer swap the order of the WeakHashMap parameters back to the original <Logger, String>. My patch keeps it with the new <String, Logger> parameters, but does not store Logger implementations in the map if the Pax Logging Manager is already created (as mentioned earlier, SLF4J already had this check, but Log4J1 did not). 

I attached my two patches and the instructions I wrote so that my teammates could build the new jars. Feel free to use them or modify them as needed.

Monica
To unsubscribe from this group and stop receiving emails from it, send an email to op...@googlegroups.com.
buildPaxLogging_README.txt
PaxLoggingThreadContext.patch
PaxLoggingMemoryLeak.patch

Grzegorz Grzybek

unread,
Apr 22, 2020, 12:46:37 AM4/22/20
to op...@googlegroups.com
Thanks! Definitely I'll use these patches to fix it in the project.

regards and stay healthy!
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/60e642dd-33d3-4249-beb4-87d2b65d7944%40googlegroups.com.

Grzegorz Grzybek

unread,
May 4, 2020, 2:02:29 AM5/4/20
to op...@googlegroups.com
Hello

FYI, I've changed the memory tests to do logging via 7 "frontends" for each of 3 "backends". These frontends are:

org.slf4j.Logger slf4jLogger = org.slf4j.LoggerFactory.getLogger(name);
slf4jLogger.trace("TRACE through SLF4J");

org.apache.commons.logging.Log commonsLogger = org.apache.commons.logging.LogFactory.getLog(name);
commonsLogger.trace("TRACE through Apache Commons Logging");

org.apache.juli.logging.Log juliLogger = org.apache.juli.logging.LogFactory.getLog(name);
juliLogger.trace("TRACE through JULI Logging");

org.apache.avalon.framework.logger.Logger avalonLogger = org.ops4j.pax.logging.avalon.AvalonLogFactory.getLogger(name);
avalonLogger.debug("DEBUG through Avalon Logger API");

org.jboss.logging.Logger jbossLogger = org.jboss.logging.Logger.getLogger(name);
jbossLogger.trace("TRACE through JBoss Logging Logger API");

org.apache.log4j.Logger log4j1Logger = org.apache.log4j.Logger.getLogger(name);
log4j1Logger.trace("TRACE through Log41 v2 API");

org.apache.logging.log4j.Logger log4j2Logger = org.apache.logging.log4j.LogManager.getLogger(name);
log4j2Logger.trace("TRACE through Log4J v2 API");

Tests with -Xmx64M run like this:

[INFO] Running org.ops4j.pax.logging.it.Log4J1MemoryIntegrationTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 65.928 s - in org.ops4j.pax.logging.it.Log4J1MemoryIntegrationTest
[INFO] Running org.ops4j.pax.logging.it.Log4J2MemoryIntegrationTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 73.524 s - in org.ops4j.pax.logging.it.Log4J2MemoryIntegrationTest
[INFO] Running org.ops4j.pax.logging.it.LogbackMemoryIntegrationTest
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 68.748 s - in org.ops4j.pax.logging.it.LogbackMemoryIntegrationTest

and in memory dump I saw exactly 70016 instances of PaxLoggerImpl and TrackingLogger - which perfectly match what I wanted to achieve with 2.0.x and 1.11.x

Now I'll check these tests with 1.10.x.

regards
Grzegorz Grzybek

Grzegorz Grzybek

unread,
May 4, 2020, 2:44:05 AM5/4/20
to op...@googlegroups.com
Hello²

In Pax Logging 1.10.x it's not that good.

 - org.ops4j.pax.logging.log4jv2.Log4jv2Logger - 10001 instances - ok
 - org.ops4j.pax.logging.log4j2.internal.PaxLoggerImpl - 10010 instances - ok
 - org.apache.logging.log4j.core.Logger - 10010 instances - ok
 - org.ops4j.pax.logging.internal.TrackingLogger - 60011 instances - ok
 - org.apache.log4j.logger - 185599 instances - not ok
 - org.ops4j.pax.logging.avalon.AvalongLogger - 185599 instances - not ok
 - org.apache.commons.logging.internal.JclLogger - 185600 instances - not ok
 - org.apache.juli.logging.internal.JuliLogger - 185600 instances - not ok

SLF4J, JBossLogging seems to be properly GCed. Log4j2 loggers are ok. Log4j1, Avalon, JCL and JULI are broken in Pax Logging 1.10.x

Checking your patches now ;)

regards
Grzegorz Grzybek

Grzegorz Grzybek

unread,
May 4, 2020, 3:10:41 AM5/4/20
to op...@googlegroups.com
Hello³

And finally - many many thanks for your patch! I'm grateful because after applying your patch without changes, my Memory tests (extended to cover all remaining logging APIs/facades) pass without memory leaks on -Xmx64m.


Now the last thing is - if you want a release 1.10.6, just let me know.

kind regards
Grzegorz Grzybek

Grzegorz Grzybek

unread,
May 4, 2020, 4:40:55 AM5/4/20
to op...@googlegroups.com
Hello again

Without waiting, I've just released pax-logging 1.10.6 version - I hope it'll solve all your (Monica Ron) problems ;)

regards
Grzegorz Grzybek

Jean-Baptiste Onofré

unread,
May 4, 2020, 4:41:42 AM5/4/20
to OPS4J
Hi Grzegorz,

Pax Logging 1.11.x is not impacted ?

Regards
JB

Grzegorz Grzybek

unread,
May 4, 2020, 4:43:18 AM5/4/20
to op...@googlegroups.com
Hello

Nope, it's not. Though I've enhanced the memory-related tests. And I'll prepare new 1.11.x/2.0.x releases too, because of https://ops4j1.jira.com/browse/PAXLOGGING-312 == https://issues.apache.org/jira/browse/LOG4J2-2819 == CVE-2020-9488 soon.

regards
Grzegorz Grzybek

Jean-Baptiste Onofré

unread,
May 4, 2020, 4:45:31 AM5/4/20
to OPS4J
Good, thanks for the update.

Regards
JB

Monica Ron

unread,
May 4, 2020, 10:34:21 AM5/4/20
to OPS4J
Hi.
I very much appreciate the update and the new release. Thank you. I'm glad that the memory leak patch helped and that you could see from your updated tests what I had seen regarding leaks.

I hate to tell you this, but you'll need to make some fixes and release 1.10.x again.

First, you didn't include my Log4J2 ThreadContext patch for fixing https://ops4j1.jira.com/browse/PAXLOGGING-244 on the updated 1.10.x branch. Without that fix, my updated code that uses the ThreadContext instead of using so many loggers with different names does not work (because the ThreadContext gets cleared too soon). That PAXLOGGING-244 fix affects the pax-logging-log4j2.jar, not just pax-logging-api.jar. My ThreadContext patch was included in a previous post. As shown in the "README" attached to that post, I applied my ThreadContext patch first, and then applied my memory leak patch.


Second, besides missing the PAXLOGGING-244 fix, you have the following issues in what you did patch. The patch seems to have been applied incorrectly:

pax-logging-api/src/main/java/org/apache/commons/logging/LogFactory.java
Line 262/264: jclLogger gets added to the m_loggers.get(name) list TWICE. (This is wrong--need to delete 264 [keep 262 so that it gets added inside the synchronized block].)

pax-logging-api/src/main/java/org/apache/juli/logging/LogFactory.java
Line 37: Import of java.util.Collections is no longer used, and can be removed.
Line 159: Extra semicolon (no big deal, but might as well remove it while you fix the other issues).
Line 209/211: juliLogger gets added to the m_loggers.get(name) list TWICE. (This is wrong--need to delete 211 [keep 209 so that it gets added inside the synchronized block].)

pax-logging-api/src/main/java/org/apache/log4j/Logger.java
Line 20: Import of java.util.Collections is no longer used, and can be removed.
Line 147/149: logger gets added to the m_loggers.get(name) list TWICE. (This is wrong--need to delete 149 [keep 147 so that it gets added inside the synchronized block].)

pax-logging-api/src/main/java/org/ops4j/pax/logging/avalon/AvalonLogFactory.java
Line 20: Import of java.util.Collections is no longer used, and can be removed.
Line 86-94/95-96: A local variable 'paxLogger' is created on 86-94, but then is not used. Line 95-96 needs to use the paxLogger instead of creating and using a new local variable 'logger'. (Line 95-96 ignores the possibility of needing the "fallback logger".) (Looks like I missed this in my patch. I don't have anything that uses Avalon Log, so I didn't test it. I changed Avalon Log to make all of the logging implementations work similarly with regards to the fallback logger and the real m_paxLogging.getLogger, clearing the m_loggers list in setBundleContext, and only storing a value in m_loggers at all if the m_paxLogging was null when a new instance of a logger was created). I think line 86-94 need to use newName, then line 95 needs to be deleted, and Line 96 needs to use the paxLogger variable.)
Line 102/104: avalonLogger gets added to the m_loggers.get(newName) list TWICE. (This is wrong--need to delete 104 [keep it inside the synchronized block].)

Thus, the getLogger method needs to be the following (to get the correct PaxLogger implementation based on the newName, including possibility of using a "fallback logger"). This is different from my original patch.:
    public static Logger getLogger( AvalonLogger parent, String name )
    {
        String newName;
        if( parent == null )
        {
            newName = name;
        }
        else
        {
            newName = parent.getName() + "." + name;
        }
        PaxLogger paxLogger;
        if( m_paxLogging == null )
        {
            paxLogger = FallbackLogFactory.createFallbackLog( null, newName );
        }
        else
        {
            paxLogger = m_paxLogging.getLogger( newName, AvalonLogger.AVALON_FQCN );
        }
        AvalonLogger avalonLogger = new AvalonLogger( paxLogger );
        if (m_paxLogging == null) {
            synchronized (m_loggers) {
                if (!m_loggers.containsKey(newName)) {
                    m_loggers.put(newName, new LinkedList<AvalonLogger>());
                }
                m_loggers.get(newName).add(avalonLogger);
            }
        }
        return avalonLogger;
    }



pax-logging-api/src/main/java/org/ops4j/pax/logging/log4jv2/Log4jv2LoggerContext.java
Line 49/52: paxLogging.open() is called twice. (This is wrong--need to delete 52 [keep 49 so that it gets updated inside the synchronized block].)

pax-logging-api/src/main/java/org/ops4j/pax/logging/slf4j/Slf4jLoggerFactory.java
Line 38: m_paxLogging was made "package protected" instead of "private". It still works, but not sure why the change, since it isn't used anywhere outside the class? It can be left "package protected" if you want, but it seemed odd (since it is not clear why the change was made).


===============
Please fix the ThreadContext problem (patch provided earlier, based on changes on 1.11.x branch for PAXLOGGING-244) and address the above problems. If you want me to look at it before you actually make the 1.10.7 release, please let me know.

Thanks again!
Monica

Grzegorz Grzybek

unread,
May 5, 2020, 1:53:30 AM5/5/20
to op...@googlegroups.com
Hello again ;)

Thanks for detailed check! I confess - I wasn't able to do `git apply < xxx.patch`, so I did it manually and:
 - I forgot about changing the clear() to removal of "bundle.*" entries
 - I didn't notice double put()

This time I did 2nd review and here's the result: https://github.com/ops4j/org.ops4j.pax.logging/commit/3250cd60675a48530000f74877fb49a68a2313ec - this time before new release ;)

So this time, if you (please) check it, I'll release 1.10.7.

regards
Grzegorz Grzybek

--
--
------------------
OPS4J - http://www.ops4j.org - op...@googlegroups.com

---
You received this message because you are subscribed to the Google Groups "OPS4J" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.

Monica Ron

unread,
May 5, 2020, 3:13:19 PM5/5/20
to OPS4J
Hi. The new checkin looks much better, and matches what I have from my patch.

Please go ahead and release 1.10.7 with the updated code. You might want to mention in the Release Notes that PAXLOGGING-244 (the ThreadContext problem) is now fixed on the 1.10.x branch (and/or update the "Fix Versions" on PAXLOGGING-244 to include 1.10.7). It might help someone to have that info.

Thanks for applying my patches and making an official release. Being able to use an official version is easier than our having to keep around our own patched version.

The investigation of all this improved our code, too, as we now use the ThreadContext instead of making quite so many distinct loggers.

Hopefully this will be the last issue for a while. ;)

Monica
To unsubscribe from this group and stop receiving emails from it, send an email to op...@googlegroups.com.

Grzegorz Grzybek

unread,
May 6, 2020, 5:58:45 AM5/6/20
to op...@googlegroups.com
Hello

I've updated PAXLOGGING-244 issue and I'm just releasing pax-logging 1.10.7.

regards
Grzegorz Grzybek

To unsubscribe from this group and stop receiving emails from it, send an email to ops4j+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ops4j/a80b948c-a403-44bf-bb34-9e373c0d2f3c%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages