Tomcat8 in Windows 2012 experiencing CPU spiking event

922 views
Skip to first unread message

Feed My Lambs Esq.

unread,
Dec 8, 2015, 10:24:10 AM12/8/15
to DSpace Technical Support
I have posted about this problem in the Stack Overflow community and thought I would cross-post here for posterity. This is happening on our TEST server in the hopes of going live on DSpace 5 asap so any help you can provide at calming Tomcat down would be greatly appreciated.


Here's a direct link to an image of the CPU stack trace / thread dump during the spiking events:

Many thanks for any insight you can give us so we can begin using the new version.

helix84

unread,
Dec 8, 2015, 10:31:02 AM12/8/15
to Feed My Lambs Esq., DSpace Technical Support
YourKit will also tell you when garbage collection occurred. Do the
CPU activity times correspond to GC events? If they do, it's normal
behaviour. GC in Java can be tuned to a high extent (but GC events
will always occur).


Regards,
~~helix84

Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Feed My Lambs Esq.

unread,
Dec 8, 2015, 10:48:08 AM12/8/15
to DSpace Technical Support, victor....@gmail.com, hel...@centrum.sk
Thank you for your prompt reply! Yes and no.

The blue line has a strong correlation to Garbage Collections but as you can see that represents only a partial amount of CPU used and doesn't occur for every spike. Every 3rd or 4th event has no garbage collection associated. All of the garbage collection over the past hour or so has occurred within the greater high CPU consumption.

Since there are 10 second long periods of high CPU usage with no garbage collection, what could be causing those? The garbage collection was something I also noticed but it looks like new threads keep getting launched every 20 seconds when no one has requested that. If the thousands of threads not been created I also presume there would be no need for garbage collection now.

Have you looked at the "CPU Time" in the image? There is nothing that gets attributed for causing the CPU to run (to my untrained eye) -- is this indicative of a rogue process that just regularly calls `wait(10)`?

Thank you again; I'm happy to try and run or execute anything that might give us more of an idea of what is launching these consumptive threads.

After reading the Etiquette file, I realized I should output the version info:
E:\dspace\bin>dspace version
Using DSpace installation in: E:\dspace
DSpace version:  5.4-SNAPSHOT
  SCM revision:  ${buildNumber}
    SCM branch:  UNKNOWN_BRANCH
            OS:  Windows Server 2012 R2(amd64) version 6.3
  Applications:
     Discovery:  enabled.
           JRE:  Oracle Corporation version 1.8.0_65
   Ant version:  Apache Ant(TM) version 1.9.4 compiled on April 29 2014
 Maven version:  3.3.1
   DSpace home:  E:/dspace

Feed My Lambs Esq.

unread,
Dec 8, 2015, 11:31:17 AM12/8/15
to DSpace Technical Support, victor....@gmail.com, hel...@centrum.sk
Since I have been profiling for 3 days, there are thousands and thousands of instances of MergeThread (org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread) which I presume correlates to the garbage collection events and re-using threads since there's probably a max number of active threads.

Since this is a TEST system, is there ANY reason to have this many threads launched, re-used, and garbage collected? I've run indexing and run searches and opened items but not thousands and thousands :-) With no users hitting the system, I would expect a dozen threads have been needed over the past 3 days but something is launching and launching...

Our version of Tomcat is 8.0.23 and there is an upgrade available to 8.0.30. Is that false hope to upgrade and have a different experience? If DSpace code isn't launching these threads (is that what the screen capture reveals?) then can we focus on Tomcat?

Bonus info...there are also hundreds of instances of java.lang.Thread recorded in the past 3 days which all reportedly lasted for 60,000 milliseconds (60 seconds). Although there is 1 java.lang.Thread has lasted for 258,752,817 ms or 71 hours(!)

Feed My Lambs Esq.

unread,
Dec 8, 2015, 2:49:58 PM12/8/15
to DSpace Technical Support, victor....@gmail.com, hel...@centrum.sk
After removing default filters from the profiling tool, it was revealed that the work that is being done every ~20 seconds and lasting for 8-10 seconds includes the following functions via the updated thread dump:

  • java.io.WinNTFileSystem.canonicalize0 (6-8 seconds)
  • java.io.WinNTFileSystem.getBooleanAttributes (inFile.exists()) (~1 second)
  • StardardRoot.java (~1 second)
Is there a reason DSpace would launch a check like this every ~20 seconds? Or could Tomcat have a reason to launch something to find out if a file exists?

A contributor on Stack Overflow discovered that curated tasks could be run on a schedule -- is there a way to disprove a DSpace scheduled / periodic task is running? When I ran a test curated task it appeared in the dspace.log file but no updates to the logs have happened in the last hour and yet the periodic spikes are still occurring.

(Stack Overflow was updated as well.)

helix84

unread,
Dec 9, 2015, 6:29:10 AM12/9/15
to Feed My Lambs Esq., DSpace Technical Support
On Tue, Dec 8, 2015 at 8:49 PM, Feed My Lambs Esq.
<victor....@gmail.com> wrote:
> java.io.WinNTFileSystem.canonicalize0 (6-8 seconds)
> java.io.WinNTFileSystem.getBooleanAttributes (inFile.exists()) (~1 second)
> StardardRoot.java (~1 second)
>
> Is there a reason DSpace would launch a check like this every ~20 seconds?
> Or could Tomcat have a reason to launch something to find out if a file
> exists?

If you defined your webapp Contexts in
[tomcat]/conf/Catalina/localhost (as opposed to server.xml), Tomcat
should check periodically whether those files have been modified so
that it can load/unload/reload contexts. Whether Tomcat will be
notified of changes or will poll for changes will be platform
specific.

I don't think anything in DSpace does such regular checks as of DSpace
5. DSpace 6 may introduce such checks.


Unfortunately nothing else that could answer your other questions
occurs to me at the moment. Please do share your findings with various
Tomcat / Java versions. It might be useful to someone, especially
because I don't think there are too many people running on Windows in
production.

Feed My Lambs Esq.

unread,
Dec 9, 2015, 3:31:19 PM12/9/15
to DSpace Technical Support, victor....@gmail.com, hel...@centrum.sk
helix84, thank you for this idea. It pushed me towards the potential solution. With the following change to our 3 "contexts" XML files, the CPU spikes suddenly and completely stopped:

* reloadable="false"
* cachingAllowed="true"

(But I suspect 'reloadable' was the main change - more later)

We did have 3 individual Catalina/localhost contexts (ROOT.xml, solr.xml, and oai.xml). Their content before was:
<?xml version='1.0'?>
<Context
    docBase="E:/dspace/..."
    reloadable="true"
    cachingAllowed="false"/>

Changing those flags in-place and saving suddenly stopped the CPU behavior I have documented -- even without stopping/starting Tomcat!

Your warning regarding "as opposed to server.xml" sounded like server.xml was the approved / better place for these contexts (although the documentation here seems to imply 3 separate documents is most proper). I have since moved all 3 into server.xml at your recommendation. Of note, if I leave the settings for reloadable="true", the spiking activity that started this thread happily returned. For our installation, there appears to be no safe place for these contexts to be in development (reloadable='true') mode. Every 20 seconds to be checking for changed files seems egregious until you're actively making changes.

In my catalina (Tomcat) log file, I also found the following warning 3 times with each start-up:
org.apache.tomcat.util.digester.SetPropertiesRule.begin [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property 'cachingAllowed' to 'true' did not find a matching property.

There is not a correlated comment about reloadable so does our application need to create a container somehow for cachingAllowed to make the intended difference? I haven't found anyone else posting about this warning so we may be doing something uniquely wrong. The only place I found a reference to reloadable in our application was in dspace/config/default.context but adding the cachingAllowed property there didn't stop the error message.

Here are the contexts now with 'debug' and 'allowLinking' as holdovers from when we ran DSpace 3 (those two properties also make catalina WARNINGs):

<!-- DEFINE A CONTEXT PATH FOR DSpace JSP User Interface -->
<Context path="" docBase="E:\dspace\webapps\xmlui" debug="0"
reloadable="false" cachingAllowed="true"
allowLinking="true"/>

<!-- DEFINE A CONTEXT PATH FOR DSpace Solr Statistics -->
<Context path="/solr" docBase="E:\dspace\webapps\solr" debug="0"
reloadable="false" cachingAllowed="true"
allowLinking="true"/>
<Context path="/oai" docBase="E:\dspace\webapps\oai" debug="0"
reloadable="false" cachingAllowed="true"
allowLinking="true"/>

* Since the symptoms of our problem have dissipated, are we in the clear?
* Do I need to do something to make sure the cachingAllowed property is getting handled properly with respect to the WARNING messages?

Thank you!

helix84

unread,
Dec 10, 2015, 3:45:34 AM12/10/15
to Feed My Lambs Esq., DSpace Technical Support
On Wed, Dec 9, 2015 at 9:31 PM, Feed My Lambs Esq.
<victor....@gmail.com> wrote:
> Changing those flags in-place and saving suddenly stopped the CPU behavior I
> have documented -- even without stopping/starting Tomcat!

Yes, as expected, it was the reloadable attribute.

> Your warning regarding "as opposed to server.xml" sounded like server.xml
> was the approved / better place for these contexts (although the
> documentation here seems to imply 3 separate documents is most proper). I
> have since moved all 3 into server.xml at your recommendation.

It's better only because it allows to make them separately reloadable.
If you don't want to use that feature, I don't think it matters
whether you use server.xml or context fragments.

The thing is on systems like Linux, Tomcat doesn't poll for changes to
its xml files, it is notified by the OS via subsystems like inotify.
It's easily possible that Windows doesn't have such subsystem or if it
has, Tomcat is not programmed to use it, thus it has to poll for
changes periodically.

> In my catalina (Tomcat) log file, I also found the following warning 3 times
> with each start-up:
> org.apache.tomcat.util.digester.SetPropertiesRule.begin
> [SetPropertiesRule]{Server/Service/Engine/Host/Context} Setting property
> 'cachingAllowed' to 'true' did not find a matching property.
>
> There is not a correlated comment about reloadable so does our application
> need to create a container somehow for cachingAllowed to make the intended
> difference? I haven't found anyone else posting about this warning so we may
> be doing something uniquely wrong. The only place I found a reference to
> reloadable in our application was in dspace/config/default.context but
> adding the cachingAllowed property there didn't stop the error message.
>
> Here are the contexts now with 'debug' and 'allowLinking' as holdovers from
> when we ran DSpace 3 (those two properties also make catalina WARNINGs):

Those attributes were simply removed in newer versions of Tomcats. The
warning just tells you that it doesn't do anything, it's not
dangerous.

Compare:
https://tomcat.apache.org/tomcat-8.0-doc/config/context.html#Attributes
https://tomcat.apache.org/tomcat-6.0-doc/config/context.html#Attributes

Feed My Lambs Esq.

unread,
Dec 10, 2015, 8:31:40 AM12/10/15
to DSpace Technical Support, victor....@gmail.com, hel...@centrum.sk
Thank you helix84, I guess it really does just boil down to the reloadable attribute. We even had that change as part of our upgrade notes -- lesson learned! Now I know where the changes are to be made intimately.

As I mentioned in my last post, though, our live, production server has kept its reloadable attribute set to 'true' for year(s) and does not reflect this behavior which leads me to presume that Tomcat on Windows may have changed. Both Tomcat and Windows and Java are different versions, though, between our old and new systems.

Thanks so much for your help and recommendations. It was very encouraging and appreciated!

To put it all in one place, the problematic configuration was having reloadable set to true for my Tomcat 'contexts' on the following architecture:

Windows Server 2012 R2
Tomcat: 8.0.23
DSpace: 5.4
Java: jre1.8.0_65
Reply all
Reply to author
Forward
0 new messages