Good Afternoon All,
This weekend we had RunDeck locked up from Saturday around noon till Monday at 9:00 am when we had to restart the server to get it to start processing jobs again.
This has happened in the past before, but usually due to outside RunDeck factors like backups failing or restarts not completing. In this case nothing on the OS side was out of the ordinary.
The CPU tends to stay at or near 70% and just hangs with no other jobs being processed until the tomcat server is rebooted.
In trying to find out what was happening I found the following from the catalina logs. Does anyone know what we need to be checking? How can we prevent this from happening especially on overnights and weekends? I couldn't get a thread dump because it wasn't when we could observe it as the webgui won't respond at all.
Catalina Logs:
Saturday-
11-Jun-2022 12:54:22.653 WARNING [pool-11-thread-2] com.google.common.cache.LocalCache$Segment$1.run Exception thrown during refresh
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space
...
...
Caused by: java.lang.OutOfMemoryError: Java heap space
And again today-
14-Jun-2022 12:51:44.775 WARNING [Thread-495] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [rundeck] appears to have started a thread named [Thread-7] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
sun.nio.fs.WindowsNativeDispatcher.GetQueuedCompletionStatus0(Native Method)
sun.nio.fs.WindowsNativeDispatcher.GetQueuedCompletionStatus(WindowsNativeDispatcher.java:1007)
sun.nio.fs.WindowsWatchService$Poller.run(WindowsWatchService.java:586)
java.lang.Thread.run(Thread.java:748)
14-Jun-2022 12:34:16.974 WARNING [pool-11-thread-1] com.google.common.cache.LocalCache$Segment$1.run Exception thrown during refresh
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
SETUP:
Server: Windows 2016 (8 core / 35 GB ram)
Tomcat: 9.0.55
JAVA_OPTS=-XX:MaxPermSize=256m -Xmx1024m -Xms512m
RunDeck
APIVERSION : 40
BUILD : 3.4.10-20220118
BUILDGIT : v3.4.10-0-g01b8e84
VERSION : 3.4.10-20220118
ThreadPoolSize:10
RDECK_CLI_OPTS=-Xms1028m -Xmx4096m
JVM
IMPLEMENTATIONVERSION : 25.312-b07
NAME : OpenJDK 64-Bit Server VM
VENDOR : Amazon.com Inc.
VERSION : 1.8.0_312
DB
MySQL MariaDB: 10.6.3