When the issue happened (10:24) wrapper log only shows:
INFO | jvm 3 | 2022/05/18 10:23:44 | May 18, 2022 10:23:44 AM jakarta.mail.Session loadResource
INFO | jvm 3 | 2022/05/18 10:23:44 | WARNING: expected resource not found: /META-INF/javamail.default.address.map
INFO | jvm 3 | 2022/05/18 10:24:06 | May 18, 2022 10:24:05 AM jakarta.mail.Session loadResource
INFO | jvm 3 | 2022/05/18 10:24:06 | WARNING: expected resource not found: /META-INF/javamail.default.address.map
INFO | jvm 3 | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] qtp1718515850-41: Retried waiting for GCLocker too often allocating 256 words
INFO | jvm 3 | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] 105@MessageListener for WorkFinder: Retried waiting for GCLocker too often allocating 386 words
INFO | jvm 3 | 2022/05/18 10:29:42 | May 18, 2022 10:29:42 AM jakarta.mail.Session loadResource
INFO | jvm 3 | 2022/05/18 10:29:42 | WARNING: expected resource not found: /META-INF/javamail.default.address.map
Having a closer look at the log showed that the OOM situation already happened one day earlier than expected. On 05/17 at round about 11:30 the wrapper-log shows the out of memory error messages:
2022-05-17 11:30:39,871 ERROR [104@MessageListener for WorkFinder] JMSMessageListenerAdapter:87 - Exception thrown in message handling by listener com.thoughtworks.go.server.messaging.scheduling.WorkFinder@4cf95d4c
org.springframework.transaction.TransactionSystemException: Could not roll back JDBC transaction; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed [90098-200]
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:329)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:857)
at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:834)
at org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:164)
at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137)
at com.thoughtworks.go.server.transaction.TransactionTemplate.execute(TransactionTemplate.java:28)
at com.thoughtworks.go.server.service.ScheduleService.rescheduleAbandonedBuildIfNecessary(ScheduleService.java:640)
at com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:183)
at com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:165)
at com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:60)
at com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:32)
at com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
at com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed [90098-200]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
at org.h2.message.DbException.get(DbException.java:194)
at org.h2.engine.Session.getTransaction(Session.java:1792)
at org.h2.engine.Session.getStatementSavepoint(Session.java:1804)
at org.h2.engine.Session.setSavepoint(Session.java:915)
at org.h2.command.Command.executeUpdate(Command.java:244)
at org.h2.jdbc.JdbcConnection.rollbackInternal(JdbcConnection.java:1530)
at org.h2.jdbc.JdbcConnection.rollback(JdbcConnection.java:561)
at org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781)
at org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781)
at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:326)
... 13 common frames omitted
2022-05-17 11:30:54,167 INFO [WrapperJarAppMain] Jetty9Server:199 - Configuring Jetty using /etc/go/jetty.xml
2022-05-17 11:30:54,236 WARN [WrapperJarAppMain] Server:357 - ErrorPageMapper not supported for Server level Error Handling
2022-05-17 11:30:54,372 WARN [WrapperJarAppMain] AbstractHandler:96 - No Server set for ResourceHandler@51be8b61{STOPPED}
2022-05-17 11:30:58,763 WARN [WrapperJarAppMain] ConnectionManager:117 - The file /etc/go/db.properties specified by `go.db.config` does not exist.
2022-05-17 11:30:59,154 INFO [WrapperJarAppMain] DatabaseMigrator:40 - Upgrading database, this might take a while depending on the size of the database.
2022-05-17 11:30:59,154 INFO [WrapperJarAppMain] DatabaseMigrator:49 - ************************************************************************
2022-05-17 11:30:59,155 INFO [WrapperJarAppMain] DatabaseMigrator:49 - WARNING: Shutting down your server at this point will lead to a database corruption. Please wait until the database upgrade completes.
2022-05-17 11:30:59,155 INFO [WrapperJarAppMain] DatabaseMigrator:49 - ************************************************************************
2022-05-17 11:31:01,356 INFO [WrapperJarAppMain] DatabaseMigrator:57 - Database upgrade completed successfully.
2022-05-17 11:31:01,357 INFO [WrapperJarAppMain] DataMigrationRunner:34 - Running data migrations...
2022-05-17 11:31:01,388 INFO [WrapperJarAppMain] DataMigrationRunner:49 - Data migration took 3 ms
2022-05-17 11:31:01,391 INFO [WrapperJarAppMain] DataMigrationRunner:49 - Data migration took 0 ms
2022-05-17 11:31:01,391 INFO [WrapperJarAppMain] DataMigrationRunner:39 - Data migrations completed.
2022-05-17 11:31:02,206 WARN [WrapperJarAppMain] ConfigurationFactory:136 - No configuration found. Configuring ehcache from ehcache-failsafe.xml found in the classpath: jar:file:/var/lib/go-server/work/jetty-0_0_0_0-8153-cruise_war-_go-any-/webapp/WEB-INF/lib/ehcache-2.10.9.2.jar!/ehcache-failsafe.xml
2022-05-17 11:31:02,383 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.AccessToken]; using defaults.
2022-05-17 11:31:02,436 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.NotificationFilter]; using defaults.
2022-05-17 11:31:02,538 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.User]; using defaults.
2022-05-17 11:31:02,560 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.Plugin]; using defaults.
2022-05-17 11:31:02,570 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.EnvironmentVariable]; using defaults.
2022-05-17 11:31:02,705 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.User.notificationFilters]; using defaults.
2022-05-17 11:31:02,789 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [org.hibernate.cache.UpdateTimestampsCache]; using defaults.
2022-05-17 11:31:02,826 WARN [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [org.hibernate.cache.StandardQueryCache]; using defaults.
2022-05-17 11:31:04,815 WARN [WrapperJarAppMain] BrokerService:2163 - Temporary Store limit is 51200 mb (current store usage is 0 mb). The data directory: /var/lib/go-server only has 45488 mb of usable space. - resetting to maximum available disk space: 45488 mb
2022-05-17 11:31:04,961 INFO [WrapperJarAppMain] ConnectionManager:98 - Done loading query extensions, found com.thoughtworks.go.server.database.h2.H2QueryExtensions@6a25c389
2022-05-17 11:31:08,045 INFO [WrapperJarAppMain] GoConfigMigration:93 - Upgrading config file from version 139 to version 139
2022-05-17 11:31:08,097 INFO [WrapperJarAppMain] GoConfigMigration:101 - Finished upgrading config file
2022-05-17 11:31:08,097 INFO [WrapperJarAppMain] GoConfigMigrator:106 - [Config Save] Starting Config Save post upgrade using FullConfigSaveNormalFlow
2022-05-17 11:31:08,183 INFO [Thread-76] DefaultPluginJarChangeListener:67 - Plugin load starting: /var/lib/go-server/plugins/bundled/gocd-ldap-authentication-plugin.jar
Afterwards it seems the server is running as usual, until the OOM and "locked" issue is occurring 23 hours later. So I can imagine that you are right and it is doing some "unclean restart". I suppose that GoCD is innocent, since you can never plan how to act when out of memory occurs, that's unpredictable. I assume that raising the size will solve the issue.
Thanks again for all the support, this is really a great community.