H2 error: "Database may be already in use" after upgrade

3,211 views
Skip to first unread message

AquilaNiger

unread,
May 19, 2022, 3:15:29 AM5/19/22
to go-cd
Hi everyone,

I really need your help. Previously we were using GoCD 20.1.0 and we upgraded to 22.1.0 recently, including database migration. Everthing went well and the server is running as usual. However, every now and then (no clue how to reproduce that) the database is locked and errors like these add up:

Hibernate operation: could not inspect JDBC autocommit mode; ... Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]
Hibernate operation: could not execute query; ... Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]
Could not open JDBC Connection for transaction; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]


Additionally I can see that in the database directory, two traces are written, the old and the new one. Is that "works as designed"? 

trace.png

Both traces contain stacktraces like this one:
database: flush
org.h2.message.DbException: General error: "java.lang.IllegalStateException: The file is locked: nio:/var/lib/go-server/db/h2db/cruise.mv.db [1.4.200/7]" [50000-200]...
...

       
What's wrong with the installation? Is there any option or setting in H2 that we could change to fix the issue? Is it correct that two trace files are written? Any idea or help is highly appreciated, otherwise I don't think we can get rid of the issue without a complete new installation.

Thanks in advance!
Julia

Chad Wilson

unread,
May 19, 2022, 10:38:55 AM5/19/22
to go...@googlegroups.com
Hi Julia

This seems odd. My understanding is that by default the H2 database uses a file system based lock to get exclusive access to the file. The default mechanism did change between the H2 versions used by 21.4.0 and 21.5.0.
  • Does the server recover after some time, or you need to restart GoCD or take some other action to fix it?
  • How are you running GoCD? i.e in which environment? Container? Standard server?
  • Is your DB file on some kind of network mount or something like that?
  • Is there a way to verify there aren't multiple processes/GoCD Instances trying to access the file?
    • when it happens, are you able to use OS-level commands such as lsof to see if other/multiple processes have handles on the DB file (depends on whether storage is local)
  • Would be good to confirm you don't see GoCD crashing or getting auto-restarted in your logs to rule out GoCD itself having a different problem, and then this problem is being caused by a zombie GoCD process or some kind of stale lock which takes time to expire.
  • Do you have any overrides to DB configuration, e.g a custom config/db.properties file?
To answer your question on the trace files, I think you get two files when the main trace file reaches an H2-configured maximum size. I ask the above question on DB properties as I think GoCD sets that to 16MB by default, whereas yours seems to have got to 64MB which seems curious.

There is a way to change the locking approach H2 uses (back to the older ;FILE_LOCK=FS - which creates the stale cruise.lock.db you have in your screenshot) if the issue is with the filesystem, however I imagine you'd want to rule out multiple processes or some other issue first.

-Chad

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/f2255549-a517-48a8-b9d0-db6f22eea980n%40googlegroups.com.

AquilaNiger

unread,
May 20, 2022, 3:39:46 AM5/20/22
to go-cd
Hi Chad,

thanks for your support. 
  • Does the server recover after some time, or you need to restart GoCD or take some other action to fix it?
No it does not recover, I have to restart GoCD. 
  • How are you running GoCD? i.e in which environment? Container? Standard server?
Standard Server on a Ubuntu 18.04.6 LTS virtual machine
  • Is your DB file on some kind of network mount or something like that?
No, it isn't. 
  • Is there a way to verify there aren't multiple processes/GoCD Instances trying to access the file?
    • when it happens, are you able to use OS-level commands such as lsof to see if other/multiple processes have handles on the DB file (depends on whether storage is local)
Currently this happens only once in a while (last time there were 6 days between the database issues). lsof is a good idea! I'll try that the next time it happens.  
  • Would be good to confirm you don't see GoCD crashing or getting auto-restarted in your logs to rule out GoCD itself having a different problem, and then this problem is being caused by a zombie GoCD process or some kind of stale lock which takes time to expire.
Actually, we found out with yesterdays go-server.log that the root cause seems to be Out of Memory of Java:
2022-05-18 10:24:18,741 WARN [105@MessageListener for WorkFinder] BasicDataSource:58 - An internal object pool swallowed an Exception. org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed; SQL statement: ROLLBACK [90098-200]
...
2022-05-18 10:24:18,742 WARN [105@MessageListener for WorkFinder] BasicDataSource:58 - An internal object pool swallowed an Exception. org.h2.jdbc.JdbcSQLNonTransientConnectionException: Out of memory. [90108-200]
...
Caused by: java.lang.OutOfMemoryError: Java heap space

Actually we haven't touched heap size of GoCD up to now. Therefore, we increased it now in wrapper-properties.conf and hope that the error will be gone. I hope that does not only defer the error to some days later.
  • Do you have any overrides to DB configuration, e.g a custom config/db.properties file?
No.
 
To answer your question on the trace files, I think you get two files when the main trace file reaches an H2-configured maximum size. I ask the above question on DB properties as I think GoCD sets that to 16MB by default, whereas yours seems to have got to 64MB which seems curious.
 
Thanks, that explains a lot. You're right, the "old" file contains timestamps from 3:16 to 6:38 and the new one from 6:38 to 8:16.


There is a way to change the locking approach H2 uses (back to the older ;FILE_LOCK=FS - which creates the stale cruise.lock.db you have in your screenshot) if the issue is with the filesystem, however I imagine you'd want to rule out multiple processes or some other issue first.
 
Thanks for the hint, I'll keep that in mind as "last resort". 

Thank you for your support. I think we'll wait now if the error occurs again. 

Julia

Chad Wilson

unread,
May 20, 2022, 4:38:37 AM5/20/22
to go...@googlegroups.com
No problem!

Perhaps what is happening is that after the OOM has occurred, it is being detected by the "wrapper" process shipped with GoCD with and the Java process is being restarted. If so, you should be able to see that auto-restart happening in the logs, including the go-server-wrapper.log.

Normally I'd expect that to ensure the file lock is cleanly released before the new process tries to acquire it, but perhaps it is not shutting down cleanly, or there is something unusual about the file system.

If it keeps happening, in addition to lsof, perhaps lslocks (or looking at /proc/locks) is of use to see which PIDs have locked which files more explicitly.

In any case, fixing the OOMs is probably a good idea regardless of this, so it seems sensible to try with a larger heap to address that issue first.

-Chad


--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

Aravind SV

unread,
May 21, 2022, 2:05:10 PM5/21/22
to go...@googlegroups.com
Apologies if you all see duplicate messages. It's because some of these messages were caught in a Spam filter, and marking these as not spam will likely have re-sent them.

Cheers,
Aravind

On Sat, May 21, 2022 at 7:03 PM AquilaNiger <Aquil...@gmx.de> wrote:
Hi everyone,

I really need your help. Previously we were using GoCD 20.1.0 and we upgraded to 22.1.0 recently, including database migration. Everthing went well and the server is running as usual. However, every now and then (no clue how to reproduce that) the database is locked and errors like these add up:

Hibernate operation: could not inspect JDBC autocommit mode; SQL [???]; Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]

Hibernate operation: could not execute query; SQL [SELECT materials.id FROM pipelineMaterialRevisions INNER JOIN pipelines ON pipelineMaterialRevisions.pipelineId = pipelines.id INNER JOIN modifications on modifications.id  = pipelineMaterialRevisions.torevisionId INNER JOIN materials on modifications.materialId = materials.id WHERE materials.id = ? AND pipelineMaterialRevisions.toRevisionId >= ? AND pipelineMaterialRevisions.fromRevisionId <= ? AND pipelines.name = ? GROUP BY materials.id;]; Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]


Could not open JDBC Connection for transaction; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: Database may be already in use: null. Possible solutions: close all other connection(s); use the server mode [90020-200]

Additionally I can see that in the database directory, two traces are written, the old and the new one. Is that "works as designed"?

Both traces contain stacktraces like this one:

2022-05-19 06:39:25 database: flush

org.h2.message.DbException: General error: "java.lang.IllegalStateException: The file is locked: nio:/var/lib/go-server/db/h2db/cruise.mv.db [1.4.200/7]" [50000-200]
        at org.h2.message.DbException.get(DbException.java:194)
        at org.h2.message.DbException.convert(DbException.java:347)
        at org.h2.mvstore.db.MVTableEngine$1.uncaughtException(MVTableEngine.java:93)

       ...
       
What's wrong with the installation? Is there any option or setting in H2 that we could change to fix the issue? Is it correct that two trace files are written? Any idea or help is highly appreciated, otherwise I don't think we can get rid of the issue without a complete new installation.

Thanks in advance!
Julia

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.

AquilaNiger

unread,
May 23, 2022, 2:45:24 AM5/23/22
to go-cd
Thanks! I was struggling with getting the message posted, as it has always been deleted right after posting. It took me several tries to find out that the SQL statements in the stacktraces were "evil". ;-) So, sorry for spamming.

AquilaNiger

unread,
May 23, 2022, 3:06:59 AM5/23/22
to go-cd
When the issue happened (10:24) wrapper log only shows:
INFO   | jvm 3    | 2022/05/18 10:23:44 | May 18, 2022 10:23:44 AM jakarta.mail.Session loadResource
INFO   | jvm 3    | 2022/05/18 10:23:44 | WARNING: expected resource not found: /META-INF/javamail.default.address.map
INFO   | jvm 3    | 2022/05/18 10:24:06 | May 18, 2022 10:24:05 AM jakarta.mail.Session loadResource
INFO   | jvm 3    | 2022/05/18 10:24:06 | WARNING: expected resource not found: /META-INF/javamail.default.address.map
INFO   | jvm 3    | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] qtp1718515850-41: Retried waiting for GCLocker too often allocating 256 words
INFO   | jvm 3    | 2022/05/18 10:24:18 | [70452.759s][warning][gc,alloc] 105@MessageListener for WorkFinder: Retried waiting for GCLocker too often allocating 386 words
INFO   | jvm 3    | 2022/05/18 10:29:42 | May 18, 2022 10:29:42 AM jakarta.mail.Session loadResource
INFO   | jvm 3    | 2022/05/18 10:29:42 | WARNING: expected resource not found: /META-INF/javamail.default.address.map

Having a closer look at the log showed that the OOM situation already happened one day earlier than expected. On 05/17 at round about 11:30 the wrapper-log shows the out of memory error messages:

2022-05-17 11:30:39,871 ERROR [104@MessageListener for WorkFinder] JMSMessageListenerAdapter:87 - Exception thrown in message handling by listener com.thoughtworks.go.server.messaging.scheduling.WorkFinder@4cf95d4c
org.springframework.transaction.TransactionSystemException: Could not roll back JDBC transaction; nested exception is org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed [90098-200]
    at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:329)
    at org.springframework.transaction.support.AbstractPlatformTransactionManager.processRollback(AbstractPlatformTransactionManager.java:857)
    at org.springframework.transaction.support.AbstractPlatformTransactionManager.rollback(AbstractPlatformTransactionManager.java:834)
    at org.springframework.transaction.support.TransactionTemplate.rollbackOnException(TransactionTemplate.java:164)
    at org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:137)
    at com.thoughtworks.go.server.transaction.TransactionTemplate.execute(TransactionTemplate.java:28)
    at com.thoughtworks.go.server.service.ScheduleService.rescheduleAbandonedBuildIfNecessary(ScheduleService.java:640)
    at com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:183)
    at com.thoughtworks.go.server.service.BuildAssignmentService.assignWorkToAgent(BuildAssignmentService.java:165)
    at com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:60)
    at com.thoughtworks.go.server.messaging.scheduling.WorkFinder.onMessage(WorkFinder.java:32)
    at com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.runImpl(JMSMessageListenerAdapter.java:83)
    at com.thoughtworks.go.server.messaging.activemq.JMSMessageListenerAdapter.run(JMSMessageListenerAdapter.java:63)
    at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.h2.jdbc.JdbcSQLNonTransientConnectionException: The database has been closed [90098-200]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:622)
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
    at org.h2.message.DbException.get(DbException.java:194)
    at org.h2.engine.Session.getTransaction(Session.java:1792)
    at org.h2.engine.Session.getStatementSavepoint(Session.java:1804)
    at org.h2.engine.Session.setSavepoint(Session.java:915)
    at org.h2.command.Command.executeUpdate(Command.java:244)
    at org.h2.jdbc.JdbcConnection.rollbackInternal(JdbcConnection.java:1530)
    at org.h2.jdbc.JdbcConnection.rollback(JdbcConnection.java:561)
    at org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781)
    at org.apache.commons.dbcp2.DelegatingConnection.rollback(DelegatingConnection.java:781)
    at org.springframework.jdbc.datasource.DataSourceTransactionManager.doRollback(DataSourceTransactionManager.java:326)
    ... 13 common frames omitted
2022-05-17 11:30:54,167 INFO  [WrapperJarAppMain] Jetty9Server:199 - Configuring Jetty using /etc/go/jetty.xml
2022-05-17 11:30:54,236 WARN  [WrapperJarAppMain] Server:357 - ErrorPageMapper not supported for Server level Error Handling
2022-05-17 11:30:54,372 WARN  [WrapperJarAppMain] AbstractHandler:96 - No Server set for ResourceHandler@51be8b61{STOPPED}
2022-05-17 11:30:58,763 WARN  [WrapperJarAppMain] ConnectionManager:117 - The file /etc/go/db.properties specified by `go.db.config` does not exist.
2022-05-17 11:30:59,154 INFO  [WrapperJarAppMain] DatabaseMigrator:40 - Upgrading database, this might take a while depending on the size of the database.
2022-05-17 11:30:59,154 INFO  [WrapperJarAppMain] DatabaseMigrator:49 - ************************************************************************
2022-05-17 11:30:59,155 INFO  [WrapperJarAppMain] DatabaseMigrator:49 - WARNING: Shutting down your server at this point will lead to a database corruption. Please wait until the database upgrade completes.
2022-05-17 11:30:59,155 INFO  [WrapperJarAppMain] DatabaseMigrator:49 - ************************************************************************
2022-05-17 11:31:01,356 INFO  [WrapperJarAppMain] DatabaseMigrator:57 - Database upgrade completed successfully.
2022-05-17 11:31:01,357 INFO  [WrapperJarAppMain] DataMigrationRunner:34 - Running data migrations...
2022-05-17 11:31:01,388 INFO  [WrapperJarAppMain] DataMigrationRunner:49 - Data migration took 3 ms
2022-05-17 11:31:01,391 INFO  [WrapperJarAppMain] DataMigrationRunner:49 - Data migration took 0 ms
2022-05-17 11:31:01,391 INFO  [WrapperJarAppMain] DataMigrationRunner:39 - Data migrations completed.
2022-05-17 11:31:02,206 WARN  [WrapperJarAppMain] ConfigurationFactory:136 - No configuration found. Configuring ehcache from ehcache-failsafe.xml  found in the classpath: jar:file:/var/lib/go-server/work/jetty-0_0_0_0-8153-cruise_war-_go-any-/webapp/WEB-INF/lib/ehcache-2.10.9.2.jar!/ehcache-failsafe.xml
2022-05-17 11:31:02,383 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.AccessToken]; using defaults.
2022-05-17 11:31:02,436 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.NotificationFilter]; using defaults.
2022-05-17 11:31:02,538 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.User]; using defaults.
2022-05-17 11:31:02,560 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.Plugin]; using defaults.
2022-05-17 11:31:02,570 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.EnvironmentVariable]; using defaults.
2022-05-17 11:31:02,705 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [com.thoughtworks.go.domain.User.notificationFilters]; using defaults.
2022-05-17 11:31:02,789 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [org.hibernate.cache.UpdateTimestampsCache]; using defaults.
2022-05-17 11:31:02,826 WARN  [WrapperJarAppMain] EhCacheProvider:93 - Could not find configuration [org.hibernate.cache.StandardQueryCache]; using defaults.
2022-05-17 11:31:04,815 WARN  [WrapperJarAppMain] BrokerService:2163 - Temporary Store limit is 51200 mb (current store usage is 0 mb). The data directory: /var/lib/go-server only has 45488 mb of usable space. - resetting to maximum available disk space: 45488 mb
2022-05-17 11:31:04,961 INFO  [WrapperJarAppMain] ConnectionManager:98 - Done loading query extensions, found com.thoughtworks.go.server.database.h2.H2QueryExtensions@6a25c389
2022-05-17 11:31:08,045 INFO  [WrapperJarAppMain] GoConfigMigration:93 - Upgrading config file from version 139 to version 139
2022-05-17 11:31:08,097 INFO  [WrapperJarAppMain] GoConfigMigration:101 - Finished upgrading config file
2022-05-17 11:31:08,097 INFO  [WrapperJarAppMain] GoConfigMigrator:106 - [Config Save] Starting Config Save post upgrade using FullConfigSaveNormalFlow
2022-05-17 11:31:08,183 INFO  [Thread-76] DefaultPluginJarChangeListener:67 - Plugin load starting: /var/lib/go-server/plugins/bundled/gocd-ldap-authentication-plugin.jar


Afterwards it seems the server is running as usual, until the OOM and "locked" issue is occurring 23 hours later. So I can imagine that you are right and it is doing some "unclean restart". I suppose that GoCD is innocent, since you can never plan how to act when out of memory occurs, that's unpredictable. I assume that raising the size will solve the issue. 

Thanks again for all the support, this is really a great community.

Gulshen Jumayeva

unread,
Jul 11, 2022, 2:25:35 PM7/11/22
to go-cd
Hi all, 

I am facing the same DB issue and trying to fix it. I tried to find the file path for FILE_LOCK=FS to set on gocd DB. Could you help me to locate the file, please?

Thank you,

Gulshen

Reply all
Reply to author
Forward
0 new messages