jdbc:h2:%s;PAGE_SIZE=32768;CACHE_SIZE=262144;FILE_LOCK=NO;DB_CLOSE_DELAY=-1;COMPRESS_LOB=LZF;LOG=0;DEFAULT_LOCK_TIMEOUT=999999999;MAX_LENGTH_INPLACE_LOB=4096;MAX_MEMORY_ROWS=500000;LARGE_RESULT_BUFFER_SIZE=65536;MULTI_THREADED=1;DB_CLOSE_ON_EXIT=FALSE;JMX=TRUE
I figured out why I was thinking it wasn't being set. I just set a breakpoint in the Session constructor and when I hit that was inspecting what was being pulled out of Database#findSetting(..). The first time this breakpoint was being hit was trying to construct a session apparently before any of my extra settings were being SET on the database. I guess H2 uses this "primordial" session to do the SET updates? If I waited to activate the breakpoint until right before my query that timed out then set the breakpoint - I got the actual Session used for that statement, and it did pull the correct default lock timeout out of the Database#findSettings(). Sorry for the confusion. Unfortunately it still doesn't explain how this lock timeout occurred, but I have a guess (below).
2014-06-17 10:01:00,821 ERROR ve-pair-profiling-linkloader-1 link.LinkGraphLoader - Problem loading the link; isactive trx? true
org.hibernate.PessimisticLockException: could not extract ResultSet
at org.hibernate.dialect.H2Dialect$2.convert(H2Dialect.java:342)
at org.hibernate.exception.internal.StandardSQLExceptionConverter.convert(StandardSQLExceptionConverter.java:49)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:126)
at org.hibernate.engine.jdbc.spi.SqlExceptionHelper.convert(SqlExceptionHelper.java:112)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:89)
at org.hibernate.loader.Loader.getResultSet(Loader.java:2065)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1862)
at org.hibernate.loader.Loader.executeQueryStatement(Loader.java:1838)
at org.hibernate.loader.Loader.doQuery(Loader.java:909)
at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354)
at org.hibernate.loader.Loader.doList(Loader.java:2553)
at org.hibernate.loader.Loader.doList(Loader.java:2539)
at org.hibernate.loader.Loader.listIgnoreQueryCache(Loader.java:2369)
at org.hibernate.loader.Loader.list(Loader.java:2364)
at org.hibernate.loader.hql.QueryLoader.list(QueryLoader.java:496)
at org.hibernate.hql.internal.ast.QueryTranslatorImpl.list(QueryTranslatorImpl.java:387)
at org.hibernate.engine.query.spi.HQLQueryPlan.performList(HQLQueryPlan.java:231)
at org.hibernate.internal.SessionImpl.list(SessionImpl.java:1264)
at org.hibernate.internal.QueryImpl.list(QueryImpl.java:103)
at org.hibernate.internal.AbstractQueryImpl.uniqueResult(AbstractQueryImpl.java:966)
at com.argodata.match.matching.analysis.link.LinkGraphDao.fetchGroupForEntity(LinkGraphDao.java:114)
at com.argodata.match.matching.analysis.link.LinkGraphLoader.addLinkInTransaction(LinkGraphLoader.java:252)
at com.argodata.match.matching.analysis.link.LinkGraphLoader.addLinkHoldingLocks(LinkGraphLoader.java:216)
at com.argodata.match.matching.analysis.link.LinkGraphLoader.addLink(LinkGraphLoader.java:151)
at com.argodata.match.profiling.profiling.AsyncPairProfilingLinkerCallback.loadLink(AsyncPairProfilingLinkerCallback.java:91)
at com.argodata.match.profiling.profiling.AsyncPairProfilingLinkerCallback.access$3(AsyncPairProfilingLinkerCallback.java:90)
at com.argodata.match.profiling.profiling.AsyncPairProfilingLinkerCallback$1.consume(AsyncPairProfilingLinkerCallback.java:68)
at com.argodata.match.profiling.profiling.AsyncPairProfilingLinkerCallback$1.consume(AsyncPairProfilingLinkerCallback.java:1)
at com.argodata.jcf.commons.concurrent.AdaptiveConsumerExecutor$PoolWorker.consumeWork(AdaptiveConsumerExecutor.java:228)
at com.argodata.jcf.commons.concurrent.AdaptiveConsumerExecutor$PoolWorker.run(AdaptiveConsumerExecutor.java:201)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.h2.jdbc.JdbcSQLException: Timeout trying to lock table "LINKENTITY"; SQL statement:
select linkgroup0_.groupId as groupId1_1_, linkgroup0_.mergedTo as mergedTo2_1_, linkgroup0_.status as status3_1_ from LinkGroup linkgroup0_ cross join LinkEntity linkentity1_ where linkentity1_.groupId=linkgroup0_.groupId and linkentity1_.id=? [50200-170]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:329)
at org.h2.message.DbException.get(DbException.java:169)
at org.h2.message.DbException.get(DbException.java:146)
at org.h2.table.RegularTable.doLock(RegularTable.java:501)
at org.h2.table.RegularTable.lock(RegularTable.java:435)
at org.h2.table.TableFilter.lock(TableFilter.java:143)
at org.h2.command.dml.Select.queryWithoutCache(Select.java:604)
at org.h2.command.dml.Query.query(Query.java:307)
at org.h2.command.dml.Query.query(Query.java:277)
at org.h2.command.dml.Query.query(Query.java:36)
at org.h2.command.CommandContainer.query(CommandContainer.java:86)
at org.h2.command.Command.executeQuery(Command.java:191)
at org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:109)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeQuery(NewProxyPreparedStatement.java:482)
at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:80)
... 30 more
I've looked at the doLock code and its so straightforward that there is just no way to explain this stacktrace - unless - somehow the lock timeout wasn't being set in the session correctly or something else was updating it. Like I said, when I break in the debugger while its executing this statement the lock timeout is getting passed in correctly.
It's worth mentioning that this happened on a 32 core server, which had been doing contentious work for over 45 minutes before, which should've hit lock timeouts > 2 seconds before (2-3 threads fighting to insert records in to a few tables that all have secondary indexes).
I haven't dug through all of the code related to parsing the connection URL, but Im guessing that all of the extra settings passed in to the URL are going through the Set update dml code to get them set in the Database#settings map. My app is going to be requesting a lot of connections concurrently when it starts doing work. Is is plausible that somehow the first created connections (with associated Sessions) are racing with each other to call SET for all of the extra settings in the db url? And somehow some connections are getting out (with constructed Sessions) that are seeing the default lock timeout before the SETs have all happened? This would explain the non-determinism and why its hard for me to replicate, and why I can't see any Sessions with incorrect lock timeouts on my machine. It would also explain why it was able to work for so long: if there were 30ish connections in the connection pool and only 1 had this wrong timeout, then probability of getting that one is 1/32 + prob of exceeding 2 second lock wait... maybe thats low enough that it didn't happen in the first 45 mins.
Going off of this wild guess: I added LOCK_TIMEOUT=999999999 to the connection URL (without any valid reason to do so) as well as added a c3p0 connection customizer to actually call SET LOCK_TIMEOUT on the actual connection on checkout before it ever goes to hibernate. With both of these we've been able to run the workload a few times now without any lock timeouts. Not willing to call it solved yet, but interesting. Does that theory have any chance of being correct? I'm on 1.3.170 perhaps I need to upgrade (and just turn off MVStore)