H2 DB corruption

102 views
Skip to first unread message

Meni Hillel

unread,
Jan 24, 2020, 3:45:43 PM1/24/20
to H2 Database
For no evident reason, we're getting a random failure on a simple query reporting a corruption error. Program continues successfully and consecutive transactions works fine. But has been very puzzling and we cannot figure any root cause.


Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: File corrupted in chunk 73350, expected page length 4..256, got 404232216 [1.4.200/6]"; SQL statement:
SELECT COUNT(*) FROM task WHERE job_fk = 694 [50000-200]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:505) ~[factory.jar:2.1]
        at org.h2.message.DbException.getJdbcSQLException(DbException.java:429) ~[factory.jar:2.1]
        at org.h2.message.DbException.get(DbException.java:194) ~[factory.jar:2.1]
        at org.h2.message.DbException.convert(DbException.java:347) ~[factory.jar:2.1]
        at org.h2.command.Command.executeQuery(Command.java:212) ~[factory.jar:2.1]
        at org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114) ~[factory.jar:2.1]
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:82) ~[factory.jar:2.1]
        ... 26 more
Caused by: java.lang.IllegalStateException: File corrupted in chunk 73350, expected page length 4..256, got 404232216 [1.4.200/6]
        at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950) ~[factory.jar:2.1]
        at org.h2.mvstore.Chunk.readBufferForPage(Chunk.java:368) ~[factory.jar:2.1]
        at org.h2.mvstore.MVStore.readBufferForPage(MVStore.java:1214) ~[factory.jar:2.1]
        at org.h2.mvstore.MVStore.readPage(MVStore.java:2209) ~[factory.jar:2.1]
        at org.h2.mvstore.MVMap.readPage(MVMap.java:672) ~[factory.jar:2.1]
        at org.h2.mvstore.Page$NonLeaf.getChildPage(Page.java:1043) ~[factory.jar:2.1]
        at org.h2.mvstore.Cursor.hasNext(Cursor.java:53) ~[factory.jar:2.1]
        at org.h2.mvstore.tx.TransactionMap$CommittedIterator.fetchNext(TransactionMap.java:779) ~[factory.jar:2.1]
        at org.h2.mvstore.tx.TransactionMap$TMIterator.next(TransactionMap.java:952) ~[factory.jar:2.1]
        at org.h2.mvstore.db.MVSecondaryIndex$MVStoreCursor.next(MVSecondaryIndex.java:474) ~[factory.jar:2.1]
        at org.h2.index.IndexCursor.next(IndexCursor.java:291) ~[factory.jar:2.1]
        at org.h2.table.TableFilter.next(TableFilter.java:529) ~[factory.jar:2.1]
        at org.h2.command.dml.Select.gatherGroup(Select.java:520) ~[factory.jar:2.1]
        at org.h2.command.dml.Select.queryGroup(Select.java:487) ~[factory.jar:2.1]
        at org.h2.command.dml.Select.queryWithoutCache(Select.java:839) ~[factory.jar:2.1]
        at org.h2.command.dml.Query.queryWithoutCacheLazyCheck(Query.java:201) ~[factory.jar:2.1]
        at org.h2.command.dml.Query.query(Query.java:489) ~[factory.jar:2.1]
        at org.h2.command.dml.Query.query(Query.java:451) ~[factory.jar:2.1]
        at org.h2.command.CommandContainer.query(CommandContainer.java:285) ~[factory.jar:2.1]
        at org.h2.command.Command.executeQuery(Command.java:195) ~[factory.jar:2.1]
        at org.h2.jdbc.JdbcPreparedStatement.executeQuery(JdbcPreparedStatement.java:114) ~[factory.jar:2.1]
        at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:82) ~[factory.jar:2.1]
        ... 26 more



Additional info:

1) Using latest H2 version (1.4.200)
2) JDBC connection URL "jdbc:h2:retry://.dbfile;LOCK_TIMEOUT=10000;AUTO_RECONNECT=TRUE;AUTO_SERVER=TRUE;DB_CLOSE_ON_EXIT=FALSE;";


Meni Hillel

unread,
Jan 24, 2020, 4:03:47 PM1/24/20
to H2 Database
One more piece of information that may be important - we are taking a periodic backup hourly. We do it the proper way, using the "BACKUP TO" command. Don't know if that could be a factor.

Evgenij Ryazanov

unread,
Jan 24, 2020, 8:48:34 PM1/24/20
to H2 Database
Hello.

The BACKUP command most likely is not related (if it isn't invoked by another process with different version of H2). But there are three suspicious options in your connection URL.

1. If you use the retry: subsystem, I assume that you interrupt your threads that perform calls into database. It's a bad practice and usually it leads to corruption of embedded databases (TCP clients aren't affected), but if you can't change your application to avoid it, it's better to use the newer async: file system, the retry: file system has some own problems.

2. DB_CLOSE_ON_EXIT=FALSE means that H2 will be unable to shutdown the database properly automatically during termination of the Java process. If you use this option, you must do it by yourself using the SHUTDOWN command. Your process must wait for its completion and it should not try to reopen the database after it.

3. The AUTO_SERVER=TRUE option is used to connect to the database from a multiple applications without a TCP server. Usually it's better and safer to use a normal server, but if you can't do it, you may use this option. But you should ensure that all applications have exactly the same version of H2. For example, the similar failure was reported multiple time by users of IntellyJ IDEA Database tool, it was shipped with H2 1.4.196 and attempts to mix 1.4.196 and 1.4.200 led to such corruption. Only the TCP server and its clients may have different versions legally. With auto server versions should be the same.

4. If your database was created by a some older version, it's better to export it into an SQL script with a SCRIPT TO 'filename.sql' command, create a new database with 1.4.200, and execute a RUNSCRIPT FROM 'filename.sql' in it to create a database that definitely doesn't inherit any problems from older versions.

Meni Hillel

unread,
Jan 24, 2020, 11:12:29 PM1/24/20
to H2 Google Group
Thank you for the comprehensive and thoughtful response.

1) It is true that we may have the potential to interrupt a thread in a middle of a transaction, but it may happen of a very rare condition which absolutely positively did not take place.

2) That is correct and we indeed do call SHUTDOWN.

3) We do make possible queries from another process when we perform debugging/diagnostics, but we using the exact H2 version, and, this did not take place at all throughout the life cycle of the running process.

4) DB was created with only one version, which is the latest - 1.4.200.


I am curious about the possible circumstances under which this error takes place... Is it related to concurrent I/O?  Multi threading? 


Thanks,
Meni


--
You received this message because you are subscribed to a topic in the Google Groups "H2 Database" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/h2-database/M1jKZD5rXYQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to h2-database...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/h2-database/5a0bc11a-3fb9-45fd-9883-c2bb05bf00e4%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages