out of memory causes corrupted database (using h2-1.2.144.jar)

290 views
Skip to first unread message

mhe123

unread,
Oct 22, 2010, 5:18:39 AM10/22/10
to H2 Database
first of all, congratulations on what so far has been one of the most
fantastic pieces of software I have ever enjoyed.

Now the bad news (subject line says it all) :

due to a bug that I have recently introduced to my application, an out
of memory condition has happened which caused the application to
crash. Upon restarting, I get this error:

SEVERE: null
org.h2.jdbc.JdbcSQLException: File corrupted while reading record:
"16754 of 13346". Possible solution: use the recovery tool [90030-144]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:
327)
at org.h2.message.DbException.get(DbException.java:167)
at org.h2.message.DbException.get(DbException.java:144)
at org.h2.store.PageStore.readPage(PageStore.java:1256)
at org.h2.store.PageStore.getPage(PageStore.java:695)
at org.h2.index.PageDataIndex.getPage(PageDataIndex.java:220)
at org.h2.index.PageDataNode.getLastKey(PageDataNode.java:208)
at org.h2.index.PageDataIndex.<init>(PageDataIndex.java:76)
at org.h2.table.RegularTable.<init>(RegularTable.java:75)
at org.h2.store.PageStore.addMeta(PageStore.java:1610)
at org.h2.store.PageStore.readMetaData(PageStore.java:1542)
at org.h2.store.PageStore.recover(PageStore.java:1340)
at org.h2.store.PageStore.openExisting(PageStore.java:346)
at org.h2.store.PageStore.open(PageStore.java:275)
at org.h2.engine.Database.getPageStore(Database.java:2030)
at org.h2.engine.Database.open(Database.java:537)
at org.h2.engine.Database.openDatabase(Database.java:208)
at org.h2.engine.Database.<init>(Database.java:203)
at org.h2.engine.Engine.openSession(Engine.java:53)
at org.h2.engine.Engine.openSession(Engine.java:147)
at org.h2.engine.Engine.createSessionAndValidate(Engine.java:126)
at org.h2.engine.Engine.createSession(Engine.java:109)
at org.h2.engine.Engine.createSession(Engine.java:27)
at
org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:
289)
at org.h2.engine.SessionRemote.createSession(SessionRemote.java:265)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:110)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:94)
at org.h2.Driver.connect(Driver.java:72)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at com.ma1.m2.M2.initConn(M2.java:1717)
at com.ma1.m2.M2.init(M2.java:1674)
at com.ma1.m2.M2.<clinit>(M2.java:230)


So I ran the recovery tool and then the runscript tool on the script
generated by the recovery tool, however this didn't work.

I can't afford to use H2 if an accidental bug could lead to an out of
memory condition (happens even with the best of products) that could
corrupt a database.

I don't understand fully the inner workings of H2 (though I have spent
a fair bit of time studying the source), nevertheless, can't the
database put some kind of flag (say a byte or crc check or something)
to determine if a partially written record is broken or not? I'm
wondering about the implementation that leads to the possibility of a
corrupted database in the first place.

In the meantime, I have (quite unfortunately) had to completely
suspend usage of H2.



Thomas Mueller

unread,
Oct 22, 2010, 11:44:55 AM10/22/10
to h2-da...@googlegroups.com
Hi,

I am very interested in analyzing and solving this problem. Corruption
problems have top priority for me.

An OutOfMemory error shouldn't corrupt the database. On OOM, the
database file should be closed immediately (there is a test case for
this situation: TestOutOfMemory). And the database should never get
corrupt if the database file is closed at any point. For this
situation (stopping the process at any point) there are a lot of test
cases - from time to time I run all existing test cases so that after
each x write operations the current database file is copied and then
reopened.

Maybe the OOM problem occurs at a strange place, or for some reason
the database didn't close itself automatically. Do you still have the
OOM stack trace, or the .trace.db file? If yes could you check if
there is an exception logged?

I have a few more questions:

- What is your database URL?
- How many connections does your application use concurrently?
- Do you use temporary tables?
- Did you use LOG=0 or LOG=1?
- With which version of H2 was this database created?
You can find it out using:
select * from information_schema.settings where name='CREATE_BUILD'
or have a look in the SQL script created by the recover tool.
- Do you use any settings or special features (for example cache settings,
two phase commit, linked tables)?
- Do you use any H2-specific system properties?
- Is the application multi-threaded?
- What operating system, file system, and virtual machine
(java -version) do you use?
- How did you start the Java process (java -Xmx... and so on)?
- Is it (or was it at some point) a networked file system?
- How big is the database (file sizes)?
- Are there any other exceptions (maybe in the .trace.db file)?
Could you send them please?
- Do you still have any .trace.db files, and if yes could you send them?

Regards,
Thomas

Thomas Mueller

unread,
Oct 23, 2010, 7:15:27 AM10/23/10
to h2-da...@googlegroups.com
(This is a copy of the mail, because unfortunately mhe123 used the
wrong subject in his reply).

Hi,

> CACHE_TYPE=SOFT_LRU;LOCK_MODE=0;LOG=0;UNDO_LOG=0

Please don't view that as a personal attack, but no wonder your
database is corrupt. You have used all kinds of dangerous options,
specially LOG=0. I guess you read that in the "Fast Database Import"
section at http://h2database.com/html/performance.html#fast_import ?
The documentation there says that "not recommended for regular use",
but it doesn't say how dangerous those options are. The danger is
documented elsewhere:

http://h2database.com/html/faq.html#reliable ("Those dangerous
features are: LOG=0, LOCK_MODE=0; Areas considered experimental:
CACHE_TYPE=SOFT_LRU")

http://h2database.com/html/grammar.html#set_log ("It is the fastest
mode, but also the most dangerous: if the process is killed while the
database is open in this mode, the data might be lost.")

The documentation doesn't say the database might get corrupt. So the
problem is mainly that the documentation is not clear about the risks.

For H2 version 1.3, I would like to ensure people are aware of the
risk when using those features. It's not enough to improve the
documentation. Instead, I will add a new setting connection option
"DANGEROUS" that is required if you use any of the dangerous features.
That way you can't use experimental or risky features accidentally. So
in your case you would need to use:

jdbc:h2:~/data/test;DANGEROUS=2;CACHE_TYPE=SOFT_LRU;LOCK_MODE=0;LOG=0;UNDO_LOG=0

If you don't use DANGEROUS=2, then you couldn't connect, the database
would throw an exception like the following: "You have tried to use
the dangerous/experimental feature LOG=2. Because the feature is
dangerous, you need to set the option DANGEROUS=2, indicating you know
about the risks."

P.S. Please don't use a different subject in reply to another mail
(you have used "detailed reply" now, which starts a new email thread).
It's very confusing for others.

Regards,
Thomas

Reply all
Reply to author
Forward
0 new messages