Corrupted Database - unable to recover

352 views

Skip to first unread message

beetle

unread,

Aug 20, 2012, 1:41:59 PM8/20/12

to h2-da...@googlegroups.com

I'm using h2-1.3.155 in embedded mode with Spring/Hibernate in a Java/Swing client app running on WinXP SP3 with the startup params : IFEXISTS=TRUE;AUTO_SERVER=TRUE;DB_CLOSE_DELAY=5. I have several corrupted databases on different tablets where I am either no longer able to query against some or all of the database and sometimes cannot access it at all. Looking at the one with partial access, the trace file initially contains the following error:

08-20 04:37:48 database: close

org.h2.message.DbException: IO Exception: "java.io.EOFException"; "C:\journal\H2DB\HNDLDATA.h2.db" [90031-155]

at org.h2.message.DbException.get(DbException.java:156)

at org.h2.message.DbException.convertIOException(DbException.java:313)

at org.h2.store.FileStore.readFully(FileStore.java:287)

at org.h2.store.PageStore.readPage(PageStore.java:1247)

at org.h2.store.PageStore.getPage(PageStore.java:701)

at org.h2.store.PageStreamTrunk$Iterator.next(PageStreamTrunk.java:247)

at org.h2.store.PageLog.free(PageLog.java:210)

at org.h2.store.PageStore.compact(PageStore.java:583)

at org.h2.engine.Database.closeOpenFilesAndUnlock(Database.java:1125)

at org.h2.engine.Database.close(Database.java:1076)

at org.h2.engine.DatabaseCloser.run(DatabaseCloser.java:80)

Caused by: org.h2.jdbc.JdbcSQLException: IO Exception: "java.io.EOFException"; "C:\journal\H2DB\HNDLDATA.h2.db" [90031-155]

at org.h2.message.DbException.getJdbcSQLException(DbException.java:327)

... 11 more

Caused by: java.io.EOFException

at java.io.RandomAccessFile.readFully(Unknown Source)

at org.h2.store.FileStore.readFully(FileStore.java:285)

... 8 more

From this point forward in time, most basic queries in the H2 Console fail against the schema tables fail in one of two ways as seen below:

SELECT * FROM JESIG;

General error: "java.lang.NullPointerException"; SQL statement:
SELECT * FROM JESIG [50000-155] HY000/50000 (Help)
org.h2.jdbc.JdbcSQLException: General error: "java.lang.NullPointerException"; SQL statement:
SELECT * FROM JESIG [50000-155]
    at org.h2.message.DbException.getJdbcSQLException(DbException.java:327)
    at org.h2.message.DbException.get(DbException.java:156)
    at org.h2.message.DbException.convert(DbException.java:279)
    at org.h2.command.Command.executeQuery(Command.java:185)
    at org.h2.jdbc.JdbcStatement.executeInternal(JdbcStatement.java:173)
    at org.h2.jdbc.JdbcStatement.execute(JdbcStatement.java:152)
    at org.h2.server.web.WebApp.getResult(WebApp.java:1304)
    at org.h2.server.web.WebApp.query(WebApp.java:994)
    at org.h2.server.web.WebApp$1.next(WebApp.java:957)
    at org.h2.server.web.WebApp$1.next(WebApp.java:960)
    at org.h2.server.web.WebThread.process(WebThread.java:166)
    at org.h2.server.web.WebThread.run(WebThread.java:93)
    at java.lang.Thread.run(Thread.java:595)
Caused by: java.lang.NullPointerException
    at org.h2.index.PageDataLeaf.getNextPage(PageDataLeaf.java:391)
    at org.h2.index.PageDataCursor.nextRow(PageDataCursor.java:90)
    at org.h2.index.PageDataCursor.next(PageDataCursor.java:49)
    at org.h2.index.IndexCursor.next(IndexCursor.java:235)
    at org.h2.table.TableFilter.next(TableFilter.java:352)
    at org.h2.command.dml.Select.queryFlat(Select.java:512)
    at org.h2.command.dml.Select.queryWithoutCache(Select.java:617)
    at org.h2.command.dml.Query.query(Query.java:290)
    at org.h2.command.dml.Query.query(Query.java:260)
    at org.h2.command.dml.Query.query(Query.java:37)
    at org.h2.command.CommandContainer.query(CommandContainer.java:80)
    at org.h2.command.Command.executeQuery(Command.java:181)
    ... 9 more

SELECT * FROM USER_SESSION;

As you would expect, views against these tables fail in the same way. These two exceptions are also repeated in the trace file. Running the Recovery tool, it appears to generate the SQL to recreate all objects, however it only seems to reclaim < 5% of the data and the recovery file is full of messages like this:

-- page 20: data node parent: 0 table: 18 entries: 295 rowCount: 5058

-- ERROR [20] child[0]: 63 parent: 0

-- ERROR [20] child[1]: 64 parent: 0

...

-- ERROR [20] child[291]: 1335 parent: 0

-- ERROR [20] child[292]: 1340 parent: 0

-- page 21: data overflow (last)

and this:

-- page 1294: data node parent: 17 table: 20 entries: 203 rowCount: -1

-- ERROR [1294] child[0]: 974 parent: 0

-- ERROR [1294] child[1]: 981 parent: 0

I could use some expert assistance in determine if there's any hope of recovering the rest of the data.

As for the cause of the corruption, logging data points to the running of a bat file to move (and subsequently replace) the db using the local Scheduled Task Manager while it was in use that is likely the culprit. This has since been disabled to say the least.

I appreciate any help in trying to recover the data.

Thanks

Thomas Mueller

unread,

Aug 21, 2012, 2:26:17 PM8/21/12

to h2-da...@googlegroups.com

Hi,

Version 1.3.155 is quite old, there were some problems related to corruption fixed in newer version (see the change log for details).

> Running the Recovery tool, ... only seems to reclaim < 5% of the data

That's strange.

> -- page 20: data node parent: 0 table: 18 entries: 295 rowCount: 5058
> -- ERROR [20] child[0]: 63 parent: 0
> -- ERROR [20] child[1]: 64 parent: 0
> ...
> ...
> -- ERROR [20] child[291]: 1335 parent: 0
> -- ERROR [20] child[292]: 1340 parent: 0
> -- page 21: data overflow (last)

Those messages are usually not a problem, as they refer to b-tree nodes and not leaves. All the data is in the leaves (H2 uses b+trees internally).

> -- page 1294: data node parent: 17 table: 20 entries: 203 rowCount: -1

The rowCount: -1 isn't a problem (the row count is set to -1 on an update, this is normal). The counts are not updated (I tried to implement counted b-trees but it is incomplete).

> I could use some expert assistance in determine if there's any hope of recovering the rest of the data.

The Recover tool should be able to retrieve all data. Possibly you need to use a more recent version of the Recover tool, did you try using the newest version? The file format didn't change.

> I appreciate any help in trying to recover the data.

If you want, you can send the .h2.db file to me and I can try.

Regards,
Thomas

beetle

unread,

Aug 23, 2012, 4:04:52 PM8/23/12

to h2-da...@googlegroups.com

ok, thanks for the clarification on the output from the recovery tool. I tried the latest version but there appears to be little difference. our db is small in comparison to what i've seen listed here, but i know i'm missing a large percentage of my data. A comparable db instance contains ~5k records in one of the larger tables and only seeing maybe 50 in the recovery output. If you wouldn't mind taking a crack at it, i'd like to send it to you.