Need advice with embedded H2 losing data due to "revert"

100 views
Skip to first unread message

Christian Dirks

unread,
May 3, 2023, 10:27:36 AM5/3/23
to H2 Database
We are running a java application using embedded H2 databases and ran into data loss situation that we can not explain. It appears as if the database reverted to a previous state - or that data wasn't written to file for hours.

Setup:
Java 8 client software running on tablets(Microsoft Surfaces), using a file-based embedded H2 database for local storage (H2 version 1.4.196). The database is started in a cluster with a secondary mirror database (HA-JDBC version 3.0.3).

Situation:
Entries were inserted into the client-devices' database, updated and later also send the server. After a client-software restart that data was suddenly gone from the client. In analysed cases there was a previous point where our application was suddenly terminated without proper shutdown procedure. Sometimes hours between the creation of the data, the improper shutdown and then the restart, sometimes closer in range of minutes.
Our own logging indicated that the missing data was actually saved successfully in the client-device's H2. Further, the server's SOAP service had received the data, proving that the data had to exist on the client at one point.
We quickly received the client-device's databases from the customer and it did not contain said data despite our logfile inidating that it had been created and even updated afterwards. It nearly appears as if the data was never written to the database file. But that it was synchronized to the server is proof that at one point the data had to be query-able when determining what data to send to our server application.

Notes:
Master and mirror database are ensured to be equal in content before starting the client-device database cluster. Should they not be equal the mirror will not be added to the cluster. Further, the cluster uses no synchronization(passive strategy) so it can't be an out of date mirror database causing a revert.
The customer had in the past already problems with corrupted database files due to power loss and improper shutdown of our application. However in these data-loss cases the database started as normal and did not complain about corrupted files.
Other customers have not reported similar problems despite running similar setups, e.g. also using Microsoft Surfaces.

Questions:
What could lead to the database seemingly reverting to a previous state? Are there known problems related to running in a HA-JDBC cluster?
Can an unexpected shutdown of the embedded H2 lead to data reverting in a way it is hours old? Are there already fixed bugs or similar situations that can lead to something like the described problem?
Is there any kind of long-living H2 memory cache that never gets writte to the actual file for quite a while and is lost on power loss? (I assume not, judging by the documentation)
Is it possible that the database "repaired" itself, somehow loosing data in the process - or got corrupt in a way that a previous version of data reapears?

What other steps could we do to further trace where the problem is comming from?


Thanks in advance,
Christian Dirks

Noel Grandin

unread,
May 4, 2023, 9:25:29 AM5/4/23
to h2-da...@googlegroups.com


On 5/3/2023 3:10 PM, Christian Dirks wrote:
> We are running a java application using embedded H2 databases and ran into data loss situation that we can not explain.
> It appears as if the database reverted to a previous state - or that data wasn't written to file for hours.
>

Given the lengths of time involved, my first guess would be some kind of cloud-sync service that is getting confused and
reverting parts of the filesystem to a prior state.

My second guess would be that HA-JDBC is doing something weird like directing writes to one file and not the other, and
then choosing the wrong file the next time the software starts up.

Christian Dirks

unread,
May 5, 2023, 2:46:49 AM5/5/23
to H2 Database
Thanks for the reply.

I do not believe that the application is used in a setup with cloud-share by our customer. The devices are used in an environemnt where it is not guaranteed to even have internet connectivity all the time. But I will ask them to make sure.

I do not believe that this can be caused by HA-JDBC choosing the wrong file. As touched on in the "Notes" section we have custom code that compares both databases' content before we start the cluster. This includes compariong the amount of rows in each table as well as equality of each individual row. If we would detect any difference in content we wouldn't add the mirror database to the cluster, running with only the master database.
A similar situation would require that HA-JDBC stopped writing to the master database midway during the last run of our application and only wrote to the mirror. But I would assume that a failed write operation would result in a database exception from either the cluster or H2. We have a listener attached to the cluster that should be informed about all exceptions. If an exception happens, this listener will remove the databases from the cluster, check them and only reattach them if they are successfully tested. We do not see such a behavior happening in the logfiles. So this scenario would only be possible if neither HA-JDBC nor H2 realized that the write operation failed or if the exception was caught in a way that neither the transaction nor the listener knew about it.
Reply all
Reply to author
Forward
0 new messages