Data loss with H2 HA / Clustered Mode
Hi,
We are trying to use the H2 DB after switching from Apache Derby. This main reason for this was because of the clustering support in H2. After a fair amount of testing we have now come across a problem with the clustered mode.
I have configured a system to use H2 (1.3.172) in clustered mode and access the DB using the supplied connection pool. Our application has many threads accessing the DB concurrently (from Web Services and background tasks). The main purpose of the DB is to hold a job queue where the job is added to the table, updated frequently while active and then set to complete. I have found when the system is under load job records can “disappear” from the DB (many jobs being submitted and updates to existing records, ~40 out of 1000 jobs lost). This doesn’t happen when the DB is run in standalone mode.
I don’t know how the clustering mode works internally but I was wondering if the transactions are not making it to one of the DB instances but this instance is then used for reading. Would this be possible? Could it be something I have set-up incorrectly? Does clustering work OK with the connection pool class? We had to switch to sequences for id generation, could I have done something wrong there?
Any help would be much appreciated. I am happy to supply more details if required.
Regards,
Daniel Stone
My connection URL is essentially this: "jdbc:h2:tcp://server1:9092,server2:9092/./folder/dbName"
Please note that 2 applications are accessing the cluster. I have now started to look at the H2 source code and from what I can see our system is experiencing transient problems when running updates on server1 which then results in the updates only making it to server2. I then think that another connection (as we use a pool) attempts to read the data from server1 but it doesn't exist. It looks like the cluster list is held per connection, is this correct?
Hi Noel,
Many thanks for the update. We are in the process of deciding whether to stay with H2 or not. I have a few questions regarding H2 clustering. Do you know if there is any intention to enhance the H2 cluster mode? Would we be able to enhance this feature and submit back to the H2 community?
Regards,
Dan
Many thanks for the update. We are in the process of deciding whether to stay with H2 or not. I have a few questions regarding H2 clustering. Do you know if there is any intention to enhance the H2 cluster mode? Would we be able to enhance this feature and submit back to the H2 community?
--
You received this message because you are subscribed to the Google Groups "H2 Database" group.
To unsubscribe from this group and stop receiving emails from it, send an email to h2-database...@googlegroups.com.
To post to this group, send email to h2-da...@googlegroups.com.
Visit this group at http://groups.google.com/group/h2-database.
For more options, visit https://groups.google.com/groups/opt_out.
Hi,
Thanks for the quick response. We are still in discussions here as to whether we will continue to use H2 with some enhancements etc but I need to establish how complex the process would be if we do.
I notice in the FAQ that you recommend patches / bug fixes before making any major changes to the source code, which I can fully understand. How would we go about proposing design / implementation changes to the H2 clustering code? Would anyone with the authority be able to work with us for a donation to the project to speed this process up?
Regards,
Dan
I notice in the FAQ that you recommend patches / bug fixes before making any major changes to the source code, which I can fully understand. How would we go about proposing design / implementation changes to the H2 clustering code?
Would anyone with the authority be able to work with us for a donation to the project to speed this process up?
Hi Neol & Thomas,
Thank you for the responses. I now have enough information to discuss this with my management. I will most likely post my requirements / problems with the current clustering solution next week, possibly with some ideas on how we believe it could be improved.
We are very pleased with the performance and general DB functionality of H2 but just need the clustering to handle transient problems more gracefully than it does today.
We did also experience many problems with a “missing lob” error which actually triggered the cluster issue in the first place. We have managed to work around this but is this still a bug with H2 (I have seen this problem mentioned on the web)?
Regards,
Dan
We did also experience many problems with a �missing lob� error which actually triggered the cluster issue in the first place.� We have managed to work around this but is this still a bug with H2 (I have seen this problem mentioned on the web)?
Hi Noel & Thomas,
We are going to perform some tests with the H2 clustering using a tool to simulate a troublesome network. Once we have established how H2 behaves exactly I will post up our current problems with H2 (unless we don't manage to break it in the way I believe we can).
I understand that you will not be able to fix/change the clustering implementation but we need to post the problems, discuss changes and then hopefully implement any approved changes.
Regards,
Dan
Hi,
Sorry for the long delay in responding. It does sound like we have similar requirements for H2 and we are certainly interested in supporting more than 2 nodes. We have managed to configure 2 nodes to perform as we expect and have also completed a number of tests running with a troublesome network (lost packets / delays etc). For the most part H2 has behaved as we'd hoped.
I’m not surprised that you have been able to break the consistency of the DBs by modifying records on a single node as this is similar to the problem that occurred for us when some queries were producing an exception. The instance that had the exception would be removed from the list for a single connection. This would then result in lost writes to the DB instance if the connection was used for writing (we use a connection pool).
Because we are very busy and have managed to work around some of the issues we have not been spending much time on this but that could change within the next few months. What kind of timescales are you working to?
Dan