Hello,
thank you both for your answers. I appreciate it.
I'm attaching a ZIP with my second attempt to use hanoidb instead of outdated toke as a larger than RAM disk backend.
The CT result with rev 4776 (if I'm not mistaken) is :
TEST COMPLETE, 472 ok, 27 failed, 62 skipped of 561 test cases
Skipped test are 62 (11 user/51 auto).
(CT with standard ETS had zero failed test).
I see somewhat 50M bytes of files in the hanoidb store, so encouraging :-)
I don't understand much about CT failing tests. The first I can see something related to hanoidb is this one :
snapshot_suite.test_rdht_tx_read_validate_should_abort (#495)
*** CT Error Notification 2013-05-17 18:16:08.718 ***
db_hanoidb:get_entry2_ failed on line 125
Reason: function_clause
*** User 2013-05-17 18:16:08.720 ***
####################################################
End snapshot_SUITE:test_rdht_tx_read_validate_should_abort -> {error, {function_clause, [{db_hanoidb,get_entry2_, [{db_3985071319,762318896,{762323088,0,0}}, 80325066489831061459460196859901989661], [{file,"src/db_hanoidb.erl"},{line,125}]}, {db_hanoidb,get_entry_,2,[{file,"src/db_common.hrl"},{line,41}]}, {rdht_tx_read,validate,3, [{file,"src/transactions/rdht_tx_read.erl"},{line,259}]}, {snapshot_SUITE,test_rdht_tx_read_validate_should_abort,1, [{file,"test/snapshot_SUITE.erl"},{line,146}]}, (snip)
There seems indeed to be some function signature mismatch :
get_entry2_({{DB, _FileName}, _Subscr, _SnapState}, Key)
seems called with
{db_3985071319,762318896,{762323088,0,0}}, 80325066489831061459460196859901989661
and the atom db_3985071319 can't match the tuple {DB, _FileName}.
I have made the db_hanoidb module from the toke module by taking care of same function signatures/API.
This is one of 27 errors reported by CT.
What can we do about this ? I feel both optimistic as I see scalaris filling the hanoidb backend (even on first try) and lost as I don't understand much of the surrounding API to avoid easy mistakes.
On the hard property of durability on disk : yes, I had read "Is_the_store_persisted_on_disk". My point was just a modest first step :
-have a use case of time windows where the application can refuse write requests (called planned maintenance mode, may be 15 minutes).
-use it to let all write transactions terminate.
-hence have each node and the entire cluster in a consistent read-only state.
-hence make some dump back up.
-allow to start from such a known consistent state rather than an empty database.
You are talking about crash-recovery on restart. I see "snapshot" feature in SVN. I feel scalaris is getting closer and closer to have larger than RAM backupable storage (at least first for a simple use case).
Have fun with all these interesting topics
Pierre M.