I’m new to ZODB, but I’m thinking of using it to store state in a testing rig on OS X, where the system could panic during some of the tests. I see that FileStorage.py applies fsync() in an attempt to ensure data is written out to disk in a commit(), and that there have been some discussions in the past on this list concerning that function’s effectiveness and performance. OS X actually offers a fcntl(F_FULLFSYNC) call which goes one better than fsync(), in providing a better guarantee that data truly gets written out to disk platters. I was wondering if there would be support for adding this as a user-selected option, for those folks who are willing to run even *slower* than fsync(), in order to get their data more safely onto disk?
The OS X “man” page for fsync() offers this explanation of its drawbacks, relative to fcntl(F_FULLSYNC) (https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fsync.2.html):
Note that while fsync() will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physically write the data to the platters for quite some time and it may be written in an out-of-order sequence.
Specifically, if the drive loses power or the OS crashes, the application may find that only some or none of their data was written. The disk drive may also re-order the data so that later writes may be present, while earlier writes are not.
This is not a theoretical edge case. This scenario is easily reproduced with real world workloads and drive power failures.
For applications that require tighter guarantees about the integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl. The F_FULLFSYNC fcntl asks the drive to flush all buffered data to permanent storage. Applications, such as databases, that require a strict ordering of writes should use F_FULLFSYNC to ensure that their data is written in the order they expect. Please see fcntl(2) for more detail.
The fcntl() man page doesn’t really offer any further detail, TBH, beyond the function signature, and a warning that some drives may ignore the synchronous flush request: https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/fcntl.2.html
I’ve used the function before in other contexts - just flushing text files to disk - and it is several times slower than fsync(). So it’s not for everyone, certainly. But it could be really crucial for some, AFAICS.
— Jonathan