On Fri, Mar 21, 2025 at 2:36 PM Varun Gandhi <
varung...@gmail.com> wrote:
> Thank you for the response John. 😄
>
> I get your point about most people not using fsync directly, so in that sense they're not affected. However, as I see it, applications which do not use fsync are potentially at a higher risk of bugs.
Surely this depends on the semantics of the particular program. Not
all are concerned about ensuring durability against stable storage
(which is what `fsync` nominally gives you). Note that, for example,
in the FreeBSD bin/ source tree, the only program that uses `fsync` is
`dd`, and only when given a specific flag (the `fsync` oflag);
`fflush` only appears a handful of times, and mostly for stdout or
stderr.
Moreover, POSIX does not mandate that `fsync` is necessary after e.g.
a `write` or `pwrite`; this is up to the implementation, and one could
imagine an implementation that doesn't synchronize IO via a buffer
cache, as Unix traditionally did, and thus for which `fsync` would be
entirely superfluous.
But note that `open` can take either `O_SYNC` or `O_DSYNC` to sync all
updates or just data updates, so one can have synchronous semantics
without `fsync`.
What this suggests to me is that, for most programs, it's fine to
leave actually sync'ing to the underlying storage device to the
operating system. Those that care can use `fsync`, but most don't.
> For example, nowadays it's increasingly common to use "spot instances" on cloud compute which can be arbitrarily terminated in the middle of operation. If a spot instance is writing some files to a shared disk (and not interfacing through a DB), and then doing some irreversible action later (e.g. sending a network request), it's possible the network request is sent but the corresponding file was not made durable on disk if fsync was not used. This failure mode also applies to local desktop apps storing data to flat files (instead of something like SQLite), and interacting with the network.
I think you are mentioning databases ("...not interfacing through a
DB" and "...instead of something like SQLite") because those,
generally, do the necessary dance with `fsync` or some kind of more
elaborate commit protocol to ensure data is resident on stable storage
before returning "success" for mutating operations. But note that I
only interact with them via the POSIX file API on Unix-like systems in
the crudest of manners: e.g., via a socket to some kind of for an
RDBMS, or, in the limit, via a pipe for something like sqlite (usually
that's a library that I just link into my program). Of course, other
libraries like GDBM, BerkeleyDB, or even the venerable ndbm or dbm
libraries behave similarly.
But most programs just don't need those kinds of semantics.
> So in that sense, it feels like the POSIX file API design fails to create a "pit of success" the developer falls into. In Chapter 4, you've written this in the context of Java's buffered streams:
>
> > Providing choice is good, but interfaces should be designed to make the common case as simple as possible (see the formula on page 6). Almost every user of file I/O will want buffering, so it should be provided by default. For those few situations where buffering is not desirable, the library can provide a mechanism to disable it.
>
> This point about thinking carefully about the common case and defaults makes a lot of sense to me. But it feels like the POSIX file API (by making durability opt-in instead of opt-out) fails to satisfy this guideline.
This is predicated on the notion that all programs require the kinds
of durability semantics that are implied by using `fsync`, but NOT by
using `O_SYNC` or `O_DSYNC`. I don't think there's enough evidence
available to make that conclusion, but plenty of disconfirming
evidence. For programs that care, yes, they should use `fsync` or
something like it; for those that do not, there's no need (and the
performance disadvantage would be significant!).
- Dan C.