I’ve been pondering this for awhile since a lot of interesting enterprise features require a working filesystem change notification mechanism that scales to thousands or even millions of files (how did we bump into the 32 bit NFS file handle problem at iXsystems? Somebody tried to share more than 4 billion files over NFS - Enterprise folks do some crazy s**t!).
The big question is less whether it’s possible and more what kind of mechanism people will find palatable. The OS X FSEvents mechanism works reasonably well and is used constantly to trigger things like spotlight search indexing and such, and I was by no means involved in its creation at Apple so I can only speak peripherally to the implementation, but it seems like it took a fairly long time for it to become “light weight” enough to use without the overhead being punitive. Any similar mechanism in FreeBSD would also have to go through some evolutionary performance iterations - do people want it badly enough to invest in it long-term? I don’t know, but I do know that a long-term investment would be necessary to really make it work well and provide all of the appropriate APIs for talking to it.
I think we can probably all agree that Linux inotify wouldn’t be worth the trouble. From the wikipedia page:
• Inotify does not support recursively watching directories, meaning that a separate inotify watch must be created for every subdirectory.[4]
• Inotify does report some but not all events in sysfs and procfs.
• Notification via inotify requires the kernel to be aware of all relevant filesystem events, which is not always possible for networked filesystems such as NFS where changes made by one client are not immediately broadcast to other clients.
• Rename events are not handled directly; i.e., inotify issues two separate events that must be examined and matched in a context of potential race conditions.
I think the first issue alone is a deal killer. Having to walk the filesystem tree posting notifications on every [new] directory just to watch a filesystem in its entirety would be pretty onerous and failure-prone to boot. By contrast: https://en.wikipedia.org/wiki/FSEvents
This is also not to say that I would expect anything in FreeBSD to be API-compatible (though the upstream clients would probably grumble at yet another notification mechanism API to #ifdef into their code), simply that there are only so many design patterns to follow. A filesystem change is a filesystem change. Everything beyond that is just a glorified pub/sub mechanism.
Assuming there’s interest, I could potentially see throwing some engineering effort into this.
- Jordan
On Sun, Jan 3, 2016, at 15:36, Jordan Hubbard wrote:
>
> I think we can probably all agree that Linux inotify wouldn’t be worth
> the trouble. From the wikipedia page:
Just talk to Bryan Cantrill if you want to know why we should avoid
inotify at all costs. He had to work on mapping it to FEN on SmartOS and
he discovered a world of hurt in the process. They're allegedly stuck
with the broken implementation of inotify now because Linus doesn't want
KBI breakage. Not to say we couldn't provide a compatibility shim so
inotify things can compile on FreeBSD, but it might be wise to have
something else that works better. Not sure if we really should reinvent
the wheel, but I have zero clue how FSEvents or FEN scale.
>
> Assuming there’s interest, I could potentially see throwing some
> engineering effort into this.
>
> - Jordan
>
I would love to see this happen in the near future. It is *the* reason
Dropbox hasn't released a FreeBSD-native client last I checked. I know
that Plex would use it if it was available. There's a lot of cool things
ripe for porting if we only had a mechanism...
--
Mark Felder
ports-secteam member
fe...@FreeBSD.org
That’s basically how FSEvents work. There’s a fairly straight-forward (Mach IPC based) kernel upcall mechanism for communicating the filesystem change events (and control inputs for what to watch) to a daemon, fseventsd, and it’s the userland daemon which subscribers talk to and it figures out how many events to cache, when all subscribers have received the events (or timed out) and it can re-use the memory, and so on.
The kernel reporting mechanism can be relatively light-weight if you proxy all the subscription and memory management details through a userland daemon, which is why I certainly wouldn’t suggest doing it any other way…
- Jordan