os/fsnotify API draft 3

1,748 views
Skip to first unread message

Nathan Youngman

unread,
Feb 1, 2014, 2:07:40 AM2/1/14
to golan...@googlegroups.com
The API doc for fsnotify is cleaned up a bit from the last draft:
http://goo.gl/MrYxyA

There are three areas that still need attention (volunteers?):

* Finalizing the names and definition of the file operations (Op). I'm sure Russ Cox will want some say once he's back.
* The mechanism for handling errors and digging into the specific error conditions.
* Testing, in particular how to unit test packages that make use of os/fsnotify.

I'm looking forward to reworking go.exp/fsnotify to use the API we have defined thus far. Maybe doing so will reveal some unthought of issues... or possibilities.

Good night, 
Nathan.

Ingo Oeser

unread,
Feb 1, 2014, 6:24:13 AM2/1/14
to golan...@googlegroups.com
Why is event filtering now required to be done in user space by API design?

func (w *Watcher) Add(name string, ignoreEvents EventMask) error

This would allow stopping useless events at the sender instead of filtering them out at the receiver. It also allows a simple for the most basic case.

Subtree paths can be aggregated later in a smart implementation and even the implementation switched on the fly, if we are brave. So I wouldn't worry too much about those now.

Ingo Oeser

unread,
Feb 1, 2014, 6:24:13 AM2/1/14
to golan...@googlegroups.com

Nathan Youngman

unread,
Feb 1, 2014, 10:59:32 PM2/1/14
to golan...@googlegroups.com

Hi Ingo,

Initially I wanted to configure the watcher with all sorts of additional filters. While working on this API doc, my stance changed thanks to feedback Russ Cox (rsc) and Shenghou Ma (minux). 

The funny thing is, the API sketch that Russ proposed 3 weeks ago <https://codereview.appspot.com/48310043/> had a function:

func (w *Watcher) Add(file string, op Op) error

which is essentially a renaming of what fsnotify currently has:

func (w *Watcher) WatchFlags(path string, flags uint32) error

Take a look at how WatchFlags is implemented. A function called purgeEvents forwards events from an internal channel to the external Event channel. The flags are stored in a map with a mutex and some bookkeeping to apply the same flags to files within a directory watch. There are currently no tests for WatchFlags and probably some subtle bugs.

Rather than all this extra machinery to configure the watcher, we can let it send all events and apply a filter at the point where we receive them.

for event := range watcher.Events {

    if IsOp(event, fsnotify.Create|fsnotify.Write) {

        // do something

    }

}

If you actually need to apply Op filters differently depending on the file/directory watched, it's easy enough check event.Name here too.

The truth of the matter is, filtering based on file operation is just one of many desirable filters. The ones I found useful are:

  • shell pattern matches, such as *.go
  • excluding hidden files and directories
  • throwing out duplicate events on a file (occurring within a second)
  • watching subdirectories as create events happen (not exactly a filter, given the side effect)

There are sure to be other possibilities as well. People have mentioned regular expression matches or excluding folders that begin with an underscore.

If it's easy to write and test these filters, I'm sure there will be no shortage of third party libraries, each with a different take on how to best extend os/fsnotify. Perhaps some of this will make it into a higher-level ioutil-like library in the future?

Nathan.

Ingo Oeser

unread,
Feb 2, 2014, 5:11:45 AM2/2/14
to golan...@googlegroups.com
Hi Nathan,

I am not talking about filtering out events in user space, but about the operating system kernel not having to generate them, not having to wakeup this thread/process.

This is especially important for background activity. I guess most users of this API actually are background activities.

I agree that smart filtering belongs to a third party library. Not bothering the kernel with useless work is an orthogonal matter which the current API is not solving.

Nathan Youngman

unread,
Feb 2, 2014, 11:33:08 AM2/2/14
to golan...@googlegroups.com

Hi Ingo,

Thanks for bringing this up.

Unfortunately my knowledge of kqueue, inotify and ReadDirectoryChangesW is still quite limited.

I know that the current implementation of howeyc/fsnotify doesn't perform these sort of optimizations. It just uses fsnFlags to filter the events before returning them.

Guess it's about time I do some research!

Nathan.

Nathan Youngman

unread,
Feb 2, 2014, 2:12:53 PM2/2/14
to golan...@googlegroups.com

I added a section to the document entitled "File operation mapping" so we can see which Ops are available on each OS.

As to why howeyc/fsnotify does user-space filtering, I'm guessing it's related to inconsistent capabilities of the adapters. In particular, Windows differentiates between Create, Remove and Rename on the receiving end but treats them all as the same filter during setup.

Nathan.

Russ Cox

unread,
Feb 13, 2014, 3:37:49 PM2/13/14
to Nathan Youngman, golang-dev
I think we are making good progress on this, but I do not believe we are in a position to commit to an API that we cannot change later. We don't have enough experience and shouldn't be designing against a deadline. I don't believe Go 1.3 should include os/fsnotify.

As an experiment, I started looking at what the go command would need to do its cache. I have built the seed of each interesting piece but have not grown them enough to connect. Perhaps I will get that done for Go 1.3 (with some OS-specific code in what I am calling 'go tool buildcache' instead of importing a portable package named 'os/fsnotify'), but perhaps not. Here is what I learned from the experiment.

The go command needs to watch for changes to source files but also for changes to the directories containing source files, so that it knows when a file has been added or removed, or when a new directory has been created that might shadow another. For example, if we cache the build result for $GOPATH/src/asdf, but then $GOROOT/src/pkg/asdf is created, the build result for $GOPATH/src/asdf must be dropped.

Godoc is a relatively small program. It is built from 102 packages built from 582 source files. We certainly want to be able to build programs larger than godoc. It appears that Linux inotify will let you watch individual directories for changes within that directory, so for godoc you are looking at a little over 102 inotify watches. For OS X, fsevents will let you watch whole subtrees, so for godoc built from 1 GOROOT and 1 GOPATH entry you are looking at 2 fsevents watches.  For Microsoft Windows, FindFirstChangeNotification looks like it might be usable similar to fsevents. For Solaris, file event notifications (FEN) only apply to individual files or directories, and watching a directory inode does not appear to tell you about modifications made to files in that directory, so for godoc you are looking at almost 700 FEN watches. That might take a little while to set up, but assuming the kernel has no hard limits, it should be fine. Speaking of limits... For BSD (or OS X if you don't like fsevents), kqueue has the same "enumerate every file or directory" requirement as Solaris FEN, but you give them to the kernel not as file names but as file descriptors. It appears that the file descriptor must remain open while you are watching that file, so the per-process file descriptor limit imposes a limit on the number of things you can watch. Worse, the per-system file descriptor limit imposes a limit on the number of things anyone on the system can watch. The typical kernel limit on a BSD is only on the order of 10,000 file descriptors for the entire machine. Put a few users building significantly larger programs than godoc on the machine - even at different times, assuming the cache sits in the background - and the machine is out of file descriptors. If we go for the least common denominator, using inotify or fsevents as if it were FEN or kqueue, then we create significant unnecessary work for the system. If we don't, then fsnotify possibly can't be used on FEN or kqueue. Perhaps those systems should be ignored until they provide a more scalable notification system. Plan 9 won't have one (probably ever) so fsnotify is always going to be a best effort, might say I can't help you kind of package.

There is also a question of synchronization. Ideally you want to have a "Sync()" method that guarantees that when it returns, all events for modifications made before the call to Sync have now been delivered. Then a go command can have the buildcache call Sync before starting to give results, and we know it will not have missed recent changes and be delivering stale results. I know how to build a Sync from the fsevents API. I am less sure about the others; probably it is possible but it requires some thought. It also affects the API, because you need to know when the events triggered by the Sync have stopped. If the events are coming over a channel, you need some kind of sentinel event marking the sync position. This is all unclear to me. I think fsevents gives some clear guarantees about delivery that let you build Sync. Windows FindFirstChangeNotification seems not to give any at all. For example, you find out about file changes in a directory by watching the mtime on the files, but the docs say: "The operating system detects a change to the last write-time only when the file is written to the disk. For operating systems that use extensive caching, detection occurs only when the cache is sufficiently flushed." That's clearly useless for interactive go command build result caching.
 
In the non-recursive watch model provided by inotify, the idea of watching for directories that might yet be created is also quite subtle. You need to watch the parents of those directories, but the parents might not exist yet either, so you need to walk up the tree until you find a parent that does exist, and then you need to watch for subdirectories there, and then only some of them you care about starting to watch their children. And then maybe one of those directories will be removed, and you need to put the watch back on the parent. I've sat down to write the logic multiple times and backed off each time. I can't quite wrap my head around it. I'm sure it's possible, but it's so subtle it probably needs to be provided by the library, not reimplemented incorrectly by each user. We can't go full-blown recursive by default, because I still believe the cost there is too expensive.

Perhaps the right middle ground API is to allow watching only of directories, and to be told only "something in that directory changed". Fsevents can watch the root, inotify can watch individual directories, FEN can be told about directories and all the files in it. Maybe kqueue can try until it runs the machine out of file descriptors and then back off. Or maybe not.

In all honesty, we might have to decide that systems that don't at least provide a directory-based watching just lose out, so that BSD and maybe even Solaris systems just aren't supported. It would be nice to know if BSD or Solaris are thinking about higher-level watching (and in the case of BSD, not tying watches to file descriptors). I did not appreciate the wide variation in APIs and the complexity of turning the APIs into something you can actually program correctly against.

These are the sorts of questions I see about the os/fsnotify API and the reasons I think we can't commit to something in the next two weeks. I do hope we can get something done for the next release, and I encourage work to continue in the go.exp subrepo.

Russ

Nathan Youngman

unread,
Feb 13, 2014, 11:27:28 PM2/13/14
to golan...@googlegroups.com, Nathan Youngman

Giving ourselves more time to research & experiment sounds good to me.

Will dropping kqueue/FEN and focussing on fsevents/inotify/Windows cover the majority of people using Go? Moving the bar up to that level would be quite a nice improvement.

~~

The default number of inotify watches is 8192 (on Ubuntu 12.04). For context, the Go standard library contains 172 directories.

Each used inotify watch takes 1KB on 64-bit systems:
http://askubuntu.com/questions/154255/how-can-i-tell-if-i-am-out-of-inotify-watches

Compared to what you're describing for inotify, the user-space recursive watch I implemented is rather simple:

* filepath.Walk and watch each folder (with a SkipDir for hidden folders like .hg)
* an autowatch function to add another watch for events that are both IN_CREATE and IN_ISDIR

There are probably some subtleties that I missed. Nor have I spent any time with FSEvents or with bWatchSubtree on Windows to ensure consistent behaviour. 

A case for supporting a recursive/subtree watch is:

go test ./... --watch

~~

How should we proceed with go.exp?

I like the direction we were headed with the API. Much cleaner. Does it make sense to proceed with that refactoring? (knowing it will change again)

Might it make sense to pull back and work on inotify, winfsnotify as well as add an fsevents package?

My thinking is to explore each of those thoroughly before working out what a common API would look like.

Nathan.

Aram Hăvărneanu

unread,
Feb 14, 2014, 7:00:18 AM2/14/14
to golan...@googlegroups.com, Nathan Youngman
> For Solaris, file event notifications (FEN) only apply to individual
> files or directories, and watching a directory inode does not appear to
> tell you about modifications made to files in that directory

This is true and it's really a shame considering that with DTrace you
can implement recursive watching in a couple of lines with practically
zero overhead.

Nathan Youngman

unread,
Feb 15, 2014, 1:20:29 AM2/15/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox

This evening I spent some time with kqueue, rewriting a small part of fsnotify and inspecting the raw output under both OS X 10.9 and FreeBSD 9.1.

When watching the path to a directory I found that:

* adding, removing or saving a file resulted in a WRITE note on the directory
* adding or removing a subdirectory resulted in both WRITE|LINK notes on the event
* renaming a subdirectory is just a WRITE
* thankfully, renaming the directory itself is a RENAME (the watch followed the rename! yay!)
* deleting the directory itself is a DELETE (and maybe WRITE and/or LINK if it contains stuff)

There's no additional information, just the Ident for the file descriptor of the directory where some change happened.

~~

Then there's the kFSEventStreamCreateFlagFileEvents option on fsevents:

"Your stream will receive events about individual files in the hierarchy you're watching instead of only receiving directory level notifications. Use this flag with care as it will generate significantly more events than without it."

~~

Looking specifically at cmd/go, I wonder if instead of trying to get kqueue to watch every file, we just watch directories (whether recursively or following import paths)? In which case, for any file modified (*.go or not) we invalidate the cache for that directory/package.

That doesn't solve all the problems, but at least it helps with the file descriptor limits, and I think it would still reduce the amount of stat'ing cmd/go needs to do.

Nathan.

Aram Hăvărneanu

unread,
Feb 15, 2014, 5:29:21 AM2/15/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox
> Looking specifically at cmd/go, I wonder if instead of trying to
> get kqueue to watch every file, we just watch directories (whether
> recursively or following > import paths)? In which case, for any
> file modified (*.go or not) we invalidate the cache for that
> directory/package.

Watching a directory with kqueue won't report when files from that
directory will have modified. You have to watch every individual
file, and kqueue requires a fd for every file.

Solaris also requires watching every file, but it doesn't require
fd's.

Russ Cox

unread,
Feb 15, 2014, 10:07:13 AM2/15/14
to Nathan Youngman, golang-dev
On Sat, Feb 15, 2014 at 1:20 AM, Nathan Youngman <n...@nathany.com> wrote:

This evening I spent some time with kqueue, rewriting a small part of fsnotify and inspecting the raw output under both OS X 10.9 and FreeBSD 9.1.

When watching the path to a directory I found that:

* adding, removing or saving a file resulted in a WRITE note on the directory

What do these things mean exactly? "Save" is not a system call. I think if you work at the system call level you will see that open+write+close of an existing file does not appear as a directory write. A "save" from an editor may well, because most editors write a temporary file next to the real one and then rename the temporary on top of the real one, and those are directory operations.

Russ

Nathan Youngman

unread,
Feb 15, 2014, 12:57:51 PM2/15/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox

You are correct. I apologize for not testing that earlier.

To confirm, disabling the "atomic_save" option of my editor or simply touching an existing file gives us nothing. :-(

Now I've seen with my own eyes that kqueue is a poor fit for what cmd/go (and many other tools) wish to do. What do you think of extracting kqueue to a separate low-level package for those who need it? (or go.exp/fskqueue, as it only deals with file watches via EVFILT_VNODE).

More importantly, do you have some code to kickstart our work on FSEvents? (I suspect other priorities will take precedence for the next two weeks though. :-)

Aram, do you know the (default) limits for event ports / FEN on Solaris? (how many files, how much memory per watch)

Long term, if we decide to keep low-level packages for each, it would allow anyone to wrap the parts they want, alleviating fsnotify from trying to support every platform and use case. (separate packages certainly aren't without their downsides)

IMO, a subtree watch on Linux, Mac and Windows will cover the 80% (80/20 rule) and make good use what FSEvents and Windows can provide.

Nathan.

Aram Hăvărneanu

unread,
Feb 15, 2014, 1:37:54 PM2/15/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox
> Aram, do you know the (default) limits for event ports / FEN on
> Solaris? (how many files, how much memory per watch)

The default limit is 64k event sources per event port per process.
The default limit for the number of event sources per system is
2^31.

The default limit for the number of event ports per process is 8k.
The default limit for the number of event ports per system is 64k.

I don't know how much memory event ports and event sources use,
probably little.

I don't understand why os/fsnotify is not like os. os provides a
unique interface, but the semantics differ between systems. Calls
that always work on one system can always fail on another. That's
ok, the application decides what to do on failure.

In my mind we only need three functions, watch a filename (which
can be a directory), watch a fd (or an os.File rather), and recursively
watch a directory. The semantics of these need not be the same on
every system. Watching a directory might or might not trigger an
event if a file in that directory is modified. That's fine, the
behaviour for each system is documented and the application can deal
with it. Recursively watching a directory can fail for systems
that don't support it. That's perfectly fine, the application sees
the error and can decide what to do next. Perhaps it wants to emulate
recursive watchers, perhaps it switches to a completely different
mechanism, e.g. polling. Perhaps it wants to use (or reimplement)
FAM.

It's ok for us to not provide the same behaviour for all system. Our
job is to provide the same interface, not emulate other systems'
behaviour. Perhaps it's fine to provide inside the package a recursive
watcher emulation, but we shouldn't silently push it to applications
that didn't require it explicitly. They see the error, they can
decide if the emulation is good enough and they they can use it if
they so please.

I think os/fsnotify in the standard library is a fine addition, but
the more I think about using it for cmd/go, the more I believe it's
a mistake.

Nathan Youngman

unread,
Feb 22, 2014, 12:58:30 AM2/22/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox

Maybe it's just me? If the behaviour is significantly different from one OS to another, I'm having trouble seeing the benefit (to the user) of a single "fsnotify" API vs. just having separate low-level packages? Even the integration tests would be different from one OS to the next.

It's interesting that Alex brings up fanotify on Linux. Maybe it is worth thinking about a SQL-driver like system that allows FSEvents and/or kqueue to be used on OS X and fanotify and/or inotify on Linux?

I think improving the performance of cmd/go is the main motivation for adding fsnotify to the standard library vs. leaving it off elsewhere? Personally, I'd like to know that fsnotify is extremely solid before relying on it in something as critical as cmd/go, so I am glad if this is all being pushed to Go 1.4.

Nathan.

Nathan Youngman

unread,
Feb 22, 2014, 11:06:34 AM2/22/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox
Facebook's file watcher "Watchman" sounds very much like it was designed to solve the same problems as the proposed cmd/go improvements. It supports inotify, FSEvents, kqueue and event ports (notably not Windows). The code is open source under an Apache License


I highly recommend watching Durham Goode's talk "Scaling Source Control at Facebook":


"Some of you may be familiar with existing file monitoring solutions... they are a pain in the butt to use, they are complex, they are not user friendly, and they have a lot of race conditions that you have to be aware of as a developer."

"We've extensively tested it, we rolled it out for several months to all our developers... what's actually on disk, what does Watchman think is on disk... when we found discrepancies we fixed it."

~~

Having os/fsnotify simply expose what each OS provides will still be a "pain in the butt to use". If ease-of-use is not the main goal, then I'm really liking the database/sql driver approach:
  • It sets expectations. The API presented by howeyc/fsnotify is a black box that tries to smooth out OS-specific differences. On the other hand, I wouldn't expect SQLite and Postgres to behave exactly the same, even if the API looks the same. By telling fsnotify which adapter to use, I know that I need to be aware of the features and limitations of that adapter.
  • It makes it possible to support multiple adapters on a single OS. For example, one could use kqueue on OS X for watching individual files or use FSEvents for subtree watches.
  • Not all adapters would need to exist in the standard library, making it possible for third-parties to extend as new file notification systems become available (for example, adding fanotify on Linux).
That said, I wouldn't expect many people to use os/fsnotify directly. There would probably be one or more third-party packages that, like howeyc/fsnotify, try to smooth out the differences so that file notifications can be treated like a black box that "just works." This would be equivalent to a query builder or ORM that generates the appropriate SQL for each database driver.

The nice thing with this approach is that those third-party packages (and the go tool buildcache internals) can make different decisions as to which features and platforms to support. Indeed, those decisions can change over time.

Even if we all like this approach, there is still the substantial challenge of coming up with a uniform API and implementing/debugging all these adapters.

For my use case, I still just want an efficient subtree watch on Linux, Mac, Windows that can be used for autotesting (--watch), auto-building (like with App Engine's devserver) and perhaps other tasks like asset packaging (CSS/JS). There are many ways to get there.

Nathan.

Nathan Youngman

unread,
Apr 1, 2014, 11:33:21 PM4/1/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox

Any feedback on the last post?

I'm still of the opinion that we should build out the individual packages for fsevents, inotify, kqueue, Windows, event ports first and then converge on a similar API or a driver-like system.

I'm becoming less convinced that fsnotify itself belongs in the standard library (or at least the os package). It seems like any cross-platform API will need to make some choices for the user of that API, placing it at a higher level layer.

"This is an important principle of API design: handle the difficult, tedious, and tricky parts at the low levels so that everyone gets that right while not ruling out people doing their own things at higher levels."

Nathan.

Russ Cox

unread,
Apr 2, 2014, 12:28:52 AM4/2/14
to Nathan Youngman, golang-dev
I don't know, and preparing the rest of the release has been all-consuming and I haven't had time to think about whether a unification is possible. I think it's fine to have OS-specific packages for now.

Nathan Youngman

unread,
May 29, 2014, 12:41:52 AM5/29/14
to golan...@googlegroups.com, Nathan Youngman, Russ Cox

Once Go 1.3 is out the door, I'd like to look at these things to start:

* remove current implementation of WatchFlags https://codereview.appspot.com/100860043/
* refactor fsnotify towards the API we have in http://goo.gl/MrYxyA (knowing full well that it will change again, but it's a chance to dig into the code and clean it up)
* write and document a standalone fsnotify/kqueue package that follows a similar API but doesn't add anything on top of what the OS provides

Nathan.
Reply all
Reply to author
Forward
0 new messages