I know I'm coming to this conversation very late, but I have been writing a
build tool that automatically tracks dependencies in a similar fashion to that
described by the author of Tup in the linked article. The tool is meant to be
used in "daemon mode" only. It's called "lmake" for "Linux Make", but the
architecture could allow it to be used on other platforms as well.
The architecture is as follows:
There are three components: the "builder", the "watcher" and the "coordinator".
Following the UNIX tradition, each of these components is run as a separate
process and all communication is via simple text-based protocol on (pipe'd)
stdin, stdout and stderr.
Builders read on standard input a "build" command (e.g., "build gcc -c hello.c
-o hello.c.o"), write to standard output dependencies ("input hello.c \n output
hello.c.o") and status messages ("success" or "failure"), and writes to
standard error command output (like compiler errors). The default "builder" is
"lmake-builder-ptrace", and, as the name suggests, uses the "ptrace()" system
call to track dependencies. Another builder is "lmake-builder-ldpreload", which
- again as the name suggests - uses LD_PRELOAD to intercept calls.
Watchers read on standard input files to watch (e.g., "watch hello.c"), write
to standard output events ("changed hello.c") and write to standard error any
issues that occur (e.g., "Unable to watch hello.c: file does not exist"). The
default watcher is "lmake-watcher-inotify", and uses the "inotify()" system
call on Linux to monitor the file system. Another watcher is
"lmake-watcher-timer", which periodically scans the file system to see if any
files have changed.
There is only one coordinator ("lmake"), which interfaces between the watcher
and the various builders. That is, when the watcher says "Hey, just so you
know, the file 'hello.c' was just changed" the coordinator can automatically
distribute work to each of the builders in order to compile only what is
necessary, and only when necessary.
Having three separate processes allows flexibility and testability. That is,
you can easily run "echo build gcc -c main.c | lmake-builder-ptrace" (without
the coordinator) to see what files are used by a process. Likewise, you can
easily run "echo watch hello.c | lmake-watcher-inotify" and see what events are
produced. Again, you could have a OS-specific builder for other platforms.
Perhaps fswatch for macOS? Something hacky and horribly slow for Windows?
(Jks).
Three separate processes could also allow distributed builds. E.g., you can
pipe the build commands to builders on other computers. Of course, the builder
protocol would need to be enhanced in order to allow file distribution.
There are two reasons why I started working on "lmake". Firstly, I wanted a
daemon-like build system. That is, as soon as I saved a file, I wanted the
project rebuilt. The "watcher" component solves this first problem.
(Incidentally, one complaint against using compiled languages for web
applications is the long edit/compile/test build cycle, which, when using a
daemon, is effectively reduced to that of interpreted languages). Secondly, I
wanted accurate dependency tracking. I dislike GNU Make dependencies, because
implicit/explicit dependencies are hacky (e.g., the "$<" vs "$^" inputs). Ninja
does this in a much better way using explicit ("$in"), implicit ("|") and
order-only dependencies ("||"), but you still have to manually list the
dependencies. The "builder" component solves this second problem, and also
allows rather minimal configuration files (and ones that can be executed
directly in the shell in case the user doesn't have "lmake" installed). E.g.:
the configuration file below can be executed by "lmake" or can be executed
directly in the shell by typing "source lmake.config".
gcc -c foo.c -o foo.c.o
gcc -c bar.c -o bar.c.o
gcc foo.c.o bar.c.o -o hello-world
Anyway, "lmake" still has a lot of work before it's production ready. I
recently stopped work on it because it was distracting me from getting any
actual work done. Instead I decided to write a small utility "wexec", which
executes a program (e.g., "ninja") after a period of inactivity on standard
input, which, when combined with a tool like "inotifywait", effectively solves
problem 1 above (automatic rebuilds). However, problem 2 above (automatic
dependency trakcing) is still not well managed.
I'd like to make clear that I LOVE ninja. I'm not trying to hijack this thread
and I'm not suggesting a "better tool" (lmake is a rather garbage
implementation of the idea). Among a plethora of goodies, ninja is incredibly
fast, easy to use, and has a very simple configuration file format. Would there
be any interest in implementing a feature to automatically track dependencies?
Among ninja's features, this would be - in my opinion - the best.
(Also, I couldn't find the github discussion and I don't know how much overlap
this has with what others have written.)