Dear redo implementors,
some of you have chosen to “optimize” redo-always so that a target is
not always rebuilt, i.e. built at most once. For example, goredo does
this: <
http://www.goredo.cypherpunks.ru/FAQ.html>
> By definition, it should be built always, as redo-sh and redo-c
> implementations do. But that ruins the whole redo usability,
> potentially rebuilding everything again and again. apenwarr/redo and
> goredo tries to build always-targets only once per run, as a some kind
> of optimization.
Here are some use cases that get sabotaged by the “optimizations”; all
can be summed up with ”not everything out there is a pure function”:
1. Composition with recursion – a major reason to use redo – becomes
hard or impossible to do without implementation-specific workarounds.
The popular apenwarr-redo seems to have this problem in its own test
suite, leading to it having code that messes with its cache during a
test:
https://github.com/apenwarr/redo/blob/main/t/flush-cache.in
This is necessary – because otherwise this test will not pass:
https://github.com/apenwarr/redo/blob/main/t/640-always/all.do
In my opinion, a test reaching into an implementation-specific cache
solely because an implementation's dependency caching is broken … is
bogus: redo-always only functions “as expected” in the test case due
to smoke and mirrors – and the trickery means that it is unportable.
The test case for redo-always in redo-sh does not need this. If some
redo implementation does not pass it, recursive composition is hard,
unportable and/or both.
2. A target that is always out of date and has desired side effects may
be broken by not considering it out of date erroneously. This should be
obvious: If I have a target that is always out of date and I want it to
generate something and send it to another host over the network, it is
not going to work past the first time if the target is only build once.
3. Targets that include timestamp information are majorly affected by
this. Imagine a website footer that includes the output of “date” or a
long-running document conversion process that wants to timestamp each
build artifact. If
footer.html.do is just ”redo-always; LANG=C date”,
that will probably not result in what was intended if footer.html is
built only once. I recently used redo to generate a printed schedule
for an event – since that could change at a moment's notice, it was
very important to the team to have the timestamps on the printouts.
4. Targets that are out of date unless a condition is fulfilled may be
broken. For example, I have targets that are using redo-always to be out
of date, but then use redo-stamp to take it back. Without redo-always, a
target might not be evaluated next time, as any input to redo-stamp can
come from wherever, e.g. it could be an ETag from an HTTP header or so.
Here are some reasons to deliver a botched “redo-always” implementation:
1. A naive way of representing dependencies can not represent a target
that is always out of date. An example: Treating targets as out of date
or not out of date as a binary is wrong – a target is always out of date
or not out of date relative to some other thing: In the minimal case, a
target is out of date relative to its build rule. Incidentally, a system
that represents dependencies that way MUST have the concept of a “run”,
as it can not cleanly deal with many states that occur during a build.
2. Several obvious but wrong “optimizations” of dependency checking make
it so that a target can be built at most once during a run. For example,
a build system can aggressively and erroneously cache dependency checks,
so an always-out-of-date target will be found out of date, but after it
is built the updated cache entry reads “this is up to date”. This will
make builds faster and sometimes incorrect, even without redo-always.
3. An implementor who finds it hard to make a recursive top-down build
system fast can implement parts of it in a bottom-up toposort way. This
makes builds faster, more easier to parallelize and sometimes incorrect
even without redo-always.
Based on these three alone, you can sometimes work out motivations an
author had. For example, the ”potentially rebuilding everything again
and again” problem of goredo could be a property of an inplementation
that is very eager to calculate what is out-of-date, but is unable to
take it back later. As I understand it, redo-sh does not have such an
issue. (Please correct me if I am wrong here – with example dofiles.)
Greetings,
Nils