I probably use Tup a little differently than most people. Instead of using it to build software (which I do sometimes), I use it for repeated data analysis and graph generation. One my tup rules looks something like
: input.json |> < %f simulator | jq -f extract_something.jq > %o |> output.json
where the simulator produces a lot of output, but in this case I only care about one aspect. Without pipefail, the simulator might fail, jq will get empty input, which it's fine with, and the result is an empty output file. This then causes the next rule that uses output.json to fail, because it's empty, but the problem was the previous rule.
In general, I find it hard to believe that (especially with tup), there's ever a time when you wouldn't want pipefail to be active, but maybe I'm unique in that feeling. One interesting thing that I didn't realize is that sh is run with the -e flag, which means any failed command is a failure. E.g. currently in tup the rule
: |> false; echo foo |>
will fail, but the rule
: |> false | echo foo |>
will succeed.
Thanks for pointing me to the appropriate place in tup for where commands were executed. With it I was able to write an interposer that does what I want. Essentially it looks for execle calls that look like '/bin/sh -e -c cmd' and turns them into execve calls that look like '/bin/bash -e -o pipefail -c cmd'.
I like your suggestion about the ^p^ flag, although it seems a little strange that a flag would also change execution to bash. However, it feels like a reasonable way to accomplish this without requiring pipefail to be active for all :-rules. After looking at the code though, another option that seems easy would be to have a tup config option that specifies the command that executes all of :-rules. This may be somewhat frowned upon due to "changing any of these options should not affect the end result of a successful build" (*), but having the default:
updater.command = /bin/sh -e -c
be there, but have the ability to change it to
updater.command = /bin/bash -e -o pipefail -c
would be very convenient, and would provide a lot of flexibility with how tup is run. E.g. a user could just specify bash to get access to the more advanced bash syntax. The downside is that it has the potential to violate the rules (*) of a tup config as stated above. An potentially absurd example would be setting
update.command = /usr/bin/env python -c
which would execute all :-rules in python.
I'm curious about everyone's thoughts about this. I have my hacky solution, which means I'm fine for the time being, but I'd be up for submitting a pull request of a more reasonable implementation of one of these ideas if it seemed worthwhile.
Erik