removing "abandoned outputs" -- revisited

206 views
Skip to first unread message

Avinash Baliga

unread,
Dec 2, 2013, 3:50:57 AM12/2/13
to ninja...@googlegroups.com
Hey all,

I wanted to revisit the idea of incrementally removing output files that are no longer generated by the build.
The problem I'm trying to solve is pretty basic: ensure that an incremental build is equivalent to a clean build or pristine build.
Example:
  • I pull down a repo with header generation from protobuf, and some C++ files that include the generated header.
  • I build once.  Now the generated headers exist on my disk.
  • I rename the protobuf file, causing the generated header name to be different.  I neglect to update header includes in the C++ files though.
  • I perform an incremental build, and it succeeds.  But it only succeeds because I had old generated headers -- "abandoned outputs" -- lying around on my disk.
  • I upstream my changes, having been satisfied by my local build.
  • The next person to fetch the repo finds it in a broken state.
The fix for this turns out to be really simple in ninja, thanks to the build log.
Just test every entry in the build log (.ninja_log) against the current build graph (build.ninja); if the build log entry is no longer output (or input) by any build command, then delete it.
I have a version of this coded up here: https://github.com/fifoforlifo/ninja/compare/inc_clean

To be fair, this has been discussed before in https://github.com/martine/ninja/pull/190 , but didn't make it.
Previously it looked like folks were concerned about generators that created multiple disconnected build graphs (multiple build.ninja or subninja files) but shared a common .ninja_log.
I think an extra commandline switch to control enablement would solve that.  Does that sound reasonable?

The other aspect is performance.  I haven't tested it, but on the surface it seems minor:
  • In the steady-state, need an additional O(E) node lookups where E = # of log entries.  No additional file system accesses.
  • After a generator step has been triggered, the first build will take longer due to the (possible) extra file removals.  But if we assume generating is "slow" and "exceptional" compared to a no-op build, then this should be of no concern.
Of course, if the perf is a problem in certain projects, then the commandline switch would allow control there as well.

What do you think?

Thanks,
Avinash Baliga

Nicolas Desprès

unread,
Dec 3, 2013, 7:50:28 AM12/3/13
to Avinash Baliga, ninja-build
It could be a new tool named 'autoclean' or a new option to the 'clean' tool.


--
Nicolas Desprès

Avinash Baliga

unread,
Dec 4, 2013, 3:12:39 AM12/4/13
to ninja...@googlegroups.com, Avinash Baliga
It could be a new tool named 'autoclean' or a new option to the 'clean' tool.

I probably should have mentioned: for this to work, RemoveAbandonedOutputs (RAO) must be called between every re-generate of build.ninja [ninja.RebuildManifest] and the subsequent build [ninja.RunBuild].
That's needed to solve the original problem of old generated headers still on disk.

So even a new tool would not help, unless it performed all the same steps as a normal build.  (and then that tool would replace the normal build)
Pedantic note: I'd prefer to not use 'clean' terminology with RAO.  Clean is defined as removing a dependency-graph within the current build graph.  Whereas RAO occurs between two build graphs (or build states).


More Updates
Today I managed to refine the algorithm, by observing that RAO only needs to run after a generate.  So now it only costs 2 extra Stat() calls per null build.
The cost is for an extra file ".ninja_regen" which must always be newer than "build.ninja".  When ".ninja_regen" is older than "build.ninja", we know that a RebuildManifest() occurred.  Using a file to track this is required to be robust against CTRL+C.

The bulk of the code also refactored nicely into one simple function.  So I'm a lot happier sharing this latest version.

Nico Weber

unread,
Dec 4, 2013, 12:00:00 PM12/4/13
to Avinash Baliga, ninja-build
Hi Avinash,

here are a few points that make me somewhat skeptical about this feature:

* It only cleans up files that ninja has in its .ninja_log as far as I understand – if you have a build edge that as far as ninja knowns only writes a stamp file but that also creates a bunch of .h files as a side effect, they won't be cleaned up by this feature.

* Many projects have one bot that does continuous full builds (which removes the output directory and deletes all files not known to source control), which already mostly handles this (and other) issues.

* People who like to have a feature like this could write a simple python script walking .ninja_log without having to use code in ninja (sure, this script will break when .ninja_log changes)

* Keeping unused lines in .ninja_log around slows down ninja (https://github.com/martine/ninja/issues/661). I think we might want to garbage-collect these lines somehow eventually, for build performance, which would be much harder to do if this autoclean feature was implemented.

Nico


--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Sean Silva

unread,
Dec 4, 2013, 12:04:31 PM12/4/13
to Avinash Baliga, ninja...@googlegroups.com
Could this be done in the generator program?

It seems like the natural place to perform this removal of "abandoned outputs" is in the generator program, rather than repeatedly checking per-build.
Or to phrase it another way: in what scenario would an "autoclean" be triggered that is *not* immediately after a build.ninja regeneration? 

-- Sean Silva


Reid Kleckner

unread,
Dec 4, 2013, 1:07:21 PM12/4/13
to Nico Weber, Avinash Baliga, ninja-build
The generator doesn't usually know about the set of outputs it built previously, so I don't think this can be solved without help from ninja.

I think a hybrid approach would be best:
- Add an 'autoclean' tool to ninja which garbage collects the log file *and* deletes the output files on disk
- After build generation, the generator manually runs autoclean

In theory the only time the outputs change is when the generator runs.

However, this still fails to address the stamp file problem, where the stamp file is deleted and the stale .h is left on disk.  If it solves the common case for .pb.h files, it might be worth it.

Allan Odgaard

unread,
Dec 4, 2013, 8:36:12 PM12/4/13
to ninja-build
On 5 Dec 2013, at 1:07, Reid Kleckner wrote:

> The generator doesn't usually know about the set of outputs it built
> previously, so I don't think this can be solved without help from
> ninja.

The generator can already call ninja to ask for all targets (before and
after updating build.ninja).

Previously discussed here:
https://groups.google.com/forum/#!topic/ninja-build/U1MAUvisfR8

The code I am using in my generator to get all targets is the following
(ruby):

def all_targets(buildfile, builddir)
res = Set.new
return res unless File.exists? buildfile

Dir.chdir(File.dirname(buildfile)) do
targets = %x{ ${TM_NINJA:-ninja} -f #{buildfile.shellescape} -t
targets all }
return nil if $? != 0
targets.each do |line|
if line =~ /.*(?=:(?! phony))/
path = $&
res << path if path =~ /^#{Regexp.escape(builddir)}/ #
ignore targets outside builddir
end
end
end

res
end

Avinash Baliga

unread,
Dec 8, 2013, 11:08:19 AM12/8/13
to ninja...@googlegroups.com
Hi Everyone, thanks for the good discussion.  I think there are a lot of good ideas here, and implementing in the generator is nifty, but ...
I still find there to be several issues with externalizing RAO, as compared to my patch.  Please consider:

#1: If RAO is implemented in the generator, and occurs *after* build.ninja is written out, there is a race condition with CTRL+C.  If I CTRL+C the build immediately after build.ninja is written, but before RAO executes, then on the next incremental build, ninja would consider build.ninja up-to-date, and RAO will be skipped.  This effectively fails to solve my abandoned headers problem.

#2: If the generator implements RAO, it's not removing abandoned output entries from the build log.

#3: If the generator implements RAO *before* build.ninja is written out, race-condition averted.  But now the generator must explicitly track all of its own outputs.  Not a fun problem to have if the generator isn't globally tracking everything already.

#4: Making RAO the generator's responsibility means having to re-implement N times.  It may be OK for my own personal generator (pynja).  But at my workplace we use CMake, and I'm not interested in figuring out how to implement RAO in the CMake code [even ignoring the other issues].  (yes, there is an element of human stubborn-ness here, but I don't like having to modify the whole world whenever possible :-) )

#5: I could work-around all of the above issues by writing a shell script to the effect of:   ninja;  separate-rao-tool;  ninja   .   This would give me the correct result and be race-free.  However, in the worst case it would be grossly inefficient.
Imagine I modify a common.h and rename a generated header in the same change, but don't change my #include statements.  The script would fully build (successfully) on the first ninja, since abandoned headers are still on disk; this could take minutes.  Then after RAO, I discover the whole build fails.
In contrast, the proposed patch discovers the build error as quickly as possible.


Also, I'm not averse to adding an extra commandline option to control enablement of this feature; just need to pick name + defaults.  I tend to think most builds would benefit from this, and no correct build would fail from it, so I lean towards 'RAO enabled by default'.


Replying to individual concerns not addressed above:


On Wednesday, December 4, 2013 11:00:00 AM UTC-6, Nico Weber wrote:
Hi Avinash,

here are a few points that make me somewhat skeptical about this feature:

* It only cleans up files that ninja has in its .ninja_log as far as I understand – if you have a build edge that as far as ninja knowns only writes a stamp file but that also creates a bunch of .h files as a side effect, they won't be cleaned up by this feature.

Agreed.  But I'd argue this is no worse than ninja -t clean.  Both RAO and clean are limited to "well formed" cases where outputs are declared explicitly.
Actually I had an idea for handling "implicit outputs" as you described, but that's a separate topic :)  The basic though is to have an 'iout' concept similar to 'deps' on every build task.  The 'iout' file would allow a command to list implicit outputs, which ninja could then use only for future dirty checks and clean (or RAO) functions.  It doesn't solve bootstrap issues, but would be a purely additive feature to ninja.
 
* Many projects have one bot that does continuous full builds (which removes the output directory and deletes all files not known to source control), which already mostly handles this (and other) issues.
 
That's a decent solution, but not very efficient.  At my workplace a 'clean build' could take 15-30 minutes, and there are multiple branches.  So our continuous builds are all incremental.  Typical build time is 1-2 minutes for those.  It's a big time and money saver.

* Keeping unused lines in .ninja_log around slows down ninja (https://github.com/martine/ninja/issues/661). I think we might want to garbage-collect these lines somehow eventually, for build performance, which would be much harder to do if this autoclean feature was implemented.

Thanks for sharing the bug #.  My patch actually addresses this issue!
After performing RAO, I force a build log recompaction that also removes non-existent files (based on a Stat call) from the build log, and then re-open the build log as normal again.
To implement, I added a flag to struct BuildLog called files_must_exist_, which I set to true only after RAO.


On Wednesday, December 4, 2013 11:04:31 AM UTC-6, Sean Silva wrote:
Could this be done in the generator program?

It seems like the natural place to perform this removal of "abandoned outputs" is in the generator program, rather than repeatedly checking per-build.
Or to phrase it another way: in what scenario would an "autoclean" be triggered that is *not* immediately after a build.ninja regeneration? 

-- Sean Silva


Only thing to clarify, is that it only needs to Stat the .ninja_regen and build.ninja one extra time per build, to resolve the race against CTRL+C.  Same deal as item #1.  With the patch as written, if you CTRL+C after a regen, and then run ninja again, it will realize that the full RAO still needs to be performed.
Other than that, I agree it only needs to be triggered after a build.ninja regeneration.


On Wednesday, December 4, 2013 12:07:21 PM UTC-6, Reid Kleckner wrote:
The generator doesn't usually know about the set of outputs it built previously, so I don't think this can be solved without help from ninja.

I think a hybrid approach would be best:
- Add an 'autoclean' tool to ninja which garbage collects the log file *and* deletes the output files on disk
- After build generation, the generator manually runs autoclean

In theory the only time the outputs change is when the generator runs.

However, this still fails to address the stamp file problem, where the stamp file is deleted and the stale .h is left on disk.  If it solves the common case for .pb.h files, it might be worth it.
 
I kind of like also having the extra tool, but wasn't sure when I'd use it.  The way I have it now, RAO respects pretend' mode (ninja -n), which I used as an alternative.
And yup, solving some common cases would be better than none, at least in my case :-)
Thanks for sharing the link to the previous discussion.  This is for sure a nice idea, and I'm impressed you had this working all this time :-)
But please see my concerns for why I still prefer adding this to ninja in some form instead.

Nico Weber

unread,
Dec 13, 2013, 11:49:35 AM12/13/13
to Avinash Baliga, ninja-build
On Sun, Dec 8, 2013 at 8:08 AM, Avinash Baliga <abba...@gmail.com> wrote:
Hi Everyone, thanks for the good discussion.  I think there are a lot of good ideas here, and implementing in the generator is nifty, but ...
I still find there to be several issues with externalizing RAO, as compared to my patch.  Please consider:

#1: If RAO is implemented in the generator, and occurs *after* build.ninja is written out, there is a race condition with CTRL+C.  If I CTRL+C the build immediately after build.ninja is written, but before RAO executes, then on the next incremental build, ninja would consider build.ninja up-to-date, and RAO will be skipped.  This effectively fails to solve my abandoned headers problem.

#2: If the generator implements RAO, it's not removing abandoned output entries from the build log.

#3: If the generator implements RAO *before* build.ninja is written out, race-condition averted.  But now the generator must explicitly track all of its own outputs.  Not a fun problem to have if the generator isn't globally tracking everything already.

#4: Making RAO the generator's responsibility means having to re-implement N times.  It may be OK for my own personal generator (pynja).  But at my workplace we use CMake, and I'm not interested in figuring out how to implement RAO in the CMake code [even ignoring the other issues].  (yes, there is an element of human stubborn-ness here, but I don't like having to modify the whole world whenever possible :-) )

#5: I could work-around all of the above issues by writing a shell script to the effect of:   ninja;  separate-rao-tool;  ninja   .   This would give me the correct result and be race-free.  However, in the worst case it would be grossly inefficient.
Imagine I modify a common.h and rename a generated header in the same change, but don't change my #include statements.  The script would fully build (successfully) on the first ninja, since abandoned headers are still on disk; this could take minutes.  Then after RAO, I discover the whole build fails.
In contrast, the proposed patch discovers the build error as quickly as possible.


Also, I'm not averse to adding an extra commandline option to control enablement of this feature; just need to pick name + defaults.  I tend to think most builds would benefit from this, and no correct build would fail from it, so I lean towards 'RAO enabled by default'.

I know of a build somewhere that basically does:

1. Write a build.ninja
2. Run ninja
3. Write different build.ninja in same place, with different output paths
4. Run ninja

If this is on by default, 4 would delete the build products from 2, right?
 

Avinash Baliga

unread,
Dec 15, 2013, 12:23:45 PM12/15/13
to ninja...@googlegroups.com, Avinash Baliga

I know of a build somewhere that basically does:

1. Write a build.ninja
2. Run ninja
3. Write different build.ninja in same place, with different output paths
4. Run ninja

If this is on by default, 4 would delete the build products from 2, right?

Great point!  Seems like this would happen because the .ninja_log etc. are shared across all *.ninja files in that directory.
So, my solution would be to give each X.ninja a separate .ninja_log-X, .ninja_deps-X, and [new in patch] .ninja_regen-X.
  • If all X.ninja files control non-overlapping sets of outputs, then the above problem is resolved.
  • If multiple X.ninja files produce the same output, I think you could end up in perma-dirty situations when alternating builds between them.  So I'm not sure this is a valid case to consider.
I updated my branch/patch with this, if you're curious to see.

Also, I'd be curious to know which build system does this; I'd guess there are interesting optimizations around this.

Minor side point: Incidentally, this change would also allow multiple .ninja files in one directory to safely be built concurrently.  Currently, appending to .ninja_log uses append-writes via fwrite (using FILE*), but those are buffered within the process and not guaranteed to be atomic.  You could change to using [linux] open(O_APPEND) + write or [windows] CreateFile(FILE_APPEND_DATA) + WriteFile to fix this when building on a local file system; all bets still off on a network FS.

Maxim Kalaev

unread,
Dec 16, 2013, 11:06:12 AM12/16/13
to ninja...@googlegroups.com, Avinash Baliga

  • If multiple X.ninja files produce the same output, I think you could end up in perma-dirty situations when alternating builds between them.  So I'm not sure this is a valid case to consider.
For that it worth, I am using multiple build.<platform>.ninja for building my project to a specific platform configuration. It works great to avoid unnecessary rebuild of e.g., build tools and auto-generated headers.

Nico Weber

unread,
Dec 16, 2013, 1:02:27 PM12/16/13
to Avinash Baliga, ninja-build
On Sun, Dec 15, 2013 at 9:23 AM, Avinash Baliga <abba...@gmail.com> wrote:

I know of a build somewhere that basically does:

1. Write a build.ninja
2. Run ninja
3. Write different build.ninja in same place, with different output paths
4. Run ninja

If this is on by default, 4 would delete the build products from 2, right?

Great point!  Seems like this would happen because the .ninja_log etc. are shared across all *.ninja files in that directory.
So, my solution would be to give each X.ninja a separate .ninja_log-X, .ninja_deps-X, and [new in patch] .ninja_regen-X.
  • If all X.ninja files control non-overlapping sets of outputs, then the above problem is resolved.
  • If multiple X.ninja files produce the same output, I think you could end up in perma-dirty situations when alternating builds between them.  So I'm not sure this is a valid case to consider.
I updated my branch/patch with this, if you're curious to see.

Also, I'd be curious to know which build system does this; I'd guess there are interesting optimizations around this.

It's a variant of an existing android ndk build script: https://github.com/awong-dev/ndk/blob/master/build/tools/build-cxx-stl.sh This is called repeatedly, and it overwrites build.ninja each time.
 

Minor side point: Incidentally, this change would also allow multiple .ninja files in one directory to safely be built concurrently.  Currently, appending to .ninja_log uses append-writes via fwrite (using FILE*), but those are buffered within the process and not guaranteed to be atomic.  You could change to using [linux] open(O_APPEND) + write or [windows] CreateFile(FILE_APPEND_DATA) + WriteFile to fix this when building on a local file system; all bets still off on a network FS.

Avinash Baliga

unread,
Dec 17, 2013, 3:56:54 AM12/17/13
to ninja...@googlegroups.com, Avinash Baliga
Maxim: In your case, each build.<platform>.ninja produces unique output files per <platform> right?  That should be all good with RAO.
 
It's a variant of an existing android ndk build script: https://github.com/awong-dev/ndk/blob/master/build/tools/build-cxx-stl.sh This is called repeatedly, and it overwrites build.ninja each time.

Interesting ... It looks like this script would safely create distinct outputs if a different BUILD_DIR (--build-dir) is passed in on each invocation.  Otherwise, thanks to the [ninja] command hash it would remake all *.o in-place, but write distinct outputs to the output directory.  But, the script also completely removes (rm -rf) the BUILD_DIR if it had created it, after performing a build, which would obliterate the .ninja_log in the process.
So If RAO were introduced and BUILD_DIR already existed before that script were run, the libs would stop being generated.  Otherwise with a temporary BUILD_DIR, everything'd work as normal.

But details aside, strictly speaking, I do see your point that to keep compatibility, RAO would need to be disabled by default.  Just seems like a real bummer :)

Few fallback options then:
  1. Introduce an environment variable like NINJA_RAO, which if set, causes ninja to perform RAO.  Advantage is backwards compat, and transparently working even with "old" generators written before RAO was introduced (e.g. existing CMake versions).
    Also like a 1 minute change (just wrap the RAO call with if (getenv("NINJA_RAO")) {} )
    This is looking like the best option now, IMHO.
  2. Introduce a commandline option to enable RAO, and a similar "rule variable" for generator rules.  Only useful for new generators, plus requires shell scripts that call ninja to pass the extra option.  Might be worth adding for completeness, but not useful for existing generators.
  3. Bump the ninja program version, and start enabling RAO by default if your ninja file says e.g. "ninja_required_version 1.6".  Nicer than #1 in terms of purity (no behavioral dependency on environment), but new generators using latest features would need to take note.
Your thoughts are of course appreciated again.

Avinash Baliga

unread,
Dec 17, 2013, 11:02:06 AM12/17/13
to ninja...@googlegroups.com, Avinash Baliga
Actually, not sure why I thought a commandline option to enable was a "problem".  Posting too late at night I guess.  I'll probably add that when I get the chance.

Maxim Kalaev

unread,
Dec 18, 2013, 3:03:27 PM12/18/13
to ninja...@googlegroups.com, Avinash Baliga


On Tuesday, December 17, 2013 10:56:54 AM UTC+2, Avinash Baliga wrote:
Maxim: In your case, each build.<platform>.ninja produces unique output files per <platform> right?  That should be all good with RAO.
No, the outputs file names are overlapping, the tools parameters and outputs are different.

Avinash Baliga

unread,
Dec 18, 2013, 11:13:40 PM12/18/13
to ninja...@googlegroups.com, Avinash Baliga
Ah gotcha.  Should still be fine, RAO wouldn't remove any files since they're still referenced.
But per above discussion I left the default behavior alone now (RAO disabled).  Patch/pull request is updated.
Reply all
Reply to author
Forward
0 new messages