faster implicit dependencies with depslog

338 views
Skip to first unread message

Evan Martin

unread,
Jan 6, 2013, 6:54:51 PM1/6/13
to ninja...@googlegroups.com
I've pushed an experimental branch that adds a new feature to make dependency loading much faster:
  https://github.com/martine/ninja/tree/dep-pipe
I haven't merged it yet since there are a few details to work out, particularly the build-time syntax.

The basic idea is this: rather than have subcommands write out depfiles when they run and load these files on subsequent Ninja runs, instead greedily parse the dependency output as it's written and store it in a compact format for quicker loading on the next Ninja run.

On Unix this means telling gcc to output its dependency information via /dev/fd/$NINJA_DEPS, which is a pipe specially set up in subcommands to be read and parsed by Ninja.  On Windows, we already had code to parse the /showIncludes output from MSVC, so no pipes are needed and we can drop the dependency helper.

Streaming all this information into the Ninja process during the build allows a very compact format, which is quick to load by itself and especially saves time on Windows because all dependency information can be loaded from a single file.  The file format, briefly, is a sequence of records: either strings (paths), or dependency lists, which are sets of integers that refer to the paths.
The format (which is internal to Ninja and only interesting to look at if you'd like to comment on data structures) is documented further here:

I haven't run a Chrome build yet but I estimate it's about 40x less data to load.  I also verified the code to work with a CMake-generated Clang build (where I manually hacked up the resulting build files to trigger the new behavior), where it was faster but not in an important way (on my laptop, no-op Clang builds are a handful of milliseconds either way).


There are a few open questions that I'd like feedback on.

One is how to specify in the build file that a command supports this special behavior.  It seems reasonable to me to keep the "old" (depfile-parsing-based) behavior around for project that have special needs in how their dependencies work.  So now there are three modes for dependencies: depfiles, the pipe used on Unix, and MSVC-specific /showIncludes parsing for Windows.  It seems these could be selected via a per-rule (or per-build) attribute, but I'm not sure what to call it.  "deps=msvc" and "deps=pipe"?  (It seems weird to call the Unix one gcc-specific since it works with other compilers etc.)

Second, right now as designed this only works for commands that have a single output.  That behavior is the reality for 99% of the code Ninja builds (C code) so maybe it's ok to use depfiles for the rest of cases, but it's also maybe appealing to have a single unified format for all dependencies.


As it currently exists the branch seems to work on Linux and Windows but there are a few important extra pieces (like adding versioning information to the file so that we can change the format later if necessary, or preventing the file size from continually growing across builds) that I haven't written yet.  So if you do play with it, don't rely on it yet.  Otherwise, take a look at the change to configure.py here to see how to use the new mode (note the "special" keyword, subject to change):

Scott Graham

unread,
Jan 7, 2013, 11:46:11 AM1/7/13
to ninja...@googlegroups.com
On Sun, Jan 6, 2013 at 3:54 PM, Evan Martin <mar...@danga.com> wrote:
I've pushed an experimental branch that adds a new feature to make dependency loading much faster:
  https://github.com/martine/ninja/tree/dep-pipe
I haven't merged it yet since there are a few details to work out, particularly the build-time syntax.

The basic idea is this: rather than have subcommands write out depfiles when they run and load these files on subsequent Ninja runs, instead greedily parse the dependency output as it's written and store it in a compact format for quicker loading on the next Ninja run.

On Unix this means telling gcc to output its dependency information via /dev/fd/$NINJA_DEPS, which is a pipe specially set up in subcommands to be read and parsed by Ninja.  On Windows, we already had code to parse the /showIncludes output from MSVC, so no pipes are needed and we can drop the dependency helper.

Streaming all this information into the Ninja process during the build allows a very compact format, which is quick to load by itself and especially saves time on Windows because all dependency information can be loaded from a single file.  The file format, briefly, is a sequence of records: either strings (paths), or dependency lists, which are sets of integers that refer to the paths.
The format (which is internal to Ninja and only interesting to look at if you'd like to comment on data structures) is documented further here:

I haven't run a Chrome build yet but I estimate it's about 40x less data to load.  I also verified the code to work with a CMake-generated Clang build (where I manually hacked up the resulting build files to trigger the new behavior), where it was faster but not in an important way (on my laptop, no-op Clang builds are a handful of milliseconds either way).


That's great. I'll give it a try on Windows Chrome and report back timings.
 

There are a few open questions that I'd like feedback on.

One is how to specify in the build file that a command supports this special behavior.  It seems reasonable to me to keep the "old" (depfile-parsing-based) behavior around for project that have special needs in how their dependencies work.  So now there are three modes for dependencies: depfiles, the pipe used on Unix, and MSVC-specific /showIncludes parsing for Windows.  It seems these could be selected via a per-rule (or per-build) attribute, but I'm not sure what to call it.  "deps=msvc" and "deps=pipe"?  (It seems weird to call the Unix one gcc-specific since it works with other compilers etc.)

Those names seem fine to me. (Other suggestions: deps=showincludes, deps=unix or deps=linux or whatever system it is that offers /dev/fd/...)
 

Second, right now as designed this only works for commands that have a single output.  That behavior is the reality for 99% of the code Ninja builds (C code) so maybe it's ok to use depfiles for the rest of cases, but it's also maybe appealing to have a single unified format for all dependencies.

Assuming you plan to keep support for all 3 styles, it seems fine for depslog to have some limitations. If .d support could be entirely dropped, then that would seem like more motivation to unify.

Frank Miller

unread,
Jan 7, 2013, 4:27:09 PM1/7/13
to ninja...@googlegroups.com
I don't have an opinion on either of the questions you wanted feedback
on. I'll keep them in mind as I keep working but for now, any of the
ideas you suggested seem fine to me.

What I can share is some numbers. I work on a fairly large linux based
project and my coworkers typically build off nfs mounts. When the caches
are cold and the network traffic is high, loading the depfiles can take
a significant amount of time. I have seen the 'depfile load' line in the
'-d stats' go above 40000 (yep, thats 40 seconds). I tried to do a fair
comparison by rebooting and then running noop builds in two different
build trees. They both have the same source tree however so I ran
dep-pipe first to give it a disadvantage. Here are the results.

dep-pipe:

    ninja: no work to do.
    metric                  count   avg (us)        total (ms)
    .ninja parse            2       36260.5         72.5
    canonicalize str        34208   0.3             8.6
    canonicalize path       34208   0.1             4.2
    lookup node             39095   0.2             5.9
    .ninja_log load         1       18496.0         18.5
    .ninja_deps load        1       148398.0        148.4
    node stat               4554    571.9           2604.6

    path->node hash load 0.96 (5882 entries / 6151 buckets)

master:

    ninja: no work to do.
    metric                  count   avg (us)        total (ms)
    .ninja parse            2       37993.0         76.0
    canonicalize str        34168   0.3             8.8
    canonicalize path       114429  0.2             21.5
    lookup node             114429  0.2             27.9
    .ninja_log load         1       40243.0         40.2
    node stat               5544    329.0           1823.8
    depfile load            1592    920.0           1464.7

    path->node hash load 0.95 (5862 entries / 6151 buckets)

So .ninja_deps loads 10 times faster than the depfiles and there are 20%
less files to stat. These two steps are by far the bottleneck so I give
this branch a big two thumbs up.

Thanks for sharing. Your build system has made my work fun again!

Frank

Maxim Kalaev

unread,
Jan 7, 2013, 5:12:41 PM1/7/13
to ninja...@googlegroups.com, mar...@danga.com

On Monday, January 7, 2013 1:54:51 AM UTC+2, Evan Martin wrote:
I've pushed an experimental branch that adds a new feature to make dependency loading much faster:
  https://github.com/martine/ninja/tree/dep-pipe
I haven't merged it yet since there are a few details to work out, particularly the build-time syntax.

A nice idea, and clever log syntax!
Syntax proposals: deps={make,log,bin,ninja,text}

Scott Graham

unread,
Jan 7, 2013, 9:21:29 PM1/7/13
to ninja...@googlegroups.com
On Mon, Jan 7, 2013 at 1:27 PM, Frank Miller <christopher...@gmail.com> wrote:
I don't have an opinion on either of the questions you wanted feedback
on. I'll keep them in mind as I keep working but for now, any of the
ideas you suggested seem fine to me.

What I can share is some numbers. I work on a fairly large linux based
project and my coworkers typically build off nfs mounts. When the caches
are cold and the network traffic is high, loading the depfiles can take
a significant amount of time. I have seen the 'depfile load' line in the
'-d stats' go above 40000 (yep, thats 40 seconds). I tried to do a fair
comparison by rebooting and then running noop builds in two different
build trees. They both have the same source tree however so I ran
dep-pipe first to give it a disadvantage. Here are the results.

dep-pipe:
... 
   node stat               4554    571.9           2604.6 

master:
... 
node stat               5544    329.0           1823.8
...

and there are 20%
less files to stat.


Should there be less files to stat? (Shouldn't the nodes already have been unique? Maybe stronger normalization? Or am I being thick?)

Frank Miller

unread,
Jan 8, 2013, 10:54:46 AM1/8/13
to ninja...@googlegroups.com
Should there be less files to stat? (Shouldn't the nodes already have been unique? Maybe stronger normalization? Or am I being thick?)

I thought it was because ninja did not have to stat all the *.d files but this does not add up. I counted 1592 *.d files when I built with master and 0 when built with dep-pipe. Also, now that I think about it, Im not sure it makes sense to stat the *.d files anyways. Perhaps I was being thick.

What is "stronger normalization"?

Frank

Maxim Kalaev

unread,
Jan 8, 2013, 11:54:04 AM1/8/13
to ninja...@googlegroups.com

On Tuesday, January 8, 2013 5:54:46 PM UTC+2, Frank Miller wrote:

I thought it was because ninja did not have to stat all the *.d files but this does not add up. I counted 1592 *.d files when I built with master and 0 when built with dep-pipe. Also, now that I think about it, Im not sure it makes sense to stat the *.d files anyways.
In theory, if a depfile is more recent when the target then target should be rebuilt for safety. That covers the case where a previous incremental build has been terminated leaving an incomplete .d file (missing relevant dependencies).
But as I remember ninja only stats depfiles for targets with 'restat' enabled. Is this your case?

The numbers may not add up because it's not always necessary to stat all dependencies to conclude that a target is out of date. Such if a single input is found to be out of date, it's not necessary to stat the rest of inputs.

Frank Miller

unread,
Jan 8, 2013, 12:06:16 PM1/8/13
to ninja...@googlegroups.com
In theory, if a depfile is more recent when the target then target should be rebuilt for safety. That covers the case where a previous incremental build has been terminated leaving an incomplete .d file (missing relevant dependencies).
But as I remember ninja only stats depfiles for targets with 'restat' enabled. Is this your case?

Im using cmake for the generator with manually edited files to test the new feature. 'restat' is only enabled for the CUSTOM_COMMAND rule which doesn't use a depfile.
 
The numbers may not add up because it's not always necessary to stat all dependencies to conclude that a target is out of date. Such if a single input is found to be out of date, it's not necessary to stat the rest of inputs.

All of my test were on noop builds so it should have to stat everything to figure out that nothing is out of date.

Evan Martin

unread,
Jan 8, 2013, 12:08:03 PM1/8/13
to ninja...@googlegroups.com
On Mon, Jan 7, 2013 at 1:27 PM, Frank Miller <christopher...@gmail.com> wrote:
dep-pipe:

    ninja: no work to do.
    metric                  count   avg (us)        total (ms)
    .ninja parse            2       36260.5         72.5
    canonicalize str        34208   0.3             8.6
    canonicalize path       34208   0.1             4.2
    lookup node             39095   0.2             5.9
    .ninja_log load         1       18496.0         18.5
    .ninja_deps load        1       148398.0        148.4
    node stat               4554    571.9           2604.6

    path->node hash load 0.96 (5882 entries / 6151 buckets)

Would you mind telling me the size of the $builddir/.ninja_deps file?

NFS causes latency when opening files so I would expect it to be slower, but in your numbers above it's nearly 10x slower than opening the build log, wihch is also a single file.  I guess the difference must be in the relative sizes of the files?

Thanks for sharing. Your build system has made my work fun again!

I'm glad to hear it!

PS: if you're still using this branch, you might want to keep pulling down new code as I'm still finishing it and fixing bugs.  It also currently does the wrong thing if your build is interrupted at the wrong time, so I wouldn't trust it yet.

Frank Miller

unread,
Jan 8, 2013, 1:03:43 PM1/8/13
to ninja...@googlegroups.com
Would you mind telling me the size of the $builddir/.ninja_deps file?

NFS causes latency when opening files so I would expect it to be slower, but in your numbers above it's nearly 10x slower than opening the build log, wihch is also a single file.  I guess the difference must be in the relative sizes of the files?

.ninja_deps 1001K
.ninja_log 217K

PS: if you're still using this branch, you might want to keep pulling down new code as I'm still finishing it and fixing bugs.  It also currently does the wrong thing if your build is interrupted at the wrong time, so I wouldn't trust it yet.

Will do.

Frank

Evan Martin

unread,
Jan 9, 2013, 2:02:52 PM1/9/13
to ninja...@googlegroups.com
On Tue, Jan 8, 2013 at 10:03 AM, Frank Miller <christopher...@gmail.com> wrote:
Would you mind telling me the size of the $builddir/.ninja_deps file?

NFS causes latency when opening files so I would expect it to be slower, but in your numbers above it's nearly 10x slower than opening the build log, wihch is also a single file.  I guess the difference must be in the relative sizes of the files?

.ninja_deps 1001K
.ninja_log 217K

To summarize, your build is 4.5k files and 1mb deps.  I had estimated that Chrome (~35k files, so 8x larger) should only be about 2mb of deps, so that file still seems a bit big.  I *think* it is this bug I fixed after my announcement:
so it should just fix itself, so I'm not gonna worry about it more than writing this mail where I'm trying to convince myself it'll be fine.   :)

If you wouldn't mind one more stat, could you "time cat path/to/builddir/.ninja_deps > /dev/null" to estimate how long it takes to read the file via NFS?  (NFS has been a recurring nemesis in my career, it's funny to see it turn up here.)

Evan Martin

unread,
Jan 10, 2013, 12:36:48 PM1/10/13
to ninja...@googlegroups.com
On Sun, Jan 6, 2013 at 3:54 PM, Evan Martin <mar...@danga.com> wrote:
On Unix this means telling gcc to output its dependency information via /dev/fd/$NINJA_DEPS, which is a pipe specially set up in subcommands to be read and parsed by Ninja.

I think I've convinced myself this won't work in the presence of distributed builds (distcc or Google's internal one), and probably won't make ccache too happy either.  I need to experiment with how those systems handle the -MF flag to see if a less audacious plan, like having Ninja read the file off disk and immediately delete it, will work.

Frank Miller

unread,
Jan 10, 2013, 8:02:58 PM1/10/13
to ninja...@googlegroups.com
On Wed, Jan 9, 2013 at 1:02 PM, Evan Martin <mar...@danga.com> wrote:

To summarize, your build is 4.5k files and 1mb deps.  I had estimated that Chrome (~35k files, so 8x larger) should only be about 2mb of deps, so that file still seems a bit big.  I *think* it is this bug I fixed after my announcement:
so it should just fix itself, so I'm not gonna worry about it more than writing this mail where I'm trying to convince myself it'll be fine.   :)

If you wouldn't mind one more stat, could you "time cat path/to/builddir/.ninja_deps > /dev/null" to estimate how long it takes to read the file via NFS?  (NFS has been a recurring nemesis in my career, it's funny to see it turn up here.)

$ time cat .ninja_deps > /dev/null

real    0m0.301s
user    0m0.000s
sys     0m0.002s

$ time cat .ninja_log > /dev/null

real    0m0.200s
user    0m0.000s
sys     0m0.001s

Warning, I ran the above commands a few minutes apart and days after my initial timings. (NFS has only just begun to be my nemesis. Sorry to make you think about it).

I also got the latest and built a brand new tree. The .ninja_deps is now 601K. This still sounds large based on your comparison to Chrome but it might be right. Our code style makes for long include paths in the source and its quite likely that our dependency graph is nastier.

Frank

Frank Miller

unread,
Jan 11, 2013, 10:50:22 AM1/11/13
to ninja...@googlegroups.com
On Mon, Jan 7, 2013 at 8:21 PM, Scott Graham <sco...@chromium.org> wrote:
On Mon, Jan 7, 2013 at 1:27 PM, Frank Miller <christopher...@gmail.com> wrote:
...
dep-pipe:
... 
   node stat               4554    571.9           2604.6 

master:
... 
node stat               5544    329.0           1823.8
...

and there are 20%
less files to stat.


Should there be less files to stat? (Shouldn't the nodes already have been unique? Maybe stronger normalization? Or am I being thick?)


This "20% less files to stat" business may have been (at least partially) an implementation error on my part. When I modified the CMake generated rules, I didn't add a 'special = ...' line. When I do that, incremental builds now actually work, and I stat 5564 files. This is now slightly larger than master so its still not clear what going on but for some reason I feel more comfortable with the situation.

Frank

Maxim Kalaev

unread,
Jan 11, 2013, 11:28:49 AM1/11/13
to ninja...@googlegroups.com
If you are building on linux, I can suggest running both variants under 'strace -estat64', and compare the traces.

--

Maxim

Frank Miller

unread,
Jan 15, 2013, 9:38:34 PM1/15/13
to ninja...@googlegroups.com

If you are building on linux, I can suggest running both variants under 'strace -estat64', and compare the traces.

 I ran 'strace -estat'  and compared the output as you suggested and the extra files were internal to my cmake installation. So I guess the two trees were configured differently somehow. I repeated the experiment with two brand new build trees and now the number of stats is exactly the same. No issue here.

Thanks for the suggestion. I have been meaning to learn how to use strace for years.

Maxim Kalaev

unread,
Jan 17, 2013, 9:17:16 AM1/17/13
to ninja...@googlegroups.com, mar...@danga.com
You can even keep depfiles intact, treating your log as a big 'pch'. This could help with debugging build when something goes wrong.

Evan Martin

unread,
Jan 17, 2013, 11:44:34 AM1/17/13
to ninja...@googlegroups.com
I have been worrying that it can lead to more confusion, in that there will be the extra .d files sitting around that aren't actually used -- for example, deleting or modfiying them won't affect the build.  (Another alternative, to examine the .d files on startup anyway, would mean Ninja has to stat them all on every build.  This was the problem I had with the otherwise nice patch written earlier that would merge .d files at build time.)

Nico Weber

unread,
Feb 13, 2013, 7:06:09 PM2/13/13
to ninja...@googlegroups.com
On Sun, Jan 6, 2013 at 3:54 PM, Evan Martin <mar...@danga.com> wrote:
I finally had the opportunity to take a quick look at this. On my Mac
Pro, empty build time goes from 1.59s to 0.83s – the first time and
empty build took less than 1 second on this machine. (It took 1.3s not
too long ago, but chrome has grown. Also, current ninja trunk is 5%
slower than the 1.1 binary – possibly because I'm building with a
newer clang and that might have worse codegen) Even better, after
running `purge` (which purges the disk cache), and empty build is now
1.4s instead of > 20s previously (I didn't measure this with regular
ninja, and switching back from deps-log requires a full rebuild. I'll
measure this later.)

Breakdown:

hummer:src thakis$ time ~/src/ninja/ninja -C out/Release/ chrome -d stats
ninja: Entering directory `out/Release/'
ninja: no work to do.
metric count avg (us) total (ms)
.ninja parse 736 929.5 684.1
canonicalize str 137629 0.3 38.8
canonicalize path 137629 0.2 22.0
lookup node 172842 0.2 37.6
.ninja_log load 1 61502.0 61.5
.ninja_deps load 1 57712.0 57.7
node stat 42081 4.5 188.2


Also, there's some evidence that the .d files are getting too large
for the default disk cache size for chromium (
https://groups.google.com/a/chromium.org/d/topic/chromium-dev/WXIhb6S9Ly4/discussion
). I wasn't able to reproduce this myself, but I'll take a look at
this on bauerb's box next week.

From my early results, this looks very very promising!

Nico

Evan Martin

unread,
Feb 13, 2013, 8:10:45 PM2/13/13
to Nico Weber, ninja...@googlegroups.com
If you have it handy, can you tell me how large the out/Release/.ninja_deps file is?
Also "strings out/Release/.ninja_deps | wc".
 

Scott Graham

unread,
Feb 14, 2013, 1:20:18 PM2/14/13
to Nico Weber, ninja...@googlegroups.com
Thanks to Nico's kick I finally tried to test this on Windows. Sorry for the delay.

Unfortunately Chrome relies on the "-e" that was in "-t msvc" to set the compiler environment (for different bitnesses in one build) so can't currently build without some surgery. I'm not sure yet what the best way to fix that problem, and whether that really belongs in ninja or not.

On Wed, Feb 13, 2013 at 4:06 PM, Nico Weber <tha...@chromium.org> wrote:
--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



Evan Martin

unread,
Feb 15, 2013, 11:58:48 AM2/15/13
to Scott Graham, ninja...@googlegroups.com, Nico Weber

For a quick test you can probably just hack out the showincludes parsing from msvc-helper. If it passes the output through to ninja then everything else should work. I was hoping you had a bright idea about env stuff. :) maybe just integrate it into ninja vars I guess.

brevity due to phone

Scott Graham

unread,
Feb 15, 2013, 2:00:26 PM2/15/13
to Evan Martin, ninja...@googlegroups.com, Nico Weber
Aha, good idea. Tested on Z620 (but on moderate speed HDD not SSD).

.d files version as a baseline:
no-op with dropped cache: 3m57.046s
no-op hot cache: 0m4.078s

dep-pipe branch:
no-op with dropped cache: 1m38.625s
no-op hot cache: 0m1.984s
02/15/2013  10:36 AM         1,003,372 .ninja_deps
02/15/2013  10:36 AM         1,429,014 .ninja_log
c:\src\cr\src>strings out/Release/.ninja_deps | wc
  13487   13491  868513

and the breakdown for hot cache no-op:

c:\src\cr\src>tim d:\src\ninja\ninja.exe -C out\Release chrome -d stats
ninja: Entering directory `out\Release'
ninja: no work to do.
metric                  count   avg (us)        total (ms)
.ninja parse            960     923.2           886.3
canonicalize str        249598  0.0             0.2
canonicalize path       249598  0.0             0.1
lookup node             263084  0.0             0.4
.ninja_log load         1       16595.0         16.6
.ninja_deps load        1       9774.0          9.8
node stat               32264   36.6            1182.0

path->node hash load 0.70 (45589 entries / 65536 buckets)


Overall, looks great: ~50% faster.

Scott Graham

unread,
Feb 15, 2013, 2:12:11 PM2/15/13
to Evan Martin, ninja...@googlegroups.com, Nico Weber
Hmm, sorry, ignore these results. I must have messed up the showincludes hack, as there's very few headers in .ninja_deps.

Scott Graham

unread,
Feb 15, 2013, 5:43:08 PM2/15/13
to Evan Martin, ninja...@googlegroups.com, Nico Weber
With a bit more hacking, correct numbers for dep-pipe:

c:\src\cr\src>killcache c
c:\src\cr\src>tim d:\src\ninja\ninja.exe -C out\Release chrome -d stats
ninja: Entering directory `out\Release'
ninja: no work to do.
metric                  count   avg (us)        total (ms)
.ninja parse            960     12956.4         12438.2
canonicalize str        249598  0.0             0.6
canonicalize path       249598  0.0             0.3
lookup node             277383  0.0             2.9
.ninja_log load         1       308606.0        308.6
.ninja_deps load        1       1845782.0       1845.8
node stat               45605   1868.6          85217.6

path->node hash load 0.90 (58834 entries / 65536 buckets)

real: 1m39.937s

hot cache:
c:\src\cr\src>tim d:\src\ninja\ninja.exe -C out\Release chrome -d stats
ninja: Entering directory `out\Release'
ninja: no work to do.
metric                  count   avg (us)        total (ms)
.ninja parse            960     918.4           881.7
canonicalize str        249598  0.0             0.2
canonicalize path       249598  0.0             0.1
lookup node             277383  0.0             0.8
.ninja_log load         1       16642.0         16.6
.ninja_deps load        1       33110.0         33.1
node stat               45605   35.6            1622.9

path->node hash load 0.90 (58834 entries / 65536 buckets)

real: 0m2.532s

c:\src\cr\src>dir out\release\.ninja*
02/15/2013  12:44 PM         8,821,937 .ninja_deps
02/15/2013  12:44 PM         1,418,184 .ninja_log

c:\src\cr\src>strings out/Release/.ninja_deps | wc
  27786   27790 1678661


So, still a definite improvement over loose .d files.

Scott Graham

unread,
Feb 15, 2013, 6:01:05 PM2/15/13
to Evan Martin, ninja...@googlegroups.com, Nico Weber
Meant to add, the stat time on an SSD tree for a similar number of files is ~400ms on my machine, so just parsing .ninja files will be the majority of time now! :)

Nico Weber

unread,
Feb 18, 2013, 9:25:29 AM2/18/13
to Scott Graham, Evan Martin, ninja...@googlegroups.com
(It looks like the dep-pipe branch disappeared. Is dep-pipeless the
branch to use instead?)

Evan Martin

unread,
Feb 18, 2013, 6:31:31 PM2/18/13
to Nico Weber, Scott Graham, ninja...@googlegroups.com
Yeah.  Same idea, but without the "tell gcc to write into /dev/fd/3" pipe trickiness.  It turns out removing that removed a lot of the other complexity of the branch.

Nico Weber

unread,
Feb 25, 2013, 10:23:31 AM2/25/13
to Evan Martin, Scott Graham, ninja...@googlegroups.com
More numbers, this time with the deps-pipeless branch:

Warm cache:
$ time ninja -C out/Release/ chrome
ninja: Entering directory `out/Release/'
ninja: no work to do.

real 0m1.138s
user 0m0.803s
sys 0m0.329s
$ time ~/src/ninja/ninja -C out_deps/Release/ chrome
ninja: Entering directory `out_deps/Release/'
ninja: no work to do.

real 0m0.459s
user 0m0.339s
sys 0m0.117s

Cold cache:
$ purge
$ time ninja -C out/Release/ chrome
ninja: Entering directory `out/Release/'
ninja: no work to do.

real 0m56.609s
user 0m1.220s
sys 0m0.956s
$ time ~/src/ninja/ninja -C out_deps/Release/ chrome
ninja: Entering directory `out_deps/Release/'
ninja: no work to do.

real 0m0.836s
user 0m0.375s
sys 0m0.152s

(57s vs 0.8s!)

.ninja_deps size:

$ strings out_deps/Release/.ninja_deps | wc
10590 10593 695026


(However, after the first full build with deps the next build took one
more iteration to end up in the "nothing to do" state. But after that,
it seemed to behave every time.)

Nico Weber

unread,
Feb 25, 2013, 10:24:05 AM2/25/13
to Evan Martin, Scott Graham, ninja...@googlegroups.com
Forgot to say, this is with today's ninja repo
(6dc4d5d9c10bba5aec1cb8861858779cb096a01c).

Nico Weber

unread,
Feb 26, 2013, 4:57:39 AM2/26/13
to Evan Martin, Scott Graham, ninja...@googlegroups.com
Maxim pointed out that I didn't re-run purge before running
deps-pipeless. With purging, the time becomes 1s-3s (some variance)
instead of 0.8s, compared to 55s-57s with ninja 1.1.0.

Maxim Kalaev

unread,
Mar 31, 2013, 2:51:16 AM3/31/13
to ninja...@googlegroups.com
(for the reference)
I've run some tests on my linux host with ext4 & a standard HD to confirm that stat-ing files is indeed much faster with cold VFS caches.
The results are coherent with Nico's results, 90 sec to read all dep files cold, 3.3sec to just stat these files.

-------------
host201: leia> find -type f -name \*.d -o -name \*.dd > lst
host201: leia> wc -l lst
12822 lst

-------------
host201: leia> cat t.py
#!/usr/bin/python

import os
import sys
for f in file(sys.argv[2]):
    if sys.argv[1] == 's':
        os.stat(f.rstrip())
    elif sys.argv[1] == 'r':
        f = file(f.rstrip())
        f.read()

-------------
host201: leia> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
3
host201: leia> time python t.py r lst

real    1m28.301s
user    0m0.260s
sys     0m0.930s
host201: leia> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
3
host201: leia> time python t.py r lst

real    1m28.860s
user    0m0.360s
sys     0m0.830s

host201: leia> time python t.py r lst

real    0m0.377s
user    0m0.150s
sys     0m0.210s
host201: leia> time python t.py r lst

real    0m0.367s
user    0m0.200s
sys     0m0.160s

-------------

host201: leia> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
3
host201: leia> time python t.py s lst

real    0m3.618s
user    0m0.080s
sys     0m0.160s
host201: leia> sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
3
host201: leia> time python t.py s lst

real    0m3.067s
user    0m0.070s
sys     0m0.160s
host201: leia> time python t.py s lst

real    0m0.095s
user    0m0.080s
sys     0m0.010s
host201: leia> time python t.py s lst

real    0m0.096s
user    0m0.050s
sys     0m0.040s


Reply all
Reply to author
Forward
0 new messages