Failures when running ninja in parallel

1,254 views
Skip to first unread message

johndo...@gmail.com

unread,
Mar 18, 2015, 6:40:04 PM3/18/15
to ninja...@googlegroups.com
Good afternoon everyone,

A quick question for you: is running multiple instances of ninja, in parallel, using the same build directory, supported?

I ask because I'm seeing behavior that would lead me to believe it is not. Here's my use case:

1) Generate an out-of-source-tree build directory using CMake that contains ~100 targets, nearly all of which are archives, a few of which are shared objects.
2) Spin up N ninja instances (currently 8, but could scale up to 16), with each instance of ninja responsible for compiling a specific archive. Do this until all the archives are compiled.
3) Spin up M ninja instances (currently 3) with each instance responsible for compiling a specific shared object. Each .so pulls in some subset of the compiled archives from Step 2.

When running multiple instances of ninja, I notice the following showing up in my output (emphasis added):

ninja: Entering directory `/project/build/el7/allarchives'
ninja: warning: premature end of file; recovering
[1/1562] /usr/lib64/ccache/c++    -DCODE_INLINE -DCSTDMF_IMPORT -DMF_SERVER -DMF_USE_ASSERTS -DSCRIPT_PYTHON  -DUSE_OPENSSL -D__linux__ -Dlinux=1 -Dunix=1 -std=c++11 -msse3 -Wno-deprecated -fPIC -DHK_CONFIG_SIMD=2 -pipe -O3 -DNDEBUG -I/project/source/pub_headers -I/project/ext_api -MMD -MT libA/CMakeFiles/libA.dir/libAFile1.cpp.o -MF libA/CMakeFiles/libA.dir/libAFile1.cpp.o.d -o libA/CMakeFiles/libA.dir/libAFile1.cpp.o -c /project/source/libraries/libA/libAFile1.cpp
[2/1562] /usr/lib64/ccache/c++    -DCODE_INLINE -DCSTDMF_IMPORT -DMF_SERVER -DMF_USE_ASSERTS -DSCRIPT_PYTHON  -DUSE_OPENSSL -D__linux__ -Dlinux=1 -Dunix=1 -std=c++11 -msse3 -Wno-deprecated -fPIC -DHK_CONFIG_SIMD=2 -pipe -O3 -DNDEBUG -I/project/source/pub_headers -I/project/ext_api -MMD -MT libA/CMakeFiles/libA.dir/libAFile2.cpp.o -MF libA/CMakeFiles/libA.dir/libAFile2.cpp.o.d -o libA/CMakeFiles/libA.dir/libAFile2.cpp.o -c /project/source/libraries/libA/libAFile2.cpp
*snip* *snip*

Compilation seems to continue though, so it doesn't seem to be a fatal situation.

When I go to build the .so files in parallel in Step 3 (which each depend on a different large subsets of the archives built in Step 2), ninja will sometimes completely ignore the built archives and rebuild them instead. This, of course, proves problematic because they will occasionally try to rebuild the same archive at the same time, leading to build failures.

I spent a few hours reading through the Ninja documentation, using Google (surprisingly, few results show up for practically anything ninja-related!), and then eventually grabbing the source from GitHub and reading through that. I suspect this may have something to do with multiple instances of ninja reading to and writing from .ninja_deps? Any ideas on how I can resolve or work around this issue?

Thank you,
John

Evan Martin

unread,
Mar 18, 2015, 6:42:32 PM3/18/15
to johndo...@gmail.com, ninja-build

You are right, multiple instances at the same time aren't supported. Perhaps we should use some lockfile-like strategy to make it more obviously an error.

Is it possible to just pass your entire target list to one instance of ninja? If not, can you elaborate more on what you're trying to achieve?

brevity due to phone

--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

johndo...@gmail.com

unread,
Mar 18, 2015, 9:48:31 PM3/18/15
to ninja...@googlegroups.com, johndo...@gmail.com, mar...@danga.com
Thank you for the prompt reply, Evan.

It is indeed possible to pass an entire target list to one instance of ninja, but unfortunately, I won't be able to take advantage of that kind of behavior in our build system. Allow me to elaborate/give you some detail:

We have a build cluster of X machines driven by a job-based build system. At the start of each build, CMake runs on every node in the cluster, generating a complete ninja-based build tree. A pool of about 100 jobs is then created, with each job containing the pre-build, build, and post-build steps necessary to ultimately produce a valid archive. Each of the X nodes in the build cluster then behave in the following way:

1) Pull down a job from the job pool and execute it. Since each job directly corresponds to building a specific archive, several important hardware utilization metrics will begin increasing (CPU, disk throughput, memory, etc.).
2) Once per second, each node evaluates its hardware utilization. If its hardware utilization is beneath a certain threshold, it pulls down another job and begins executing it.
3) Repeat the above steps until the node crosses a predefined threshold that we consider to be "95-100% utilized".

The above 3 steps are repeated until all ~100 jobs are completed and the job pool is depleted.

In any given run, each node in the cluster is executing several jobs in parallel and is thus building several archives in parallel. This is where we're encountering the parallel execution error with ninja.

Passing an entire list of targets to a single instance of ninja would be great, but we would then lose the ability to efficiently and dynamically spread the compilation of these archives across multiple nodes in the cluster. If it helps any, our build infrastructure is built around the following notions:

* Any node in the cluster can execute a job, and they're often not the same ones in successive runs.
* 1 job = 1 archive.
* Some jobs have fairly heavy pre-build, build, or post-build steps that utilize resources differently (CPU-bound, disk-bound, memory-bound).
* Always try to maximally utilize available resources.
* Adding more hardware to the cluster = faster builds.

This build system accommodates cross-platform builds. msbuild is used on Windows while make is used on Linux. We've heard amazing things about ninja, so we're trying to replace make with ninja on Linux.

We definitely welcome any comments or thoughts from you on the matter: we're very new to ninja.

Thank you,
John

johndo...@gmail.com

unread,
Mar 19, 2015, 8:17:44 PM3/19/15
to ninja...@googlegroups.com
Hi Evan,

Given the above description, do you think there's any way we can shoe-horn ninja into a heavily parallelized build process like the one mentioned above? (Note: as great as ninja is, we just may be trying to force too square of a peg into a round hole, which we acknowledge.)

Thank you,
John

Nico Weber

unread,
Mar 19, 2015, 8:42:10 PM3/19/15
to johndo...@gmail.com, ninja-build
As Evan said, this isn't supported. We made ninja not crash and corrupt files if several builds run in the same build directory at the same time, but that's it.

Maybe it's interesting if I explain the current implementation details for this :-)

1.) The way ninja works is that at startup, it creates an in-memory build graph, then computes which parts of the graph are "dirty", and then it never (*) reconsiders what needs to be done and just builds all the dirty edges. So if you kick off two builds at the same time, they'll consider the same things dirty, and both processes will build everything.

2.) What's more, ninja rebuilds files if their commandline changes (say if you change a compiler flag), so it needs to track the commandline that was used last time. This is stored in a text file called ".ninja_log" in your build directory. Ninja reads that file at startup, and then streams new commandlines to that file during the build. The file is line-buffered, so if two ninja processes write to that file, it won't be corrupted, but the two ninja processes won't learn about the lines that the other process writes either.

3.) Finally, ninja also needs to track .h dependencies. Since reading lots of .d files at startup is slow, ninja reads .d files right after gcc finishes, and stashes them in .ninja_deps in your build directory. This file is then read at startup too. This is a binary file. It has some checksums and whatnot so that ninja can detect if another process wrote to it, but if it detects that, it basically ignores all data past the point where the other process started writing to that file (at least in first approximation). That's the warning you see above.

Things will probably work a little better if you disable deps mode, but you'll still get lots of duplicate compilations due to 1.

What people usually do is to have a single ninja process on a master, and then have build edges ferry work to the build cluster (using e.g. distcc).

Hope this is useful,
Nico

*: (exception: restat rules).

--

Evan Martin

unread,
Mar 20, 2015, 1:18:06 PM3/20/15
to johndo...@gmail.com, ninja-build
To save you some time, here's another idea that I think *won't* work:
You might think you could have each builder work on a separate copy of
the source tree, and then redistribute copies the built archives among
the builders to let them begin work on the shared libraries. But I
think this won't work because Ninja won't know where the copied-in
archives came from and it will attempt to rebuild them.

In general it's hard to make distributed systems work with shared
filesystems. Even with a single process builder you have to be
careful about with atomically writing outputs -- for example if the
compiler writes half a .o and then the ninja process and compiler both
crash, you need some other mechanism to know that the file needs to be
rebuilt. (I think Ninja handles this case by writing a log line out
after the build succeeds, which means that if the log line is missing
Ninja knows the file is bad. But that is also the reason the
distribution scheme I described above won't work. Another scheme,
used in tools like redo, is to always write outputs into a temporary
path and then rename upon completion, but as far as I know that only
works if commands can only have a single output.)

Because of this, the best practice I know of is to have a single
process be in charge of all scheduling and bookkeeping. Tools like
distcc work even without a shared filesystem because they ship all the
relevant files around when needed. (Note that distcc's "pump" mode
helps keep the network traffic down.)


Having written all of that, here's one final idea. Perhaps you could
break your build down into multiple separate build processes -- one
that builds the archives as before, but then a second one that is
unaware of how the archives are built and only knows to assemble them
together. (Effectively, the second step would treat the built
archives as inputs "from the system" and not things that can be built,
like source code.) You also would need to make these separate steps
use separate build directories (that is, you execute Ninja from
different directories or use the -C flag to specify a directory; you
could name these e.g. "stage1/" and "stage2/"). There would then be
no way for these steps to stomp on each other.

Unfortunately, that only solves part of your parallelization problem.
You still wouldn't be able to build multiple archives on multiple
machines in parallel.

PS: if you're on Linux you should look into "thin archives" -- they're
effectively just lists of .o files which means you don't spend build
time packing a bunch of .o files into a .a just to repack them again
into a .so. It also makes the "ar" step quick, which pairs well with
distcc.

On Thu, Mar 19, 2015 at 5:17 PM, <johndo...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages