Depenencies on generate header files

2,190 views
Skip to first unread message

Qingning

unread,
Sep 26, 2011, 10:57:25 AM9/26/11
to ninja-build
When generating ninja files, one does not normally worry about header
files included by cpp files. Because this kind of dependencies are
automagically discovered from the depfiles. However, if any included
headers are generated during the same ninja build session, IMHO, the
dependency must be recorded in the ninja files explicitly to ensure
ninja gets the build order correct. Otherwise, when the .d file is
missing, eg, for a clean build, ninja does not know that the
generating header file action must be run before the cpp compilation
action.

So the question is how to ensure this is always done correctly. I do
not want to scan all the cpp files for #include at ninja file
generating time because this would be too slow. So I am thinking I can
do a post build check with the following algorithm.

for each $target file that has a depfile (currently cxx rule only):
for each of its dependency file ($df_depend) discovered from
its depfile:
if $df_depend is a valid ninja target,
if $target->$df_depend dependency is not set explicitly
in the ninja file,
flag as an error.

This is not terribly difficult to do (with the help of ninja, -t query
and -t targets), but I feel it is far from elegant.

I'd like to know if anyone else is facing similar issues, and if so,
how you are dealing with it.

Thanks
Qingning

Evan Jones

unread,
Sep 26, 2011, 12:15:43 PM9/26/11
to ninja...@googlegroups.com
On Sep 26, 2011, at 10:57 , Qingning wrote:
> I'd like to know if anyone else is facing similar issues, and if so,
> how you are dealing with it.

I definitely end up with similar issues, due to using headers
generated by Google protocol buffers. I've had to add explicit
dependencies on the .h. just as you mention. It hasn't bothered me
enough to fix it (yet), but I've thought about how I *think* I should
fix this. I've tried to explain this below. You could also look at how
Chrome deals with this via gyp, since they must run into this issue. I
couldn't figure it out from looking at the gyp manual for 5 minutes
right now though.

The way that I was thinking about fixing this: when something depends
on a rule that generates a .h, I need to "propagate" that .h
dependency to any compile rules that depend on it. As a quick example,
defined in a pseudo-ninja format:

# Generate the .cc and .h; compile the .cc into a library.
build libmessage: pb_library message.proto

# Generate a library that depends on message_protocolbuffer
build libsomething: library message_protocolbuffer libsomething.cc
other_file.cc

# Create an EXE using libsomething
build someexe: executable libsomething executable.cc some_other_file.cc


In this case, what I *think* should happen is that the compile rules
generated by the "executable someexe" rule should include explicit
dependencies on the generated header file. In other words, the compile
executable.cc rule will depend on message.pb.h. Note: It is important
that this is an "order-only dependency." This has the advantage that
it will ensure the correct build order, but if some_other_file.cc
doesn't actually include the header file, when the header file is
regenerated, it won't trigger a recompile. It will only trigger a
recompile if some_other_file includes the header, since it will then
be "upgraded" to an implicit dependency.

Evan

--
http://evanjones.ca/

Qingning

unread,
Sep 26, 2011, 1:27:59 PM9/26/11
to ninja-build
> The way that I was thinking about fixing this: when something depends  
> on a rule that generates a .h, I need to "propagate" that .h  
> dependency to any compile rules that depend on it. As a quick example,  
> defined in a pseudo-ninja format:
>
> # Generate the .cc and .h; compile the .cc into a library.
> build libmessage: pb_library message.proto
>
> # Generate a library that depends on message_protocolbuffer
> build libsomething: library message_protocolbuffer libsomething.cc  
> other_file.cc
>
> # Create an EXE using libsomething
> build someexe: executable libsomething executable.cc some_other_file.cc
>

Thanks for your reply. Can I assume that the above is not directly
mappable to ninja rules? It seems to me that this line will be
expanded to a few ninja rules that involves compilation and linking.

> In this case, what I *think* should happen is that the compile rules
> generated by the "executable someexe" rule should include explicit
> dependencies on the generated header file. In other words, the compile
> executable.cc rule will depend on message.pb.h. Note: It is important
> that this is an "order-only dependency." This has the advantage that
> it will ensure the correct build order, but if some_other_file.cc
> doesn't actually include the header file, when the header file is
> regenerated, it won't trigger a recompile. It will only trigger a
> recompile if some_other_file includes the header, since it will then
> "upgraded" to an implicit dependency.

Glad to see that you agree we need to exprss the dependency on
generated files explicitly in ninja files. But what do you think about
the consistent checking process I mentioned in my first mail?

Btw, I don't quite follow why an "order-only dependency" is important.
Do you mean you make all cpp files depend on generated header files,
and expect it to do the correct thing by the semantics of "order-only
dependency"? And you are using this method to avoid doing any
consitent checking?

Thanks
Qingning

Evan Jones

unread,
Sep 27, 2011, 2:59:17 PM9/27/11
to Dmitry Sagalovskiy, ninja...@googlegroups.com
On Sep 27, 2011, at 11:54 , Dmitry Sagalovskiy wrote:
> In your example, I think you rely on the fact that the rule that
> generates .h also generates something else, which is listed as an
> explicit dependency for a target of interest (the protobuf library
> in your case). So then you can infer that .h is also a dependency.
> But this assumption won't always hold.

You make a good point. In my case, anything that generates code
generates a .c or a .cc that needs to be linked, so I need an explicit
dependency anyway.

There is a design trade-off here: should the user explicitly define
dependencies, or should they be automatically determined?

Ninja's "philosophy" documentation states "Ninja contains the barest
functionality necessary to describe arbitrary dependency graphs". In
the case of these generated header files, we've both discussed the
fact that it already *can* express the dependencies correctly: either
explicit or order-only dependencies do the "right thing". So in my
opinion, this functionality belongs in the .ninja generator, and not
in Ninja itself.


> It's entirely possible to have rules which just generate .h files,
> and to have source code which includes those files. Leaving it to
> the developer to add an explicit dependency whenever they add
> #include leaves too much room for human error.

However this problem also happens with linking: Whenever you #include
something, you need to figure out what library you need to link, and
manually add a dependency. You could probably make a build system that
automatically figures out what libraries to link: take the errors
generated by the linker, search for the appropriate symbols, and
automatically add the correct linker dependencies.


> Given the line "build foo.o: cc foo.cc": on the first build,
> everything that starts with "build/" and ends with ".pb.h" would be
> brought up to date.

Doesn't this permit circular dependencies? What if in order to build
"build/foo.pb.h" I need to build "protoc" using a "build protoc.o: cc
protoc.cc" rule? At any rate: I believe that rules of this form could
be implemented as a pre-processor that generates the
appropriate .ninja files.

Evan

--
http://evanjones.ca/

Evan Jones

unread,
Sep 27, 2011, 3:04:26 PM9/27/11
to ninja...@googlegroups.com
On Sep 26, 2011, at 13:27 , Qingning wrote:
> Thanks for your reply. Can I assume that the above is not directly
> mappable to ninja rules? It seems to me that this line will be
> expanded to a few ninja rules that involves compilation and linking.

Right. I use a primitive script to generate .ninja files.


> Glad to see that you agree we need to exprss the dependency on
> generated files explicitly in ninja files. But what do you think
> about the consistent checking process I mentioned in my first mail?

This is actually an interesting idea: It might always be a "bug" if a
rule has an implicit dependency on a generated file. It probably
always should be an explicit or order-only dependency. It probably
might be possible to add this as a ninja tool, so it can re-use
the .ninja and dependency file parsers?


> Btw, I don't quite follow why an "order-only dependency" is important.
> Do you mean you make all cpp files depend on generated header files,
> and expect it to do the correct thing by the semantics of "order-only
> dependency"? And you are using this method to avoid doing any
> consitent checking?

You don't *need* to use order-only dependencies, but it makes my life
easier: I add order-only dependencies on anything that might depend on
a generated header file. Then, only if it *actually* #includes the
header does it get rebuilt when the header is updated.

Evan

--
http://evanjones.ca/

Qingning

unread,
Sep 27, 2011, 4:57:40 PM9/27/11
to ninja-build
Hi Evan,

Thanks for the clarification.

> This is actually an interesting idea: It might always be a "bug" if a  
> rule has an implicit dependency on a generated file. It probably  
> always should be an explicit or order-only dependency. It probably  
> might be possible to add this as a ninja tool, so it can re-use  
> the .ninja and dependency file parsers?

I'd be very keen to get a soluton into ninja. This is because (a) I
think it is
always wrong not to have an _explicit_ dependency when including
generated
files, and (b) ninja has all the required knowledge to make the
decision.

I'd quick look of the Edge::LoadDepFile() function, it seems that it
can be
easily modified to carry out such check. Might be a good idea to write
a
patch for it.

> > Btw, I don't quite follow why an "order-only dependency" is important.
> > Do you mean you make all cpp files depend on generated header files,
> > and expect it to do the correct thing by the semantics of "order-only
> > dependency"? And you are using this method to avoid doing any
> > consitent checking?
>
> You don't *need* to use order-only dependencies, but it makes my life  
> easier: I add order-only dependencies on anything that might depend on  
> a generated header file. Then, only if it *actually* #includes the  
> header does it get rebuilt when the header is updated.

Now, I understand your approach. I think it should work and it could
be an
alternative than the dependency check as above.

Thanks
Qingning

Peter Collingbourne

unread,
Sep 27, 2011, 5:36:58 PM9/27/11
to ninja...@googlegroups.com
On Mon, Sep 26, 2011 at 07:57:25AM -0700, Qingning wrote:
> When generating ninja files, one does not normally worry about header
> files included by cpp files. Because this kind of dependencies are
> automagically discovered from the depfiles. However, if any included
> headers are generated during the same ninja build session, IMHO, the
> dependency must be recorded in the ninja files explicitly to ensure
> ninja gets the build order correct. Otherwise, when the .d file is
> missing, eg, for a clean build, ninja does not know that the
> generating header file action must be run before the cpp compilation
> action.
>
> So the question is how to ensure this is always done correctly. I do
> not want to scan all the cpp files for #include at ninja file
> generating time because this would be too slow. So I am thinking I can
> do a post build check with the following algorithm.
>
> for each $target file that has a depfile (currently cxx rule only):
> for each of its dependency file ($df_depend) discovered from
> its depfile:
> if $df_depend is a valid ninja target,
> if $target->$df_depend dependency is not set explicitly
> in the ninja file,

This should also include indirect dependencies. To give an example of
this scenario, consider an executable which uses a library. Some of
the executable's source files include the library's generated header
files, said generated header files being declared as order-only
dependencies of some of the library's object files. Since the
executable's source files include the generated header files, there
should be a dependency specified between the executable's object
files and the generated header files. But if the executable's object
files only declare an order-only dependency on the library, there
is no error, because the dependency is specified, albeit indirectly
(via the library and its object files). I have seen this in practice
with some CMake-based build systems.

Thanks,
--
Peter

Dmitry Sagalovskiy

unread,
Sep 27, 2011, 11:54:42 AM9/27/11
to ninja...@googlegroups.com, ev...@csail.mit.edu
In your example, I think you rely on the fact that the rule that generates .h also generates something else, which is listed as an explicit dependency for a target of interest (the protobuf library in your case). So then you can infer that .h is also a dependency. But this assumption won't always hold.

It's entirely possible to have rules which just generate .h files, and to have source code which includes those files. Leaving it to the developer to add an explicit dependency whenever they add #include leaves too much room for human error.

Here's my attempting to generalize: if a generated target *could* possibly appear in a depfile of rule R, then it should be an order-only dependency for anything built with rule R.

My suggestion for a syntax to express it is to add a pattern-matching variable to go along with depfile:

rule cc
  depfile = $out.d
  dep_pattern = build/.*\.pb\.h
  command = gcc -MMD -MF $out.d [other gcc flags here]

This would filter all targets known to ninja on the dep_pattern regular expression, and make them into order-only dependency of anything built with rule cc. It doesn't involve any stat calls, but creates extra dependency edges among existing nodes.

Given the line "build foo.o: cc foo.cc": on the first build, everything that starts with "build/" and ends with ".pb.h" would be brought up to date. That will allow the compiler to correctly compile and generate the depfile. From then on, if message.pb.h is listed in the depfile, then its dirty flag will propagate to foo.o. If message.pb.h is not listed, then its dirty flag would not propagate to foo.o, but if foo.o needs to be rebuilt for any other reason, then message.pb.h would be brought up to date first.

I think that's the desired behavior, because you don't know ahead of time when a new #include of message.pb.h might be added to foo.cc or to one of its included header files.

Dmitry

Dmitry Sagalovskiy

unread,
Sep 26, 2011, 11:39:58 AM9/26/11
to ninja...@googlegroups.com
Yes, we face this issue also. The problem with your solution is that ninja doesn't know that the generated header has to be built *before* the cxx file that depends on it, so you may get a compiler error and never get the depfile generated. Worse, you could have a cxx file compiled with a stale version of the generated header file.

[For an example to keep in mind, suppose bar.h is generated from bar.in, and you just modified foo.cc to include bar.h, and modified bar.in at the same time; and you now try to build foo.o.]

My solution is this. Any rule that generates a header file (or anything which *could* become an implicit dependency), I add that file to a list. Then I do this

rule touch
   command = touch $out
build all_generated_headers: touch | <the list of all generated headers>

And for every .o file, I add "all_generated_headers" as an order-only dependency. (And still do the usual thing with depfiles, as recommended in ninja manual.)

Basically, it means that all generated headers are brought up to date whenever you build any .o file. Admittedly, it lacks some elegance also. But I think it's the only simple way to be sure the build is correct.

I'd love to hear if anyone has better suggestions, or if anyone sees problems that would still be present with this approach.

Dmitry

P.S. Incidentally, scons has had a bug with it forever, even though it scans changed files on every run to build the dependency tree. This stuff is hard to get right, especially when the implicit dependency you are trying to discover doesn't exist yet in the filesystem.

Ami Fischman

unread,
Sep 26, 2011, 1:02:35 PM9/26/11
to ninja...@googlegroups.com
Instead of declaring dependencies on generated headers, declare dependencies on the rules generating those headers, or on the generator's inputs.  E.g. this example from chromium includes the .proto inputs as sources to the libphonenumber target.

Cheers,
-a

Dmitry Sagalovskiy

unread,
Sep 27, 2011, 4:54:56 PM9/27/11
to Evan Jones, ninja...@googlegroups.com
On Tue, Sep 27, 2011 at 2:59 PM, Evan Jones wrote:

It's entirely possible to have rules which just generate .h files, and to have source code which includes those files. Leaving it to the developer to add an explicit dependency whenever they add #include leaves too much room for human error.

However this problem also happens with linking: Whenever you #include something, you need to figure out what library you need to link, and manually add a dependency. You could probably make a build system that automatically figures out what libraries to link: take the errors generated by the linker, search for the appropriate symbols, and automatically add the correct linker dependencies.

Well, it's a matter of expectations. Given the support for implicit dependencies on header files, there is an expectation that the build will do the right thing whether the included header is created manually or generated as part of the build. The link stage gets away with it because there isn't such an expectation (and also because there is usually an order of magnitude fewer components).


Given the line "build foo.o: cc foo.cc": on the first build, everything that starts with "build/" and ends with ".pb.h" would be brought up to date.

Doesn't this permit circular dependencies? What if in order to build "build/foo.pb.h" I need to build "protoc" using a "build protoc.o: cc protoc.cc" rule? At any rate: I believe that rules of this form could be implemented as a pre-processor that generates the appropriate .ninja files.

True. Makes sense. I would then recommend adding an example to the manual for a recommended way to set up rules with depfiles when some of the headers are generated as part of the build. It's hard to get right, and it depends on the precise behavior of order-only dependencies (which admit I don't understand in full detail).

Dmitry

Brad King

unread,
Sep 26, 2011, 4:51:26 PM9/26/11
to ninja...@googlegroups.com
Hi Folks,

FYI, I'm a CMake developer and have dealt extensively with all kinds of
dependency problems. The problem discussed in this thread touches on
one of the reasons CMake generates what appear to be recursive
Makefiles. I just added an entry to our FAQ with a justification:

http://www.cmake.org/Wiki/CMake_FAQ#Why_does_CMake_generate_recursive_Makefiles.3F

On 9/26/2011 12:15 PM, Evan Jones wrote:
> The way that I was thinking about fixing this: when something
> depends on a rule that generates a .h, I need to "propagate"
> that .h dependency to any compile rules that depend on it. As a
> quick example, defined in a pseudo-ninja format:
>
> # Generate the .cc and .h; compile the .cc into a library.
> build libmessage: pb_library message.proto
>
> # Generate a library that depends on message_protocolbuffer
> build libsomething: library message_protocolbuffer libsomething.cc other_file.cc
>
> # Create an EXE using libsomething
> build someexe: executable libsomething executable.cc some_other_file.cc
>
>
> In this case, what I *think* should happen is that the compile
> rules generated by the "executable someexe" rule should include
> explicit dependencies on the generated header file. In other
> words, the compile executable.cc rule will depend on
> message.pb.h. Note: It is important that this is an "order-only
> dependency."

I don't think you need to explicitly propagate the dependency on
the generated header to other targets. Your point about order-only
dependencies is the key. All you need to do is ensure that the rules
for libmessage and libsomething have been fully evaluated and are
up to date before you even start to evaluate the rules for someexe.
By the time ninja even considers dependency generation for someexe
the header generated as part of building libmessage should exist.
You don't need any special handling in someexe for it.

CMake is able to handle cases when the rules for generating a header
or source file are specified in the same target in which they are
included or compiled. The reason is that we evaluate the generation
rules before we do dependency scanning. Ninja should be able to do
this without multiple levels of new processes because it can just
load the new dependencies that scanning produces directly into the
running process (since it is not make, after all).

-Brad

Qingning

unread,
Sep 28, 2011, 9:45:40 AM9/28/11
to ninja-build
Peter, you are totally right. I ignored this situation because we do
not use this kind of order-only dependencies. I think it creates a
bottleneck on the library, in that high level cpp files compilation
cannot start until the library is fully built.

Anyway, if/when we get this checking into ninja, we will have to make
it controllable by a command line option, or make it a ninja tool
instead.

Qingning

Qingning

unread,
Sep 28, 2011, 9:49:21 AM9/28/11
to ninja-build
On Sep 26, 6:02 pm, Ami Fischman <fisch...@google.com> wrote:
> Instead of declaring dependencies on generated headers, declare dependencies
> on the rules generating those headers, or on the generator's inputs.
>  E.g. this example<http://codesearch.google.com/codesearch#OAMlx_jo-ck/src/third_party/l...>from
> chromium includes the .proto inputs as sources
> to the libphonenumber target.
>

Hi Ami,

I don't understand what you mean by declaring dependencies on the
rules. But I think depending on the generator's inputs is cannot be
sufficient, because it will allow the cpp files to be compiled at the
same time when the header file is being generated.

Qingning

Peter Collingbourne

unread,
Sep 28, 2011, 10:42:27 AM9/28/11
to ninja...@googlegroups.com

I agree. Unfortunately, the CMake language does not permit declaring
a library dependency from a target (e.g., an executable or shared
library) without also (implicitly) declaring an order-only dependency
from that target's object files to the library, and a number of
projects are now relying on this behaviour. I believe that one reason
for the implicit order-only dependency is that the more accurate
dependency graph cannot be easily represented using recursive Make,
which CMake must remain compatible with.

One idea is to teach the CMake Ninja generator to emit order-only
dependencies on libraries as order-only dependencies on the header
files generated for the library. But I haven't completely thought this
through, and there is the possibility that it might break some projects.

> Anyway, if/when we get this checking into ninja, we will have to make
> it controllable by a command line option, or make it a ninja tool
> instead.

Right, I think it should certainly be a tool. It would certainly be a
very valuable tool for CMake users. I have already used Ninja to (by
chance) find a couple of dependency bugs in another project I work on.

Thanks,
--
Peter

Ami Fischman

unread,
Sep 28, 2011, 11:00:27 AM9/28/11
to ninja...@googlegroups.com
I don't understand what you mean by declaring dependencies on the rules.

Ah, I'm leaking a gyp-centric bias :)
In chromium's gyp setup, the standard way to do this is to have a gyp target (usually a library) list its .cc sources, and have the gyp ninja generator emit an order-only dependency for each of those .cc's .o targets on a stamp file emitted by the header-generating action.
This wastes some parallelism, because even .cc's that don't depend on the generated header now need to wait for its generation before they can be compiled.

Cheers,
-a

Brad King

unread,
Sep 28, 2011, 1:12:12 PM9/28/11
to ninja...@googlegroups.com
On 9/28/2011 10:42 AM, Peter Collingbourne wrote:
> I agree. Unfortunately, the CMake language does not permit declaring
> a library dependency from a target (e.g., an executable or shared
> library) without also (implicitly) declaring an order-only dependency
> from that target's object files to the library, and a number of
> projects are now relying on this behaviour.

CMake does this in part to handle the case that the library's build
rules generate files that may then be used by the executable's sources
during their compilation. We also have to make the behavior consistent
across all the target build environments, and this is how Xcode and the
VS IDE build tools work.

-Brad

Evan Martin

unread,
Dec 4, 2011, 7:22:16 PM12/4/11
to ninja...@googlegroups.com
I was going through some old mail and noticed I had marked this thread
as one I ought to respond to. Sorry for my late response!

The way header generation works in gyp is as follows:

gyp has a notion of a "target", which is a named entity that other
targets can depend on. The most obvious kind of target is a static
library. If executable target A depends on library B which depends on
library C, gyp implicitly translates that into the appropriate build
instructions (where library B and C can be built in parallel, and both
are both linked into executable A).

If a given target generates headers (aside from building libraries,
targets may also run commands that generate additional files) it must
be marked specially in gyp as a "hard dependency": that indicates that
any other target that depends on the hard dependency target has an
order-only dependency on it.

Translating the above rules into ninja rules is pretty simple, which
maybe explains why the ninja rules behave as they do.


Evan Jones's case of protocol buffers is the most common example of
this in Chrome as well.

At the gyp level, say you have
- target 'message_pb' which runs the generator to generate headers
- target 'libmessage' which builds libmessage.a with the generated source
- target 'foobar' which makes use of the above library
.

Concretely you build it as something like:

# generate the relevant headers
build message.cc message.h: pb_library message.proto
# build the library
build message.o: ...
build libmessage.a: ...

# build something that uses the library
build foobar.o: foobar.cc || message.h
# two options for linking
# 1) if you just put libmessage.a on the link line, explicit dep
build foobar: link foobar.o libmessage.a
# or 2) if you have a more complicated link line, implicit dep
build foobar: link foobar.o | libmessage.a
extra_link_flags = -lmessage

Those rules are sufficient to make builds both correct and minimal.
Unfortunately, generating them all is difficult if you're not
generating your ninja files, as you need to write down *somewhere*
that foobar.cc depends on message.h.


I'm reluctant to get too much of how headers work in C as ninja
builtins as my hope is that ninja remains small (it already feels a
bit large to me). With that said, I'm always interested in ideas
about general rules that can work for many people.

I'll take Dmitry's suggestion and extend the manual to better describe
how this is intended to work.

Qingning Huo

unread,
Jan 5, 2012, 5:53:33 PM1/5/12
to ninja...@googlegroups.com
Thanks for your reply. I know my reply is also very late because I was
busy with something else. Apologies!

At the moment, we have only a few special cases where cpp files depend
on generated header files. So we maintain them in the script that
generate ninja files. In order to check that all such dependencies are
expressed in ninja files, I've made a change to ninja to check, when
loading depfiles for a target, that all dependencies discovered from
the dep file is already known to ninja. If a violation if found, ninja
will fail with a fatal message.

If you are interested, I'd happy to send the patch to you. I believe
it is general useful for cases where cpp files (or any target files
that use depfile) may depend on generated files. It has some known
limitations that it is currently not controlled by a command line
option, and it does not handle the indirect dependencies that Peter C
has pointed out, it does not handle indirect dependencies yet.

Qingning

Petr Wolf

unread,
Oct 4, 2012, 11:50:42 PM10/4/12
to ninja...@googlegroups.com
Hi all,

please check the following pull request

It adds ninja -d option "depcheck", which looks for unsafe depedencies.

Petr
Reply all
Reply to author
Forward
0 new messages