Discussion: Make data_deps non-blocking

Junji Watanabe

unread,

May 21, 2025, 9:09:05 PMMay 21

to gn-dev, Andrew Grieve, chrome-build-team

Hi folks,

I'm working for Chrome build infra, and have filed https://crbug.com/gn/413507213 to use Ninja's validation for `data_deps`, instead of `order-only` deps. Andrew and I had different understanding of when `data_deps` should be built. So, let me open the discussion here.

According to the document, I see a couple of places mentioned that data_deps will be available at "runtime" as opposed to "build time", and they can be built in parallel.

https://gn.googlesource.com/gn/+/refs/heads/main/docs/reference.md#var_data_deps

> Specifies dependencies of a target that are not actually linked into the

current target. Such dependencies will be built and will be available at
runtime.

https://gn.googlesource.com/gn/+/refs/heads/main/docs/reference.md#actions-and-copies

> The different rules for deps and data_deps are to express build-time (deps)

vs. run-time (data_deps) outputs. If GN counted all build-time copy steps as
data dependencies, there would be a lot of extra stuff, and if GN counted all
run-time dependencies as regular deps, the build's parallelism would be
unnecessarily constrained.

I thought Ninja's validation is more appropriate rather than "order-only", which loses the build parallelism. My understanding is that order-only was used just because Ninja validation didn't exist when GN was implemented.

The problem is that there are some targets (at least on Chromium) that assume `data_deps` to be ready when the executables are built. e.g. a target that runs a generated executable

IMO, those targets should specify the runtime deps as "deps"/"inputs". But I might misunderstand the meaning and the context.

What do GN developers think about this proposal? I think this is also related to bundling and packaging targets.

Thank you,

Junji

Message has been deleted

Roland McGrath

unread,

May 22, 2025, 1:55:20 PMMay 22

to Junji Watanabe, gn-dev, Andrew Grieve, chrome-build-team

Unfortunately, it cuts both ways. Because ninja traditionally had no way to express the concept that the GN docs say `data_deps` does, GN always used order-only deps instead. Due to other GN limitations, that has now become load-bearing behavior. Things today can (and do) use `data_deps` to express what actually need to be order-only deps in GN. This is used for cases where GN is not capable of expressing all the true file dependencies, so things are instead using actions with depfiles to fill in the file-granularity ninja dependencies to ensure correct incremental builds, with `data_deps` order-only deps behavior ensuring that the inputs GN doesn't see directly are always updated first--they don't become ninja file-granularity deps on first build, only via depfile for incremental builds.

Unless and until GN gets sufficiently richer functionality to be able to grok all the file-granularity deps in all the situations where this pattern has been used, we need GN to have some way to express ninja's order-only deps concept. Today that way is `data_deps`. There are probably many builds where this isn't an issue, very likely including Chromium. But other major GN users (such as Fuchsia) rely on it extensively. This mostly arises in cases using the `metadata` feature, which is both very powerful and extremely limited. The only way to use metadata is via `generated_file()` file targets and the contents of such files cannot be consumed by GN expression code so as to populate `inputs` fields and the like. The original aspiration for metadata was something much richer, but we never managed to get that hashed out and implemented.

I think it would be great to be able to express the original documented idea behind `data_deps` in GN in some fashion that translates properly down to ninja so that optimal parallelism is achievable. But without much larger feature overhauls in GN, we will still need a way to do what `data_deps` actually does today, which is ninja's order-only deps semantics. If we introduce a third flavor of deps into GN (strawman name `order_deps`), then we'd need a transition period for affected projects to convert `data_deps` uses into `order_deps` when those semantics are important. Perhaps something like a `.gn` flag to say that `data_deps` is actually `order_deps` for compatibility.

On Wed, May 21, 2025 at 7:00 PM 'Junji Watanabe' via gn-dev <gn-...@chromium.org> wrote:

If you think about this at Ninja level, using order-only deps is problematic.

Given the following GN targets:

executable("A") {
  data_deps = [ ":B" ]
  ...
}
action("run_A") {
  script = "run_A.py"
  deps = [ ":A" ]
  outputs = [ "$root_out_dir/run_A.stamp" ]
}

Ninja rules would look like this:

build A: link ... || ./B
build run_A.stamp: ___run_A___build_toolchain_gcc__rule | ../../run_A.py A

According to Ninja document,
> Order-only dependencies, expressed with the syntax || dep1 dep2 on the end of a build line. When these are out of date, the output is not rebuilt until they are built, but changes in order-only dependencies alone do not cause the output to be rebuilt.

The problem here is that if only `B` gets updated, `run_A` wouldn't run.

Dirk Pranke

unread,

May 22, 2025, 2:41:53 PMMay 22

to Junji Watanabe, gn-dev, Andrew Grieve, chrome-build-team, Roland McGrath

I have always understood `data_deps` for X to mean that "these will have been built when you want to run X". Accordingly, since it might not be easy to know in a build graph when you want to run X, that the data_deps need to have been built by the time building X is *done* (not started). I believe this matches Andrew's understanding as given in the bug. Ninja has not historically had the ability to express that, and it doesn't look like validations express that, either, so AFAIK it still doesn't have the ability to do what I want.

I think you could get the semantics that I am picturing for data_deps in GN without needing to change Ninja by instead of having GN use order-only deps for data dependencies, rewrite the build graph such that if X has a data dependency on Y, then rewrite X to be a group target that (regular-deps) depends on Y and on a new X' target that is the original X target minus the data deps. That would not in general get you as much parallelism as using validations would get you, but it would still be an improvement, and if no target actually depended on X at build time then you'd get the same parallelism that validations would get you.

I do not think we should make a target W that wants to run X at build time have to explicitly add a data dependency on X's data deps. That feels like a layering violation and an abstraction violation, and it seems like it would be easy to get things wrong and end up with broken build graphs. I think in theory you could probably automatically propagate up the dependencies such that W would automatically (regular-deps) depend on X's data deps, but I don't think that would get you anything over the rewriting I described in the prior paragraph, and you'd either have to do that rewriting *anyway* just to make `ninja X` work correctly or *also* add something like the validation logic.

However, as Roland says, we also have the problem that in some places we use data deps to express what should be order_only deps, and I agree with him that we must not break those cases without providing an alternative. I think his suggestion to add something like an `order_deps` and to provide a migration path to that makes sense.

In my ideal world, we'd add the data_deps concept to Ninja and the order_only concept to GN. It's possible that we might also want something like validations in GN, but, at least from Chromium CI's point of view, I'd be reluctant to blur build time and test-time concepts like that. From a non-Chromium point of view (and perhaps also from a Chromium local dev build point of view), having that functionality generically makes sense.

-- Dirk

Dirk Pranke

unread,

May 22, 2025, 2:51:32 PMMay 22

to Junji Watanabe, gn-dev, Andrew Grieve, chrome-build-team, Roland McGrath

It occurred to me almost immediately after I hit send that you could probably get the rewriting I describe using existing GN functionality to define a new template for `executable`. Perhaps that's worth trying?

-- Dirk

Andrew Grieve

unread,

May 22, 2025, 3:05:47 PMMay 22

to Dirk Pranke, Junji Watanabe, gn-dev, Roland McGrath

(sending again from my @chromium and dropping internal mailing-list)

Despite my understanding that data_deps are supposed to work as they do now, I admittedly failed to find any spots in chromium where they are used in that way.

The best example I could find was that //base depends on ICU, and so //base has a data_dep on ICU's data file, and any executable that depends on //base might expect ICU's data file to be there in time. However, the host_toolchain tools that depend on //base don't seem likely to use any ICU functionality. It just seems to be a very rare thing that a binary used when building relies on auxiliary files. IMO it seems rare enough that it's not worth implementing a way to propagate them as Dirk describes.

So... I'm basically over my initial reasons for opposing the change, but orderfile + order-only deps usage in Fuschia seems more concrete to me & reason to not just change it (or to provide an orderonly_deps).

The spot that I think would be most sped up by this proposal is android static analysis steps, which currently use data_deps because validations are not exposed via GN. Although arguably for this case the best thing to do would be to directly expose "validation_deps". But maybe there are other spots where data_deps are introducing slow-downs.

Aaron Wood

unread,

May 22, 2025, 5:52:45 PMMay 22

to Andrew Grieve, Dirk Pranke, Junji Watanabe, gn-dev, Roland McGrath

On Thu, May 22, 2025 at 12:05 PM Andrew Grieve <agr...@chromium.org> wrote:

(sending again from my @chromium and dropping internal mailing-list)

Despite my understanding that data_deps are supposed to work as they do now, I admittedly failed to find any spots in chromium where they are used in that way.

The best example I could find was that //base depends on ICU, and so //base has a data_dep on ICU's data file, and any executable that depends on //base might expect ICU's data file to be there in time. However, the host_toolchain tools that depend on //base don't seem likely to use any ICU functionality. It just seems to be a very rare thing that a binary used when building relies on auxiliary files. IMO it seems rare enough that it's not worth implementing a way to propagate them as Dirk describes.

So... I'm basically over my initial reasons for opposing the change, but orderfile + order-only deps usage in Fuschia seems more concrete to me & reason to not just change it (or to provide an orderonly_deps).

We could probably go through and migrate the Fuchsia build to differentiate our true data_deps from our order_only deps. Most of that usage is buried in templates (for better or worse). A migration path would look (to me) something like:

1. Add order_only_deps field, with gn option to treat data_deps as order-only deps in ninja (ie, parity behavior)

2. Migrate our usages as needed (with lots of testing)
3. Flip the flag on our build to get the data-dep behavior.

It would definitely help our build graph to get our true data deps separated from our compilation and order-only deps, but it would definitely be a non-trivial migration effort.

To unsubscribe from this group and stop receiving emails from it, send an email to gn-dev+un...@chromium.org.

Roland McGrath

unread,

May 22, 2025, 6:34:11 PMMay 22

to Aaron Wood, Andrew Grieve, Dirk Pranke, Junji Watanabe, gn-dev

On Thu, May 22, 2025 at 2:52 PM Aaron Wood <aaro...@google.com> wrote:

We could probably go through and migrate the Fuchsia build to differentiate our true data_deps from our order_only deps. Most of that usage is buried in templates (for better or worse). A migration path would look (to me) something like:

1. Add order_only_deps field, with gn option to treat data_deps as order-only deps in ninja (ie, parity behavior)
2. Migrate our usages as needed (with lots of testing)
3. Flip the flag on our build to get the data-dep behavior.

Yes, this is exactly the sort of scenario I had in mind. I think it's a desirable migration, but I also think that we have more than enough subtlety in the Fuchsia build using data_deps in lots of ways that it would be a nontrivial and not-very-mechanical effort to untangle the two sorts of cases correctly. Unfortunately it still wouldn't actually be all that optimal in our build I suspect. That's because the things that we'd need to make order_deps are probably quite often today a few steps of deps graph removed from the action that actually needs the ordering, and via a mix of deps and order_deps (nee data_deps) arcs. I'm suspecting that we'll still wind up with more serialization than we actually need for the semantics of our build, but not be able to express that granularity of things to GN or Ninja. I suspect that we could come up with a more complex GN abstraction that could be translated into a nontrivially-synthesized use of Ninja order-only deps that more precisely expresses where serialization is needed. But that's a lot more to figure out and make sense of than "introduce order_deps".

Another somewhat sad aspect is that, at the point of use, `data_deps` really does express the concept people have in mind and makes sense for the target, such as "this executable needs this other executable at runtime". It's just that the executable with that "runtime-only" dependency on the other thing is itself rolling up (through many steps) into a deps graph that is getting collected for packaging an image of all the things that need to go together at runtime (at various granularities)--so the "runtime-only" deps of one subtree of the overall build graph are in fact build-time deps of an ancestor node of that subtree. Conceivably we could introduce a notion of "order-only that catches downstream data_deps too". One notion might be to just build that in as a semantic of `generated_file` targets that collect metadata. Basically, anything that consumes the generated file already has to have direct deps on that target (as with an action target whose output file is an input to your target). The contents of the generated file (the metadata collection) include data contributed by `data_deps` paths as well as `deps` paths (and presumably, new `order_deps` paths), which can reasonably be presumed to include pointers to output files of contributing targets. So anything that contributes metadata contents to a generated file has to effectively be treated as in `order_deps` of anything that takes that generated file as input. I'm fairly confident that this kind of semantics could cover everything going on in the Fuchsia build nicely. (If we had that magic behavior for metadata collection targets, I'm not actually sure of what other use cases we might have for the explicit `order_deps` feature left, though it seems like a generally good thing to be able to express the notion first-class in GN since it exists first-class in Ninja.)

As long as Ninja doesn't have a truly good direct analogue of the "true" data_deps concept (as distinct from order_deps in our strawman universe), then I'm pretty unsure if we want to maintain some much more complicated algorithm in GN to achieve it indirectly through fancier synthesis of a Ninja deps graph. And until we have a new way to drive Ninja for the "true" data_deps cases that achieves a lot more parallelism, it's not clear that untangling the data_deps vs order_deps abstractions in GN merits the work required. That said, I don't understand Ninja's "validation" features yet, so I am going on Dirk's assessment of their aptness as a means to achieve the "true data_deps" semantics.

Junji Watanabe

unread,

May 23, 2025, 2:04:55 AMMay 23

to Roland McGrath, Aaron Wood, Andrew Grieve, Dirk Pranke, gn-dev

Thank you all for your feedback! That's really helpful for me.

I understand `data_deps` is used for the cases that GN can't express with regular deps, and we need to keep the "order-only" behavior somehow.

Just let me give some reasons why I'm motivated for more parallel build.

There was a Chromium Q build regression caused by `code_cache_generator` having `v8_context_snapshot_generator` as data_deps. At that time, I made a workaround to move `v8_context_snapshot_generator` to the deps that runs `code_cache_generator`.

Also, we still see `v8_context_snapshot_generator` is blocking the entire build (http://b/413494195) by several minutes. Then, I thought it's better to change the data_deps itself, rather than handling it one by one.

I initially planned to use "phony" to change the Ninja build graph (e.g. phony/X has executable X and data_deps of X). But, surprisingly, Chromium builds already worked mostly well with "validations". That's why I was pursuing this proposal.

I now rethink that changing the specific `v8_context_snapshot_generator` blocking issue might be easier to resolve the CQ build time issue...

> I do not think we should make a target W that wants to run X at build time have to explicitly add a data dependency on X's data deps. That feels like a layering violation and an abstraction violation, and it seems like it would be easy to get things wrong and end up with broken build graphs.

Hmm, I understand the feeling. But, order-only deps don't guarantee to run W only when X's data_deps changes.

Let's say if X has Y as data_deps. The following Ninja graph generated by the current GN doesn't rebuild W, when only Y changes.

```

build X: link ... || Y

build W: run_executable X

```

Does anyone know how this issue is handled?

> you could probably get the rewriting I describe using existing GN functionality to define a new template for `executable`. Perhaps that's worth trying?

Yeah, I considered a similar idea, but with GN using a phony. A new template is interesting to not affect Fuchsia. I will try that.

Dirk Pranke

unread,

May 24, 2025, 12:03:40 PMMay 24

to Junji Watanabe, Roland McGrath, Aaron Wood, Andrew Grieve, gn-dev

On Thu, May 22, 2025 at 11:04 PM Junji Watanabe <jw...@google.com> wrote:

Thank you all for your feedback! That's really helpful for me.
I understand `data_deps` is used for the cases that GN can't express with regular deps, and we need to keep the "order-only" behavior somehow.

Just let me give some reasons why I'm motivated for more parallel build.
There was a Chromium Q build regression caused by `code_cache_generator` having `v8_context_snapshot_generator` as data_deps. At that time, I made a workaround to move `v8_context_snapshot_generator` to the deps that runs `code_cache_generator`.
Also, we still see `v8_context_snapshot_generator` is blocking the entire build (http://b/413494195) by several minutes. Then, I thought it's better to change the data_deps itself, rather than handling it one by one.
I initially planned to use "phony" to change the Ninja build graph (e.g. phony/X has executable X and data_deps of X). But, surprisingly, Chromium builds already worked mostly well with "validations". That's why I was pursuing this proposal.
I now rethink that changing the specific `v8_context_snapshot_generator` blocking issue might be easier to resolve the CQ build time issue...

> I do not think we should make a target W that wants to run X at build time have to explicitly add a data dependency on X's data deps. That feels like a layering violation and an abstraction violation, and it seems like it would be easy to get things wrong and end up with broken build graphs.

Hmm, I understand the feeling. But, order-only deps don't guarantee to run W only when X's data_deps changes.
Let's say if X has Y as data_deps. The following Ninja graph generated by the current GN doesn't rebuild W, when only Y changes.

True. I'm not the biggest fan of the current implementation.

-- Dirk

Ben Boeckel

unread,

May 26, 2025, 2:26:55 AMMay 26

to Junji Watanabe, Roland McGrath, Aaron Wood, Andrew Grieve, Dirk Pranke, gn-dev

On Fri, May 23, 2025 at 15:04:22 +0900, 'Junji Watanabe' via gn-dev wrote:
> Hmm, I understand the feeling. But, order-only deps don't guarantee to run
> W only when X's data_deps changes.
> Let's say if X has Y as data_deps. The following Ninja graph generated by
> the current GN doesn't rebuild W, when only Y changes.
>
> ```
> build X: link ... || Y
> build W: run_executable X
> ```
>
> Does anyone know how this issue is handled?

IMO, this is best handled by *everything* reporting `deps =` information
for their execution. Here, `X` would offer dependency information along
the lines of "this path was read in a way that can materially change my
output" (basically, don't report cache file paths). Then X would report
Y as a runtime dependency that way.

`data_deps` would then be something akin to "usage requirements". I'm
not familiar with GN syntax, but something like:

```
executable X {
data_deps = Y;
}
command W {
command = X;
}
```

becomes:

```
build X: link …
build W: run X || Y
```

where the declared `data_deps` of X become order-only deps of the *use*
of X (if X doesn't report things via `deps =`, Y can be a proper
dependency, not an order-only dep). This at least seems reasonably
possible (in the long run; I can't advise on transition feasibility).

--Ben

Dirk Pranke

unread,

May 28, 2025, 1:09:45 PMMay 28

to Matt Stark, Andrew Grieve, Junji Watanabe, gn-dev, Roland McGrath

- chrome-b...@google.com, please don't cross-post to public and private lists.

Thanks, Matt, for the feedback, and welcome to working on GN! :). Comments below ...

On Mon, May 26, 2025 at 11:13 PM Matt Stark <ms...@google.com> wrote:

I'll give my two cents here, as someone newish to the chrome build team who's come from bazel build teams.

Firstly, In my opinion, order-only deps are never useful, as they inherently provide incorrect semantics for incremental builds, and don't work properly with remote execution. I don't think they provide any useful use cases that cannot be dealt with by judicial use of real dependencies.

I've never been a big fan of order-only dependencies either, but I don't think I've ever really fully looked into them to think about when they are needed and how best to handle them. For at least the basic example given in the Ninja manual, I don't believe they have incorrect incremental semantics, but I could certainly see how they might be fragile and incorrect in other situations. I would be interested in others' thoughts on this.

AFAICT, there are a few possible semantics of data_deps we could choose:
Targets depending on A must declare dependencies on the data dependencies of A in addition to A
All targets depending on A also implicitly declare a dependency on the data dependencies of A
All targets attempting to run A also implicitly declare a dependency on the data dependencies of A
FWIW, I agree that the first option seems like a bad idea to me due to the layering violations mentioned earlier in the thread.

To extend the previous example, here we see two ways we attempt to use A. In the first, we attempt to run the binary. This requires the binary and all its runfiles. And in the second we attempt to strip the binary. This requires only the binary itself.

executable("A") {
  data_deps = [ ":B" ]
  ...
}
action("run_A") {
  script = "run_A.py"
  deps = [ ":A" ]
  outputs = [ "$root_out_dir/run_A.stamp" ]
}

action("strip_A") {
  script = "strip_A.py"
  deps = [ ":A" ]
  outputs = [ "$root_out_dir/A.stripped" ]
}

Bazel uses the third set of semantics. Specifically, it distinguishes between a label / target, a binary, and a file. If this were bazel:
The target A would output DefaultInfo(files = ["A"], runfiles = ["B"])
The target run_A would register an action with a dependency on the executable A (ie. A's files and runfiles)
The target run_A would register an action with a dependency on the file A (and thus ignores its runfiles)

I'm guessing the third bullet has a typo and you meant to write "strip_A" instead of "run_A"?

In my opinion, GN desperately needs a concept of providers, since it would solve this, and about half the problems GN currently has, but I don't really want to hijack this thread to talk about it.

I'm no expert on Bazel, so I'm not really familiar with providers. From a glance, they look similar to GN's metadata features. I agree with not hijacking threads, so perhaps you could start a new one, but I'm curious to hear more of your thoughts on this and/or whatever problems you think GN has? :).

First option:
I believe that if we choose the first option, there should be no dependency at all, so we should just remove the order-only deps, which

Second option:
If we choose the second option, I think that instead of order-only deps, we should provide a phony, to provide correct semantics while continuing to avoid B triggering a rebuild in A. Something along the lines of:
phony A_exe: A B
cc_binary A: clang -o A ...
cc_binary B: clang -o B ...
# Depends on the phony A_exe. No clue what the ninja syntax is to represent this since I'm not familiar with ninja.
root_out_dir/run_A.stamp: run_A.py
Third option:
While I'd love to do the third option, there's no way to tell just by looking at the command-line whether a binary is being run or just used, so we couldn't do so in a way that's backwards compatible. Additionally, the vast majority of a time a binary is being executed, so I don't think this will meaningfully negatively impact build performance.

I'm not wild about your second option, assuming I'm understanding it correctly. I feel like you should only have to type 'ninja A' to build A, and not 'ninja A_exe'. Nor should most things be trying to distinguish between a dependency on "A_exe" and "A". I think the template idea accomplishes what you mostly want here with a better UI and more safety; the dummy group is similar enough to a phony target, but has real dependency requirements to protect it.

I tend to agree with your thoughts on the third option. It seems like it would make the build a bit harder to maintain (or at least keep accurate) and it's not clear how big a win it would provide. I think it could be hard to know when you have a dependency on A whether or not that requires A's data deps without having to have more knowledge of A than I might tend to want a build graph to have (which is I think me saying the same thing as what I just said about your second option).

-- Dirk

On Fri, May 23, 2025 at 5:02 AM Andrew Grieve <agr...@google.com> wrote:

Despite my understanding that data_deps are supposed to work as they do now, I admittedly failed to find any spots in chromium where they are used in that way.

The best example I could find was that //base depends on ICU, and so //base has a data_dep on ICU's data file, and any executable that depends on //base might expect ICU's data file to be there in time. However, the host_toolchain tools that depend on //base don't seem likely to use any ICU functionality. It just seems to be a very rare thing that a binary used when building relies on auxiliary files. IMO it seems rare enough that it's not worth implementing a way to propagate them as Dirk describes.

So... I'm basically over my initial reasons for opposing the change, but orderfile + order-only deps usage in Fuschia seems more concrete to me & reason to not just change it (or to provide an orderonly_deps).

The spot that I think would be most sped up by this proposal is android static analysis steps, which currently use data_deps because validations are not exposed via GN. Although arguably for this case the best thing to do would be to directly expose "validation_deps". But maybe there are other spots where data_deps are introducing slow-downs.

On Thu, May 22, 2025 at 2:51 PM Dirk Pranke <dpr...@chromium.org> wrote:

--
You received this message because you are subscribed to the Google Groups "chrome-build-team" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chrome-build-t...@google.com.
To view this discussion visit https://groups.google.com/a/google.com/d/msgid/chrome-build-team/CABiQX1WP5GXKR6SJ1MCW9yp7KYoXAEiyXsq2UREM2Dfdj7U2tg%40mail.gmail.com.

--
Thanks, Matt.

Matt Stark

unread,

May 29, 2025, 12:50:53 AMMay 29

to Dirk Pranke, Andrew Grieve, Junji Watanabe, gn-dev, Roland McGrath

I've never been a big fan of order-only dependencies either, but I don't think I've ever really fully looked into them to think about when they are needed and how best to handle them. For at least the basic example given in the Ninja manual, I don't believe they have incorrect incremental semantics, but I could certainly see how they might be fragile and incorrect in other situations. I would be interested in others' thoughts on this.

After thinking about it more, I'll amend my earlier statement. Order-only deps are, in my opinion, only useful for dynamic/implicit dependencies. The difference between dynamic and implicit dependencies is that dynamic dependencies are explicit dependencies created during the execution of the build, whereas implicit dependencies are... implicit. Implicit dependencies suck because they don't play nice with RBE, and thus will fail in wierd unexpected ways whenever you use RBE, which is why no build system that cares about correctness supports them.

The problem is, ninja does not provide a concept of dynamic dependencies - only implicit dependencies. Implicit dependencies suck because they don't play nice with RBE, which doesn't have access to any files not explicitly provided. Currently, we solve this in siso by having attempting to turn those implicit dependencies into explicit, dynamic dependencies by reading depfiles, for example. However, doing this requires knowledge of precisely what format these depfiles use (and data dependencies don't have a depfile).

I'm guessing the third bullet has a typo and you meant to write "strip_A" instead of "run_A"?

Whoops, good catch

I'm no expert on Bazel, so I'm not really familiar with providers. From a glance, they look similar to GN's metadata features. I agree with not hijacking threads, so perhaps you could start a new one, but I'm curious to hear more of your thoughts on this and/or whatever problems you think GN has? :).

From what I could tell from looking at the documentation, metadata has a roughly similar idea, but isn't very helpful because it can only be put into files on disk, rather than read directly in your BUILD.gn. If other rules (which I think are approximately templates in GN?) could read it, it'd be more akin to bazel's providers, and would have an extremely large number of use cases (I could do some very cool things with that).

Second option:
If we choose the second option, I think that instead of order-only deps, we should provide a phony, to provide correct semantics while continuing to avoid B triggering a rebuild in A. Something along the lines of:
phony A_exe: A B
cc_binary A: clang -o A ...
cc_binary B: clang -o B ...
# Depends on the phony A_exe. No clue what the ninja syntax is to represent this since I'm not familiar with ninja.
root_out_dir/run_A.stamp: run_A.py
Third option:
While I'd love to do the third option, there's no way to tell just by looking at the command-line whether a binary is being run or just used, so we couldn't do so in a way that's backwards compatible. Additionally, the vast majority of a time a binary is being executed, so I don't think this will meaningfully negatively impact build performance.
I'm not wild about your second option, assuming I'm understanding it correctly. I feel like you should only have to type 'ninja A' to build A, and not 'ninja A_exe'. Nor should most things be trying to distinguish between a dependency on "A_exe" and "A". I think the template idea accomplishes what you mostly want here with a better UI and more safety; the dummy group is similar enough to a phony target, but has real dependency requirements to protect it.

I don't necessarily mean that precise ninja file, just the general semantics of "any time someone refers to A, they actually refer to this A_exe target". It could be something like the following instead:

phony A: A_internal B
cc_binary A_internal: clang -o A ...


cc_binary B: clang -o B ...
# Depends on the phony A_exe. No clue what the ninja syntax is to represent this since I'm not familiar with ninja.
root_out_dir/run_A.stamp: run_A.py

And I'm not fussed on how we'd implement such semantics - changing the template for executable to something that outputs something equivalent to this would be fine by me, which I'm assuming is what you meant?

--

Thanks, Matt.

Matt Stark

unread,

May 29, 2025, 12:57:30 AMMay 29

to Dirk Pranke, Andrew Grieve, Junji Watanabe, gn-dev, Roland McGrath

Edit: I've been informed that apparently ninja does have dynamic dependencies. Might be a good idea to just remove all order-only deps and implicit deps and replace them with dynamic deps.

--

Thanks, Matt.

Andrew Grieve

unread,

May 29, 2025, 1:42:15 PMMay 29

to Matt Stark, Dirk Pranke, Junji Watanabe, gn-dev, Roland McGrath

On Thu, May 29, 2025 at 12:57 AM Matt Stark <ms...@google.com> wrote:

Edit: I've been informed that apparently ninja does have dynamic dependencies. Might be a good idea to just remove all order-only deps and implicit deps and replace them with dynamic deps.

On Thu, May 29, 2025 at 2:50 PM Matt Stark <ms...@google.com> wrote:
I've never been a big fan of order-only dependencies either, but I don't think I've ever really fully looked into them to think about when they are needed and how best to handle them. For at least the basic example given in the Ninja manual, I don't believe they have incorrect incremental semantics, but I could certainly see how they might be fragile and incorrect in other situations. I would be interested in others' thoughts on this.

After thinking about it more, I'll amend my earlier statement. Order-only deps are, in my opinion, only useful for dynamic/implicit dependencies. The difference between dynamic and implicit dependencies is that dynamic dependencies are explicit dependencies created during the execution of the build, whereas implicit dependencies are... implicit. Implicit dependencies suck because they don't play nice with RBE, and thus will fail in wierd unexpected ways whenever you use RBE, which is why no build system that cares about correctness supports them.

The problem is, ninja does not provide a concept of dynamic dependencies - only implicit dependencies. Implicit dependencies suck because they don't play nice with RBE, which doesn't have access to any files not explicitly provided. Currently, we solve this in siso by having attempting to turn those implicit dependencies into explicit, dynamic dependencies by reading depfiles, for example. However, doing this requires knowledge of precisely what format these depfiles use (and data dependencies don't have a depfile).

I'm guessing the third bullet has a typo and you meant to write "strip_A" instead of "run_A"?

Whoops, good catch

I'm no expert on Bazel, so I'm not really familiar with providers. From a glance, they look similar to GN's metadata features. I agree with not hijacking threads, so perhaps you could start a new one, but I'm curious to hear more of your thoughts on this and/or whatever problems you think GN has? :).

From what I could tell from looking at the documentation, metadata has a roughly similar idea, but isn't very helpful because it can only be put into files on disk, rather than read directly in your BUILD.gn. If other rules (which I think are approximately templates in GN?) could read it, it'd be more akin to bazel's providers, and would have an extremely large number of use cases (I could do some very cool things with that).

Probably something along the lines of https://groups.google.com/a/chromium.org/g/gn-dev/c/sZ3XlDd5BKU/m/TApnYDJvBQAJ

Some helpful context also in: https://groups.google.com/a/chromium.org/g/gn-dev/c/qVktGj6t2_M/m/UAP4XFXjBAAJ

David Turner

unread,

May 30, 2025, 4:06:32 AMMay 30

to Matt Stark, Dirk Pranke, Andrew Grieve, Junji Watanabe, gn-dev, Roland McGrath

On Thu, May 29, 2025 at 6:50 AM 'Matt Stark' via gn-dev <gn-...@chromium.org> wrote:

The problem is, ninja does not provide a concept of dynamic dependencies - only implicit dependencies. Implicit dependencies suck because they don't play nice with RBE, which doesn't have access to any files not explicitly provided. Currently, we solve this in siso by having attempting to turn those implicit dependencies into explicit, dynamic dependencies by reading depfiles, for example. However, doing this requires knowledge of precisely what format these depfiles use (and data dependencies don't have a depfile).

Just as a point of clarification, Ninja does support dynamic dependencies since 1.10, but GN does not emit anything like that, and doesn't have any syntax or semantics to support them.

For Fuchsia we came up with an `action()` wrapper in BUILDCONFIG.gn that allows these targets to provide another file, script or even action target to generate a list of extra inputs, which are used to remote the parent action properly. For example when a tool parses a top-level JSON manifest file to read other files listed in it.

There are also support templates like hermetic_inputs_action(), Here's an example usage for building go packages.

We use the generated hermetic_inputs files for three things:

- To verify hermiticity locally with fsatrace (our build can trace each command to verify it only reads and writes its declared inputs and outputs).
- To remote the parent action because the hermetic_inputs_file provides the list of all required extra implicit inputs.
- To generate a Ninja depfile, so that the action script / tool doesn't need to.

This only works for action() targets though, for C++ and Rust, we have to use other ugly tricks.

It would be nice to have something like that in GN, but we're happy with what we have.

From what I could tell from looking at the documentation, metadata has a roughly similar idea, but isn't very helpful because it can only be put into files on disk, rather than read directly in your BUILD.gn. If other rules (which I think are approximately templates in GN?) could read it, it'd be more akin to bazel's providers, and would have an extremely large number of use cases (I could do some very cool things with that).

FWIW, I have worked a lot on GN internals in the past years and looked into this specific problem. The summary is that adding that type of dependency -> dependent propagation would require a major refactoring of GN's internal design, which is optimized to evaluate BUILD.gn and .gni files in parallel and in any order, and may even make `gn gen` significantly slower. Tools like Bazel or BUCK can deal with that because they do not have to parse the entire build graph on each invocation. I'd be happy to discuss technical details in another thread though.

--
Thanks, Matt.

To unsubscribe from this group and stop receiving emails from it, send an email to gn-dev+un...@chromium.org.

Reply all

Reply to author

Forward