Concepts/cookbook ideas for asset conversion

87 views
Skip to first unread message

Oliver Smith

unread,
Jan 22, 2025, 10:31:56 PMJan 22
to ninja-build
For the last few weeks I've been trying to familiarize myself with Ninja either as a holy grail or at least a rosetta stone for a significant refactoring for part of our engine's asset conversion pipeline.

One some levels it seems very close to being a great use-case for Ninja, bar a couple dynamic twists I haven't sunk my teeth into Ninja solidly enough to reason thru.

I'm hoping for some research pointers or references to extant examples that might help me figure if this is workable or not.

In a sense our project is like a multi-language build, you just need Maya instead of an IDE for most of the translation units :)

Our users would be going thru our tooling in a "cmake --build" like scenario, but our no-op turnaround needs to be super fast (<1s), and I have an inkling that while Ninja supports generator-triggering, it doesn't really "integrate" with the generator.

Win/Lin/Mac.

4 main inflection points:

1. DSL files for associating assets with tooling, probably not that different from being a cmake generator.
2. optional config files: EFINAE: existence failure is not an error, but a change in existence may affect compiler flags,
3. dynamic target addition: (two levels),
4. dynamic command argument generation

The first I suspect I should probably generate some small cmake ninja builds and look at those, but I'd like our tool to be able to do something along the line of:

while running_ninja() isnt ok:
  generate_build()

I don't mind if it takes a little time to reach a steady state; our users run the tooling a lot with only minor local changes to a very large asset tree most of the time, but they also run it without any changes most of the time :)

#1 I suspect I just need to look closely at cmake's approach around --build.

#2 If "foo.cpp" will build as "-std=c++98" unless you add a "foo.ini" with "-std=c++26". If foo.ini comes into existence later, I have to regenerate the build line for foo.cpp. Unsure how to reference the existence of a file in ninja rules.

#3 and #4 ".mesh" -> ".shadegraph" -> texture.

This is hard to explain: a mesh may add several build edges, shadegraphs, which in turn may add several build edges, textures. The properties of the mesh may affect flags used to process shadegraphs, and the shadegraphs referenced by a mesh may collectively affect flags used to process the mesh:

```
mesh_attrs, shaders = mesh_tool(mode="analyze", input=mesh_file)
mesh_shader_args = get_shader_args_for(mesh_attrs)
all_shader_features, textures = set(), set()
all_shader_features, textures |= shader_tool(mode="analyze", input=shader, extra_args=mesh_shader_args) foreach shader in shaders

add_build_edge(mesh_tool, mesh_artifact, mesh_file, enable_features=all_shader_features)
foreach shader in shaders:
  add_build_edge(shader_tool, artifact_for(shader), shader, extra_args=mesh_shader_args)

foreach texture in textures:
  add_build_edge(texture_tool, artifact_for(texture), texture, soft_depends=texture.with_extension(".ini"))
```

It seems like this is a bootstrap issue, where on first run I'd generate my ninja.build without these shader/texture references, just the mesh analysis & convert builds.

I thought perhaps they could generate response files, but they need to actually inject shader analysis and shader convert rules into the build so those assets get processed, and then those need to inject the texture rules into the build.

I suppose we could add generator rules but analyzing all meshes or all shaders would be expensive, while the generator overhead to retrigger it per mesh and/or shader would be heinous. 

Which leads me to a sense that I've either missed something ridiculously basic or I've reached the Ouroboros point where I'm just going to have to continue doing our own cache/state management for the mesh/shader/texture debacle.

I did consider treating each mesh as a mini ninja-build of its own, but that replaced an 180ms no-op for the project to 3 minutes (windows file-handle operations, not file io, is terrible: https://github.com/kfsone/filebench, and we have more-than-2 meshes)

Appreciate any guidance/pointers.

-Oliver

David Turner

unread,
Jan 23, 2025, 12:52:23 PMJan 23
to Oliver Smith, ninja-build
It is hard to get a clear picture of what you need from the description you gave, but Ninja is designed around a number of constraints that seem to conflict with your requirements:
  • No dynamic command arguments: All commands executed by Ninja come from expanding a rule with fixed arguments that themselves come from the build statement that invoked it.
  • No dynamic command creation: While depfiles and dyndeps allow adding new paths to the build graph, these must always be connected to existing build nodes (commands) from the static build plan.
  • No "existence" support: A missing input file (i.e. a path that appears in the Ninja build graph that is not generated by a build rule) is an error, period.
All these are pretty fundamental to the way Ninja works, and changing any of these points would be detrimental to its speed, or would break some workflows.

You probably need a wrapper tool that can handle these points, and only regenerates a build plan when necessary before invoking Ninja. In other works, invoking Ninja directly cannot work without appropriate pre-checks for the problem you have, on each incremental build.


--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ninja-build/c45f43e0-cf43-450a-9675-3a98534391fan%40googlegroups.com.

Evan Martin

unread,
Jan 23, 2025, 3:25:51 PMJan 23
to ninja-build
At a high level, anything that changes the specific commands you want Ninja to execute or their relative order will require updating the build.ninja file.

So I think your problems all come down to making it efficient when you want to regenerate the file again.

I didn't follow your pseudocode but it sounds like any changes to a mesh file will need to trigger things.
It might help if you try to think about the problem of updates more incrementally.  For example, you might have:
1) for each mesh, a matching build step analyzes it and dumps out a metadata file containing the information relevant to generating the build file
2) a single build step that collects those metadata files and regenerates the main build.ninja
After a full build, the next time you edit a single mesh file, part 1 only executes a single time for that file, and part 2 joins all the metadata files together.


Evan Martin

unread,
Jan 23, 2025, 3:29:11 PMJan 23
to ninja-build
On Thursday, January 23, 2025 at 9:52:23 AM UTC-8 di...@google.com wrote:
  • No "existence" support: A missing input file (i.e. a path that appears in the Ninja build graph that is not generated by a build rule) is an error, period.
I'm not sure this is true.  I believe you can use 'phony' rules to mark files as optional.  It's in the manual here, see "dummy targets":
https://ninja-build.org/manual.html#_the_literal_phony_literal_rule

I tried this:

rule wc
    command = wc -l $in > $out

build b: phony  # marks b as phony
build demo: wc a b

This fails to build if the file 'a' is missing, but if you 'touch a' then run you'll see it executes the 'wc' command (which then fails because 'b' is missing).

Evan Martin

unread,
Jan 23, 2025, 3:30:02 PMJan 23
to ninja-build
build b: phony  # marks b as phony

Errr, I meant "marks b as optional".

(Also, I am not certain this behavior works as I describe, please correct me if I'm wrong!) 

Oliver Smith

unread,
Jan 23, 2025, 4:11:27 PMJan 23
to ninja-build
On Thursday, January 23, 2025 at 9:52:23 AM UTC-8 David Turner wrote:
It is hard to get a clear picture of what you need from the description you gave, but Ninja is designed around a number of constraints that seem to conflict with your requirements:
  • No dynamic command arguments: All commands executed by Ninja come from expanding a rule with fixed arguments that themselves come from the build statement that invoked it.
  • No dynamic command creation: While depfiles and dyndeps allow adding new paths to the build graph, these must always be connected to existing build nodes (commands) from the static build plan.
  • No "existence" support: A missing input file (i.e. a path that appears in the Ninja build graph that is not generated by a build rule) is an error, period.
All these are pretty fundamental to the way Ninja works, and changing any of these points would be detrimental to its speed, or would break some workflows.

You probably need a wrapper tool that can handle these points, and only regenerates a build plan when necessary before invoking Ninja. In other works, invoking Ninja directly cannot work without appropriate pre-checks for the problem you have, on each incremental build.

Absolutely a given, hence my line: "Our users would be going thru our tooling in a "cmake --build" like scenario, but our no-op turnaround needs to be super fast (<1s), and I have an inkling that while Ninja supports generator-triggering, it doesn't really "integrate" with the generator."

Rather I'm hoping for guidance reference for possible techniques that make it practical for a generator to iterate on a build script. To my mind, bootstrap is just increment from 0, but what I don't want to have to write is a complex, complete, comprehensive counter-ninja that will reverse engineer the product of executing my current build script so that it can deduce what additions to make.

We're going to have to always regenerate at least some portion of the build script anyway - since our top-level DSL uses wildstar globs for asset references. Which puts me on the cusp of having to re-run all those pesky inspection sub-processes (which aren't cheap) every run.

I'm not asking "what flag do I set so ninja does this for me" but for generator-authoring guidance: look at how cmake does X, use this technique of chaining three rules to signal this factoid to yourself, ...


-Oliver

Oliver Smith

unread,
Jan 23, 2025, 4:54:16 PMJan 23
to ninja-build
On Thursday, January 23, 2025 at 12:25:51 PM UTC-8 Evan Martin wrote:
At a high level, anything that changes the specific commands you want Ninja to execute or their relative order will require updating the build.ninja file.

So I think your problems all come down to making it efficient when you want to regenerate the file again.

Aye: The two extremes I'd like to avoid are (a) just always regenerate the entire script including all those dep-generation steps, (b) mimicking Ninja's logic to identify or reverse engineer why Ninja is saying the build is stale.
 
I didn't follow your pseudocode but it sounds like any changes to a mesh file will need to trigger things.

All else aside, just attempting to hand-craft a ninja build to mimic a single run of the pipeline was stunningly helpful in detecting ancient, vestigial, and sometimes baffling twists in the pipeline. discovering the path to our tooling making sense is part of this journey for me.

It might help if you try to think about the problem of updates more incrementally.  For example, you might have:
1) for each mesh, a matching build step analyzes it and dumps out a metadata file containing the information relevant to generating the build file
2) a single build step that collects those metadata files and regenerates the main build.ninja
After a full build, the next time you edit a single mesh file, part 1 only executes a single time for that file, and part 2 joins all the metadata files together.

This was definitely in the collection of possibilities I wasn't sure which were best to pursue. I would like to have my cake and eat it, in that where I give Ninja a predicate for correctness/staleness, I'd like to have access to Ninja's findings/conclusions beyond a trivial bool.

In ~14 days learning/experimenting with Ninja, I've not found a safe-looking route to that sort of incremental or self-assembling approach (maybe I don't understand restat/generator features sufficiently yet, or features the manual mentions in passing but doesn't actually document) so I'm hoping for research/investigation recommendations, really. 

-Oliver

Eli Schwartz

unread,
Jan 23, 2025, 5:39:49 PMJan 23
to ninja...@googlegroups.com
On 1/23/25 3:29 PM, Evan Martin wrote:
> On Thursday, January 23, 2025 at 9:52:23 AM UTC-8 di...@google.com wrote:
>
>
> - *No "existence" support*: A missing input file (i.e. a path that
> appears in the Ninja build graph that is not generated by a build rule) is
> an error, period.
>
> I'm not sure this is true. I believe you can use 'phony' rules to mark
> files as optional. It's in the manual here, see "dummy targets":
> https://ninja-build.org/manual.html#_the_literal_phony_literal_rule
>
> I tried this:
>
> rule wc
> command = wc -l $in > $out
>
> build b: phony # marks b as phony
> build demo: wc a b
>
> This fails to build if the file 'a' is missing, but if you 'touch a' then
> run you'll see it executes the 'wc' command (which then fails because 'b'
> is missing).


That is not what is being asked about. Also it doesn't really work as
expected. Since b is a phony rule, wc will be rerun every time you
invoke ninja.

Going back to the original request...

> 2. *optional* config files: EFINAE: existence failure is not an error,
> but a change in existence may affect compiler flags,


The idea is that when you first run the command, it doesn't matter
whether b exists, the command runs successfully (unlike wc, it doesn't
consider the file not existing to be a problem). And if you run ninja
again, it will say everything is up to date.

If you *create* b, then now demo is stale and will be rebuilt. And then
if you delete b again, then demo is stale again and will be rebuilt again.

For a better idea of how and why this is actually supposed to operate,
consider this case. It's a contrived case and I'm not convinced it makes
sense to support, but there is at least one dependency graph runner (not
ninja, not make) that considers it vital and also thinks both make and
ninja suck for not supporting it.

Here's a build.ninja:

rule c_COMPILER
command = gcc $ARGS -MD -MQ $out -MF $DEPFILE -o $out -c $in
deps = gcc
depfile = $DEPFILE_UNQUOTED
description = Compiling C object $out

rule c_LINKER
command = gcc $ARGS -o $out $in $LINK_ARGS
description = Linking target $out


build prog: c_LINKER prog.p/prog.c.o
LINK_ARGS = -Wl,--as-needed -lmylib

build prog.p/prog.c.o: c_COMPILER ../prog.c
DEPFILE = prog.p/prog.c.o.d
DEPFILE_UNQUOTED = prog.p/prog.c.o.d
ARGS = -I. -I../


Now, prog.c exists in your source tree, and it does #include "mylib.h"
which is installed on your system in /usr/include (and -lmylib is
unsurpisingly in /usr/lib64 linked to libmylib.so.1, ABI version 1). If
you compile:

ninja prog.p/prog.c.o

You'll get an object file that is based on the header functionality
defined in /usr/include/mylib.h

Plot twist. Don't link prog yet. First, go download a new version of
"mylib", and sudo make install it into /usr/local.


*Now* go ahead and finish building your project. Your object file still
uses the types from /usr/include/mylib.h, but it links to
/usr/local/lib64/libmylib.so -> libmylib.so.2


The universe now explodes. Or at least your binary does. If you do a
clean, then recompile, the object file would use
/usr/local/include/mylib.h and all would be well.


The theory here is that a build system which supports existence checks
can take a rule like this:


build prog.p/prog.c.o: c_COMPILER ../prog.c

and declare, possibly via a depfile, that it has a header dependency
which looks like this:


prog.p/prog.c.o: /usr/include/mylib.h !/usr/local/include/mylib.h


Really, it would define a "not" rule for every possible search path
permutation of any header file it utilizes. Then if you ever create any
of those files, ninja can say "well, obviously the build is now stale,
because if I recompile I'll get something different from last time".

It is technically true that there is a difference here. In practice
that's not how people put together software (and that's why effectively
nobody uses the dependency graph runner in question). Of course, people
can come up with lots of reasons why existence checks should maybe
matter that don't have to do with changing your dependency installation
locations, but I'm not sure I'm convinced any of them are really any
better. In order for "existence checks" to be useful, you have to know
in advance you're going to create a file in a specific location in the
future...


--
Eli Schwartz
OpenPGP_signature.asc

David Turner

unread,
Jan 24, 2025, 10:38:38 AMJan 24
to Evan Martin, ninja-build

Oh yes, you're right, thanks for mentioning this.

(technically Ninja considers b an output of a phony node, and not a pure input, but that's an implementation detail :-))

--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.

Evan Martin

unread,
Jan 24, 2025, 2:19:52 PMJan 24
to Eli Schwartz, ninja...@googlegroups.com
On Thu, Jan 23, 2025 at 2:39 PM Eli Schwartz <eschw...@gmail.com> wrote:
> 2. *optional* config files: EFINAE: existence failure is not an error,
> but a change in existence may affect compiler flags,

The idea is that when you first run the command, it doesn't matter
whether b exists, the command runs successfully (unlike wc, it doesn't
consider the file not existing to be a problem). And if you run ninja
again, it will say everything is up to date.

If you *create* b, then now demo is stale and will be rebuilt. And then
if you delete b again, then demo is stale again and will be rebuilt again.

Ah yes, I misunderstood.  Thinking about it more, I think the underlying issue is that if we had some mechanism for saying "the output X might use the input Y, which is allowed to be missing", then in between Ninja invocations you need some way of representing the difference between:

1) X is present, Y is missing, but you built X in the past and that was fine and you shouldn't rebuild anymore
2) X is present, Y is missing, but you built X in the past using Y and you should now rebuild X to reflect the removal of Y

This is related to what depfiles capture ("which files were actually used to build X?") but they have different behavior around missing files.

This is a pretty interesting case, I will think about it more!


Back to Oliver's original problem (foo.cc's build flags change depending on optionally present foo.ini), I think cases like this can sometimes be worked around via additional levels of intermediate build steps.

For example maybe something like:
  build foo.ini: phony
  build foo.cc.flags: check-if-exists foo.ini
  build build.ninja: ... foo.cc.flags
where the "check-for" rule actually looks on disk for whether foo.ini is present.

Or even:
  rule do-find
     command = find $srcdir -name "*.ini" > $out
  build all-ini-files: do-find
  build build.ninja: ... all-ini-files

These will unfortunately be invoked on every build, but per your requirements ultimately something needs to do that check on every build.  Depending on your project it may not be too costly to do it as an external program.


[Eli's example elided]
 
Really, it would define a "not" rule for every possible search path
permutation of any header file it utilizes. Then if you ever create any
of those files, ninja can say "well, obviously the build is now stale,
because if I recompile I'll get something different from last time".

A simpler demonstration of what I think is the same situation is if you have a C file containing
#include "foo.h"
and you
1) build once, so your depfile now specifies which foo.h you used
2) create another 'foo.h' that appears earlier in your search path, e.g. your project has multiple -Idirectory_a -Idirectory_b search paths
then Ninja will fail to rebuild the next time you invoke it (unaware that you now want a different foo.h) until you build clean.

Unfortunately if you look at the output
$ touch foo.cc; clang -v foo.cc
you'll see even in a project without its own -I paths there are a bunch of directories that a compiler may search.
And further it wouldn't even be sufficient to just monitor those directories for changes, as you can #include "subdir/path.h" too.

It is technically true that there is a difference here. In practice
that's not how people put together software (and that's why effectively
nobody uses the dependency graph runner in question). 
 
In the limit tools can do all sorts of wacky traversals of the file system looking for inputs, and I think the only total solution is sandboxing as found in bazel and tup.  As you observe though, there's unfortunately a tradeoff between correctness and a build tool that people are willing to adopt.

Eli Schwartz

unread,
Jan 24, 2025, 3:18:18 PMJan 24
to Evan Martin, ninja...@googlegroups.com
On 1/24/25 2:19 PM, Evan Martin wrote:
> [Eli's example elided]
>
>
>> Really, it would define a "not" rule for every possible search path
>> permutation of any header file it utilizes. Then if you ever create any
>> of those files, ninja can say "well, obviously the build is now stale,
>> because if I recompile I'll get something different from last time".
>>
>
> A simpler demonstration of what I think is the same situation is if you
> have a C file containing
> #include "foo.h"
> and you
> 1) build once, so your depfile now specifies which foo.h you used
> 2) create another 'foo.h' that appears earlier in your search path, e.g.
> your project has multiple -Idirectory_a -Idirectory_b search paths
> then Ninja will fail to rebuild the next time you invoke it (unaware that
> you now want a different foo.h) until you build clean.


In "probably most" cases, I would say that this only happens when the
old foo.h also gets deleted, which means that rebuild *is* triggered
(because the old foo.h is an explicit depfile dependency).


> Unfortunately if you look at the output
> $ touch foo.cc; clang -v foo.cc
> you'll see even in a project without its own -I paths there are a bunch of
> directories that a compiler may search.
> And further it wouldn't even be sufficient to just monitor those
> directories for changes, as you can #include "subdir/path.h" too.


Yup, my feeling is that essentially everyone already does this.


> It is technically true that there is a difference here. In practice
>> that's not how people put together software (and that's why effectively
>> nobody uses the dependency graph runner in question).
>>
>
> In the limit tools can do all sorts of wacky traversals of the file system
> looking for inputs, and I think the only total solution is sandboxing as
> found in bazel and tup. As you observe though, there's unfortunately a
> tradeoff between correctness and a build tool that people are willing to
> adopt.


My understanding of bazel is it solves this by requiring you to specify
exactly which files you depend on without any use of depfiles or dyndeps
or "existence deps" and then enforcing that via sandboxing. Since you
can also specify exactly which files you depend on by namespacing your
source root and specifying exact `#include "subdir/foo.h"` style
includes, the tradeoff here is essentially, bazel:


- can prove via contract that no wacky traversals are allowed to exist

- requires you to define your dependencies in two places, once in the
*.cpp files and once in the bazel files


Ninja (and Make) only need to specify once, in the source files, and if
you want to know that no wacky traversals exist, "simply use the
superior include style", combined with "and don't ever use /usr/local".


--
Eli Schwartz
OpenPGP_signature.asc

Oliver Smith

unread,
Jan 25, 2025, 5:20:06 AMJan 25
to ninja-build
On Thursday, January 23, 2025 at 2:39:49 PM UTC-8 Eli Schwartz wrote:
For a better idea of how and why this is actually supposed to operate,
consider this case. It's a contrived case and I'm not convinced it makes
sense to support, but there is at least one dependency graph runner (not
ninja, not make) that considers it vital and also thinks both make and
ninja suck for not supporting it.
 
Or, perhaps the equally not something ninja should support:

```
#if __cplusplus >= 201703L && __has_include(<opengl/gl.h>)
# include <opengl/gl.h>
# ...enable opengl stuff ... 
#else
# ... don't ...
#endif
```


Oliver Smith

unread,
Jan 25, 2025, 5:52:48 AMJan 25
to ninja-build
On Friday, January 24, 2025 at 11:19:52 AM UTC-8 Evan Martin wrote:
On Thu, Jan 23, 2025 at 2:39 PM Eli Schwartz <eschw...@gmail.com> wrote:
> 2. *optional* config files: EFINAE: existence failure is not an error,
> but a change in existence may affect compiler flags,

The idea is that when you first run the command, it doesn't matter
whether b exists, the command runs successfully (unlike wc, it doesn't
consider the file not existing to be a problem). And if you run ninja
again, it will say everything is up to date.

If you *create* b, then now demo is stale and will be rebuilt. And then
if you delete b again, then demo is stale again and will be rebuilt again.

Ah yes, I misunderstood.  Thinking about it more, I think the underlying issue is that if we had some mechanism for saying "the output X might use the input Y, which is allowed to be missing", then in between Ninja invocations you need some way of representing the difference between:

1) X is present, Y is missing, but you built X in the past and that was fine and you shouldn't rebuild anymore
2) X is present, Y is missing, but you built X in the past using Y and you should now rebuild X to reflect the removal of Y

This is related to what depfiles capture ("which files were actually used to build X?") but they have different behavior around missing files.

This is a pretty interesting case, I will think about it more!
 
I do anticipate it being something my generator/tooling would have to tackle; maybe I've played too much Factorio/ONI, but it still feels like something that a person could coerce into the ninja model to some degree. Given that my current codebase/would-be generator is in Python, the more staleness detection I can farm out to Ninja, even if it's a bit circuitous and stat()y, the better.

I also want to gauge how much state-tracking will need to be replicated, essentially, by the tool itself. My mental model for how I would implement this from the ground up if I were writing it in C, C++, or Go, would be to roll out the maximal amount of the build-graph (or build script) that I have data for, then resolve the graph. The first resolve would ast-gen the dsl files; second would match the wildstar patterns to the filesystem, generate the top-level 'convert' edges and the 2nd-level 'analyze' edges, and then continue resolving until stability. 

From the perspective of writing a ninja-driver+generator, that means instrumenting some way for my build-script to advise the generator what edges/nodes it needs to regenerate for (again, I'm not trying to write a general purpose multi-system driver).

Back to Oliver's original problem (foo.cc's build flags change depending on optionally present foo.ini), I think cases like this can sometimes be worked around via additional levels of intermediate build steps.

For example maybe something like:
  build foo.ini: phony
  build foo.cc.flags: check-if-exists foo.ini
  build build.ninja: ... foo.cc.flags
where the "check-for" rule actually looks on disk for whether foo.ini is present.

Or even:
  rule do-find
     command = find $srcdir -name "*.ini" > $out
  build all-ini-files: do-find
  build build.ninja: ... all-ini-files
 
This is the sort of thing I've imagined. I'd like our tool to be able to run ninja early, so I only set about going looking for problems for python once Ninja has speedily detected staleness.

But how far to take that, then; cf I could just generate a file (or hash) of all the filenames/paths so I know when to re-match wildstar globs in the DSL, and also to know when I need to go re-check my INIs. Then I circle back to having a 1s no-op budget, and realize I might not want to blanket capture & test all that data as part of the pre-ninja workload...

-Oliver
Reply all
Reply to author
Forward
0 new messages