[ANN] bang: A ninja file generator scriptable in Lua

231 views
Skip to first unread message

Christophe D.

unread,
Oct 1, 2023, 11:42:59 PM10/1/23
to ninja-build
Hello,

Bang[1] is a ninja file generator scriptable in LuaX[2], a Lua interpreter with
a bunch of useful modules (file management, functional programming module, basic
cryptography, ...). It takes a build description (a LuaX script) and generates
a Ninja file.

Bang provides functions to generate ninja primitives (variables, rules, build
statements, ...) and some extra features:

- rule/build statement pairs described in a single function call
- file listing and filenames list management using LuaX modules (e.g. F[3] and
  fs[4])
- pipe simulation using rule composition
- "clean", "install" and "help" targets

Bang comes with an example[6] that shows how to use bang and LuaX functions to:

- discover source files actually present in the repository: no redundant hard
  coded file lists (redundancy means painful maintenance)
- cross-compile the same sources for multiple platforms: compilation for several
  platforms without any dirty copy/paste
- describe static libraries: in the `lib` directory, each sub-directory is
  a library compiled and archived in its own `.a` file
- describe executables: in the `bin` directory, each C source file is the main
  file of a binary containing this C file as well as libraries from the `lib`
  directory.

I'm currently using bang to build bang itself but also LuaX[2] and some projects
available on my GitHub[5].

[1]: https://github.com/CDSoft/bang
[2]: https://github.com/CDSoft/luax
[3]: https://github.com/CDSoft/luax/blob/master/doc/F.md
[4]: https://github.com/CDSoft/luax/blob/master/doc/fs.md
[5]: https://github.com/CDSoft
[6]: https://github.com/CDSoft/bang/tree/master/example

Regards,
Christophe.

Ben Boeckel

unread,
Oct 5, 2023, 12:45:43 PM10/5/23
to Christophe D., ninja-build
On Sun, Oct 01, 2023 at 02:04:02 -0700, Christophe D. wrote:
> Bang[1] is a ninja file generator scriptable in LuaX[2], a Lua interpreter
> with
> a bunch of useful modules (file management, functional programming module,
> basic
> cryptography, ...). It takes a build description (a LuaX script) and
> generates
> a Ninja file.

Neat.

> Bang provides functions to generate ninja primitives (variables, rules,
> build
> statements, ...) and some extra features:
>
> - rule/build statement pairs described in a single function call
> - file listing and filenames list management using LuaX modules (e.g. F[3]
> and
> fs[4])
> - pipe simulation using rule composition
> - "clean", "install" and "help" targets

I've liked the rule/build pattern that `ninja` has gone with. I've
thought that it would be a good pattern to use for a build system. I'm
glad to see it finally be made :) .

> Bang comes with an example[6] that shows how to use bang and LuaX functions
> to:
>
> - discover source files actually present in the repository: no redundant
> hard
> coded file lists (redundancy means painful maintenance)

Just to note, I personally find globbing to be more harmful than the
minor pain involved (adding/removing files is a relatively rare
occurance IME). I've posted about it on CMake's Discourse here:

https://discourse.cmake.org/t/is-glob-still-considered-harmful-with-configure-depends/808/2

Summary:

- Git will drop files that match trivial globs during a conflict
- `git diff` doesn't show new files introduced into someone's builds
- helper files need some special consideration (e.g., if `bin/compat.c`
were to be useful for some platform compatibility shims)
- your `build.ninja` generation now depends (or, worse, needs manually
rerun) on filesystem state changes

I also recently ended up posting this in response to a request for
build-time globbing:

https://www.reddit.com/r/cpp/comments/16y9qv2/cmake_c_modules_support_in_328/k3iq5zi/

Summary:

- there's no clear time to perform the glob at build time unless you
enforce that all outputs are known at configure time (like `tup` does)
- there is no "the build has no work to do" state because the glob
always needs to be resolved to search for new or removed files
- per-source flags that apply to files discovered at build time mean
that enough information needs shuttled to build time to resolve them
- related, scanning, export configuration, and install rules may also
need to know about discovered files
- removing stale files needs to be looked at (e.g., `protoc` dropping a
generated file and then its source is removed: what knows to remove
the output?)

--Ben

Christophe Delord

unread,
Oct 11, 2023, 12:48:47 PM10/11/23
to Ben Boeckel, ninja-build

Le 05/10/2023 à 18:45, Ben Boeckel a écrit :
> I've liked the rule/build pattern that `ninja` has gone with. I've
> thought that it would be a good pattern to use for a build system. I'm
> glad to see it finally be made :) .
Bang is actually just a thin layer above ninja that add scripting
capabilities to ninja.
> Just to note, I personally find globbing to be more harmful than the
> minor pain involved (adding/removing files is a relatively rare
> occurance IME). I've posted about it on CMake's Discourse here:
>
> https://discourse.cmake.org/t/is-glob-still-considered-harmful-with-configure-depends/808/2
>
> Summary:
>
> - Git will drop files that match trivial globs during a conflict
> - `git diff` doesn't show new files introduced into someone's builds
> - helper files need some special consideration (e.g., if `bin/compat.c`
> were to be useful for some platform compatibility shims)
> - your `build.ninja` generation now depends (or, worse, needs manually
> rerun) on filesystem state changes

Right. When the file system changes I need to run bang. My experience
may be different. Some projects have grown in such a way that the
Makefile has become huge and is difficult to maintain (lots of
redundancies, a team that is becoming too large, ...). I some cases
a well organized source tree can contain interesting information
for the build system.

But I agree globbing on generated sources may be tricky. In this case
my solution is a two-pass compilation process. This implies the code
(re)generation is rare, which is true when writing/debugging hand
written code. Anyway I'm not fully satisfied with this process.

>
> I also recently ended up posting this in response to a request for
> build-time globbing:
>
> https://www.reddit.com/r/cpp/comments/16y9qv2/cmake_c_modules_support_in_328/k3iq5zi/
>
> Summary:
>
> - there's no clear time to perform the glob at build time unless you
> enforce that all outputs are known at configure time (like `tup` does)
> - there is no "the build has no work to do" state because the glob
> always needs to be resolved to search for new or removed files

This is true with one-pass build process. With a two-pass build process
the second pass may have no work to do. But we have to know when the
first pass needs to be run again.

> - per-source flags that apply to files discovered at build time mean
> that enough information needs shuttled to build time to resolve them
> - related, scanning, export configuration, and install rules may also
> need to know about discovered files
> - removing stale files needs to be looked at (e.g., `protoc` dropping a
> generated file and then its source is removed: what knows to remove
> the output?)
When the file system changes, cleaning the build directory and regenerating
the ninja file may be required. But if this happens too often, it can
become painful.
>
> --Ben
>
Thanks for sharing your thoughts and experiences.


--

Christophe

Mathias Stearn

unread,
Oct 12, 2023, 5:24:48 AM10/12/23
to Ben Boeckel, Christophe D., ninja-build
On Thu, Oct 5, 2023 at 6:45 PM 'Ben Boeckel' via ninja-build <ninja...@googlegroups.com> wrote:
- there is no "the build has no work to do" state because the glob
  always needs to be resolved to search for new or removed files

While that may be true of how CMake writes globbing rules today, I don't think that is an inherent issue. You could add all directories (yes the directories themselves) that the glob covers as inputs to the globbing build. You could even make them runtime deps and have the glob tool just announce every directory that it opens*. That would allow you to have real no-op builds if nothing has changed. Of course, since many editors use the atomic write-to-temp-file-and-rename trick rather than overwriting the single file, you will still need to reglob every time someone edits a file in one of those directories. But those probably wouldn't be no-op builds anyway.

* my favorite way of doing this is using `deps=msvc` + `msvc_deps_prefix=Opening file: `. Then you can just have your tool print("Opening file: {path}") every time it opens a file, rather than teaching every tool to emit a well-formed Makefile fragment.

Ben Boeckel

unread,
Oct 12, 2023, 7:59:28 AM10/12/23
to Mathias Stearn, Christophe D., ninja-build
On Wed, Oct 11, 2023 at 22:57:01 +0200, Mathias Stearn wrote:
> While that may be true of how CMake writes globbing rules today, I don't
> think that is an inherent issue. You could add all directories (yes the
> directories themselves) that the glob covers as inputs to the globbing
> build. You could even make them runtime deps and have the glob tool just
> announce every directory that it opens*. That would allow you to have real
> no-op builds if nothing has changed. Of course, since many editors use the
> atomic write-to-temp-file-and-rename trick rather than overwriting the
> single file, you will still need to reglob every time someone edits a file
> in one of those directories. But those probably wouldn't be no-op builds
> anyway.

Ok, so this *could* work for in-source globs with out-of-source builds,
but if the build is writing to the directory that is being globbed,
you're SOL unless you list the directory as an output of…some rule that
depends on all rules that may write there (as you cannot specify it as
an output for more than one rule). Otherwise you may glob "too early"
and miss files that are intended to be caught. Even if you don't write
files that are globbed, you have to rerun `ninja` until all of the
write-glob cycles are resolved before you have a "no work to do" build.

--Ben

David Turner

unread,
Oct 12, 2023, 8:24:37 AM10/12/23
to Mathias Stearn, Ben Boeckel, Christophe D., ninja-build
On Thu, Oct 12, 2023 at 11:24 AM 'Mathias Stearn' via ninja-build <ninja...@googlegroups.com> wrote:


On Thu, Oct 5, 2023 at 6:45 PM 'Ben Boeckel' via ninja-build <ninja...@googlegroups.com> wrote:
- there is no "the build has no work to do" state because the glob
  always needs to be resolved to search for new or removed files

While that may be true of how CMake writes globbing rules today, I don't think that is an inherent issue.

I believe it is an inherent issue if you care about correctness: if your Ninja generator performs globbing, this means that _any_ time a file is added to or removed from one of the globbed directories, the project's final build graph changes (inputs and/or outputs are added or removed). Thus the Ninja build plan becomes _immediately_ _stale_ and should always be regenerated. Otherwise your next Ninja incremental build is not going to reflect the real state of your project, and this introduces all kinds of really flaky and hard-to-debug correctness issues that will drive you mad.

Which implies that every Ninja incremental build would need to re-run the generator to perform the globs again, and no-op builds are impossible, unless you really really like building from stale Ninja build plans. 

You could add all directories (yes the directories themselves) that the glob covers as inputs to the globbing build. You could even make them runtime deps and have the glob tool just announce every directory that it opens*. That would allow you to have real no-op builds if nothing has changed. Of course, since many editors use the atomic write-to-temp-file-and-rename trick rather than overwriting the single file, you will still need to reglob every time someone edits a file in one of those directories. But those probably wouldn't be no-op builds anyway.

* my favorite way of doing this is using `deps=msvc` + `msvc_deps_prefix=Opening file: `. Then you can just have your tool print("Opening file: {path}") every time it opens a file, rather than teaching every tool to emit a well-formed Makefile fragment.

--
You received this message because you are subscribed to the Google Groups "ninja-build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ninja-build...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ninja-build/CAHnCjA3Kii0iDvrS41--C4hAMdta5Hb7VwQJzh%3DOuHEqLYxZKA%40mail.gmail.com.

Mathias Stearn

unread,
Oct 12, 2023, 8:35:45 AM10/12/23
to David Turner, Ben Boeckel, Christophe D., ninja-build
On Thu, Oct 12, 2023 at 2:24 PM David Turner <di...@google.com> wrote:


On Thu, Oct 12, 2023 at 11:24 AM 'Mathias Stearn' via ninja-build <ninja...@googlegroups.com> wrote:


On Thu, Oct 5, 2023 at 6:45 PM 'Ben Boeckel' via ninja-build <ninja...@googlegroups.com> wrote:
- there is no "the build has no work to do" state because the glob
  always needs to be resolved to search for new or removed files

While that may be true of how CMake writes globbing rules today, I don't think that is an inherent issue.

I believe it is an inherent issue if you care about correctness: if your Ninja generator performs globbing, this means that _any_ time a file is added to or removed from one of the globbed directories, the project's final build graph changes (inputs and/or outputs are added or removed). Thus the Ninja build plan becomes _immediately_ _stale_ and should always be regenerated. Otherwise your next Ninja incremental build is not going to reflect the real state of your project, and this introduces all kinds of really flaky and hard-to-debug correctness issues that will drive you mad.

Which implies that every Ninja incremental build would need to re-run the generator to perform the globs again, and no-op builds are impossible, unless you really really like building from stale Ninja build plans. 

I don't think you read the rest of my message. I was suggesting having the directories that are covered by the globs being input edges to the task running the glob. At least on linux, whenever you add or remove a file to a directory, that bumps the mtime for the directory, which will cause the glob evaluating task to be considered dirty and rerun. To make this more explicit, if the glob task is also restat=1 and avoids touching its output if it doesn't change, then you now have precisely evaluated globs that are correctly updated only as needed, and downstream tasks (including rerunning your generator) can depend on them. This is nice because I basically always run `ninja blah && run_my_program` so that I'm always running my tests with an up-to-date build. So it is pretty common to run ninja back-to-back even if no source files have been touched, and I like no-op builds to be as close to instant as possible.

That said, as Ben correctly pointed out this only covers the cases where your globs are not modified during the build. I think this covers the majority of use cases of globing your source directories (which should (almost) never be modified during a build), it does not support globbing on outputs of your build. This seems less important IMO since most of the time, the generator needs to know what it is outputting up front, but there are a handful of exceptions.

David Turner

unread,
Oct 12, 2023, 9:51:12 AM10/12/23
to Mathias Stearn, Ben Boeckel, Christophe D., ninja-build
On Thu, Oct 12, 2023 at 2:35 PM Mathias Stearn <mat...@mongodb.com> wrote:


On Thu, Oct 12, 2023 at 2:24 PM David Turner <di...@google.com> wrote:


On Thu, Oct 12, 2023 at 11:24 AM 'Mathias Stearn' via ninja-build <ninja...@googlegroups.com> wrote:


On Thu, Oct 5, 2023 at 6:45 PM 'Ben Boeckel' via ninja-build <ninja...@googlegroups.com> wrote:
- there is no "the build has no work to do" state because the glob
  always needs to be resolved to search for new or removed files

While that may be true of how CMake writes globbing rules today, I don't think that is an inherent issue.

I believe it is an inherent issue if you care about correctness: if your Ninja generator performs globbing, this means that _any_ time a file is added to or removed from one of the globbed directories, the project's final build graph changes (inputs and/or outputs are added or removed). Thus the Ninja build plan becomes _immediately_ _stale_ and should always be regenerated. Otherwise your next Ninja incremental build is not going to reflect the real state of your project, and this introduces all kinds of really flaky and hard-to-debug correctness issues that will drive you mad.

Which implies that every Ninja incremental build would need to re-run the generator to perform the globs again, and no-op builds are impossible, unless you really really like building from stale Ninja build plans. 

I don't think you read the rest of my message. I was suggesting having the directories that are covered by the globs being input edges to the task running the glob. At least on linux, whenever you add or remove a file to a directory, that bumps the mtime for the directory, which will cause the glob evaluating task to be considered dirty and rerun. To make this more explicit, if the glob task is also restat=1 and avoids touching its output if it doesn't change, then you now have precisely evaluated globs that are correctly updated only as needed, and downstream tasks (including rerunning your generator) can depend on them. This is nice because I basically always run `ninja blah && run_my_program` so that I'm always running my tests with an up-to-date build. So it is pretty common to run ninja back-to-back even if no source files have been touched, and I like no-op builds to be as close to instant as possible.

That said, as Ben correctly pointed out this only covers the cases where your globs are not modified during the build. I think this covers the majority of use cases of globing your source directories (which should (almost) never be modified during a build), it does not support globbing on outputs of your build. This seems less important IMO since most of the time, the generator needs to know what it is outputting up front, but there are a handful of exceptions.

 
I stand corrected, this is a very interesting idea!  If I understand correctly:

- The generator performs the initial globs to generate the first Ninja build plan.
- The generator also includes in the build plan specific Ninja tasks to perform the _same_ globs (and store their result in an output file that is only modified if the result changes, to get `restat = 1` working). These take a directory path as a Ninja input.
- The "regen" task in the build plan depends on these special globbing-tasks.

So this ensures that:

- The generator is always run again if there is a directory change that would result in a different glob result, ensuring correctness.
- Directory changes that do not affect the glob result, do not make the generator run again, and do not force a rebuild of all downstream dependencies.
- If there are no directory changes at all (only changing the content of files within them), Ninja no-op builds actually work.

This is really clever and should work well, at least on Linux (we had issues in the past with MacOS and Windows where the kernel would use a very low granularity for directory timestamp, even though they support high accuracy ranges, i.e. not talking about ExFAT or HFS+ here, but I am not sure if these issues still exist though).

Rex Roni

unread,
Oct 20, 2023, 10:45:06 AM10/20/23
to 'David Turner' via ninja-build
On Thu, Oct 12, 2023 at 03:50:59PM +0200, 'David Turner' via ninja-build
wrote:
>   I stand corrected, this is a very interesting idea!  If I understand
> correctly:
>
> - The generator performs the initial globs to generate the first Ninja build
> plan.
> - The generator also includes in the build plan specific Ninja tasks to perform
> the _same_ globs (and store their result in an output file that is only
> modified if the result changes, to get `restat = 1` working). These take a
> directory path as a Ninja input.
> - The "regen" task in the build plan depends on these special globbing-tasks.
>
> So this ensures that:
>
> - The generator is always run again if there is a directory change that would
> result in a different glob result, ensuring correctness.
> - Directory changes that do not affect the glob result, do not make the
> generator run again, and do not force a rebuild of all downstream dependencies.
> - If there are no directory changes at all (only changing the content of files
> within them), Ninja no-op builds actually work.
>
> This is really clever and should work well, at least on Linux (we had issues in
> the past with MacOS and Windows where the kernel would use a very low
> granularity for directory timestamp, even though they support high accuracy
> ranges, i.e. not talking about ExFAT or HFS+ here, but I am not sure
> if these issues still exist though).

I wrote a helper tool called `findglob` to do basically exactly this, as
part of my own ninja frontend [1] which I never totally finished.

findglob is written in C, it's faster than `find` by quite a bit
(especially when you use patterns), it uses glob patterns instead of
find's insane CLI syntax, and it's tested on linux, windows, and mac.

The idea is that you call findglob with every run of ninja, and it
generates a sorted list of paths which is piped to another binary
(manifest) that writes that path list to an output file, if the output
file doesn't exist, or if the paths differ from what is currently in the
output file, or if any of the paths on the filesystem are older than the
output file.

It does break the "no work to do" feature as somebody mentioned, but I
don't see that as any kind of problem. I find that when I need
globbing, no approximation of globbing really meets the need. In my day
job we have a build system that tries to glue golang and python and
typescript and some custom code generators into one coherent build.
Globbing is the only sane way to do that.

[1] https://github.com/rexroni/mkninja
Reply all
Reply to author
Forward
0 new messages