Build Configurations Per Target vs. User Defined Transitions

923 views
Skip to first unread message

Mark Schulte

unread,
Feb 9, 2022, 1:11:53 PM2/9/22
to bazel-...@googlegroups.com
Hi All,

I'm using Bazel to build code for a large firmware codebase. (I've asked a question in this mailing list before, see here: https://groups.google.com/g/bazel-discuss/c/1n-4b1BSwUI/m/hHlpgNLTAgAJ).

In the firmware world, build time configuration is important due to strict limits on code size and RAM usage. Many firmware libraries have lots of configuration options. i.e.: https://www.freertos.org/a00110.htmlhttps://tls.mbed.org/api/config_8h.htmlhttps://github.com/littlefs-project/littlefs/blob/master/lfs_util.h, or https://sourceware.org/newlib/README).

In my previous question, I asked about platforms/constraints vs build_settings (again, here: https://groups.google.com/g/bazel-discuss/c/1n-4b1BSwUI/m/hHlpgNLTAgAJ). The guidance was to use platforms/constraints for hardware configurations, and build_settings for software configurations. This makes sense to me as a good split.

However, adding new software configurations to our code base via build_settings is tedious. If a new software configuration is needed, we need to go modify a transition rule to allow that build_setting to be configured in our build.

I'd like to understand why Bazel went this design route, and if there's an interest in the future of allowing configurations directly on rules and not requiring special transitions? i.e.:

cc_binary(
    name = "my_binary",
    cfg = {"//command_line_option:platforms": "//:myplatform", "//freertos:stack_depth": 50},
)

I'm happy to read-up on thoughts here, or any background. I spent some time looking, as I remember there being some design docs on the topic, but could not find them.

I have read https://docs.bazel.build/versions/main/skylark/config.html#memory-and-performance-considerations, which makes sense! On the flip side, these configurations/needs do exist in firmware style builds, and the current transition based flows make Bazel less appealing to firmware developers due to the friction/learning curve of transitions.

Thank you! I'm hoping the Bazel can be used as a tool to push code re-use and sharing in the embedded/firmware world, and appreciate all the design and thought that has gone into Bazel so far!

Mark

John Cater

unread,
Feb 17, 2022, 10:13:04 AM2/17/22
to Mark Schulte, bazel-discuss
Hi Mark, let me add a few thoughts:

On Wed, Feb 9, 2022 at 1:11 PM Mark Schulte <schul...@gmail.com> wrote:
I'd like to understand why Bazel went this design route, and if there's an interest in the future of allowing configurations directly on rules and not requiring special transitions? i.e.:

cc_binary(
    name = "my_binary",
    cfg = {"//command_line_option:platforms": "//:myplatform", "//freertos:stack_depth": 50},
)

I see that you read https://docs.bazel.build/versions/main/skylark/config.html#memory-and-performance-considerations, which lays out most of this, but the main reason why we haven't taken this approach is mainly that: build performance suffers as you add more configurations, and does so at an exponential rate with the number of different configurations.

This example is fairly simple, but if you had multiple cc_binary targets, each depending on the same core libraries but using different configurations, you'll quickly begin to see the problems:
1) Bazel's memory usage will skyrocket: each ConfiguredTarget (a specific target with a specific configuration) takes up memory, and the entire transitive dependency tree of each binary are separate ConfiguredTarget instances.
2) Action caching and reuse goes down: so Bazel ends up executing more build actions. We attempt to merge identical actions where possible, even if they come from different ConfiguredTargets, but configuration changes do tend to cause the actions to be different, so you end up with the same source file being built several times with several slightly different command lines.

This is a problem of scale, and hits the largest Bazel users the worst. Inside Google, we spend a lot of time and effort on monitoring memory usage and action cache rates to try and stay ahead of this, and so we've had to opt to make defensive choices to try and head off the worst of the bloat.

We do have several long-term projects to try and address this, and if we can we'd love to have some kind of facility like you propose. I am very much looking forward to being able to add a `platform` attribute to ~every binary rule, and stop using the `--platforms` flag altogether unless it's as an override, but we're not yet in a place where that makes sense, unfortunately.

John C

Fabian Meumertzheim

unread,
Feb 17, 2022, 1:06:11 PM2/17/22
to John Cater, Mark Schulte, bazel-discuss
Hi Mark,

Keeping all the caveats John mentioned in mind, you can use some of
the more recent additions to Starlark in Bazel 5 (nested functions and
anonymous rules) to at least reduce the amount of boilerplate needed
to set up transitions.
https://github.com/bazelbuild/bazel/issues/14673 has more context on
that. First implementations of that idea (which are very much in alpha
state) are available at
https://github.com/fmeum/rules_meta/issues/1#issuecomment-1029729765
and https://github.com/fmeum/rules_meta/blob/main/tests/use_meta.bzl.

But all this syntactic sugar won't help the fact that you would
essentially define one rule per target, which strays quite far away
from Bazel best practices and the performance guarantees they provide.

Fabian
> --
> You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CAOD4_Y4R6CxZ63KOB7%2BAw3K97OJJy1%3Dktdfz2xbjhfO%3DXL_egw%40mail.gmail.com.

Mark Schulte

unread,
Feb 18, 2022, 12:34:40 PM2/18/22
to Fabian Meumertzheim, John Cater, bazel-discuss
Thank you John for the explanation and Fabian for the potential work around! As always, very much appreciated.

This example is fairly simple, but if you had multiple cc_binary targets, each depending on the same core libraries but using different configurations, you'll quickly begin to see the problems:

This is exactly our use case :). And a fairly common use case in the embedded world. In our repo, we’ve been doing this for awhile, and have about 20 different configurations that build some subset of the same libraries.  The explosion of actions doesn’t impact developers too much because they’re usually building just one target. Our CI machines are fairly beefy though, as they need to build :all.

Thank you also for the context on not quite being there as a build system yet. I’m happy to sit patiently and wait. And happy to provide more insight into use cases in the embedded space. In my mind, part of what makes firmware development difficult is the lack of open source libraries for use in development. And part of that is driven by existing build systems lack of ability to express that amount of configuration needed for a library to be successful in the embedded world. I think Bazel may be able to get us there.

Thank you!
Mark


cns...@gmail.com

unread,
Jul 17, 2023, 9:22:40 AM7/17/23
to bazel-discuss
I currently build a target (py_binary) with `bazel build train_main --//:accelerator=tpu`. I want to add the target as a `data` to a script (another py_binary) so that the script can call it. But the script cannot have `--//:accelerator=tpu`.
It would be nice if I can add the flag the `train_main` rule rather than through command line which affect both `train_main` and `script`.
py_binary(
  name = "train_main",
  ... = "//:accelerator=tpu",
)
Is this possible? Will this be easier than the discussion here? In my case, `train_main` can always be with `--//:accelerator=tpu` (turning the flag off for debugging is nice but not required) and `script` will always not have the flag.

I'm new to this platform/config etc and still trying to wrap my head around it. Any pointers or examples would be much appreciated. 

cns...@gmail.com

unread,
Jul 17, 2023, 9:35:11 AM7/17/23
to bazel-discuss
PS: Furthermore/if it matters/to be exact,  the `script` does not depend on `train_main` directly, but just run its zip:
```
filegroup(
  name = "train_main_zip",
  src = [":train_main"],
  output_group = "python_zip_file",
)
```

Reply all
Reply to author
Forward
0 new messages