Testing modules in BCR presubmit

72 views
Skip to first unread message

Yun Peng

unread,
Dec 21, 2021, 12:10:01 PM12/21/21
to external-deps, Alex Eagle, Xudong Yang

So far, we have been testing Bazel modules in BCR presubmit as a dependency instead of the root module. See here to check how we create the test repo. 
However, this is causing problem for running tests that require extra dev dependencies, because they don't propagate to dependent.

To improve the situation, we can make some changes on how we run test targets in presubmit.yml file:

- All build targets are still built with the test repo as a dependency.
- We copy out the workspace root of "module_X" from $output_base/external/module_X, which is generated by the previous build step and make it our new working directory.
- If a MODULE.bazel file is there (extracted from the source archive), we use that, otherwise, we copy the MODULE.bazel file checked in the BCR to the working directory.
- Then we run test targets as the main module (bazel test //:test_bar). And use --check_direct_dependencies=error to make sure fetching dev dependencies doesn't change other direct build dependencies.

This way, we can achieve:
- Still verify all build targets build correctly without fetching dev dependencies.
- Run tests with extra dev dependencies.
- You can still have *_override in your MODULE.bazel file or even local_repository in your WORKSPACE file inside your source archive, which can make running tests easier.

Risks:
- Fetching dev dependencies can change other transitive dependencies, which means the dependency graph may not be the same for testing. But this can be mitigated by --check_direct_dependencies=error
- The checked-in MODULE.bazel file can be different from the MODULE.bazel file in the source archive. Normally this is fine since the one in the source archive should only contain extra dependencies for testing. But to be sure, we can automatically generating their diff for review (in the BCR presubmit pipeline). We are also considering allowing *_override as dev dependencies, then we can always use the checked-in MODULE.bazel file.

How does this sound to everyone? Ideas are welcome!

Cheers,
Yun

Fabian Meumertzheim

unread,
Dec 22, 2021, 3:34:38 AM12/22/21
to external-deps, Yun Peng, al...@aspect.dev, Xudong Yang
I like this approach. It won't require reimplementing any Bazel features in the BCR presubmit logic, but allows for quite realistic tests.

I see one additional risk though: Certain features in Bazel depend on whether a repository is the main repository or not (e.g. using -I flags in cc_library) and some outright depend on the canonical internal name of a repository (e.g. runfiles lookups via the runfiles libraries rlocation functions). With the proposed setup, modules would only every be tested as the root module, which would mask these types of bugs. I am not sure whether they would be feasible to implement, but the following hypothetical Bazel flags would help:
1. "--treat_root_module_as_external": If true load the root module (including dev dependencies and overrides) as if it were a non-root module, that is, with its repository under `external/...` and appended with a version suffix.
2. "--add_repository_name_grease": A string that would be appended to all internal repository names coming from modules (modelled after TLS GREASE: https://datatracker.ietf.org/doc/html/rfc8701). This could help detect hardcoded repository names.

- The checked-in MODULE.bazel file can be different from the MODULE.bazel file in the source archive. Normally this is fine since the one in the source archive should only contain extra dependencies for testing. But to be sure, we can automatically generating their diff for review (in the BCR presubmit pipeline). We are also considering allowing *_override as dev dependencies, then we can always use the checked-in MODULE.bazel file.
 
Allowing overrides on dev dependencies for checked-in modules would make it possible to require the checked in MODULE.bazel to agree with that from the source archive. This would be my favorite approach as differing MODULE.bazel files seem like a fruitful source of bugs. 

Fabian

Alex Eagle

unread,
Dec 22, 2021, 9:48:38 AM12/22/21
to Fabian Meumertzheim, external-deps, Yun Peng, Xudong Yang
Can we sort out what is the desired scope of BCR presubmit? I am still not convinced that BCR should duplicate the same testing done by the ruleset. Dev dependencies should only be needed within the CI the ruleset uses to verify PRs and main branch. If the ruleset has already pushed a broken release, we are pretty late to reject mirroring that release to BCR based on re-running its tests. Shouldn't that problem have been solved sooner within the ruleset? Also, shouldn't we encourage rulesets to stop shipping their internal tests in their release artifact anyhow?

IMO the scope of BCR presubmit ought to be only interop between rules, as that's the graph expressed in the MODULE.bazel files in the BCR repo. This should be possible entirely black-box, holding the rules the way a user would. Of course, there does need to be some "smoke test" either shipped in the ruleset artifact or included in the BCR repo to indicate what is the desired end-user behavior, and which should exercise the code in BCR by evaluating codepaths that require all the dependencies and their extensions to be present.

Should we have more nuance about what "smoke test" lives in BCR vs. re-running the rulesets internal tests? I realize it's simple and convenient to re-use the latter, except that you have to do ugly workarounds.
-Alex

--
To unsubscribe from this group and stop receiving emails from it, send an email to external-dep...@bazel.build.

Yun Peng

unread,
Dec 22, 2021, 10:31:26 AM12/22/21
to Alex Eagle, Fabian Meumertzheim, external-deps, Xudong Yang

>  I am not sure whether they would be feasible to implement, but the following hypothetical Bazel flags would help:

Both changes sound pretty intrusive to me, they won't be easy to implement. And as Alex said, we don't want you to rely on BCR presubmit to run most of your test suite. Those problems should be solved by the project's own test infrastructure.

> Allowing overrides on dev dependencies

We'll probably implement this, but allowing *_override in the checked-in MODULE.bazel files is a breaking change for Bazel 5.0. Once we check in those MODULE.bazel, Bazel 5.0 won't be able to use the latest BCR content. We'll have to either branch out a BCR for 5.0, or cherry pick the new feature in 5.x.

@Alex 

I think we agree with each other. We never advocated putting your entire test suite in BCR. The presubmit.yml should only contain a few "smoke tests" that cover the core functionalities of your module. This is explained in the README.md file, the Bzlmod User Guide, and the Bazel Central Registry doc.
However, even those test targets could require additional dev dependencies, therefore we still face this problem, right?

Kyle Cordes

unread,
Dec 22, 2021, 10:54:50 AM12/22/21
to Alex Eagle, Fabian Meumertzheim, Yun Peng, Xudong Yang, external-deps
To test rules as a user would and get more than a trivial-case-smoke-test, needs a needs an example chunk of code that fully-ish exercises the features set of the rules. I wonder how a ruleset could supply such an example. (Which might be different than the typical granular examples inside a rule sets test suite.)

Also as I look at the BCR gdoc, there appears to be some conversation that didn’t get resolved back in August, about combinatorial testing to look for conflicts. I think it is realistic to expect such conflicts could occur, as they generally do occur in other mechanism of this general nature (adding a modularity layer on a system with legacy global state). I wonder So I wonder if the pre-submit testing should be isolation-only, vs "does this new module X version N work when sharing a workspace with various combinations of published modules”?  Vs later testing (post-publish) or wait-for-a-human-to-notice.

Lastly, I might've missed something but I haven't seen multiple platforms mentioned. Even if only advisory (because the ecosystem has varied levels of support for Windows, especially), it would be great to see the automatic testing warn submitters as early as possible that their module does not work on  all platforms. A particular pet peeve here: even if a ruleset is unable to support a certain platform, it would be very helpful if the testing ensured the rule set can be present in workspaces and BUILD files on that platform even if the targets cannot be successfully built.

—Kyle

Fabian Meumertzheim

unread,
Dec 23, 2021, 4:39:51 AM12/23/21
to Kyle Cordes, Alex Eagle, Yun Peng, Xudong Yang, external-deps
@Alex, @Yun: I fully agree with you that presubmit should only run a
small subset of meaningful tests. It's just that the rulesets I
maintain are so narrow in scope that there is no essential difference
between "running all tests" and "running a meaningful subset of tests
to verify the common use cases".

My ideas are based on the following understanding of the scope of the
presubmit checks. As Alex has said, we should probably focus on
agreeing on a scope first before we discuss the setup further. Since
this got a bit longer, I numbered the individual points to make
references easier. Please let me know what you think about them.

I think that presubmit tests *should*:

1. make sure that simple, typical example usages of the rules and
targets offered by the ruleset work.

For example, if I were to submit rules_foo to the BCR, I would at
least include an example with a `foo_test` depending on a
`foo_library` and having a `foo_binary` as a `data` dependency.

2. be easy to set up.

Looking at the current state of tests in the registry, only a single
module (re2) even has test targets configured. That may just be
because the existing modules were mostly registered to make a module
build of Bazel itself possible and additional tests were not a
priority, but it may also be in part because these rulesets would
require more dependencies to run their tests.

If we want the BCR to grow organically, we have to ensure that anyone
can quickly submit their favourite useful ruleset. If writing tests
requires more intricate changes because dependencies are not easily
available and their absence needs to be worked around, most tests
simply won’t be written.

3. be quick to run.

This is evidently necessary to make presubmit and compatibility test
runs scale as the BCR grows. The tests shouldn’t do much work after
the analysis phase. Luckily, the analysis phase is where most
breakages related to module interactions would happen, so that is not
much of a restriction.

4. have a specific focus on the delta between MODULE.bazel and
WORKSPACE usage of the ruleset.

At least initially (before Bazel 6/7), MODULE.bazel will likely play a
somewhat niche role compared to the existing WORKSPACE build way of
managing dependencies. The participants of this email thread,
including myself, are of course very eager to explore and set up
modules, but most BCR maintainers, which are not even necessarily
ruleset maintainers, will probably not have much time to spare to
learn the concepts and intricacies of modules in detail.

Since the design is so elegant, most things of course work out of the
box assuming dependencies are already available as modules, but the
extensive use of repo mappings does introduce new complexities. See
https://github.com/bazelbuild/bazel/issues/14259,
https://github.com/bazelbuild/bazel/issues/14279 as well as
https://github.com/fmeum/rules_runfiles, which I created specifically
out of a need to make runfiles work reliably with modules. I am also
not yet sure that I have grasped all the consequences for transitions
on label_flags, but at the very least the “str(Label(//…))” pattern
will be needed in a number of places. Remember that until a few days
ago, the Bazel documentation advised against using Label for anything
other than rule attribute defaults, so these usages will look pretty
arcane to most people.

Ideally, the presubmit tests would be able to catch these subtle
issues before they cause weird errors when end users migrate to
modules. This doesn’t require running extensive integration tests, but
it does require a realistic test setup. I agree that the hypothetical
Bazel flags I proposed above are not simple additions and I am not
suggesting that we have to implement them. I only want to highlight
what we could do to make very basic module tests more effective at
catching the subtle, module-specific bugs. If we only run tests with a
single module (necessarily then the root module) and the examples are
part of the same module, we will not be able to catch the bugs that
are most likely to appear.

5. allow for meaningful tests without upstream changes.

The BCR rightly doesn’t assume that module maintainers and ruleset
maintainers are the same set of people. Even if an upstream repo is no
longer actively maintained, it may still be useful and should be able
to be included in the BCR. However, this submission may have to be
entirely patch-based without a chance to submit changes upstream in a
timely manner. This severely limits the ability to do even basic
module-based integration tests upstream, which makes the presubmit
tests focusing on module interactions even more important - they may
very well be the *only* tests that test a repository as a module.

I think that presubmit tests *should not*:

6. test the details of the inner logic of the ruleset.

These details are best tested in the upstream repositories, with
arbitrary elaborate setups. They are also unlikely to break due to
Bazel or dependency updates.

7. blindly duplicate the existing upstream tests.

rules_nodejs has very sophisticated and powerful integration testing
and I am pretty sure that this will be extended to cover module usage
of the ruleset. In such a case, I don’t really see a need for any BCR
presubmit tests and I fully understand why Alex does find them mostly
redundant.

Inline answers to some more points below:

> Do we have a compelling example of why a black box smoke test could depend on dev dependencies?

> Fabian, is it possible to not introduce any dev dependencies for the test you check in BCR?

In fact, I would go as far as saying that *every* meaningful black box
smoke test that is not made redundant by the project's own testing
infrastructure requires a dev dependency: A separate module containing
the examples. If we want presubmit to verify that basic usages of the
module work, how could we do that without having a separate module
depend on it? That is the main reason for why I am using dev
dependencies: The examples/tests are part of another module, whose
source is contained in a subdirectory of the source repository (and
the release archive) and which is loaded with a local_path_override.

Even if I were to inline the tests into the main module, I would still
need some extra dependencies to test the core use cases of the module.
In this particular case, I could easily make them true dependencies
(rules_cc and rules_java aren’t particularly large) or drop them
(junit is convenient to have for every java_test, but certainly not
necessary for simple tests), but that may not easily be possible for
larger rulesets or at least pollute the dependency tree.

> We'll probably implement this, but allowing *_override in the checked-in MODULE.bazel files is a breaking change for Bazel 5.0. Once we check in those MODULE.bazel, Bazel 5.0 won't be able to use the latest BCR content. We'll have to either branch out a BCR for 5.0, or cherry pick the new feature in 5.x.

I didn’t think of the consequences for Bazel 5.0. here. An alternative
approach that would preserve backwards compatibility would be to just
include these overrides in the release archive MODULE.bazel and then
have some script in the BCR to verify that the two MODULE.bazel only
differ in dev dependencies. In fact, the checked-in MODULE.bazel could
probably be required to be a true prefix of the release archive
MODULE.bazel, with the latter only containing additional lines with
dev dependencies.

Yun Peng

unread,
Dec 23, 2021, 9:23:09 AM12/23/21
to Fabian Meumertzheim, Kyle Cordes, Alex Eagle, Xudong Yang, external-deps
@Fabian 

Thanks for your excellent analysis, this is very valuable to us!

I think a great point you made is that: we should have easy-to-setup smoke tests that test the module as a dependency.

> The examples/tests are part of another module, whose
> source is contained in a subdirectory of the source repository (and
> the release archive) and which is loaded with a local_path_override.

And sounds like you are also offering the solution ;)

Here is my new proposal:

When publishing a Bazel module:
- Every module can have a subdirectory in the source archive named "bcr_smoke_tests" where module authors write specific smoke tests for BCR.
- This directory serves as a separate testing module, which contains its own MODULE.bazel file.
- It depends on the target module with a local_path_override(path = "../") and of course it can introduce extra testing dependencies.
- For non-Bazel modules like "zlib", this can be patched in.

In BCR presubmit:
- We still require the presubmit.yml file, but only requires "build_targets" to be specified.
- After we verify the build targets work correctly, we prepare the source tree of the module like I said previously, the MODULE.bazel file is overridden with the checked-in one.
- We switch the working directory into bcr_smoke_test and run all tests available there if it exists.

I think this solution checks most of your boxes. The only downside is that it puts a bit more burden on module authors to provide such smoke tests, but in the long term, I think it's beneficial for the ecosystem.

@Fabian @Alex, please tell me what you think. ;)

> have some script in the BCR to verify that the two MODULE.bazel only
> differ in dev dependencies.

This looks more and more important, on my TODO list now. In most cases we can require the checked-in MODULE.bazel file to be a prefix of the original one. But not when the test requires extra toolchain registration, which can only be done in the "module" directive. Anyhow, it's good to start with a diff for review.

-------------------


> So I wonder if the pre-submit testing should be isolation-only, vs "does this new module X version N work when sharing a workspace with various combinations of published modules”?

I'm still not convinced this could work. The total combination could be O(V^N), where V is the average number of versions of a module and N is the total number of modules. This could be very large, how do you decide what combination to test and how to make sure it's actually covering real world use cases. I'm planning to implement a downstream pipeline to test if a new module version would break downstream modules that depend on it, which I think could better help detecting unexpected breakages.

> Lastly, I might've missed something but I haven't seen multiple platforms mentioned

In the presubmit.yml, the currently available platforms are (centos7, debian10, ubuntu2004, windows, macos). If you add a new module with the helper script, those platforms will be enabled by default. We actually already detected unsupported platforms for modules, see here. Later, we could extract information from this file on some dashboard to better present what modules support what platforms.

Kyle Cordes

unread,
Dec 23, 2021, 11:59:28 AM12/23/21
to Yun Peng, Fabian Meumertzheim, Alex Eagle, Xudong Yang, external-deps
Certainly it is infeasible to test every combination - the numbers get big fast. But could testing a few combinations (chosen intentionally - perhaps a few example projects that use many rulesets together) catch some failures up front, and improve the general quality of the ecosystem? I really like the idea of downstream combinatorial testing to get a broader swath and great confidence also.

Kyle

Fabian Meumertzheim

unread,
Dec 24, 2021, 12:18:29 PM12/24/21
to Yun Peng, Kyle Cordes, Alex Eagle, Xudong Yang, external-deps
There is one bzlmod-internal implementation detail that makes module tests using overrides (local_path or any other kind) unrealistic: With an override, the canonical name of a repository is always the module name, whereas when loaded from the BCR the name is module_name.version (see https://github.com/bazelbuild/bazel/blob/46ed5d55def27d124afcdeaccd17f3cd16070816/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/ModuleKey.java#L65 and https://github.com/bazelbuild/bazel/blob/46ed5d55def27d124afcdeaccd17f3cd16070816/src/main/java/com/google/devtools/build/lib/bazel/bzlmod/Module.java#L59). As a consequence, the most common case of bugs related to hardcoded repository names (e.g. in runfiles or C/C++ include paths) will not be caught by these tests.

Would it be possible to make the canonical name of the repository of a module with an override something like module_name._override instead? Given the current documentation that calls these names implementation details this would not be a backwards incompatible change and would allow for more realistic tests.

Additionally, it might be better to make the smoke test directory name configurable. With the new proposal for the test setup, tests could double as realistic examples for end users of the module. But for that purpose, rulesets might prefer names such as "bcr_examples" or "examples/module". But this point is much less important than the other one.

Yun Peng

unread,
Jan 3, 2022, 9:44:33 AM1/3/22
to Fabian Meumertzheim, Kyle Cordes, Alex Eagle, Xudong Yang, external-deps

> As a consequence, the most common case of bugs related to hardcoded repository names (e.g. in runfiles or C/C++ include paths) will not be caught by these tests.

I understand the problem. But I actually haven't seen many bugs that are caused by this. Do you have one or two examples?

> Would it be possible to make the canonical name of the repository of a module with an override something like module_name._override instead?

@Xudong Yang What do you think? 

> Additionally, it might be better to make the smoke test directory name configurable.

That makes sense. Then here's my newest proposal:

I think we could allow two additional attributes "bcr_test_dir" and "bcr_test_targets" in the presubmit.yml file:

eg:
platforms:
  centos7:
    build_targets:
    - '@bazel_skylib//...'
    bcr_test_dir: "examples"
    bcr_test_targets:
    - '//...'
  debian10:
    build_targets:
    - '@bazel_skylib//...'
    bcr_test_dir: "examples"
    bcr_test_targets:
    - '//...'
  ...

After building the normal build_targets, we switch into "bcr_test_dir" in the archive (with patches) and run "bcr_test_targerts" there. WDYT?

Fabian Meumertzheim

unread,
Jan 19, 2022, 5:50:09 AM1/19/22
to Yun Peng, Kyle Cordes, Alex Eagle, Xudong Yang, external-deps
(I just noticed that I failed to send my reply to the entire mailing list)

> After building the normal build_targets, we switch into "bcr_test_dir" in the archive (with patches) and run "bcr_test_targerts" there. WDYT?

This is also what I had in mind. The presubmit.yml is the ideal place
for this kind of configuration.

> I understand the problem. But I actually haven't seen many bugs that are caused by this. Do you have one or two examples?

I encountered the following two issues while porting my own rulesets
over to modules, but there may be more:

1. Runfiles lookups require specifying the full path to a runfile from
the execroot, which includes the internal repository name. For
example, when using @bazel_tools//tools/cpp/runfiles in a cc_binary,
something like runfiles->Rlocation("rules_foo/tools/builder") would
work both without bzlmod and with a local_path_override on the module
rules_foo, but would break when actually used as a module since the
correct path would be something like "rules_foo.1.2.3/tools/builder".
2. Specifying a non-propagating include path on a cc_library requires
adding a copt of the form
`-Iexternal/internal_repo_name/include/path`. There are established
idioms to do this without hardcoding the repository name (see
https://github.com/bazelbuild/bazel/issues/4463, although I am not
100% certain how native.repository_name() behaves with repo mappings),
but the fact that repository names have been "well-known" so far and
repo mapping haven't seen much use means that they weren't necessary
to use.

Yun Peng

unread,
Jan 19, 2022, 5:57:23 AM1/19/22
to Fabian Meumertzheim, Kyle Cordes, Alex Eagle, Xudong Yang, external-deps
> I encountered the following two issues while porting my own rulesets
> over to modules, but there may be more:

Oh, sorry for not being responsive on this issue.
@Xudong Yang , do you think we could make the canonical repo name for overridden modules more special? (eg. "foo._override" instead of just "foo"). This will probably help catch those issues.

Xudong Yang

unread,
Jan 19, 2022, 12:16:19 PM1/19/22
to Yun Peng, Fabian Meumertzheim, Kyle Cordes, Alex Eagle, external-deps
I can't immediately think of any issues with doing that (IIRC the reason for choosing "foo" instead of "foo." was purely aesthetical); I'm going to try it and see if anything breaks.

"  YÁNG Xùdōng (杨旭东)
#  SURNAME Givenname
=  Software Engineer
@  Google Munich

Yun Peng

unread,
Jan 20, 2022, 9:14:13 AM1/20/22
to Xudong Yang, Fabian Meumertzheim, Kyle Cordes, Alex Eagle, external-deps
> I can't immediately think of any issues with doing that (IIRC the reason for choosing "foo" instead of "foo." was purely aesthetical); I'm going to try it and see if anything breaks.

Fabian Meumertzheim

unread,
Jan 20, 2022, 10:42:05 AM1/20/22
to Yun Peng, Xudong Yang, Kyle Cordes, Alex Eagle, external-deps
Looks great, thanks. Could we get this on the cherry-pick list for
5.0.1 or 5.1.0, whatever you feel is right?

Yun Peng

unread,
Jan 20, 2022, 11:05:07 AM1/20/22
to Fabian Meumertzheim, Xudong Yang, Kyle Cordes, Alex Eagle, external-deps
It could probably make it into 5.1 ;)
Reply all
Reply to author
Forward
0 new messages