@Alex, @Yun: I fully agree with you that presubmit should only run a
small subset of meaningful tests. It's just that the rulesets I
maintain are so narrow in scope that there is no essential difference
between "running all tests" and "running a meaningful subset of tests
to verify the common use cases".
My ideas are based on the following understanding of the scope of the
presubmit checks. As Alex has said, we should probably focus on
agreeing on a scope first before we discuss the setup further. Since
this got a bit longer, I numbered the individual points to make
references easier. Please let me know what you think about them.
I think that presubmit tests *should*:
1. make sure that simple, typical example usages of the rules and
targets offered by the ruleset work.
For example, if I were to submit rules_foo to the BCR, I would at
least include an example with a `foo_test` depending on a
`foo_library` and having a `foo_binary` as a `data` dependency.
2. be easy to set up.
Looking at the current state of tests in the registry, only a single
module (re2) even has test targets configured. That may just be
because the existing modules were mostly registered to make a module
build of Bazel itself possible and additional tests were not a
priority, but it may also be in part because these rulesets would
require more dependencies to run their tests.
If we want the BCR to grow organically, we have to ensure that anyone
can quickly submit their favourite useful ruleset. If writing tests
requires more intricate changes because dependencies are not easily
available and their absence needs to be worked around, most tests
simply won’t be written.
3. be quick to run.
This is evidently necessary to make presubmit and compatibility test
runs scale as the BCR grows. The tests shouldn’t do much work after
the analysis phase. Luckily, the analysis phase is where most
breakages related to module interactions would happen, so that is not
much of a restriction.
4. have a specific focus on the delta between MODULE.bazel and
WORKSPACE usage of the ruleset.
At least initially (before Bazel 6/7), MODULE.bazel will likely play a
somewhat niche role compared to the existing WORKSPACE build way of
managing dependencies. The participants of this email thread,
including myself, are of course very eager to explore and set up
modules, but most BCR maintainers, which are not even necessarily
ruleset maintainers, will probably not have much time to spare to
learn the concepts and intricacies of modules in detail.
Since the design is so elegant, most things of course work out of the
box assuming dependencies are already available as modules, but the
extensive use of repo mappings does introduce new complexities. See
https://github.com/bazelbuild/bazel/issues/14259,
https://github.com/bazelbuild/bazel/issues/14279 as well as
https://github.com/fmeum/rules_runfiles, which I created specifically
out of a need to make runfiles work reliably with modules. I am also
not yet sure that I have grasped all the consequences for transitions
on label_flags, but at the very least the “str(Label(//…))” pattern
will be needed in a number of places. Remember that until a few days
ago, the Bazel documentation advised against using Label for anything
other than rule attribute defaults, so these usages will look pretty
arcane to most people.
Ideally, the presubmit tests would be able to catch these subtle
issues before they cause weird errors when end users migrate to
modules. This doesn’t require running extensive integration tests, but
it does require a realistic test setup. I agree that the hypothetical
Bazel flags I proposed above are not simple additions and I am not
suggesting that we have to implement them. I only want to highlight
what we could do to make very basic module tests more effective at
catching the subtle, module-specific bugs. If we only run tests with a
single module (necessarily then the root module) and the examples are
part of the same module, we will not be able to catch the bugs that
are most likely to appear.
5. allow for meaningful tests without upstream changes.
The BCR rightly doesn’t assume that module maintainers and ruleset
maintainers are the same set of people. Even if an upstream repo is no
longer actively maintained, it may still be useful and should be able
to be included in the BCR. However, this submission may have to be
entirely patch-based without a chance to submit changes upstream in a
timely manner. This severely limits the ability to do even basic
module-based integration tests upstream, which makes the presubmit
tests focusing on module interactions even more important - they may
very well be the *only* tests that test a repository as a module.
I think that presubmit tests *should not*:
6. test the details of the inner logic of the ruleset.
These details are best tested in the upstream repositories, with
arbitrary elaborate setups. They are also unlikely to break due to
Bazel or dependency updates.
7. blindly duplicate the existing upstream tests.
rules_nodejs has very sophisticated and powerful integration testing
and I am pretty sure that this will be extended to cover module usage
of the ruleset. In such a case, I don’t really see a need for any BCR
presubmit tests and I fully understand why Alex does find them mostly
redundant.
Inline answers to some more points below:
> Do we have a compelling example of why a black box smoke test could depend on dev dependencies?
> Fabian, is it possible to not introduce any dev dependencies for the test you check in BCR?
In fact, I would go as far as saying that *every* meaningful black box
smoke test that is not made redundant by the project's own testing
infrastructure requires a dev dependency: A separate module containing
the examples. If we want presubmit to verify that basic usages of the
module work, how could we do that without having a separate module
depend on it? That is the main reason for why I am using dev
dependencies: The examples/tests are part of another module, whose
source is contained in a subdirectory of the source repository (and
the release archive) and which is loaded with a local_path_override.
Even if I were to inline the tests into the main module, I would still
need some extra dependencies to test the core use cases of the module.
In this particular case, I could easily make them true dependencies
(rules_cc and rules_java aren’t particularly large) or drop them
(junit is convenient to have for every java_test, but certainly not
necessary for simple tests), but that may not easily be possible for
larger rulesets or at least pollute the dependency tree.
> We'll probably implement this, but allowing *_override in the checked-in MODULE.bazel files is a breaking change for Bazel 5.0. Once we check in those MODULE.bazel, Bazel 5.0 won't be able to use the latest BCR content. We'll have to either branch out a BCR for 5.0, or cherry pick the new feature in 5.x.
I didn’t think of the consequences for Bazel 5.0. here. An alternative
approach that would preserve backwards compatibility would be to just
include these overrides in the release archive MODULE.bazel and then
have some script in the BCR to verify that the two MODULE.bazel only
differ in dev dependencies. In fact, the checked-in MODULE.bazel could
probably be required to be a true prefix of the release archive
MODULE.bazel, with the latter only containing additional lines with
dev dependencies.