Using BzlMod with "prebuilt" repositories

722 views
Skip to first unread message

David Turner

unread,
Jul 1, 2022, 9:42:42 AM7/1/22
to bazel-...@googlegroups.com
Hello,

I am trying to use the new BzlMod feature to manage external repositories whose content is already extracted in the project's source tree. The idea is that all downloads of third-party code should be performed up-front using a tool like repo, gclient or git sub-modules, then the build itself would not be allowed to download extra things from the network, and should avoid extracting archives into the bazel output root, if not necessary. I believe this corresponds to the "entreprise use-case" and "offline build" in the original BzlMod design doc.

I have experimented with different ways to achieve that, and uploaded the results at  https://github.com/digit-google/bazel-experiments/tree/main/bzlmod-prebuilt-modules for others to see. I have a few questions for the Bazel team:

1) When do we expect native.local_repository() to work in module extensions?

2) Any chance to have the --registry option support (relative) file paths, in addition to URLs, to access a local registry?

3) Any chance we can have a local registry that contains source.json files that points to pre-extracted directories, instead of tarball archives? Possibly using relative file paths? If so, what would be a good way to check for integrity?

For the record, for the Fuchsia platform build (which is not using Bazel at the moment), where we expect to see a few dozen of external repositories. Without BzlMod, our alternative is to manually patch all WORKSPACE files from these dependencies, which is simply not scalable, pleasant or maintainable, which is why I am enthusiastic and grateful for BzlMod to exist :-)

For the sake of completeness, here's the result of my experiments (which will explain ny questions):

What works:

local-module-directories

a top-level MODULE.bazel file that uses bazel_dep() and local_path_override() to override the path of each repository. This forces the project's MODULE.bazel file to know the exact location of each versioned external workspace. I verified that the final output base did contain symlinks to these workspaces, instead of trying to copy them, so everything is fine, except for the annoyance that we need to maintain the content of this MODULE.bazel file manually to ensure it is always in sync with the other repo/gclient/submodules manifest, which is annoying.

local-register-with-urls
This uses a local registry whose source.json points to compressed archives of the external repo with file:/// URLs. However, because file:// URLs can only be absolute, this requires generating the content of the source.json file as well as the .bazelrc file (which contains the `common --registry file:///absolute/path/to/registry) to adapt them to the checkout's final absolute location. Also this still requires compressed archives, which sometimes need to be patched to support BzlMode (for example bazel-skylib-1.0.3.tar.gz does not include a WORKSPACE file and is thus unusable as-is with BzlMod).

What does not work:

local-module-extensions
Trying to use an intermediate module that provides an extension that creates all the remote repositories through a starlark function. Ideally, this would allow reading the repo manifest and generating the local_path_override() objects automatically from it. Unfortunately, this does not work because native.local_repository() does not work in module extensions. This is actually documented as footnote 2 in the BzlMod Migration Guide so there is hope this will be solved at some point.

local-registry-with-files
Trying to setup a local registry that contains source.json files whose "url" entry points to the prebuilt directory (e.g. "url": "file:///....../prebuilt/platforms-0.0.4"). This does not work because Bazel really expects an archive file path instead of a directory path. From the current code. Also an "integrity" value is needed, and I didn't find a way to create one from a prebuilt directory.

Also note that the --registry option expects a URL, when using a file path instead (e.g. --registry path/to/registry), Bazel crashes with an exception (see this README.md for details).

It works if a file:// URL is used, but these are always absolute paths (the URL spec does not support relative file paths), which forces the .bazelrc file that contains the option to be auto-generated to adjust to the absolute checkout path directory, which is slightly annoying.

Thanks for BzlMod,

- Digit

Fuchsia Build Team

Xudong Yang

unread,
Jul 1, 2022, 10:19:05 AM7/1/22
to David Turner, bazel-discuss
Hi David! Great to see more adoption of Bzlmod :)

> 1) When do we expect native.local_repository() to work in module extensions?

Please follow https://github.com/bazelbuild/bazel/issues/15412 for updates. This is actually surprisingly hard to fix since it involves a huge code refactor, and I have a pending changelist sitting around that I haven't touched in months. If this is urgently needed, you could work around the limitation by writing a Starlark repo rule that replicates the behavior of local_repository (just `ctx.symlink`). If I have time left and/or if a lot of other people request the fix, I can try to get it in before 6.0.

> 2) Any chance to have the --registry option support (relative) file paths, in addition to URLs, to access a local registry?

This is already possible via an (undocumented?) trick: you can use file:///%workspace%/some/relative/path as the registry URL. We should document that somewhere...

> 3) Any chance we can have a local registry that contains source.json files that points to pre-extracted directories, instead of tarball archives? Possibly using relative file paths? If so, what would be a good way to check for integrity?

This is not possible right now, although we could look into implementing it. I didn't quite understand why the local-module-directories setup is not ideal for you; if you didn't have all the relative paths to other modules as `local_path_overrides` in your MODULE.bazel file, the same information would need to be present in the hypothetical local-registry-with-files, right? Unless you mean people are working from different modules, so the top-level MODULE.bazel file with tons of overrides needs to be in every single module?

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CACnJMqq8cNfKWXimO1trzCQpJRMpe%3DKn1x%3DRA_WFKhn7yUed9A%40mail.gmail.com.

David Turner

unread,
Jul 1, 2022, 10:37:39 AM7/1/22
to Xudong Yang, bazel-discuss
Le ven. 1 juil. 2022 à 16:19, Xudong Yang <w...@bazel.build> a écrit :
Hi David! Great to see more adoption of Bzlmod :)

> 1) When do we expect native.local_repository() to work in module extensions?

Please follow https://github.com/bazelbuild/bazwill try that!el/issues/15412 for updates. This is actually surprisingly hard to fix since it involves a huge code refactor, and I have a pending changelist sitting around that I haven't touched in months. If this is urgently needed, you could work around the limitation by writing a Starlark repo rule that replicates the behavior of local_repository (just `ctx.symlink`). If I have time left and/or if a lot of other people request the fix, I can try to get it in before 6.0.

Thanks, I will do it. This is not urgent at all, I can start by auto-generating a lot of the Bazel stuff for now, but a cleaner way to handle this in the future would be fantastic :)
 
> 2) Any chance to have the --registry option support (relative) file paths, in addition to URLs, to access a local registry?

This is already possible via an (undocumented?) trick: you can use file:///%workspace%/some/relative/path as the registry URL. We should document that somewhere...

Oh nice, I'll try that.
 
> 3) Any chance we can have a local registry that contains source.json files that points to pre-extracted directories, instead of tarball archives? Possibly using relative file paths? If so, what would be a good way to check for integrity?

This is not possible right now, although we could look into implementing it. I didn't quite understand why the local-module-directories setup is not ideal for you; if you didn't have all the relative paths to other modules as `local_path_overrides` in your MODULE.bazel file, the same information would need to be present in the hypothetical local-registry-with-files, right? Unless you mean people are working from different modules, so the top-level MODULE.bazel file with tons of overrides needs to be in every single module?

I was just experimenting with different ways. A local registry seemed the most logical way to abstract the location of external repositories from the rest of the project, so I started with that. I ended with local-module-directories later when I discovered a registry could not really work (so far).
I can perfectly leave without a registry for my project, but wanted to test a bit of everything, just in case.

Btw, may I suggest improving the documentation for local_repository() to suggest that it returns an actual object that can be used in module extensions? I was a little bit puzzled by that because the BzlMod user guide only mentions an "hypothetical repo rule" in its example.
 
Thanks,

- Digit

David Turner

unread,
Jul 1, 2022, 10:39:55 AM7/1/22
to Xudong Yang, bazel-discuss
Another related question, repository rules like new_local_repository() allow you to generate a WORKSPACE and BUILD file, can we have additional argumenst to generate a MODULE.bazel file as well?

Xudong Yang

unread,
Jul 1, 2022, 10:47:45 AM7/1/22
to David Turner, bazel-discuss
> Btw, may I suggest improving the documentation for local_repository() to suggest that it returns an actual object that can be used in module extensions?
I don't think that works? I believe it just returns `None`.

> repository rules like new_local_repository() allow you to generate a WORKSPACE and BUILD file, can we have additional argumenst to generate a MODULE.bazel file as well?

David Turner

unread,
Jul 1, 2022, 11:12:23 AM7/1/22
to Xudong Yang, bazel-discuss
Le ven. 1 juil. 2022 à 16:47, Xudong Yang <w...@bazel.build> a écrit :
> Btw, may I suggest improving the documentation for local_repository() to suggest that it returns an actual object that can be used in module extensions?
I don't think that works? I believe it just returns `None`.

It is true that it currently doesn't work, though the error is that native.local_repository() is not resolved yet in module extensions (see error message here).
However, if I understand correctly, once issue 15412 is resolved, the call should return such an object, to make this function work correctly.
Or maybe I didn't understand how repositories created in module extensions work?
 

> repository rules like new_local_repository() allow you to generate a WORKSPACE and BUILD file, can we have additional argumenst to generate a MODULE.bazel file as well?

Filed https://github.com/bazelbuild/bazel/issues/15786 :)
I had a quick look at the Bazel sources, it looks like one file to modify would be https://cs.opensource.google/bazel/bazel/+/master:src/main/java/com/google/devtools/build/lib/rules/repository/NewLocalRepositoryRule.java but I am a bit lost on the use of the setWorkspaceOnly() method on the RuleClass class and whether this is needed here.
I'll probably try again a bit later in a few weeks after my vacation, but thanks for the pointers.

Xudong Yang

unread,
Jul 1, 2022, 11:16:30 AM7/1/22
to David Turner, bazel-discuss
> However, if I understand correctly, once issue 15412 is resolved, the call should return such an object, to make this function work correctly.
> Or maybe I didn't understand how repositories created in module extensions work?

You actually don't need to return anything from the module extension implementation function; just invoking the repo rules instantiates them (just like how in BUILD files, you just call 'cc_library' and that target is created). (With #15412 fixed, that is.)

David Turner

unread,
Jul 1, 2022, 11:21:22 AM7/1/22
to Xudong Yang, bazel-discuss
Le ven. 1 juil. 2022 à 17:16, Xudong Yang <w...@bazel.build> a écrit :
> However, if I understand correctly, once issue 15412 is resolved, the call should return such an object, to make this function work correctly.
> Or maybe I didn't understand how repositories created in module extensions work?

You actually don't need to return anything from the module extension implementation function; just invoking the repo rules instantiates them (just like how in BUILD files, you just call 'cc_library' and that target is created). (With #15412 fixed, that is.)

Oh, I though the example at the end of the BzlMod user guide was returning a list (like rule implementation functions return a list of providers), but it is just a list comprehension that acts as a simple loop.
Thanks for the clarification, I stand corrected, no change needed here, I'll fix my example :)
Reply all
Reply to author
Forward
0 new messages