Like the Maven Central Repo or crates.io, we create a central registry for hosting Bazel modules. This is where all users should publish their project in order to make it available to others. While this is the main source of Bazel external dependencies, third party Bazel registry will also be supported for use cases that the project cannot be in the central registry (eg. publishing internal libraries inside the company). But in most cases, users just have to specify the module name and version of their direct dependencies, then our tool will know how to pull them from the central registry.
Pros
Cons
The Bazel team will host a registry for official Bazel rules, Starlark libraries, and other important Bazel related projects (Kind of like the Bazel Federation). Other interest groups can host their own Bazel registries. For example, the Bazel C++ community can host a third party registry for releasing C++ projects as Bazel modules. Note that one Bazel module in a registry may have to depend on a module in another registry. For example, a library in the C++ Bazel registry may have to depend on rules_cc in the official Bazel registry. With this approach, users have to specify not only the module name and version of their direct dependencies, but also a list of registries that provide all the Bazel modules in their transitive dependencies. For some use cases, we could support a git repo as a mini registry that contains only one module.
Pros
Cons
In a decentralized world, we think the best way is to distribute Bazel modules as git repositories with version tags. We can still have Bazel registries, but they will not be the main sources for pulling Bazel dependencies. When users declare dependencies, the source (a git repo or a Bazel registry) of a module should also be specified along with the module name and version.
Pros
Cons
Hi everyone,While Xudong, Philipp and I are still working on the design doc, I want to share the basic ideas of our solution for improving the external dependencies management experience in Bazel and ask for your opinions on some important design decisions.To be short, we want to introduce a new way to declare external dependencies and move the responsibility of managing dependencies from Bazel to a new dependencies management tool. The design has the following parts:
- Bazel Module and MODUEL.bazel: Bazel module is a collection of available versions of a Bazel project. This project declares its dependencies on other Bazel modules in a MODULE.bazel file. Unlike the WORKSPACE file, users only need to declare their direct dependencies.
- The Bazel dependencies management tool: You can use this tool to add, remove, upgrade, and query your dependencies. It will resolve your dependencies transitively by reading the MODULE.bazel files and make the required external sources available for the Bazel build.
There are a lot of design details, we'll share the doc as soon as it's in shape.However, the obvious question here is how does a user publish their project as a Bazel module. In our design, we have the concept of Bazel registry. It is basically an index of Bazel modules hosted somewhere on the internet in the form that's understandable by the dependencies management tool.We have the following ideas of how Bazel registries should look like in the new world, but we think the community's opinions on this are very important for making the decision.
- Bazel Central Registry
Like the Maven Central Repo or crates.io, we create a central registry for hosting Bazel modules. This is where all users should publish their project in order to make it available to others. While this is the main source of Bazel external dependencies, third party Bazel registry will also be supported for use cases that the project cannot be in the central registry (eg. publishing internal libraries inside the company). But in most cases, users just have to specify the module name and version of their direct dependencies, then our tool will know how to pull them from the central registry.Pros
- It's easy for users to find and declare dependency, module name + version, that's it.
- In the central registry, we can store patch files that are unable to be upstreamed for some reason (eg. for adding BUILD files for a non-Bazel project), and this can be shared with all Bazel users.
- The Bazel modules are reviewed before checking into the registry, which ensures their license validity and security.
- It's possible to calculate the dependents of a module, therefore compatibility check is easier when a new version comes out.
- No module name conflict because the same module name can only appear once in the registry.
- The transitive dependency closure of any given module can be precomputed, saving a lot of HTTP downloads at dependency resolution time.
Cons
- Users probably have to figure out a way to get their dependencies into the central registry in the first place, especially in the initial phase.
- Very likely a huge maintenance cost that's nearly impossible for a three people team to deal with. Mitigate: The community can join in and help with the governance.
- Bazel Official Registry + Third Party Registries
The Bazel team will host a registry for official Bazel rules, Starlark libraries, and other important Bazel related projects (Kind of like the Bazel Federation). Other interest groups can host their own Bazel registries. For example, the Bazel C++ community can host a third party registry for releasing C++ projects as Bazel modules. Note that one Bazel module in a registry may have to depend on a module in another registry. For example, a library in the C++ Bazel registry may have to depend on rules_cc in the official Bazel registry. With this approach, users have to specify not only the module name and version of their direct dependencies, but also a list of registries that provide all the Bazel modules in their transitive dependencies. For some use cases, we could support a git repo as a mini registry that contains only one module.Pros
- The first three points of pros of the Bazel central registry solution.
- Maintenance cost is spread across the community.
- Each interest group can have full control of their registry.
Cons
- The first point of cons of the Bazel central registry solution.
- The same module name might be used in multiple registries, which could cause a conflict. Mitigate: we can require users to use reversed internet domain as module name (they are already recommended for repo name)
- When adding a new dependency, users have to make sure they also add it's required registries. This list can grow as the number of registries in the ecosystem grows.
- It's not very clear for some multi-language projects to choose which registry they should go into.
- Decentralized
In a decentralized world, we think the best way is to distribute Bazel modules as git repositories with version tags. We can still have Bazel registries, but they will not be the main sources for pulling Bazel dependencies. When users declare dependencies, the source (a git repo or a Bazel registry) of a module should also be specified along with the module name and version.Pros
- Low maintenance cost for Bazel registry. Because even if it exists, its size should be very small.
- Easier for users to "publish" their projects. Just make a new version tag.
Cons
- If one git repo changes, it could transitively break many downstream projects. Mitigate: we can use a mirror to ensure what was available is always available and the same.
- We have a much higher chance to have module name conflicts. Eg. 1) different projects accidentally use the same module name. 2) The same module is hosted in different git repos (due to clone perhaps). In the first case, we can distinguish modules by url and use repo_remapping to mitigate, but in the second case, there could still be conflicts during linking time.
- For projects not using Bazel already, this means the corresponding Bazel module (with Bazel BUILD files) has to be created and hosted by a third party.
- Compared to the registries as the main source solutions, this approach has less security promises.
As you can see, each solution has its pros and cons. Please tell us what you think is the best approach. You can of course reply to this thread directly or provide us with more detailed information of your use case and opinions by filling out this form.Cheers,Yun Peng
--
To unsubscribe from this group and stop receiving emails from it, send an email to external-dep...@bazel.build.
Hi everyone,While Xudong, Philipp and I are still working on the design doc, I want to share the basic ideas of our solution for improving the external dependencies management experience in Bazel and ask for your opinions on some important design decisions.To be short, we want to introduce a new way to declare external dependencies and move the responsibility of managing dependencies from Bazel to a new dependencies management tool. The design has the following parts:
- Bazel Module and MODUEL.bazel: Bazel module is a collection of available versions of a Bazel project. This project declares its dependencies on other Bazel modules in a MODULE.bazel file. Unlike the WORKSPACE file, users only need to declare their direct dependencies.
- The Bazel dependencies management tool: You can use this tool to add, remove, upgrade, and query your dependencies. It will resolve your dependencies transitively by reading the MODULE.bazel files and make the required external sources available for the Bazel build.
There are a lot of design details, we'll share the doc as soon as it's in shape.However, the obvious question here is how does a user publish their project as a Bazel module. In our design, we have the concept of Bazel registry. It is basically an index of Bazel modules hosted somewhere on the internet in the form that's understandable by the dependencies management tool.We have the following ideas of how Bazel registries should look like in the new world, but we think the community's opinions on this are very important for making the decision.
- Bazel Central Registry
Like the Maven Central Repo or crates.io, we create a central registry for hosting Bazel modules. This is where all users should publish their project in order to make it available to others. While this is the main source of Bazel external dependencies, third party Bazel registry will also be supported for use cases that the project cannot be in the central registry (eg. publishing internal libraries inside the company). But in most cases, users just have to specify the module name and version of their direct dependencies, then our tool will know how to pull them from the central registry.
Pros
- It's easy for users to find and declare dependency, module name + version, that's it.
- In the central registry, we can store patch files that are unable to be upstreamed for some reason (eg. for adding BUILD files for a non-Bazel project), and this can be shared with all Bazel users.
- The Bazel modules are reviewed before checking into the registry, which ensures their license validity and security.
- It's possible to calculate the dependents of a module, therefore compatibility check is easier when a new version comes out.
- No module name conflict because the same module name can only appear once in the registry.
- The transitive dependency closure of any given module can be precomputed, saving a lot of HTTP downloads at dependency resolution time.
Cons
- Users probably have to figure out a way to get their dependencies into the central registry in the first place, especially in the initial phase.
- Very likely a huge maintenance cost that's nearly impossible for a three people team to deal with. Mitigate: The community can join in and help with the governance.
- Bazel Official Registry + Third Party Registries
The Bazel team will host a registry for official Bazel rules, Starlark libraries, and other important Bazel related projects (Kind of like the Bazel Federation). Other interest groups can host their own Bazel registries. For example, the Bazel C++ community can host a third party registry for releasing C++ projects as Bazel modules. Note that one Bazel module in a registry may have to depend on a module in another registry. For example, a library in the C++ Bazel registry may have to depend on rules_cc in the official Bazel registry. With this approach, users have to specify not only the module name and version of their direct dependencies, but also a list of registries that provide all the Bazel modules in their transitive dependencies. For some use cases, we could support a git repo as a mini registry that contains only one module.Pros
- The first three points of pros of the Bazel central registry solution.
- Maintenance cost is spread across the community.
- Each interest group can have full control of their registry.
Cons
- The first point of cons of the Bazel central registry solution.
- The same module name might be used in multiple registries, which could cause a conflict. Mitigate: we can require users to use reversed internet domain as module name (they are already recommended for repo name)
- When adding a new dependency, users have to make sure they also add it's required registries. This list can grow as the number of registries in the ecosystem grows.
- It's not very clear for some multi-language projects to choose which registry they should go into.
- Decentralized
In a decentralized world, we think the best way is to distribute Bazel modules as git repositories with version tags. We can still have Bazel registries, but they will not be the main sources for pulling Bazel dependencies. When users declare dependencies, the source (a git repo or a Bazel registry) of a module should also be specified along with the module name and version.Pros
- Low maintenance cost for Bazel registry. Because even if it exists, its size should be very small.
- Easier for users to "publish" their projects. Just make a new version tag.
Cons
- If one git repo changes, it could transitively break many downstream projects. Mitigate: we can use a mirror to ensure what was available is always available and the same.
- We have a much higher chance to have module name conflicts. Eg. 1) different projects accidentally use the same module name. 2) The same module is hosted in different git repos (due to clone perhaps). In the first case, we can distinguish modules by url and use repo_remapping to mitigate, but in the second case, there could still be conflicts during linking time.
- For projects not using Bazel already, this means the corresponding Bazel module (with Bazel BUILD files) has to be created and hosted by a third party.
- Compared to the registries as the main source solutions, this approach has less security promises.
As you can see, each solution has its pros and cons. Please tell us what you think is the best approach. You can of course reply to this thread directly or provide us with more detailed information of your use case and opinions by filling out this form.Cheers,Yun Peng
On 18 Sep 2020, at 16:56, 'Yun Peng' via external-deps <extern...@bazel.build> wrote:> A central registry must have reliability and security guarantees which would certainly require special launch effort and ongoing serving costs. This requires a plan to permanently fund it.The current idea is to implement the central registry as a github repo for storing metadata (name, version, dependencies, url of source blobs, etc.) plus a service for mirroring the source blobs (like the bazel mirror, this can simply be a GCS bucket or something). I think it will be simpler than hosting a running http service, but still there will be a lot of maintenance work.> Why would downstream break? If everyone depends on versioned artifacts, a repo changing at head will not impact anyone.I meant if a git repo is suddenly unavailable, or it's moved somewhere else, or the version tag was modified.> Since we are moving to reconfigure and expand the role of the Bazel Federation, we could host patches and BUILD files there.I don't quite understand, why should Bazel Federation host patches for unrelated projects? Maybe we should sync a bit on your plan for Bazel Federation.> Only if you presume the registries are promising centralized security. This comes at a setup and support cost. In the central model we pay it all in one place, in the Bazel+third_party model then many registries have to incur that cost. My hunch is that the third_party ones will simply be github published, so that degenerates to the same as decentralized.Yes, that's definitely a valid concern.James,Thanks for the feedback!> Unfortunately this is a bit over simplistic. There are projects which have optional dependencies whose use can depend on external factors such as licensing restrictions. There is also the notion that package managers such as yum / apt use of 'virtual' packages i.e. dependencies that can be swapped out to a given standardised interface e.g. BLAS or MPI to name a couple of examples.This is very interesting, to be honest, we haven't thought about it. Does this have to be on the package manager level instead of on the BUILD file level?> I like how the renovate tool currently works in this regard - automatically updates the WORKSPACE when it can deduce that there is a new version of a repository available. I'd hope that this tool works like this but also atomically updates the transitive dependencies when this occurs.IMO the tool will also need to be aware of licensing constraints; optional dependencies may require compliance to given licensing terms (commercial or otherwise) and so the user needs to be able to specify these restrictions before the dependency solver can do its thing.SPACK as a dependency manager does this using a constraint solver. See the recent FOSDEM talk for their approach: https://www.youtube.com/watch?v=xBhpfW5cZ-wYes, we intend to make the tool be able to upgrade your transitive dependencies. For version resolution, the current plan is to use MVS.> For languages that have existing registries such as pypi, npm to name a few additional ones are you proposing that the bazel registry would have to mirror the dependencies contained in these repositories to be usable? Where binary components are required at install time then bazel will potentially be responsible for ensuring these are available to the language specific package manager; for example a BLAS package needs to be available before installing scipy in Python. I have yet to see a set of rules that provides a way of injecting built binary deps into the step that installs from a language specific repository. I guess this is partly due to the bazel design phases as the language package manager doesn't necessarily have a way of separating fetching of the dependencies from building/installing them.We thought about mirroring existing third party registries (maven, pypi, cargo, etc), but decided not to do so. We have some other solution to integrate with instead of mirroring them. Will share once the design doc is ready. I know this is another big topic, but I want to focus on discussing what's the role of Bazel registries in this thread.
James,Thanks for the feedback!> Unfortunately this is a bit over simplistic. There are projects which have optional dependencies whose use can depend on external factors such as licensing restrictions. There is also the notion that package managers such as yum / apt use of 'virtual' packages i.e. dependencies that can be swapped out to a given standardised interface e.g. BLAS or MPI to name a couple of examples.This is very interesting, to be honest, we haven't thought about it. Does this have to be on the package manager level instead of on the BUILD file level?
> For languages that have existing registries such as pypi, npm to name a few additional ones are you proposing that the bazel registry would have to mirror the dependencies contained in these repositories to be usable? Where binary components are required at install time then bazel will potentially be responsible for ensuring these are available to the language specific package manager; for example a BLAS package needs to be available before installing scipy in Python. I have yet to see a set of rules that provides a way of injecting built binary deps into the step that installs from a language specific repository. I guess this is partly due to the bazel design phases as the language package manager doesn't necessarily have a way of separating fetching of the dependencies from building/installing them.We thought about mirroring existing third party registries (maven, pypi, cargo, etc), but decided not to do so. We have some other solution to integrate with instead of mirroring them. Will share once the design doc is ready. I know this is another big topic, but I want to focus on discussing what's the role of Bazel registries in this thread.
> This is probably actually a con for all the registry options - or it simply encourages people to run forks of the central registry which fragments the community and creates a decentralized registry.Con: The user of the package may want different visibility of the build targets to those that are in the BUILD file in the registryDo you mean a project may want to limit who can depend on it? I don't know how this could be achieved in the open source world.
> So I guess the other option is centralized definition but decentralised hosting?Does the plan mentioned in the reply to Tony fit your description? (Git repo for module metadata + a mirror of source blobs)
> Also I would expect an official registry to be run by much more than a three person team; in fact I'd likely want to say that it should definitely be formed by a team that includes people from outside of Google.Yes, if there is a central Bazel registry, we would really like to work with people outside of Google.