Proposal: Canonical label literals

68 views
Skip to first unread message

Xudong Yang

unread,
May 16, 2022, 10:36:16 AM5/16/22
to bazel-dev, bazel-discuss
Hi all,

I'm proposing to add a new label syntax to Bazel ("@@foo//:bar" -- note the extra "@"), which allows the label to bypass repo mapping. This will solve some pain points with the upcoming Bzlmod. Please feel free to leave comments!

Thanks,
Xudong

Lukács T. Berki

unread,
May 17, 2022, 7:49:51 AM5/17/22
to Xudong Yang, bazel-dev, bazel-discuss
Thanks for writing this up!

It's a pretty big change and as such, I'm worried that some issue will slip through, regardless of how many smart people read the design doc, however carefully. Do you have any alternatives in mind that are less intrusive?

--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CADMn-5aNVqOMx82vP%2BMAoLvVeUfths1YuU0OQ3obJziROUxJqw%40mail.gmail.com.


--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

Gregg Reynolds

unread,
May 17, 2022, 6:00:42 PM5/17/22
to Xudong Yang, bazel-dev, bazel-discuss
"The result of str(Label("@foo//:target")) cannot be safely passed to a Label constructor, because the result is actually "@foo.1.0.2//:target" -- foo.1.0.2 is a canonical repo name, but not usually a valid apparent repo name in the current repo (i.e. missing from its repo mapping)."

This looks to me like a (major) design flaw. Adding bzlmod should not change the meaning of the Label function. IMO the Label function should construct labels from strings, period. If you want to then obtain the mapping from that label to (the label of) a canonical repo name, that's a separate operation.

More generally, the canonical label is an implementation detail that should not be exposed to Starlark code (IMO). But maybe you could get the same result by adding another level of scoping. Repository mapping already in fact does this - a `repo_mapping` is scoped to the repo in which it occurs, no? Can you just make this explicit? Maybe something as simple as @foo@bar//pkg:target, meaning @bar//pkg:target as mapped by @foo. Then you could also have e.g. @baz@bar//pkg:target, mapping to a different version of @bar.

-Gregg



--

Xudong Yang

unread,
May 18, 2022, 7:24:36 AM5/18/22
to Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss
> Adding bzlmod should not change the meaning of the Label function.  IMO the Label function should construct labels from strings, period.

This is not new behavior introduced by Bzlmod; today, the Label function already passes the argument through the current repo's repo mapping. Imagine that you say repo A should map "foo" to "bar", and A creates a label using `Label("@foo//:target")`. IMO it would be completely surprising if that didn't get mapped to "@bar//:target"; what would A having a repo mapping even do, then?

===============

> At the moment, `Label` objects carry
> additional information beyond what is reflected in their
> stringification. If we want to allow repository rules to generate
> BUILD files that reference targets passed in via labels (we most
> certainly do, since this is a common use case), we need *some* way to
> serialize all the information attached to a `Label` object into a
> string that can later be reified into a `Label` object.

This is very well put, thank you Fabian :)

On Wed, May 18, 2022 at 9:00 AM Fabian Meumertzheim <fab...@meumertzhe.im> wrote:
On Wed, May 18, 2022 at 12:00 AM Gregg Reynolds <d...@mobileink.com> wrote:
>
> IMO the Label function should construct labels from strings, period. If you want to then obtain the mapping from that label to (the label of) a canonical repo name, that's a separate operation.

With the proposal, the only way to get your hands on such a new label
literal in the first place would be to call `str` on a `Label` object,
which has been a relatively uncommon operation so far. For all label
literals that are valid today, the result of applying `Label` to them
wouldn't change. Would you prefer a new function or struct field (e.g.
`to_canonical_label()`) to be added to `Label` objects instead of
modifying their stringification behavior?


> But maybe you could get the same result by adding another level of scoping. Repository mapping already in fact does this - a `repo_mapping` is scoped to the repo in which it occurs, no? Can you just make this explicit? Maybe something as simple as @foo@bar//pkg:target, meaning @bar//pkg:target as mapped by @foo. Then you could also have e.g. @baz@bar//pkg:target, mapping to a different version of @bar.

It's important to keep in mind that, without any further context, any
unambiguous reference to a target has to include at least one
canonical repository name: "@foo@bar//pkg:target" is meaningless on
its own if "foo" is just an apparent repository name (borrowing
language from the proposal). To be unambiguous in general, only
"@foo.1.0.2@bar//pkg:target" has enough information to 1. look up the
repo mapping to apply from "foo.1.0.2"'s perspective and 2. apply it
to the apparent repo name "bar" to get the canonical repo name, say
"bar.2.0.4". But then the result is just a more complicated version of
the syntax proposed by Xudong, with which this label could instead be
written as "@@bar.2.0.4//pkg:target".


> More generally, the canonical label is an implementation detail that should not be exposed to Starlark code (IMO).

To be clear, I don't intend to pick on the particular label syntax
"@foo@bar//pkg:target", but rather point out the more general
complexity of the situation: At the moment, `Label` objects carry
additional information beyond what is reflected in their
stringification. If we want to allow repository rules to generate
BUILD files that reference targets passed in via labels (we most
certainly do, since this is a common use case), we need *some* way to
serialize all the information attached to a `Label` object into a
string that can later be reified into a `Label` object. Such a
serialization format would probably only be good for this purpose, but
it definitely has to be exposed to Starlark (in the form of BUILD
files) to fulfill it. Whether that is in the form of
"@@foo.1.0.3//pkg:target" (new syntax) or
"@_never_even_think_of_writing_this_yourself___foo.1.0.3//pkg:target"
(a long prefix to prevent this from colliding with existing labels) or
`labels.from_canonical_only_use_in_generated_code("@foo.1.0.3//pkg:target")`
(new function instead of new syntax) is something to be discussed and,
ultimately, shouldn't matter too much.

Fabian


>
> -Gregg
>
>
>
> On Mon, May 16, 2022 at 9:36 AM 'Xudong Yang' via bazel-dev <baze...@googlegroups.com> wrote:
>>
>> Hi all,
>>
>> I'm proposing to add a new label syntax to Bazel ("@@foo//:bar" -- note the extra "@"), which allows the label to bypass repo mapping. This will solve some pain points with the upcoming Bzlmod. Please feel free to leave comments!
>>
>> Thanks,
>> Xudong
>>
>> --
>> You received this message because you are subscribed to the Google Groups "bazel-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CADMn-5aNVqOMx82vP%2BMAoLvVeUfths1YuU0OQ3obJziROUxJqw%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "bazel-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.

Fabian Meumertzheim

unread,
May 18, 2022, 11:53:32 PM5/18/22
to Gregg Reynolds, Xudong Yang, bazel-dev, bazel-discuss
On Wed, May 18, 2022 at 12:00 AM Gregg Reynolds <d...@mobileink.com> wrote:
>
> IMO the Label function should construct labels from strings, period. If you want to then obtain the mapping from that label to (the label of) a canonical repo name, that's a separate operation.

With the proposal, the only way to get your hands on such a new label
literal in the first place would be to call `str` on a `Label` object,
which has been a relatively uncommon operation so far. For all label
literals that are valid today, the result of applying `Label` to them
wouldn't change. Would you prefer a new function or struct field (e.g.
`to_canonical_label()`) to be added to `Label` objects instead of
modifying their stringification behavior?

> But maybe you could get the same result by adding another level of scoping. Repository mapping already in fact does this - a `repo_mapping` is scoped to the repo in which it occurs, no? Can you just make this explicit? Maybe something as simple as @foo@bar//pkg:target, meaning @bar//pkg:target as mapped by @foo. Then you could also have e.g. @baz@bar//pkg:target, mapping to a different version of @bar.

It's important to keep in mind that, without any further context, any
unambiguous reference to a target has to include at least one
canonical repository name: "@foo@bar//pkg:target" is meaningless on
its own if "foo" is just an apparent repository name (borrowing
language from the proposal). To be unambiguous in general, only
"@foo.1.0.2@bar//pkg:target" has enough information to 1. look up the
repo mapping to apply from "foo.1.0.2"'s perspective and 2. apply it
to the apparent repo name "bar" to get the canonical repo name, say
"bar.2.0.4". But then the result is just a more complicated version of
the syntax proposed by Xudong, with which this label could instead be
written as "@@bar.2.0.4//pkg:target".

> More generally, the canonical label is an implementation detail that should not be exposed to Starlark code (IMO).

To be clear, I don't intend to pick on the particular label syntax
"@foo@bar//pkg:target", but rather point out the more general
complexity of the situation: At the moment, `Label` objects carry
additional information beyond what is reflected in their
stringification. If we want to allow repository rules to generate
BUILD files that reference targets passed in via labels (we most
certainly do, since this is a common use case), we need *some* way to
serialize all the information attached to a `Label` object into a
string that can later be reified into a `Label` object. Such a
serialization format would probably only be good for this purpose, but
it definitely has to be exposed to Starlark (in the form of BUILD
files) to fulfill it. Whether that is in the form of
"@@foo.1.0.3//pkg:target" (new syntax) or
"@_never_even_think_of_writing_this_yourself___foo.1.0.3//pkg:target"
(a long prefix to prevent this from colliding with existing labels) or
`labels.from_canonical_only_use_in_generated_code("@foo.1.0.3//pkg:target")`
(new function instead of new syntax) is something to be discussed and,
ultimately, shouldn't matter too much.

Fabian

>
> -Gregg
>
>
>
> On Mon, May 16, 2022 at 9:36 AM 'Xudong Yang' via bazel-dev <baze...@googlegroups.com> wrote:
>>
>> Hi all,
>>
>> I'm proposing to add a new label syntax to Bazel ("@@foo//:bar" -- note the extra "@"), which allows the label to bypass repo mapping. This will solve some pain points with the upcoming Bzlmod. Please feel free to leave comments!
>>
>> Thanks,
>> Xudong
>>
>> --
>> You received this message because you are subscribed to the Google Groups "bazel-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CADMn-5aNVqOMx82vP%2BMAoLvVeUfths1YuU0OQ3obJziROUxJqw%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "bazel-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CAO40MinFLK8NWGDNa0S6Ehhi1VQzhPBo6EaqV9xQNCPe5e9ouQ%40mail.gmail.com.

Lukács T. Berki

unread,
May 19, 2022, 6:40:16 AM5/19/22
to Xudong Yang, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss
On Wed, May 18, 2022 at 1:24 PM 'Xudong Yang' via bazel-discuss <bazel-...@googlegroups.com> wrote:
> Adding bzlmod should not change the meaning of the Label function.  IMO the Label function should construct labels from strings, period.

This is not new behavior introduced by Bzlmod; today, the Label function already passes the argument through the current repo's repo mapping. Imagine that you say repo A should map "foo" to "bar", and A creates a label using `Label("@foo//:target")`. IMO it would be completely surprising if that didn't get mapped to "@bar//:target"; what would A having a repo mapping even do, then?
Whereas you're correct that this is not new behavior bzlmod introduces, there is a difference between repo remapping being an obscure, hardly ever used feature and it being core functionality used in every single build.

I think the main issue is that with repo remapping, a "Label" can be two things: the Label as written in the BUILD file and a Label after it has gone through repo remapping. I think there are enough use cases for both that we can't just say that we support only one of them (I'd welcome arguments against this statement!). There is some precedent for a Label changing after stringification because package-relative labels like ":foo" get stringified in their full form (e.g. "//bar:foo"), but my gut feeling is that we are better off if e.g. "bazel query" reported the form of Label before repository remapping. The rollout of bzlmod is a big change already and so it's probably better not to change "bazel query" and the like with a flag flip.

All of the above sounds like an argument for Fabian's proposed "Label.to_canonical()" function, doesn't it? (if it is not impossible to implement)

(I wish repository rules worked in a way that's different from generating BUILD files in text form because then we could think about alternatives to transmitting the necessary information in string form but alas,I don't think such a change is feasible,at least not within the scope of bzlmod)


You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CADMn-5YeVD251gz-SiiO%2BEZ_%2BLOCgZH5Zqexfr0hsr%3DFL3E%3Diw%40mail.gmail.com.

Xudong Yang

unread,
May 19, 2022, 7:28:03 AM5/19/22
to Lukács T. Berki, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss
> my gut feeling is that we are better off if e.g. "bazel query" reported the form of Label before repository remapping.

What does this mean exactly? Say you have a dependency on @foo, and @foo has a dependency on @bar. (You have no visibility into @bar.) You do `bazel query deps(@foo//:target)`, and it spits out:
    :other_target
    //my_pkg:target
    @bar//:something
Because these are the strings written in the BUILD file of @foo's root package, for the target called "target".

Is that what you'd want? To me, that seems incredibly confusing and *very* error prone. I'd much rather it output something like the following:
    @foo//:other_target
    @foo//my_okg:target
    @@bar.1.0.2//:something
So that I know the first two deps I'm getting are actually in @foo, not in my main repo, and that the third thing is something that I don't normally have access to.


> All of the above sounds like an argument for Fabian's proposed "Label.to_canonical()" function, doesn't it? (if it is not impossible to implement)

It's definitely not impossible to implement, but again, what do you expect `str(Label("@bar//:something"))` to return when it's used from within @foo? Right now, it returns "@bar.1.0.2//:something", which is very much not ideal because this string cannot actually be used anywhere. The proposal changes it to "@@bar.1.0.2//:something". Is your expectation that it returns "@bar//:something" (which would be impossible to implement without storing an "original string" with every Label object -- and IMO a bad idea to implement in the first place), or something else?

Lukács T. Berki

unread,
May 19, 2022, 8:28:01 AM5/19/22
to Xudong Yang, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
On Thu, May 19, 2022 at 1:28 PM Xudong Yang <w...@bazel.build> wrote:
> my gut feeling is that we are better off if e.g. "bazel query" reported the form of Label before repository remapping.

What does this mean exactly? Say you have a dependency on @foo, and @foo has a dependency on @bar. (You have no visibility into @bar.) You do `bazel query deps(@foo//:target)`, and it spits out:
    :other_target
    //my_pkg:target
    @bar//:something
Because these are the strings written in the BUILD file of @foo's root package, for the target called "target".

Is that what you'd want? To me, that seems incredibly confusing and *very* error prone. I'd much rather it output something like the following:
    @foo//:other_target
    @foo//my_okg:target
    @@bar.1.0.2//:something
So that I know the first two deps I'm getting are actually in @foo, not in my main repo, and that the third thing is something that I don't normally have access to.


> All of the above sounds like an argument for Fabian's proposed "Label.to_canonical()" function, doesn't it? (if it is not impossible to implement)

It's definitely not impossible to implement, but again, what do you expect `str(Label("@bar//:something"))` to return when it's used from within @foo? Right now, it returns "@bar.1.0.2//:something", which is very much not ideal because this string cannot actually be used anywhere. The proposal changes it to "@@bar.1.0.2//:something". Is your expectation that it returns "@bar//:something" (which would be impossible to implement without storing an "original string" with every Label object -- and IMO a bad idea to implement in the first place), or something else?
I think we can agree that @@bar.1.0.2//:something is an improvement over @bar.1.0.2//:something :)

My uncertainty comes from the fact that I don't know what the output of "bazel query" is used for in general; I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach, but I also use it to descend down dependency graphs when debugging, for which use case resolving the label is better because that's how one can find dependencies.

On the Starlark API side, it feels wrong to build what is essentially a mini analysis phase to resolve cross-repository dependencies into the constructor of Label but when I do the mental exercise of replacing just the name of Label.parseAbsolute() with Label.resolve(), it sounds much better; if we could separate the "parsing" part with the "resolving" part, it would be much better. Maybe we can call new Label() a legacy API and provide e.g. Label.parse() and Label.resolve() to Starlark? This would also address Gregg's well founded concerns he voiced above.

Ivo, Jon, WDYT?

Xudong Yang

unread,
May 19, 2022, 8:35:52 AM5/19/22
to Lukács T. Berki, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
> I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach

Could you give an example for "expand macros"? I feel like resolving the label is never *not* a good approach. Also, I can't stress this enough -- even today, the label is already always "resolved", even in `query` -- if you write "//pkg:target" in @foo, it's always understood as "@foo//pkg:target".


> Maybe we can call new Label() a legacy API and provide e.g. Label.parse() and Label.resolve() to Starlark?

Sorry for being insistently pedantic, but what do you envision these to actually do? I have a hard time evaluating many of these arguments since they all appear very vague to me.

In the context of the repo @foo, what does `Label.parse("@bar//pkg:target")` return? What's its type? And what does `Label.resolve("@bar//pkg:target")` return? What's its type?

Xudong Yang

unread,
May 19, 2022, 8:46:23 AM5/19/22
to Gunnar Wagenknecht, Lukács T. Berki, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
@Gunnar, that comment was in the context of using the output of `bazel query` to expand macros. The example you gave is not one I was specifically asking for, but nonetheless pertinent to the discussion -- it demonstrates how we already need to be cognizant of when a string is turned into a Label (being "resolved" in the process).

On Thu, May 19, 2022 at 2:39 PM Gunnar Wagenknecht <gun...@wagenknecht.org> wrote:


On May 19, 2022, at 05:35, 'Xudong Yang' via bazel-dev <baze...@googlegroups.com> wrote:

> I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach

Could you give an example for "expand macros"? I feel like resolving the label is never *not* a good approach. Also, I can't stress this enough -- even today, the label is already always "resolved", even in `query` -- if you write "//pkg:target" in @foo, it's always understood as "@foo//pkg:target".

Gunnar Wagenknecht

unread,
May 19, 2022, 8:56:05 AM5/19/22
to Xudong Yang, "Lukács T. Berki", Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein


On May 19, 2022, at 05:35, 'Xudong Yang' via bazel-dev <baze...@googlegroups.com> wrote:

> I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach

Could you give an example for "expand macros"? I feel like resolving the label is never *not* a good approach. Also, I can't stress this enough -- even today, the label is already always "resolved", even in `query` -- if you write "//pkg:target" in @foo, it's always understood as "@foo//pkg:target".

Fabian Meumertzheim

unread,
May 19, 2022, 9:16:32 AM5/19/22
to Xudong Yang, Lukács T. Berki, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
On Thu, May 19, 2022 at 2:35 PM Xudong Yang <w...@bazel.build> wrote: 
In the context of the repo @foo, what does `Label.parse("@bar//pkg:target")` return? What's its type? And what does `Label.resolve("@bar//pkg:target")` return? What's its type?
 
 Not Lukács, but I could imagine the following:
* Both Label.parse and Label.resolve return a Label or similar struct-like Starlark object with `workspace_name` (which AFAIU the terminology would make more sense as `repository_name`), `package` and `name` fields.
* The field values of Label.parse("@bar//pkg:target") would be `"bar"`, `"pkg"` and `"target"` (parsing of the label string with not further context)
* The field values of Label.resolve("@bar//pkg:target") would be `"@bar.1.0.2"`, `"pkg"` and `"target"` (parsing + resolution relative to the current package & repo mapping)

I feel like resolving the label is never *not* a good approach.
 
I don't know of many use cases for Label.parse, but generating "stable" code from a label literal is one. Label.parse would have been very useful for rules_runfiles (see https://github.com/fmeum/rules_runfiles/blob/cd0c47bcf1fcfcb2d53e5deb42101def4cf69044/runfiles/internal/common.bzl#L50).

Gregg Reynolds

unread,
May 19, 2022, 12:00:56 PM5/19/22
to Xudong Yang, bazel-dev, bazel-discuss
On Mon, May 16, 2022 at 9:36 AM 'Xudong Yang' via bazel-dev <baze...@googlegroups.com> wrote:
Hi all,

I'm proposing to add a new label syntax to Bazel ("@@foo//:bar" -- note the extra "@"), which allows the label to bypass repo mapping. This will solve some pain points with the upcoming Bzlmod. Please feel free to leave comments!

What's the timeframe for this?  Personally I strongly prefer a solution that does not expose the version string, but maybe your proposal is the best we can do.  In any case, I'm a little nervous as a rules author (OBazl) and I'd like a little more time to study this. (I confess I have not actually used either repo_mapping or bzlmod yet).  FWIW OCaml has a similar problem, a flat namespace and ensuing name clashes. The solution was to allow (OCaml) module aliasing and leave it to the build tools to make it all work.

Aside from the pros and cons of your proposal the documentation could use some work.  I suspect this is one of those areas where the documentation is harder to write than the code. Some observations and suggestions:

* "Dependency" and its cognates are overloaded and confusing (to me).  Suggest "module dep" (mdep?) and "target dep" (tdep) where appropriate.

* "Repository name". We have repo names, module names, name and
  repo_name attributes, module aliases (mappings), version strings,
  "the directory name the repository lives in", etc. It's all very
  confusing.

* Repository mapping v. (bzlmod) indexing

"Repository mapping" is a legacy featurebug. It is expressed via
attribute "repo_mapping", which is defined for native repo rules
"local_repository" and "new_local_repository", and is implicitly
defined for user-defined repo rules.

Semantically, a repo mapping is an unconstrained alias. It allows the
user to map a repo name (not label) to any other valid repo name.  Really should be renamed "module_alias" but I suppose that's a non-starter.

Bzlmod introduces a "bazel_dep" rule, which indirectly serves a
similar purpose. It (implicitly) declares a "canonical repository name" from the
bazel_dep attributes. But the documentation is not clear. It gives a
template "module_name.version"; but where does the module_name come
from? "bazel_dep" has a "name" attribute, but it also has a
"repo_name" attribute - which one is used? Or does this refer to the
"name" and "version" attributes of the "module" rule? What happens if the module is aliased? Either way, why
call this "canonical repository name" instead of "canonical module
name"?

More generally: if we're switching from WORKSPACE.bazel and repos to
MODULE.bazel and modules, then the documentation should be consistent:
modules not repositories.

From https://bazel.build/docs/bzlmod#repository-names:

"Canonical repository name: The globally unique repository name for
each repository. This will be the directory name the repository lives
in." I don't know what this means. Maybe it's just a mistake; don't
repository (module?) names come from attributes?

"Local repository name: The repository [module?] name to be used in the BUILD
and .bzl files within a repo. The same dependency could have different
local names for different repos." Don't know what that last sentence
means. Does it mean "different repos can use different names for the
same dependency"?

"For Bazel module repos: module_name by default, or the name specified
by the repo_name attribute in bazel_dep." Which is it? And
again, where does module_name come from? What if it's aliased?

An attempt to articulate the problem(s) a little more clearly:

* Modules not repositories. At the very least we need clear, simple,
  unambiguous etc. definitions of both terms.
* "Name" should mean name, not name plus version. @foo.1.2.3 should be called an "indexed module name", not a module name (see below).
* "Module" can be thought of as a "family of modules indexed by
  version string". A member of that family (version of the module) is
  identified by an "Indexed Module Name" or IMN, which is
  name plus version string. So @foo names a  "module" (= family of modules) whose
  members are @foo.1, @foo.2, etc.  To be clear: @foo.1 and @foo.1 are separate modules (in the @foo family).
* Indexing a family of modules is expressed via the "bazel_dep" rule with name and version attributes.
* In general this indexing is an implementation detail that need not (and
  should not IMO) be exposed to starlark code.
* Module aliases are supported via the 'repo_mapping' rule; such
  aliases are scoped to the module in which they are declared.
* The "bazel_dep" rule also supports aliasing via the "repo_name" attribute. (?)
* Indexing and aliasing are mutually orthogonal.
* Aliasing is scoped; it follows that module names are scoped. A Fully
  Qualified Module Name would include the context module name.
  Possible syntax: "@ctx@mod//pkg:tgt". (This is a label containing an FQMN)
* An "Indexed FQMN" would include the version string but as noted
  above this should never be needed in user code.
* Scopes (contexts) nest so we could have e.g. @a@b@c//pkg:tgt
* In principle the FQMN should be sufficient to unambiguously identify
  a unique indexed FQMN. So generated BUILD files could use FQMNs.
 
Your question from another msg: "Say you have a dependency on @foo, and @foo has a dependency on @bar. (You have no visibility into @bar.) You do `bazel query deps(@foo//:target)`.

Your preference is that it emit:
    @foo//:other_target
    @foo//my_okg:target
    @@bar.1.0.2//:something

But I think that should be
    @@foo.1.2.3//:other_target
    @@foo.1.2.3//my_okg:target
    @@bar.1.0.2//:something

and then @@ does not add information.  With namespacing:

    @foo//:other_target
    @foo//my_okg:target
    @foo@bar//:something

Here @foo@bar tells us that @bar is declared in @foo; @foo tells us that @foo is declared in the main (unnamed) repo (equivalent to @@foo). Of course this is premised on the assumption that Bazel retains enough information to resolve FQMNs to indexed FQMNs when it needs to.  I've no idea if that is feasible, but looking at it from the outside it seems like it ought to.

Cheers,

Gregg



Gregg Reynolds

unread,
May 19, 2022, 12:06:54 PM5/19/22
to Xudong Yang, Lukács T. Berki, Fabian Meumertzheim, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
On Thu, May 19, 2022 at 7:35 AM Xudong Yang <w...@bazel.build> wrote:
> I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach

Could you give an example for "expand macros"? I feel like resolving the label is never *not* a good approach. Also, I can't stress this enough -- even today, the label is already always "resolved", even in `query` -- if you write "//pkg:target" in @foo, it's always understood as "@foo//pkg:target".

I'd make a distinction between label expansion and module name resolution.

Label expansion:  :tgt =>  @foo//pkg:tgt
Module name resolution:  @foo =>  @foo.1.2.3  (if using bzlmod) or @foo => @bar  (if using repo_mapping with or without bzlmod)

Gregg

 

Gregg Reynolds

unread,
May 19, 2022, 12:20:57 PM5/19/22
to Fabian Meumertzheim, Xudong Yang, bazel-dev, bazel-discuss
On Wed, May 18, 2022 at 2:00 AM Fabian Meumertzheim <fab...@meumertzhe.im> wrote:
On Wed, May 18, 2022 at 12:00 AM Gregg Reynolds <d...@mobileink.com> wrote:
>
> IMO the Label function should construct labels from strings, period. If you want to then obtain the mapping from that label to (the label of) a canonical repo name, that's a separate operation.

With the proposal, the only way to get your hands on such a new label
literal in the first place would be to call `str` on a `Label` object,
which has been a relatively uncommon operation so far. For all label
literals that are valid today, the result of applying `Label` to them
wouldn't change. Would you prefer a new function or struct field (e.g.
`to_canonical_label()`) to be added to `Label` objects instead of
modifying their stringification behavior?

That might work, I'd have to think about it some more.
 

> But maybe you could get the same result by adding another level of scoping. Repository mapping already in fact does this - a `repo_mapping` is scoped to the repo in which it occurs, no? Can you just make this explicit? Maybe something as simple as @foo@bar//pkg:target, meaning @bar//pkg:target as mapped by @foo. Then you could also have e.g. @baz@bar//pkg:target, mapping to a different version of @bar.

It's important to keep in mind that, without any further context, any
unambiguous reference to a target has to include at least one
canonical repository name: "@foo@bar//pkg:target" is meaningless on
its own if "foo" is just an apparent repository name (borrowing
language from the proposal).

Yes, namespacing would have to be inductive - it has to "bottom out" so that Bazel can know where to look for aliases and version indexing.  But I think the context should provide that.  E.g. if @foo@bar//pkg:tgt occurs in module @buz, then the latter must declare @foo.  Or something like that, to be honest I have not worked it out in detail.
 

stringification. If we want to allow repository rules to generate
BUILD files that reference targets passed in via labels (we most
certainly do, since this is a common use case), we need *some* way to
serialize all the information attached to a `Label` object into a
string that can later be reified into a `Label` object. Such a
serialization format would probably only be good for this purpose, but
it definitely has to be exposed to Starlark (in the form of BUILD
files) to fulfill it. Whether that is in the form of
"@@foo.1.0.3//pkg:target" (new syntax) or
"@_never_even_think_of_writing_this_yourself___foo.1.0.3//pkg:target"
(a long prefix to prevent this from colliding with existing labels) or
`labels.from_canonical_only_use_in_generated_code("@foo.1.0.3//pkg:target")`
(new function instead of new syntax) is something to be discussed and,
ultimately, shouldn't matter too much.

Yeah, could be there's no way to avoid the version string, but it sure doesn't smell very good.

G

Lukács T. Berki

unread,
May 20, 2022, 3:58:10 AM5/20/22
to Xudong Yang, Fabian Meumertzheim, Gregg Reynolds, bazel-dev, bazel-discuss, Ivo Ristovski List, Jon Brandvein
On Thu, May 19, 2022 at 2:35 PM Xudong Yang <w...@bazel.build> wrote:
> I *personally* use it a lot to expand macros and the like, for which use case being as close to the original string the label was parsed from is unquestionably the best approach

Could you give an example for "expand macros"? I feel like resolving the label is never *not* a good approach. Also, I can't stress this enough -- even today, the label is already always "resolved", even in `query` -- if you write "//pkg:target" in @foo, it's always understood as "@foo//pkg:target".
 I meant "debugging Starlark code". Sometimes I find it easier to look at the output of a Starlark macro to understand what it's doing. I do realize that package-relative labels are resolved there; that makes "query --output=build" less useful but you're right in that it establishes a precedent for processing labels between parsing and query.



> Maybe we can call new Label() a legacy API and provide e.g. Label.parse() and Label.resolve() to Starlark? 

Sorry for being insistently pedantic, but what do you envision these to actually do? I have a hard time evaluating many of these arguments since they all appear very vague to me.

In the context of the repo @foo, what does `Label.parse("@bar//pkg:target")` return? What's its type? And what does `Label.resolve("@bar//pkg:target")` return? What's its type?
See what Fabian said below; parse() would return a Label whose repository part is "@bar" and resolve() would return one with "@@bar.1.0.2" as the repository part.

Given the general uncertainty about this, I recommend two changes that would constrain the "blast radius" of this change:

A. Only allow @@ labels when --experimental_enable_bzlmod is in effect. This would let us work out the ramifications experimentally without committing to it.

B. Do not allow @@ labels within hand-written BUILD files. As per our previous discussion, I do realize that the line between the two is not very bright due to "generated" BUILD files that are then checked in, but a simple rule like only allowing @@ labels in BUILD files that are written by a repository rule would already help. I wish there was a way to only allow @@ labels that come from the resolution of a "regular" label, but the fact that they must be reified in text makes that difficult. I could imagine a number of workarounds:
  1.  str(Label) returning a "secret token" (but then how does one deal with server restarts and cached repositories?)
  2. Your "__do_not_ever_think_of_writing_this_yourself" idea 
  3. Something like deps=[ResolvedLabel("@@repo//pkg:name")] where we upgrade LabelConverter to understand the return value of ResolvedLabel and the stringification of ResolvedLabel() would not be parseable as a "regular" label
but none of these looks very satisfying.

Brian Silverman

unread,
May 20, 2022, 6:47:04 AM5/20/22
to Lukács T. Berki, Xudong Yang, bazel-dev, bazel-discuss
On Tue, May 17, 2022 at 4:49 AM 'Lukács T. Berki' via bazel-dev <baze...@googlegroups.com> wrote:
It's a pretty big change and as such, I'm worried that some issue will slip through, regardless of how many smart people read the design doc, however carefully. Do you have any alternatives in mind that are less intrusive?

Instead, what about just allowing @foo.1.0.2//:target everywhere labels are currently accepted? Effectively, make repo mapping a NOP for a name that's a value in the map. Equivalently, if a repo mapping includes foo => foo.1.0.2, implicitly include foo.1.0.2 => foo.1.0.2 in the mapping too. That avoids new label syntax, and localizes the change in behavior to repo mapping instead.

The main con that comes to mind is that people can now write @foo.1.0.2 directly. That has fairly obvious consequences though, and fixing the problem after noticing it while upgrading to @foo.1.0.3 is straightforward. It might also be useful as an escape hatch while prototyping or debugging.

query will show labels that were never written, but it already does that: if you write //target, query shows //target:target. I think @foo being replaced with @foo.1.0.2 is a fairly obvious behavior, in that it's created exactly how you'd think.

Lukács T. Berki

unread,
May 27, 2022, 8:55:09 AM5/27/22
to Xudong Yang, bazel-dev, bazel-discuss, Jon Brandvein
After pondering this a bit, given that:
  1. There must be way to make these changes happen to Bazel
  2. No one has identified a clear show-stopper
  3. bzlmod solves an existing, serious usability issue in Bazel (the complexity and error-proneness of WORKSPACE files) and it seems to require something like this
  4. There is some precedent for the stringification of Labels returning something different from which the Label was parsed (package-relative labels)
  5. I don't think anyone be confident that this change is OK simply by thinking about it hard enough
I propose that canonical labels be tested alongside bzlmod by gating them behind bzlmod's --experimental_* flag, as is in the current version of the design doc. The current plan is to stop gating bzlmod behind an --experimental_* flag in a few months, which should be enough to weed out any issues.

So, please go ahead while keeping this functionality behind the --experimental_* flag of bzlmod. @Jon: if you find any serious issues, please do speak up!



--
You received this message because you are subscribed to the Google Groups "bazel-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-dev/CADMn-5aNVqOMx82vP%2BMAoLvVeUfths1YuU0OQ3obJziROUxJqw%40mail.gmail.com.


--

Adam Porich

unread,
Jun 9, 2022, 9:48:56 AM6/9/22
to bazel-dev
I feel that `bzlmod` is introducing a change in how Bazel is handling external repositories and after that change it will be possible that from a build-wide perspective there may be multiple versions of a dependency in play. I don't think anyone want's that situation in their codebase but sometimes that's not the biggest problem :) . I haven't deployed any `bzlmod` yet but I think as a solution to this difficult problem  it's currently a frontrunner for me :+1:

That said, I think the issue here is we are trying to squeeze this behaviour into the current external workspace handling systems. To handle this effectively we should expand the syntax in a meaningful fashion to allow clients to understand these new definitions. The resolved repository name will become increasingly important in many cases such as IDE or static analysis. 

I'm not familiar with the exact definition of a valid external repository name but I would suggest we use a currently excluded character to seperate `@<name>` and `<version>`.  Personally I have never seen a `/` in a repository name so assuming that it's reclaimable. How about if `@<repo>/<version>//...` is the canonical version but `@<repo>//...` is the common version. That way we can explain to developers what a canonical reference is if they have to deal with a mixed build (hopefully not the common case) but also allow linting as appropriate with a simple regex

Thanks,
Adam
Reply all
Reply to author
Forward
0 new messages