1. Ruleset authors may not be familiar with the behavior of Python's OS-specific functions (e.g. which path separators are supported by which function on which platform?), leading to potential edge case bugs.
2. The Python runfiles library would subtly differ from the Python runfiles library reference implementation (the former doesn't need to locate runfiles relative to argv[0]), which could lead to confusion.
3. If we restrict ourselves to a subset of Python that we truly expect everyone to be familiar with, we are probably close to writing Starlark anyway ;-)
Since I don't want to block anyone but realistically won't get to writing a proper spec soon, I'm going to collect some references that should help folks write their own compliant runfiles libraries.
Depending on the language your are most familiar with, I would recommend studying either of these existing runfiles libraries, all of which are compatible with Bzlmod:
- Go (https://github.com/bazelbuild/rules_go/blob/master/go/runfiles/runfiles.go)
- Python (https://github.com/bazelbuild/rules_python/blob/main/python/runfiles/runfiles.py)
- Java (https://github.com/bazelbuild/bazel/blob/master/tools/java/runfiles/Runfiles.java)
All of these provide the Rlocation and EnvVars functions that comprise the core API of a runfiles library as described in https://docs.google.com/document/d/e/2PACX-1vSDIrFnFvEYhKsCMdGdD40wZRBX3m3aZ5HhVj4CtHPmiXKDCxioTUbYsDydjKtFDAzER5eg7OjJWs3V/pub.
Since Java and Python can rely on launchers setting certain environment variables, their runfiles don't contain the logic that derives the location of the runfiles directory and manifest from argv[0]. A Python implementation of this part is available as the code deleted in this commit: https://github.com/bazelbuild/rules_python/commit/86b01a3a1b30b2244b86c80181f42492abaa09a1
The tricky part of making a runfiles library compatible with Bzlmod is that runfiles lookups require the canonical repository name of the Bazel target containing the source file that calls Rlocation as context. This information either has to be injected at compile-time (for compiled languages) or determined at runtime by parsing source file paths and is used to resolve apparent to canonical repository names (e.g. turn rules_go into rules_go~0.38.1). The languages with Bzlmod support all adopted their own approach to implement this CurrentRepository function:
* The C++ rules add a BAZEL_CURRENT_REPOSITORY define containing the repository name to all compilation actions. Users are expected to pass this value in when creating a Runfiles instance (or alternatively to Rlocation).
* Java offers the AutoBazelRepository annotation that causes a class to be generated that has the repository name in a static constant. Users are expected to pass this value in when creating a Runfiles instance.
* Go relies on runtime.Caller to get the execpath of a source file and parses the repository name out of it, using that the canonical repository name of the main repository is always the empty string. Users don't have to manually pass in the name.
* Python relies on sys._getframe and inspect.getfile and parses the repository name out of the runfiles path, using that with Bzlmod the name of the runfiles directory of the main repository is always _main. Users don't have to manually pass in the name.
If you have any questions, feel free to reach out to me. I am still planning to produce a comprehensive spec at some point.
Fabian