repo rule that depends on a tool built by another repo?

898 views
Skip to first unread message

Gregg Reynolds

unread,
Apr 17, 2021, 9:31:44 PM4/17/21
to bazel-discuss
I've got one repo with a rule that builds a tool that I use to configure another repo.  (Both using custom repository rules.) Currently I build the tool by hand.  I'd like to have the second repo depend on the tool somehow, but I have not been able to find a way to do it.  I can put an executable label attribute on the second repo rule, but repository_ctx does not support the '.execute' attribute.

Any suggestions? The best I've been able to come up with is a shell script that runs bazel twice.

Thanks,

Gregg

John Cater

unread,
Apr 19, 2021, 8:24:57 AM4/19/21
to Gregg Reynolds, bazel-discuss
You can do it, but it needs to be slightly indirect.

The repo that creates the tool is (relatively) easy: you have a repo1_rule (called from WORKSPACE) that writes a file @repo1//:BUILD.bazel that includes a target @repo1//:helper (an binary rule of whatever type you want). To make this even easier, you can give repo1_rule an attribute to name the target in the BUILD.bazel file, so your workspace would be:
```
repo1_rule(
    name = "repo1",
    helper = "helper",
    ...
)
```

and @repo1//:BUILD (written by repo1_rule) looks like:
```
foo_binary(
    name = "helper",
    srcs = ...,
    ...,
)
```

Now, to get it into the second repo, you have a repo2_rule with a _string_ typed helper attribute, but you pass in a label:
```
repo2_rule(
    name = "repo2",
    helper = "@repo1//:helper",
)
```

repo2_rule would write a BUILD.bazel file that uses the helper, and _this_ is the target that checks whether the label is valid and points to something executable:
```
bar_library(
    name = "...",
    helper = "@repo1//:helper",
    ...
)
```

So, yes, the workspace rules can only see Strings, but the BUILD rules can check that they things written into them are valid executables, or any other checks.

The main problem with this is in error messaging: anyone who mis-configured their WORKSPACE will need to know a lot about Bazel, repository rules, and these specific implementations to debug the problem if there is an issue, but it should set up the plumbing you want.

Hope this helps!
John C


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CAO40MikpYcQFNyAtgQTnhK%2B%3DKQyez2Fpzpnu1Q-y6edg0Jcsig%40mail.gmail.com.

Gregg Reynolds

unread,
Apr 26, 2021, 1:25:51 PM4/26/21
to John Cater, bazel-discuss
Thanks, but I don't see how to make that work. I did not state the problem very clearly. Here's some more detail.

I have a cc_binary rule that builds the tool (opam_bootstrap) I need. It lives in repo @tools_obazl, which is installed using `http_archive`. To build it:  `bazel build  @tools_obazl//bootstrap:opam_bootstrap`.

The job of the `opam_bootstrap` tool is to configure repository @opam.  It writes some BUILD.bazel files into the repo (i.e. `external/opam`).  To run it, I have a custom repository rule, `opam_configuration`.  It uses `repo_ctx.execute` to run the tool.

(In principle the logic of `opam_bootstrap` could be implemented in starlark, but it parses config files and I don't know of a starlark lex/yacc library. Actually I haven't even looked, I just used re2c and lemon since they are fast and portable.)

So after `opam_configuration` is run, we have a collection of targets like @opam//lib/zarith, @opam//lib/foo, etc.  These are used by the OCaml rules, e.g. `ocaml_module(name = "foo", deps = ["@opam//lib/zarith"], ...)`

So when an OCaml build target that depends on an @opam target is built, the `opam_configuration` rule is run (and cached).  For it to succeed, the `opam_bootstrap` executable must be in the path, so that `repo_ctx.execute` can find it.

So the following procedure works just fine:

1.  $ bazel build @tools_obazl//bootstrap:opam_bootstrap
2.   sudo cp bazel-bin/external/tools_obazl/bootstrap/opam_bootstrap
3.  $ bazel build src/some/ocaml:target  ## target that depends on @opam//lib/foo

After step 2, the `opam_bootstrap` executable is in the path, so `repo_ctx.executable` will find it and run it.

I want to automate steps 1 and 2, so that the user need only use step 3.  I can get part way there by using a custom env variable, OBAZL_BOOTSTRAP, setting it to `bazel-bin/external/tools_obazl/bootstrap/opam_bootstrap`, and passing it to `opam_confugration`, which adds it to the path in the env passed to `repo_ctx.execute`.

What I have not been able to figure out is how to get the opam_configuration rule to trigger a build of the opam_bootstrap tool, and then run it.

I wonder if there is some way to make this work by putting opam_bootstrap in a toolchain.

Note that I have two external repos installed using http_archive:  tools_obazl, and rules_ocaml.  The latter contains the `opam_configuration` rule that creates @opam (by running tool `opam_bootstrap` contained in the former).

Thanks,

Gregg

Gregg Reynolds

unread,
Apr 26, 2021, 1:28:38 PM4/26/21
to John Cater, bazel-discuss

 ...snip...
 
So after `opam_configuration` is run, we have a collection of targets like @opam//lib/zarith, @opam//lib/foo, etc.  These are used by the OCaml rules, e.g. `ocaml_module(name = "foo", deps = ["@opam//lib/zarith"], ...)`

Just to be clear, the BUILD.bazel files that define the @opam// targets are generated by the opam_bootstrap tool.

Gregg Reynolds

unread,
Apr 26, 2021, 1:30:40 PM4/26/21
to John Cater, bazel-discuss
2.   sudo cp bazel-bin/external/tools_obazl/bootstrap/opam_bootstrap

Sorry, that should be: 

2.   sudo cp bazel-bin/external/tools_obazl/bootstrap/opam_bootstrap ~/bin  ## put the tool somewhere in $PATH

Herrmann, Andreas

unread,
Apr 27, 2021, 4:23:13 AM4/27/21
to Gregg Reynolds, bazel-discuss
Hi Gregg,

As I understand the question this is not possible with Bazel - at least not directly. @tools_obazl//bootstrap:opam_bootstrap is a binary that is built by Bazel, i.e. the binary is a generated file. As such it will only be created during the execution phase. However, the opam_configuration repository rule is executed at an earlier stage in the loading phase. As the docs on repository_rule state "a repository rule cannot depend on a generated artifact" (https://docs.bazel.build/versions/master/skylark/lib/globals.html#repository_rule).

That said, rules_go is facing a very similar issue with Gazelle. The go_repository repository rule fetches Go sources and then runs Gazelle to generate Bazel BUILD files for it. In this use-case Gazelle cannot be built by Bazel for the same reasons as above. The go_repository rule works around the issue by building Gazelle directly with the Go compiler during the loading phase: https://github.com/bazelbuild/bazel-gazelle/blob/5b8616dbb7dad825c61ffbb12ea9f75622568657/internal/go_repository_tools.bzl#L140-L143 . An alternative to this approach is to check in a pre-built binary or provide binary releases of opam_bootstrap and download them with http_archive/file.

Best, Andreas


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.

Gregg Reynolds

unread,
Apr 27, 2021, 1:15:01 PM4/27/21
to Herrmann, Andreas, bazel-discuss
Hi Andreas,

On Tue, Apr 27, 2021 at 3:23 AM Herrmann, Andreas <andreas....@tweag.io> wrote:
Hi Gregg,

As I understand the question this is not possible with Bazel - at least not directly. @tools_obazl//bootstrap:opam_bootstrap is a binary that is built by Bazel, i.e. the binary is a generated file. As such it will only be created during the execution phase. However, the opam_configuration repository rule is executed at an earlier stage in the loading phase. As the docs on repository_rule state "a repository rule cannot depend on a generated artifact" (https://docs.bazel.build/versions/master/skylark/lib/globals.html#repository_rule).

That's what I figured, but I was hoping to find a dirty trick to get around it.
 
That said, rules_go is facing a very similar issue with Gazelle. The go_repository repository rule fetches Go sources and then runs Gazelle to generate Bazel BUILD files for it. In this use-case Gazelle cannot be built by Bazel for the same reasons as above. The go_repository rule works around the issue by building Gazelle directly with the Go compiler during the loading phase: https://github.com/bazelbuild/bazel-gazelle/blob/5b8616dbb7dad825c61ffbb12ea9f75622568657/internal/go_repository_tools.bzl#L140-L143 . An alternative to this approach is to check in a pre-built binary or provide binary releases of opam_bootstrap and download them with http_archive/file.
 
A classic bootstrapping situation.  I guess the dirty trick is just to write a shell script or makefile to build opam_bootstrap outside of Bazel, and run it with repo_ctx.execute.  I originally wanted to avoid that, but building the tool is pretty simple so I think it will work out nicely.  Repo functions like download_and_extract, file, etc. make it pretty easy.  And after all, if I'm going to go outside of Bazel to generate BUILD.bazel files, I might as well do the same to build the tool that does it.

(FWIW I've learned a lot from studying rules_go, and also rules_rust and cargo_raze.)

Thanks,

Gregg
Reply all
Reply to author
Forward
0 new messages