Iterating over files located in a subpackage

1,349 views
Skip to first unread message

DC

unread,
Jul 27, 2023, 12:18:52 PM7/27/23
to bazel-discuss
Hi,

I have a macro that needs to iterate over files which are located in a subpackage of the package that is calling the macro.

I've tried using exports_files and filegroup in the subpackage and reference them in the macro but I can't seem to extract the list of files from these.

Help would be appreciated.

Thanks

Chuck Grindel

unread,
Jul 27, 2023, 5:28:13 PM7/27/23
to bazel-discuss
I presume that you are doing a glob to get the files. The glob in Bazel will not include files from other Bazel packages. In other words, if the subdirectory has a BUILD or BUILD.bazel file, the glob in the parent directory will not see them.

DC

unread,
Jul 28, 2023, 3:46:30 AM7/28/23
to bazel-discuss
I'm aware that glob can't get files from other packages.
I have specified that i'm using exports_files / filegroup to fetch the list of files from the subpackage.
The issue is how can I iterate over these files by referencing them from a different BUILD file.

Alexandre Rostovtsev

unread,
Jul 28, 2023, 10:57:04 AM7/28/23
to DC, bazel-discuss
Bazel runs the loading phases for different packages independently in separate threads. Macro evaluation and globbing both happen in the loading phase. Bazel doesn't allow a macro in package A to iterate the contents of a glob from access B because there is no guarantee about the order in which different packages' loading phases complete, and adding such an order guarantee would limit parallelization; this design decision allows `bazel query` to perform well on very large repos.

You have some alternative options:
* Instead of a macro, use a rule: pass the rule a filegroup wrapping the glob, and access its contents only at analysis time - in the rule implementation function via ctx.files or ctx.attr;
* Change your macro's logic and/or your package hierarchy so that everything you need to glob for a single macro invocation lives in one package. A package can have subdirectories, as long as the subdirectories aren't themselves packages.

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/452e9f2b-ea6b-468e-befb-36eccafb3c3fn%40googlegroups.com.

DC

unread,
Jul 28, 2023, 1:39:52 PM7/28/23
to bazel-discuss
I don't think a rule would be suitable here, because i'm using the macro to iterate over the files and then call an existing rule, i'm not sure if that would work with a rule.

What about using a genrule to copy the files from the subpackage to the current directory? Could that work?

Alexandre Rostovtsev

unread,
Jul 28, 2023, 2:13:32 PM7/28/23
to DC, bazel-discuss
It won't help; you cannot access the attributes of a target (in this case - the list of outputs of a copy_files rule) at macro evaluation time when the target is defined in another package. It's for the same reason why you cannot access another package's globs: each package is loaded in a separate thread, and the order in which these threads run is undefined.

DC

unread,
Jul 28, 2023, 2:27:19 PM7/28/23
to bazel-discuss
I see.
So if writing a rule is my only option here, can a rule wrap another rule within it? (that's what i'm doing with the macro)

Alexandre Rostovtsev

unread,
Jul 28, 2023, 2:50:54 PM7/28/23
to DC, Ivo Ristovski List, bazel-discuss
Unfortunately, not yet. It is a deficiency in the rule API. +Ivo Ristovski List has some proposals currently in review for fixing it.

For now, maybe you can factor out a shared part of the rule implementation function into a helper function that can be also used by your new rule's implementation function. (Similarly, you can factor out a shared attrs dict to use in multiple rule definitions.)

Alternatively, if you do not control the rule implementation, maybe you can change your package hierarchy to have everything that needs to be globbed in one package.

Or maybe you can invoke your macro in each package separately, add all instantiated rules in the macro to a filegroup (one per package), and then define another filegroup to combine all those per-package filegroups.

DC

unread,
Jul 28, 2023, 3:20:59 PM7/28/23
to bazel-discuss
I do not control the rule implementation and I can't change the package hierarchy.
Can you please elaborate on your last suggestion?

Alexandre Rostovtsev

unread,
Jul 28, 2023, 3:55:18 PM7/28/23
to DC, bazel-discuss
It would be easier to answer if you provided more details about what exactly you are trying to do (i.e. what is the rule, what your macro is doing, and how the sources are organized).

But guessing based on your original question, it sounded to me that your macro is doing something like this:

def map_reduce(name, srcs, **kwargs):
    map_target_names = []
    for src in srcs:
        map_target_name = "%s_%s" % (name, input)
        maps.append(map_target_name)
        map_rule(name = map_target_name, src = src, **kwargs)
    reduce_rule(name = name, srcs = map_target_names, **kwargs)

And that you wanted srcs to come from globs from multiple packages. If that is the case, then my suggestion would be to do something like this instead:

def map_group(name, srcs, **kwargs):
    map_target_names = []
    for src in srcs:
        map_target_name = "%s_%s" % (name, input)
        maps.append(map_target_name)
        map_rule(map_target_name, src = src, **kwargs)
    native.filegroup(name = name, srcs = map_target_names)

Then in a/b/BUILD and a/c/BUILD:
...
map_group(
    name = "mapped_group",
    srcs = glob(["*.txt"]),
)
...

and in a/BUILD:
...
reduce_rule(
    name = "reduce_everything",
    srcs = [
        "//a/b:mapped_group",
        "//a/c:mapped_group",
    ]
)
...


Reply all
Reply to author
Forward
0 new messages