Obtaining input and output locations in run

Ajith Ramanathan

unread,

May 26, 2021, 12:31:32 PM5/26/21

to bazel-discuss

Hi.

I'm porting a build to Bazel that must run on both Windows and Linux. One step in the build (that is currently done manually, and that I wish to automate) is to generate some C++ code from some specifications. Specifically, we have several specification files and for each we produce a C++ header and source pair using a custom tool. One thing to note is that I have a naming convention such that foo.data produces foo.generated.hpp and foo.generated.cpp.

In my build, I have a cpp_binary() rule generating the tool, and then I use run_binary() to execute the tool. The *.data files are specified in a filegroup(). My first attempt at the build was this:

cpp_binary(name = "tool", ... )

filegroup(

name = "data",

# Let's say we have foo.data and bar.data.

srcs = glob(["*.data"]),

)

run_binary(

name = "generate_code_from_data",

srcs = [":data"],

outs = ["foo.generated.hpp",

"foo.generated.cpp",

"bar.generated.hpp",

"bar.generated.cpp"],

args = ["$(locations :data)"],

tool = [":tool"],

)

This didn't quite work. The tool ran fine, but it placed the outputs in locations that Bazel was not expecting, and I ended up with error messages like "output 'path/to/data/foo.generated.hpp' was not created.

My next attempt changed the generate_code_from_data rule to (changes bold-faced):

run_binary(

name = "generate_code_from_data",

srcs = [":data"],

outs = ["foo.generated.hpp",

"foo.generated.cpp",

"bar.generated.hpp",

"bar.generated.cpp"],

args = ["--in=$(locations :data)"

"--out=$(location foo.generated.hpp)",

"--out=$(location foo.generated.cpp)",

"--out=$(location bar.generated.hpp)",

"--out=$(location bar.generated.cpp)",],

tool = [":tool"],

)

and then using the naming convention to match inputs and outputs. It feels a little awkward as

1) there is an asymmetry in the way inputs and outputs are defined. I tried using the make variables that genrule() understands but they don't seem to work or I have the wrong syntax (for example --out=$OUTS and --out=$(location OUTS) didn't seem to work).

2) I have to do some param matching to pair input and output paths.

So I changed it to a set of rules, one for each data file:

filegroup(name="foo_data", srcs=["foo.data")

run_binary(

name = "generate_foo_code_from_foo_data",

srcs = [":foo_data"],

outs = ["foo.generated.hpp",

"foo.generated.cpp"],

args = ["--in=$(locations :foo_data)"

"--hppout=$(location foo.generated.hpp)",

"--cppout=$(location foo.generated.cpp)"],

tool = [":tool"],

)

and similarly for bar.data. I suppose I could simplify with list comprehensions:

[filegroup(name = "%s_data" % b,

srcs = [":%s.data" % b]) for b in ["foo", "bar"]]

and so on.

I have two questions:

1) Is the approach I'm taking (with or without the list comprehensions) the most natural way to express this build step in Bazel? My first attempt feels pretty unnatural.

2) One thing I need to extract is the path to the generated hpp relative to the workspace root. $(location foo.generated.hpp) produces something like bazel-out/k8-fastbuild/bin/path/to/data/foo.generated.hpp. Is there some make variable that run_binary() understands that I could pass in, or should I just bake the path in manually (either in code or in the rule) as I know the directory structure?

Herrmann, Andreas

unread,

May 27, 2021, 4:25:06 AM5/27/21

to Ajith Ramanathan, bazel-discuss

Is there some make variable that run_binary() understands that I could pass in

Looking at the implementation of run_binary, it doesn't expand make variables, only locations: https://github.com/bazelbuild/bazel-skylib/blob/c6f6b5425b232baf5caecc3aae31d49d63ddec03/rules/run_binary.bzl#L29-L30

So, make variables are not available.

One thing I need to extract is the path to the generated hpp relative to the workspace root.

Bazel provides different forms of location expansion: $(execpath ) and $(rootpath ). The latter should give you this path relative to the workspace root.

https://docs.bazel.build/versions/master/be/make-variables.html#predefined_label_variables

If this gets too unwieldy, it may be worth writing a custom rule for this task instead of using an existing one like run_binary.

Inside a rule implementation you have access to input and output files as File objects. The short_path attribute should give you this path relative to the workspace root.

https://docs.bazel.build/versions/master/skylark/lib/File.html#short_path

Best, Andreas

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/a629e540-a913-40a5-8de7-94589ae0e8a3n%40googlegroups.com.

Ajith Ramanathan

unread,

May 27, 2021, 8:02:07 PM5/27/21

to Herrmann, Andreas, bazel-discuss

Thank you for the very helpful response.

This isn't a task that occurs frequently enough for me to think about
a custom rule, but I'll keep that advice in mind!

Reply all

Reply to author

Forward

Obtaining input and output locations in run_binary() rules

Ajith Ramanathan

Herrmann, Andreas

Ajith Ramanathan