Hi.
I'm porting a build to Bazel that must run on both Windows and Linux. One step in the build (that is currently done manually, and that I wish to automate) is to generate some C++ code from some specifications. Specifically, we have several specification files and for each we produce a C++ header and source pair using a custom tool. One thing to note is that I have a naming convention such that foo.data produces foo.generated.hpp and foo.generated.cpp.
In my build, I have a cpp_binary() rule generating the tool, and then I use run_binary() to execute the tool. The *.data files are specified in a filegroup(). My first attempt at the build was this:
cpp_binary(name = "tool", ... )
filegroup(
name = "data",
# Let's say we have foo.data and bar.data.
srcs = glob(["*.data"]),
)
run_binary(
name = "generate_code_from_data",
srcs = [":data"],
outs = ["foo.generated.hpp",
"foo.generated.cpp",
"bar.generated.hpp",
"bar.generated.cpp"],
args = ["$(locations :data)"],
tool = [":tool"],
)
This didn't quite work. The tool ran fine, but it placed the outputs in locations that Bazel was not expecting, and I ended up with error messages like "output 'path/to/data/foo.generated.hpp' was not created.
My next attempt changed the generate_code_from_data rule to (changes bold-faced):
run_binary(
name = "generate_code_from_data",
srcs = [":data"],
outs = ["foo.generated.hpp",
"foo.generated.cpp",
"bar.generated.hpp",
"bar.generated.cpp"],
args = ["--in=$(locations :data)"
"--out=$(location foo.generated.hpp)",
"--out=$(location foo.generated.cpp)",
"--out=$(location bar.generated.hpp)",
"--out=$(location bar.generated.cpp)",],
tool = [":tool"],
)
and then using the naming convention to match inputs and outputs. It feels a little awkward as
1) there is an asymmetry in the way inputs and outputs are defined. I tried using the make variables that genrule() understands but they don't seem to work or I have the wrong syntax (for example --out=$OUTS and --out=$(location OUTS) didn't seem to work).
2) I have to do some param matching to pair input and output paths.
So I changed it to a set of rules, one for each data file:
filegroup(name="foo_data", srcs=["foo.data")
run_binary(
name = "generate_foo_code_from_foo_data",
srcs = [":foo_data"],
outs = ["foo.generated.hpp",
"foo.generated.cpp"],
args = ["--in=$(locations :foo_data)"
"--hppout=$(location foo.generated.hpp)",
"--cppout=$(location foo.generated.cpp)"],
tool = [":tool"],
)
and similarly for bar.data. I suppose I could simplify with list comprehensions:
[filegroup(name = "%s_data" % b,
srcs = [":%s.data" % b]) for b in ["foo", "bar"]]
and so on.
I have two questions:
1) Is the approach I'm taking (with or without the list comprehensions) the most natural way to express this build step in Bazel? My first attempt feels pretty unnatural.
2) One thing I need to extract is the path to the generated hpp relative to the workspace root. $(location foo.generated.hpp) produces something like bazel-out/k8-fastbuild/bin/path/to/data/foo.generated.hpp. Is there some make variable that run_binary() understands that I could pass in, or should I just bake the path in manually (either in code or in the rule) as I know the directory structure?