diff (expect) tests under Bazel?

222 views
Skip to first unread message

Gregg Reynolds

unread,
Oct 26, 2022, 9:23:54 AM10/26/22
to bazel-discuss
I'm converting OCaml builds to Bazel.  Lots of them run diff tests, which usually involve building and running an executable, capturing its output in a file, and then running a diff tool to compare the output to a static file in the source tree (e.g. comparing generated 'test.output' to 'test.expected')

Sometimes the executables have hard-coded paths for inputs and/or outputs. I handle this by putting those in the 'data' attribute, and I have an attribute 'data_prefix_map" that allows me to set the path to whatever the executable is expecting. For example, a target in 'test/pretty':

ocaml_binary(
    name            = "test.exe",
    main            = ":Test",
    data            = ["sample.json"],
    data_prefix_map = {"test/pretty/": ""},
    visibility      = ["//visibility:public"],
)

This works great for running test.exe. In this case, "sample.json" is hardcoded in test.exe, and it writes its output to stdout.

Now the problem is to capture that output to a file and then run a 'diff' under a Bazel test target.

I've used the diff_test rule from skylib, but it needs file inputs.  I tried using a genrule to run test.exe and redirect its output to a file, for input to diff_test.  However, since the genrule uses the shell to execute test.exe, it ignores the runfiles stuff, so it does not work (remember that "sample.json" is hardcoded in test.exe).  But this works:

genrule(
    outs  = ["test.output"],
    name  = "__test.output__",
    srcs  = [":sample.json"],
    cmd   = " ".join([
        "cp $(location sample.json) .;",   ## <= ghastly hack
        "$(execpath test.exe)",
        "> $@"
    ]),
    exec_tools = ["test.exe"]
)

This works, but it's hardly user-friendly and it doesn't scale very well. You have to know quite a lot about Bazel to understand it, I think, and let's face it, most programmers dread having to learn the gory details of yet another build system. So I'm looking for a more elegant and user-friendly solution.

How do rule authors handle this sort of thing?  Diff tests are pretty common so I'm sure this problem has been solved in various ways; I'm just not sure what the optimal Bazel way is.

I have an ocaml_test rule that I could modify to accept a diff tool etc. It would also need an attribute for expected files. But that seems a little fragile.  An alternative would be to write a dedicated ocaml_diff_test rule.  Or maybe it would be better to call it ocaml_expect_test.  I could also write a rule (or macro) whose sole purpose is to do what the above genrule does (run something and redirect stdout/stderr to a file).  FWIW I'd like to avoid depending on third-party libs.

Advice?

Thanks,

Gregg


David Turner

unread,
Oct 26, 2022, 10:03:34 AM10/26/22
to Gregg Reynolds, bazel-discuss
I don't see any data_prefix_map in either https://github.com/jin/rules_ocaml or https://github.com/obazl/rules_ocaml so I really wonder what set of OCaml bazel rules you are using. Did you modify them to add a data_prefix_map attribute yourself?

exec_tools dependencies in a genrule() should pick the target executable's runfiles and you shouldn't need to copy anything.
But to achieve this Bazel relies on the DefaultInfo.runfiles information provided by the configured target for your ocaml_binary() definition.
I suspect there is a bug in the implementation function for ocaml_binary(), but it is really hard to tell without more details.

Do you have a simple reproducible test case?

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/CAO40MimM-ur0BeaFbuZddgEiA7%2B%2BHpJrTK8G-sK0Z08j4Wn_MA%40mail.gmail.com.

Gregg Reynolds

unread,
Oct 26, 2022, 11:02:42 AM10/26/22
to David Turner, bazel-discuss
On Wed, Oct 26, 2022 at 9:03 AM David Turner <di...@google.com> wrote:
I don't see any data_prefix_map in either https://github.com/jin/rules_ocaml or https://github.com/obazl/rules_ocaml so I really wonder what set of OCaml bazel rules you are using. Did you modify them to add a data_prefix_map attribute yourself?

Yes, still under development (I'm the OBazl author). I'm developing and testing a dune->obazl conversion tool, which has exposed some weaknesses in OBazl's testing support.  See the dev branch of obazl rules_ocaml.

exec_tools dependencies in a genrule() should pick the target executable's runfiles and you shouldn't need to copy anything.
But to achieve this Bazel relies on the DefaultInfo.runfiles information provided by the configured target for your ocaml_binary() definition.
I suspect there is a bug in the implementation function for ocaml_binary(), but it is really hard to tell without more details.

You're saying that genrule should configure the runfiles for the tool, relative to "$(execpath test.exe)?  Maybe the symlink stuff is the problem. Here's the relevant snippet:

In impl_binary.bzl:

    rfsymlinks = {}
    for f in ctx.files.data:
        added = False
        for (k,v) in ctx.attr.data_prefix_map.items():
            if f.path.startswith(k):
                rf = v + f.path.removeprefix(k)
                rfsymlinks.update({rf: f})
                added = True
                break
        if not added:
            rfsymlinks.update({f: f})

    if ctx.attr.data_prefix_map:
        myrunfiles = ctx.runfiles(
            files = rfiles,
            symlinks = rfsymlinks,
            root_symlinks = rfsymlinks
        )
    else:
        myrunfiles = ctx.runfiles(
            files = rfiles
        )
    defaultInfo = DefaultInfo(
        executable=out_exe,
        runfiles = myrunfiles
    )



Do you have a simple reproducible test case?

See https://github.com/obazl-repository/yojson/blob/dev/bazel/test/pretty/BUILD.bazel.  Note that you probably won't be able to build that unless you set up some local repos, since I'm in the middle of developing stuff.  At a minimum `--override_repository=rules_ocaml=/path/to-rules/obazl/rules_ocaml` set to the dev branch. I shooting to cut a new public alpha soon.

Thanks,

G

David Turner

unread,
Oct 26, 2022, 12:09:39 PM10/26/22
to Gregg Reynolds, bazel-discuss
On Wed, Oct 26, 2022 at 5:02 PM Gregg Reynolds <d...@mobileink.com> wrote:


On Wed, Oct 26, 2022 at 9:03 AM David Turner <di...@google.com> wrote:
I don't see any data_prefix_map in either https://github.com/jin/rules_ocaml or https://github.com/obazl/rules_ocaml so I really wonder what set of OCaml bazel rules you are using. Did you modify them to add a data_prefix_map attribute yourself?

Yes, still under development (I'm the OBazl author). I'm developing and testing a dune->obazl conversion tool, which has exposed some weaknesses in OBazl's testing support.  See the dev branch of obazl rules_ocaml.

exec_tools dependencies in a genrule() should pick the target executable's runfiles and you shouldn't need to copy anything.
But to achieve this Bazel relies on the DefaultInfo.runfiles information provided by the configured target for your ocaml_binary() definition.
I suspect there is a bug in the implementation function for ocaml_binary(), but it is really hard to tell without more details.

You're saying that genrule should configure the runfiles for the tool, relative to "$(execpath test.exe)?  Maybe the symlink stuff is the problem. Here's the relevant snippet:

No, the genrule should not do anything. If you use `exec_tools = [ ":some_tool" ]` in your genrule() invocation, the Bazel will ensure that the runfiles of `some_tool` are copied to the execution sandbox at the proper location.
In this case, you need to ensure that the DefaultInfo.runfiles value returned for the configured ocaml_binary() target is correct. Try to use a cquery to see what the providers of your configured target are, e.g. `bazel cquery --output=starlark --starlark:expr="providers(target)" //path/to/your/ocam_binary_target`.
Note that the output might be a little surprising because it dumps Bazel internals details that do not appear in starlark (e.g. DefaultInfo is really a collection of different sub-providers, what a terrible mess).

Also have a look at the action commands generated by your genrule(), I suspect your executable is called from a working directory or with a path that is not what you expect (maybe with `bazel aquery //test/pretty:genruie_target`).

In impl_binary.bzl:

    rfsymlinks = {}
    for f in ctx.files.data:
        added = False
        for (k,v) in ctx.attr.data_prefix_map.items():
            if f.path.startswith(k):
                rf = v + f.path.removeprefix(k)
                rfsymlinks.update({rf: f})
                added = True
                break
        if not added:
            rfsymlinks.update({f: f})

    if ctx.attr.data_prefix_map:
        myrunfiles = ctx.runfiles(
            files = rfiles,
            symlinks = rfsymlinks,
            root_symlinks = rfsymlinks
 
NOTE: Using the same values for symlinks and root_symlinks seems to be incorrect, see https://bazel.build/extending/rules#runfiles_symlinks 

        )
    else:
        myrunfiles = ctx.runfiles(
            files = rfiles
        )
    defaultInfo = DefaultInfo(
        executable=out_exe,
        runfiles = myrunfiles
    )



Do you have a simple reproducible test case?

See https://github.com/obazl-repository/yojson/blob/dev/bazel/test/pretty/BUILD.bazel.  Note that you probably won't be able to build that unless you set up some local repos, since I'm in the middle of developing stuff.  At a minimum `--override_repository=rules_ocaml=/path/to-rules/obazl/rules_ocaml` set to the dev branch. I shooting to cut a new public alpha soon.

Sorry but these are not reproduction steps at all. I hope you will find your bug with the queries above though.

Thanks,

G

Gregg Reynolds

unread,
Oct 26, 2022, 1:49:44 PM10/26/22
to David Turner, bazel-discuss


On Wed, Oct 26, 2022 at 11:09 AM David Turner <di...@google.com> wrote


No, the genrule should not do anything. If you use `exec_tools = [ ":some_tool" ]` in your genrule() invocation, the Bazel will ensure that the runfiles of `some_tool` are copied to the execution sandbox at the proper location.
In this case, you need to ensure that the DefaultInfo.runfiles value returned for the configured ocaml_binary() target is correct. Try to use a cquery to see what the providers of your configured target are, e.g. `bazel cquery --output=starlark --starlark:expr="providers(target)" //path/to/your/ocam_binary_target`.

Tried that, it does not show DefaultInfo providers. Tried it on a simple ocaml_module rule, same. 

Just ran the build with --sandbox_debug, then `find -L <sandbox path> -name sample.json`, and it looks like it is indeed exactly where it should be:

<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/__main__/test/pretty/sample.json
<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/__main__/sample.json
<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/sample.json

And if I replace the hardcoded "sample.json" with ""bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/sample.json", it works.

OTOH, if I add sample.json to srcs and add `cp $(location sample.json) .;` to the cmd, I get (in addition to the three just listed):

<sandbox>/test/pretty/sample.json

Which suggests to me that there is some discombobulation going on here.  The genrule cmd is:

/bin/bash \
    -c \
    'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe > bazel-out/darwin-fastbuild/bin/test/pretty/test.output.json'

Maybe genrule-setup.sh is runnning test.exe from the sandbox root instead of the runfiles-appropriate location?

 
NOTE: Using the same values for symlinks and root_symlinks seems to be incorrect, see https://bazel.build/extending/rules#runfiles_symlinks 

Thanks
 
Do you have a simple reproducible test case?

See https://github.com/obazl-repository/yojson/blob/dev/bazel/test/pretty/BUILD.bazel.  Note that you probably won't be able to build that unless you set up some local repos, since I'm in the middle of developing stuff.  At a minimum `--override_repository=rules_ocaml=/path/to-rules/obazl/rules_ocaml` set to the dev branch. I shooting to cut a new public alpha soon.

Sorry but these are not reproduction steps at all. I hope you will find your bug with the queries above though.

Sorry about that.  I added some quickstart info to the readme and changed the workspace rules to use git_repository pointing to the appropriate dev branches, so you might now be able to build those targets if you don't mind installing and configuring OCaml etc.  But it's only tested on Mac, will run it on Linux shortly.

Thanks for the suggestions,

Gregg

Gregg Reynolds

unread,
Oct 26, 2022, 2:03:07 PM10/26/22
to David Turner, bazel-discuss
On Wed, Oct 26, 2022 at 12:49 PM Gregg Reynolds <d...@mobileink.com> wrote:


On Wed, Oct 26, 2022 at 11:09 AM David Turner <di...@google.com> wrote


No, the genrule should not do anything. If you use `exec_tools = [ ":some_tool" ]` in your genrule() invocation, the Bazel will ensure that the runfiles of `some_tool` are copied to the execution sandbox at the proper location.
In this case, you need to ensure that the DefaultInfo.runfiles value returned for the configured ocaml_binary() target is correct. Try to use a cquery to see what the providers of your configured target are, e.g. `bazel cquery --output=starlark --starlark:expr="providers(target)" //path/to/your/ocam_binary_target`.

Tried that, it does not show DefaultInfo providers. Tried it on a simple ocaml_module rule, same. 

Just ran the build with --sandbox_debug, then `find -L <sandbox path> -name sample.json`, and it looks like it is indeed exactly where it should be:

<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/__main__/test/pretty/sample.json
<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/__main__/sample.json
<sandbox>/bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/sample.json

And if I replace the hardcoded "sample.json" with ""bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe.runfiles/sample.json", it works.

OTOH, if I add sample.json to srcs and add `cp $(location sample.json) .;` to the cmd, I get (in addition to the three just listed):

<sandbox>/test/pretty/sample.json

Which suggests to me that there is some discombobulation going on here.  The genrule cmd is:

/bin/bash \
    -c \
    'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/darwin-opt-exec-2B5CBBC6/bin/test/pretty/test.exe > bazel-out/darwin-fastbuild/bin/test/pretty/test.output.json'

Maybe genrule-setup.sh is runnning test.exe from the sandbox root instead of the runfiles-appropriate location?

Well, I guess it's because it's run under bash, so the current dir is the launchdir, which is the sandbox root? But runfiles are not designed for that, it seems.
Reply all
Reply to author
Forward
0 new messages