How to run Python script as part of the build without making py_binary first?

6,363 views
Skip to first unread message

Konstantin Erman

unread,
Feb 4, 2020, 11:55:15 AM2/4/20
to bazel-discuss
At this time I invoke Python script as part of the build by building py_binary first and then invoking it through genrule or run_binary. py_binary uses py_runtime, which points to a downloaded (hermetic) interpreter.
It works, but py_binary does a lot of unnecessary work, including zipping the whole interpreter (tens of megs). I only need to run a script, no need to package it for standalone execution.
I wonder if there is a better way to just run the script without making a binary out of it and package a huge interpreter with it?

Thank you!
Konstantin

Sarah Maven

unread,
Feb 4, 2020, 10:20:02 PM2/4/20
to bazel-discuss
Yes it is a lot of work but the script is only built once and all that is cached for future builds. But if you really want to do this in a non-bazel-y way, consider using a genrule to call python.

Example:
genrule(
    name = "pleasedontdothis",
    srcs = [],
    outs = ["a.txt"],
    cmd = """python3 -c 'x=open("$@", "w"); x.write("my location is $@"); x.close()'""",
)

obviously you would then include the script as a source and use $(location //your/python:script.py) instead of -c.

But again I would advise against doing this I'm just letting you know how in case you feel there's no other way, but this will lead to pain and you should not do this.

Konstantin Erman

unread,
Feb 4, 2020, 10:33:56 PM2/4/20
to bazel-discuss
Just today I accidentally discovered that when EXE (and .ZIP) created by py_binary gets executed it actually uses system Python interpreter to bootstrap and then switch to the packaged hermetic interpreter. Am I missing something? What if there is no system interpreter on the machine? This is a huge argument against using py_binary.   

What if alternatively, I write something like the following?

def _impl(ctx):
    py_runtime_info = ctx.attr._python_toolchain[PyRuntimeInfo]
    python_path = py_runtime_info.interpreter.path # external/python_windows/python.exe
    inputs = [ctx.file.main] + ctx.files.srcs

    bindir = ctx.var["BINDIR"# bazel-out/x64_windows-fastbuild/bin

    args = ctx.actions.args()
    args.add(ctx.file.main)
    args.add(bindir)
    args.add_all(ctx.attr.args)

    ctx.actions.run(
        executable = python_path,
        inputs = inputs,
        arguments = [args],
        outputs = ctx.outputs.outs,
        tools = py_runtime_info.files,
    )

    return DefaultInfo(
        files = depset(ctx.outputs.outs),
        runfiles = ctx.runfiles(files = ctx.outputs.outs),
    )
 
run_python = rule(
    implementation = _impl,
    attrs = {
        "main"attr.label(
            allow_single_file = True,
        ), 
        "srcs"attr.label_list(
            allow_files = True,
            doc = "Additional inputs of the action.<br/><br/>These labels are available for" +
                  " <code>$(location)</code> expansion in <code>args</code> and <code>env</code>.",
        ), 
        "args"attr.string_list(
            doc = "Command line arguments of the binary.<br/><br/>Subject to" +
                  "<code><a href=\"https://docs.bazel.build/versions/master/be/make-variables.html#location\">$(location)</a></code>" +
                  " expansion.",
        ), 
        "outs"attr.output_list(
            mandatory = True,
            doc = "Output files generated by the action.<br/><br/>These labels are available for" +
                  " <code>$(location)</code> expansion in <code>args</code> and <code>env</code>.",
        ),
        "_python_toolchain"attr.label(default = "@tab_toolchains//bazel/toolchains/python:tab_python_runtime"),
    },
)


What do you think?

Sarah Maven

unread,
Feb 4, 2020, 11:05:00 PM2/4/20
to bazel-discuss
Outside of the question of do you integrate with py_library or not it looks like you have an answer. I think for the standard bazel user I would still suggest py_binary because of bazel's caching and it is the general practice to use it. I would say for you - if you want to use your run_python, go ahead. It looks like you know what you're doing, you document your code, I see no reason why this would break cross-platform, and you can switch back to py_binary if need be.

Konstantin Erman

unread,
Feb 5, 2020, 12:04:24 AM2/5/20
to bazel-discuss
Thank you, Sarah! As a matter of fact I am not quite sure what I am doing yet, but you sound encouraging. 😁

Jon Brandvein

unread,
Feb 5, 2020, 9:08:25 AM2/5/20
to bazel-discuss
The Python stub script does require a Python interpreter be installed on the target platform for bootstrapping. See #8685. We should make the specific way it finds this first-stage interpreter customizable, but a longer term solution would probably be to rewrite the stub script as a native launcher.

Conversely, if you don't want to build/unpack a Python interpreter just to run Python scripts, you can just have your toolchain / py_runtime point to the non-hermetic system interpreter.
Reply all
Reply to author
Forward
0 new messages