The failing test //tools:par_test in rules_python

48 views
Skip to first unread message

Lukács T. Berki

unread,
Mar 23, 2018, 9:22:43 AM3/23/18
to Marcel Hlopko, Jakob Buchgraber, Rosica Dejanovska, Matthew Moore, bazel-si...@googlegroups.com
Hey there,

Turns out, it's failing because Bazel at HEAD builds different PAR files than the last release. Therefore, there isn't really a way to fix this test -- it either works with the released Bazel or at HEAD.

Thus, I propose to delete that test. I haven't looked what the difference is, but having this test would mean that we'd have to promise that we never change Bazel in a way that makes .par files build differently, which isn't such a great idea (this, or not running the test on the CI, but let's not do that...)

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891

Matthew Moore

unread,
Mar 23, 2018, 10:06:31 AM3/23/18
to Lukács T. Berki, hlo...@google.com, Jakob Buchgraber, ros...@google.com, bazel-si...@googlegroups.com
FWIW, this is born of Bazel's inability to build tool dependencies for WORKSPACE evaluation.  Let's fix that, everything else is a workaround.

In the short-term, if you'd like to propose an alternative that makes sure the checked in PAR file is properly updated and gives the reviewer a way of validating that the PAR (effectively an opaque binary) was updated correctly (by an untrusted third party) then I'd love to hear it.  This test basically builds on the premise that Bazel's promise of perfect reproducibility (maybe just for a version of Bazel) is also a promise of perfect verifiability.  The approach we've adopted elsewhere (rules_docker) effectively requires splitting the repo and cutting releases, which is an incredibly tedious / heavy process.

tl;dr please don't just disable this.
-M

--
Matthew Moore
Container Development Uber-TL
Developer Infrastructure @ Google

Lukács T. Berki

unread,
Mar 23, 2018, 10:34:46 AM3/23/18
to Matthew Moore, Marcel Hlopko, Jakob Buchgraber, Rosica Dejanovska, bazel-si...@googlegroups.com
On Fri, 23 Mar 2018 at 15:06, Matthew Moore <matt...@google.com> wrote:
FWIW, this is born of Bazel's inability to build tool dependencies for WORKSPACE evaluation.  Let's fix that, everything else is a workaround.
Unfortunately, that is a HUGE problem, so I don't think it's feasible to make that a dependency on making our CI green.
 

In the short-term, if you'd like to propose an alternative that makes sure the checked in PAR file is properly updated and gives the reviewer a way of validating that the PAR (effectively an opaque binary) was updated correctly (by an untrusted third party) then I'd love to hear it.  This test basically builds on the premise that Bazel's promise of perfect reproducibility (maybe just for a version of Bazel) is also a promise of perfect verifiability.  The approach we've adopted elsewhere (rules_docker) effectively requires splitting the repo and cutting releases, which is an incredibly tedious / heavy process.
 Do we have untrusted third parties pushing code to rules_python? In that case, they can also push source code, can't they?

Mind you, I'm all for verifying that opaque binaries are what we think they are, but a Bazel test seems like the wrong place for it due to the aforementioned dependency on Bazel versions. It essentially guarantees that our CI is red around a release at least for a short time.




tl;dr please don't just disable this.
-M



On Fri, Mar 23, 2018 at 6:22 AM Lukács T. Berki <lbe...@google.com> wrote:
Hey there,

Turns out, it's failing because Bazel at HEAD builds different PAR files than the last release. Therefore, there isn't really a way to fix this test -- it either works with the released Bazel or at HEAD.

Thus, I propose to delete that test. I haven't looked what the difference is, but having this test would mean that we'd have to promise that we never change Bazel in a way that makes .par files build differently, which isn't such a great idea (this, or not running the test on the CI, but let's not do that...)

--
Lukács T. Berki | Software Engineer | lbe...@google.com | 

Google Germany GmbH | Erika-Mann-Str. 33  | 80636 München | Germany | Geschäftsführer: Paul Manicle, Halimah DeLaine Prado | Registergericht und -nummer: Hamburg, HRB 86891


--
Matthew Moore
Container Development Uber-TL
Developer Infrastructure @ Google

Evan Jones

unread,
Mar 23, 2018, 10:57:45 AM3/23/18
to Lukács T. Berki, Matthew Moore, Marcel Hlopko, Jakob Buchgraber, Rosica Dejanovska, Bazel/Python Special Interest Group
On Fri, Mar 23, 2018 at 10:34 AM, 'Lukács T. Berki' via Bazel/Python Special Interest Group <bazel-si...@googlegroups.com> wrote:
In the short-term, if you'd like to propose an alternative that makes sure the checked in PAR file is properly updated and gives the reviewer a way of validating that the PAR (effectively an opaque binary) was updated correctly (by an untrusted third party) then I'd love to hear it.  This test basically builds on the premise that Bazel's promise of perfect reproducibility (maybe just for a version of Bazel) is also a promise of perfect verifiability.  The approach we've adopted elsewhere (rules_docker) effectively requires splitting the repo and cutting releases, which is an incredibly tedious / heavy process.
 Do we have untrusted third parties pushing code to rules_python? In that case, they can also push source code, can't they?

Mind you, I'm all for verifying that opaque binaries are what we think they are, but a Bazel test seems like the wrong place for it due to the aforementioned dependency on Bazel versions. It essentially guarantees that our CI is red around a release at least for a short time.

In this particular case: I believe the opaque binary is actually a zip containing Python sources. We could write a smarter differ which just looks at the contents, and ignores parts that are known to change? (Like the Python bootstrap piece?) That doesn't *entirely* allow you to blindly accept a PR from any untrusted person, since they could insert malicious code into the bootstrap script, but it might be closer?

Lukács T. Berki

unread,
Mar 23, 2018, 11:00:41 AM3/23/18
to Evan Jones, Matthew Moore, Marcel Hlopko, Jakob Buchgraber, Rosica Dejanovska, bazel-si...@googlegroups.com
Problem is, as long as you don't control who can push code to the repository, then can always push untrusted .par files and just ignore the red tests. What's the threat model here? 

Matthew Moore

unread,
Mar 23, 2018, 11:05:39 AM3/23/18
to Lukács T. Berki, hlo...@google.com, Jakob Buchgraber, ros...@google.com, bazel-si...@googlegroups.com
Do we have untrusted third parties pushing code to rules_python? In that case, they can also push source code, can't they?

Yes.  Code can be reviewed, but binaries cannot.


Bazel test seems like the wrong place for it due to the aforementioned dependency on Bazel versions

Technically the dependency is on the version of the python rules, which are co-versioned with Bazel.

e.g. if I were checking in Go binaries they wouldn't change unless I revved rules_go.

Bazel core "seems like the wrong place for" the python rules, but rewriting those in skylark is also "a HUGE problem". :)

In fact, given that this repo is where the skylark python rules would live, I don't see anything wrong with a test like this (other than that we need to check in a binary in the first place).



Ultimately I think the path forward that makes the most sense is for whoever owns the core Bazel python rules to take ownership of this repo (and probably google/subpar), and own the solution.

The main piece I wanted to contribute to this thread was the verification constraint for whoever solves this problem differently.  I don't have the bandwidth to change this, and probably don't have the bandwidth to own this repo anymore anyways.

thanks,
-M

jo...@google.com

unread,
Mar 23, 2018, 7:30:12 PM3/23/18
to Bazel/Python Special Interest Group
I have some bandwidth and have been experiencing frustration in trying to use bazel with python, so I might be able to help. I've been hesitant to jump in because I just started using bazel, and I'm unclear on the planned direction for rules_python. It seemed like the plan is to move/implement the functionality of py_binary and friends in rules_python and out of bazel itself? Could someone give me an idea what I might be signing up for? What are the major difficulties? Eg. Why hasn't it happened already besides lack of time?

-james

Doug Greiman

unread,
Mar 23, 2018, 7:52:02 PM3/23/18
to Bazel/Python Special Interest Group
I think the use of .par files can be removed in this case: https://github.com/bazelbuild/rules_python/pull/81

I need to do some more testing though.

P. Oscar Boykin

unread,
Mar 23, 2018, 7:52:59 PM3/23/18
to Bazel/Python Special Interest Group
note using rules_nix could work here:

https://github.com/tweag/rules_nixpkgs

so, we would make a nix build of the binary we need, then use rules_nix to reproducibly build that binary.

It would be nice if we could build tools in bazel for WORKSPACE usage, but I haven't even heard a design on how that would work, so I am afraid realistically that is years out.

Lukács T. Berki

unread,
Mar 26, 2018, 3:55:58 AM3/26/18
to Matthew Moore, Marcel Hlopko, Jakob Buchgraber, Rosica Dejanovska, bazel-si...@googlegroups.com
Well, I can see two approaches that would work:
  1. Mandate that .par files can only be checked in by owners of the repository that we trust to not be naughty and not have these tests.
  2. Only run these tests with the latest Bazel release and not at HEAD and rebuild them on Bazel releases
I prefer (1) because it doesn't add any more ceremony to our releases. It does make accepting changes harder, though.

@Oscar: I'd muchr rather not add a dependency on Nix here. The simpler our world, the better.
Reply all
Reply to author
Forward
0 new messages