using vpython3 in the build instead of python3?

373 views
Skip to first unread message

Dirk Pranke

unread,
Feb 22, 2023, 7:35:02 PM2/22/23
to build, Brian Ryner, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Hi all,

For quite a long time, we had a prohibition against using `vpython` in the build, because it was believed to be too slow. In fact, as far as I know we're supposed to still have that prohibition.

The bug in question we were using to track this -- crbug.com/948846 -- got dup'ed into crbug.com/1152879 (on my suggestion), but I think that ended up being a mistake because in the latter case we were only looking into general infra vpython performance,
and not something as performance sensitive as using it thousands of times in the build.

So, I think we should re-evaluate things and decide whether we can now allow it. 

IIRC, there was one case where we knew performance would be unacceptably slow, and that was when we were using vpython as the script_executable (i.e., the default for all python invocations), and when we used a python wrapper script around clang on the code coverage bots (the latter meaning we'd invoke python on every compile action as well as every script action). I have a vague memory that we did something to address this, but I don't remember what it was, so I'm not sure if this is still an issue.

It seems like we could have several different possible options

1) vpython is not allowed at all, for anything (status quo AFAIK)
2) vpython is allowed as needed
3) vpython is allowed (and is in fact mandated) for everything except for the compile wrapper, and we somehow figure out how to work around it for the wrapper.
4) vpython is allowed/mandated for everything.

I think generally we want to switch to vpython where we can, because vpython's package management practices are better than our ad-hoc "pull code down. check it into third_party, and do sys.path hacking" approach that we've been using forever.

I am, however, not clear on how much of a performance hit we're willing to take for this.

Anyone have any sense of what they think might be acceptable? I'm thinking we'd want something like a <= 1% performance hit on the bots like linux-rel that have coveraged enabled, i.e., case 4 above. 

@Gary Tong Do you have any sense of what sort of a build time increase we could tolerate on the fleet (bearing in mind that I'm asking you this on a public mailing list :)?

@Prakhar Asthana Do you remember the coverage issue I'm talking about and whether we did anything about it?

@Others - do other people have thoughts on this? I'll mention for example that we don't have to worry about whether or not people have vpython, because we run all of our gclient hooks using vpython now.

I tripped over this today because I wanted to switch some tools over to vpython but I discovered that one of them was being used in the build and hence triggered this case :(.

-- Dirk





Chenlin Fan

unread,
Feb 22, 2023, 8:15:10 PM2/22/23
to Dirk Pranke, build, Brian Ryner, Takuto Ikuta, Prakhar Asthana, Gary Tong
For the script_executable, maybe you are mentioning crbug.com/1345490? I did some exploration but vpython's performance on windows may not be fixable : (

There are some benchmarks in crrev.com/c/3764910. It's more like vpython is ~500% slower than python on windows.

Dirk Pranke

unread,
Feb 22, 2023, 8:44:44 PM2/22/23
to Chenlin Fan, build, Brian Ryner, Takuto Ikuta, Prakhar Asthana, Gary Tong
On Wed, Feb 22, 2023 at 5:15 PM Chenlin Fan <fa...@google.com> wrote:
For the script_executable, maybe you are mentioning crbug.com/1345490? I did some exploration but vpython's performance on windows may not be fixable : (

Ah, yes, that bug describes the problem.

That rules out #4 and possibly rules out #3 (although I think it might make sense to check what the actual wall clock increase is instead of measuring work time). 

It still leaves option #2 (vpython only as needed) as possible, as long as the overhead isn't too high.

Arguably that makes things more complicated (having a mix of python and vpython) but that may be an acceptable tradeoff to be able to
take advantage of CIPD where possible and reduce the number of third_party directories directly checked in src/. 

Alternatively, maybe we could pursue your idea to run GN inside the vpython environment such that we'd get vpython's hermetic copy of
python instead of relying on python being in the $PATH.

There are some benchmarks in crrev.com/c/3764910. It's more like vpython is ~500% slower than python on windows.

Yes, that's an interesting and relevant discussion as well.

Thanks for the pointers!

-- Dirk

Fumitoshi Ukai (鵜飼文敏)

unread,
Feb 23, 2023, 8:52:25 PM2/23/23
to Dirk Pranke, build, Brian Ryner, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
On Thu, Feb 23, 2023 at 9:35 AM 'Dirk Pranke' via build <bu...@chromium.org> wrote:
Hi all,

For quite a long time, we had a prohibition against using `vpython` in the build, because it was believed to be too slow. In fact, as far as I know we're supposed to still have that prohibition.

The bug in question we were using to track this -- crbug.com/948846 -- got dup'ed into crbug.com/1152879 (on my suggestion), but I think that ended up being a mistake because in the latter case we were only looking into general infra vpython performance,
and not something as performance sensitive as using it thousands of times in the build.

So, I think we should re-evaluate things and decide whether we can now allow it. 

IIRC, there was one case where we knew performance would be unacceptably slow, and that was when we were using vpython as the script_executable (i.e., the default for all python invocations), and when we used a python wrapper script around clang on the code coverage bots (the latter meaning we'd invoke python on every compile action as well as every script action). I have a vague memory that we did something to address this, but I don't remember what it was, so I'm not sure if this is still an issue.

It seems like we could have several different possible options

1) vpython is not allowed at all, for anything (status quo AFAIK)
2) vpython is allowed as needed
3) vpython is allowed (and is in fact mandated) for everything except for the compile wrapper, and we somehow figure out how to work around it for the wrapper.
4) vpython is allowed/mandated for everything.

fyi. vpython3 is used for now.

note that vpython3 would not be remote execution friendly, so I think it's better not allow vpython3 in build.



I think generally we want to switch to vpython where we can, because vpython's package management practices are better than our ad-hoc "pull code down. check it into third_party, and do sys.path hacking" approach that we've been using forever.

I am, however, not clear on how much of a performance hit we're willing to take for this.

Anyone have any sense of what they think might be acceptable? I'm thinking we'd want something like a <= 1% performance hit on the bots like linux-rel that have coveraged enabled, i.e., case 4 above. 

@Gary Tong Do you have any sense of what sort of a build time increase we could tolerate on the fleet (bearing in mind that I'm asking you this on a public mailing list :)?

@Prakhar Asthana Do you remember the coverage issue I'm talking about and whether we did anything about it?

@Others - do other people have thoughts on this? I'll mention for example that we don't have to worry about whether or not people have vpython, because we run all of our gclient hooks using vpython now.

I tripped over this today because I wanted to switch some tools over to vpython but I discovered that one of them was being used in the build and hence triggered this case :(.

-- Dirk





--
You received this message because you are subscribed to the Google Groups "build" group.
To unsubscribe from this group and stop receiving emails from it, send an email to build+un...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/build/CAEoffTBsAeOJHcDwrX6dO2JuPOWLZkywb%2BCfMXTJ25T-U5F79w%40mail.gmail.com.

Takuto Ikuta

unread,
Feb 24, 2023, 4:02:31 AM2/24/23
to Fumitoshi Ukai (鵜飼文敏), Dirk Pranke, build, Brian Ryner, Chenlin Fan, Prakhar Asthana, Gary Tong
I think option 2 is fine. IIRC, we already have a script invoking vpython in chromium build.

By the way, I'm curious what python package do you want to use from build scripts?

Brian Ryner

unread,
Feb 26, 2023, 6:31:38 PM2/26/23
to Fumitoshi Ukai (鵜飼文敏), Dirk Pranke, build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Hi,

Fumitoshi, could you elaborate on vpython3 not being remote-execution friendly? Are you referring to the performance issues, or something else?

Thanks,


--
-Brian

Fumitoshi Ukai (鵜飼文敏)

unread,
Feb 26, 2023, 8:29:57 PM2/26/23
to Brian Ryner, Dirk Pranke, build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
On Mon, Feb 27, 2023 at 8:31 AM Brian Ryner <bry...@google.com> wrote:
Hi,

Fumitoshi, could you elaborate on vpython3 not being remote-execution friendly? Are you referring to the performance issues, or something else?

We need to list up all files needed to run the command.  Maybe, we need to parse .vpython3 spec and script-specific/embedded spec?
Otherwise, vpython3 needs to fetch cipd for each vpython3 invocation (on RBE worker) and not be reused with other vpython3 invocations.  It would be a performance issue.
We also want to disable the network on RBE worker for action harmicity, then we must prepare vpython wheel before executing vpython3 commands and add them in action inputs.

Erik Staab

unread,
Feb 27, 2023, 1:59:05 PM2/27/23
to Fumitoshi Ukai (鵜飼文敏), Brian Ryner, Dirk Pranke, build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
I don't think we should allow vpython in the build for the reason ukai@ mentioned, and not just for remote execution but also for not introducing build flakiness due to added network access at build time.

If all dependencies could be pulled from CIPD or wherever in advance of starting the build I think that would be a better approach and also more aligned with the py_binary() py_test() model in bazel, which I think we should be moving toward anyway.



Dirk Pranke

unread,
Feb 27, 2023, 3:32:23 PM2/27/23
to Erik Staab, Fumitoshi Ukai (鵜飼文敏), Brian Ryner, build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Generally speaking, one is supposed to add one's vpython packages to //.vpython3. We have a gclient hook that fetches everything for that vpython-spec, so that we can be guaranteed to have what we need at test time. We don't have tooling that enforces this and I doubt that it's even a very broadly understood policy, but it hasn't been an issue thus far.

If we tweaked that so that the venv was also in the checkout (I don't think it is today?), RBE should be able to fetch it as easily as anything else. Put differently, if we decide we should allow vpython to be used at build time, we should simply make that work with RBE. An even different way of putting this is that we should be using vpython for as many things as we can rather than checking them in to //third_party.

Perhaps obviously, I was not aware that we were already using vpython in the build. We should not have allowed that, since the policy from thakis@ and I has been to not allow vpython at all during the build. I've commented on crbug.com/1309771 to raise that as an issue with that build step.

But, as shown by this thread, I'm open to changing the policy. I think option 2, where vpython is only used rarely, is probably fine from a performance point of view.

I think thakis@ and I have been the main build OWNERS involved in this discussion, and so I'd be reluctant to try and change the policy without his buy-in. Perhaps we should pause this discussion until he becomes available, though I'd still like to hear the answers from gatong@ and pasthana@.

-- Dirk


Brian Ryner

unread,
Feb 27, 2023, 9:35:04 PM2/27/23
to Dirk Pranke, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
I'd think this could generally work in Bazel by having the venv be an implicit build target, which can be built once ("vpython -vpython-tool install", essentially), cached and then used as an input to other build targets that rely on vpython?

It's not the standard way we recommend, but yes it would also be possible to have the venv constructed at submit time (via a similar method) and committed to git.

--
-Brian

Dirk Pranke

unread,
Feb 28, 2023, 1:23:59 PM2/28/23
to Brian Ryner, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
I'm a bit confused as to why you're bringing up Bazel. Is it because that's how the remote build execution API was mostly targeted at? Or are you thinking we don't want to do anything that can't be done in Bazel?

-- Dirk

Brian Ryner

unread,
Feb 28, 2023, 6:43:24 PM2/28/23
to Dirk Pranke, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
I was following on to @Erik Staab 's comment about Bazel, but I think my reply is generalizable to any kind of remote build.

--
-Brian

Dirk Pranke

unread,
Feb 28, 2023, 7:02:32 PM2/28/23
to Brian Ryner, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Got it. 

I don't think we'd want to download code (the venv packages) during the build itself. We currently download everything we might need for a build during `gclient sync` and a build does not need to talk to the network (ignoring goma which is special). I don't think we'd want to give up that property.

But, yeah, ignoring that, I agree that you could probably do it at build time :).

-- Dirk

Brian Ryner

unread,
Feb 28, 2023, 7:35:53 PM2/28/23
to Dirk Pranke, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Right, ok -- so possibly 'gclient sync' could run vpython to build the venv (once)?


--
-Brian

Dirk Pranke

unread,
Mar 1, 2023, 12:39:07 PM3/1/23
to Brian Ryner, Erik Staab, Fumitoshi Ukai (鵜飼文敏), build, Chenlin Fan, Takuto Ikuta, Prakhar Asthana, Gary Tong
Right. That's what I'm picturing (and what the code already does).

-- Dirk
Reply all
Reply to author
Forward
0 new messages