Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Windows remote cache issues

25 views
Skip to first unread message

Yngve N. Pettersen

unread,
May 7, 2024, 5:30:32 AM5/7/24
to reclien...@chromium.org
Hi,

As just mentioned, we just deployed a general Reclient system in Vivaldi
for building our Chromium (124) based code, and we also updated an existing
one, but I am noticing some oddities, especially on Windows. (Windows is
cross-compiled on Linux).


What I seem to be seeing, especially on Windows is that the remote cache
does not map from checkout to checkout, even on the same computer, much
less different computers, and such mapping is IMO much of the point about
remote caching and compilation. So, aside from compile assistance reclient
does not seem to help building a different checkout from scratch.

I conducted a couple of tests of building with reclient in a updated
checkout on Windows that had not been built using Reclient before; a second
checkout had been built fully the day before (that was my test environment,
on a different drive, but changes were minimal from that state). Despite
the previous cache build, the test build (which was only using -j 32, since
I wanted to simulate the other, existing system) took 160 minutes to build,
when rebuilding from scratch. After cache seeding, it took 15 minutes for
80K items.

A second tests in a third work dir had the same timing results.

To some extent it seems like some of the cache is used, but not all of it
(if it had, the third dir should have built in 15 minutes the first time,
not 160 mins).

At the very least, on the (dev) Linux x64 builders (which are using a
remote-cache-only system) the full cache seems to be used, that also seems
to be the case with the Android builders, too, but not the Linux Arm64
builders, although that may have improved.

The dev Windows Arm64 builder also seems to be using the cache, but it is
also only running on a single worker.

Thinking about it, I see some potential reasons:

- build path
- environment or GN variables
- some kind of variability in the sequence of command parameters in
ninja, but still remains deterministic in a given path (unlikely
possibility IMO)
- some differences in central generated include files (the mojom files
strikes me as a possible candidate), e.g by changing the content based on
absolute path location

I have not yet done a deep dive to figure out why this is happening (I will
actually need to use my reclient test env to do that, and the next couple
of weeks will be spent updating our code base to Chromium 126), but I did
notice that the only files with the absolute path are the ninja files
having the full path in the exec_root parameter to the rewrapper, and the
PDB path in the LD flags, neither of which are (or should be) included in
the calculation of the CAS addresses IMO (don't know if they are). The .h
files seemed clean (and a sha256sum of header files didn't turn up any
active files as mismatches).

Do you have any ideas about what is going on?

--
Sincerely,
Yngve N. Pettersen
Vivaldi Technologies AS
Reply all
Reply to author
Forward
0 new messages