What does it mean when action is reported as "local, remote-cache"?

654 views
Skip to first unread message

Konstantin

unread,
Jul 21, 2021, 12:11:49 AM7/21/21
to bazel-discuss
During build execution it produces progress messages on the console, which look for instance like this:
    Compiling modules/platform/tabdatacollections/main/db/BinaryTupleReader.cpp; 12s local, remote-cache

I wonder what exactly is the meaning of "local, remote-cache" added to each build progress message. 

I understand that it must be connected to where (or whether) the action is executed, but struggle to grasp the exact meaning. For instance, when the action is executed (clean build with no cache) and when the build is fully cached and output is populated from the cache I see exactly the same "local, remote-cache" in both cases, which confuses me - I expected to see different things. 

Could somebody please shed some light on it?

Thank you!
Konstantin

Lars Clausen

unread,
Jul 21, 2021, 12:53:53 PM7/21/21
to Konstantin, bazel-discuss
That part lists the strategies used. When you use dynamic execution, there's one strategy used locally ('local') and one used remotely ('remote-cache'), and the first to finish cancels the other one. You'll see them both in that message regardless of how cached the build is. Once one of them finishes, you don't see the message any more, so you never see which one actually won.

-Lars


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/8b8ae70d-d9e2-4859-a49f-9a00f688dc9dn%40googlegroups.com.

Konstantin

unread,
Jul 21, 2021, 2:08:42 PM7/21/21
to bazel-discuss
Hey Lars, thank you for your response! It begins to make sense now.
I know about dynamic execution, but was under impression that it must be explicitly turned on. I did not know it is now ON by default.

While from the description dynamic execution looks like the best of both worlds (or strategies) I wonder if it may have some negative perf consequences. 
Here is our specific scenario: large C++ build with thousands of targets, single build machine and "local remote" cache (AKA --disk_cache) on the local SSD.

From the clean state it takes 40 minutes to finish. At the end it shows INFO: 11477 processes: 1586 internal, 9891 local.
Then I run "bazel clean" to wipe the output folder and build again without any changes expecting it to just copy everything over from --disk_cache.
At the end it reports INFO: 11744 processes: 10032 remote cache hit, 1712 internal. (numbers are approximate) which seems to tell that indeed it copied everything from the cache, except for the "internal" actions which I don't know what it is.
THE PROBLEM: the second build which only copies files from one SSD to another still takes about 10 minutes! I tried to copy the same volume of data manually and it was 30 seconds. Also during the fully cached build CPU is at 100% all the time, which probably can be explained by the dynamic execution, but I don't see any compiler processes running.
It does not feel right that the fully cached build takes that long and I am looking for the culprit.
Could dynamic execution be at fault? How do I turn it off for the experiment?

Also it is kind of pity that Bazel treats __disk_cache as the remote cache, while physically it is local. Dynamic execution makes a lot of sense when distributed execution is enabled and probably when REAL remote cache is in play, but local cache would be faster in 99.9% cases and there is no point to incur dynamic execution overhead.

Could you shed some light please?

Konstantin

Jared Neil

unread,
Jul 21, 2021, 9:10:13 PM7/21/21
to bazel-discuss
There is a chance the disk cache perf issues you are experiencing are similar to those described in this issue: https://github.com/bazelbuild/bazel/issues/10875
When there are lots of action inputs, simply calculating the cache keys can become the bottleneck.

Konstantin

unread,
Jul 21, 2021, 10:35:04 PM7/21/21
to bazel-discuss
Jared, first of all, thank you for pointing me at that discussion! I was not aware of it and indeed it looks relevant. 

-- The average duration for MerkelTree.build went from 94ms to 0.15ms (625x speedup) and getInputMapping went from 104ms to 0.037ms (2810x speedup).

I wonder how you got those averages? When I look at the profile loaded into Chrome I can see the timing for each individual event, but how do you get the averages for all events of the same type?

I like your idea of bundling frequently reused sets of dependencies to single files. We work with C++ (at this time) not node, so your receipt does not apply directly, but I am thinking about replacing toolchain files (such as compiler and everything) with the TreeArtifacts to avoid unnecessary re-hashing them for each compile action. Do you think it is worth trying?

Lars Clausen

unread,
Jul 22, 2021, 9:17:50 AM7/22/21
to Konstantin, bazel-discuss
Dynamic execution is not the default, something you use must be setting it. You can use `--announce_rc` to see what gets set in which bazelrc files. Or you can run with `--bazelrc=/dev/null` to reset your flags.

I don't know if your slowdown is due to this, but if you're using dynamic execution and 'local' `--dynamic_local_strategy` _without_ `--experimental_local_lockfree_output`, local execution may indeed block remote execution and cause slowdown.

-Lars


Konstantin

unread,
Jul 22, 2021, 3:48:20 PM7/22/21
to bazel-discuss
Lars, I am pretty sure we don't have dynamic execution enabled. I have checked with --announce_rc and no flags like " --experimental_spawn_scheduler" or " --internal_spawn_scheduler" are present.

While looking for that I discovered that we have "--spawn_strategy=local". We only utilize local machine so far. For the experiment I commented out that flag and to my surprise fully cached build now runs much faster! I wonder how it can be explained.

Another thing that still confuses me is that even without that  "--spawn_strategy=local" I see all actions reporting  "local, remote-cache", i.e. exactly the same thing as before. And to make it worse on another machine which uses exactly the same settings the actions report  "remote-cache, local", i.e. the same thing but in a different order. I feel lost. :-(

Lars Clausen

unread,
Jul 23, 2021, 7:53:31 AM7/23/21
to Konstantin, bazel-discuss
The order of the output is arbitrary, don't let that worry you.

I didn't realize other Bazel features also use that output form - I have not yet played around with remote caches. I expect that output form indicates that it uses remote cache with fallback to local execution on cache miss.

-Lars


Reply all
Reply to author
Forward
0 new messages