rules_go: How to reduce downloads of the Go toolchain?

528 views
Skip to first unread message

tom....@centralway.com

unread,
Aug 28, 2017, 9:59:28 AM8/28/17
to bazel-discuss
Hi,

rules_go downloads the Go toolchain as part of the build process via repository_ctx.download_and_extract:
For example, on Linux/amd64 for Go 1.8.3 it downloads the tarball
which is 86MB in size.

Several developers, including myself, have observed that this download occurs more often than expected. Go 1.8.3 was released three months ago, the tarball has not changed since, and yet we see Bazel re-downloading it fairly often (say, a few times per week). For people on slow network connections, notably our people working remotely working via VPN, these re-downloads significantly slow down the builds, easily adding several unnecessary minutes to the build. It is not clear what changes trigger a re-download.

Barring "bazel clean", what would cause Bazel to decide that it needs to re-download the tarball and how can we avoid this?

Many thanks,
Tom


Random factoids that might assist diagnosis:
- This is all on macOS and we're generally tracking the latest release of Bazel (0.5.2, 0.5.3, 0.5.4...).
- We're using rules_go at tag 0.5.3.
- The VCS is git, and developers are regularly switching branches, some of which might have different WORKSPACE contents.
- The top-level WORKSPACE file changes approximately weekly, usually in response to changes in rules_go (e.g. to use a newer tag or take advantage of a new feature in rules_go).
https://bazel.build/designs/2016/10/18/repository-invalidation.html states "To avoid unnecessary re-download of artifacts, a content-addressable cache has been developed for downloads (and thus not discuted here)." However, the re-downloads are clearly occurring.
- We have a local copy of rules_go in our repository so we don't need GitHub to be up when doing a clean build.
- Our WORKSPACE file contains:

local_repository(
    name = "io_bazel_rules_go",
    path = "bazel-rules/github.com/bazelbuild/rules_go",
)

load("@io_bazel_rules_go//go:def.bzl", "go_repositories")

go_repositories(go_version="1.8.3")



Damien Martin-Guillerez

unread,
Aug 28, 2017, 10:01:25 AM8/28/17
to tom....@centralway.com, bazel-discuss
Hi Tom,

You might want to use --experimental_repository_cache to cache those artifact. The rules might actually redownload it despite the sha sum not changing (and it is working as intended, our long term solution is --experimental_repository_cache)

This message is for the attention of the intended recipient(s) only. It may contain confidential, proprietary and/or legally privileged information. Use, disclosure and/or retransmission of information contained in this email may be prohibited. If you are not an intended recipient, you are kindly asked to notify the sender immediately (by reply e-mail) and to permanently delete this message. Thank you.

--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/53b6fe98-3acd-487a-ae9f-98b0257c6ffe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ian Cottrell

unread,
Aug 28, 2017, 10:10:15 AM8/28/17
to Damien Martin-Guillerez, tom....@centralway.com, bazel-discuss
As Damien, use --experimental_repository_cache, we all do.
The sdk repository rules add special build files to the go sdk after extraction. For this to work correctly, they must effectively be re-run whenever rules_go (or bazel) changes. They do not however need to download again, just re-extract and process, which is where --experimental_repository_cache comes in, it caches the downloaded archive externally, so it can be re-extracted into the workspace.
Alternatives are to use a caching proxy (which is essentially a harder to set up but more general solution with no practical difference to that command line flag) or change the repository rules to refer to local copy of the archive (maybe in a network folder or something) but that makes your workspace less general.


tom....@centralway.com

unread,
Aug 28, 2017, 12:05:30 PM8/28/17
to bazel-discuss, dmar...@google.com, tom....@centralway.com
Thanks Damien and Ian. I've added the option and will report back.

For those following at home, the --experimental_repository_cache option requires an argument specifying the absolute path the cache. It cannot be a relative path.

Thanks for the quick response,
Tom
Reply all
Reply to author
Forward
0 new messages