Examples on hermetic x86 cc_toolchains?

1,172 views
Skip to first unread message

Carlos Galvez

unread,
Mar 1, 2022, 9:43:12 AM3/1/22
to bazel-discuss
Hi,

Are there examples or tutorials on how to setup a fully hermetic x86 cc_toolchain? By "fully hermetic" I mean "not using the host compiler under /usr/bin, and instead download it via http_archive or similar". It should use the hermetic toolchain both at build time but also at runtime, e.g. use the http_archive-provided libstdc++.so instead of the host libstdc++.so.

Thanks!

Herrmann, Andreas

unread,
Mar 1, 2022, 9:52:19 AM3/1/22
to Carlos Galvez, bazel-discuss
Hi Carlos,

One option that may be worth considering is to use the Nix package manager [1] to provide a fully hermetic cc toolchain and import it into Bazel using rules_nixpkgs [2]. The repository contains examples [3,4] that illustrate how to use it.

Best, Andreas


--
You received this message because you are subscribed to the Google Groups "bazel-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/4a28ee71-bac9-4207-a804-eed4ff2d0c47n%40googlegroups.com.

Brian Silverman

unread,
Mar 1, 2022, 3:16:50 PM3/1/22
to Herrmann, Andreas, Carlos Galvez, bazel-discuss
Another option: https://github.com/grailbio/bazel-toolchain downloads Clang, and will use the bundled static libc++. I've got some changes at https://github.com/frc971/971-Robot-Code/tree/master/third_party/bazel-toolchain which support static or dynamic libc++ or libstdc++ from a sysroot.

Using the downloaded libstdc++.so when running tools on the host as part of the build is tricky. I haven't seen a good solution for that. I typically require the host to have a compatible version installed and just use that, C++ standard library bugs are fairly uncommon and the ABI compatibility ranges are usually workable.

I'm hoping to merge my changes there back to the original when I find time. `third_party/diff-from-upstream.sh third_party/bazel-toolchain/` is an easy way to see the diff  (script in the repository, you have to `git clone` for it to work).

Carlos Galvez

unread,
Mar 2, 2022, 4:57:31 AM3/2/22
to bazel-discuss
Thanks for the inputs!

> Using the downloaded libstdc++.so when running tools on the host as part of the build is tricky.

Yes, this is exactly what I'm struggling with at the moment. Basically I want to use GCC trunk but hosts are running Ubuntu 18 or 20, so they might have too old libstdc++.so. Docker is an option, but nothing beats the path of least resistance of running binaries directly via ./bazel-bin/.../my_binary. It's also more cumbersome to update the compiler (which we want to do often to get the latest bug fixes and optimizations).

I've been exploring the "runtime_library_search_directories" field of CcToolchainConfig. The cc_toolchain.bzl file will use that variable and append to the RPATH so it looks like exactly what's needed. BUT I can't find a way to actually set its value! It's a private field that I don't know how it gets populated.

/Carlos

Carlos Galvez

unread,
Mar 2, 2022, 10:13:17 AM3/2/22
to bazel-discuss
It's also not just libstdc++. There's also libgcc, libubsan, libasan, gcov, etc that need to match the compiler at runtime. Those might not be as stable as libstdc++ and cause trouble if there's any mismatch.



You received this message because you are subscribed to a topic in the Google Groups "bazel-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/bazel-discuss/kfWYRODXux4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to bazel-discus...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bazel-discuss/6067fa84-098a-40d4-b82f-ce89f465a5c5n%40googlegroups.com.

Brandon Adams

unread,
Mar 2, 2022, 10:33:30 AM3/2/22
to Carlos Galvez, bazel-discuss

The way I’ve handled this internally is to create host packages for our compilers with matching compiler runtime packages that can be installed into a known-location. We build inside containers that have these compilers+runtimes pre-installed. This isn’t fully hermetic, but it allows for our custom toolchain to RPATH a known location and for our production machines to simply depend on our internal compiler runtime packages (installable via our yum repo since we’re on a Centos7 base). This is needed because our compiler and c++ stdlib are much newer than what the host defaults to.

We have plans to become fully hermetic by consuming our compiler+runtime packages directly from the toolchain via rpm2cpio+cpio to download/extract our rpms into a sysroot and then rpath against those relative to the binary using runfiles propagation, but our prototyping runs into issues with the runtime dynamic linker refusing to process relative RPATHs when the binary is run via root or has certain capabilities set. One alternative to this is to statically link (and possibly LTO) our libstdc++, but we haven’t tried it over our full codebase yet. We have some amount of dynamic code loading, so I suspect that we can’t fully go `-static-libstdc++`  and will need compromises still. Another possibility is hacking the runtime dynamic linker to ignore the security restrictions, but this seems like a last resort even if our production environments are tightly sealed off from the rest of the world.

I’m curious what others are doing. If you can accept the limitations in statically linking libstdc++ that seems obviously an easy path to a hermetic build. Maybe other people don’t need to set capabilities on their c++ binaries and can accept the relative rpath hack (with the corresponding runfiles copying that needs to occur). Has anyone hacked ld.so to be more friendly to capabilities (I’m well aware of the possible exploits in doing so)? Most companies I’ve talked to still rely on a container runtime environment to build their code under remote exec/CI scenarios. Is there something I missed?

Brandon Adams
Software Engineer
T +13122047540
E brando...@imc.com
233 South Wacker Drive # 4300,
Chicago, Illinois 60606, US

IMC Logo

F

t

I

in

imc.com





The information in this e-mail is intended only for the person or entity to which it is addressed.

It may contain confidential and /or privileged material, the disclosure of which is prohibited. Any unauthorized copying, disclosure or distribution of the information in this email outside your company is strictly forbidden.

If you are not the intended recipient (or have received this email in error), please contact the sender immediately and permanently delete all copies of this email and any attachments from your computer system and destroy any hard copies. Although the information in this email has been compiled with great care, neither IMC nor any of its related entities shall accept any responsibility for any errors, omissions or other inaccuracies in this information or for the consequences thereof, nor shall it be bound in any way by the contents of this e-mail or its attachments.

Messages and attachments are scanned for all known viruses. Always scan attachments before opening them.

Carlos Galvez

unread,
Mar 3, 2022, 8:44:15 AM3/3/22
to Brandon Adams, bazel-discuss
> rpath against those relative to the binary using runfiles propagation

How can this be achieved in practice, could you provide some example/pseudocode?

Thanks!
Reply all
Reply to author
Forward
0 new messages