I needed to find a way to create reproducible builds, regardless of the dev environment user uses. Luckily, Go gives almost everything needed for that out-of-the-box, and there is a great blog post by Filippo on the topic:
https://blog.filippo.io/reproducing-go-binaries-byte-by-byte. If we have the same Go version and the same set of dependencies (which is easy when using vendor/ approach), the only problem is the difference in the absolute path of the working directory. In other words, the same code, built on the same dev environment in `GOPATH/src/project1` and `GOPATH/src/project2` will yield different binaries. There is an open issue for that in Go, and it will be hopefully addressed in Go 1.12 (
https://github.com/status-im/status-react/issues/5587).
For now, the easy approach, of course, is to use docker for the build, but that feels too heavy just for ensuring the same dir. Spoofing directory with LD_PRELOAD hacks or using `chroot` approach also have obvious drawbacks – the need of C toolchain and root access, respectively.
For a quick recap, every Go package or binary is stamped with buildid value, which is essentially a 4 hash value:
actionID(binary)/actionID(main.a)/contentID(main.a)/contentID(binary)
where:
- actionID means a unique identifier of the inputs (sources, file names, go version, etc)
- contentID means a unique identifier of the outputs (actual content output by compiler/linker)
So my thought went in the following direction – I don't care if the actionID (inputs) is different, but do care if contentID (outputs) are different.
If contentID is equal, I can just rewrite actionID with "expected" one and get the same binary byte-by-byte. This can be fully automated in Makefile or script. So the steps for the reproducible build are the following:
- build binary - `go build -ldflags "-s -w" -asmflags=-trimpath="$(pwd)" -gcflags=-trimpath="$(pwd)"`
- extract buildid - `go tool buildid myapp`
- compare buildid's contentID values to known ones - `diff <(go tool buildid ./myapp | cut -d'/' -f3) <(cat release.buildid.txt | cut -d'/' -f3)`
- if they're equal, assume that build is the same, and just rewrite the buildid value inside the binary - `objcopy --update-section .note.go.buildid=release.buildid.bin ./myapp` for ELF
In my tests that result in byte-by-byte equal binaries.
I have two concerns with this approach:
1) I might be missing some corner cases, especially with hacking binaries of different formats. What perils of patching binary can be here?
Any thoughts on that? What else am I missing? Would this be a viable workaround for having reproducible build until #5587 is solved?