Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Reproducible builds

156 views
Skip to first unread message

David Bruant

unread,
Jul 17, 2016, 12:38:32 PM7/17/16
to
Hi,

Two recent comments on the Linux reproducible build bug thread [1] suggest that the bug has no clear end goal.

In this email, I'll try to describe what I understand of the problem and discuss the outline of a possible end goal.
I felt that the topic covers a wide enough range of responsibilities, so I'm posting to dev-platform. If there is a better forum, please tell me what a better forum would be and sorry for the noise.

# Context

I believe Brendan Eich's post captures the threat fairly well [2]

> Every major browser today is distributed by an organization within reach of surveillance laws. As the Lavabit case suggests, the government may request that browser vendors secretly inject surveillance code into the browsers they distribute to users. We have no information that any browser vendor has ever received such a directive. However, if that were to happen, the public would likely not find out due to gag orders.

# End goal

Brendan Eich suggests :

> Through international collaboration of independent entities we can give users the confidence that Firefox cannot be subverted without the world noticing, and offer a browser that verifiably meets users’ privacy expectations.

The end goal would be that someone (EFF or equivalents, security researchers, etc.) downloads something from https://www.mozilla.org/en-US/firefox/all/ and can either build a bit-identical from source or a version that allows to verify that the downloaded version has not been altered with a backdoor or else.

The verification could be assisted by a tool that looks at the various files and verifies the content of important files is bit-identical


# Answering to some points

https://bugzilla.mozilla.org/show_bug.cgi?id=885777#c21

> 1) The NSS .chk files are always different (see #c8).
2) Timestamps of the files inside the .tar.bz2 package will differ, but untarring them and using a recursive diff will reveal no differences (except for the aforementioned .chk files)

The second point sort of solves them both. As part of making things verifiable, Mozilla could publish a program that makes byte by byte comparison only on files that matters after unzip. If they're not that important, .chk files could be ignored (blacklisted from the comparison). Same for file timestamps.
That would be acceptable IMHO since a backdoor cannot be hidden in .chk files or file timestamps (right?).

That could be called "comparable builds" and seems closer to something reachable than actually bit-equality.
This shifts the problem a bit because another programs verifies Firefox. However, this verifying program is a combination of gunzip + directory traversal + bit comparison and seems simple enough that it cannot be the target of being alterable.

Out of curiosity, how has is the TOR team handled points 1 and 2?


https://bugzilla.mozilla.org/show_bug.cgi?id=885777#c22

> Docker images are notoriously not very reproducible (because `yum update/install` installs the latest version of packages advertised on servers and that can change over time).

Does this affect file bit-to-bit comparison between what can be downloaded from https://www.mozilla.org/en-US/firefox/all/ and what can be built from source?

> However, paranoid people will want to reproduce those independently. It's turtles all the way down of course. The question is how far do we want to go.

In my opinion, enabling independent organization to point out whether what can be downloaded from https://www.mozilla.org/en-US/firefox/all/ has been altered would be an amazing first milestone.
I wouldn't worry too much about "paranoid people" for now.

> We /could/ publish the Docker images Mozilla uses (they are probably already public for all I know).

Publishing the images used by Mozilla would probably be enough for now IMHO. People can always audit the image by traversing the image file system to see whether they find something fishy.

Does a comparable build seems like a good end-goal?

David

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=885777#c21
[2] https://brendaneich.com/2014/01/trust-but-verify/

Kurt Roeckx

unread,
Jul 18, 2016, 4:27:15 AM7/18/16
to
On 2016-07-17 18:38, David Bruant wrote:
>
> 2) Timestamps of the files inside the .tar.bz2 package will differ, but untarring them and using a recursive diff will reveal no differences (except for the aforementioned .chk files)
>
> The second point sort of solves them both. As part of making things verifiable, Mozilla could publish a program that makes byte by byte comparison only on files that matters after unzip. If they're not that important, .chk files could be ignored (blacklisted from the comparison). Same for file timestamps.
> That would be acceptable IMHO since a backdoor cannot be hidden in .chk files or file timestamps (right?).
>
> That could be called "comparable builds" and seems closer to something reachable than actually bit-equality.
> This shifts the problem a bit because another programs verifies Firefox. However, this verifying program is a combination of gunzip + directory traversal + bit comparison and seems simple enough that it cannot be the target of being alterable.
>
> Out of curiosity, how has is the TOR team handled points 1 and 2?

In Debian we have a similar problem with the timestamps in .deb files.
We're actually making the .deb files bit identical. See
https://reproducible-builds.org/specs/source-date-epoch/ for how that's
done.

A different problem you might have is that the order in .tar.bz2 files
might not be identical, see:
https://wiki.debian.org/ReproducibleBuilds/FileOrderInTarballs

Note that Firefox in Debian can be build reproducible, I'm just not sure
what is all needed for that.


Kurt

Gregory Szorc

unread,
Jul 18, 2016, 2:57:12 PM7/18/16
to David Bruant, dev-platform
We already have deterministic packaging in some parts of Firefox (notably
most XPIs and omni.ja files). We've done this by implementing our own
jar/zip archiving layer (
https://dxr.mozilla.org/mozilla-central/source/python/mozbuild/mozpack/mozjar.py)
which pins times, sorts files before writing, etc. We just haven't applied
this to all parts of packaging yet. We know what we have to do here.


>
>
> https://bugzilla.mozilla.org/show_bug.cgi?id=885777#c22
>
> > Docker images are notoriously not very reproducible (because `yum
> update/install` installs the latest version of packages advertised on
> servers and that can change over time).
>
> Does this affect file bit-to-bit comparison between what can be downloaded
> from https://www.mozilla.org/en-US/firefox/all/ and what can be built
> from source?
>
> > However, paranoid people will want to reproduce those independently.
> It's turtles all the way down of course. The question is how far do we want
> to go.
>
> In my opinion, enabling independent organization to point out whether what
> can be downloaded from https://www.mozilla.org/en-US/firefox/all/ has
> been altered would be an amazing first milestone.
> I wouldn't worry too much about "paranoid people" for now.
>
> > We /could/ publish the Docker images Mozilla uses (they are probably
> already public for all I know).
>
> Publishing the images used by Mozilla would probably be enough for now
> IMHO. People can always audit the image by traversing the image file system
> to see whether they find something fishy.
>
> Does a comparable build seems like a good end-goal?
>

A significant obstacle to even comparable builds is "private" data embedded
within Firefox. e.g. Google API Keys. I /think/ we're also shipping some
DRM blobs. Then of course there is build signing, which takes a private key
and cryptographically signs builds/installers. With these in play, there is
no way for anybody not Mozilla to do a bit-for-bit reproduction of most
(all?) of the Firefox distributions at
https://www.mozilla.org/en-US/firefox/all/. The best we can do is ask you
to compare the extracted/packaged files and compare them - modulo pieces
like the Google API Key - to what a 3rd party entity has produced.
Unfortunately, I'm not sure that will be trivial, as I believe these
private blobs of data are embedded within libxul. So your comparison tool
would have to know how to read library headers and possibly even assembly
code. At some point, the ability to audit a Firefox distribution is
undermined enough that a security professional may not feel comfortable
saying it looks good.

So when I asked what the end goal is, I'm really asking non-Mozilla groups
what an acceptable level of reproducibility is. While I agree
deterministic, reproducible builds have a number of positive traits
(including caching for the build system to make builds faster!), I see
enough obstacles for satisfying the intent of security-minded groups that I
question whether it is worth doing for those groups alone. Right now, Tor
and Debian can both build Firefox in a mostly reproducible manner. And our
policy on the build system is that any patches they write to achieve that
goal will (likely) be taken upstream. I just don't know if it is worth our
effort to work on additional reproducibility efforts given limitations
reproducing what *Mozilla* actually ships.

Chris Peterson

unread,
Jul 18, 2016, 3:05:03 PM7/18/16
to
On 7/18/16 11:56 AM, Gregory Szorc wrote:
> A significant obstacle to even comparable builds is "private" data embedded
> within Firefox. e.g. Google API Keys. I /think/ we're also shipping some
> DRM blobs.

Mozilla does not ship any DRM blobs with Firefox. The Adobe Primetime
and Google Widevine CDMs (DRM DLLs) are downloaded from Adobe and Google
servers on Firefox first-run. Similarly, Cisco's OpenH264 codec is
downloaded from a Cisco server on Firefox first-run.

Bobby Holley

unread,
Jul 18, 2016, 3:07:36 PM7/18/16
to Chris Peterson, dev-pl...@lists.mozilla.org
Yes. Moreover, they are sandboxed at runtime. So modulo bugs in the
sandboxing layer, we can treat those blobs as adversarial and the integrity
of Firefox shouldn't depend on the integrity of those blobs.

On Mon, Jul 18, 2016 at 12:04 PM, Chris Peterson <cpet...@mozilla.com>
wrote:
> _______________________________________________
> dev-platform mailing list
> dev-pl...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-platform
>

David Bruant

unread,
Jul 18, 2016, 4:18:15 PM7/18/16
to
Le lundi 18 juillet 2016 20:57:12 UTC+2, Gregory Szorc a écrit :
> On Sun, Jul 17, 2016 at 9:38 AM, David Bruant <brua...@gmail.com> wrote:
>
> We already have deterministic packaging in some parts of Firefox (notably
> most XPIs and omni.ja files). We've done this by implementing our own
> jar/zip archiving layer (
> https://dxr.mozilla.org/mozilla-central/source/python/mozbuild/mozpack/mozjar.py)
> which pins times, sorts files before writing, etc. We just haven't applied
> this to all parts of packaging yet. We know what we have to do here.

Out of curiosity, do you have a bug number tracking this work off-head?


> A significant obstacle to even comparable builds is "private" data embedded
> within Firefox. e.g. Google API Keys. I /think/ we're also shipping some
> DRM blobs. Then of course there is build signing, which takes a private key
> and cryptographically signs builds/installers. With these in play, there is
> no way for anybody not Mozilla to do a bit-for-bit reproduction of most
> (all?) of the Firefox distributions at
> https://www.mozilla.org/en-US/firefox/all/. The best we can do is ask you
> to compare the extracted/packaged files and compare them - modulo pieces
> like the Google API Key - to what a 3rd party entity has produced.
> Unfortunately, I'm not sure that will be trivial, as I believe these
> private blobs of data are embedded within libxul. So your comparison tool
> would have to know how to read library headers and possibly even assembly
> code. At some point, the ability to audit a Firefox distribution is
> undermined enough that a security professional may not feel comfortable
> saying it looks good.

Blah, anything that's more than unzip + file traversal (with blacklist) + byte comparison seems too complicated to audit to be worth it.

I'm delighted to read the followup answers explaining some things are downloaded on Firefox first run.
For the private data, I'm tempted to ask whether these could be in a separate file (which the comparator could safely ignore) and loaded dynamically, but I guess there is a trade-off to address with Mozilla's willingness of keeping them "private".

In any case, thank you for your answer!

David

Ehsan Akhgari

unread,
Jul 18, 2016, 5:38:18 PM7/18/16
to Gregory Szorc, David Bruant, dev-platform
On 2016-07-18 2:56 PM, Gregory Szorc wrote:
> A significant obstacle to even comparable builds is "private" data embedded
> within Firefox. e.g. Google API Keys. I /think/ we're also shipping some
> DRM blobs. Then of course there is build signing, which takes a private key
> and cryptographically signs builds/installers. With these in play, there is
> no way for anybody not Mozilla to do a bit-for-bit reproduction of most
> (all?) of the Firefox distributions at
> https://www.mozilla.org/en-US/firefox/all/. The best we can do is ask you
> to compare the extracted/packaged files and compare them - modulo pieces
> like the Google API Key - to what a 3rd party entity has produced.
> Unfortunately, I'm not sure that will be trivial, as I believe these
> private blobs of data are embedded within libxul. So your comparison tool
> would have to know how to read library headers and possibly even assembly
> code. At some point, the ability to audit a Firefox distribution is
> undermined enough that a security professional may not feel comfortable
> saying it looks good.

These API keys are all written into nsURLFormatter.js:
<http://searchfox.org/mozilla-central/source/toolkit/components/urlformatter/nsURLFormatter.js#117>.
AFAIK none of these keys are written into libxul.

But at any rate, since these keys are not secrets (as we distribute them
inside the builds!) you can always pass the identical keys in to the
build system and should be able to get a bit-identical key out, right?

Mike Hommey

unread,
Jul 18, 2016, 6:50:16 PM7/18/16
to David Bruant, dev-pl...@lists.mozilla.org
On Sun, Jul 17, 2016 at 09:38:31AM -0700, David Bruant wrote:
> Out of curiosity, how has is the TOR team handled points 1 and 2?

I cannot answer for TOR, but I can answer for Debian, who also does
reproducible builds of Firefox.

1) is not addressed at all, and while the Firefox package is marked as
being reproducible, it's only because the chk files are not in the
Firefox package, but in the NSS package, which is separate, and is not
reproducible because of the .chk files.

2) Debian doesn't ship .tar.bz2 files, but .deb files, and the tools
that create those files deal with the reproducibility.

That being said, the packages that do reach Debian users are *not*
currently reproducible. Many of the required tools to make it happen are
not used to build normal packages. They are only used in a separate CI
that does two builds with a special toolchain and checks the results
are identical. (At least, that's my understanding of the current status)

Also note that Debian builds are not PGOed (as is the case with most
distros afaik), so that leaves that problem out of the equation.

For what it's worth, Debian uses a recursive comparison tool to check
for the differences: https://diffoscope.org/

I've actually used that tool to compare our (Mozilla's) Firefox builds
on buildbot vs. the same builds on taskcluster a few months ago. That
allowed to find a bunch of differences in the build environments, that
were subsequently fixed (or ignored when appropriate). (and yes, the
tool can show differences in binary files. It allowed to identify the
missing API keys that Greg was talking about, but reading the diff can
be tedious)

Mike

Justin Dolske

unread,
Jul 18, 2016, 6:50:50 PM7/18/16
to David Bruant, dev-pl...@lists.mozilla.org
On Sun, Jul 17, 2016 at 9:38 AM, David Bruant <brua...@gmail.com> wrote:

>
> The second point sort of solves them both. As part of making things
> verifiable, Mozilla could publish a program that makes byte by byte
> comparison only on files that matters after unzip. If they're not that
> important, .chk files could be ignored (blacklisted from the comparison).
> Same for file timestamps.
> That would be acceptable IMHO since a backdoor cannot be hidden in .chk
> files or file timestamps (right?).
>

It's not unreasonable, but I'd be a wary of having to have an asterisk with
caveats explaining that you should trust us that the non-reproducible bits
don't actually matter. Reproducability shouldn't depend on having to do a
code audit to understand impact of excluded things.

That said, my understanding of .CHK files is that they're just library
checksums required for FIPS140 certification (iirc intended to guard
against accidentally corrupted code emitting broken crypto). I think we
generally no longer care about FIPS certification of Firefox, and so should
consider just nuking this stuff in Firefox. We've certainly talked about
doing this before, because it's caused pain in other cases. (Judging from
1181814 NSS itself still cares about this for use in other products.)

Justin

Mike Hommey

unread,
Jul 18, 2016, 7:30:19 PM7/18/16
to Justin Dolske, David Bruant, dev-pl...@lists.mozilla.org
On Mon, Jul 18, 2016 at 03:50:38PM -0700, Justin Dolske wrote:
> On Sun, Jul 17, 2016 at 9:38 AM, David Bruant <brua...@gmail.com> wrote:
>
> >
> > The second point sort of solves them both. As part of making things
> > verifiable, Mozilla could publish a program that makes byte by byte
> > comparison only on files that matters after unzip. If they're not that
> > important, .chk files could be ignored (blacklisted from the comparison).
> > Same for file timestamps.
> > That would be acceptable IMHO since a backdoor cannot be hidden in .chk
> > files or file timestamps (right?).
> >
>
> It's not unreasonable, but I'd be a wary of having to have an asterisk with
> caveats explaining that you should trust us that the non-reproducible bits
> don't actually matter. Reproducability shouldn't depend on having to do a
> code audit to understand impact of excluded things.
>
> That said, my understanding of .CHK files is that they're just library
> checksums required for FIPS140 certification (iirc intended to guard
> against accidentally corrupted code emitting broken crypto). I think we
> generally no longer care about FIPS certification of Firefox, and so should
> consider just nuking this stuff in Firefox. We've certainly talked about
> doing this before, because it's caused pain in other cases. (Judging from
> 1181814 NSS itself still cares about this for use in other products.)

AFAIK, the issue is that for people that do care about FIPS
certification, the .CHK files provided along Firefox are the wrong thing
to begin with. What people that do care about FIPS certification really
need is to download the certified NSS FIPS binaries and *replace*
Firefox's softoken with the one from there.

Because, even if you can enable FIPS mode in Firefox, the softoken
shipped with Firefox is not FIPS certified.

Mike

Kurt Roeckx

unread,
Jul 19, 2016, 4:08:00 AM7/19/16
to
On 2016-07-19 00:49, Mike Hommey wrote:
> On Sun, Jul 17, 2016 at 09:38:31AM -0700, David Bruant wrote:
>> Out of curiosity, how has is the TOR team handled points 1 and 2?
>
> I cannot answer for TOR, but I can answer for Debian, who also does
> reproducible builds of Firefox.
>
> 1) is not addressed at all, and while the Firefox package is marked as
> being reproducible, it's only because the chk files are not in the
> Firefox package, but in the NSS package, which is separate, and is not
> reproducible because of the .chk files.
>
> 2) Debian doesn't ship .tar.bz2 files, but .deb files, and the tools
> that create those files deal with the reproducibility.
>
> That being said, the packages that do reach Debian users are *not*
> currently reproducible. Many of the required tools to make it happen are
> not used to build normal packages. They are only used in a separate CI
> that does two builds with a special toolchain and checks the results
> are identical. (At least, that's my understanding of the current status)

It is at least the intention that all those toolchain changes end up in
Debian itself and that packages can be build reproducible in Debian
itself. I know that at least dpkg recently added support for
SOURCE_DATE_EPOCH, so we're making progress, I just don't know what the
current state of everything is.

There was a talk at debconf about it, didn't have time to watch it yet:
http://meetings-archive.debian.net/pub/debian-meetings/2016/debconf16/Reproducible_Builds_status_update.webm


Kurt

Kurt Roeckx

unread,
Jul 19, 2016, 4:11:23 AM7/19/16
to
On 2016-07-18 20:56, Gregory Szorc wrote:
>
> Then of course there is build signing, which takes a private key
> and cryptographically signs builds/installers. With these in play, there is
> no way for anybody not Mozilla to do a bit-for-bit reproduction of most
> (all?) of the Firefox distributions at
> https://www.mozilla.org/en-US/firefox/all/.

There is at least a section about this here:
https://reproducible-builds.org/docs/embedded-signatures/


Kurt

Chris AtLee

unread,
Jul 19, 2016, 11:23:25 AM7/19/16
to dev-platform
Regarding timestamps in tarballs, using tar's --mtime option to force
timestamps to MOZ_BUILD_DATE (or a derivative thereof) could work.

Georg Koppen

unread,
Jul 20, 2016, 8:46:52 AM7/20/16
to dev-pl...@lists.mozilla.org
Hi,

David Bruant:

[snip]

> Out of curiosity, how has is the TOR team handled points 1 and 2?

1) We remove the .chk files before generating the final package.
2) We have deterministic tar/zip wrappers we deploy, e.g.:

https://gitweb.torproject.org/builders/tor-browser-bundle.git/tree/gitian/build-helpers/dtar.sh

Georg

[snip]

signature.asc
0 new messages