(trying to avoid) unpacking before checking signatures

63 views
Skip to first unread message

Jean-Philippe Ouellet

unread,
Nov 8, 2017, 10:51:47 PM11/8/17
to qubes-devel
Hello,

The way some things are distributed on kernel.org (e.g. util-linux
[1], cryptsetup [2], etc.) is such that the authors upload .tar and
.tar.sign files, and then the kernel.org infrastructure compresses
those (creating .tar.gz & .tar.xz) and signs all resulting files
(creating sha256sums.asc) using its own key. More info here [3]

Kernel.org does not make the original .tar files available, which
means there is no file available for which a signature directly from
the developers is also available. In order to check the developer's
provided signature, you must first unpack the file. I consider
unpackers to be of sufficient complexity that I would rather not run
them on arbitrary attacker-provided input.

I could of course verify the signature of the auto-generated
sha256sums.asc file which covers all the files (including compressed
ones), but that means trusting kernel.org infrastructure - which was
compromised in 2011 and may well be compromised again in the future...

If I want to follow qubes packaging best practices [4] and ensure that
no untrusted code gets processed (including unpacked) by the builder,
it seems my best option is to manually download the .tar.gz, verify
the kernel.org sig, unpack it (possibly in a DispVM), verify the
developer's sig, and then pin the sha512 of the original file for
qubes-builder's verify-sources.

To be extra sure I can also re-compress and reproduce (almost) the
original .tar.gz file from the verified .tar file with `gzip --no-name
--best`, and then verify that only the 4 bytes for the timestamp [5]
are different.

Thoughts?

Regards,
Jean-Philippe

[1]: https://www.kernel.org/pub/linux/utils/util-linux/v2.31/
[2]: https://www.kernel.org/pub/linux/utils/cryptsetup/v1.7/
[3]: https://www.kernel.org/signature.html#kernel-org-checksum-autosigner-and-sha256sums-asc
[4]: https://www.qubes-os.org/news/2016/05/30/build-security/
[5]: http://www.gzip.org/zlib/rfc-gzip.html#file-format

Jean-Philippe Ouellet

unread,
Nov 8, 2017, 10:55:51 PM11/8/17
to qubes-devel
On Wed, Nov 8, 2017 at 10:51 PM, Jean-Philippe Ouellet <j...@vt.edu> wrote:
> Hello,
>
> The way some things are distributed on kernel.org (e.g. util-linux
> [1], cryptsetup [2], etc.) is such that the authors upload .tar and
> .tar.sign files, and then the kernel.org infrastructure compresses
> those (creating .tar.gz & .tar.xz) and signs all resulting files
> (creating sha256sums.asc) using its own key. More info here [3]
>
> Kernel.org does not make the original .tar files available, which
> means there is no file available for which a signature directly from
> the developers is also available. In order to check the developer's
> provided signature, you must first unpack the file. I consider
> unpackers to be of sufficient complexity that I would rather not run
> them on arbitrary attacker-provided input.

Specifically: I would rather not run unpackers on unverified
attacker-controlled input, unsandboxed, in a trusted part of the
builder.

Normally I'm more pragmatic and just don't care. Hooray for DispVMs :)

Chris Laprise

unread,
Nov 11, 2017, 5:55:03 PM11/11/17
to Jean-Philippe Ouellet, qubes-devel
...manually download the .tar.gz, verify kernel.org sig, send to dispVM
for unpacking, qvm-copy unpacked files to parent VM, verify developer's
sig...

>>
>> To be extra sure I can also re-compress and reproduce (almost) the
>> original .tar.gz file from the verified .tar file with `gzip --no-name
>> --best`, and then verify that only the 4 bytes for the timestamp [5]
>> are different.

This part sounds unreliable.

--

Chris Laprise, tas...@posteo.net
https://github.com/tasket
https://twitter.com/ttaskett
PGP: BEE2 20C5 356E 764A 73EB 4AB3 1DC4 D106 F07F 1886

Jean-Philippe Ouellet

unread,
Nov 11, 2017, 6:11:54 PM11/11/17
to Chris Laprise, qubes-devel
Normally I'd agree, however I am not aware of anything else in the
builder (besides the top-level "build full template in DispVM")
depending on the ability to start DispVMs. I suspect this was an
intentional choice to allow building of Qubes packages on non-Qubes
machines (e.g. qubes-builder docs say: "In order to use it one should
use an rpm-based distro, like Fedora :) and should ensure the
following packages are installed:" [1]).

The ability to do so becomes somewhat more relevant in the context of
aiming for deterministic builds in the long run: it may be quite
desirable to build qubes things on non-qubes machines and ensure the
same binaries are produced.

Thoughts?

[1]: https://www.qubes-os.org/doc/qubes-builder/

>>>
>>> To be extra sure I can also re-compress and reproduce (almost) the
>>> original .tar.gz file from the verified .tar file with `gzip --no-name
>>> --best`, and then verify that only the 4 bytes for the timestamp [5]
>>> are different.
>
>
> This part sounds unreliable.

Perhaps. The idea is I would do that manually once and then pin the
hash of the .tar.gz for the verify-sources target. This would only
need to happen once per version bump.

Chris Laprise

unread,
Nov 11, 2017, 6:57:08 PM11/11/17
to Jean-Philippe Ouellet, qubes-devel
It sounds like an easy diversion to make: Test for Qubes and use dispvm
for unpacking in that case. Builds can still be done on different
distros, although with a higher assurance of integrity if done on Qubes.

>
> Thoughts?
>
> [1]: https://www.qubes-os.org/doc/qubes-builder/
>
>>>> To be extra sure I can also re-compress and reproduce (almost) the
>>>> original .tar.gz file from the verified .tar file with `gzip --no-name
>>>> --best`, and then verify that only the 4 bytes for the timestamp [5]
>>>> are different.
>>
>> This part sounds unreliable.
> Perhaps. The idea is I would do that manually once and then pin the
> hash of the .tar.gz for the verify-sources target. This would only
> need to happen once per version bump.
>


Konstantin Ryabitsev

unread,
Nov 13, 2017, 10:28:19 AM11/13/17
to Jean-Philippe Ouellet, qubes-devel
On 08/11/17 10:51 PM, Jean-Philippe Ouellet wrote:
> I could of course verify the signature of the auto-generated
> sha256sums.asc file which covers all the files (including compressed
> ones), but that means trusting kernel.org infrastructure - which was
> compromised in 2011 and may well be compromised again in the future...

Not to argue anything in particular, but sometimes I do have moments
where I wonder if moving our ultimate trust to developer workstations
(where the signature is generated) was such a great idea. I can assure
you that kernel.org infrastructure is, in the majority of cases, vastly
better protected than developer workstations -- together with the
signing keys that live on them and into which we put the ultimate trust.

I don't have any answers to this, just wanted to share my inner conflict
with others. :)

Best,
--
Konstantin Ryabitsev
Director, IT Infrastructure Security
The Linux Foundation

Peter Todd

unread,
Nov 13, 2017, 12:21:39 PM11/13/17
to Konstantin Ryabitsev, Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 10:28:15AM -0500, Konstantin Ryabitsev wrote:
> On 08/11/17 10:51 PM, Jean-Philippe Ouellet wrote:
> >I could of course verify the signature of the auto-generated
> >sha256sums.asc file which covers all the files (including compressed
> >ones), but that means trusting kernel.org infrastructure - which was
> >compromised in 2011 and may well be compromised again in the future...
>
> Not to argue anything in particular, but sometimes I do have moments where I
> wonder if moving our ultimate trust to developer workstations (where the
> signature is generated) was such a great idea. I can assure you that
> kernel.org infrastructure is, in the majority of cases, vastly better
> protected than developer workstations -- together with the signing keys that
> live on them and into which we put the ultimate trust.

The thing is developer workstations are *already* trusted, because if those
workstations are compromised they can compromise the code itself, which in
practice is quite difficult to effectively audit. So why add an additional
trusted entity when you already have one?

That said, with deterministic builds you can get the best of both worlds: build
on both kernel.org infrastructure and dev workstations and only release if the
builds match 100%

--
https://petertodd.org 'peter'[:-1]@petertodd.org
signature.asc

Konstantin Ryabitsev

unread,
Nov 13, 2017, 1:29:21 PM11/13/17
to Peter Todd, Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 12:21:31PM -0500, Peter Todd wrote:
>> Not to argue anything in particular, but sometimes I do have moments
>> where I
>> wonder if moving our ultimate trust to developer workstations (where the
>> signature is generated) was such a great idea. I can assure you that
>> kernel.org infrastructure is, in the majority of cases, vastly better
>> protected than developer workstations -- together with the signing keys that
>> live on them and into which we put the ultimate trust.
>
>The thing is developer workstations are *already* trusted, because if those
>workstations are compromised they can compromise the code itself, which in
>practice is quite difficult to effectively audit. So why add an additional
>trusted entity when you already have one?

I would argue the point you're making here. Compromising code that is
headed for git repositories is significantly more difficult than code
that is going into a tarball. A malicious commit going into a git
repository at least has the potential to be reviewed and analyzed (and
will almost certainly be noticed if it lands in the Linux kernel
repository). However, a 2-line change somewhere deep in the released
tarball will likely never be noticed without someone actually checking
for such sneaky changes.

If I were an attacker in control of a developer workstation, I would
target specifically tarballs -- I would make use of the race condition
between when "git archive" completes and when the gpg sign command is
invoked to substitute the generated tarball for a malicious one.

>That said, with deterministic builds you can get the best of both worlds: build
>on both kernel.org infrastructure and dev workstations and only release if the
>builds match 100%

It's not quite what we do, but the process of releasing a kernel does
not rely on us accepting tarballs from Linus or Greg. Instead, we take
the git tree reference, the tarball prefix, and the gpg signature -- and
then generate the tarball on our end. If the signature we received
successfully verifies the tarball we generated on our end, that means
that our git tree is identical to Linus's (or Greg's) tree, and
therefore we can trust our generated tarball.

This was written as a convenience, not as a security check (Linus
oftentimes goes diving to remote parts of the world and does not want to
spend most of his day uploading 600-megabyte tarballs over bad
connections). However, it also acts as a useful double-check to verify
both that git repositories on our end do not differ from the ones on
Linus's workstation, and to make sure "use a race condition to sneak in
a trojaned tarball" attack cannot succeed.

It is my intent to deprecate the ability to upload tarballs entirely and
force everyone to use the same process Linus and Greg use to generate
tarballs on our end and then use the provided signature to verify it.

(If you're interested in how the kernel release process works, you can
see my talk from last year: https://www.youtube.com/watch?v=9ZIu3a3gocM)

-K

signature.asc

Konstantin Ryabitsev

unread,
Nov 13, 2017, 1:33:43 PM11/13/17
to Peter Todd, Jean-Philippe Ouellet, qubes-devel
On 13 November 2017 at 13:29, Konstantin Ryabitsev
<konst...@linuxfoundation.org> wrote:
> (If you're interested in how the kernel release process works, you can see my talk from last year: https://www.youtube.com/watch?v=9ZIu3a3gocM)

Looks like I gave the wrong link. Here is the direct one:
https://www.youtube.com/watch?v=vohrz14S6JE

--
Konstantin Ryabitsev
Director, IT Infrastructure Security
The Linux Foundation
Montréal, Québec
+1-971-258-2363

Peter Todd

unread,
Nov 13, 2017, 1:46:34 PM11/13/17
to Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 01:29:15PM -0500, Konstantin Ryabitsev wrote:
> >The thing is developer workstations are *already* trusted, because if those
> >workstations are compromised they can compromise the code itself, which in
> >practice is quite difficult to effectively audit. So why add an additional
> >trusted entity when you already have one?
>
> I would argue the point you're making here. Compromising code that is headed
> for git repositories is significantly more difficult than code that is going
> into a tarball. A malicious commit going into a git repository at least has
> the potential to be reviewed and analyzed (and will almost certainly be
> noticed if it lands in the Linux kernel repository). However, a 2-line
> change somewhere deep in the released tarball will likely never be noticed
> without someone actually checking for such sneaky changes.

I do agree that theres more potential for a change in git to be noticed, but I
have to wonder how much that's actually true in practice? A backdoor can easily
be a one character code change, and that's rather difficult to spot.

> If I were an attacker in control of a developer workstation, I would target
> specifically tarballs -- I would make use of the race condition between when
> "git archive" completes and when the gpg sign command is invoked to
> substitute the generated tarball for a malicious one.
>
> >That said, with deterministic builds you can get the best of both worlds: build
> >on both kernel.org infrastructure and dev workstations and only release if the
> >builds match 100%
>
> It's not quite what we do, but the process of releasing a kernel does not
> rely on us accepting tarballs from Linus or Greg. Instead, we take the git
> tree reference, the tarball prefix, and the gpg signature -- and then
> generate the tarball on our end. If the signature we received successfully
> verifies the tarball we generated on our end, that means that our git tree
> is identical to Linus's (or Greg's) tree, and therefore we can trust our
> generated tarball.

When you say "GPG signature", do you mean the GPG signature on the git commit?
Or is this a GPG signature on a tarball?

> This was written as a convenience, not as a security check (Linus oftentimes
> goes diving to remote parts of the world and does not want to spend most of
> his day uploading 600-megabyte tarballs over bad connections). However, it
> also acts as a useful double-check to verify both that git repositories on
> our end do not differ from the ones on Linus's workstation, and to make sure
> "use a race condition to sneak in a trojaned tarball" attack cannot succeed.
>
> It is my intent to deprecate the ability to upload tarballs entirely and
> force everyone to use the same process Linus and Greg use to generate
> tarballs on our end and then use the provided signature to verify it.

That process to generate that tarball is deterministic right?


I might be misunderstanding how all this works, but maybe what we actually need
here is a new type of tarball signature verifier, that takes the GPG signature
from the signed git commit, and verifies that the tarball matches the tree that
the git commit signed. It should be possible to do this with just the tarball
and part of the git commit if the tarball is in fact deterministically
generated from the Git commit.
signature.asc

Jean-Philippe Ouellet

unread,
Nov 13, 2017, 2:00:10 PM11/13/17
to Peter Todd, qubes-devel
What is there to be gained by doing that instead of just verifying
signatures on git tags/commits?

Konstantin Ryabitsev

unread,
Nov 13, 2017, 2:01:29 PM11/13/17
to Peter Todd, Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 01:46:27PM -0500, Peter Todd wrote:
>> I would argue the point you're making here. Compromising code that is
>> headed
>> for git repositories is significantly more difficult than code that is going
>> into a tarball. A malicious commit going into a git repository at least has
>> the potential to be reviewed and analyzed (and will almost certainly be
>> noticed if it lands in the Linux kernel repository). However, a 2-line
>> change somewhere deep in the released tarball will likely never be noticed
>> without someone actually checking for such sneaky changes.
>
>I do agree that theres more potential for a change in git to be noticed, but I
>have to wonder how much that's actually true in practice? A backdoor can easily
>be a one character code change, and that's rather difficult to spot.

It isn't nearly as difficult to spot when it's a part of a git commit,
because it will stand so much more in a diff than it would inside a
600MB tarball. :)

>> It's not quite what we do, but the process of releasing a kernel does
>> not
>> rely on us accepting tarballs from Linus or Greg. Instead, we take the git
>> tree reference, the tarball prefix, and the gpg signature -- and then
>> generate the tarball on our end. If the signature we received successfully
>> verifies the tarball we generated on our end, that means that our git tree
>> is identical to Linus's (or Greg's) tree, and therefore we can trust our
>> generated tarball.
>
>When you say "GPG signature", do you mean the GPG signature on the git commit?
>Or is this a GPG signature on a tarball?

On the tarball. We do not do anything with gpg signatures on git trees
because we need to provide a sig file someone can run against the
tarball, not git.

>> It is my intent to deprecate the ability to upload tarballs entirely
>> and
>> force everyone to use the same process Linus and Greg use to generate
>> tarballs on our end and then use the provided signature to verify it.
>
>That process to generate that tarball is deterministic right?

Yes, "git archive" will generate deterministic tarballs -- at least if
git versions are fairly close together. This has broken a couple of
times in the past, but in both cases the changes in git were rolled back
because it was judged that having a mechanism to have deterministic
tarball generation was more valuable than having "more correct"
tarballs.

>I might be misunderstanding how all this works, but maybe what we actually need
>here is a new type of tarball signature verifier, that takes the GPG signature
>from the signed git commit, and verifies that the tarball matches the tree that
>the git commit signed. It should be possible to do this with just the tarball
>and part of the git commit if the tarball is in fact deterministically
>generated from the Git commit.

You can do this right now -- it would be a 5-line script, honestly, but
would still require unpacking the tarball and ultimately you will not
really gain anything that you don't already have right now by comparing
to the .sig file provided by Linus or Greg. It already assures you that
the tarball you've downloaded was generated directly from a git tree
that is identical to the git tree on the developer's workstation.

-K

signature.asc

Peter Todd

unread,
Nov 13, 2017, 2:07:28 PM11/13/17
to Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 01:59:41PM -0500, Jean-Philippe Ouellet wrote:
> I might be misunderstanding how all this works, but maybe what we actually need
> > here is a new type of tarball signature verifier, that takes the GPG signature
> > from the signed git commit, and verifies that the tarball matches the tree that
> > the git commit signed. It should be possible to do this with just the tarball
> > and part of the git commit if the tarball is in fact deterministically
> > generated from the Git commit.
>
> What is there to be gained by doing that instead of just verifying
> signatures on git tags/commits?

For some workflows you might not want to have to get the whole git history...
but come to think of it, Git supports shallow checkouts with the git clone
--depth option, and you can verify the PGP signatures on those shallow
checkouts. So I think everything I said above can already be done with git with
some minor changes to workflows.
signature.asc

Peter Todd

unread,
Nov 13, 2017, 2:15:19 PM11/13/17
to Jean-Philippe Ouellet, qubes-devel
On Mon, Nov 13, 2017 at 02:01:24PM -0500, Konstantin Ryabitsev wrote:
> >I do agree that theres more potential for a change in git to be noticed, but I
> >have to wonder how much that's actually true in practice? A backdoor can easily
> >be a one character code change, and that's rather difficult to spot.
>
> It isn't nearly as difficult to spot when it's a part of a git commit,
> because it will stand so much more in a diff than it would inside a 600MB
> tarball. :)

Yes, we agree one has a 0% chance of being noticed; the other has a 0% + epsilon
chance. :)

> Yes, "git archive" will generate deterministic tarballs -- at least if git
> versions are fairly close together. This has broken a couple of times in the
> past, but in both cases the changes in git were rolled back because it was
> judged that having a mechanism to have deterministic tarball generation was
> more valuable than having "more correct" tarballs.

Good to hear!

> >I might be misunderstanding how all this works, but maybe what we actually need
> >here is a new type of tarball signature verifier, that takes the GPG signature
> >from the signed git commit, and verifies that the tarball matches the tree that
> >the git commit signed. It should be possible to do this with just the tarball
> >and part of the git commit if the tarball is in fact deterministically
> >generated from the Git commit.
>
> You can do this right now -- it would be a 5-line script, honestly, but
> would still require unpacking the tarball and ultimately you will not really
> gain anything that you don't already have right now by comparing to the .sig
> file provided by Linus or Greg. It already assures you that the tarball
> you've downloaded was generated directly from a git tree that is identical
> to the git tree on the developer's workstation.

Alright, sounds like I had a misunderstanding of how this all worked.

If I understand it correctly, you're able to provide a signature on a tarball
that's signed directly from the developer themselves, from a tarball they
personally generated. We also have a mechanism to deterministically regenerate
those tarballs from the git commits they were supposed to have been generated.

In this scenario, once we get past the problem of uncompressing the tarball -
which I think is adequately addressed by doing that in a DispVM as the OP
suggested - we do in fact have a signature directly from the developer on that
tarball with no additional trust dependencies, and we can deterministically
check it against the git commit it was supposed to have come from as well to
make sure it matches the reviewed code in git.

Sounds good to me! :)
signature.asc

HW42

unread,
Nov 13, 2017, 2:38:08 PM11/13/17
to Peter Todd, Jean-Philippe Ouellet, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Peter Todd:
> On Mon, Nov 13, 2017 at 02:01:24PM -0500, Konstantin Ryabitsev wrote:
>>> I do agree that theres more potential for a change in git to be noticed, but I
>>> have to wonder how much that's actually true in practice? A backdoor can easily
>>> be a one character code change, and that's rather difficult to spot.
>>
>> It isn't nearly as difficult to spot when it's a part of a git commit,
>> because it will stand so much more in a diff than it would inside a 600MB
>> tarball. :)
>
> Yes, we agree one has a 0% chance of being noticed; the other has a 0% + epsilon
> chance. :)

Could you explain why you think so? As discussed below the tarball is
generated (almost) deterministically so spotting a tarball which does
not match the git commit is trivial (and even possible to automate). So
why is spotting a manipulated tarball harder?

>> Yes, "git archive" will generate deterministic tarballs -- at least if git
>> versions are fairly close together. This has broken a couple of times in the
>> past, but in both cases the changes in git were rolled back because it was
>> judged that having a mechanism to have deterministic tarball generation was
>> more valuable than having "more correct" tarballs.
>
> Good to hear!
>
>>> I might be misunderstanding how all this works, but maybe what we actually need
>>> here is a new type of tarball signature verifier, that takes the GPG signature
>> >from the signed git commit, and verifies that the tarball matches the tree that
>>> the git commit signed. It should be possible to do this with just the tarball
>>> and part of the git commit if the tarball is in fact deterministically
>>> generated from the Git commit.
>>
>> You can do this right now -- it would be a 5-line script, honestly, but
>> would still require unpacking the tarball and ultimately you will not really
>> gain anything that you don't already have right now by comparing to the .sig
>> file provided by Linus or Greg. It already assures you that the tarball
>> you've downloaded was generated directly from a git tree that is identical
>> to the git tree on the developer's workstation.
>
> Alright, sounds like I had a misunderstanding of how this all worked.
>
> If I understand it correctly, you're able to provide a signature on a tarball
> that's signed directly from the developer themselves, from a tarball they
> personally generated. We also have a mechanism to deterministically regenerate
> those tarballs from the git commits they were supposed to have been generated.
-----BEGIN PGP SIGNATURE-----

iQJDBAEBCgAtFiEEqieyzvOmi9FGaQcT5KzJJ4pkaBYFAloJ9JkPHGh3NDJAaXBz
dW1qLmRlAAoJEOSsySeKZGgW7moP/RDMN8oSwENDyxkfrxVVfskIRAK5hW7vk5qo
niYJUw/kmRtbxD+a8xWZ0NoM5pZiNxS7McsaUFa/rMZtw7su7c0sGBMBzFANwpu9
lBNTC/8B7Xm9z4EsbXHYH7mM+NPN789mOQq6is2wzqQ7Aec/04QUIXPAwxjhLIy4
Dz3JCnGPzKx+EwDIPCPdRRQxGstnlo4YzMEl0E6pfbK56ePyWMSgQJ3Lw16l2HCZ
GuwTExDABH4KpLxNMRSertLVZW+i5BncrlCuVMhmtx3AiajCpvYYSGxQhRC0WTxu
p+saDCd6TrYdMQYFSKOm/kjLE9DE0RWmIcZaas2PXZhseRuVo+/qQntNQKwTCC4J
2l9YK6p7mmNwmjPZEUEta2Cba3jEhEw4u5gnC7OCnsFWX/S1V8kaFYZ9NIo4PfUA
pVCcuIm1aWKg82Lyh6Crmu6k5SS6drcPyMb6+6Ge9A4mirsVMDHAzGQsbHa5F32O
YH6w8hyVj91nQET4UBph/a6SP5NSSH3shlwmQeNbW5PCUdiGY5ODKcu5Ahk4+43j
qMTjUgHuVd4ghOqyBVP0HzAOlIK0thJt+TCRjVBitwg8HVUDKSASOROj86uuK0Z0
NKaj/rAeRMEqnkF56Ydq1WjNLgH9b9epaPQFAqOohP5NUkGjaQSFm1PsRa/s4OGl
G8zV2eSM
=SmWM
-----END PGP SIGNATURE-----

Jean-Philippe Ouellet

unread,
Nov 13, 2017, 2:54:59 PM11/13/17
to HW42, Peter Todd, qubes-devel
On Mon, Nov 13, 2017 at 2:38 PM, HW42 <hw...@ipsumj.de> wrote:
> Peter Todd:
>> On Mon, Nov 13, 2017 at 02:01:24PM -0500, Konstantin Ryabitsev wrote:
>>>> I do agree that theres more potential for a change in git to be noticed, but I
>>>> have to wonder how much that's actually true in practice? A backdoor can easily
>>>> be a one character code change, and that's rather difficult to spot.
>>>
>>> It isn't nearly as difficult to spot when it's a part of a git commit,
>>> because it will stand so much more in a diff than it would inside a 600MB
>>> tarball. :)
>>
>> Yes, we agree one has a 0% chance of being noticed; the other has a 0% + epsilon
>> chance. :)
>
> Could you explain why you think so? As discussed below the tarball is
> generated (almost) deterministically so spotting a tarball which does
> not match the git commit is trivial (and even possible to automate). So
> why is spotting a manipulated tarball harder?

Because in practice people do not do that. Even I did not consider
doing that in the context of this post.

My assumption was that "if the tarball signed by the developer is
somehow evil, then the developer is either compromised or themselves
malicious, and in both cases we're screwed anyway so there's no point
checking against git". Perhaps this is flawed reasoning, perhaps not,
but it does allow for the possibility of unnoticed changes in the
tarball not present in git.

HW42

unread,
Nov 13, 2017, 8:05:08 PM11/13/17
to Peter Todd, Jean-Philippe Ouellet, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Konstantin Ryabitsev:
What's the reason for not signing the compressed files in the first
place? Compressing the files two times and then uploading three
signatures (for .tar, .tar.xz and .tar.gz) doesn't sound so bad.

>> That process to generate that tarball is deterministic right?
>
> Yes, "git archive" will generate deterministic tarballs -- at least if
> git versions are fairly close together. This has broken a couple of
> times in the past, but in both cases the changes in git were rolled
> back because it was judged that having a mechanism to have
> deterministic tarball generation was more valuable than having "more
> correct" tarballs.

The compression process is also reproducible. After a bit testing I
found the parameters which seem to be currently used (see attached
script). For the last few mainline/stable/longterm releases this works
fine. But it seems that not long ago the parameters changed (for example
the 4.11 tar.xz used other parameters).

If someone wants to play around with generating the archives from git
you can run the attached script in an checked out linux repository. It
will verify the git tag so you need the matching key in your gpg
keyring.

- --

df8a8178824e5e8b96c0ec6e7f4ab9d020d66286965ae053f6709c9daad01fe2 *gen-linux-tar

(Google Groups breaks PGP/MIME, so I need to use inline signatures,
which don't cover attachments)
-----BEGIN PGP SIGNATURE-----

iQJDBAEBCgAtFiEEqieyzvOmi9FGaQcT5KzJJ4pkaBYFAloKQT4PHGh3NDJAaXBz
dW1qLmRlAAoJEOSsySeKZGgWV9oQALQxQnsCngR+R53eSCUdWRBYyDeWibNMZDlW
upA5q2phi+rzCTFDUpuo13e1JjeQoTTFcNMLtdwQ80PxAQBGy1e7u6fpXEC1AtXo
PT/WMQHLHCjC8oLcJxo6/51cFL/bDIfmHN3CIRxfv5P2rGE6bo8HCcoj29Bp06aR
7GLACOPWoUrRSq7f8JiwhofC/CnhILSS20PFXyYMAC0LkWo0hX/iFj26GCm/WkHq
gIyzuZ9VLnB9yIWyn0HVQsJMo+z8cxkDfyX3AhtII90qiUDbfVzlfepTBRaWxWfv
0O661gkVeiYcPrsEquetVxc8EkZVORZU8etzh31su7iL4v5fEZNS9PqYtKeiC07f
Uh7mKzDnBhOfTcmzvdLBR/M5kkQx0nU5kfszgGGH2GxCnhd9jBLfVeG9vtbmEWz7
OM3tcqdydS+c+nvFNzXFS/H93dH0tyF73qA5fB/+HHNLyz2G+UOnUpTrk0vrBkK1
udjXfF2eQZscxpAXveGGbRMTKXZOSx30rWfGduxfu/G6QTaq9dPPOfzcdiGqzHM7
EFzEhzy6QILljEUfuiA2n9xEZnwAxLGlGQ/UTn++4fnDXsNdyk9Khfzfsa+BIVTP
xqskbQX7R9iOkpY4Jq60c8mRdF/1PbsIKuk7yvDE3gw8aKiY8bnpmgPnt2jEd2jV
AgDfemQ1
=mMvD
-----END PGP SIGNATURE-----
gen-linux-tar

HW42

unread,
Nov 13, 2017, 8:18:22 PM11/13/17
to Jean-Philippe Ouellet, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

Jean-Philippe Ouellet:
> On Wed, Nov 8, 2017 at 10:51 PM, Jean-Philippe Ouellet <j...@vt.edu> wrote:
>> Hello,
>>
>> The way some things are distributed on kernel.org (e.g. util-linux
>> [1], cryptsetup [2], etc.) is such that the authors upload .tar and
>> .tar.sign files, and then the kernel.org infrastructure compresses
>> those (creating .tar.gz & .tar.xz) and signs all resulting files
>> (creating sha256sums.asc) using its own key. More info here [3]
>>
>> Kernel.org does not make the original .tar files available, which
>> means there is no file available for which a signature directly from
>> the developers is also available. In order to check the developer's
>> provided signature, you must first unpack the file. I consider
>> unpackers to be of sufficient complexity that I would rather not run
>> them on arbitrary attacker-provided input.
>
> Specifically: I would rather not run unpackers on unverified
> attacker-controlled input, unsandboxed, in a trusted part of the
> builder.

I think it's good to improve the situation here. But keep in mind that
the IMO bigger problem is git. I.e. we already do a lot more processing
of unverified input. But I don't see a solution for this one currently.
(And of course ideally we would use a simpler signature verification
tool than GnuPG)

> Normally I'm more pragmatic and just don't care. Hooray for DispVMs :)
>
>> I could of course verify the signature of the auto-generated
>> sha256sums.asc file which covers all the files (including compressed
>> ones), but that means trusting kernel.org infrastructure - which was
>> compromised in 2011 and may well be compromised again in the future...
>>
>> If I want to follow qubes packaging best practices [4] and ensure that
>> no untrusted code gets processed (including unpacked) by the builder,
>> it seems my best option is to manually download the .tar.gz, verify
>> the kernel.org sig, unpack it (possibly in a DispVM), verify the
>> developer's sig, and then pin the sha512 of the original file for
>> qubes-builder's verify-sources.

Pinning the hash is probably the less intrusive way. Another option
would be to use a shallow git-clone.

I think it's important to keep qubes-builder working on a non-Qubes
system. But with a fallback the DispVM solution would also be an option.

I personally would prefer pinning the hash.

>> To be extra sure I can also re-compress and reproduce (almost) the
>> original .tar.gz file from the verified .tar file with `gzip --no-name
>> --best`, and then verify that only the 4 bytes for the timestamp [5]
>> are different.

The .xz is reproducible, see the other mail.
-----BEGIN PGP SIGNATURE-----

iQJDBAEBCgAtFiEEqieyzvOmi9FGaQcT5KzJJ4pkaBYFAloKRFsPHGh3NDJAaXBz
dW1qLmRlAAoJEOSsySeKZGgWsMUP/1tsiUzzE7L4RPqsJ0c5KMKJ3nyI5O11ifor
am+7DMAm8Effl5LbvISalrxcDltGJjYmvzx/a5zPegTr1Jrt5s7WVAjHPHuJwhSQ
8M8sqfqJ1fYJ1jyc0VH1fSraVlLGqVMypzEMboOfJO6nuaKVV6F1kzlg56pNAglk
YnxAyoXSnp++SoHLjhmp+GivDTZFj0a+3epAX5HemdJpTyLZkgMAEg8IvFGS4ln6
gdEoqXT6rEaN6qvG0XHZbHb6GZ8gAI+hZboqycN/puWxUXvK5u8KojEclrH3x5Qf
lY+XBzPkho7wqRdF9iJgbn5+TZXYSjFVBLFHUMvIxYlycrYBaCZuv7XWzA70BwoX
lb5JwwehVy/sLSmA20IduTl+mtoksaBm9/YVN7SOIFGg7zetPCz/+g4FGQVrFqBR
+ZD7DVpd2PLF+Bc8R1xHjQyuJZQz/mcDz/oRWW72M2xIJCG+0+ht8nqZDwMFolRi
JL0jOTJ2uaxSxNCdsYtuE4agi1swN5iKLsR5xsLcABQag4ByHmbnf9+uWgwW2v+N
YAoBq18ermEqcixfn9f0t2RdE3qngXFj1Bk07xmmI6RiGIG8RJvPgS5OmjRB5chS
k5FjNQSGk0yqXrPdH3bUf+TTENdpzht6RtKgtGtHvgIyM3ZwjkxenTbQl+trQ9zb
z8fiDcAm
=1VTB
-----END PGP SIGNATURE-----

Marek Marczykowski-Górecki

unread,
Nov 14, 2017, 8:21:17 AM11/14/17
to HW42, Jean-Philippe Ouellet, qubes-devel
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
+1

> I personally would prefer pinning the hash.

That sounds reasonable.

> >> To be extra sure I can also re-compress and reproduce (almost) the
> >> original .tar.gz file from the verified .tar file with `gzip --no-name
> >> --best`, and then verify that only the 4 bytes for the timestamp [5]
> >> are different.
>
> The .xz is reproducible, see the other mail.

This could be done before pinning to specific hash.
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJaB60aAAoJENuP0xzK19csd7gH/121Lv7A648T+XW+zapmqQBI
j6yQ3zaqmHPpWT8S+RHC03OkwmewUj3Uu6E8ay1zLaLrnz6D+MohmmYV4AMBlseC
n+jRFuKLOTS0NR4BSBZIQ+og+NMysWzl9eNelEKyT6dUgDwvhayYMgvBQkOda3fm
aZZB3H6miVSxBGxqjcJTO7qb2T3a5L+/fhdbjke1GAqhcoYgkTX1Da2MfWg8ubRD
7GU7K8hYtDgdjHl2vOjIraUBxld6TMZFmJBAngKfghYRtOZ3GQjw1u/GDaakA9+W
++yU7AY86FTmZlsI57eL3k0ixzKogig8b6VZLEmz1Q3qrllLj6NZ2aax2i4Xro0=
=F4sc
-----END PGP SIGNATURE-----

Konstantin Ryabitsev

unread,
Nov 14, 2017, 10:33:58 AM11/14/17
to HW42, Peter Todd, Jean-Philippe Ouellet, qubes-devel
On Tue, Nov 14, 2017 at 01:05:00AM +0000, HW42 wrote:
>>>> It is my intent to deprecate the ability to upload tarballs
>>>> entirely and
>>>> force everyone to use the same process Linus and Greg use to generate
>>>> tarballs on our end and then use the provided signature to verify it.
>
>What's the reason for not signing the compressed files in the first
>place? Compressing the files two times and then uploading three
>signatures (for .tar, .tar.xz and .tar.gz) doesn't sound so bad.

Yes it is. Compressing the Linux tarball with xz -9 takes about 20
minutes on a laptop. Greg, who does 3-4 releases at the same time, is
not willing to wait 1.5 hours just to generate signatures, especially if
he's running on battery.

>The compression process is also reproducible. After a bit testing I
>found the parameters which seem to be currently used (see attached
>script). For the last few mainline/stable/longterm releases this works
>fine. But it seems that not long ago the parameters changed (for example
>the 4.11 tar.xz used other parameters).

It's not at all guaranteed to be reproducible. The parameters changed
because we upgraded the OS and switched to using pixz for better
parallelized compressing. We may furthermore lazy-recompress older .gz
archives with better tools than vanilla gzip (e.g. zopfli), because the
savings may be worth it. Putting the signature on the .tar archive
allows us to do it and does not tie us to any particular compression
format.

>If someone wants to play around with generating the archives from git
>you can run the attached script in an checked out linux repository. It
>will verify the git tag so you need the matching key in your gpg
>keyring.

I am kind of surprised that you are going the route of using git if the
initial conversation started out with not trusting compression software.
Wouldn't git have much larger complexity than unxz, and therefore using
it be a net loss in security?

-K
Reply all
Reply to author
Forward
0 new messages