Securing package distribution

Chris Done

unread,

Jan 25, 2015, 6:36:53 PM1/25/15

to commerci...@googlegroups.com

Hi all!

I’d like to pose a problem, propose a possible solution and then
finally ask some questions at the end.

A remaining problem with the distribution of Haskell packages is that
the process from end to end is generally unsecured and lacks any
notion of quality control. This is a problem for a few types of people
who fit into our primary goals:

Community tooling enhancement: General community Haskell users who
want to be safe and have some measure of quality.
Adoption: Newcomers or possible adopters of Haskell who may
perceive the lack of an end-to-end security story to be immaturity
and/or even incompetence.
Commercial use: Businesses (like ourselves) who not only desire to
be stringent but have requirements from clients from the private
and public sector that mandate a proper process.

A lot of of good will currently goes into trusting Hackage (1) as a
way of uploading, (2) as an archive, (3) as quality control (none) and
(4) as a download process. None of these are secured to any standard.

This is up for discussion. Previous discussion about this issue has
let to such ideas as:

Make Hackage a CA and make Hackage sign packages.
Add TLS support to Hackage for upload/download.

In fact, the above approaches are in a process of development, but a
couple years has gone past with no discernible progress. I argue that
this is good; these solutions (as is) are not a good idea because:

We do not want to have to trust Hackage. If Hackage is compromised,
its signatures are worthless. A false sense of security is more
harmful than a keen awareness of no security.
TLS support is always good. But implementing it solo would make
many users feel that the entire problem is now solved. We’ve got
TLS, surely they must have it under control. This would be a case
of a poor solution preventing further interest in a good solution.
I’ve seen more than enough worrying discussion about using X and Y
crypto library to implement such and such cryptographic function
which we should not be doing. It is both a waste of effort and a
violation of the golden rule: do not implement security software
yourself.

It seems much more safe and practical to go with package
signing. Michael and I have discussed the following solution of
“reviewing”. Hopefully it will be entirely unsurprising and
obvious. It goes roughly like this:

When an author creates a package, they make a review of the package
which consists of some claim about the state of the package and a
public-key signature of the .tar.gz. of that package. (Probably
both should be signed together.)
The author uploads this review to a separate archive, decoupled
from the package, such as to a specially made web service or simply
to a Github repository of package signatures.
When a user of a package downloads this package or comes to use it
in some way, they first acquire that author’s public key from a key
server, and then use the appropriate signature verification
software to check the package.

The nature of a signature depends on the review and the reviewer; this
is why I use the term “review” and not simply signing:

As an author, I sign that I wrote the package and I am a trustworthy
author.
As an interested user or collaborator who reviews the diff of each
new release, I verify that, indeed, the package is of a particular
standard of quality.
As an automated process like Stackage or Nix, I verify that this
package builds (for a given package set) and passes test suites, but
make no comment on its security.
As a business, I might perform an audit on a package to verify that
it is up to a certain standard of quality.

Within this process the following points hold:

Any subsequent releases of the package by the author should be
reviewed (signed) again. Fortunately, reviews of updates can be much
easier than an initial audit.
In fact, not just the author, but anyone can review a package and
upload their signature to the signature archive.
The signature archive itself does not have to be secure. There can
be any number of mirrors.
The package archive (e.g. Hackage, Stackage or a mirror) does not
have to be secure.
The transport layer from end-to-end of the packages does not have to
be secure.
The ability for anyone to sign a release and state the kind of
signature they’re giving means that one doesn’t necessarily have to
just trust the author, but can rely on other/additional reviews.

Note: we’re not talking about source control here, signing Git commits
or tags is a misleading notion. We are specifically referring to
signing the .tar.gz archive itself.

To implement this solution, the software already exists:

PGP and GnuPG come with signing, verification and a notion of web of
trust out of the box.
Key servers for storing and discovery already exist,
e.g. keys.gnupg.net, keyserver.ubuntu.com, pgp.mit.edu, etc.

What, I believe, we would need to implement this, would be:

Some very clear education to users on the signing process; how to
acquire and verify people’s keys and build a web of trust, etc.
Perhaps a small bit of tooling to sign a package that would just
call PGP and the signature to our TBD review server(s).
Any support necessary to ensure that this process is as smooth on
Windows and OS X as it is on Linux.
Possibly a web site to host reviews.

What Github and Git teach us is that if you make the steps as simple
as git init; git remote add …; git push; then any developer will
happily follow this process. Make it hard and nobody will do it.

Some example work in this area from 2013:
https://github.com/chrisdone/cabal-sign This serves as a
demonstration, but it couples the signature with the package. I don’t
recommend this software or propose it, but the README is rather
illustrative of how simple the process can be. To be clear, use of
standard key servers with education on how to maintain a web of trust
would be the preferred approach here.

Debian’s package manager:
http://manpages.ubuntu.com/manpages/gutsy/man1/dpkg-sig.1.html

The questions I’d like to ask are:

Are you in agreement that this is a problem to the degree I wrote
above?
Does this actually make sense to you as a process from end to end?
Is this something that you would contribute to in terms of
development?
Is this something you would use and contribute to (i.e. signing and
verifying) as a business?
Any concerns?

Cheers!

Arnaud Bailly

unread,

Jan 25, 2015, 7:37:11 PM1/25/15

to commerci...@googlegroups.com

Hello,

Yes
Yes. I would add this is something that exists in the Java world, with Sonatype offering community a service to upload signed packages to "maven central": http://central.sonatype.org/pages/ossrh-guide.html Note that the process is not, in this case, completely straightforward and that it involves Jira to manage upload requests and Nexus to manage artifacts uploads. But on the upside, it ensures everything is signed, has received a reasonable amount of formal reviewing to ensure minimal requirements are met, and only allows identified users
To the best of my time and ability, certainly.
Yes.
Not that I see yet.

Regards,

--
You received this message because you are subscribed to the Google Groups "Commercial Haskell" group.
To unsubscribe from this group and stop receiving emails from it, send an email to commercialhask...@googlegroups.com.
To post to this group, send email to commerci...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/commercialhaskell/CAAJHNPBP6u3AZTGOPNdUkFG7k7WtuAuQPREKhR6quNc72%3DGjxA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

--

Arnaud Bailly

CTO | Capital Match

CapitalMatch

www.capital-match.com - 71 Ayer Rajah Crescent | #06-16 | Singapore 139951

(FR) +33 617 121 978 / (SG) +65 84 08 79 73 | arn...@capital-match.com

Disclaimer:

Capital Match Holdings Pte. Ltd. (the "Company") registered in Singapore (Co. Reg. No. 201418682W) engages in factoring and lending activity only to corporations and limited liability partnerships. The Company does not offer services to other persons or entities, nor engages in deposit taking. The Company operates as an "excluded moneylender" and does not fall under the purview of the Moneylenders Act.

Nikita Karetnikov

unread,

Jan 26, 2015, 4:12:57 AM1/26/15

to chri...@fpcomplete.com, commerci...@googlegroups.com

I’ve been concerned about it for a very long time. I tried to address
some of the problems you mention by adding OpenPGP support to both
hackage-server [1] and cabal-install [2].

This version of hackage-server allows to upload an ASCII-armored OpenPGP
signature, which is optional, while uploading a package or a package
candidate. (If the signature is present, the download link will be
shown in the “Downloads” list.) And my version of cabal-install only
allows to mark each key as trusted or untrusted but doesn’t provide a
way to verify a signature [3].

I didn’t add the main feature because I didn’t want to give a false
sense of security. The OpenPGP spec requires to include only an 8-octet
key id in a signature, which I don’t consider reliable since it doesn’t
prevent collisions [4].

I stopped working on this project due to the lack of time and funding,
so it’s still at the preliminary stage. But to keep this on topic: when
I was looking for a way to fund this project, I was told that industrial
users have other priorities, that is, no one wants to fund secure
package distribution.

Related reading [5,6].

Finally, since I’m not happy about the current state of things, I’m in
the process of switching to Nix/NixOS where each package comes with a
hash. This is not ideal, but I can at least be sure that I have the
same package as others.

[1] https://gitorious.org/hackage-server/hackage-server/commits/openpgp
[2] https://gitorious.org/cabal/cabal/commits/openpgp
[3] http://article.gmane.org/gmane.comp.lang.haskell.cabal.devel/9987
[4] http://article.gmane.org/gmane.comp.lang.haskell.cabal.devel/10009
[5] http://article.gmane.org/gmane.comp.lang.haskell.cabal.devel/9988
[6] http://article.gmane.org/gmane.comp.lang.haskell.cabal.devel/9895

Duncan Coutts

unread,

Jan 26, 2015, 5:16:29 PM1/26/15

to commerci...@googlegroups.com, chri...@fpcomplete.com

On Sunday, January 25, 2015 at 11:36:53 PM UTC, Christopher Done wrote:

Hi all!

I’d like to pose a problem, propose a possible solution and then
finally ask some questions at the end.

I should note that Well-Typed and the IHG are currently implementing one solution in this area.

Our goal was that we should start with something that is implementable in a reasonable amount of time and effort and that gives a significant improvement in security.

Our analysis was that the biggest improvement we could make with reasonable effort would be to take a "wide" but not a "deep" approach. Wide in the sense of covering every package, and covering all platforms and not requiring any author or user intervention. It is not deep in the sense that it is not the gold standard of end to end author (or personal) signing.

So the approach we are taking is indeed based on server-based signing of the package index. We're following the design of "The update framework" http://theupdateframework.com/ but adapted to the existing hackage data formats.

I want to emphasise that this approach is completely complementary with an end-to-end gpg-signing based approach. My view is that ultimately we want to do both. We're never going to get every author to sign every package, and achieving fully automatic checking by all package clients is quite hard. Having some important packages signed is much more realistic. I'm not quite sure if you're proposing that cabal (or another tool) always check the gpg signatures or if it's just some tools like distros and some users who are especially concerned? The tricky part of course is deciding automatically if you want to trust the signatures and interpreting what they mean. If the checking doesn't need to be general and automatic then it's a lot easier.

I should also point out that server-based index signing isn't quite as bad all that when it comes to the threat of server compromise. See the description of the update framework for details. The key points though are that things can be arranged so that multiple machines must be compromised to be able to forge signatures, including non-public facing machines. The root key is stored offline so the system can be restored in the event of a key compromise. Secretly subverting existing packages can be partially mitigated against by distributed checking of the package checksums. Corruption of newly uploaded packages can also be partially mitigated against by a more complicated form of distributed checksum checking. So the point is, sever compromise is a real issue, but things can be arranged to make it hard to achieve and so that any compromise is detected promptly.

A major plus for server based signing is that it will allow us to have untrusted mirrors. Having mirrors routinely used by `cabal` would be a significant reliability improvement for most users. It also avoids the difficulties of using TLS. It's also worth noting that this approach is about as good as what you get with other language package systems like python.

As I say, for deeper security of particular packages, a gpg signing approach is sensible. And I agree that it makes sense to separate that signature from the package, and indeed for 3rd parties to be able to attest something about a package (like your example of automated test systems declaring that the thing at least builds and doesn't attempt to rm -rf / in the sandbox environment etc).

Some of Well-Typed's clients are interested in this and so we are interested in developing the idea further, and making sure it complements the server-based signing that is already in progress.

Duncan

Greg Weber

unread,

Jan 30, 2015, 1:17:09 PM1/30/15

to Duncan Coutts, commerci...@googlegroups.com, Christopher Done

I should also point out that server-based index signing isn't quite as bad all that when it comes to the threat of server compromise. See the description of the update framework for details. The key points though are that things can be arranged so that multiple machines must be compromised to be able to forge signatures, including non-public facing machines. The root key is stored offline so the system can be restored in the event of a key compromise. Secretly subverting existing packages can be partially mitigated against by distributed checking of the package checksums. Corruption of newly uploaded packages can also be partially mitigated against by a more complicated form of distributed checksum checking. So the point is, sever compromise is a real issue, but things can be arranged to make it hard to achieve and so that any compromise is detected promptly.

I am having trouble understanding every part of this explanation. Is there an approach here to prevent (or make more difficult) a MITM attack during the package upload step?

Simon Marlow

unread,

Feb 2, 2015, 4:54:19 AM2/2/15

to Duncan Coutts, commerci...@googlegroups.com, chri...@fpcomplete.com

On 26/01/2015 22:16, Duncan Coutts wrote:
> So the approach we are taking is indeed based on server-based signing of
> the package index. We're following the design of "The update framework"
> http://theupdateframework.com/ but adapted to the existing hackage data
> formats.

I skimmed their spec, it looks very nice and general. Is there some
reason we can't use that library directly? If we included hashes of
package tarballs in the Hackage index, then used TUF to cover the
Hackage index, then we get a verified up to date set of packages, no?

Cheers,
Simon

> --
> You received this message because you are subscribed to the Google
> Groups "Commercial Haskell" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to commercialhask...@googlegroups.com

> <mailto:commercialhask...@googlegroups.com>.

> To post to this group, send email to commerci...@googlegroups.com

> <mailto:commerci...@googlegroups.com>.

> To view this discussion on the web visit

> https://groups.google.com/d/msgid/commercialhaskell/b4a2427c-fb5a-42c2-a122-92f1b8c3554d%40googlegroups.com
> <https://groups.google.com/d/msgid/commercialhaskell/b4a2427c-fb5a-42c2-a122-92f1b8c3554d%40googlegroups.com?utm_medium=email&utm_source=footer>.

Chris Done

unread,

Feb 5, 2015, 8:09:39 AM2/5/15

to Arnaud Bailly, commerci...@googlegroups.com

On 26 January 2015 at 01:37, Arnaud Bailly <arn...@capital-match.com> wrote:
> I would add this is something that exists in the Java world, with
> Sonatype offering community a service to upload signed packages to "maven
> central": http://central.sonatype.org/pages/ossrh-guide.html Note that the
> process is not, in this case, completely straightforward and that it
> involves Jira to manage upload requests and Nexus to manage artifacts
> uploads. But on the upside, it ensures everything is signed, has received a
> reasonable amount of formal reviewing to ensure minimal requirements are
> met, and only allows identified users

Thanks, that makes an interesting implementation to compare against
and learn from.

Chris Done

unread,

Feb 5, 2015, 8:15:38 AM2/5/15

to Nikita Karetnikov, commerci...@googlegroups.com

On 26 January 2015 at 09:12, Nikita Karetnikov <nik...@karetnikov.org> wrote:
> This version of hackage-server allows to upload an ASCII-armored OpenPGP
> signature, which is optional, while uploading a package or a package
> candidate. (If the signature is present, the download link will be
> shown in the “Downloads” list.) And my version of cabal-install only
> allows to mark each key as trusted or untrusted but doesn’t provide a
> way to verify a signature [3].

Presumably you didn't just use `gpg --verify`, which would use the
user's existing trusted key infrastructure, in the interest of
portability?

> I stopped working on this project due to the lack of time and funding,
> so it’s still at the preliminary stage. But to keep this on topic: when
> I was looking for a way to fund this project, I was told that industrial
> users have other priorities, that is, no one wants to fund secure
> package distribution.

We (FP Complete) are interested, but with a strict focus on using
existing tooling infrastructure rather, as opposed to re-implementing
anything in Haskell.

Chris Done

unread,

Feb 5, 2015, 9:00:04 AM2/5/15

to Duncan Coutts, commerci...@googlegroups.com

On 26 January 2015 at 23:16, Duncan Coutts <dun...@well-typed.com> wrote:
> I want to emphasise that this approach is completely complementary
> with an end-to-end gpg-signing based approach. My view is that
> ultimately we want to do both. We're never going to get every author
> to sign every package, and achieving fully automatic checking by all
> package clients is quite hard. Having some important packages signed
> is much more realistic.

I'd agree, although without educating users on the process and seeing
the uptake it's difficult to judge.

> I'm not quite sure if you're proposing that cabal (or another tool)
> always check the gpg signatures or if it's just some tools like
> distros and some users who are especially concerned?

Right, I think there're two unrelated benefits. The signature itself can
be verified by e.g. simply `gpg --verify`, used by users directly or by
sysadmins preparing development environments, etc. Given a collection of
signatures, there is then the possibility to start work on automation
tools, like Cabal.

For automated verification, the likely scenario is for a signer to
specify in the review metadata a free text field indicating the nature
of their review. The user can then decide what to do with a signed
package based on the review. This could evolve into a sum type of
standard review types.

> I should also point out that server-based index signing isn't quite as
> bad all that when it comes to the threat of server compromise. See the
> description of the update framework for details. The key points though
> are that things can be arranged so that multiple machines must be
> compromised to be able to forge signatures, including non-public
> facing machines. The root key is stored offline so the system can be
> restored in the event of a key compromise. Secretly subverting
> existing packages can be partially mitigated against by distributed
> checking of the package checksums. Corruption of newly uploaded
> packages can also be partially mitigated against by a more complicated
> form of distributed checksum checking. So the point is, sever
> compromise is a real issue, but things can be arranged to make it hard
> to achieve and so that any compromise is detected promptly.

That's reasonable with the caveat of trusting that it has been
implemented properly.

> Some of Well-Typed's clients are interested in this and so we are interested
> in developing the idea further, and making sure it complements the
> server-based signing that is already in progress.

Sounds good. :-)

Chris Done

unread,

Feb 5, 2015, 9:09:45 AM2/5/15

to commerci...@googlegroups.com

We're going to move ahead with this proposal, barring any serious objections.

I've moved this proposal to the wiki:
https://github.com/commercialhaskell/commercialhaskell/wiki/Package-signing-proposal

We'll flesh it out more as areas solidify until it evolves from a
proposal to a description of what has been implemented.

The next steps:

1. I will do some research regarding Windows and OS X tooling, and
write a document detailing for end users the end-to-end process of
using standard tools to sign a package and share the signature, trust
someone's key and maintain a web of trust, and verify a signed
package. I'll put this on the wiki and notify the group when it's
available. I will probably request that some of you try this out and
then discuss it.
2. After that will be a case of deciding where to put the keys. Will
require discussion.
3. Coming up with a story for automation. Will require discussion.

Ciao!

Nikita Karetnikov

unread,

Feb 5, 2015, 10:03:54 AM2/5/15

to chri...@fpcomplete.com, commerci...@googlegroups.com

> Presumably you didn't just use `gpg --verify`, which would use the
> user's existing trusted key infrastructure, in the interest of
> portability?

Yes, see
http://article.gmane.org/gmane.comp.lang.haskell.cabal.devel/9891

Duncan Coutts

unread,

Feb 6, 2015, 12:23:38 PM2/6/15

to Greg Weber, commerci...@googlegroups.com, Christopher Done

In the most straightforward implementation, no, there's no MITM
protection on the upload side, just the download side.

To do better on that side, there's a couple things we can do:

We already use digest auth for upload, but don't currently use digest
auth's support for protecting the integrity of the body. Doing that
would mean that a MITM would have to compromise the user's password to
succeed.

What I was referring to above with distributed checksums was actually
aimed not at MITM for upload, but with the case where the server itself
is compromised, and detecting that the new owner of the server is
modifying new packages. The idea there would be that the cabal upload
tool would upload to the main server and additionally upload the
expected checksum to a number of other monitoring servers. That way, if
the main server is compromised but the other monitoring servers are not,
then any modifications performed by the main server would be detectable.

Neither of these things are suggested by the update framework. Austin
and I went through their papers and docs again at the weekend and
noticed something we had not fully appreciated before which is about the
role of the "target" key, and delegated target keys. The target key is
for signing the content of packages, but it's only useful in an
automated setting if it does not live on the same server as the snapshot
or timestamp keys, otherwise they all get compromised at the same time.
(In some systems, like distros they can have private internal servers
where they do package signing, so it can make sense in some systems).
The more interesting thing is delegated target keys. This is where the
main target key delegates to a set of other keys for particular sets of
packages. It's best illustrated by their experimental integration with
Python's pip system, which is very much like hackage. In their design
for pip, they have a target key per developer and delegate from the
central target key to each developer key for signing specific packages.
The main target key is kept offline, or at least not on the main server.
So then this amounts to end to end signing of the individual packages,
along with keeping track of who is allowed to sign which package. The
central snapshot key that signs the index is still important for
consistency and freshness (preventing mix and match, and replay or
freeze attacks), but if the server is compromised then the integrity of
individual packages is at least still maintained.

So that is a direction that we may wish to pursue after the basic index
signing. And that direction would actually overlap significantly with
the gpg-based signing that Chris is discussing.

--
Duncan Coutts, Haskell Consultant
Well-Typed LLP, http://www.well-typed.com/

Duncan Coutts

unread,

Feb 6, 2015, 12:53:05 PM2/6/15

to chri...@fpcomplete.com, commerci...@googlegroups.com

This is not an objection, but a suggestion for a plausible alternative
that might be easier to implement and cover most of the same use cases.

Austin and I were reviewing the documentation and papers for [The Update
Framework] in preparation for implementing hackage index signing and we
notices a feature of "TUF" that we had not previously fully appreciated.

[The Update Framework]: http://theupdateframework.com/

TUF can be used with a so-called "target" role, which can delegate to
many individual target keys for package signing. It's best illustrated
with their example of applying TUF to python's package system (which is
very similar to ours). In their use of TUF, they have a delegated target
key for each developer, and the developers use them to sign their
packages before uploading to the central server. The central target key
is used to manage the delegation, ie which dev's keys are allowed to
sign which packages. All the signatures are checked on the client, using
the same signature algorithms etc as is used for the signed index. So
overall this gives end to end protection for individual packages.

It's worth understanding the role of the various keys in TUF, and what
attacks they prevent:

* delegated target keys: end-to-end protection of the integrity of
individual packages. Prevents packages being modified between
the author and user.
* central target key: who is allowed to sign individual packages,
ie authorisation to upload. Prevents signed packages being
uploaded by the wrong person.
* snapshot/timestamp key: consistency and freshness of the whole
collection of packages. Prevents an attacker substituting old
known-vulnerable (but nevertheless signed) packages for new,
preventing updates for one or more specific packages (so called
mix and match attacks), and overall freshness of the info
(preventing users from getting the latest versions, forcing them
to continue to use old known-vulnerable versions).

It's also worth noting that gpg-signing of individual packages only
gives the first bullet point above. Also, on the topic of the second
bullet point, I have not seen any discussion so far in the gpg-based
proposal about how keys used to sign packages should be interpreted, and
if this can be done automatically or not. We can delegate everything to
the end user's gpg keychain, but this doesn't help automate anything,
and doesn't match up with the known information about who is allowed to
upload which packages. On the other hand, the gpg-based approach lets
you express much more than "this user is authorised to upload this
package", with things like "I as a 3rd party attest that this package
doesn't obviously rm -rf on my VM". But that extra expressiveness also
highlights the challenge of automating the verification step.

If we were to do a full implementation of the TUF approach (using the
target key) we would get the end-to-end signing that we all want, and we
would get a clear story on which keys are authorised to do what, and a
clear story on the automatic verification by the client. We would also
get the general advantages of TUF, that it is a model that a bunch of
academics and other experts have spent considerable time researching,
with a threat model and a number of different attacks considered. With
the gpg-based signing as proposed we miss out on protection for a number
of attacks.

What we would miss out on compared to the gpg-based approach under
discussion is this ability for 3rd parties to add signatures indicating
extra properties other than authorship. While that's nice, it's unclear
to me that many people in this discussion rate that feature as critical.

So more concretely, my suggestion is that in addition to the index
signing parts of TUF that Well-Typed are already working on for the IHG,
that we (the wider we) consider making use of the "target" role in TUF
and extend the implementation to do full end to end signing.

Duncan Coutts

unread,

Feb 6, 2015, 1:10:24 PM2/6/15

to chri...@fpcomplete.com, commerci...@googlegroups.com

On Thu, 2015-02-05 at 14:59 +0100, Chris Done wrote:
> On 26 January 2015 at 23:16, Duncan Coutts <dun...@well-typed.com> wrote:

> > I'm not quite sure if you're proposing that cabal (or another tool)
> > always check the gpg signatures or if it's just some tools like
> > distros and some users who are especially concerned?
>
> Right, I think there're two unrelated benefits. The signature itself can
> be verified by e.g. simply `gpg --verify`, used by users directly or by
> sysadmins preparing development environments, etc. Given a collection of
> signatures, there is then the possibility to start work on automation
> tools, like Cabal.
>
> For automated verification, the likely scenario is for a signer to
> specify in the review metadata a free text field indicating the nature
> of their review. The user can then decide what to do with a signed
> package based on the review. This could evolve into a sum type of
> standard review types.

If I as an end user have to read and check each one, then it doesn't
sound like useful automation for me as an end user. If I were a sysadmin
preparing the release of an important product, then maybe, but as an end
user I'm just going to be trained to say 'y', to everything.

Consider that an attacker can also sign packages and legitimately upload
them to hackage, and then when they're feeling malicious (and manage to
bypass the server's checks on who's allowed to upload what, e.g. by
compromising the server, or conducting a MITM attack between the server
and end-user), they can include their signed version of 'containers' or
whatever, and unless you're really vigilant you're not going to notice
that this package was signed by someone who's not really authorised to
do so, and you'll hit 'y', like all the others.

The end user does not have available to them at their fingertips the
info about who is allowed to upload each package, so asking them to
confirm that isn't very helpful. An extremely dedicated user like you or
Michael can (and I know Michael used to) keep track of who is allowed to
upload each package for a limited set of curated packages, and so you
could automate a degree further. But for most users unless this
information is maintained, distributed and used automatically it's not
going to help all that much.

The TUF approach does allow that to be fully automatic, because the
central target key is used to sign delegation info, which says which
keys are valid for which packages.

All that said, I do appreciate that for extra 3rd party reviews, such a
system would be nice. I just think it's a lot less helpful than it could
be for the basic task of package authenticity, which can and should be
fully automated for end users.

Greg Weber

unread,

Feb 7, 2015, 2:20:43 PM2/7/15

to Duncan Coutts, Christopher Done, commerci...@googlegroups.com

Hi all,

I added a proposal for improving package security by integrating with SCMs

https://github.com/commercialhaskell/commercialhaskell/blob/master/proposal/cabal-scm-integration.md

Note that instead of using the wiki, I created a proposal/ directory and added a markdown file there.

Because github supports markdown editing in the browser, one gets nearly all the benefits of the wiki.

The problem with wikis is that they don't provide a good way to discuss changes.

In the case of a proposal, most will hesitate to edit it (except for the author) and instead will attempt to contact the author about changes without the ease of a pull request workflow.

In short, I recommend *not* using the wiki for proposals and using markdown files instead and sending pull requests for non-minor changes.

This does not preclude discussing the proposal on the mail list. I am including the full text of the proposal below

Greg Weber

Proposal: cabal SCM integration

I like how we have 2 proposals that are attempting to provide end-to-end package security.

Vincent and I were brainstorming about what benefits are possible from integrating with widely used source code management systems (SCM).

SCM integration has the potential to

* provide a direct and secure package transport mechanism

* add additional security to signed packages

* add end-to-end security to unsigned packages

* make local forks practical to deal with in a secure way and easier to install

This ends up having a lot in common with the package signing proposal in terms of decentralization, but also some orthogonal and complementary capabilities.

One way to think about package distribution is by asking: what happens if we get rid of a package intermediary like Hackage?

# Completely De-centralized

I search for a package on google, find its Github page, and look at the releases.

I can use git to securely checkout the repo at the release tag and vendor the sources.

Unfortunately I need to manually resolve dependencies and version issues.

# Secure transport

But lets note that I have accomplished secure transport via git.

Other programming language installers such as `go get` and Ruby's `Gemfile` have SCM integration.

When you use `go get`, it will demand that you install `git` or `hg`, etc, so the user is responsible for having a tool installed that accomplishes secure transport.

# An essential workflow

The SCM workflow is already used today for local forks, except that we often find the SCM url on hackage and may have problems determining the SCM commit of the release version. This situation cannot change under the existing proposals because there is no guarantee that there will be a commit matching the tarball. So the only secure local workflow under the signing proposals is to not clone from the SCM but to unpack a tarball. This creates a lot of extra overhead to contribute back to the original repository.

# A trusted workflow

When an author puts the url of the package on Hackage it is an implicit statement of trust in Github knowing that users may create a local fork from it. The package author already trusts Github for their normal development workflow of pushing their repository. It is not completely blind trust: the distributed nature of git means there are some checks against tampering and the collaborative nature of github means there is visibility into some forms of third party attacks.

# Bring back centralization!

We have some big problems with finding and resolving dependencies in the SCM workflow.

So we need central locations where package information can be published.

The published SCM information (in addition to what is required in a cabal file) is

* URI: location to fetch package source

* SCM (git, mercurial, darcs): how to fetch package and move to the commit

* release commit: combine with URI to obtain the source code

* Optional: subdirectory to find package within the repo

# Signing proposal weaknesses

## Security - compromised private key

Done properly signing leaves one major attack point: the private key of the key pair. Adding the SCM attributes to the package data makes attacks much harder. We can verify the package contents against what is in the SCM. The attacker must also compromise something at the package SCM url. In the case of Github, Github is unlikely to be completely compromised. A more likely scenario is for an attacker to gain privilege to commit to a repo. In that case, the attacking commit will be very visibile.

## Usage or lack thereof

What if package authors don't want to bother signing their package? They don't want to install PGP, but they most certainly use a SCM. Assuming we can auto-detect the SCM, The sdist step is now just:

cabal sdist v0.4.3

The command will verify that the tag exists in the local repo and in the project URI specified in the SCM information.

If we can get the author to securely log in to Hackage and enter the project URI, things become more difficult to hack because we can verify information against what is in the SCM. If the SCM or package directory information is changed, the SCM fetch is not going to work. Changing the SCM commit and the cabal version allows an attacker only the easy ability to attempt to convince us to use an older version of the package, not one that they create (because we can verify that against what is in the SCM).

So we need a secure client to to tell Hackage about a URI. We all have one: a browser. Creating a new package means logging in, writing down a package name, a SCM, and a URI. Releasing a package is a later event that requires giving Hackage a cabal version and a SCM commit revision. Even if this meta-data release is insecure we can check the data against the SCM.

# Security weaknesses

A big weakness of the SCM approach is that we must trust the URI. If Github is hacked, we are hacked. SCM mirrors, a sha of package contents, and the difficulty of faking a git sha can prevent tampering with *previous* versions. Publishing *new* versions from a compromised SCM repo can be protected against by

* authenticating the sending of giving a cabal version and a commit revision

* specifying multiple SCM urls that must all be pushed to (this will require every url to be compromised with write access).

## Centralization

Another weakness is the centralized distribution of package information: Hackage could be compromised. We can solve this by having multiple Hackages. But I don't really mean Hackage, I just mean a simple meta-data repository with API access. These servers can be running in multiple locations. When you release a package, you push your information to multiple servers (potentially insecurely, although securely giving the URI for a package would be required). When you download your package, you check package information from multiple servers and verify that they are all the same.

# Distribution

The signing proposals assumes centralized package distribution (with mirroring). The SCM provides a secure direct transport mechanism. However, we are relying on Github being up.

## Local SCM verification

A company can operate a local Hackage by downloading the package meta-data and cloning all the SCM repos (that it uses). This local Hackage could be a server that polls for updated meta-data and gathers the latest SCM updates. A persistent update process helps avoid SCM downtime issues. In addition, this process would automatically verify that the SCM contents match the meta-data.

# SCM benefits unrelated to security

Cabal already has a command to fetch the source. If it also knew how to checkout a specific release, that helps to hack on an open source library you are depending on.

There is a problem with having a local copy of an open source library though. You should bump versions in your local copy, but that will just confuse cabal because it also finds version info on Hackage. I think there is an opportunity here to integrate SCM commit hash information to avoid confusion.

# Conclusion

Adding package signing is the clearest way to secure the basic package installation workflow, if we can get everyone to do it. SCM integration can complement package signing security.

SCM integration has the potential to add real end-to-end security by itself. However, the security has weaknesses unless a user can securely transmit some release metadata or change their workflow to push code to multiple locations.

SCM integration also offers

* local forks that are practical to deal with in a secure way and easier to install

* a direct package transport mechanism that is secure

Michael Snoyman

unread,

Feb 8, 2015, 3:38:40 AM2/8/15

to Greg Weber, Duncan Coutts, Christopher Done, commerci...@googlegroups.com

I think if someone has their GPG private key compromised, it's pretty likely that they could have their SSH private key compromised as well. I don't see this as a serious win for the SCM proposal.

## Usage or lack thereof

What if package authors don't want to bother signing their package? They don't want to install PGP, but they most certainly use a SCM. Assuming we can auto-detect the SCM, The sdist step is now just:

cabal sdist v0.4.3

The command will verify that the tag exists in the local repo and in the project URI specified in the SCM information.

If we can get the author to securely log in to Hackage and enter the project URI, things become more difficult to hack because we can verify information against what is in the SCM. If the SCM or package directory information is changed, the SCM fetch is not going to work. Changing the SCM commit and the cabal version allows an attacker only the easy ability to attempt to convince us to use an older version of the package, not one that they create (because we can verify that against what is in the SCM).

So we need a secure client to to tell Hackage about a URI. We all have one: a browser. Creating a new package means logging in, writing down a package name, a SCM, and a URI. Releasing a package is a later event that requires giving Hackage a cabal version and a SCM commit revision. Even if this meta-data release is insecure we can check the data against the SCM.

# Security weaknesses

A big weakness of the SCM approach is that we must trust the URI. If Github is hacked, we are hacked. SCM mirrors, a sha of package contents, and the difficulty of faking a git sha can prevent tampering with *previous* versions. Publishing *new* versions from a compromised SCM repo can be protected against by

* authenticating the sending of giving a cabal version and a commit revision
* specifying multiple SCM urls that must all be pushed to (this will require every url to be compromised with write access).

## Centralization

Another weakness is the centralized distribution of package information: Hackage could be compromised. We can solve this by having multiple Hackages. But I don't really mean Hackage, I just mean a simple meta-data repository with API access. These servers can be running in multiple locations. When you release a package, you push your information to multiple servers (potentially insecurely, although securely giving the URI for a package would be required). When you download your package, you check package information from multiple servers and verify that they are all the same.

# Distribution

The signing proposals assumes centralized package distribution (with mirroring). The SCM provides a secure direct transport mechanism. However, we are relying on Github being up.

I think we're relying on a lot of other things, like everyone agreeing on a few core SCMs that support signing, and hosting their code on trusted servers. It seems like we're drastically increasing the attack vectors with this approach.

## Local SCM verification

A company can operate a local Hackage by downloading the package meta-data and cloning all the SCM repos (that it uses). This local Hackage could be a server that polls for updated meta-data and gathers the latest SCM updates. A persistent update process helps avoid SCM downtime issues. In addition, this process would automatically verify that the SCM contents match the meta-data.

# SCM benefits unrelated to security

Cabal already has a command to fetch the source. If it also knew how to checkout a specific release, that helps to hack on an open source library you are depending on.

There is a problem with having a local copy of an open source library though. You should bump versions in your local copy, but that will just confuse cabal because it also finds version info on Hackage. I think there is an opportunity here to integrate SCM commit hash information to avoid confusion.

# Conclusion

Adding package signing is the clearest way to secure the basic package installation workflow, if we can get everyone to do it. SCM integration can complement package signing security.

SCM integration has the potential to add real end-to-end security by itself. However, the security has weaknesses unless a user can securely transmit some release metadata or change their workflow to push code to multiple locations.

SCM integration also offers

* local forks that are practical to deal with in a secure way and easier to install
* a direct package transport mechanism that is secure

Maybe I'm simply not "getting it," but adding SCM now seems like a major hurdle of technologies to get working correctly (Git, Mercurial, and Darcs support at the minimum). I'd rather focus on a more standard solution, which is what I think Chris's approach is taking. (I discussed his proposal with him before he made it, but I'll admit that I haven't been following this discussion closely since then due to recent travel.)

Michael

Greg Weber

unread,

Feb 8, 2015, 2:36:49 PM2/8/15

to Michael Snoyman, Duncan Coutts, Christopher Done, commerci...@googlegroups.com

On the whole, I tend to agree that signing is better. This is more of a brainstorm. But I don't know why you would think SCM integration is a difficult. There is already a `cabal get` command that does the fetching of a repo part. Vincent and I will probably start looking into adding some more SCM integration functionality to cabal since most aspects are fairly trivial.

Michael

Greg Weber

unread,

Feb 10, 2015, 12:54:04 PM2/10/15

to Michael Snoyman, Duncan Coutts, Christopher Done, commerci...@googlegroups.com

On Sun, Feb 8, 2015 at 12:38 AM, Michael Snoyman <mic...@snoyman.com> wrote:

On Sat Feb 07 2015 at 9:20:45 PM Greg Weber <gr...@gregweber.info> wrote:

# Signing proposal weaknesses

## Security - compromised private key

Done properly signing leaves one major attack point: the private key of the key pair. Adding the SCM attributes to the package data makes attacks much harder. We can verify the package contents against what is in the SCM. The attacker must also compromise something at the package SCM url. In the case of Github, Github is unlikely to be completely compromised. A more likely scenario is for an attacker to gain privilege to commit to a repo. In that case, the attacking commit will be very visibile.

I think if someone has their GPG private key compromised, it's pretty likely that they could have their SSH private key compromised as well. I don't see this as a serious win for the SCM proposal.

That is a good point, but lets think about the issues in this area a little more.

The first question is: do you have a revocation key for your PGP key? If not, SSH keys are always revokable.

Another aspect is the visibility of github commits. We could probably achieve something similar in the PGP system by sending notifications to the author's e-mail address whenever a package (metadata) upload occurs.

Another aspect is not a stolen key, but imitating someone (the web of trust issue). With SCM integration we have the option of using github to help validate trust.

I think we're relying on a lot of other things, like everyone agreeing on a few core SCMs that support signing, and hosting their code on trusted servers. It seems like we're drastically increasing the attack vectors with this approach.

In general I agree, although I want to be clear that there is no reliance on an SCM for signing. An SCM like git with a sha as opposed to an SVN auto-increment is definitely useful though.

Reply all

Reply to author

Forward