Code Signing: Establishing Trust

Showing 1-4 of 4 messages
Code Signing: Establishing Trust hyperthunk 3/11/12 4:02 PM
Something occurred to me as I was pondering code signing yesterday. We are assuming that the user is going to add their public key to the index at some point, and each time they publish a new artefact, add folders and config/metadata files to the index pointing to the artefact download URL. This is a problem.

Let's assume that we have an Erlware index published on github under the Erlware github organisation (or user). How do we authenticate a user who is trying to add things to the Erlware index? I would assume that the only sane way to do this would be based on their github username (and either password or ssh key) but this isn't much practical use at all. In order to verify that the person trying to write to the index is really 'hyperhunk@github' we would have to either

1. add 'hyperthunk' to the list of collaborators for the repo (or organisation)
2. have some kind of private authentication data stored somewhere that the client application can check against

Neither of those is a good idea IMO. Option (2) sounds unworkable in practise and option (1) requires us to have some kind of registration mechanism. I think this kind of 'manual process' is what puts most people off using CEAN and we should avoid it. But without a manual process, there is no practical way for a person to actually register themselves (as a valid committer) or submit their public key! Even if we could overcome that old chestnut, there's also the problem that using github to authenticate a user would effectively give them commit rights to the whole index (git repository) which is not 'trustworthy' behaviour from the other user's point of view. AFAICT this model is broken.

Instead, I think the assertion Eric has made about their being 'many indexes' is absolutely bang on. We provide a command line API to create a new index for yourself and associate it with your github account. It sets up the git repository locally with all the right stuff, uses the github API - 'POST /<username>/repos' - to create the remote repository and configure it against the local one, and then you run local commands against your repository and sync when you're done. Assuming the 'base' application from which the others can be called is named after the Pointy Haired Boss in Dilbert (this is a joke BTW), we get something like

$ phb index-create --org=nebularis --user=hyperthunk
Please enter you github password and press enter: ***
Local index [org=nebularis] created successfully!

I'm assuming we need to store (locally) some configuration so we know where the private key for your repo resides on this machine, so this next command configures a given key for a particular organisation.

$ phb pki-configure --org=nebularis --keyfile=/usr/local/protected/keys/mykey --id=mykey

This one simply adds the public key:

$ phb index-add --org=nebularis --artefact-type=public_key --artefact-location=file:///usr/local/protected/keys/mykey.pub

Assuming we've already built and assembled an artefact, this next command uses the id from earlier to find the right key to do the signing - I'm assuming the packager does the signing because the assembly needs to be finalised before you've got a consistent file system structure upon which to build the hash that will be getting signed. In this command, 'artefact-type' refers to the packaging type:

$ phb artefact-package /path/to/project --artefact-type=application --code-signing=mykey

I'm assuming that the packaging step will hash all the relevant inputs (i.e., the contents of the file system for the assembled project) and sign it with the private key, then put that into the relevant place (e.g., somewhere below ./priv maybe) before zipping everything up. And assuming that we've got all the info we need in the output .zip file from the previous step:

$ phb publish artefact --org=nebularis --file=file:///path/to/project/target/myproject.zip

And then we can synchronise this local repository, pushing the changes to github so that everyone else can access the index. You'll note that instead of putting --artefact-location=<url> I just added the file to the index. This was quite deliberate.

Thinking about it, this kind of nice simple repository management API means that it's easy to set up your own organisation/company index, add your PKI stuff and publish metadata to it locally (meaning it will resolve locally whilst you're in 'development' mode on your local machine!) and then synchronising with github when you're ready to actually publish all your changes. You can also easily pull other people's changes, either by doing a 'git pull' and merging by hand, or just relying on the tool to get the merging done for you. Based on this, I would expect to see basho, rabbitmq, yaws, erlware, etc - all organisations who are using the tool for dependency management - having their own index. This makes perfect sense:

- because it's your index, you can read + write to it without any authentication problems
- everybody knows they can trust it, because it's your repository and github will only let you (and your selected collaborators) write to it
- therefore everybody who trusts you (i.e., your code) will trust your index

The first time someone wants to install riak, they simply add the basho index to their whitelist and they're ready to roll. The tool downloads the basho index and uses it (along with any other supported indexes) to resolve dependencies. This approach also makes third party signing easy:

1. you build someone else's stuff
2. you package and sign it yourself
3. you publish it to *your* index, with some additional metadata (i.e., in a special 'third-party' sub directory) to indicate it's originally from another organisation

And now for my final point. If you're with me so far, I think this might go down quite comfortably - it's been a really long weekend though so bear with me....

Based on my assumptions that

- nobody will trust our indexes unless we're able to guarantee who is writing to them, therefore
- we're not in a position to let arbitrary users write to our (erlware or nebularis) indexes, but
- it should be really simple for users to manage and publish their own index if it's based on git

Now everyone who is publishing packages has to write data back to github for their index. A whole team can get added to a single repo (or organisation/team) on github so this should work brilliantly. But now, why should there be some other, additional way, to manage uploading the physical artefacts themselves? Surely it would make a lot more sense if when you did `phb publish artefact ...' that the tool simply copied the binary into place next to the index. Now the index has gone back to being a repository, but instead of a large repository pointing to lots of little downloads, you've got lots of smaller self contained repositories.

I think this has a couple of advantages:

1. if you trust the github user of the code (which you do if you're using their project sources anyway) then you'll trust their repository/index.
2. there is no need for any 'local repository' - more on this in a moment
3. publication is a lot simpler, both for implementors and those who want to understand the process
4. checking to see if my local copy of the basho-public-releases index is up to date, is as simple as checking (via the github API) for the most recent commit SHA
5. if you *want* to manually manage your own indexes (or the artefacts section) then the git workflow provides you with all the power you could possibly want

On point (2), what I'm thinking is that the index should just be a flat file. When a users adds a repository/index to the (local) whitelist, the tool simply fetches the index (file) locally and puts it into a new folder based on the organisation name. You get something like:

/indexes
  /erlware/index.meta
  /basho/index.meta
  /nebularis/index.meta
  /rabbitmq/index.meta
  /esl/index.meta
 
Of course it is easy to build a large top level index.meta if you want that kind of optimisation, so as to avoid conjoining all the metadata each time the tool runs. This can be stored in the root of ./indexes and rebuilt whenever the whitelist changes. Now how does this avoid having a local repository for the binaries? Well I think that when you choose to download an artefact, the full path of the artefact in the remote (git repository) should be inserted into the local copy of the index:

/indexes
  /erlware
    index.meta
    /artefacts
      /erlware_commons
        /1.0.2/erlware_commons-1.0.2.ez
  /basho/index.meta
  etc....

This is pretty useful from a discovery perspective, but it's also nice to have that mirror consistency because it means the same logical functionality will work for local publishing and remote fetching/installing. Remote publishing is simply the act of synchronising your repository, and this leads me onto what I think is another good aspect of this approach. Your own personal local repository is handled in exactly the same way as every other whitelisted (remote) one. It's index.meta resides in the same place, its artefacts follow the same directory structure conventions - the only real difference is that for your own local copy of a published repository, you have *all* the artefacts 'in place' already, so no fetching is required - nor should fetching be possible in this case, as the correct response to 'phb index-update --org=nebularis' would be to do a 'git pull' or some such. This will actually be a boon for teams who're working on specific bound versions of applications (and all their dependencies) because they'll already have everything installed in the right place and with the right build tool integration, a successful local build can republish (or publish via a build increment when binding with something like '>= 1.0' so that 1.0.1 and 1.0.2 are both valid and the highest is picked) to the local repository and see changes in place immediately.  I might end up with something like  

/indexes
  /erlware/index.meta
  /basho/index.meta
  /nebularis
    index.meta
    /artefacts
      /third-party
        /esl/parse_trans/1.0/parse_trans-1.0.ez
      /osenv
        /0.0.1/osenv-0.0.1.ez
      /erlxsl
        /0.5.9
          /64bit
            /x86
              /darwin/erlxsl-0.5.9.ez
              /linux/erlxsl-0.5.9.ez
            /ia64
              /darwin/erlxsl-0.5.9.ez
              /linux/erlxsl-0.5.9.ez
          /32bit
            /x86
              /darwin/erlxsl-0.5.9.ez
              /linux/erlxsl-0.5.9.ez
            /ia64
              /darwin/erlxsl-0.5.9.ez
              /linux/erlxsl-0.5.9.ez
  /rabbitmq/index.meta
  /esl/index.meta

--------------

So I would really like to get your thoughts on this. These are amongst of the last outstanding questions in my mind really: establishing trust and whether the index and artefacts should really be separate at all.

Cheers,

Tim

 

Re: Code Signing: Establishing Trust Eric Merritt 3/12/12 10:03 AM
On Sun, Mar 11, 2012 at 6:02 PM, Tim Watson <watson....@gmail.com> wrote:
> Something occurred to me as I was pondering code signing yesterday. We are assuming that the user is going to add their public key to the index at some point, and each time they publish a new artefact, add folders and config/metadata files to the index pointing to the artefact download URL. This is a problem.
>
> Let's assume that we have an Erlware index published on github under the Erlware github organisation (or user). How do we authenticate a user who is trying to add things to the Erlware index? I would assume that the only sane way to do this would be based on their github username (and either password or ssh key) but this isn't much practical use at all. In order to verify that the person trying to write to the index is really 'hyperhunk@github' we would have to either

I have always assumed that one of the collaborators on the project is
going to have to review and do the merge. Well probably at some point
it will be automated in such away as when someone requests a pull we
will verify the signing and then automerge it.

>
> 1. add 'hyperthunk' to the list of collaborators for the repo (or organisation)
> 2. have some kind of private authentication data stored somewhere that the client application can check against

Well, I assume that the change would be signed and thats enough to do
the merge really. I dont see any reason to add them to the
collaborators or have something more complex.

> Neither of those is a good idea IMO. Option (2) sounds unworkable in practise and option (1) requires us to have some kind of registration mechanism. I think this kind of 'manual process' is what puts most people off using CEAN and we should avoid it. But without a manual process, there is no practical way for a person to actually register themselves (as a valid committer) or submit their public key! Even if we could overcome that old chestnut, there's also the problem that using github to authenticate a user would effectively give them commit rights to the whole index (git repository) which is not 'trustworthy' behaviour from the other user's point of view. AFAICT this model is broken.

Well it starts as a manual process on the part of the collaborators
but there is no reason it would stay that way.

> Instead, I think the assertion Eric has made about their being 'many indexes' is absolutely bang on. We provide a command line API to create a new index for yourself and associate it with your github account. It sets up the git repository locally with all the right stuff, uses the github API - 'POST /<username>/repos' - to create the remote repository and configure it against the local one, and then you run local commands against your repository and sync when you're done. Assuming the 'base' application from which the others can be called is named after the Pointy Haired Boss in Dilbert (this is a joke BTW), we get something like
>
> $ phb index-create --org=nebularis --user=hyperthunk
> Please enter you github password and press enter: ***
> Local index [org=nebularis] created successfully!
>
> I'm assuming we need to store (locally) some configuration so we know where the private key for your repo resides on this machine, so this next command configures a given key for a particular organisation.
>
> $ phb pki-configure --org=nebularis --keyfile=/usr/local/protected/keys/mykey --id=mykey

Even given my statements above I think this is a good idea.

> This one simply adds the public key:
>
> $ phb index-add --org=nebularis --artefact-type=public_key --artefact-location=file:///usr/local/protected/keys/mykey.pub
>
> Assuming we've already built and assembled an artefact, this next command uses the id from earlier to find the right key to do the signing - I'm assuming the packager does the signing because the assembly needs to be finalised before you've got a consistent file system structure upon which to build the hash that will be getting signed. In this command, 'artefact-type' refers to the packaging type:
>
> $ phb artefact-package /path/to/project --artefact-type=application --code-signing=mykey
>
> I'm assuming that the packaging step will hash all the relevant inputs (i.e., the contents of the file system for the assembled project) and sign it with the private key, then put that into the relevant place (e.g., somewhere below ./priv maybe) before zipping everything up. And assuming that we've got all the info we need in the output .zip file from the previous step:

yes that should not be a manual thing.

> $ phb publish artefact --org=nebularis --file=file:///path/to/project/target/myproject.zip
>
> And then we can synchronise this local repository, pushing the changes to github so that everyone else can access the index. You'll note that instead of putting --artefact-location=<url> I just added the file to the index. This was quite deliberate.
>
> Thinking about it, this kind of nice simple repository management API means that it's easy to set up your own organisation/company index, add your PKI stuff and publish metadata to it locally (meaning it will resolve locally whilst you're in 'development' mode on your local machine!) and then synchronising with github when you're ready to actually publish all your changes. You can also easily pull other people's changes, either by doing a 'git pull' and merging by hand, or just relying on the tool to get the merging done for you. Based on this, I would expect to see basho, rabbitmq, yaws, erlware, etc - all organisations who are using the tool for dependency management - having their own index. This makes perfect sense:

It makes sense to me too.

> - because it's your index, you can read + write to it without any authentication problems
> - everybody knows they can trust it, because it's your repository and github will only let you (and your selected collaborators) write to it
> - therefore everybody who trusts you (i.e., your code) will trust your index
>
> The first time someone wants to install riak, they simply add the basho index to their whitelist and they're ready to roll. The tool downloads the basho index and uses it (along with any other supported indexes) to resolve dependencies. This approach also makes third party signing easy:
>
> 1. you build someone else's stuff
> 2. you package and sign it yourself
> 3. you publish it to *your* index, with some additional metadata (i.e., in a special 'third-party' sub directory) to indicate it's originally from another organisation
>
> And now for my final point. If you're with me so far, I think this might go down quite comfortably - it's been a really long weekend though so bear with me....
>
> Based on my assumptions that
>
> - nobody will trust our indexes unless we're able to guarantee who is writing to them, therefore
> - we're not in a position to let arbitrary users write to our (erlware or nebularis) indexes, but

Well this is true. but there is also the mantra that every change is
addative and every change is signed. Not only the individual change
but the repo owner could also easily sign the repo after each change.

> - it should be really simple for users to manage and publish their own index if it's based on git

This should be the case no matter what.

> Now everyone who is publishing packages has to write data back to github for their index. A whole team can get added to a single repo (or organisation/team) on github so this should work brilliantly. But now, why should there be some other, additional way, to manage uploading the physical artefacts themselves? Surely it would make a lot more sense if when you did `phb publish artefact ...' that the tool simply copied the binary into place next to the index. Now the index has gone back to being a repository, but instead of a large repository pointing to lots of little downloads, you've got lots of smaller self contained repositories.
>
> I think this has a couple of advantages:
>
> 1. if you trust the github user of the code (which you do if you're using their project sources anyway) then you'll trust their repository/index.
> 2. there is no need for any 'local repository' - more on this in a moment
> 3. publication is a lot simpler, both for implementors and those who want to understand the process
> 4. checking to see if my local copy of the basho-public-releases index is up to date, is as simple as checking (via the github API) for the most recent commit SHA
> 5. if you *want* to manually manage your own indexes (or the artefacts section) then the git workflow provides you with all the power you could possibly want

The one big problem with this is that if the binary is in the repo, I
cant pull down just the binaries I want anymore. I have to sync the
repo which includes all binaries. I actually think thats a huge
problem.

> On point (2), what I'm thinking is that the index should just be a flat file. When a users adds a repository/index to the (local) whitelist, the tool simply fetches the index (file) locally and puts it into a new folder based on the organisation name. You get something like:
>
> /indexes
>  /erlware/index.meta
>  /basho/index.meta
>  /nebularis/index.meta
>  /rabbitmq/index.meta
>  /esl/index.meta
>
> Of course it is easy to build a large top level index.meta if you want that kind of optimisation, so as to avoid conjoining all the metadata each time the tool runs. This can be stored in the root of ./indexes and rebuilt whenever the whitelist changes. Now how does this avoid having a local repository for the binaries? Well I think that when you choose to download an artefact, the full path of the artefact in the remote (git repository) should be inserted into the local copy of the index:
>
> /indexes
>  /erlware
>    index.meta
>    /artefacts
>      /erlware_commons
>        /1.0.2/erlware_commons-1.0.2.ez
>  /basho/index.meta
>  etc....
>
> This is pretty useful from a discovery perspective, but it's also nice to have that mirror consistency because it means the same logical functionality will work for local publishing and remote fetching/installing. Remote publishing is simply the act of synchronising your repository, and this leads me onto what I think is another good aspect of this approach. Your own personal local repository is handled in exactly the same way as every other whitelisted (remote) one. It's index.meta resides in the same place, its artefacts follow the same directory structure conventions - the only real difference is that for your own local copy of a published repository, you have *all* the artefacts 'in place' already, so no fetching is required - nor should fetching be possible in this case, as the correct response to 'phb index-update --org=nebularis' would be to do a 'git pull' or some such. This will actually be a boon for teams who're working on specific bound versions of applications (and all their dependencies) because they'll already have everything installed in the right place and with the right build tool integration, a successful local build can republish (or publish via a build increment when binding with something like '>= 1.0' so that 1.0.1 and 1.0.2 are both valid and the highest is picked) to the local repository and see changes in place immediately.  I might end up with something like

I like these properties very much.

Overall I like it, with a couple of caveats.

1) I may be missing something with regards to the binaries themselves.
I got the impression from reading that they where in the repo but then
later on I got the impression that they where not (but ended up there
in the local version).

2) I am a little fuzzy on how you handle binaries published for other
organizations

> Cheers,
>
> Tim
>
>
>
> --
> You received this message because you are subscribed to the Google Groups "erlware-questions" group.
> To post to this group, send email to erlware-...@googlegroups.com.
> To unsubscribe from this group, send email to erlware-questi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/erlware-questions?hl=en.
>

Re: Code Signing: Establishing Trust hyperthunk 3/12/12 11:14 AM
On 12 Mar 2012, at 17:03, Eric Merritt wrote:

On Sun, Mar 11, 2012 at 6:02 PM, Tim Watson <watson....@gmail.com> wrote:
Something occurred to me as I was pondering code signing yesterday. We are assuming that the user is going to add their public key to the index at some point, and each time they publish a new artefact, add folders and config/metadata files to the index pointing to the artefact download URL. This is a problem.

Let's assume that we have an Erlware index published on github under the Erlware github organisation (or user). How do we authenticate a user who is trying to add things to the Erlware index? I would assume that the only sane way to do this would be based on their github username (and either password or ssh key) but this isn't much practical use at all. In order to verify that the person trying to write to the index is really 'hyperhunk@github' we would have to either

I have always assumed that one of the collaborators on the project is
going to have to review and do the merge. Well probably at some point
it will be automated in such away as when someone requests a pull we
will verify the signing and then automerge it.

This requires having infrastructure, or being manually involved in the process all the time. I think it's a bad move.



1. add 'hyperthunk' to the list of collaborators for the repo (or organisation)
2. have some kind of private authentication data stored somewhere that the client application can check against

Well, I assume that the change would be signed and thats enough to do
the merge really. I dont see any reason to add them to the
collaborators or have something more complex.

Erm no that's not true. How can we verify that they've signed something until we have their public key, and how can we verify their public key without trust?


Neither of those is a good idea IMO. Option (2) sounds unworkable in practise and option (1) requires us to have some kind of registration mechanism. I think this kind of 'manual process' is what puts most people off using CEAN and we should avoid it. But without a manual process, there is no practical way for a person to actually register themselves (as a valid committer) or submit their public key! Even if we could overcome that old chestnut, there's also the problem that using github to authenticate a user would effectively give them commit rights to the whole index (git repository) which is not 'trustworthy' behaviour from the other user's point of view. AFAICT this model is broken.

Well it starts as a manual process on the part of the collaborators
but there is no reason it would stay that way.


Again, the way you're describing it assumes we would have some level of infrastructure, which is something I think we're better avoiding - you said this yourself early on IIRC.

Instead, I think the assertion Eric has made about their being 'many indexes' is absolutely bang on. We provide a command line API to create a new index for yourself and associate it with your github account. It sets up the git repository locally with all the right stuff, uses the github API - 'POST /<username>/repos' - to create the remote repository and configure it against the local one, and then you run local commands against your repository and sync when you're done. Assuming the 'base' application from which the others can be called is named after the Pointy Haired Boss in Dilbert (this is a joke BTW), we get something like

$ phb index-create --org=nebularis --user=hyperthunk
Please enter you github password and press enter: ***
Local index [org=nebularis] created successfully!

I'm assuming we need to store (locally) some configuration so we know where the private key for your repo resides on this machine, so this next command configures a given key for a particular organisation.

$ phb pki-configure --org=nebularis --keyfile=/usr/local/protected/keys/mykey --id=mykey

Even given my statements above I think this is a good idea.

This one simply adds the public key:

$ phb index-add --org=nebularis --artefact-type=public_key --artefact-location=file:///usr/local/protected/keys/mykey.pub

Assuming we've already built and assembled an artefact, this next command uses the id from earlier to find the right key to do the signing - I'm assuming the packager does the signing because the assembly needs to be finalised before you've got a consistent file system structure upon which to build the hash that will be getting signed. In this command, 'artefact-type' refers to the packaging type:

$ phb artefact-package /path/to/project --artefact-type=application --code-signing=mykey

I'm assuming that the packaging step will hash all the relevant inputs (i.e., the contents of the file system for the assembled project) and sign it with the private key, then put that into the relevant place (e.g., somewhere below ./priv maybe) before zipping everything up. And assuming that we've got all the info we need in the output .zip file from the previous step:

yes that should not be a manual thing.

Yes - manual as in the tool does it on the local build environment against the manually specified public key. 


$ phb publish artefact --org=nebularis --file=file:///path/to/project/target/myproject.zip

And then we can synchronise this local repository, pushing the changes to github so that everyone else can access the index. You'll note that instead of putting --artefact-location=<url> I just added the file to the index. This was quite deliberate.

Thinking about it, this kind of nice simple repository management API means that it's easy to set up your own organisation/company index, add your PKI stuff and publish metadata to it locally (meaning it will resolve locally whilst you're in 'development' mode on your local machine!) and then synchronising with github when you're ready to actually publish all your changes. You can also easily pull other people's changes, either by doing a 'git pull' and merging by hand, or just relying on the tool to get the merging done for you. Based on this, I would expect to see basho, rabbitmq, yaws, erlware, etc - all organisations who are using the tool for dependency management - having their own index. This makes perfect sense:

It makes sense to me too.

Cool.
That's not true. All blobs in a github repo are accessible using the github data API and further more, can be downloaded via a direct HTTP link. I already do this in https://github.com/hyperthunk/remote_plugin_loader/blob/master/src/remote_plugin_loader.erl#L159, although that does assume that github won't change their API for reading raw files in the future. Either way, the github data api supports getting them by reading the tree with '/repos/:user/:repo/git/trees/:sha (see http://developer.github.com/v3/git/trees/) and pass the SHA for the path you want to obtain to '/repos/:user/:repo/git/blobs/:sha' (see http://developer.github.com/v3/git/blobs/) so it will work fine.
The idea is that the binaries are there in a published (remote) repo, but not initially downloaded in the local copy. Once you resolve a specific binary, it goes into the local (copy) of that organisation's repo, but not via git/sync but rather via an HTTP request. I guess the process looks something like this:

resolve d
    if d ∈ ./indexes/*.meta then    # we've resolved the org/app/version in one of our indexes 
        return local_path(d, U) if it exists
        otherwise fetch remote_path(d, U) from the internet and return local_path(d, U)

So actually we're not really using git so much for reading the data from the remotes - we use HTTP to do this for the most part. What we do use git for is the publication (i.e., writing back to remotes) and this can be done either with the github API (which completely ties us into github) or just using git. 

2) I am a little fuzzy on how you handle binaries published for other
organizations


Here are my thoughts at the moment:

1. you can only add third party artefacts if you've signed them - there is no third party 'unsigning' as it were
2. the binaries for third party signed artefacts go into a sub directory [ third-party/<organisation-name>/...
3. apart from the 'third-party' and organisation name prefix, the directory structure for third part signed binaries is exactly the same
4. in the index.meta, you explicitly specify which artefacts you've published are 3rd party signed  

Cheers,

Tim



--
You received this message because you are subscribed to the Google Groups "erlware-questions" group.
To post to this group, send email to erlware-...@googlegroups.com.
To unsubscribe from this group, send email to erlware-questi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/erlware-questions?hl=en.


--
You received this message because you are subscribed to the Google Groups "erlware-questions" group.
To post to this group, send email to erlware-...@googlegroups.com.
To unsubscribe from this group, send email to erlware-questi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/erlware-questions?hl=en.


Re: Code Signing: Establishing Trust Eric Merritt 3/13/12 9:11 AM
On Mon, Mar 12, 2012 at 1:14 PM, Tim Watson <watson....@gmail.com> wrote:
>> I have always assumed that one of the collaborators on the project is
>> going to have to review and do the merge. Well probably at some point
>> it will be automated in such away as when someone requests a pull we
>> will verify the signing and then automerge it.
>
>
> This requires having infrastructure, or being manually involved in the
> process all the time. I think it's a bad move.
>

 I tend to agree with this for all the reasons we talked about


>> 1. add 'hyperthunk' to the list of collaborators for the repo (or
>> organisation)
>>
>> 2. have some kind of private authentication data stored somewhere that the
>> client application can check against
>>
>>
>> Well, I assume that the change would be signed and thats enough to do
>> the merge really. I dont see any reason to add them to the
>> collaborators or have something more complex.
>
>
> Erm no that's not true. How can we verify that they've signed something
> until we have their public key, and how can we verify their public key
> without trust?

We don't do trust, we just provide validation of signatures. Its up
the user to decide who to trust or not trust. In this case, once
someone makes their public key available in the repo it should always
be there. Then all we do is make sure that user user claiming to be
'foo' is using the same private key as the user who published their
public key under the name 'foo'. There is no trust involved only
signature validation.


>> Well it starts as a manual process on the part of the collaborators
>> but there is no reason it would stay that way.
>>
>
> Again, the way you're describing it assumes we would have some level of
> infrastructure, which is something I think we're better avoiding - you said
> this yourself early on IIRC.

I am still of that belief and would love not to have infrastructure if
we can get around it.

>> I'm assuming that the packaging step will hash all the relevant inputs
>> (i.e., the contents of the file system for the assembled project) and sign
>> it with the private key, then put that into the relevant place (e.g.,
>> somewhere below ./priv maybe) before zipping everything up. And assuming
>> that we've got all the info we need in the output .zip file from the
>> previous step:
>>
>>
>> yes that should not be a manual thing.
>
>
> Yes - manual as in the tool does it on the local build environment against
> the manually specified public key.

ok fair enough.


[much snipping]


>
>
> That's not true. All blobs in a github repo are accessible using the github
> data API and further more, can be downloaded via a direct HTTP link. I
> already do this
> in https://github.com/hyperthunk/remote_plugin_loader/blob/master/src/remote_plugin_loader.erl#L159,
> although that does assume that github won't change their API for reading raw
> files in the future. Either way, the github data api supports getting them
> by reading the tree with '/repos/:user/:repo/git/trees/:sha
> (see http://developer.github.com/v3/git/trees/) and pass the SHA for the
> path you want to obtain to '/repos/:user/:repo/git/blobs/:sha'
> (see http://developer.github.com/v3/git/blobs/) so it will work fine.

Not only that but I think we need to try to stick with basic git where
possible. Maybe not initially but in the long run. In that case any
time you clone or fetch you pull the entire repo. I dont know if there
is a way to avoid that.

[more snipping ...]

> The idea is that the binaries are there in a published (remote) repo, but
> not initially downloaded in the local copy. Once you resolve a specific
> binary, it goes into the local (copy) of that organisation's repo, but not
> via git/sync but rather via an HTTP request. I guess the process looks
> something like this:
>
> resolve d
>     if d ∈ ./indexes/*.meta then    # we've resolved the org/app/version in
> one of our indexes
>         return local_path(d, U) if it exists
>         otherwise fetch remote_path(d, U) from the internet and return
> local_path(d, U)
>
> So actually we're not really using git so much for reading the data from the
> remotes - we use HTTP to do this for the most part. What we do use git for
> is the publication (i.e., writing back to remotes) and this can be done
> either with the github API (which completely ties us into github) or just
> using git.

As long as there is some option to do this with pure git its
doable. The upside is mainly that it is simpler to publish?

>>
>> 2) I am a little fuzzy on how you handle binaries published for other
>> organizations
>>
>>
> Here are my thoughts at the moment:
>
> 1. you can only add third party artefacts if you've signed them - there is
> no third party 'unsigning' as it were

thats true for any change I suspect.

> 2. the binaries for third party signed artefacts go into a sub directory [
> third-party/<organisation-name>/...
> 3. apart from the 'third-party' and organisation name prefix, the directory
> structure for third part signed binaries is exactly the same

I dont really like having a separate area for the third-party
stuff. The current format we have come up with should be able to
handle that actually. As long as the binaries are signed we know their
origin so it shouldnt be a problem for them to go into the normal
/<organizanion-name>/... area. We might want to come we a naming
convention for binary packages that helps avoid conflict there. We
could just borrow debians <name>-<vsn>-<publisher> or something along
those lines.


> 4. in the index.meta, you explicitly specify which artefacts you've
> published are 3rd party signed

Why? I am not really opposed to it, I just dont see the need.

I think I can get on board with this approach fairly easily, as long
as we can solve the repo syncing issue (pulling down all binaries). I
really like the idea of not having infrastructure and leaving it up to
folks to manage things themselves. This is very similar (though better
I think) to the whole ubuntu ppa thing which seems to work.