This was what I was hinting at with my security definitions. I don't think you can prevent anything without control over the merge process, unless you go completely to external tools for merging of pull requests. But it suffices to be auditable at the point of installation. Since there's interest I'll go through and build up some flow diagrams and fault trees and things to try to inform the next step.
Thanks,
Lachlan
Ok, I have updated the document to talk a bit more about attack scenarios:From a cryptographic perspective, all the signatures tell you is that a piece of code has not been changed since it was created, and so I have left out questions of auditing and trust, and just touched on dataflow. The conclusion is that if you want gatekeepers to METADATA.jl to have any cryptographically-enforceable control, they need to publish signatures somewhere, which can't be done with the pull-request interface unless they are willing to give standing permission before seeing the code.This standing permission can be as granular as you like, but it has to be enforceable by whoever is verifying the signature. This could be as simple as a list of directories that the author is trusted to modify, or as complicated as requiring two signatures from three designated reviewers plus verification by a static-analysis tool that it cannot read or write external files. The latter is pushing the bounds of realism, but I want to stress that you can put quite a bit of flexibility into the certificate system if need be, though obviously one is bounded by the KISS principle.Thanks,
Lachlan
On Thursday, 5 November 2015 18:20:50 UTC+1, Stefan Karpinski wrote:
I think the most helpful thing right now might be to try to list attack models and then we can figure out if or how we are going to try to prevent or mitigate them. It's one thing to prevent some one from trying to alter the code of a releases package version after the fact, it's a totally different thing to have some trust mechanism for deciding about the security of released code in the first place. But if you can't trust the released version does it really matter if someone can change it? Do we want to take an approach that tries to prevent attacks or do we want to make it so that it's highly auditable after the fact if an attack is discovered? In a lot of ways the latter is more useful (and prevents attacks for fear of discovery), but you kind of need to be able to trace things back to an actual person for it to have any efficacy. Of course, this is very much at odds with the semi-anonymous nature of open source dev – a lot of people don't even want to put their real picture on GitHub let alone reveal who they are IRL. But without that, you can't really prevent or uncover anonymous attacks.
I can imagine a scheme where people who are known quantities can release code and people can use it just by virtue of trusting who they are via a crypto signature; other less known people can release code too but trust would have to be established by an external audit of that code rather than by knowing who they are.
I think people would have to make trust decisions for VERY many people without the necessary information at hand to make a good decision. A typical user might have to make a trust decision for potentially hundreds of authors, if he/she uses a fair number of packages. But I don’t know any of them, so as a new user I really wouldn’t have the info to make these decisions.
I think if this “make trust decisions about authors” is really the goal, there would have to be some sort of network of trust: if some core users trust someone (Stefan etc.), then I’d be happy to automatically trust those authors as well.
But, quite frankly, the whole thing to me seems overkill, plus it seems to me that there are much more pressing issues in julia-land than this.
Cheers,
David
Yep, of course. Particularly with me being a Windows user I'm quite aware of that :)
I just trust myself to write that command line more clearly than I could the English equivalent.
It will be worth seeing what we can do with existing dependencies; Python went with X.509 rather than PGP to a great extent, as I understand, because they preferred to use OpenSSL over GnuPG. Though GNUTLS I think has the ability to do some PGP, from memory, so who knows at this stage.
Thanks,
Lachlan