inter-repo deps

22 views
Skip to first unread message

Fredrik Ekholdt

unread,
Feb 24, 2013, 6:07:06 PM2/24/13
to adep...@googlegroups.com

Any one given any thoughts on inter-repo dependencies? 

A simple approach would be to have a url to the other repository when declaring the artifact. 

This means that a repository would depend on another repository and would have be pulled as well to be able to have all the metadata offline.

The issue is that we would be doing the same thing with metadata as ivy does with dependencies. 


What I am doing now is to hash org,name,version and the contents of each artifact creating a unique id (hopefully). 

When adding an artifact to your local repository you would add it with the hash along with the dependencies. The dependencies could be now be artifacts/modules which you have from other repositories. These dependencies would then be put into your local repository as well.

When you push back to the remote repository, you push the artifact and its dependencies.

The pro is that from the time you have the repository where you found the artifact you wanted, you all the metadata that you need. The con is that you actually have to include artifacts from other repositories.


Any thoughts?

eugene yokota

unread,
Feb 24, 2013, 7:37:06 PM2/24/13
to adep...@googlegroups.com
On Sun, Feb 24, 2013 at 6:07 PM, Fredrik Ekholdt <fre...@gmail.com> wrote:

Any one given any thoughts on inter-repo dependencies? 

 
I was thinking in terms of [proxying support for team development][1]:
Whether the metadata, auth, and file are centralized or de-centralized,
it'd be nice to let one instance of adept act as the metadata/auth/file server for the others (like git).

This allows locking down network access in firewalled environment,
or locking down SNAPSHOT to a specific artifact in a single location.

To this, Mark wrote:
At the most basic, you'd have a tool that merges and splits repositories.  

so, in a way this could be considered as inter-repo deps.

Mark also wrote:
For git repositories, it is mostly straightforward to merge/split.  If you want things to stay signed, the merge tool has to verify the commits being merged are signed and then sign the merge commit.  If a human does the merge, they can sign it.  Automated signing by a machine is a bit more complicated I think. 

So, this is an interesting point. 

+-------+     +-------------+   ||    +-----------+    +-----+
| Alice | =>  | Team Proxy  |  =||=>  | Adept One | <= | Bob |
+-------+     +-------------+   ||    +-----------+    +-----+

Suppose if Alice can only hit Team Proxy, and Team Proxy can only hit Adept One,
and that she wants to get Bob's metadata/artifact published in Adept One.
Let's also say Alice needs to verify the authenticity of Bob's metadata and artifact.

There are two solutions to this:
1. Adept One publishes not only files and metadata, but also PGP pub keys.
2. Upon accepting Bob's metadata/artifact Adept One re-signs them with its own pubkey, and advertises its own pubkey.

The second method is more elegant, since the advertisement and management of PGP keys
can now be implementation detail per adept server, while adept-to-adept interaction can be set.
# Two methods I mentioned in some other thread was using hard-coded github repo to advertise key,
and making it available online.

Today, to sign a jar with PGP we create foo.jar.asc file.
To sign this automatically, an adept server can verify the signature, and add a secondary signature foo.jar.467cc13.asc upon merge
where 467cc13 part is a unique id per adept instance like Adept One.
Team Proxy can find out Adept One's id, so it can go straight to verifying foo.jar.467cc13.asc,
and not bother with Bob's signature.

A simple approach would be to have a url to the other repository when declaring the artifact. 

Does this mean we would have separate namespace of artifacts for each metadata repository?
Or are you thinking more like resolver += "adept.foo.com"?

This means that a repository would depend on another repository and would have be pulled as well to be able to have all the metadata offline.

Similar to git, if the local repository worked as a server (a remote repo) for others, it would at least be able to grab all known artifacts.
It would also be nice to proxy pulling and pushing too.

When adding an artifact to your local repository you would add it with the hash along with the dependencies. The dependencies could be now be artifacts/modules which you have from other repositories. These dependencies would then be put into your local repository as well.

When you push back to the remote repository, you push the artifact and its dependencies.

Why are the deps pushed back to the remote repo?

The pro is that from the time you have the repository where you found the artifact you wanted, you all the metadata that you need. The con is that you actually have to include artifacts from other repositories.

Since files are addressed by hash, hopefully we'll see less of cache corruption issues.
I don't see the need to move deps files around.

-eugene



Josh Suereth

unread,
Feb 24, 2013, 10:12:30 PM2/24/13
to adept-dev

Note:  why not keep a public key ring associated with the repository and just add bobs key directly too it.  Then users can opt-in to finer grained security later if they want?

When merging two repos, you need to merge the key store as well (or just keep it as flat keys named by pgp id)

> # Two methods I mentioned in some other thread was using hard-coded github repo to advertise key,
> and making it available online.
>
> Today, to sign a jar with PGP we create foo.jar.asc file.
> To sign this automatically, an adept server can verify the signature, and add a secondary signature foo.jar.467cc13.asc upon merge
> where 467cc13 part is a unique id per adept instance like Adept One.
> Team Proxy can find out Adept One's id, so it can go straight to verifying foo.jar.467cc13.asc,
> and not bother with Bob's signature.

Actually, I believe we can sign the signature, even more meta and crazy...

>> A simple approach would be to have a url to the other repository when declaring the artifact. 
>
> Does this mean we would have separate namespace of artifacts for each metadata repository?
> Or are you thinking more like resolver += "adept.foo.com"?
>>
>> This means that a repository would depend on another repository and would have be pulled as well to be able to have all the metadata offline.
>
> Similar to git, if the local repository worked as a server (a remote repo) for others, it would at least be able to grab all known artifacts.
> It would also be nice to proxy pulling and pushing too.
>>
>> When adding an artifact to your local repository you would add it with the hash along with the dependencies. The dependencies could be now be artifacts/modules which you have from other repositories. These dependencies would then be put into your local repository as well.
>>
>> When you push back to the remote repository, you push the artifact and its dependencies.
>
> Why are the deps pushed back to the remote repo?
>
>> The pro is that from the time you have the repository where you found the artifact you wanted, you all the metadata that you need. The con is that you actually have to include artifacts from other repositories.
>
> Since files are addressed by hash, hopefully we'll see less of cache corruption issues.
> I don't see the need to move deps files around.
>

Yeah, the cache can even be self cleaning.  We do this for scala now.  No major reported issues in a while.

I think the only corruption/issues we have to worry about are the metadata repositories and any local minded/db we make on that...

eugene yokota

unread,
Feb 25, 2013, 12:22:56 AM2/25/13
to adep...@googlegroups.com

On Sun, Feb 24, 2013 at 10:12 PM, Josh Suereth <joshua....@gmail.com> wrote:

Note:  why not keep a public key ring associated with the repository and just add bobs key directly too it.  Then users can opt-in to finer grained security later if they want?

That's the first method of two solutions:

>> There are two solutions to this:
>> 1. Adept One publishes not only files and metadata, but also PGP pub keys.
>> 2. Upon accepting Bob's metadata/artifact Adept One re-signs them with its own pubkey, and advertises its own pubkey.

With this approach an adept repo tracks metadata, files, and pubkeys.
During the repo merge, if a repo wants to verify the authenticity, it would verify the keys from the primary source,
or like you said a trusted authority like Adept One can sign the key file.

Even with this solution, wouldn't it be more efficient if Bob declared his pubkey id with the signature as foo.jar.5232f2.asc?
Otherwise every user who wants to verify authenticity needs to keep every programmer's pubkeys in the ring.
(I am hoping PGP has a library call that says verify this sig with this pubkey)
This key signing, could be cascaded similar to https's CA.

I know we mostly assume authenticity of jars just by being on some server,
but we could enforce better secure behavior by default.

-eugene

Josh Suereth

unread,
Feb 25, 2013, 12:43:22 AM2/25/13
to adept-dev

Yeah, I see what you're saying.  Check out the lib I have in the sbt-pgp plugin and the check-pgp-signatures implementation.

If we sign other users keys, we still require those keys to be available when verifying, I think.

Let me try some of this out and get back to you.

The real question is: what do we want for the ux of security?  What use cases and scenarios are we trying to solve?  I can document the ones I think are high priorities.

eugene yokota

unread,
Feb 25, 2013, 1:40:55 AM2/25/13
to adep...@googlegroups.com
Here are some of the scenarios: https://github.com/sbt/adept/wiki/Security-Scenarios
The ux should be seamless from the build users point of view.

-eugene

Mark Harrah

unread,
Mar 2, 2013, 7:50:26 PM3/2/13
to adep...@googlegroups.com
On Mon, 25 Feb 2013 01:40:55 -0500
eugene yokota <eed3...@gmail.com> wrote:

> Here are some of the scenarios:
> https://github.com/sbt/adept/wiki/Security-Scenarios
> The ux should be seamless from the build users point of view.

One comment: if the metadata is signed, the jars don't need to be. The metadata contains a strong hash of the jars.

-Mark

Josh Suereth

unread,
Mar 2, 2013, 8:42:18 PM3/2/13
to adept-dev
On Sat, Mar 2, 2013 at 7:50 PM, Mark Harrah <dmha...@gmail.com> wrote:
On Mon, 25 Feb 2013 01:40:55 -0500
eugene yokota <eed3...@gmail.com> wrote:

> Here are some of the scenarios:
> https://github.com/sbt/adept/wiki/Security-Scenarios
> The ux should be seamless from the build users point of view.

One comment: if the metadata is signed, the jars don't need to be.  The metadata contains a strong hash of the jars.

That's absolutely not true.  Signing is not just about identification of the jar, but who created it.   If the metadata can be fixed later, you could loose the "author" of the JAR.   Metadata vs. JAR signature implies different things.  Who added the JAR vs. Who described the JAR.   I'd rather enforce most of my security concerns on the former, because it can help prevent pulling in unwanted jars if someone messes up the metadata.


- Josh

Mark Harrah

unread,
Mar 2, 2013, 9:46:56 PM3/2/13
to adep...@googlegroups.com
On Sat, 2 Mar 2013 20:42:18 -0500
Josh Suereth <joshua....@gmail.com> wrote:

> On Sat, Mar 2, 2013 at 7:50 PM, Mark Harrah <dmha...@gmail.com> wrote:
>
> > On Mon, 25 Feb 2013 01:40:55 -0500
> > eugene yokota <eed3...@gmail.com> wrote:
> >
> > > Here are some of the scenarios:
> > > https://github.com/sbt/adept/wiki/Security-Scenarios
> > > The ux should be seamless from the build users point of view.
> >
> > One comment: if the metadata is signed, the jars don't need to be. The
> > metadata contains a strong hash of the jars.
> >
>
> That's absolutely not true. Signing is not just about identification of
> the jar, but who created it.

I can get a jar that I didn't create, sign it, and post it. It isn't about creation. It is saying "I vouch that jar X does Y" (or "doesn't do Z"), which is identification/description.

> If the metadata can be fixed later, you
> could loose the "author" of the JAR.

Who originally posts a jar to the repository isn't important. When you post a jar to a hash-based repository, all you say is what hash it has, which can be easily verified without trusting anyone.

What matters is when someone publishes metadata that says "the artifact for project A is the jar with hash ABC". If the metadata is changed later to hash DEF, the person that changes it vouches that DEF is now correct.

If you need a definition of "author", you could say the author is the first person in a repository to associate a hash with a project. By this definition, the original author is not lost because a commit includes all previous commits.

> Metadata vs. JAR signature implies
> different things. Who added the JAR vs. Who described the JAR. I'd
> rather enforce most of my security concerns on the former, because it can
> help prevent pulling in unwanted jars if someone messes up the metadata.

What defines an unwanted jar? It is a jar that doesn't do what the metadata claimed. If the metadata is messed up, the jar is unwanted whether or not the jar itself is signed.

-Mark

> - Josh
Reply all
Reply to author
Forward
0 new messages