Adept mark ii

72 views
Skip to first unread message

Fredrik Ekholdt

unread,
Sep 20, 2013, 12:19:06 PM9/20/13
to adep...@googlegroups.com
Hi everybody!
Lately, me, Mark and Josh have been thinking about how to improve the model for Adept so that it can better support variances between for example scala binary versions (but also other ones such play or android versions or even C++ versions). 

I have tried to write a spec for once to better explain the model we are looking at, why we want it and how it can be used to solve some actual use cases.


The benefits is that we can do actual resolution on pretty much any set of constraints, see the spec to learn more.

This spec does not try to explain how we solve everything - yet! That is perhaps where you come in!

What I would love is for anybody who is interested either post comments in the spec (if we are wrong) or add some comments on this mail thread.


Enjoy your weekends!

Cheers,
Fredrik

Havoc Pennington

unread,
Sep 20, 2013, 4:02:11 PM9/20/13
to adep...@googlegroups.com
Hi,

Great spec! It was really clear and makes it easy to understand Adept plans. Or I guess you can decide based on this email what I understand ;-)

Here are some things I thought about reading it, fwiw

1.

The proposed sbt syntax seems a little too hard for the common case. I understand you're trying to illustrate the general mechanism, agreed that should exist, but could the common case be kind of like:

    dependencyRepositories := Seq("git://blahblah/foo/3.0")
   
    dependencies := Seq("mygroup/mylib", "othergroup/otherlib" constrainedByVersion "1.2.1")

  - in the long term (if adept becomes the default way of doing things) will having the word "adept" in there seem odd?
  - sugar for constrainedBy("version" -> "1.2.1") seems warranted since this has to be the most common constraint. Autocomplete doesn't work on strings, only types and methods, so "version" as a string isn't as discoverable.
  - could do things like, in sbt automatically constrain by the scalaVersion that's set for the project

I know this seems sort of like a surface/refinement issue but I think it'd be worth looking at early on, to be sure the implementation supports the simplest surface.

2.

I'm not sure if this is already the plan or not based on the document, but to me *resolution* should be done explicitly only by people hacking on the module, and the results *checked into git*.

That is, split "sbt update" into two separate things: "resolve" which determines the artifacts to use; and "download-artifacts" which sucks them onto the local machine. *Most builds* only need to (only *should*) "download-artifacts".

If I just download some random project source from github and type "sbt compile" or even "publish", to me that should just yank down a bunch of artifacts identified by sha hash, and that's it. No constraint-solving.

Resolution can ONLY introduce bugs. If I'm a hacker on a module, and I resolve on my workstation, then I want to upload those resolution results, keep them in git, have Jenkins reproduce them *exactly*, and anyone hacking on the module who types "publish" should be using the exact same artifacts... if the results of resolving change, then I want it to be visible - it's probably some kind of problem! I want to be able to reproduce builds later, see changes to resolution in pull requests, watch when things changed in my git history.

If there are new versions available, then I should type "resolve" (or whatever) and it will go see if the resolution results have changed. If they have, then my local git-managed file listing artifacts will be modified, and then I check it in. This means that upgrading to the latest version is *visible* rather than silent.

As a nice side effect, this means SPEED - not waiting on the constraint solver ;-) All we have to do is 1 stat per entry in the artifacts list to see if that file exists already in cache.

The resolution file in git would be the full transitive resolution, I think (resolution for a module is global for that module and all its deps). So if module A depends on module B depends on module C, then A may not end up using the same C that B was published against. But that's fine as long as all constraints are met. What would be true is that the hashes used for both B and C would be checked in to the A git repo.

If you had an "artifacts.txt" kind of file with the artifacts list, it could include human-readable comments just for convenience:

0c857914ca893ce09378fd4ffa42aa13363ea466  # com.typesafe.play/play 2.1.1
8ae9a903ce90a6be0fa3a7dbfcbd02dca97357b0 # org.junit/junit 1.4

That makes git diff more useful.

If I have any funky local changes that might affect resolution (proxies, global sbt config, local metadata server, whatever) then it would show up when I try to push my PR. Or even if Adept itself changes its resolution algorithm and two people use different adept versions, then that would show up.

Another thing this permits is that build tools only have to understand the already-resolved artifacts file potentially - resolution could be a separate command line tool if desired, not part of any build tool...

Havoc

eugene yokota

unread,
Sep 20, 2013, 4:33:48 PM9/20/13
to adept-dev
On Fri, Sep 20, 2013 at 4:02 PM, Havoc Pennington <h...@pobox.com> wrote:
Great spec! It was really clear and makes it easy to understand Adept plans. Or I guess you can decide based on this email what I understand ;-)

ditto.
 
 - sugar for constrainedBy("version" -> "1.2.1") seems warranted since this has to be the most common constraint. Autocomplete doesn't work on strings, only types and methods, so "version" as a string isn't as discoverable.

If Adept is saying it's not going to auto-evict like Ivy does, should the version really the be the default constraint?
I think "binary-version" should be mandated as metadata or at least be the default.
Related to that, version algebra should be defined so one can reliably compare one version to the other. (Ivy spec something like use php to compare[1])

-eugene


Havoc Pennington

unread,
Sep 20, 2013, 5:05:05 PM9/20/13
to adep...@googlegroups.com
You may be right that with the Adept design there's rarely a need to specify version (or anything else).

BTW another very small idea, along the lines of designing "artifacts.txt" for git-diffing, would be to always sort it by the module name. So that old and new hash for a given module would be adjacent in the diff. Even though the module would just be a comment, not important to using the file.

Havoc

Mark Harrah

unread,
Sep 20, 2013, 5:19:50 PM9/20/13
to adep...@googlegroups.com
On Fri, 20 Sep 2013 16:33:48 -0400
eugene yokota <eed3...@gmail.com> wrote:

> On Fri, Sep 20, 2013 at 4:02 PM, Havoc Pennington <h...@pobox.com> wrote:
>
> > Great spec! It was really clear and makes it easy to understand Adept
> > plans. Or I guess you can decide based on this email what I understand ;-)
> >
>
> ditto.
>
>
> > - sugar for constrainedBy("version" -> "1.2.1") seems warranted since
> > this has to be the most common constraint. Autocomplete doesn't work on
> > strings, only types and methods, so "version" as a string isn't as
> > discoverable.
> >
>
> If Adept is saying it's not going to auto-evict like Ivy does, should the
> version really the be the default constraint?
> I think "binary-version" should be mandated as metadata or at least be the
> default.

Right. My thoughts here are that this is at a higher level than the core. It is probably a set of conventions followed by a build system for a particular domain.

For example, sbt might translate its normal syntax a % b % c to something like:

group=a, name=b, sourceVersion=majorMinorOnly(c), binaryVersion=majorMinorOnly(c)

Here, sourceVersion probably isn't gaining anything over binaryVersion, but I'm thinking it might in practice. The main point is that there is a version the user specifies that gets translated into the right constraints. This is only the default and the user can take more control if desired. Something similar might happen for version := Y when publishing.

> Related to that, version algebra should be defined so one can reliably
> compare one version to the other. (Ivy spec something like use php to
> compare[1])

I agree that if it is necessary to compare versions, this should be specified. I personally haven't seen use cases where this is necessary in the core resolution engine, although auxiliary tools, like "find me the most recent version of X", might. I'm not sure yet. Fredrik has done a good job collecting use cases and describing how adept handles them. Something we'd like to see more of are more use cases, such as ones that require versions to be compared.

-Mark

> -eugene
>
> [1]: http://ant.apache.org/ivy/history/latest-milestone/concept.html

Mark Harrah

unread,
Sep 20, 2013, 5:21:15 PM9/20/13
to adep...@googlegroups.com
One thing to consider is that the order in this file will determine the order on the classpath. It may or may not be desirable to depart from a topological sort. Alphabetical by module name is stable, so it might be fine.

I agree with the general desire for a human readable and diffable artifacts.txt. I'll reread your first email and comment there though.

-Mark

> Havoc

Mark Harrah

unread,
Sep 20, 2013, 5:47:44 PM9/20/13
to adep...@googlegroups.com
I know Fredrik supports the idea of an artifacts.txt and splitting resolution (and I agree as well). There was some discussion on this earlier and the open questions are around the ideas you've mentioned. My opinion is that there is no fundamental obstacle to properly caching resolution automatically because all of the metadata is local and the artifacts are cached by hash. Therefore, you don't gain much of a speed advantage from caching except for the first time. I think you want to do this uptodate checking and reresolve automatically if there aren't problems. You should be able to configure it to fail if the resolved artifacts are different. I think this has been referred to as "locking" resolution. I think we agree at least that is desirable.

We agree resolution could be a separate command line tool, except that there are several places where dependency resolution happens automatically in a normal build. For example, sbt itself is pulled via dependency resolution. The build definition is a project where dependency resolution happens automatically. In Maven, plugins get pulled by dependency resolution. I think in practice it would be integrated into the build tool, but it would be properly decoupled compared to the way things are now and you'd be able to lock down resolution.

-Mark

> Havoc
>

Havoc Pennington

unread,
Sep 20, 2013, 8:53:30 PM9/20/13
to adep...@googlegroups.com
This is way-premature detail I suppose, but while I'm thinking about it...

It could be that to configure classpath order you just reorder the 'artifacts' file, and the resolve operation tries its best to keep the same order you had.

Otherwise, if there are known classpath constraints configured outside the file, an idea from an unrelated context:
 https://git.gnome.org/browse/metacity/tree/src/core/stack.c#n383

There you have the user's window stack in arbitrary order, which you want to mess with as little as possible while still applying constraints (which are of the form A-above-B).  I'm sure there's a better algorithm but this is one.

Maybe classpath is similar: sometimes you have a constraint like A-before-B-in-classpath but for the most part you want to keep the order the file is already in.

An implication of keeping the module order while modifying the file is that the file has to be (module-name,hash) tuples, not just a list of hashes.

But to make the file human-readable it really has to have module names anyway, and to make diffs human-readable the order really has to be stable.

Just thinking out loud about minor stuff, I'm sure it can be worked out.

Havoc

Havoc Pennington

unread,
Sep 20, 2013, 9:01:55 PM9/20/13
to adep...@googlegroups.com
On Fri, Sep 20, 2013 at 5:47 PM, Mark Harrah <dmha...@gmail.com> wrote:
I know Fredrik supports the idea of an artifacts.txt and splitting resolution (and I agree as well).  There was some discussion on this earlier and the open questions are around the ideas you've mentioned.  My opinion is that there is no fundamental obstacle to properly caching resolution automatically because all of the metadata is local and the artifacts are cached by hash.  Therefore, you don't gain much of a speed advantage from caching except for the first time.  I think you want to do this uptodate checking and reresolve automatically if there aren't problems.  You should be able to configure it to fail if the resolved artifacts are different.  I think this has been referred to as "locking" resolution.  I think we agree at least that is desirable.

Cool. Yeah, if resolution is instant then it's harmless to do automatically. I guess the main point for me is that anytime an artifact changes in my entire stack, I'd like to manually approve and record it (my version control system being the natural way to do so). This gives 100% reproducible builds and avoids weird mystery situations. Exactly how it works is sort of up to whoever is coding this thing and working out the details.
 
 I think in practice it would be integrated into the build tool, but it would be properly decoupled compared to the way things are now and you'd be able to lock down resolution.

Sounds good. I didn't mean to necessarily say "it should be separate" but more just "it could be, if that's useful."

Havoc
 

Fredrik Ekholdt

unread,
Sep 21, 2013, 5:16:26 AM9/21/13
to adep...@googlegroups.com


On Friday, 20 September 2013 22:02:11 UTC+2, Havoc Pennington wrote:
Hi,

Great spec! It was really clear and makes it easy to understand Adept plans. Or I guess you can decide based on this email what I understand ;-)
Thanks! :)  

Here are some things I thought about reading it, fwiw

1.

The proposed sbt syntax seems a little too hard for the common case. I understand you're trying to illustrate the general mechanism, agreed that should exist, but could the common case be kind of like:

    dependencyRepositories := Seq("git://blahblah/foo/3.0")
   
    dependencies := Seq("mygroup/mylib", "othergroup/otherlib" constrainedByVersion "1.2.1")

  - in the long term (if adept becomes the default way of doing things) will having the word "adept" in there seem odd?
  - sugar for constrainedBy("version" -> "1.2.1") seems warranted since this has to be the most common constraint. Autocomplete doesn't work on strings, only types and methods, so "version" as a string isn't as discoverable.
  - could do things like, in sbt automatically constrain by the scalaVersion that's set for the project

I know this seems sort of like a surface/refinement issue but I think it'd be worth looking at early on, to be sure the implementation supports the simplest surface.
Yes, I agree with all of these. As you say though the sbt examples is there to get a clearer picture of what the build tool would do, but not necessarily how the sbt plugin will look like. I think the input is good though and I think you are right when it comes to version and scalaVersion, in fact I was thinking to have a part of the API (or another library) which has all of the common attributes there. It would make it easier to have a convention for attribute names (version, configurations, binaryVersion, sourceVersion, group, name, ...) and it would make it easier to signal whenever (if) the convention changes.

2.

I'm not sure if this is already the plan or not based on the document, but to me *resolution* should be done explicitly only by people hacking on the module, and the results *checked into git*.

That is, split "sbt update" into two separate things: "resolve" which determines the artifacts to use; and "download-artifacts" which sucks them onto the local machine. *Most builds* only need to (only *should*) "download-artifacts".

If I just download some random project source from github and type "sbt compile" or even "publish", to me that should just yank down a bunch of artifacts identified by sha hash, and that's it. No constraint-solving.

Resolution can ONLY introduce bugs. If I'm a hacker on a module, and I resolve on my workstation, then I want to upload those resolution results, keep them in git, have Jenkins reproduce them *exactly*, and anyone hacking on the module who types "publish" should be using the exact same artifacts... if the results of resolving change, then I want it to be visible - it's probably some kind of problem! I want to be able to reproduce builds later, see changes to resolution in pull requests, watch when things changed in my git history.

If there are new versions available, then I should type "resolve" (or whatever) and it will go see if the resolution results have changed. If they have, then my local git-managed file listing artifacts will be modified, and then I check it in. This means that upgrading to the latest version is *visible* rather than silent.

As a nice side effect, this means SPEED - not waiting on the constraint solver ;-) All we have to do is 1 stat per entry in the artifacts list to see if that file exists already in cache.

The resolution file in git would be the full transitive resolution, I think (resolution for a module is global for that module and all its deps). So if module A depends on module B depends on module C, then A may not end up using the same C that B was published against. But that's fine as long as all constraints are met. What would be true is that the hashes used for both B and C would be checked in to the A git repo.

If you had an "artifacts.txt" kind of file with the artifacts list, it could include human-readable comments just for convenience:

0c857914ca893ce09378fd4ffa42aa13363ea466  # com.typesafe.play/play 2.1.1
8ae9a903ce90a6be0fa3a7dbfcbd02dca97357b0 # org.junit/junit 1.4

That makes git diff more useful.

If I have any funky local changes that might affect resolution (proxies, global sbt config, local metadata server, whatever) then it would show up when I try to push my PR. Or even if Adept itself changes its resolution algorithm and two people use different adept versions, then that would show up.

Another thing this permits is that build tools only have to understand the already-resolved artifacts file potentially - resolution could be a separate command line tool if desired, not part of any build tool...
Yep - I agree. Will follow up on Mark's comments. 

Havoc

 

Fredrik Ekholdt

unread,
Sep 21, 2013, 5:29:43 AM9/21/13
to adep...@googlegroups.com


On Friday, 20 September 2013 23:19:50 UTC+2, Mark Harrah wrote:
On Fri, 20 Sep 2013 16:33:48 -0400
eugene yokota <eed3...@gmail.com> wrote:

> On Fri, Sep 20, 2013 at 4:02 PM, Havoc Pennington <h...@pobox.com> wrote:
>
> > Great spec! It was really clear and makes it easy to understand Adept
> > plans. Or I guess you can decide based on this email what I understand ;-)
> >
>
> ditto.
>
>
> >  - sugar for constrainedBy("version" -> "1.2.1") seems warranted since
> > this has to be the most common constraint. Autocomplete doesn't work on
> > strings, only types and methods, so "version" as a string isn't as
> > discoverable.
> >
>
> If Adept is saying it's not going to auto-evict like Ivy does, should the
> version really the be the default constraint?
> I think "binary-version" should be mandated as metadata or at least be the
> default.

Right.  My thoughts here are that this is at a higher level than the core.  It is probably a set of conventions followed by a build system for a particular domain.

For example, sbt might translate its normal syntax a % b % c to something like:

  group=a, name=b, sourceVersion=majorMinorOnly(c), binaryVersion=majorMinorOnly(c)

Here, sourceVersion probably isn't gaining anything over binaryVersion, but I'm thinking it might in practice.  The main point is that there is a version the user specifies that gets translated into the right constraints.  This is only the default and the user can take more control if desired.  Something similar might happen for version := Y when publishing.
Yep, I agree. binaryVersion is what most modules probably want to constrain on - not the exact version. 

> Related to that, version algebra should be defined so one can reliably
> compare one version to the other. (Ivy spec something like use php to
> compare[1])

I agree that if it is necessary to compare versions, this should be specified.  I personally haven't seen use cases where this is necessary in the core resolution engine, although auxiliary tools, like "find me the most recent version of X", might.  I'm not sure yet.  Fredrik has done a good job collecting use cases and describing how adept handles them.  Something we'd like to see more of are more use cases, such as ones that require versions to be compared.
Yep, I also think this is something we can handle in a separate tool outside of the core. I will add a use case describing this, so we can discuss whether an auxiliary tool is sufficient.
A way that the build tool could handle this, as an example, is to sort the versions on resolution failure for under-constrained where there are multiple versions in the variants found.
BTW: I already happen to have a version of the php compare in scala in the *former* POC of Adept (needed it earlier to be compatible with Iyv and Ivy doesn't export their version): https://github.com/adept-dm/adept/blob/mark-i/adept-core/src/main/scala/adept/core/operations/ConflictResolver.scala#L23 

Fredrik Ekholdt

unread,
Sep 21, 2013, 6:15:38 AM9/21/13
to adep...@googlegroups.com


On Saturday, 21 September 2013 03:01:55 UTC+2, Havoc Pennington wrote:
On Fri, Sep 20, 2013 at 5:47 PM, Mark Harrah <dmha...@gmail.com> wrote:
I know Fredrik supports the idea of an artifacts.txt and splitting resolution (and I agree as well).  There was some discussion on this earlier and the open questions are around the ideas you've mentioned.  My opinion is that there is no fundamental obstacle to properly caching resolution automatically because all of the metadata is local and the artifacts are cached by hash.  Therefore, you don't gain much of a speed advantage from caching except for the first time.  I think you want to do this uptodate checking and reresolve automatically if there aren't problems.  You should be able to configure it to fail if the resolved artifacts are different.  I think this has been referred to as "locking" resolution.  I think we agree at least that is desirable.

Cool. Yeah, if resolution is instant then it's harmless to do automatically. I guess the main point for me is that anytime an artifact changes in my entire stack, I'd like to manually approve and record it (my version control system being the natural way to do so). This gives 100% reproducible builds and avoids weird mystery situations. Exactly how it works is sort of up to whoever is coding this thing and working out the details.
The advantage of the offline metadata is that it is easy to cache the resolution safely, so speed should not be too much of a problem if you do not change anything. I really want to focus on making everything that can be fast as fast as possible - even if Ivy is slow, it is acceptably slow if it wasn't for its caching issues. Adept should be faster (~ 500 ms for "quite large" projects (play-ebean)).
I agree that locking down the resolution and having visibility into exactly what artifacts you are using is a very good thing (in fact this was my starting point for Adept). I guess having to sort the classpath manually means something is fishy in your dependencies. Since we are living in a world where fishy sometimes is what you have to live with, I agree that this is likely to be required from time to time.
I think that the build tool should be able to easily generate these files, so it should be outside of Adept's core (in some aux tools). A challenge with these files is to find a way to do this so that a user that really doesn't care, is not forced to make a decision. It also has to be hard to use it in a bad way, i.e. update the artifact file and not declare/update the dependencies in the build file. I am going to add it as a use case in the spec so we can clearly see how it is going to work once we agree with the overall way it is working. How about this:
A user declares her/his deps in the build file. During compile the build tool simply generates this file and uses it to download the artifacts (if needed) and generates the classpath based on the files corresponding to the artifact. The build tool would have a setting (in sbt's case) where you can set a lock := true. When the build is locked, it means that it would fail if the generated file contains different artifacts than the actual file (it can just check if all artifacts are accounted for and not check order, so that you can reorder the classpath it if you want). On a build server, you would typically enable this lock (sbt -Dadept.lock=true). That means that users has to check a new version of the artifacts file if they make changes, if not the build will fail. When this file is checked in it would be easy to see which artifacts has changed during reviews. If you do not care about this, you simply do not add the lock on the build file and this will be transparent. WDYT?

Fredrik Ekholdt

unread,
Sep 21, 2013, 6:44:18 AM9/21/13
to adep...@googlegroups.com
The beauty of the model is really how simple it is and how easy it is to implement. 

I have tried to prove this and hacked together a (naive) implementation of the resolution engine the way I see it now: 
The actual algorithm is only ~ 30 lines of code.

I have also a test dsl so that it easy to create small and easy-to-read test cases. I have made some example unit tests in the link that follows which further demonstrates how resolution works: https://github.com/adept-dm/adept/blob/master/src/test/scala/adept/core/resolution/ConstraintsTest.scala

I think it is important to have a testing framework where it is easy to test small specific chunks of functionality - if you have any input on the way the DSL looks it is most welcome. 
Also the test cases are a good place to start if you want to understand the model better.
You are also most welcome to add test cases and to try to break it or use the test dsl to find ways where we cannot express a use case using this model! :)

If we can solve all use cases and prove that this implementation works, I think it is just a matter of adding the tooling around it and (as we are going further) and make the implementation faster and safer (if you look at the impl you see why I mention this - currently it is optimised for readability only :).

Josh Suereth

unread,
Sep 22, 2013, 3:44:54 PM9/22/13
to adept-dev
On Sat, Sep 21, 2013 at 6:44 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
The beauty of the model is really how simple it is and how easy it is to implement. 

I have tried to prove this and hacked together a (naive) implementation of the resolution engine the way I see it now: 
The actual algorithm is only ~ 30 lines of code.


Wow, that is quite small.
 
I have also a test dsl so that it easy to create small and easy-to-read test cases. I have made some example unit tests in the link that follows which further demonstrates how resolution works: https://github.com/adept-dm/adept/blob/master/src/test/scala/adept/core/resolution/ConstraintsTest.scala

 
I think it is important to have a testing framework where it is easy to test small specific chunks of functionality - if you have any input on the way the DSL looks it is most welcome. 
Also the test cases are a good place to start if you want to understand the model better.
You are also most welcome to add test cases and to try to break it or use the test dsl to find ways where we cannot express a use case using this model! :)


Looks like a good start so far!  I couldn't tell from the two locations you list, but what kind of information is reported upon resolution failure?    I'd say trying to get a robust error message on "tricksy" failure would be a good next step.   A resolution engine that works in the happy case is great.  A resolution engine that is *informative* on the failure case is pretty much a promised land of goodness and unicorns.   What do you think an elegant way would be to test error messages or error information?

 
If we can solve all use cases and prove that this implementation works, I think it is just a matter of adding the tooling around it and (as we are going further) and make the implementation faster and safer (if you look at the impl you see why I mention this - currently it is optimised for readability only :).


Yeah, it's amazing to see how fast this project is moving.  Great work Fred!!

Fredrik Ekholdt

unread,
Sep 22, 2013, 6:44:02 PM9/22/13
to adep...@googlegroups.com


On Sep 22, 2013 8:45 PM, "Josh Suereth" <joshua....@gmail.com> wrote:
>
>
>
>
> On Sat, Sep 21, 2013 at 6:44 AM, Fredrik Ekholdt <fre...@gmail.com> wrote:
>>
>> The beauty of the model is really how simple it is and how easy it is to implement. 
>>
>> I have tried to prove this and hacked together a (naive) implementation of the resolution engine the way I see it now: 
>> https://github.com/adept-dm/adept/blob/master/src/main/scala/adept/core/resolution/Resolver.scala
>> The actual algorithm is only ~ 30 lines of code.
>>
>
> Wow, that is quite small.

Yep, I think it might grow a bit, but if the core resolution algorithm is succinct it will make things that much easier (of course)

>  
>>
>> I have also a test dsl so that it easy to create small and easy-to-read test cases. I have made some example unit tests in the link that follows which further demonstrates how resolution works: https://github.com/adept-dm/adept/blob/master/src/test/scala/adept/core/resolution/ConstraintsTest.scala
>>
>  
>>
>> I think it is important to have a testing framework where it is easy to test small specific chunks of functionality - if you have any input on the way the DSL looks it is most welcome. 
>> Also the test cases are a good place to start if you want to understand the model better.
>> You are also most welcome to add test cases and to try to break it or use the test dsl to find ways where we cannot express a use case using this model! :)
>>
>
> Looks like a good start so far!  I couldn't tell from the two locations you list, but what kind of information is reported upon resolution failure?    I'd say trying to get a robust error message on "tricksy" failure would be a good next step.   A resolution engine that works in the happy case is great.  A resolution engine that is *informative* on the failure case is pretty much a promised land of goodness and unicorns.   What do you think an elegant way would be to test error messages or error information?

Yeah I agree that it is very (* 10) important with good error messages. What you can get after resolution now is:
- The graph containing module ids and their children
- The unresolved ids
- The resolved ids
- The constraints it found for each id
- The variants for each id it had when resolution ended.

If you have more than one variant per id it is under constrained, 0 variants for an id means it is over constrained.

With this info you can basically create a nice graph and show which constraints it found. As we move along we should save where we found the constraints as well so that can be part of the graph.
Even with what we have now though it is possible to prompt the user to be more specific if it is under constrained (more than one variant for an id). I have the version order algo that ivy and maven and php uses so this could be used to suggest a good likely version if a version is the issue. 
If it is over constrained you can currently print out the constraints and the id and prompt the user to loosen up the constraints. If there is a conflict: 2 constraints that want different things (e.g. 2 different versions of the same id) you can also detect this. In the case where it is over constrained on a dependency that the user defined, it would be enough to ask the user to loosen the constraint. If it is on a module you did not define you have to override that variant - overrides are not available yet though. Telling the user how to fix the issue could also be part of the error message. As we implement search we could also do a fuzzy query and get a did-you-mean in this case. I think that would very cool.

Currently you can also print the (partial) graph it found in case it is over/under constrained. This is nice to be able to debug. I know I hate it when ivy fails deep down on a transitive dep but you cannot see why that dep is there before you fix it.

For error messages we could create some, but the design is simple enough to be able to communicate the issues so that the build tool creates them themselves. At least if we get the API right. The ideal error message for me is one that shows you the problem very clearly *and* also gives you a tip on how to fix it. As we are moving forward we could also have command that fixes it for you or at least suggest exactly what you have to do to fix it. It could be up to the build tool to choose what is the best approach: automatically fix or explain the issue.


>
>  
>>
>> If we can solve all use cases and prove that this implementation works, I think it is just a matter of adding the tooling around it and (as we are going further) and make the implementation faster and safer (if you look at the impl you see why I mention this - currently it is optimised for readability only :).
>>
>>
> Yeah, it's amazing to see how fast this project is moving.  Great work Fred!!

Thx:) it is really just because the design is so simple. Makes me think we might be on to something ;)
I have started on overrides and exclusions helpers now, but I will be busy on other things the first part of this week. If I am lucky I *hope* to finish those by the end of this week still. With those in place and validated I think importing data from maven or ivy  would be the next step. I think it is important not only for the functionality but to validate that we can solve the same use cases.
I will also update the spec with some ideas on the artifacts files that we discussed earlier if nobody else feels like they want to do it first...

Mark Harrah

unread,
Sep 22, 2013, 8:13:24 PM9/22/13
to adep...@googlegroups.com
I wrote a test case that fails, but I'm not sure if I've encoded it right. Here's the graph I tried to encode:

A 1.0 -> B, C, D, E

D 1.0 -> C 3.0
D 2.0 -> C 2.0
B 1.0 -> C 2.0
B 2.0 -> C 3.0
E 1.0 -> D 1.0, B


with the expected results:

A 1.0
B 2.0
C 3.0
D 1.0
E 1.0

This is the test case I tried:

test("solving") {
val resolver = load(useTestData(
R("A")("v" -> "1.0")( V("B")()(), V("C")()(), V("D")()(), V("E")()() ),
V("D")("v" -> "1.0")(
V("C")("v" -> "3.0")()
),
V("D")("v" -> "2.0")(
V("C")("v" -> "2.0")()
),
V("B")("v" -> "1.0")(
X("C")("v" -> "2.0")
),
V("B")("v" -> "2.0")(
X("C")("v" -> "3.0")
),
V("E")("v" -> "1.0")(
X("D")("v" -> "1.0"),
X("B")()
)
))

checkUnresolved(resolver, Set())
checkResolved(resolver, Set("A", "B", "C", "D", "E"))
}

Fredrik Ekholdt

unread,
Sep 23, 2013, 2:27:07 AM9/23/13
to adep...@googlegroups.com, Mark Harrah
Cool :) I will have a look at it this evening :)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Fredrik Ekholdt

unread,
Sep 23, 2013, 3:36:31 AM9/23/13
to adep...@googlegroups.com, Mark Harrah
I see the issue I think. The test case is almost correct, you have some Vs in the definition of A that should be Xs :) It doesn't really matter because I think you found a valid issue :)
The algo is currently super naive and stops if something is unresolved. So this is expected (though I can't try it now), but it is something we need to fix.

I'll post some examples to help debugging as well. Not sure I am able today.

Tobias Roeser

unread,
Sep 24, 2013, 8:02:35 AM9/24/13
to adep...@googlegroups.com
Nice to have a specification. :-)

Here are some thoughts.


==Artifacts==

From Definitions: "An artifact hash is linked to a set of location providers,
where the artifact can be found"

Will it be possible to download artiftacts from their original source? This
is, for me, an important goal to reach and is essentially what I hear when you
tell separated metadata.

Will artifacts inside other resources/artifacts be possible, e.g. a JAR in a
ZIP on the tools home page? This is of course no major use case, but some
dependencies are sometimes only available inside a ZIP file from the original
tool/library provider (e.g. JUnit, but I didn't check since in the last 12
month, though).


==Attributes==

When defining an attribute (in the global config), how to deal with modules,
that do not set this attribute. Will these be excluded from the tree or
included by default?

Also, I think, it is essential to have some predefined common-sense attributes,
to avoid a cluttered inhomogenous attribute landscape, where nobody knows
which attribute means what.


==Hashes==

As discussed earlier in that ML, I argue, that hashes are not that readable by
human and do not provided a natural order like e.g. incremented revision
numbers. When thinking about package versions, you have to deal with three
kind of metadata evolutions:

1. new version of the package
2. new version of the metadata (e.g. because of newly known incompatibilities)
3. refactorings of the metadata itself (e.g. because of new features of adept)
which do not affect the effective artifacts for the end user

In my view, the third category should not bump the artifact revision/hash as
there are no effects for the package consumer. But this is not possible if the
hashes are driven by the underlying storage mechanism (currently git). If we
would use a more maintainer-friendly "hash", e.g. r0, r1 than this would be
possible and the package consumer could easily grasp, which revision (aka
hash) is newer. Of course, some rules have to be applied, in which cases it is
allowed to not-bump a revision.

Also, ordered revision nr. could be included into versions ranges, whereas
unordered hashed do not support to be included in version ranges (at least
not, without having some additional knowledge).

But these are just some thought based on the use of Jackage and almost ten
years of Gentoo portage, which both have such a revision mechanism. Besides
inconvenience, no show stopper for me.


==Pre-Resolving aka having an "artifacts.txt"==

I believe, a non transient explicit classpath is what most mature projects
need and want. Whereass in new, quick-and-dirty, test, name-it-what-you-want
projects you want a fast start and automatic transitive dependency resolution
is desireable.

I very much like the idea of resolving a dependency graph based on metadata
and committing the result to the project repo. Nobody else (who wants to build
the project) should be required to re-resolve them. But, everybody should be
able to re-resolve to the same result, which is the key for reproducable
setups.

A typical workflow, utilizing a package manager in a "passive way", could be
like this:
- Dev adds some deps to project
- Dev asks package manager/build system to suggest some additional/missing
transitive deps
- Dev picks the one he wants and makes them persistent
- Later the dev modifies, adds, removes, bumps some deps
- Package manager/build system re-analyzes the classpath and detects
missing/conflicting packages and makes suggestions

Therefore, in SBuild, we do not depend on a package manager or on the concept
of automatic managed dependencies at all. But of course, we support it.

So, the SBuild-Adept intergration could be just some analyzing, verification
and suggestion step. No automatism in the build chain, but lots of help in
assembling the build chain. Any hard decision between dependencies is
supervised by the developer. And, I would like to not have a "artifacts.txt"
file but integrate it into SBuild's DSL, but that should be an implementation
detail, IMHO.


==Dependency resolver, conflict resolution, stable packages==

Just some pointers here to avoid NIH syndrom.

In OSGi land, there is the OSGi bundle repository specification RFC and some
implementations, most notably Apache Felix OBR
(http://felix.apache.org/site/apache-felix-osgi-bundle-repository.html), which
support dependency resolution based on a very generic capabilities model. This
fells like almost the same (mightiness) as the constraint-based approach of
Adpet Mark II.

Also the package manager of Gentoo Linux called portage has conceptually very
much in common with the aims of adept. Separate metadata, keywords, use-
flags...

And speaking about portage, it brings the concept of stable vs. unstable
packages, so you can declare a package as unstable as long as you are testing
it. After some time without any change and negative feedback, you can make it
stable. If somebody used a unstable package he knows that it might blow and
idealy knows how to report issues.


Sorry, for the longish post.

Best regards,
Tobias

Fredrik Ekholdt

unread,
Sep 24, 2013, 3:13:50 PM9/24/13
to adep...@googlegroups.com
On Sep 24, 2013, at 1:02 PM, Tobias Roeser wrote:

Nice to have a specification. :-)

Here are some thoughts.
Cool - thanks! :) Lots of great comments I see! See inline for more



==Artifacts==

From Definitions: "An artifact hash is linked to a set of location providers,
where the artifact can be found"

Will it be possible to download artiftacts from their original source? This
is, for me, an important goal to reach and is essentially what I hear when you
tell separated metadata.
I am not very specific about what location providers are, but my current thinking that it is just a URI.  It can be any type of file as well.
We can add our own protocols if we need something more complicated as we go. We could also have some properties (host(s)) that is used, so you would be able to switch out the hosts easily. Just ideas for now though.


Will artifacts inside other resources/artifacts be possible, e.g. a JAR in a
ZIP on the tools home page? This is of course no major use case, but some
dependencies are sometimes only available inside a ZIP file from the original
tool/library provider (e.g. JUnit, but I didn't check since in the last 12
month, though).


==Attributes==

When defining an attribute (in the global config), how to deal with modules,
that do not set this attribute. Will these be excluded from the tree or
included by default?

Also, I think, it is essential to have some predefined common-sense attributes,
to avoid a cluttered inhomogenous attribute landscape, where nobody knows
which attribute means what.
Yes, I think so as well. In terms of implementation I was planning on creating a separate module which includes names and common use cases so that it would break on compile if there were to change. Actually having a good starting point and a close relationship with the build tool owners is extremely important to avoid to much issues when things are moving forward.



==Hashes==

As discussed earlier in that ML, I argue, that hashes are not that readable by
human and do not provided a natural order like e.g. incremented revision
numbers. When thinking about package versions, you have to deal with three
kind of metadata evolutions:

1. new version of the package
2. new version of the metadata (e.g. because of newly known incompatibilities)
3. refactorings of the metadata itself (e.g. because of new features of adept)
which do not affect the effective artifacts for the end user

In my view, the third category should not bump the artifact revision/hash as
there are no effects for the package consumer. But this is not possible if the
hashes are driven by the underlying storage mechanism (currently git). If we
would use a more maintainer-friendly "hash", e.g. r0, r1 than this would be
possible and the package consumer could easily grasp, which revision (aka
hash) is newer. Of course, some rules have to be applied, in which cases it is
allowed to not-bump a revision.
Maybe there is a misunderstanding when it comes to hashes here? The only hashes which are left in the model, are the hashes that uniquely identifies _artifacts_. The reason I am asking if there is a misunderstanding is that the way I am reading this is that hashes somehow change when metadata is changed.
The artifact hashes are meant to be the SHA-1 (or 256) of the actual artifact itself so it should definitely be independent on the metadata?
So  only in the 1) you would have a new artifact hash. 2) will just change the metadata and if the artifacts are the same they do not change.

Sorry if I am misunderstanding (I am been holding a course, so I am bit tired today)

Also, ordered revision nr. could be included into versions ranges, whereas
unordered hashed do not support to be included in version ranges (at least
not, without having some additional knowledge).

But these are just some thought based on the use of Jackage and almost ten
years of Gentoo portage, which both have such a revision mechanism. Besides
inconvenience, no show stopper for me.


==Pre-Resolving aka having an "artifacts.txt"==

I believe, a non transient explicit classpath is what most mature projects
need and want. Whereass in new, quick-and-dirty, test, name-it-what-you-want
projects you want a fast start and automatic transitive dependency resolution
is desireable.

I very much like the idea of resolving a dependency graph based on metadata
and committing the result to the project repo. Nobody else (who wants to build
the project) should be required to re-resolve them. But, everybody should be
able to re-resolve to the same result, which is the key for reproducable
setups.
Ok, that is good :) I feel that with this pattern fits quite well for both the use cases where you want never-changing builds and hard-core verification of what is used VS developer friendliness. 


A typical workflow, utilizing a package manager in a "passive way", could be
like this:
- Dev adds some deps to project
- Dev asks package manager/build system to suggest some additional/missing
transitive deps
- Dev picks the one he wants and makes them persistent
- Later the dev modifies, adds, removes, bumps some deps
- Package manager/build system re-analyzes the classpath and detects
missing/conflicting packages and makes suggestions

Therefore, in SBuild, we do not depend on a package manager or on the concept
of automatic managed dependencies at all. But of course, we support it.

So, the SBuild-Adept intergration could be just some analyzing, verification
and suggestion step. No automatism in the build chain, but lots of help in
assembling the build chain. Any hard decision between dependencies is
supervised by the developer. And, I would like to not have a "artifacts.txt"
file but integrate it into SBuild's DSL, but that should be an implementation
detail, IMHO.
Yep, I was also thinking along those lines. 



==Dependency resolver, conflict resolution, stable packages==

Just some pointers here to avoid NIH syndrom.

In OSGi land, there is the OSGi bundle repository specification RFC and some
implementations, most notably Apache Felix OBR
(http://felix.apache.org/site/apache-felix-osgi-bundle-repository.html), which
support dependency resolution based on a very generic capabilities model. This
fells like almost the same (mightiness) as the constraint-based approach of
Adpet Mark II.
This is a great comment! 

Honestly I haven't been using OBR (and OSGI) in any project I have worked on.

Anybody else who have some more informed opinions about it? 

This is what I could gather from looking into it now:
I see that it is seems builds on the same model like you say, but it also has queries (filter) with ranges that you can use (so it is more powerful that way). 
It *looks* like the main differences in its design is that the repositories are server-based?
I think versioned offline metadata is key to Adept. It makes it possible to share and contribute metadata easier, it makes it more reliable (no need for repository manager to get around this), it makes tooling easier to build, ...

I wanted to say that being part of OSGI is not necessarily a bad thing, but I think when it to a dependency manager this is a great liability.
Even if you can use independently (though it seems to be quite integrated when you look at their use of Manifests, etc ), I think it is a problem for any project that doesn't want or need to be part of OSGI even if it not strictly a technical issue.


I guess ORB is the closest "competitor" to Adept and if it is a 100% overlap on features and capabilities I guess Adept would not be needed. That being said, it is alarming that this problem is so present, but nobody seems to be using ORB (in 2013) except eclipse though it has been under development for years (http://www.youtube.com/watch?v=hemY-6dfPnw). Are the too tied to OSGI? Did they have a poor migration model?
For all I know it might be a community issue that holds them back? They seem to have a strict specification process. 
Did they solve a problem ahead of their time?
I feel I sound very biased here (and I think I am:) just so I have said it. 

Would be interesting to hear others opinions? Either way it definitely deserves more attention.


Also the package manager of Gentoo Linux called portage has conceptually very
much in common with the aims of adept. Separate metadata, keywords, use-
flags...
Another good one: 
I know portage from the days when I was using Gentoo (a while back), but I haven't looked closely enough at their model to be honest. 
Their model seem tailored to handling packages with their own build tool to figure out the best version (decision making is what they call it seems). 

As for differences I guess they are operating in another space, and, again, they also integrate the dependency manager with the build tool...

Is it really to much to ask for to have a simple dependency management system that is not part of any larger system/build tool? It seems like a perfectly normal abstraction layer to me... Imagine how cool it would be to have the same dependency ecosystem from C to objective-c all the way to the JVM, especially in this polyglot age. 


And speaking about portage, it brings the concept of stable vs. unstable
packages, so you can declare a package as unstable as long as you are testing
it. After some time without any change and negative feedback, you can make it
stable. If somebody used a unstable package he knows that it might blow and
idealy knows how to report issues.
Hmmm. Yeah, I can see how that comes in handy. It could of course be expressed as a constraint - maybe this should be something defined by the common set of attributes.


Sorry, for the longish post.
Not at all - I am really glad to get feedback!

Tobias Roeser

unread,
Sep 24, 2013, 5:00:35 PM9/24/13
to adep...@googlegroups.com
Am Dienstag, 24. September 2013, 21:13:50 schrieb Fredrik Ekholdt:
| On Sep 24, 2013, at 1:02 PM, Tobias Roeser wrote:
| > Nice to have a specification. :-)
| >
| > Here are some thoughts.
|
| Cool - thanks! :) Lots of great comments I see! See inline for more
|
| > ==Artifacts==
| >
| > From Definitions: "An artifact hash is linked to a set of location
| > providers, where the artifact can be found"
| >
| > Will it be possible to download artiftacts from their original source?
| > This is, for me, an important goal to reach and is essentially what I
| > hear when you tell separated metadata.
|
| I am not very specific about what location providers are, but my current
| thinking that it is just a URI. It can be any type of file as well. We
| can add our own protocols if we need something more complicated as we go.
| We could also have some properties (host(s)) that is used, so you would be
| able to switch out the hosts easily. Just ideas for now though.

That sounds good. The properties idea is good and makes especially sense for
mirrored resources like Maven repos, Eclipse, Sourceforge, etc. In Portage,
they have such mechanism e.g. for Sourceforge, so instead of
http://sourceforge.net/files/a/b/c you just just use http://sourceforge/a/b/c.

| > Will artifacts inside other resources/artifacts be possible, e.g. a JAR
| > in a ZIP on the tools home page? This is of course no major use case,
| > but some dependencies are sometimes only available inside a ZIP file
| > from the original tool/library provider (e.g. JUnit, but I didn't check
| > since in the last 12 month, though).
| >
| >
| > ==Attributes==
[..]
Indeed. My mental model of Adepts core might be wrong. Let me try to explain.
When Adept is all about metadata, then the Adept repo only contains metadata.
Lets call the metadata "packages" for now. So, imagine, you evolve a package
because you have to fix the dependencies (e.g. to not depend on >=lib.a-1.0.0
but on >=lib.a.1.0.7 because of some newly revealed incompatibilities) then
you have to create a new git commit for that package, which of course still
points to the same artifact. So, I agree, the artifact hash stays the same,
lets call the artifact hashes "checksums". But, if I understand it correctly,
you also want to refer to the exact same metadata by refering to that git
commit hash, so lets call the git hashes "revisions".

I was refering to revisions in that sense, that a developer might want to say:
"I want package lib.b.1.0.0 with revision 3, as I know since that revision the
transitive dependencies work for me, but with revision 2 I had major issues."
What he really wants is to use the metadata of revision 3 for the dependency
tree computation. Of course, when somebody else just wants version 1.0.0 of
lib.b, the package he typically wants is that one with the highest revision.
If this revision now is a git hash, then there would be no easy way to
tell, if package lib.b.1.0.0-ab1075f is newer than lib.b.1.0.0-ef91d7. It is
not exactly clear to me, if Adept can work that way. Mabye, it's just how I
want it to be. If you could follow my mental model somehow, could you explain
the difference to the real model, please. ;-)

| Sorry if I am misunderstanding (I am been holding a course, so I am bit
| tired today)
|
| > Also, ordered revision nr. could be included into versions ranges,
| > whereas unordered hashed do not support to be included in version ranges
| > (at least not, without having some additional knowledge).
| >
| > But these are just some thought based on the use of Jackage and almost
| > ten years of Gentoo portage, which both have such a revision mechanism.
| > Besides inconvenience, no show stopper for me.
| >
| >
| > ==Pre-Resolving aka having an "artifacts.txt"==
[..]
|
| > ==Dependency resolver, conflict resolution, stable packages==
| >
| > Just some pointers here to avoid NIH syndrom.
| >
| > In OSGi land, there is the OSGi bundle repository specification RFC and
| > some implementations, most notably Apache Felix OBR
| > (http://felix.apache.org/site/apache-felix-osgi-bundle-repository.html),
| > which support dependency resolution based on a very generic capabilities
| > model. This fells like almost the same (mightiness) as the
| > constraint-based approach of Adpet Mark II.
|
| This is a great comment!
|
| Honestly I haven't been using OBR (and OSGI) in any project I have worked
| on.
|
| Anybody else who have some more informed opinions about it?
|
| This is what I could gather from looking into it now:
| I see that it is seems builds on the same model like you say, but it also
| has queries (filter) with ranges that you can use (so it is more powerful
| that way). It *looks* like the main differences in its design is that the
| repositories are server-based? I think versioned offline metadata is key
| to Adept. It makes it possible to share and contribute metadata easier, it
| makes it more reliable (no need for repository manager to get around
| this), it makes tooling easier to build, ...

AFAIK, there is no requirement to run as server. In fact, an OBR is just a
bunch of metadata (XML) gathered by scanning the bundles with a tool, e.g.
bindex. The dependency computation is then based on that metamodel.

My main intension was not primarily the potential code-reuse but reuse of
concepts, terminology, algorithms, problem awareness, and so on. Same holds
true for Gentoo portage, which is implemented in Python, btw. When I read
about attributes and constraints, my first though was, that this might be a
solved problem but with another terminology and another (but not so different)
domain. Also, I had the feeling that the metadata format of Adept is rather
hard to read compared to an ebuild (Portage's metadata). But please take this
with a grain of salt, it's a very personal feeling! What I wanna point out is:
Their solutions might be good resources of knowlegde. and, as an example,
getting error and conflict reporting right can be a long process, which we
could cut by looking beyond.

| I wanted to say that being part of OSGI is not necessarily a bad thing, but
| I think when it to a dependency manager this is a great liability. Even if
| you can use independently (though it seems to be quite integrated when you
| look at their use of Manifests, etc ), I think it is a problem for any
| project that doesn't want or need to be part of OSGI even if it not
| strictly a technical issue.
|
|
| I guess ORB is the closest "competitor" to Adept and if it is a 100%
| overlap on features and capabilities I guess Adept would not be needed.

Keep in mind, that an OBR is primarily operating on a versioned package level
(Java packages, those one can import) plus versioned bundles plus transitivity
in terms of uses constraints. These calculations might already be more than
what is needed to setup a build tool. E.g. a compiler only supports a flat
classpath, but the OSGi runtime provides real modules and isolation, as each
bundles has it's own classpath. So the implementation as such might be no good
fit for our use case.

| That being said, it is alarming that this problem is so present, but
| nobody seems to be using ORB (in 2013) except eclipse though it has been
| under development for years (http://www.youtube.com/watch?v=hemY-6dfPnw).
| Are the too tied to OSGI? Did they have a poor migration model? For all I
| know it might be a community issue that holds them back? They seem to have
| a strict specification process. Did they solve a problem ahead of their
| time?
| I feel I sound very biased here (and I think I am:) just so I have said it.
|
| Would be interesting to hear others opinions? Either way it definitely
| deserves more attention.

Sorry, I did not followed your links, but maybe, you got a wrong impression.
Of course, the RFC never made it to final which means in OSGi land, you can not
claim your implementation as final and stable. But besides that, there are a
lot of tools using OBRs, e.g. Apache Karaf or Eclipse bndtools. AFAIK the
official Eclipse herds go another direction with p2, which is a complicated
beast (from overhearing) but besides that yet another dependency manager. ;-)

| > Also the package manager of Gentoo Linux called portage has conceptually
| > very much in common with the aims of adept. Separate metadata, keywords,
| > use- flags...
|
| Another good one:
| I know portage from the days when I was using Gentoo (a while back), but I
| haven't looked closely enough at their model to be honest. Their model
| seem tailored to handling packages with their own build tool to figure out
| the best version (decision making is what they call it seems).
|
| As for differences I guess they are operating in another space, and, again,
| they also integrate the dependency manager with the build tool...
| Is it really to much to ask for to have a simple dependency management
| system that is not part of any larger system/build tool? It seems like a
| perfectly normal abstraction layer to me... Imagine how cool it would be
| to have the same dependency ecosystem from C to objective-c all the way to
| the JVM, especially in this polyglot age.

I'm with you. One tool for one purpose. I belief the absense of (the sense
for) a dependency management system led to Maven, which is neither ideal as
dependency manager nor as build tool.

| > And speaking about portage, it brings the concept of stable vs. unstable
| > packages, so you can declare a package as unstable as long as you are
| > testing it. After some time without any change and negative feedback,
| > you can make it stable. If somebody used a unstable package he knows
| > that it might blow and idealy knows how to report issues.
|
| Hmmm. Yeah, I can see how that comes in handy. It could of course be
| expressed as a constraint - maybe this should be something defined by the
| common set of attributes.

Yeah, that might work. But depending on my (correct or incorrect)
understanding of how metadata revisions are handled, I do not understand, how
you want to make a package stable without changing it's revision.

Best regards,
Tobias


| > Sorry, for the longish post.
|
| Not at all - I am really glad to get feedback!
|
| > Best regards,
| > Tobias
| >
| > Am Freitag, 20. September 2013, 18:19:06 schrieb Fredrik Ekholdt:
| > | Hi everybody!
| > | Lately, me, Mark and Josh have been thinking about how to improve the
| > | model for Adept so that it can better support variances between for
| > | example scala binary versions (but also other ones such play or
| > | android versions or even C++ versions).
| > |
| > | I have tried to write a spec for once to better explain the model we
| > | are looking at, why we want it and how it can be used to solve some
| > | actual use cases.
| > |
| > | The spec can be found here:
| > | https://docs.google.com/document/d/1xU9m2zxva2eKhiXVYYqjmZieaWPJY0mDbEm
| > | Z_pE 5P5c/edit?usp=sharing

Josh Suereth

unread,
Sep 24, 2013, 6:01:52 PM9/24/13
to adept-dev
This notion of "latest revision in the series" isn't quite "branch" friendly.   I think the "ideal" solution for such a thing, is exactly similar to what github does.  You host some external "branch" identifier that points to a specific version, and this "branch" can be updated to point to another specific version, similar to how github branches actually work.

How this exactly manifests in adept, I haven't quite figured out, but there is one lazy option:

currentRevisionSeries=1.0.x
latestRelease=true

If you wish to change what the latest is, when you push the next version, you alter the "latestRelease" constraint from one module variant into the latest module variant.   This is effectively your way of "tagging" or having redirectable "branches" to specific versions (module variants).

I agree with you that you need this notion of "latest in a branch/series", but you don't want to change the core metadata of the artifact, nor do you alter the hashes.  What you want is some "external" notion index-y thing that points to "latest" for that branch.  This way, you can maintain more than one series, e.g.  "integratoin releases from branch XYZ in git" can have their own set of releases which don't interfer with the notion of 'latest' from master, or latest from a 1.x branch.


Fredrik Ekholdt

unread,
Sep 27, 2013, 11:10:54 AM9/27/13
to adep...@googlegroups.com, Mark Harrah
Mark: this issue should be solved now (see latest commit).  I am not sure it works if the constraints are deeper in the transitive graph. Need more complicated test cases but I do not have time for it today.

Fredrik Ekholdt

unread,
Sep 27, 2013, 1:58:07 PM9/27/13
to adep...@googlegroups.com, Mark Harrah
I have given it some more thought and actually pretty sure that although all tests work I am doing it wrong so it will have to be continued :)

Fredrik Ekholdt

unread,
Oct 1, 2013, 4:03:25 PM10/1/13
to adep...@googlegroups.com, Mark Harrah
Alrighty! Now it works the way I think it is supposed to!
Check it out here: https://github.com/adept-dm/adept

Maybe you are thinking: what this means it practical terms (and do not have the time to have a look into the tests)?
Imagine I want modules: play, slick and therefore play-slick (play's slick integration module).
In this simple world: there is only one version of play 2.2.x (2.2.0) . play-slick has only one version that works with play 2.2.x and requires slick 1.0.1. Any other dependencies are perfectly constrained.

In this case I only need to constrain play to minor version 2.2, everything else will be figured out by adept. Since there is only one variant of slick and it requires one specific variant of play and slick and since akka-actors. If there is an update of new version of play or play slick it will fail and then you either can constrain it yourself or have a build tool that does it for you.
If there is a scala version (2.11) that is released and play 2.2 is built for it, you only have to change your scala dependency and it will resolve just as well (if there is variants that works with it).
Pretty cool?

In addition I can print out graphs and cyclic dependencies are handled.
For this test data:

     R("A")("v" -> "1.0")( //I want A v 1.0

        X("B")(), //it depends on some variant of B (do not care which)

        X("C")(),  //and it depends on some variant of C (do not care which), etc etc

        X("D")(),

        X("E")()),

      V("E")("v" -> "1.0")( //there is only one version of E v 1.0

        X("D")("v" -> "1.0")), //and it requires D 1.0

 //there are 2 versions of C:

      V("C")("v" -> "2.0")(),

      V("C")("v" -> "3.0")(), //since we want D 1.0, we and it depends on C 3.0, we must use C 3.0

      V("D")("v" -> "2.0")( 

        X("C")("v" -> "2.0")),

      V("D")("v" -> "1.0")(

        X("C")("v" -> "3.0")), //<-- depends on C 3.0

//there are also 2 version of B

      V("B")("v" -> "1.0")(

        X("C")("v" -> "2.0"),

        X("F")()),

      V("B")("v" -> "2.0")(

        X("C")("v" -> "3.0"), //but we must use 2.0 because of our requirement on C 3.0

        X("F")()),

//2 variants of F again

      V("F")("v" -> "1.0")(

        X("C")("v" -> "2.0")),

      V("F")("v" -> "2.0")(

        X("C")("v" -> "3.0")) //same thing as for B

You get this: 

- A [v=(1.0)]

 - B [v=(2.0)]

  - C <defined>

  - F [v=(2.0)]

   - C <defined>

 - C [v=(3.0)]

 - D [v=(1.0)]

  - C <defined>

 - E [v=(1.0)]

  - D <defined>


A note on the implementation
The implementations is lacking in quality (at least in my opinion) although the idea is simple enough. If it can resolve (only one variant for each module) it returns immediately, if it is over-constrained (too many constraints for a module) it returns immediately, if is under-constrained it will then try out the combinations of variants (different versions if you will).

Complexity comes from when it has figure out the right set of combinations: so Adept has to find a *unique* set of combinations based on a *minimal* set of constrained variants (starts by constraining first one, the other, etc etc then it continues with combinations of all) to be able to resolve.
 
The problem is that it can be slow if there are a lot (actually it doesn't need to be that many) of unresolved modules (especially if they are cyclic). I have parallelised it to make it faster, but I am pretty sure there is a bug somewhere and that it is trying combinations that will never work, but it requires more thought than I can put into it tonight :(


@tobias: I have tried to put all my energy into this - I was travelling last week so I have been struggling to be focused enough to get fix this hard (for me) problem. Will get back to you!

Fredrik Ekholdt

unread,
Oct 11, 2013, 7:06:23 AM10/11/13
to adep...@googlegroups.com


On Tuesday, 24 September 2013 23:00:35 UTC+2, Tobias Roeser wrote:
Am Dienstag, 24. September 2013, 21:13:50 schrieb Fredrik Ekholdt:
| On Sep 24, 2013, at 1:02 PM, Tobias Roeser wrote:
| > Nice to have a specification. :-)
| >
| > Here are some thoughts.
|
| Cool - thanks! :) Lots of great comments I see! See inline for more
|
| > ==Artifacts==
| >
| > From Definitions: "An artifact hash is linked to a set of location
| > providers, where the artifact can be found"
| >
| > Will it be possible to download artiftacts from their original source?
| > This is, for me, an important goal to reach and is essentially what I
| > hear when you tell separated metadata.
|
| I am not very specific about what location providers are, but my current
| thinking that it is just a URI.  It can be any type of file as well. We
| can add our own protocols if we need something more complicated as we go.
| We could also have some properties (host(s)) that is used, so you would be
| able to switch out the hosts easily. Just ideas for now though.

That sounds good. The properties idea is good and makes especially sense for
mirrored resources like Maven repos, Eclipse, Sourceforge, etc. In Portage,
they have such mechanism e.g. for Sourceforge, so instead of
http://sourceforge.net/files/a/b/c you just just use http://sourceforge/a/b/c.
Yep :) 
Right, you are touching on something interesting here the way I read it: you are talking about how dependencies across repositories will be handled. lib.b 1.0.0 is a dependency on your build in a _different_ repository right? You are also assuming that each module will have it's own repository?

This is interesting to me because it is something I haven't specced yet, I think your mental model fits with what I want to do though :) 

So over to the current question and how I am thinking about it:
I should start with I agree that it must be possible to constrain on git uri and hash/"revision", I am just adding a bit more context to how I see repositories interopting.

I am thinking about repositories now is that we have to make sure that repositories are smaller than for maven/Ivy (no single central). The reason they have to be small is to make it easier to publish and the amount of metadata to download should be as small as possible. Even though I have been for a central repo earlier (because we should be able to handle it), I think now we just need a place where multiple repos can be easily located for a user and where you can search across all of them.

The way I have been thinking about this (so I am open to suggestions) is that each repository is it's own island so to speak: each repository contains the all variants it needs. 
The reason I think that is the only way to go is that you do not want to manage the "repo" dependencies as well. For consumption having repo "dependencies" (one repo links to a set of other repos) you will end up spending a lot of time to check out the repositories and you might have conflicts (repo1 commit aef123)  and (repo1 commit  baba456). It also makes the publishing process much less complicated. The con is that there will be more redundant data, but that is not a problem.
By having everything you need in the same you solve the first problem (less to metadata to download), but not the second (you might still have conflicts: repo1 commit aef123 might have a different set of metadata for the same module/variants as repo1 commit baba456).

If you think it still fits nicely with the model though, because Adept simply fails on resolve if it find more than one variant for a module id. The uri and the git hash could be like any other attribute. Since these attributes has special meaning to Adept I guess they should be their own properties though. A user might also want to pull the latest version of a dependent repository, so these properties should be saved in a lookup file which can be used to run a command: "update"  that updates all dependent repositories.

So for an author in your use case: the author says that we no longer are compatible with lib.b versions 1.0.0 in revision 2, so the author pulls the new repo of lib.b and the job is done. The uri and hash will be automatically saved and the new variants replaces the old ones.

The problem now is if it was the user of the module that used lib.b that detected the issue with 1.0.0 revision 2 and below and all repos are islands? This is the reason we want a user to be able to "update" or sync his dependencies.
The user now has access to version 1.0.0 revision 3. In this case Adept would fail (as it should) because it is under-constrained (it finds variant 1.0.0 revision 2 and revision 3). The user/build-tool overrides the module that depends on lib.b (as normally) and creates a constraint to be revision 3. Since the constraints are global, any other module that used the old revision of 1.0.0 will follow automatically. Just as with any override/exclude the user could choose to create pull request, and the author will be notified of what is happening so he can fix it. 
If the lib.b dependency was strictly specified to another revision somewhere else, you would be over-constrained and you/build tool would have to resolve it (again) by overriding. 

Does this make sense?

Sorry for the long post :) (I am taking revenge :P ) 
Right :) Well, it is good to question if we really need this :) 
Also, I had the feeling that the metadata format of Adept is rather
hard to read compared to an ebuild (Portage's metadata).
Yeah, I think the metadata format deserves a new round. Right now, I am just json because it is so easy and fast to write and read it _programatically_, and not horrible to read for a human. 
But please take this
with a grain of salt, it's a very personal feeling! What I wanna point out is:
Their solutions might be good resources of knowlegde. and, as an example,
getting error and conflict reporting right can be a long process, which we
could cut by looking beyond.
Yeah, I think that is a good point.   

| I wanted to say that being part of OSGI is not necessarily a bad thing, but
| I think when it to a dependency manager this is a great liability. Even if
| you can use independently (though it seems to be quite integrated when you
| look at their use of Manifests, etc ), I think it is a problem for any
| project that doesn't want or need to be part of OSGI even if it not
| strictly a technical issue.
|
|
| I guess ORB is the closest "competitor" to Adept and if it is a 100%
| overlap on features and capabilities I guess Adept would not be needed.

Keep in mind, that an OBR is primarily operating on a versioned package level
(Java packages, those one can import) plus versioned bundles plus transitivity
in terms of uses constraints. These calculations might already be more than
what is needed to setup a build tool. E.g. a compiler only supports a flat
classpath, but the OSGi runtime provides real modules and isolation, as each
bundles has it's own classpath. So the implementation as such might be no good
fit for our use case.
Yeah, OSGI is more than what most people need, but really if there is no overhead in using it, it is an alternative.  

| That being said, it is alarming that this problem is so present, but
| nobody seems to be using ORB (in 2013) except eclipse though it has been
| under development for years (http://www.youtube.com/watch?v=hemY-6dfPnw).
| Are the too tied to OSGI? Did they have a poor migration model? For all I
| know it might be a community issue that holds them back? They seem to have
| a strict specification process. Did they solve a problem ahead of their
| time?
| I feel I sound very biased here (and I think I am:) just so I have said it.
|
| Would be interesting to hear others opinions? Either way it definitely
| deserves more attention.

Sorry, I did not followed your links, but maybe, you got a wrong impression.
Of course, the RFC never made it to final which means in OSGi land, you can not
claim your implementation as final and stable. But besides that, there are a
lot of tools using OBRs, e.g. Apache Karaf or Eclipse bndtools. AFAIK the
official Eclipse herds go another direction with p2, which is a complicated
beast (from overhearing) but besides that yet another dependency manager. ;-)
Yeah, I saw them, but, other than eclipse, there is not a lot of deps that I have seen that use it. 
Right, but there is no problem in changing the revision right? The user wants to update (is notified that he can), and update the revision and presto there is a new variant that replaces the former. BTW: I am planning making it so that a build tool always sees which revision of the repos it is on. When somebody updates the revision, it is possible to log and store this. This makes it possible to have reliable builds, even though you have snapshots.
Reply all
Reply to author
Forward
0 new messages