How to use adept API to resolve a dependency

52 views
Skip to first unread message

Tobias Roeser

unread,
Aug 29, 2013, 11:16:57 AM8/29/13
to adep...@googlegroups.com
Hello all,

I like the idea of adept, especially the separation of metadata and
released artifacts. I'd like to integrate adept with SBuild, a Scala
based buildsystem (http://sbuild.tototec.de).

Therefore, I'd like to know if there are already any API usage examples, I
could look at to understand, how I should properly use the adept API to
resolve/download dependencies. If there are no such examples, could you please
provide a plain (not sbt) example how to initialize adept and resolve a simple
dependency.

Any help is much apreciated.

Kind regards,

Tobias Roeser

Fredrik Ekholdt

unread,
Sep 2, 2013, 1:47:18 AM9/2/13
to adep...@googlegroups.com
Hi Tobias!
I hope you saw my tweets earlier: again it is really cool that you are looking into Adept! I will do whatever I kind to help you out :)

I have started on a tutorial here: https://github.com/adept-dm/adept/wiki/Tutorial

CAREFUL: right now I haven't checked if it works or not. I also see there are some changes I would like to make before you should start using it.
I will ping you again later today when it is finished if that is ok?

Also a warning: you will be the first one that will be using the API except myself - I am open to change things around to make it more logical based on your input. That being said, the API doesn't give you all the knobs as it is now which I think is good. Because of this I think it should stabilise quickly as soon as the names of the methods are solid.

Cheers,
Fredrik

Tobias Roeser

unread,
Sep 2, 2013, 3:15:15 AM9/2/13
to adep...@googlegroups.com
Hi Fredrik,

thanks for you example code. (I could not see any tweets of you, either am I
looking at the wrong account or it's because the account is private.)

Before I run your example, I have a little question regarding the role of
Module. I don't understand what the role of the Module in this example is,
and why it is necessary at all? Given, I want to resolve the Dependency
"commons-codec:commons-codes:2.1", why do I need to provide the Module to
build a dependency tree. The tree should be always the same for the same set
of dependencies and configuration, right? What am I missing here? I ask,
because the way I plan to integrate Adept into SBuild (as a SchemeHandler), it
is not always necessary to specify the coordinates of the current project.
This means, to use Adept, I (SBuild) would have to guess or generate the
Module coordinates. This would be no problem, but if this step can be avoided,
I'd would rather do that. Maybe, its an indicator for a potentially slimmer
API.

Looking forward to your updates.

Best regards,
Tobias

Fredrik Ekholdt

unread,
Sep 2, 2013, 10:33:19 AM9/2/13
to adep...@googlegroups.com


On Monday, 2 September 2013 09:15:15 UTC+2, Tobias Roeser wrote:
Hi Fredrik,

thanks for you example code. (I could not see any tweets of you, either am I
looking at the wrong account or it's because the account is private.)
Hmm.. Strange... I have been traveling so I have been on and off on the mails. 


Before I run your example, I have a little question regarding the role of
Module. I don't understand what the role of the Module in this example is,
and why it is necessary at all? Given, I want to resolve the Dependency
"commons-codec:commons-codes:2.1", why do I need to provide the Module to
build a dependency tree. The tree should be always the same for the same set
of dependencies and configuration, right?
What am I missing here? I ask,
because the way I plan to integrate Adept into SBuild (as a SchemeHandler), it
is not always necessary to specify the coordinates of the current project.
This means, to use Adept, I (SBuild) would have to guess or generate the
Module coordinates. This would be no problem, but if this step can be avoided,
I'd would rather do that. Maybe, its an indicator for a potentially slimmer
API.  
Yes - I think you are right. The reason I went for a module is honestly because of how the sbt plugin works. This is why it is so great (!) that you are interested because this is the feedback we need! Our ambition is to start on as many build tools as possible, so we can find a nice looking API for all. I am changing it so the call works like this: `Adept.resolve(repositories, dependencies, configuration, configurationMapping)`. 

Tobias Roeser

unread,
Sep 3, 2013, 8:00:45 AM9/3/13
to adep...@googlegroups.com
Am Montag, 2. September 2013, 16:33:19 schrieb Fredrik Ekholdt:
| On Monday, 2 September 2013 09:15:15 UTC+2, Tobias Roeser wrote:
| > Hi Fredrik,
| >
| > thanks for you example code. (I could not see any tweets of you, either
| > am I
| > looking at the wrong account or it's because the account is private.)
|
| Hmm.. Strange... I have been traveling so I have been on and off on the
| mails.

Meanwhile, I see them. :-)

| > Before I run your example, I have a little question regarding the role of
| > Module. I don't understand what the role of the Module in this example
| > is, and why it is necessary at all? Given, I want to resolve the
| > Dependency "commons-codec:commons-codes:2.1", why do I need to provide
| > the Module to build a dependency tree. The tree should be always the
| > same for the same set
| > of dependencies and configuration, right?
|
| What am I missing here? I ask,
|
| > because the way I plan to integrate Adept into SBuild (as a
| > SchemeHandler), it
| > is not always necessary to specify the coordinates of the current
| > project. This means, to use Adept, I (SBuild) would have to guess or
| > generate the Module coordinates. This would be no problem, but if this
| > step can be avoided,
| > I'd would rather do that. Maybe, its an indicator for a potentially
| > slimmer
| > API.
|
| Yes - I think you are right. The reason I went for a module is honestly
| because of how the sbt plugin works. This is why it is so great (!) that
| you are interested because this is the feedback we need! Our ambition is to
| start on as many build tools as possible, so we can find a nice looking API
| for all. I am changing it so the call works like this:
| `Adept.resolve(repositories, dependencies, configuration,
| configurationMapping)`.

I tried the Tutorial code. Unfortunately, there is not such method
UniqueId.default(Coordinates, Date), only default(Coordinates, Seq[Artifact])
and default(Coordinates, Date, Seq[Artifact]). But I'm not sure, which
artifacts I'm supposed to provide here, as Module has also another constructor
parameter artifacts, containing the one I want to resolve. Should I use the
same for the UniqueId of the Module? I think, I shouldn't. I might try to use
an empty Seq, too.

I'm afraid, I have to wait for your next version that no longer requires a
Module. Or alternatively, you might help me with the construction of a
(temporary) Module.

Best regards,
Tobias

Fredrik Ekholdt

unread,
Sep 3, 2013, 8:04:29 AM9/3/13
to adep...@googlegroups.com
Yep, I think it is better to wait to avoid causing you too much pain :) I am hoping to finish it up tonight. I almost got to the end of it yesterday but hit some minor snags. Will keep you posted!
>
> Best regards,
> Tobias

Fredrik Ekholdt

unread,
Sep 4, 2013, 2:21:28 PM9/4/13
to adep...@googlegroups.com
Hey Tobias!
Here is the example with the modified API: https://github.com/adept-dm/adept/wiki/API-Example
Tell me what you think!

I have also released a new version: 0.8.0-ALPHA-20130904200806 on bintray incase you want the binary files. I haven't updated the example projects (https://github.com/adept-dm/adept/wiki/Example-Projects) with this yet though

Regards,
Fredrik

Tobias Roeser

unread,
Sep 5, 2013, 3:47:46 AM9/5/13
to adep...@googlegroups.com
Hi Fredrik,

with your latest changes and the new example code I was able to get it
running. I did some changes to the Adept code base as I'm wanted to run it
with Scala 2.10, which also worked well.

Thank you for your help!

Now, I think, I need to get used to the download destination names. Is there a
chance to change the scheme a bit? As builds are written, maintained and
debugged by humans, and humans are not good in remembering hashes when it
comes to understand the output of a broken build (Whait, what jars do I have
in my classpath. again?), I would suggest to add the coordinates somewhere in
the path.

In Jackage, an internally used (but open source) tool that does almost the
same as Adept, but without transitivity (yet), we use the following pattern:
<group>/<artifact>-<version>[-<nature>].<type> which could be adapted for
adept to also contain the hash, e.g.

<group>/<artifact>-<version>[-<other needed info>]-hash.<type>

The pros:
- easy to read and understand what is part of a classpath
- easy browsable
- artifacts needed outside the build system can be easily found

The cons:
- A same artifact under a different GAV is downloaded and stored twice (which
is not a huge problem IMHO)

What do you think?

Best regards,
Tobias

Fredrik Ekholdt

unread,
Sep 5, 2013, 6:10:04 AM9/5/13
to adep...@googlegroups.com


On Thursday, 5 September 2013 09:47:46 UTC+2, Tobias Roeser wrote:
Hi Fredrik,

with your latest changes and the new example code I was able to get it
running.
Great! :) 
I did some changes to the Adept code base as I'm wanted to run it
with Scala 2.10, which also worked well.
That is really cool! .Do you want to submit a PR with the changes? I am targeting 2.9 and 2.10 for the time being, it would be good to have a 2.10 branch. Helps me to know what I am breaking as well ;) 

Thank you for your help!  

Now, I think, I need to get used to the download destination names. Is there a
chance to change the scheme a bit? As builds are written, maintained and
debugged by humans, and humans are not good in remembering hashes when it
comes to understand the output of a broken build (Whait, what jars do I have
in my classpath. again?), I would suggest to add the coordinates somewhere in
the path.
Yes, I have gotten this feedback from Mark as well. I definitely see what you mean and one of the tenants of adept is to be debuggable.


In Jackage, an internally used (but open source) tool that does almost the
same as Adept, but without transitivity (yet), we use the following pattern:
<group>/<artifact>-<version>[-<nature>].<type> which could be adapted for
adept to also contain the hash, e.g.
Interesting! I see Jackage is from tototec as well? If we are trying to solve the same problems, it would be great to unite the effort. Is there something I can do to make that happen (or are we doing it already :)? I do have a bit of traction on adept.  the owners of gradle (gradleware), sbt and Buildr knows about adept and have expressed interest to seeing it happen. My hope is of course there will be more than only interest, but it is a start :) 

  <group>/<artifact>-<version>[-<other needed info>]-hash.<type>
Yep, including the hash is a great idea! 

The pros:
- easy to read and understand what is part of a classpath
- easy browsable
- artifacts needed outside the build system can be easily found
Yep, I agree with all of those.

The cons:
- A same artifact under a different GAV is downloaded and stored twice (which
is not a huge problem IMHO)
Yep, I agree that is not a huge problem. My idea to fix the issue on sbt was to have a task where you could see only the modules included and the artifact locations. Just to mention it: the reason I chose to have only hashes is because the design is arguably cleaner in the sense that only modules knows about it's artifacts, the artifacts are blissfully ignorant of the module. That being said, it doesn't help to have a "clean" design, if it is harder to use. I will give it a day or 2 and think about it. If I cannot come up with a better argument against doing it, I will fix. Does that sound fair?

Tobias Roeser

unread,
Sep 5, 2013, 7:22:40 AM9/5/13
to adep...@googlegroups.com

Am Donnerstag, 5. September 2013, 12:10:04 schrieb Fredrik Ekholdt:
| On Thursday, 5 September 2013 09:47:46 UTC+2, Tobias Roeser wrote:
| > Hi Fredrik,
| >
| > with your latest changes and the new example code I was able to get it
| > running.
|
| Great! :)
|
| > I did some changes to the Adept code base as I'm wanted to run it
| > with Scala 2.10, which also worked well.
|
| That is really cool! .Do you want to submit a PR with the changes? I am
| targeting 2.9 and 2.10 for the time being, it would be good to have a 2.10
| branch. Helps me to know what I am breaking as well ;)

I can do that, but I built adept with my own toolchain (because I can, and
because I need as much "test" projects as I can for SBuild), so a PR from the
current state would be incomplete from an SBT standpoint (e.g. missing
dependency bumps). So, I would need some time to prepare a PR. In essence, I
bumped all dependencies to Scala 2.10 binary compatible versions and fixed the
in 2.1 removed Akka API calls by their Scala 2.10 replacements
(akka.util.duration and Future stuff).

|
| > Thank you for your help!
| >
| >
| > Now, I think, I need to get used to the download destination names. Is
| > there a
| > chance to change the scheme a bit? As builds are written, maintained and
| > debugged by humans, and humans are not good in remembering hashes when it
| > comes to understand the output of a broken build (Whait, what jars do I
| > have
| > in my classpath. again?), I would suggest to add the coordinates
| > somewhere in
| > the path.
|
| Yes, I have gotten this feedback from Mark as well. I definitely see what
| you mean and one of the tenants of adept is to be debuggable.
|
| > In Jackage, an internally used (but open source) tool that does almost
| > the same as Adept, but without transitivity (yet), we use the following
| > pattern:
| > <group>/<artifact>-<version>[-<nature>].<type> which could be adapted for
| > adept to also contain the hash, e.g.
|
| Interesting! I see Jackage is from tototec as well? If we are trying to
| solve the same problems, it would be great to unite the effort. Is there
| something I can do to make that happen (or are we doing it already :)? I do
| have a bit of traction on adept. the owners of gradle (gradleware), sbt
| and Buildr knows about adept and have expressed interest to seeing it
| happen. My hope is of course there will be more than only interest, but it
| is a start :)

Jackage is only one of my many Open Source projects which do not produce any
monetary income, and has definitly less priority than SBuild, CmdOption, and
all the others, so it suffers from the to-less-time-for-it-syndrom. In this
regard, I would say that Adept is already a lot more mature and publically
known, besides its early state API-wise. So, you already got me, I would say.
;-)

| > <group>/<artifact>-<version>[-<other needed info>]-hash.<type>
|
| Yep, including the hash is a great idea!
|
| > The pros:
| > - easy to read and understand what is part of a classpath
| > - easy browsable
| > - artifacts needed outside the build system can be easily found
|
| Yep, I agree with all of those.
|
| > The cons:
| > - A same artifact under a different GAV is downloaded and stored twice
| > (which
| > is not a huge problem IMHO)
|
| Yep, I agree that is not a huge problem. My idea to fix the issue on sbt
| was to have a task where you could see only the modules included and the
| artifact locations. Just to mention it: the reason I chose to have only
| hashes is because the design is arguably cleaner in the sense that only
| modules knows about it's artifacts, the artifacts are blissfully ignorant
| of the module. That being said, it doesn't help to have a "clean" design,
| if it is harder to use. I will give it a day or 2 and think about it. If I
| cannot come up with a better argument against doing it, I will fix. Does
| that sound fair?

Definitly.

Given I find some time, I will open a new thread with some ideas/concepts of
Jackage and Portage, that would, I think, improve the easyness and
lightweightness of Adept and the overall user experience. For now, let me
express some more thoughts about the artifacts locations.

In Portage, all sources/artifacts will be downloaded from their original
locations (which is a good thing) and optionally will be first looked up from
some kind of shared cache. All files will be saved in a "distfiles" directory
and keep their original name (!). This might cause name collissions (which
will be of course detected), but that happens very rarely. After some time you
have a very large directory, which is not so cool, especially, If you want to
clean up some content but not all.

So, I would prefer to keep some structure, that's why I suggested the group
directory. But otherwise, I really like the Idea to keeping the original
names. Unfortunately, in Java land, you sometimes find artifacts without any
version in their names (e.g. JUnit), but if we also add the hash, that would
be no problem. So, my final (for now) suggestion is the following scheme:

${artifacts}/group/<orginial-file-name>-<hash>.<orginial-file-type>

If, for some reasons, the original file name is not known, the <artifact>-
<version>-[<other-parts>]-<hash>.<type> might be a good fallback.

Regards,
Tobias

Josh Suereth

unread,
Sep 5, 2013, 7:47:54 AM9/5/13
to adept-dev
I understand the desire to keep structure in artifact names, but the reality is that module metadata is inherently *unstable* in the cache.  This is one thing Ivy suffers with drastically when dealing with maven -> The assumption that metadata can be complete + stable.

In particular:
  • One of the driving features of Adept is that we can alter the metadata of modules in a way that doesn't destroy repeatable builds.   If you tie your project to a particular git-revision of the metadata, you won't see different resolution semantics.   So we can evolve/fix metadata over time without breaking clients.
  • JARs/Artifacts can actually be shared across modules, in which case any scheme which encodes the original module means you need to redownload that file.

IIRC - At one point we had discussed using either symlinks (or copying in windows) to dump out module-named files for local builds out of the cache for debugging.   I agree that hashes are quite ugly for users.   I'm using a hash-only cache for one of our internal "integrate the world" builds.   I developed a quick utility to browse that repository and pull out human readable names for SHAs *very* quickly.  However, the only way to safely cache/share files that may be duplciated is to keep artifacts agnostic of modules.

I'm personally more a fan of copying/symlinking files into human-readable variants locally for build tools.   If we can keep the artifact cache "pure" I think we can leverage more awesome out of it.  That's what we're doing in dbuild, and in practice it's not terrible.  

I'm rather hesitant of embedding module information in artifact paths though.   This way leads to danger, methinks.

- Josh

Fredrik Ekholdt

unread,
Sep 5, 2013, 8:11:45 AM9/5/13
to adep...@googlegroups.com


On Thursday, 5 September 2013 13:22:40 UTC+2, Tobias Roeser wrote:

Am Donnerstag, 5. September 2013, 12:10:04 schrieb Fredrik Ekholdt:
| On Thursday, 5 September 2013 09:47:46 UTC+2, Tobias Roeser wrote:
| > Hi Fredrik,
| >
| > with your latest changes and the new example code I was able to get it
| > running.
|
| Great! :)
|
| > I did some changes to the Adept code base as I'm wanted to run it
| > with Scala 2.10, which also worked well.
|
| That is really cool! .Do you want to submit a PR with the changes? I am
| targeting 2.9 and 2.10 for the time being, it would be good to have a 2.10
| branch. Helps me to know what I am breaking as well ;)

I can do that, but I built adept with my own toolchain (because I can, and
because I need as much "test" projects as I can for SBuild), so a PR from the
current state would be incomplete from an SBT standpoint (e.g. missing
dependency bumps). So, I would need some time to prepare a PR. In essence, I
bumped all dependencies to Scala 2.10 binary compatible versions and fixed the
in 2.1 removed Akka API calls by their Scala 2.10 replacements
(akka.util.duration and Future stuff).

Ok - makes sense :) 
Alright! :) 

Tobias Roeser

unread,
Sep 5, 2013, 8:37:14 AM9/5/13
to adep...@googlegroups.com
Hi Josh,

Am Donnerstag, 5. September 2013, 13:47:54 schrieb Josh Suereth:
| I understand the desire to keep structure in artifact names, but the
| reality is that module metadata is inherently *unstable* in the cache.
| This is one thing Ivy suffers with drastically when dealing with maven ->
| The assumption that metadata can be complete + stable.
|
| In particular:
|
| - One of the driving features of Adept is that we can alter the metadata
| of modules in a way that doesn't destroy repeatable builds. If you tie
| your project to a particular git-revision of the metadata, you won't see
| different resolution semantics. So we can evolve/fix metadata over
| time without breaking clients.
| - JARs/Artifacts can actually be shared across modules, in which case
| any scheme which encodes the original module means you need to
| redownload that file.

Ok, maybe, we did not refer to the same source for the hash. I have to admit,
that I'm not very familar with the complete inner details of Adept. But, when
discussing a released and thus stable artifact, I would use a hash which is
some kind of checksum for that artifact file. That way, no matter how many
versions of metadata you have, as long as the corresponding artifact(s) stay
the same, they will point to the same resource in your file system (as long as
the group is the same, of course).

A huge problem with Maven is IMHO, that it at the one hand tries to please the
developer (it tries to be a build system) and on the other hand it tries to
please the user of the dependencies (it tries to be a dependency management
solution). Unfortunatelly, when the work for the developer stops - he makes a
release - it starts for the package maintainer - she needs to revise the
metadata from time to time. But Maven has no mechanism to evolve metadata
without modifying already released versions and without changing the version
at the same time. That's the reason it has to fail. It's broken by design.

Jackage derived the revision concept from Portage. You can improve the package
metadata without bumping the package version. Each improved version gets a new
revision. That way, the package version consists of the artifact-version plus
the revision part (e.g. 1.0.0-r1, -r1 being the revision). The first revision
(-r0) will be omitted. As I understand it correctly, Adept tries to reach the
same goal by refering the some git revisions. Im quite sure, that I only have
a partial understanding of Adept's inner workings, but that way, Adept has a
hard dependency to the behavior and semantics of git and a package consumer is
not able to tell which revision (which is just a hash) is newer without any
tool support. From a package metadata maintainer point of view, an git-
independent easy to understand revision mechanism (e.g. -r1, -r2) is much
desirable.

| IIRC - At one point we had discussed using either symlinks (or copying in
| windows) to dump out module-named files for local builds out of the cache
| for debugging. I agree that hashes are quite ugly for users. I'm using
| a hash-only cache for one of our internal "integrate the world" builds. I
| developed a quick utility to browse that repository and pull out human
| readable names for SHAs *very* quickly. However, the only way to safely
| cache/share files that may be duplciated is to keep artifacts agnostic of
| modules.
|
| I'm personally more a fan of copying/symlinking files into human-readable
| variants locally for build tools. If we can keep the artifact cache
| "pure" I think we can leverage more awesome out of it. That's what we're
| doing in dbuild, and in practice it's not terrible.
|
| I'm rather hesitant of embedding module information in artifact paths
| though. This way leads to danger, methinks.

As I understand correctly (and given the hash in the artifact name is some
kind of checksum, not the git hash), the artifact path is free of any module
information (if "module" refers to the local being built project). Just the
metadata of the artifact itself are part of the path.

Please correct me, If I misunderstood Adept until here. Causing confusion is
not my intent.


Best Regard,
Tobias

Josh Suereth

unread,
Sep 5, 2013, 9:12:17 AM9/5/13
to adept-dev
I think we're in general agreement here about what's needed.   I also think maven epically failed at what users *REALLY* want with dependency managment and that is less decisions.  here's the current conversation:

User: "I want to use JAwesomeLib".   
Maven: "Please run a search for possible versions".
User: "Ok, i'll just grab this version here, and copy-paste junk into my build."
Maven: "IncomapitbleClassChangeError!  The library is not compatible with your dependencies!"
User "@#%@%@%#"


What it should be in adpet:

User: "I want to use JAwesomeLib"
Adept: "Ok, we found this version which is compatible with your existing dependencies."


Part of that requires us to be able to after-the-fact modify metadata with compatibility issues that are discovered.  In any case, I'm cool with that.

 
| IIRC - At one point we had discussed using either symlinks (or copying in
| windows) to dump out module-named files for local builds out of the cache
| for debugging.   I agree that hashes are quite ugly for users.   I'm using
| a hash-only cache for one of our internal "integrate the world" builds.   I
| developed a quick utility to browse that repository and pull out human
| readable names for SHAs *very* quickly.  However, the only way to safely
| cache/share files that may be duplciated is to keep artifacts agnostic of
| modules.
|
| I'm personally more a fan of copying/symlinking files into human-readable
| variants locally for build tools.   If we can keep the artifact cache
| "pure" I think we can leverage more awesome out of it.  That's what we're
| doing in dbuild, and in practice it's not terrible.
|
| I'm rather hesitant of embedding module information in artifact paths
| though.   This way leads to danger, methinks.

As I understand correctly (and given the hash in the artifact name is some
kind of checksum, not the git hash), the artifact path is free of any module
information (if "module" refers to the local being built project). Just the
metadata of the artifact itself are part of the path.

Please correct me, If I misunderstood Adept until here. Causing confusion is
not my intent.



Right, let me go specific. Here's your proposed format:

 <group>/<artifact>-<version>[-<other needed info>]-hash.<type>


The primary issue I have is "group".  I agree that "hash" would be the has of the file.  However, I could, theoretically, rebundle artifacts in an "uber" package that literally just includes the underlying artifacts (rather than actually rebundling them).   There's no reason to prevent that, but if we embedd "module" information (like group) in here, then we start preventing it.   

Combined, artifact name may be something I wish to change for the exact same jar.   In my org.josh.awesome project, I have awesome-core module (which has the core artifact in it) and awesome-mathy-stuffs (which has the math jar and math native dlls artifacts).

When I bundle the artifacts together in a new "awesome-all" module, I'd like to just include the *same* math native DLL artifacts.  However, since I'm rebundling, I'd like to give this thing a new name, i.e. awesome-all-native.dll, rather than just awesome-all.dll.


Hypothetical situation (I'm sure there are holes), but I hope it gets across the idea.   I think tieing module-information to artifacts limits our ability to cache/re-use.   I feel copying/symlinking "machine readable" files into "human-readable" files is the way to go.

Yes we are humans, and we need nicely formatted stuff.  But that's a layer on-top of what the machine needs.  The machine needs nice hashes to be effective.   Let's not try to make the storage format sovle ALL The needs, let's add a layer of indirection so we can solve both "well".   I.e. if we copy/symlink files into some local resolution cache for projects, then we can just outright DROP the HASH we use internally, and keep only human-readable info.   When re-resolving things, we can just run SHA-diffs (they're pretty fast), or wipe out the project-local resolution cache and pull out of our artifact cache again.



In any case, I think it's important to note this idea:    Resolution should *NOT* be occuring during every freaking build.   Resolution is something we do when asking for new dependencies or checking for new dependencies.   While integration builds can opt-in to doing this all the time, it should not be an all the time thing.   

  • We want to pull metadata local to make resolving deps faster, but we don't really want to be hitting servers for metadata during our day-to-day dev (maybe 1x a day) or more frequently if doing integrations.
  • We want to give users the maximum flexibility in how they define modules, and the machine the optimal mechanism for caching artifacts/metadata
  • Builds should be 100% reproducible for given git reviisons.  This means some aspects of version control may need to be commited in a repo, like the git SHA of the metadata used, or some kind of intermediate "these are the artifacts we're using, based on this dependnecy requirements list" file.
  • Making something efficient for a machine makes it inefficient for a human.  We should optimize the core for the machine, and put a "porcelain" layer on top for humans that is so easy to use, I don't want to kick my repository server every day.  However, the machine should be able to do the tricksy stuff, like parallel downloads, avoiding cache corruption, etc.

So yeah, interested to hear feedback. I could be a grumpy-old-man here (or just avoiding anything that reminds me of the maven repo format).  In any case, a lot of repository tool folks I talk with take a similar approach.  Artifacts are stored 100% by SHA/hash, and you reconstitute a friendly name from the metadata on demand.


Again, I agree that having a friendly name is of paramount importance.  However, I think we should try to go for best of both worlds here, rather than dig ourselves a hole and find out later we didn't bring a ladder.

- Josh

Tobias Roeser

unread,
Sep 5, 2013, 10:26:01 AM9/5/13
to adep...@googlegroups.com
This meets my wish to keep the original artifact name if possible. Only, if we
can not keep/gather the orginal artifact name (I fail to come up with a good
reason, though) I would use a syntetic one.

I proposed the following:

${artifacts}/group/<orginial-file-name>-<hash>.<orginial-file-type>

Of course, the group is artifact metadata specific. I would be ok with that,
but are open for better suggestions, that also maintains some structure a
developer can easily glance at. (Not sure if this is understandable English.)

| Hypothetical situation (I'm sure there are holes), but I hope it gets
| across the idea. I think tieing module-information to artifacts limits
| our ability to cache/re-use. I feel copying/symlinking "machine readable"
| files into "human-readable" files is the way to go.

Of course there are a lot of pitfalls one can run into with symlinks.
Equality, up-to-date-ness, same-filesystem-requirement. And falling back to
copying the artifacts is exactly the situation you want to avoid, right?

| Yes we are humans, and we need nicely formatted stuff. But that's a layer
| on-top of what the machine needs. The machine needs nice hashes to be
| effective. Let's not try to make the storage format sovle ALL The needs,
| let's add a layer of indirection so we can solve both "well". I.e. if we
| copy/symlink files into some local resolution cache for projects, then we
| can just outright DROP the HASH we use internally, and keep only
| human-readable info. When re-resolving things, we can just run SHA-diffs
| (they're pretty fast), or wipe out the project-local resolution cache and
| pull out of our artifact cache again.
|
|
|
| In any case, I think it's important to note this idea: Resolution should
| *NOT* be occuring during every freaking build. Resolution is something we
| do when asking for new dependencies or checking for new dependencies.
| While integration builds can opt-in to doing this all the time, it should
| not be an all the time thing.

Sidenote: SBuild always checks all dependencies for each target it has to run.
But "dependencies" refers here to any kind of dependency, also other targets.
If a dependency is handled by a scheme handler, which might be e.g. a
transitive dependency resolver like Adept or Aether or Ivy, then they will be
checked too. It is up to the configuration of the scheme handler if it caches
some decissions. Of course, SBuild has some mechanism to generically cache
results based on unchanged input, too. Until now, this all is pretty fast, and
magnitudes faster than e.g. firing the Scala compiler.

So, forcing a technically motivated huge cryptic artifacts directory of
thousands of files, one can not easily get rid of, besided of deleting all, is
somethings I want to question.

| - We want to pull metadata local to make resolving deps faster, but we
| don't really want to be hitting servers for metadata during our
| day-to-day dev (maybe 1x a day) or more frequently if doing integrations.
| - We want to give users the maximum flexibility in how they define
| modules, and the machine the optimal mechanism for caching
| artifacts/metadata
| - Builds should be 100% reproducible for given git reviisons. This
| means some aspects of version control may need to be commited in a repo,
| like the git SHA of the metadata used, or some kind of intermediate
| "these are the artifacts we're using, based on this dependnecy
| requirements list" file.
| - Making something efficient for a machine makes it inefficient for a
| human. We should optimize the core for the machine, and put a
| "porcelain" layer on top for humans that is so easy to use, I don't want
| to kick my repository server every day. However, the machine should be
| able to do the tricksy stuff, like parallel downloads, avoiding cache
| corruption, etc.

I think we agree if terms of always reproducable builds that avoid unnecessary
work. I'm not convinced, that we should optimize for the machine in this case.
I think, that a clear stucture a human understands will also be
programmatically processed in a timely fassion. After all, the artifact
download directory is only the output of the dependency resolver. Besides
checking for up-to-date-ness it should not impact the resolution process in
any way.

If you are talking about a repository server, I agree, such a system should
work with a technically motivated directory/file structure. (In my vision of a
good dependency management system, a repository server is not needed at all.)


| So yeah, interested to hear feedback. I could be a grumpy-old-man here (or
| just avoiding anything that reminds me of the maven repo format). In any
| case, a lot of repository tool folks I talk with take a similar approach.
| Artifacts are stored 100% by SHA/hash, and you reconstitute a friendly
| name from the metadata on demand.

Again: We don't discuss the repository format here, right? We talk about the
directory, where downloaded artifacts will go to.

If I understand correctly, you can not reconstitute a friendly name for such a
artifacts directory you just found in your home directory, without knowing the
metadata repository(s) that was used to produce/fetch all these files. And even
then, you need to fire up some tool to get some information. And remember that
awkward moment when you again need that specific oracle driver that is fetch
protected. You know, you have it in your artifact dir somewhere. I, for me, am
very bad at guessing hashes and am probably faster when I redownload it from
the website.

In a more human friendly artifacts directory, I can easily find a specific
artifact. Also I can glance through all these groups I no longer use since a
half year and wipe them out with a simple "rm -r".

Josh Suereth

unread,
Sep 5, 2013, 10:48:17 AM9/5/13
to adept-dev
I'll reference it again, but I think git is a decent model for us to follow for offline + remote repository usage.   The git binary formats are highly tuned and optmized for machines.  Features that make things human readable (branches) are placed *on top* of the machine-readable formats.
 
If you are talking about a repository server, I agree, such a system should
work with a technically motivated directory/file structure. (In my vision of a
good dependency management system, a repository server is not needed at all.)



Agreed.  However, that means things a server usually does are up to the client now.
 
| So yeah, interested to hear feedback. I could be a grumpy-old-man here (or
| just avoiding anything that reminds me of the maven repo format).  In any
| case, a lot of repository tool folks I talk with take a similar approach.
|  Artifacts are stored 100% by SHA/hash, and you reconstitute a friendly
| name from the metadata on demand.

Again: We don't discuss the repository format here, right? We talk about the
directory, where downloaded artifacts will go to.


Which is the format of the repository (on disk).
 
If I understand correctly, you can not reconstitute a friendly name for such a
artifacts directory you just found in your home directory, without knowing the
metadata repository(s) that was used to produce/fetch all these files. And even
then, you need to fire up some tool to get some information. And remember that
awkward moment when you again need that specific oracle driver that is fetch
protected. You know, you have it in your artifact dir somewhere. I, for me, am
very bad at guessing hashes and am probably faster when I redownload it from
the website.

In a more human friendly artifacts directory, I can easily find a specific
artifact. Also I can glance through all these groups I no longer use since a
half year and wipe them out with a simple "rm -r".


I'm not sure this use case is one we want to promote......

I'd rather seen an "adept clean" or "adept autoclean" that can remove artifacts that haven't been resolved in a while.  If we let folks muck with our repo, we have to code/live dangerously.  It would mean we can't record information and expect it to remain in our local repo.   We would no longer own the repo.  

It may be better to push people through an API/utility (like git).    When I look at .git design, I feel we shoudl mimic it.  We keep the artifacts in some machine readable format (like git objects), and when you check out your dependencies, we copy them into some location.   When you ask to resolve, we can re-check them out.  However, the "checkout" you have can use whatever naming scheme you want.  Yes, we duplicate files on the local machine, but once they're on the local machine, things are good, and usually disk space isn't quite as limited.

I'm not arguing that people should just *like* SHA files.   BUT, I don't think copying files out of the machine-readable repository is that terrible of an idea.  It lets us keep the raw artifact depo simple.   We can expose APIs to clean things, and limit interaction with the repo to API calls where we can control access.   We're already going to have to figure out how multiple concurrent processes can access the same repo.   I'd like to simplify some assumptions (to start with).

Tobias Roeser

unread,
Sep 5, 2013, 12:07:16 PM9/5/13
to adep...@googlegroups.com
Hi Fredrik,
If you want to review the impact for adept-core and adept-cli:
https://github.com/lefou/adept/compare/adept-dm:master...lefou:scala_2_10?expand=1

Kind regards,
Tobias

Fredrik Ekholdt

unread,
Sep 5, 2013, 12:26:11 PM9/5/13
to adep...@googlegroups.com
Hey Tobias! Thanks for this!
At a glimpse, it looks good to me. Interesting to see how SBuild works.
Would be nice to have it on it's a branch 2.10.x branch at the minimum.
Do you think it is possible to re-use the exact same dependencies in the SBuild and the sbt declarations?

F

Tobias Roeser

unread,
Sep 5, 2013, 3:02:59 PM9/5/13
to adep...@googlegroups.com
Hey Fredrik,

Am Donnerstag, 5. September 2013, 18:26:11 schrieb Fredrik Ekholdt:
| Hey Tobias! Thanks for this!
| At a glimpse, it looks good to me. Interesting to see how SBuild works.

I'm currently experimenting and setup each new (Multi-)Project somewhat
differently. SBuild as imperative build tool is very flexible and MHO rock
stable. Next milestone will be, to add some declarative configuration support
on top of the "low level" configuration.

| Would be nice to have it on it's a branch 2.10.x branch at the minimum.

I'm not quite sure, what you mean here.

| Do you think it is possible to re-use the exact same dependencies in the
| SBuild and the sbt declarations?

This should be possible. SBuild can include and use any Scala or JAR file
directly. Scala files will be compiled implicitly. So, it should be possible,
to include the Dependencies file with @include("project/Dependencies.scala").
You will have to provide the SBT API too, of course, e.g. with
@classpath("http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/api/0.13.0/jars/api.jar").

You can see in file Adept.scala, that I decided to provide all dependencies
explicitly, no transitivity involved. This is, per se, a good thing, but
therefore the dependencies between SBuild and SBT differ. But one can decide,
to also use e.g. the AetherSchemeHandler, which uses Eclipse Aether library
(formerly Maven Aether) to resolve dependencies transitively. There is
currently no IvySchemeHandler, so I'm not sure, if we would get the exactly
same result from SBT's ivy resolver.

Best regards,
Tobias

Fredrik Ekholdt

unread,
Sep 5, 2013, 3:18:36 PM9/5/13
to adep...@googlegroups.com

On Sep 5, 2013, at 9:02 PM, Tobias Roeser <le.pet...@web.de> wrote:

> Hey Fredrik,
>
> Am Donnerstag, 5. September 2013, 18:26:11 schrieb Fredrik Ekholdt:
> | Hey Tobias! Thanks for this!
> | At a glimpse, it looks good to me. Interesting to see how SBuild works.
>
> I'm currently experimenting and setup each new (Multi-)Project somewhat
> differently. SBuild as imperative build tool is very flexible and MHO rock
> stable. Next milestone will be, to add some declarative configuration support
> on top of the "low level" configuration.
>
> | Would be nice to have it on it's a branch 2.10.x branch at the minimum.
>
> I'm not quite sure, what you mean here.
Ah, sorry - I meant I wanted to have 2.10.x on a separate branch. I have changed my mind now though: I think the best thing would be to have a crossbuild (so building both 2.10 and 2.9 as the default). This way the default is having both platforms built and tested at the same time. For the build plugins it gets a bit nightmarish (at least for sbt), but I think it feasible and perhaps worth experience. Will adept be a plugin on SBuild as well or directly integrated (excuse my ignorance)?
>
> | Do you think it is possible to re-use the exact same dependencies in the
> | SBuild and the sbt declarations?
>
> This should be possible. SBuild can include and use any Scala or JAR file
> directly. Scala files will be compiled implicitly. So, it should be possible,
> to include the Dependencies file with @include("project/Dependencies.scala").
> You will have to provide the SBT API too, of course, e.g. with
> @classpath("http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/api/0.13.0/jars/api.jar").
>
> You can see in file Adept.scala, that I decided to provide all dependencies
> explicitly, no transitivity involved. This is, per se, a good thing, but
> therefore the dependencies between SBuild and SBT differ. But one can decide,
> to also use e.g. the AetherSchemeHandler, which uses Eclipse Aether library
> (formerly Maven Aether) to resolve dependencies transitively. There is
> currently no IvySchemeHandler, so I'm not sure, if we would get the exactly
> same result from SBT's ivy resolver.
Right, when I think of it the correct way to solve predicament is of course to use adept :) I think that should be possible for sbt already and perhaps SBuild soon? :) - my better half tells me that my hacking hours are up for the day so I will have a look it tomorrow.

Tobias Roeser

unread,
Sep 5, 2013, 4:20:41 PM9/5/13
to adep...@googlegroups.com
Hello Fredrik,

Am Donnerstag, 5. September 2013, 21:18:36 schrieb Fredrik Ekholdt:
| On Sep 5, 2013, at 9:02 PM, Tobias Roeser <le.pet...@web.de> wrote:
| > Hey Fredrik,
| >
| > Am Donnerstag, 5. September 2013, 18:26:11 schrieb Fredrik Ekholdt:
| > | Hey Tobias! Thanks for this!
| > | At a glimpse, it looks good to me. Interesting to see how SBuild works.
| >
| > I'm currently experimenting and setup each new (Multi-)Project somewhat
| > differently. SBuild as imperative build tool is very flexible and MHO
| > rock stable. Next milestone will be, to add some declarative
| > configuration support on top of the "low level" configuration.
| >
| > | Would be nice to have it on it's a branch 2.10.x branch at the minimum.
| >
| > I'm not quite sure, what you mean here.
|
| Ah, sorry - I meant I wanted to have 2.10.x on a separate branch. I have
| changed my mind now though: I think the best thing would be to have a
| crossbuild (so building both 2.10 and 2.9 as the default). This way the
| default is having both platforms built and tested at the same time. For
| the build plugins it gets a bit nightmarish (at least for sbt), but I
| think it feasible and perhaps worth experience. Will adept be a plugin on
| SBuild as well or directly integrated (excuse my ignorance)?

Setting up a cross build with SBuild is easy. And to answer your question:
Adept will be "some kind of" a plugin. (At least it will be a separate JAR
with one can easily add. No special plugin semantic is required for now.)

But, crossbuilding the same code base for 2.9 and 2.10 should be problematic,
as the refactorings needed for Akka 2.1.x resulted in code that will only
compile and run with 2.10+. Neither Akka 2.0.x is build for 2.10, nor has 2.9
the required API (e.g. scala.concurrent.duration.FiniteDuration).

So, I think, a branch is needed. Alternatively, you have to provide a
(partial) separate source tree.


| > | Do you think it is possible to re-use the exact same dependencies in
| > | the SBuild and the sbt declarations?
| >
| > This should be possible. SBuild can include and use any Scala or JAR file
| > directly. Scala files will be compiled implicitly. So, it should be
| > possible, to include the Dependencies file with
| > @include("project/Dependencies.scala"). You will have to provide the SBT
| > API too, of course, e.g. with
| > @classpath("http://repo.typesafe.com/typesafe/ivy-releases/org.scala-sbt/
| > api/0.13.0/jars/api.jar").
| >
| > You can see in file Adept.scala, that I decided to provide all
| > dependencies explicitly, no transitivity involved. This is, per se, a
| > good thing, but therefore the dependencies between SBuild and SBT
| > differ. But one can decide, to also use e.g. the AetherSchemeHandler,
| > which uses Eclipse Aether library (formerly Maven Aether) to resolve
| > dependencies transitively. There is currently no IvySchemeHandler, so
| > I'm not sure, if we would get the exactly same result from SBT's ivy
| > resolver.
|
| Right, when I think of it the correct way to solve predicament is of course
| to use adept :) I think that should be possible for sbt already and
| perhaps SBuild soon? :)

I'd rather not use Adept to build Adept, at least not if there are no stable
releases yet and practical alternatives exit.

| - my better half tells me that my hacking hours
| are up for the day so I will have a look it tomorrow.

So, then don't let the family wait.

Best regards,
Tobias
Reply all
Reply to author
Forward
0 new messages