Format of modules

40 views
Skip to first unread message

Fredrik Ekholdt

unread,
Apr 26, 2013, 4:52:58 AM4/26/13
to adep...@googlegroups.com
Hello everybody!
I am excited to see that we are progressing so much! 

I think we have landed on using json the format for describing modules.
What would be nice is if we could land at least some parts of how it should look like.

Below is an initial draft:
{
  "organization":"org",
  "name":"name1",
  "modules": [
    "version":"1.0-alpha",
    "hash": "ae26b43ce1",
    "locations":[
    ],
    "related-artifacts": [{"type": "doc", "hash": "3bc4d3cae1df5cd", "locations":["http://repo1.maven.org/org/name1/1.0-alpha-javadoc.jar"]}]
  ]
}

Is there anything missing/wrong. Any names that needs changing?



David Illsley

unread,
Apr 27, 2013, 7:58:50 AM4/27/13
to adep...@googlegroups.com
FWIW, I'd encourage you to consider YAML. The advantage is that it's line oriented, and therefore very well suited to use with version control.

Naftoli Gugenheim

unread,
Apr 28, 2013, 1:20:24 AM4/28/13
to adep...@googlegroups.com

Evan Chan

unread,
Apr 28, 2013, 3:27:50 AM4/28/13
to adep...@googlegroups.com
+1 for HOCON.  It's also line oriented, allows comments, basically a much better JSON.

And, not that this matters, but there are no good YAML parsing libraries for Scala (AFAIK).

Evan Chan

unread,
Apr 28, 2013, 3:47:58 AM4/28/13
to adep...@googlegroups.com
HI Fredrik,

The big missing piece is the dependencies specification.   I'd like to propose something like this (in HOCON syntax):

dependencies {
  compile = [
    {name: "org.slf4j/slf4j-api", version: ">= 1.6.6"}
    {name: "scala-api", version: "2.10.*"}
  ]
  provided = [
    {name: "org.apache/hadoop-core", version: "1.0.4"}
  ] 
  conflicts = [
    {name: "*/log4j*"}
  ]
}

provides = [
  {name: "scala-logging", version: "1.6.3"}
]


A couple of important things to note:
* There are multiple types or scopes of dependencies.  Compile dependencies, plus the "provided" scope.   I can't tell you how many times I wish that provided dependencies could be transitively declared in Maven, whereas today, "provided" deps aren't included in the POM at all, and must be re-declared in the app using a library... a total hack.
* Wildcards and operators like >= should be supported for versions.   Requiring a single fixed version is extremely stifling and limits binary compatibility. 
* "Provides" gives a way for a library to advertise that it supports or implements a common API (for example, SLF4J).  This means multiple apps can depend on the advertised API,  which can be satisfied by any number of backends, without having to go through a fake "meta" jar/package.
* "Conflicts" gives dep-resolution-time insight into packages that may not work together.  Classic example is LOG4J vs LOGBACK.  Today this conflict is detected at runtime and is a huge pain to debug.

These concepts are commonplace in virtually every single Linux package management system, and even in some language package managers.     I know this seems much more complex, but it solves many, many issues with the simplistic Maven dependency system.

-Evan

Fredrik Ekholdt

unread,
Apr 29, 2013, 7:37:23 AM4/29/13
to adep...@googlegroups.com
Hey Evan! 
Thanks for the input!
On YAML, I do not know it well enough to see the pros/cons on it. 
It would be good if there is a good and fast parser in Scala/Java for whatever format we choose. 
Preferably it should be something that is known by most people. 
I have nothing against HOCON - it is great for configuration.  I am not sure I see the benefits of HOCON vs JSON when it is simply used for de/serialization. 
Could you elaborate a bit? Being line-oriented is a good one though :)

See inline for more!

On Apr 28, 2013, at 9:47 AM, Evan Chan wrote:

HI Fredrik,

The big missing piece is the dependencies specification.   I'd like to propose something like this (in HOCON syntax):
Yes, absolutely!


dependencies {
  compile = [
    {name: "org.slf4j/slf4j-api", version: ">= 1.6.6"}
    {name: "scala-api", version: "2.10.*"}
  ]
  provided = [
    {name: "org.apache/hadoop-core", version: "1.0.4"}
  ] 
  conflicts = [
    {name: "*/log4j*"}
  ]
}

provides = [
  {name: "scala-logging", version: "1.6.3"}
]


A couple of important things to note:
* There are multiple types or scopes of dependencies.  Compile dependencies, plus the "provided" scope.   I can't tell you how many times I wish that provided dependencies could be transitively declared in Maven, whereas today, "provided" deps aren't included in the POM at all, and must be re-declared in the app using a library... a total hack.
Oh, that is a great idea! How do you see this integrating with configurations of Ivy?

* Wildcards and operators like >= should be supported for versions.   Requiring a single fixed version is extremely stifling and limits binary compatibility. 
So I am still on the fence when it comes to whether things should be more or less strictly defined (i.e. you define exactly what is supported VS you can define wildcards etc etc). I agree that strictness inhibits readability, but what is good with strictness is you won't be surprised. I leaning towards wildcards though :)

* "Provides" gives a way for a library to advertise that it supports or implements a common API (for example, SLF4J).  This means multiple apps can depend on the advertised API,  which can be satisfied by any number of backends, without having to go through a fake "meta" jar/package.

* "Conflicts" gives dep-resolution-time insight into packages that may not work together.  Classic example is LOG4J vs LOGBACK.  Today this conflict is detected at runtime and is a huge pain to debug.
Yep, this is a good idea.
What do you think about having a: "replaces", where you define if this org, name replaces another org, name (thinking about handling conflicts that are google collections/guava).

Naftoli Gugenheim

unread,
Apr 29, 2013, 6:21:46 PM4/29/13
to adep...@googlegroups.com

On Sun, Apr 28, 2013 at 3:47 AM, Evan Chan <vel...@gmail.com> wrote:
* Wildcards and operators like >= should be supported for versions.   Requiring a single fixed version is extremely stifling and limits binary compatibility. 

Does that hold even with Mark's approach of separating version components?

Josh Suereth

unread,
Apr 29, 2013, 6:28:36 PM4/29/13
to adept-dev
Yeah, I'd like to disagree.  For example, imagine we track MANY different versions:


full-version=1.1-GITSHA-BUILD_DATE
integration-version=1.1-INPROGRESS
binary-version=1.0



NOW, when choosing a library, I can lock on a specific version, or grab the "latest" for the integration target of my corporation, or grab just the latest that's binary compatible for what I want.  *That's* flexibility.  The cost is for authors to use meaningful versions and enforce some sanity about them.

Evan Chan

unread,
May 1, 2013, 3:41:32 AM5/1/13
to adep...@googlegroups.com
I agree with Josh.  :)        Here is an idea along the same lines, to borrow from Ruby (http://docs.rubygems.org/read/chapter/16):

~> 2.2

the above means, I'm OK with versions 2.2.1, 2.2.2, but I don't trust that 3.0 isn't going to break my code.   I think its equivalent to >= 2.2.0, < 3.0.

At a minimum, if we allowed wildcards, I could say, "2.2.*" which would be more flexible than strict versions.

--
Because the people who are crazy enough to think they can change the world,
are the ones who do.     -- Steve Jobs

Mark Harrah

unread,
May 2, 2013, 9:21:28 AM5/2/13
to adep...@googlegroups.com
Is it necessary to distinguish one artifact from "related-artifacts"? I'd just call them all "artifacts" and let the "type" distinguish them. Also, there should probably be an advisory name for each artifact to show the user so that they aren't called ae26b43ce1 on the classpath or wherever.

-Mark

Fredrik Ekholdt

unread,
May 2, 2013, 9:23:01 AM5/2/13
to adep...@googlegroups.com

On May 2, 2013, at 3:21 PM, Mark Harrah wrote:

> On Fri, 26 Apr 2013 01:52:58 -0700 (PDT)
> Fredrik Ekholdt <fre...@gmail.com> wrote:
>
>> Hello everybody!
>> I am excited to see that we are progressing so much!
>>
>> I think we have landed on using json the format for describing modules.
>> What would be nice is if we could land at least some parts of how it should
>> look like.
>>
>> Below is an initial draft:
>> {
>> "organization":"org",
>> "name":"name1",
>> "modules": [
>> "version":"1.0-alpha",
>> "hash": "ae26b43ce1",
>> "locations":[
>> "http://repo1.maven.org/org/name1/1.0-alpha.jar"
>> ],
>> "related-artifacts": [{"type": "doc", "hash": "3bc4d3cae1df5cd",
>> "locations":["http://repo1.maven.org/org/name1/1.0-alpha-javadoc.jar"]}]
>> ]
>> }
>>
>> Is there anything missing/wrong. Any names that needs changing?
>
> Is it necessary to distinguish one artifact from "related-artifacts"? I'd just call them all "artifacts" and let the "type" distinguish them.
Heh, yes of course :)
> Also, there should probably be an advisory name for each artifact to show the user so that they aren't called ae26b43ce1 on the classpath or wherever.
Yep, that is important I noticed! Thanks for the tips!
>
> -Mark

Mark Harrah

unread,
May 2, 2013, 9:53:38 AM5/2/13
to adep...@googlegroups.com
On Mon, 29 Apr 2013 13:37:23 +0200
Fredrik Ekholdt <fre...@gmail.com> wrote:

> Hey Evan!
> Thanks for the input!
> On YAML, I do not know it well enough to see the pros/cons on it.
> It would be good if there is a good and fast parser in Scala/Java for whatever format we choose.
> Preferably it should be something that is known by most people.
> I have nothing against HOCON - it is great for configuration. I am not sure I see the benefits of HOCON vs JSON when it is simply used for de/serialization.
> Could you elaborate a bit? Being line-oriented is a good one though :)

Humans will still read and write these things, even if 99% of the time a program does.

> See inline for more!
>
> On Apr 28, 2013, at 9:47 AM, Evan Chan wrote:
>
> > HI Fredrik,
> >
> > The big missing piece is the dependencies specification. I'd like to propose something like this (in HOCON syntax):
> Yes, absolutely!
> >
> > dependencies {
> > compile = [
> > {name: "org.slf4j/slf4j-api", version: ">= 1.6.6"}
> > {name: "scala-api", version: "2.10.*"}
> > ]
> > provided = [
> > {name: "org.apache/hadoop-core", version: "1.0.4"}
> > ]
> > conflicts = [
> > {name: "*/log4j*"}
> > ]
> > }
> >
> > provides = [
> > {name: "scala-logging", version: "1.6.3"}
> > ]
> >
> >
> > A couple of important things to note:
> > * There are multiple types or scopes of dependencies. Compile dependencies, plus the "provided" scope. I can't tell you how many times I wish that provided dependencies could be transitively declared in Maven, whereas today, "provided" deps aren't included in the POM at all, and must be re-declared in the app using a library... a total hack.
> Oh, that is a great idea! How do you see this integrating with configurations of Ivy?

I personally think Ivy configurations can be pretty much copied as is.

> > * Wildcards and operators like >= should be supported for versions. Requiring a single fixed version is extremely stifling and limits binary compatibility.
> So I am still on the fence when it comes to whether things should be more or less strictly defined (i.e. you define exactly what is supported VS you can define wildcards etc etc). I agree that strictness inhibits readability, but what is good with strictness is you won't be surprised. I leaning towards wildcards though :)

A common theme that I heard after giving the talk is that people want less surprises from resolution. The simplest scheme that meets this is strictness/no wildcards. That might work for a larger team that puts a high priority on it, but I think it otherwise imposes a large cost. This is why I proposed the multiple version scheme. I think it has fewer surprises, requires less work, but is still flexible enough.

> > * "Provides" gives a way for a library to advertise that it supports or implements a common API (for example, SLF4J). This means multiple apps can depend on the advertised API, which can be satisfied by any number of backends, without having to go through a fake "meta" jar/package.

Yes, agree.

> > * "Conflicts" gives dep-resolution-time insight into packages that may not work together. Classic example is LOG4J vs LOGBACK. Today this conflict is detected at runtime and is a huge pain to debug.
> Yep, this is a good idea.
> What do you think about having a: "replaces", where you define if this org, name replaces another org, name (thinking about handling conflicts that are google collections/guava).

Replaces implies that there is an ordering. One dependency can be chosen automatically instead of the other. Conflicts says that the two cannot both be resolved together. If the resolution engine can pick dependencies so that this is true, great. If not, it has to fail. It can't pick one automatically. Replaces sounds useful separately from conflicts, but mainly for when a module changes group ID or something. Otherwise, it seems pretty open to abuse. (As an exaggerated example, org.scala-lang "replaces" org.codehaus.groovy.)

-Mark
Reply all
Reply to author
Forward
0 new messages