On 16 Dec 2009, at 21:22, mob wrote:
> I've mined the various information and pulled together a proposal forWhat are the differences between categories and keywords? And do we really need either?
> Packages 0.1. This proposal defines the package.json file, the package
> directory structure and package file format. It does not specify the
> catalog file (yet).
>
> I'd like to get feedback on this so we can iterate and then nail down
> this part of the packaging system. Then we can all create packages
> with confidence they will be usable in the various package managers,
> tools and loaders.
>
> I've tried to collect the various input from all, but I've also added
> a couple of new fields. Please point out any fields that are missing
> from the descriptor file.
>
> I've kept the package.json file fairly flat to make it easier for
> tools to inspect and modify fields.
>
> Enjoy
>
> --mob
Author: being a single field - split it into three (or an array of 3 elements) rather than requiring a fixed format that people will inevitably get wrong (and then only the name of the 3 is required).
Dependencies: not sure the version handling is right. if we are using semantic versioning then isn't just [ "ejs", "1.0.0"] enough and implies the >= part and "< 2.0" is implicit from SemVer. (I also don't like '>=' as it implies '=' should work and I really dont want to get into the culture of letting people do that)
stability: again, doesn't SemVer cover this and make it redundant?
checksum: why [ "md5", "valuehere" ] and not just { "md5": "valuehere" } - its not like you can have more than one checksum of a given type, and this would make access easier.
Can you identify any candidates I've missed?* maintainer
Is this the point contributor to receive issue requests? Can you
elaborate on the purpose. I tried to cut down to author and
contributors. Can we refactor these to only have 2 fields for these.
ie. pick 2 from (author, contributors, maintainer). Or refactor in
some way to have two instead of three.
* directory.jars should be recommended even though it is notgenerally applicable
I think this is an example of a collection of things which are not
required, but if present should adopt this form. I'll create a section
for these.* directory.engines should be in there for any system that hasengine-specific components for one or more engines, including generalengines like "default". We may or may not want to maintain a registryfor engine, os, cpu, and license names. We probably should, as afreely-editable wiki page; "spidermonkey", "v8", "node", "rhino","mozilla", "jsc", "flusspferd", "narwhal", "gpsee". It's like a stylesheet.
Yes, this is just the same as os and cpu. I'll create a list of
approved names to the proposal.
* async needs to be there as a recommendation to package maintainersor users who are trying to maintain the "no-sync" ideology.
This doesn't quite feel right. I know the need, but there will always
be a long list of things like this. Surely this is the same as
"stability" and the answers are the same. ie. we may need an async-
only catalog.
On Dec 16, 2:34 pm, Kris Kowal <cowbertvon...@gmail.com> wrote:We should have keywords for "package-manager search keyword" commands,like "pkgmgr search wiki markup". We might want to go so far ashaving an evolving wiki page to recommend keywords for particulartypes of services.
Agree, that was my intent/understanding of the keywords field. Are you
advocating then to remove categories?
Dependencies: not sure the version handling is right. if we are using semantic versioning then isn't just [ "ejs", "1.0.0"] enough and implies the >= part and "< 2.0" is implicit from SemVer. (I also don't like '>=' as it implies '=' should work and I really dont want to get into the culture of letting people do that)stability: again, doesn't SemVer cover this and make it redundant?
Okay, lets remove version operators and take a stance with SemVer.
scripts: Hmmm I thought this was going to be list of scripts to install or some such. Perhaps "commands" or "build_commands" is a better name?
+1 to scripts to forestall bike shedding if possible.
Yippee.
* name - the name of the package. This must be a lowercase alpha-numeric name without spaces. It may include "." or "_" or "-" characters.I recommend ditching "_" and ".".
We need "." in ejscript. Please... We could recommend that the names
be camelCase if multi-word instead of "_". But do we really care
here? They are opaque names.
* bugs - URL for submitting bugs. Can be mailto or http.I'm thinking we should so something similar to author for this one, where"url" and "email" can be specified, comma delimited, or (urled) and <emailed>.
Make sense. How about:
bugs: {
mail: "bugs@acme",
http: "....",
}
* license - array of licenses under which the package is provided. Eachlicense is a tuple where the first element is the kind of license and thesecond element is a URL to the license. For examplelicenses: [["GPL","http://www.ejscript.org/products/ejs/doc/licenses/gpl.html"]]I'm thinking we should have a standard set of license names, and then allow anObject for custom variants with {"name", "url"} objects instead of tuples.
It is a bit of a zoo. I tried this, but there are so many different
licenses.
* location - Array of repositories where the package can be located. Each repository is a tuple where the first element is the kind of repository and the second element is a URL path to clone/checkout the package. For example:location: [["mercurial", "http://hg.embedthis.com/ejs"]]I think we ought to leave this out for now since it's highly coupled to thecatalog specification, I think. In my latest work on Tusk, the catalogconsolidation system automatically adds a "source" property to the packagedescriptor which is the source descriptor that was used to get the rest of theinformation. It has a "type" field and the rest of its properties depend onthe VCS or whether a VCS was used at all.
I was thinking this is the authoritative source for the package not
necessarily where a catalogue may be proxying it. So I think we should
retain this here. Tusk could still add a source which is where this
particular download came from.
* dependencies - Array of prerequisite packages on which this package depends in order to install and run. Each dependency is an array with one to three elements. The first element is the dependent package name and is mandatory. The second is an optional version expression defining the lowest qualifying version. The third element is a version expression defining the highest qualifying version. Version expressions are of the form: operator SPACE version. For example:dependencies: [ "ejs", ">= 1.0.0", "< 2.0"].It does need to be an array. For the simple case, I think a mere package nameshould be permitted, and the version "" implied, which should accept anything.
Agreed.For the complex case, it should be an Object with something like {name,version} or {name, min, max} if you think that's critical.
I might wait for more input on whether this should be a tuple or hash.
I can see your point.
* bugs - URL for submitting bugs. Can be mailto or http.I'm thinking we should so something similar to author for this one, where"url" and "email" can be specified, comma delimited, or (urled) and <emailed>.
Make sense. How about:
bugs: {
mail: "bugs@acme",
http: "....",
}Sounds good. Optionally allow just a simple string where in it looks for "@" as an email, or else treats as a URL (this might be more complex than is needed)?* license - array of licenses under which the package is provided. Eachlicense is a tuple where the first element is the kind of license and thesecond element is a URL to the license. For example
licenses: [["GPL","http://www.ejscript.org/products/ejs/doc/licenses/gpl.html"]]
I'm thinking we should have a standard set of license names, and then allow anObject for custom variants with {"name", "url"} objects instead of tuples.
It is a bit of a zoo. I tried this, but there are so many different
licenses.Licenses are a bit of a zoo, but there are some common ones: GPLv2, GPLv3, LGPL, MIT, BSD, Apache, Mozilla would be the most common, and anything not in that list needs {"name": "url"}
--
You received this message because you are subscribed to the Google Groups "CommonJS" group.
To post to this group, send email to comm...@googlegroups.com.
To unsubscribe from this group, send email to commonjs+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/commonjs?hl=en.
Could be right. The original author is not nearly as relevant as the
current maintainer. How about we use those 2: maintainer and
contributors. The author could be the first contributor by convention?
> As a corollary to this how do you signal that "this package only works on engine Y"?
Create a psudo package for the engine and depend on it.
For ejscript, we will have a package for each key component of the
platform.
> Or this is a keyword/category? (as dean said while I was composing this)
Agree.
> Unless we enumerate the values for one of these they are both just free-form inputs. So I think we either need to enumerate one and have the other as freeform, or remove one of them and have the remaining as 'free form'.
I'm now leaning this way. Suggest we remove categories and just have
keywords. Simpler and then the package tools can take over from there.
> >>> Dependencies: not sure the version handling is right. if we are using semantic versioning then isn't just [ "ejs", "1.0.0"] enough and implies the >= part and "< 2.0" is implicit from SemVer. (I also don't like '>=' as it implies '=' should work and I really dont want to get into the culture of letting people do that)
> >>> stability: again, doesn't SemVer cover this and make it redundant?
>
> > Okay, lets remove version operators and take a stance with SemVer.
>
> Going with just a single version (impliying v2 is not compatible with v1, correct?)
We have a choice with that. One version could mean:
- That version or later
- That major version only.
I'm leaning toward the former. If someone says "package", "1.2.3".
What they really mean is 1.2.3 or later.
If you want it to mean only version 1, you would say
"package", "1", "2"
> >>> scripts: Hmmm I thought this was going to be list of scripts to install or some such. Perhaps "commands" or "build_commands" is a better name?
> >> +1 to scripts to forestall bike shedding if possible.
> Good point: name is fine as it is. Is there a known/define set of keys on scripts? Just the 4 listed?
I think this will grow from experience. We use install, uninstall and
build. But other platforms may need others. Perhaps "test"?
> > Make sense. How about:
>
> > bugs: {
> > mail: "bugs@acme",
> > http: "....",
> > }
>
> Sounds good. Optionally allow just a simple string where in it looks for "@" as an email, or else treats as a URL (this might be more complex than is needed)?
I think so, I'd avocate that we don't allow such simplifications. The
above is very clear and it just makes parsers and tools harder to
write and maintain.
> >>> * license - array of licenses under which the package is provided. Each
> >>> license is a tuple where the first element is the kind of license and the
> >>> second element is a URL to the license. For example
> >>> licenses: [["GPL",
> >>> "http://www.ejscript.org/products/ejs/doc/licenses/gpl.html"]]
>
> >> I'm thinking we should have a standard set of license names, and then allow an
> >> Object for custom variants with {"name", "url"} objects instead of tuples.
>
> > It is a bit of a zoo. I tried this, but there are so many different
> > licenses.
>
> Licenses are a bit of a zoo, but there are some common ones: GPLv2, GPLv3, LGPL, MIT, BSD, Apache, Mozilla would be the most common, and anything not in that list needs {"name": "url"}
Okay how about:
license: {
kind: "GPLv2",
url: "http://www.example.org/licenses/gpl.html"
},
Where the standard kinds are what you listed++
Thanks ash.
http://wiki.commonjs.org/wiki/Packages/B
--mob
+1 I think it is essential to be able to easily find the original source
of a package to encourage collaboration no matter how it was distributed.
If bug reports always target the original source it may make sense to
combine "bugs" and "location".
>> * version - a version string conforming to the Semantic Versioning
> requirements (http://semver.org/)
As for versioning and stability. Being able to declare a 2.X release as
alpha or beta is important I think until it becomes stable. Can we
extend the Semantic Versioning with this:
https://developer.mozilla.org/en/Toolkit_version_format
>> Package managers and loaders should ignore unknown fields in the
>> package descriptor file.
>
> "schema", "extra", "package", "descriptor", "info", "docs",
> "documentation", "reserved", "archive", "format", ...
Please add: "using", "build"
Christoph
Yes! Add "test" please.
Christoph
> Ash Berlin wrote:
>>
>>
>> So the package.json should include VCS repository locations, but yeah,
>> maybe not the location where you can get releases from.
>
> +1 I think it is essential to be able to easily find the original source
> of a package to encourage collaboration no matter how it was distributed.
>
> If bug reports always target the original source it may make sense to
> combine "bugs" and "location".
Hmmm yes that could work.
>
>>> * version - a version string conforming to the Semantic Versioning
>> requirements (http://semver.org/)
>
> As for versioning and stability. Being able to declare a 2.X release as
> alpha or beta is important I think until it becomes stable. Can we
> extend the Semantic Versioning with this:
>
> https://developer.mozilla.org/en/Toolkit_version_format
Or there is the .deb form of 2.0~1 (which is < 2.0 iirc.) In general I think a way of putting developer/preview versions up for explicit download is a good thing (so long as they need to be explicitly required, and by default the package installer only looks for normal versions).
How is the signature generated? If it is for the archive file and the
archive file includes package.json is that not going to be a problem?
Christoph
Actually, if they say "1.2", they mean greater or equal to "1.2.0" but
less than "2". Major versions introduce changes that require code
changes. Minor versions introduce changes that are strictly backward
compatible. Patch versions are backward and forward compatible and
the most recent is always the best choice. I think all the major
cases are covered.
Kris Kowal
s/kind/type/g since they're synonyms with the latter being the
"codish" term, or s/kind/name/ if we assure that there is only one
license for each name.
Kris Kowal
One of the things that the 'python want a CPAN' thread was that the module metadata (which in our case is package.json) should live outside of the package. or at least be available outside it. In which case the one inside it could just not have signatures section. This does slightly complicate the upload process abit.
-ash
> SemVer is one day old and we're already talking about extending it...heh...
>
> But this /is/ relevant version info and I can't see how (after 1.0.0)
> it's possible to express this. Not being able to cut a beta release
> package would be pretty constraining, especially for slow-moving
> projects. One comparatively easy way to extend SemVer is to allow
> letters and specify the sort as letters < numbers. Then you could have
> 2.a.1, 2.a.2, 2.b.1, 2.0.0, etc... You could also have use this to
> create a train for minor releases (2.2.a, 2.2.b, 2.2.c... all sorting
> before 2.2.0). Dependencies would never resolve to an /extended/ SemVar
> because they're technically invalid and thus unstable.
If you can create a train with 2.2.a it is only really useful if you can
resolve dependencies to it. I think it should depend on how your
dependency is defined (i.e. does it include the letter or not)
Christoph
Ah, it is probably technically impossible for the signature to be
inside the package anyway. Let's leave this out as something that
gets added to a package descriptor in a catalog.
Of course, this means that we'll have to be careful about the order in
which we add packages to the catalog for the using-packages style,
since the signature of each dependency needs to be in the package.json
of its dependees. Also means there can't be cyclic dependencies among
packages.
Kris Kowal
That's true for checksums, but signatures? As in, "this package must
be signed by Kris Kowal"?
Ihab
--
Ihab A.B. Awad, Palo Alto, CA
What do you mean combine? Can you please give an example?
> >> * version - a version string conforming to the Semantic Versioning
> > requirements (http://semver.org/)
>
> As for versioning and stability. Being able to declare a 2.X release as
> alpha or beta is important I think until it becomes stable. Can we
> extend the Semantic Versioning with this:
>
> https://developer.mozilla.org/en/Toolkit_version_format
Specifically, do you mean append suffixes to the digit portions?
It is a shame to mess with SemVer.
> >> Package managers and loaders should ignore unknown fields in the
> >> package descriptor file.
> >
> > "schema", "extra", "package", "descriptor", "info", "docs",
> > "documentation", "reserved", "archive", "format", ...
>
> Please add: "using", "build"
Sure.
--mob
Good catch.
The package.json lives inside the package AND should be present
outside the package in a catalog. This should be specified as only
existing in free standing package.json files when they are part of a
catalog.
--mob
That is fairly neat. Another similar proposal is to allow any alpha
suffix. ie.
1.0.0beta4
1.0b2
Parsing is easy to just parse as a number, the suffix will be ignored.
--mob
Agree that sometimes that is the case. The use case I was meaning if I
have a package that needs feature X in package y which was introduced
in version 1.3. ie. I depend on any version >= 1.3.
With the meaning you propose, my package expires as the dependent
package increases its version. I'd like to be able to say I depend on
any version after 1.3 and not have to say:
"y", "1.3", "99.99.999"
I think the meaning of SemVer which says my package may break if Y
changes to a new major release is a "may break" not a must break. So
package managers should load the latest version of a package that
qualifies. If a user wants to lock to a given major version, they
should say:
"Y", "1.3", "2.0"
That then locks it to 1.X.
--mob
Thats kinda what i suggested in irc, but with one alteration:
"If the version isn't a valid SemVer, then the this version is treated as a developer/beta release and will only ever by installed by sepcifically asking for this version".
Good
--mob
Depends on whether we're talking about:
1. a declaration that insists that some other package be signed, or
2. a declaration that this package has been signed
I'm suggesting that 2 is paradoxical, for both the crypto hash and
crypto sign cases. The "signature" property in a "package.json" would
be the 2nd case. The "using" property of a package descriptor would
be the first.
M.O'B., that's another thing we need to add to the spec, and Christoph
can speak to the point: a schema for a "using" property for
using-style packages, which we haven't yet ratified, but baby steps.
Kris Kowal
That may get too complicated.
I'd say allow suffixes, but dependencies work on the pure numbers
only.
--mob
To keep it simple, let's only ever have signatures in catalogs. Never
in package.json.
Then it is the job of the package manager tool that has access to the
catalog to check on a download.
--mob
Dependancy groups: "I need *one* of these set of modules, if any are installed great, else pick the first in the list and install that"
This has been a pain many times in my life as a perl developer.
-ash
There are a number of paradoxes I see. Maybe it's just me. :)
For case 1, for *hashes*, if
package A says "import package B having hash X"
package B says "import package A having hash Y"
there is no way to simultaneously compute X and Y (that I know of...).
For case 2, the problem can be solved by signing *part* of the
package, not the package itself. For example, the signatures in Java
JAR files are for individual files, but not the whole thing. With that
in mind, one could adopt some similar packaging convention that
applies signatures to individual files -- including any ancillary
JSON.
This sounds more like a spec dependency than an implementation
dependency. That is, it's related more to our "implements" property
than our "dependencies" property. More than one package might conform
to a specification. We need a property to permit a package to state
that it depends on a particular specification that can be fulfilled by
multiple packages. How the installer handles that is a different
issue. I suggest that the user be presented with the choices through
a package search and be required to explicitly include one of them in
the next install line.
Any ideas for the property name?
Kris Kowal
My 2 cents' worth -- we should strive for zero-admin operation. If
there is a choice, maybe that can be added to the catalog format
somehow. In other words, "I rely on catalog X to choose for me a
conforming implementation of Y".
On Dec 16, 4:57 pm, ihab.a...@gmail.com wrote:
I'd agree. This should be done through the catalog and browser tools
and outside of the package.json file.
A package should state what the "code" depends on. What modules must
be present.
One solution to this is to define meta packages. For example: an "ssl"
package could mean that some form of SSL is present (openssl, gnutls
etc). We want to keep the job for the consuming package simple. They
should depend on a package. That package can be a real package
providing code, or it can be a meta package representing a service or
platform.
--mob
With semantic versioning, if a feature is introduced in version 1.3,
it is guaranteed to exist in any minor version or patch that is equal
to or greater than 1.3, but when the major version changes, API
compatibility is no longer guaranteed, thus the need for explicit
declaration of compatibility. Perhaps this is a weakness of SemVer;
perhaps there's a need to state the maximum major version that is
guaranteed to be compatible. That means that the minimum bound would
need to be either two digits or one digit and the maximum bound would
only need to be one digit.
{"name": "package-name", "min": "1.3", "max": "3"}
1.3 <= version < 3
If that's what you're recommending, I'm down. It would be unwise to
have a package state that it is guaranteed to work with future APIs
that have not yet been defined, but if we studiously update our
dependency lists, there won't be a problem. This enforces a good
practice.
Kris Kowal
Let's call these things for what they are; a meta-package is a
standard that multiple packages can conform to. I do not think we
should conflate the package name space with the standard name space;
it would make the system too "clever" and confusing like so many
package management system that have come before.
Maybe instead of "implements", we need a "standards" name space:
{"standards": {
"implements": []
"dependencies": []
}}
Kris Kowal
On Wed, Dec 16, 2009 at 4:53 PM, <ihab...@gmail.com> wrote:
> For case 2, the problem can be solved by signing *part* of the
> package, not the package itself. For example, the signatures in Java
> JAR files are for individual files, but not the whole thing. With that
> in mind, one could adopt some similar packaging convention that
> applies signatures to individual files -- including any ancillary
> JSON.
Requiring "package A signed by X" means "package A containing a
well-known signatures file which in turn contains a signature, by X,
for each file of package A that I end up using".
Hmmm yes and no. If the dep-group is something speced by CommonJS then wonderful. But the use case I'm thinking of is when you wish to wrap two or three (or more) underlying packages. And those packages are just random features: both work, and the consuming package knows how to feature test as needed (if they dont just expose the same API anyway). For example, MySuperWonderfulDocTool needs a PEG parser, so it says:
dependencies: [
{ group: [
[ "packrat" ],
[ "some-other-peg-parser" ]
]
}
]
And I'm with Ihab on the "zero-admin operation". The most annoying part about cpan (the binary) is when it asks you questions.
On 17 Dec 2009, at 01:02, mob wrote:
> One solution to this is to define meta packages. For example: an "ssl"
> package could mean that some form of SSL is present (openssl, gnutls
> etc). We want to keep the job for the consuming package simple. They
> should depend on a package. That package can be a real package
> providing code, or it can be a meta package representing a service or
> platform.
Hmmm, this doesn't feel like the right solution to me. Its been done for perl (there are various ::Any modules, such as Config::Any) but you still have the issue of when you are installing on a native system you need *one* of the modules, but if you have the second choice already installed you dont want to install the first choice.
-ashj
This is asserted by a catalog, right? Not by any individual package?
I was thinking of the package, bearing in mind that this is not a
feature that "using-packages" would use ever. When you "use" a
package, it is a direct reference.
If we want to avoid having any installer ever ask you which package
you want that conforms to a particular specification, we should avoid
both standard implementation declarations and standard dependency
declarations since they are only informative and introduce
non-determinism respectively.
Kris Kowal
Kris, thanks for the use case. I understand the coercing packages to
do the right thing, but it has a downside.
What about this case:
I've got a package X and I use package Y 1.0 APIs. I create the
package and archive it never to return, but users continue to download
an useful (you see it was perfect the first time I created package X).
Package Y is under rapid development and goes 1.1, 1.2. .... 2.0
pretty quickly. All catalogs move quickly to host the new versions of
package Y. Users don't want to store every version of package Y, they
only really want the most recent package. Whereas they have the
perfect package X, no need to upgrade. Soon people stop hosting
package Y 1.0. And now no-one can use package X because it said:
"y", "1.0"
Whereas, package X should continue to be useful and should not die.
Package dependencies are not guarantees that something WILL work.
Rather, it is a statement of what is required to function.
So the choices are:
- Have an implicit upper version and have packages only run up to that
version. If the dependent package is upgraded, then the depending
package WILL eventually need to be upgraded
or
- Have no implicit upper version. Packages continue to work. They may
fail if the dependent package changes an API that is being used by the
depending package.
Both have holes. My preference is for the latter as it doesn't
forcibly break anything, but I understand your perspective. We need to
choose one.
---mob
What does the package manager do with dependencies?
I was assuming that implements was informative only and not enforced
by package managers.
Ok, in your use case where you might ask a user what to do, that is
fine. Package X can assert that "I implement Mail 2.0" and, as long as
you are auditing that yourself, you can presumably take precautions.
However, for the zero-admin case which I advocate, I wanted to clarify
the security scenario: I rely on catalog Y to tell me that package X
implements Mail 2.0, and therefore I delegate the burden of due
diligence to the author of catalog Y.
> If we want to avoid having any installer ever ask you which package
> you want that conforms to a particular specification, we should avoid
> both standard implementation declarations and standard dependency
> declarations since they are only informative and introduce
> non-determinism respectively.
Standard implementation declarations, if asserted by a catalog based
on knowledge (testing, ...) that the catalog author has done, can be
far stronger than informative.
Standard dependency declarations are non-deterministic only to the
extent that any dependency on a possibly varying object is
non-deterministic. We already have that if we rely on something via
its location and a signature, or range of versions, or ....
Zero: Packages can declare that they are dependent on version X.Y.Z
through W. All of the terms of X.Y.Z default to zero if they are not
provided and W defaults to (X + 1) if it is not provided.
Implications: if a package maintainer falls asleep AND multiple
versions are not supported by the catalog system AND if old versions
of packages are culled from the repository, it is possible that a
package will become unusable because it is dependent on a package that
no longer exists. This also means that, except for errors, semantic
versioning provides strong guarantees.
Infinity: Packages can declare that they are dependent on version
X.Y.Z through W. All of the terms of X.Y.Z default to zero if they
are not provided and W defaults to Infinity. Implications: packages
may go unmaintained indefinitely without causing their dependees to
ever become unusable. Incrementing the major version of your package
does not guarantee that package maintainers will be required to review
whether their packages still work and users will be forced to upgrade.
Kris Kowal
One question before hand show.
If we select the "Infinity model" you can implement the zero case by
specifying an upper version bound.
If we select the "zero model", how can you specify the infinity case?
--mob
Is there also an option for
Explicit: Packages MUST specify a minimum and maximum version --
possibly "zero" or "infinity".
?
ashb: Zero. (to muddy further might be an idea to be able to specify multiple versions: "I work with 1.3 and 2" etc?)
+1 for Infinity
But either will work.
--mob
> On Wed, Dec 16, 2009 at 5:26 PM, Kris Kowal <cowber...@gmail.com> wrote:
>> Hand show.
>
> Is there also an option for
>
> Explicit: Packages MUST specify a minimum and maximum version --
> possibly "zero" or "infinity".
>
> ?
>
> Ihab
I personally don't like this option. I'd prefer to foster a culture where a major version number bump means just that - something major and non-backwards compatible changed. In which case the upper bound is to my mind the next major version up. If we were to specify an maximum version, it should only be a single number, not a full version (i.e. an upper version of 3.2.56 should not be allowed)
For reference: Backpan currently stands at about 7gig. Backpan is like CPAN, but it includes every version of every module ever uploaded to CPAN. If the mirrors limit themselves to just keeping the latest version of old major version numbers, I can't see the disk space burden being particularly large.
-ash
I personally don't like this option. I'd prefer to foster a culture where a major version number bump means just that - something major and non-backwards compatible changed. In which case the upper bound is to my mind the next major version up. If we were to specify an maximum version, it should only be a single number, not a full version (i.e. an upper version of 3.2.56 should not be allowed)
On 17 Dec 2009, at 01:29, ihab...@gmail.com wrote:
> On Wed, Dec 16, 2009 at 5:26 PM, Kris Kowal <cowber...@gmail.com> wrote:
>> Hand show.
>
> Is there also an option for
>
> Explicit: Packages MUST specify a minimum and maximum version --
> possibly "zero" or "infinity".
>
> ?
>
> Ihab
Agree with this, but lets be clear:
When a package goes from 1.2 to 1.3, hopefully it won't break APIs
that other packages depend on. It may, but it is good practice for us
all to strive to not break APIs. I believe this is the normal case.
Packages evolve, release new features, update versions. This normal
practice should not require extra work on the part of packages
depending on a package. ie. a package should not have more work to do,
just because packages that it depends upon go through a rapid (non-
breaking) evolution. It should not have to be re-released just because
another package did a release.
Take this to its extreme, Windows releases version 7 and every
application breaks unless it is repackaged and re-released to specify
that it works with Windows 7. Surely if things aren't broken, they
should keep working? (I know I hate using windows as an example.
Please don't ping me for that).
--mob
--mob
I worked through a lot of versioning issues while designing and building
the new tusk (still under development). I believe I have come to a
satisfactory solution for the needs of a userland package developer
relying on "using packages" to compose programs. I am not a sysadmin and
cannot speak to the extent these concepts can be applied to system
packages that compose a platform on which userland packages run but am
sure there is a common ground.
In my small world:
- You build programs (in userland) on top of a platform
- The platform is made up of system packages
- The platform is maintained by sysadmins (or automated cousins)
- The platform exposes a *frozen* API as far as userland is concerned
- The platform has entry points (JSGI, command line, ...)
- To run a program on a platform you:
- Install a program package with using package dependencies
- Load a configuration that:
- Hooks the program to platform entry points
- Configures the lifecycle of the program on the platform
- Provides credentials (DB, services, ...)
- Configures internal workings of the program
As a userland developer you care about:
- Versions of system packages (if working on the edge)
- Versions of using packages (always)
- Publishing a package once and having it work forever (ideally)
- Collaborating with others easily
As a userland developer you know about:
- Versions of system packages *YOU* have tested
- Versions of using packages *YOU* have tested
As a userland developer you *DO NOT* know about:
- Future versions of system packages that *MAY* be compatible
- Future versions of using packages that *MAY* be compatible
- Unintended uses of your package (e.g. a distro mashes it up)
Code versioning and distribution is a real mess across different
technologies, communities, projects, ... and for a good reason. Any one
solution does not fit all and many have evolved over time. That is not
going to change.
You *cannot* rely on a package author to understand the implications of
their versioning decisions. You *can* expect a package author to
*declare* what they *have* tested.
The logical consequence is that a "package.json" file can accurately
reflect the *current* state but not infer a possible *future* state.
That is one where the package is used other than what was *actually tested*.
I am advocating an indirection via catalogs to manage the future state:
"package.json" files must be freezable and treated as code. They
represent what the author has been able to *actually* accomplish
(speaking to dependency compatibility) up to the point a release is tagged.
A catalog is a composition of packages intended for consumption by a
decided audience. The audience knows its requirements and the catalog
author can ensure that the catalog and the containing packages meet all
communicated needs (speaking to versions of packages).
A catalog:
- May rewrite the names of packages
- May rewrite the versions of packages
- Provide release trains/streams/channels
- May provide multiple versions of the same package or just one
- Should *actually test* all combinations (where possible/realistic)
When you build a program you become a catalog author. You construct a
catalog for your program's components. This catalog may be shared with
other developers to collaborate and ensures a completely automated
toolchain making it trivial to contribute to a project.
Your program's catalog may be entirely of your own composition or may
mirror or incorporate a catalog shared by a larger community. You may
distribute your program via your catalog and/or offer it for syndication
by other catalogs. When syndicated via other catalogs your catalog (or
program) may be recomposed entirely swapping out packages etc ...
For all this to work a catalog *must always* override the "package.json"
included by the package. While the original "package.json" file informs
the catalog author the package descriptor in the catalog informs the
package manager and runtime system.
There are numerous advantages to this approach. At the very minimum we
have a system that can be abused just like any other. At best we have a
system that can provide a very reliable and automated toolchain where
others have failed.
The primary tenets for "catalog-based package management" are:
* Package authors write code.
* Catalog authors distribute code.
* Catalogs are cheap. They are easily composed and recomposed.
* Unit, integration, API and user-experience tests are king.
* Users choose which catalogs to use or compose their own.
* Collaborate on catalogs, not just packages.
Maybe catalog is not the best term for this. We are talking about a
deterministic instrument, not one requiring user-decision beyond catalog
authoring.
I realize this is a different way of thinking but am hoping it is
considered as a possible solution to the dependency/versioning mess.
Christoph
Hmm. How about:
{
origin: {
location: {
type: "git",
url: "http://hg.example.com/mypackage.git"
},
bugs: {
mail: "d...@example.com",
web: "http://www.example.com/bugs",
}
}
}
Ideally "origin" (or equivalent) lists all resources needed for
effective collaboration.
Christoph
To me a catalog is just an index of packages created on a server and
downloaded to package managers. In our model, catalogues are not
necessarily required by loaders (though can speed up locating and
loading) and catalogs are never required to run an app. So app
developers don't have to create or directly use catalogs. I suppose
this is similar to the debian/apt model. Could you please explain what
a catalog is in your mind?
It at all possible, I'd like to nail down the package.json without
opening the catalog can of worms. ie. solve one piece and then build
on that.
--mob
Here's how I see it -- not to hijack Christophs' answer. Hope this helps.
A Reference is a pointer to some external stuff has a known schema. It includes:
* Where to find the stuff (may be multiple locations)
* Digital signatures applied to the stuff
* The checksum of the stuff
A Package Reference is a Reference that points to a package file.
A Catalog Reference is a Reference that points to a catalog.
A Catalog is a mapping from short, *locally* unique names to Package
References, and a possible inclusion of other Catalogs. For example:
{
packages: {
'strutils' : <PackageReference>,
'regexp' : <PackageReference>
},
catalogs: [
<CatalogReference>,
<CatalogReference>,
]
}
I may have the details wrong, but that doesn't matter. What matters is
that I think the package.json of a package must include a snippet of
catalog matter in it to refer to the packages it requires.
In other words, package.json can have a "catalog" slot and point to
the (to be determined) catalog schema, and be done with it.
Again, hope this helps. Cheers,
If that is the case, I'd like to propose a simpler or more layered
foundation to enable simpler and smaller systems to be built. I've no
issue with having catalogs as you describe, but we should be able to
build packages and systems without them. They must not be required to
create or use packages. Let me explain ...
With the package.json that we have proposed and discussed, users can
create complete packages with dependencies fully specified. Small
systems, can then have very simple loaders and package managers. Most
of the complex stuff can be done by tooling and server side
repositories. The loader simply reads the package.json and loads
required dependencies. The actual location of packages can be defined
by a local or central repository search path. Either way, the
package.json does not need to know about locations.
So I'm not against catalogs, but lets define a package.json that can
operate without them. This is a huge win for small systems and it
really helps layer the concerns.
--mob
What M.O'B is proposing matches much closer to my mental model of packages too, fwiw.
One thing that isn't clear though is how to deal with the problem of name clashes ("template" being the one that springs to mind. There are already 10+ different template-y modules): We either need a central repository/authority of package names, or we need more than just URL to fully qualify a package.
I'm happy with a single central authority, but I think most other people aren't. So how about:
dependencies: [
[ "template1", "http://foo.com/catalog#template", 1.0 ],
[ "template2", "http://bar.com/my_site/template", 1.0 ]
]
I'm not particularly attached (or happy with) that exact syntax, but how does the idea sound?
(also dependencies should probably not be a required key. Not all packages will have then. Ditto for implements).
Great -- that's fine. I have discussed previously the idea of a
"default" or "system" or what-not package, that grabs stuff from some
implementation-defined path (in a Pythonic manner,
/usr/local/packages/ or /usr/share/site-javascript/ or whatever).
Perhaps that can be source for the small systems you mention.
So long as it's clear by looking at a package that it uses local stuff
in this manner, that sounds fine. So a require from my own package
would look like:
require('foo/bar');
and from a "using packages" with a catalog reference would look like:
require('foo/bar', 'catalogShortName');
and from the "system" (or whatever) package would look like:
require('foo/bar', 'system');
?
In the catalog case, I rely on the catalog and the catalog disambiguates.
In the local case, a local sysadmin must organize the local search
path appropriately via manual intervention.
> (also dependencies should probably not be a required key. Not all packages
> will have then. Ditto for implements).
I think both of these should be catalog concerns and should be removed.
But if you choose a name like "template" and you don't have a lot of
visibility --- I doubt it will get accepted into the repositories for
public distribution.
Package descriptors should not have physical locations of other
packages in them. That becomes a nightmare to maintain. That is the
job of package respositories and catalogs.
--mob
>> (also dependencies should probably not be a required key. Not all packages
>> will have then. Ditto for implements).
>
> I think both of these should be catalog concerns and should be removed.
Absolutely not: the major purpose of such a file to me (as a package developer) is so that I can state the dependencies of a package.
If you remove it then catalog authors would have to manually go an work out what the dependancies are, and what specs a package implements. (ignoring for now wether or not I agree with the catalog proposal/idea)
-ash
So you are also suggesting a single central repository of packages (at least in effect)?
The name that a package asserts about itself is merely a polite
suggestion. The important name is the (memorable) nickname in the
scope of a catalog. See --
http://en.wikipedia.org/wiki/Zooko's_triangle
> Package descriptors should not have physical locations of other
> packages in them. That becomes a nightmare to maintain. That is the
> job of package respositories and catalogs.
It would be *nice* for package descriptor to contain only references
to catalogs. However, *disallowing* physical references is limiting
useful use cases. I say just let the package descriptor contain a
piece of catalog (which could just be a single reference to an
external catalog -- remember, catalogs refer to one another) and be
done with it.
Ihab
+2
> If you remove it then catalog authors would have to manually go an work out what the dependancies are, and what specs a package implements. (ignoring for now wether or not I agree with the catalog proposal/idea)
We need to separate these concerns. Keep the package.json small. Have
it specify via opaque handles what are the dependencies on other
packages and include version guarantees.
Then catalogs and other tools can manage them as discrete units.
Catalogs can build up higher order environments and groupings.
--mob
But do you agree with me, though, that "dependencies" can be dealt
with by including a snippet of catalog schema in the package.json?
These two schemata overlap considerably. There is no reason to have
two ways of referring to packages.
Doesn't have to be a single repository, but I think a few leading
repositories will naturally develop. There may be other focussed on
specific needs such as embedded or browser centric.
What I'm saying really is we provide a mechanism for unique names, but
don't force it. Allow briefer names to evolve also. In practice, if we
have some repositories setup that mirror packages, I think this will
be managed fairly easily.
--mob
Disagree. The name in the package.json should be THE name. That names
should be constant whoever uses the package.
> It would be *nice* for package descriptor to contain only references
> to catalogs. However, *disallowing* physical references is limiting
> useful use cases. I say just let the package descriptor contain a
> piece of catalog (which could just be a single reference to an
> external catalog -- remember, catalogs refer to one another) and be
> done with it.
That creates a dependency on catalogs. packages should be able to
exist without catalogs. If names are opaque and unique then we can
achieve this. Catalogs should refer to packages not the other way
round.
--mob
We can only enforce their uniqueness if we rely on some disambiguating
party to serve as the context for the unique and short names
(essentially, nicknames). Otherwise, we either utter long and messy
names (the location / checksum / signature / ... guff) or we expose
ourselves to substitution attacks where an attacker can camp on
someone else's namespace with a malicious package.
> Catalogs should refer to packages not the other way round.
That is true in all I propose.
I think catalogs are still several different things in different
peoples minds. If we can separate the concerns and move on to catalogs
next with a details catalog schema -- it would help greatly.
So can I ask a question: What is missing from the package.json that
must be in there?
The open issues I hear are:
- Whether we use infinity/zero based versions as Kris asked for a show
of hands. Currently 2 hands for 1 all.
- Whether package names can be opaque and collisions be resolved by
repositories / catalogs or whether we must impart meaning inside the
package name
I propose:
- package.json files don't have the actual locations of dependent
packages (or of itself). The names remain opaque
- package.json files don't refer to catalogs or include catalogs
sections
I'm flexible on the dependency version issue Kris raised. If we get
more votes we can resolve that issue. I'm swayable either way after
thinking more about Kris's perspective.
--mob
> packagesAndCatalogs.png
> 161KViewDownload
What is your definition of "opaque"?
> - package.json files don't refer to catalogs or include catalogs
> sections
We can get to this after we've achieved understanding on the first item. :)
In that case, we should remove "dependencies" from your schema too and
defer the issue wholesale.
> Attached is an updated diagram with an attack scenario at the bottom. -- Ihab
>
I dont get it - how is it an attack? If you're going to download random code and put it in the search paths you've got all kinds of other issues.
Ah, I see. So really this is about stuff you get from what I'm calling
(provisionally for our discussion) the "system" package? In other
words, from the PATH of some local storage that has been hand-curated
by a sysadmin?
If so, then that's fine -- the "dependencies" are a set of hints to a
sysadmin who will do real work and research. But I strongly suggest
that the "require" for this use case specify what is being
require()-ed in a namespace that cannot be confused with the secure,
catalog-using requires.
People can put anything in the name, subject to the character set
restrictions. Their job to pick a unique string.
If we had a single, authoritative repository, it could ensure
uniqueness of names when they are published. Perhaps we may get such a
single repository as other platforms do have.
If a package is private to an organization, they can use whatever
scheme they like.
--mob
No, because a package knows what other packages it needs to run. By
having dependencies, a package.json can work and be a complete
specification of the environment required to successfully run the
package.
--mob
Well perhaps.
If you mean that someone who types apt-get on debian/ubuntu is a
sysadmin. But only in that sense.
A package tool will use the package.json to ensure it pre-installs
dependent packages. And a loader MAY use the dependencies to pre-load
packages (Not required as require will load -- but I could imagine an
implementation that did this).
> If so, then that's fine -- the "dependencies" are a set of hints to a
> sysadmin who will do real work and research. But I strongly suggest
There should be little or no research required other than:
pkg install package-A
This will download the package-A, get the package.json, lookup the
dependencies, retrieve those and recursively install the lot. No big
deal and SOP for apt. ie. a familiar model.
> that the "require" for this use case specify what is being
> require()-ed in a namespace that cannot be confused with the secure,
> catalog-using requires.
I'm really not familiar with secure catalog-using require. I'm basing
this proposal on Modules 1.0 as currently approved.
Perhaps when that is more ratified we could expand this propoal in-
step?
--mob
Ok, though do we acknowledge that catalog matter may intersect with
"dependencies" and maybe the two concepts could be merged at a future
date?
Sure, that is possible. I'm current not in favor because it then
mandates the existence of catalogs which makes smaller systems harder.
But as we learn and get experience, they may more than justify their
weight.
Once we solidify this package proposal, I suggest that someone with
more experience with catalogs drafts a similar proposal focusing
narrowly on catalogs and we do the same for that -- layer upon layer.
--mob
I'm proposing a two-phased approach to the problem of catalog
management, first for system-packages and second for using-packages.
1. For system-packages, the catalog would be centralized and depending
on the release criteria of the catalog, integration tested, peer
reviewed, verified, disected: in a word: managed. Even in this
scenario, system administrators may subscribe to a prioritized list of
catalogs or install different versions deliberately. The catalog
would be composed from selected package descriptors. The catalog
would reserve the right to rewrite these descriptors with different
names and dependencies and the system administrator's package manager
would be required to overwrite the package.json in the package
directory with the one provided by the catalog in order to maintain
consistency. For this layer to function, there must be communication
and validation between the catalog maintainer and the package
maintainer. There can be many catalogs, but they apply at the
"system" context. It is sufficient at this stage for the dependencies
to be explicated with short names and perhaps version criteria. It is
not necessary for this stage to specify the catalog schema, which
separates concerns.
2. For the second stage of system-packages, we specify the the catalog
schema, which is trivial in terms of the package descriptor schema.
3. For using-packages, the responsibilities of the catalog and the
package descriptor become tightly coupled. The purpose of
using-packages can be satisfied in a couple ways, both of which I
think we should support. First, the package descriptor may subscribe
to a remote catalog, in which case it must have a reference to the
catalog, a validating hash, and a cryptographic signature in terms of
a known public key. Second, the package descriptor may itself contain
a catalog of the packages it uses, in which case it needs
ref/hash/sign for each individual package. There are no version
criteria: one reference to one version of the package.
Because the using-packages-package-descriptor-schema is likely to be
cohesive with the system-packages-catalog-schema, and because the
system-packages-catalog-schema is likely to be cohesive with the
system-packages-package-descriptor, and because ideally the
using-packages-package-descriptor is to be a strict superset of the
system-packages-package-descriptor, I propose that we converge in
these three separate phases.
M.O'B. is presently working on the first phase. In other words, I
appreciate where you're going, but I think that we should refrain from
bringing up concerns from phase 3 until we get there, except in the
case where a proposal in phases 1 or 2 precludes consistency with 3.
Kris Kowal
I like the phased approach. Can we finalize open issue with Phase 1
before getting too deep with 2 & 3. We need a solid foundation.
The open issues for phase 1 I mentioned earlier
- Versions: zero or infinity bounds. Kris's post described this issue.
We need more hands.
- Package names. Can they be opaque as currently documented?
If we resolve these and there are not other open issues that I'm not
aware of -- we can proceed to the next phases.
PS. I'd like to better understand the problem catalogs are trying to
solve. What are the requirements or use cases for them? It sounds
awfully complex -- but it could be I just don't understand some basic
details.
--mob
On Dec 17, 12:27 pm, Kris Kowal <cowbertvon...@gmail.com> wrote:
> I'm proposing a two-phased approach to the problem of catalog
> management, first for system-packages and second for using-packages.
One more thing. When we do phase 2 and 3, I really want to enable
package.json files to run without catalogs. We have systems with real
embedded memory constraints and so we want to have a very low barrier
to entry to use packages in these systems. ie. they can't afford a
complex catalog system. Layering it on top is fine, rewriting
package.json files may be okay too. But requiring catalogs is not
good.
--mob
Ok. With the most recent clarification that "dependencies" as
discussed here are disjoint from anything in the using-packages land,
that's fine.
Ok -- so the real requirement is "low (or easily reducible) memory and
CPU footprint". I agree.
Hand show.
Question 1: Regarding dependency versions and how to interpret the
digits.
Option Zero: Packages can declare that they are dependent on version
X.Y.Z
through W. All of the terms of X.Y.Z default to zero if they are not
provided and W defaults to (X + 1) if it is not provided.
Implications: if a package maintainer falls asleep AND multiple
versions are not supported by the catalog system AND if old versions
of packages are culled from the repository, it is possible that a
package will become unusable because it is dependent on a package
that
no longer exists. This also means that, except for errors, semantic
versioning provides strong guarantees.
Option Infinity: Packages can declare that they are dependent on
version
X.Y.Z through W. All of the terms of X.Y.Z default to zero if they
are not provided and W defaults to Infinity. Implications: packages
may go unmaintained indefinitely without causing their dependees to
ever become unusable. Incrementing the major version of your package
does not guarantee that package maintainers will be required to
review
whether their packages still work and users will be forced to
upgrade.
Question 2: Confirm that package names are opaque unique strings.
Question 3: Please raise any open issues now.
--mob