Packages/Registry initial spec

87 views
Skip to first unread message

Isaac Schlueter

unread,
Aug 31, 2010, 11:50:52 PM8/31/10
to comm...@googlegroups.com

Kris Kowal

unread,
Sep 1, 2010, 12:13:56 AM9/1/10
to comm...@googlegroups.com
For purposes of quotation and disection…

Packages/Registry

This specification describes a method for identifying package
descriptors by a combination of name, version, and registry base URL.
A package registry is considered “CommonJS Compliant package registry”
if it implements the API described in this specification.
The “client” refers to a package.json-aware system that uses a
CommonJS Compliant package registry to locate packages.
Scope

This API only describes identifying and locating packages. It
specifically does not describe the mechanisms by which packages are
added to the registry. It is assumed that each registry may implement
different mechanisms for user authentication, package acceptance,
package removal, and so on.
Furthermore, this specification only dictates the behavior for the
URLs it describes. Any URL which is not a valid registry root url,
package root url, or package version url, is not defined by this
specification.
This API does not assume any particular implementation technology.
HTTP Request Method and Headers

HTTP GET is the only method required for consuming the data in a
package registry. Registries MAY use other methods for entering data
into the registry, authenticating user accounts, and so on, but that
is outside the scope of this specification.
Clients SHOULD send an Accept header of application/json. Behavior of
the registry in the presence of other Accept headers is undefined. For
instance, a Compliant registry MAY send an HTML document description
of a package if given the Accept value of text/html, or it MAY send
JSON in all cases.
Registries MUST send responses with an application/json Content-Type
if the client sends an Accept header value of application/json.
HTTP Errors

In the case of errors, registries SHOULD still send any response body
in the form of valid JSON with an error member, and optionally any
other useful information. For instance:
HTTP/1.1 404 Object Not Found
Content-Length: 52

{"error":"not found","reason":"document not found"}
Registries MAY send back an empty response body, but any response body
that is sent MUST be valid JSON.
URLs

The following classes of urls are described:
registry root url
package root url
package version url
registry root url
Examples: http://registry.npmjs.org/ http://js.packag.es/registry/
The root URL is the base of the package registry. Given this url, a
name, and a version, a package can be uniquely identified, assuming it
exists in the registry.
When requested, the package root URL SHOULD return a list of packages
in the registry in the form of a hash of package names to package root
descriptors.
The package root descriptor MUST be either: an Object that would be
valid for the “package root url” contents for the named package, or a
string URL that should be used as the package root url.
In the case of a string URL, it MAY refer to a different registry. In
that case, a request for {registry root url}/{package name} SHOULD be
EITHER a 301 or 302 redirect to the same URL as named in the string
value, OR a valid “package root url” response.
The following would be a valid response for the package root url at
http://example.com/.
{ "foo" : "http://foo.com/package/versions/foo"
, "quux" : "http://example.com/quux"
, "asdf" : "http://example.com/-/third-party/asdf"
, "bar" :
{ "name" : "bar"
, "maintainers" : [ { "name" : "isaacs", "email" : "i...@izs.me" } ]
, "mtime": "2010-08-29T23:10:47Z"
, "versions" : { "1.0.0" : "http://example.com/bar/1.0.0" }
}
, "baz" :
{ "name" : "baz"
, "versions" :
{ "1.0.0" :
{ "name" : "baz"
, "version" : "1.0.0"
, "main" : "./lib/baz.js"
, "description" : "The bazziest!"
, "dist" : { "tarball" : "http://example.com/-/baz-1.0.0.tgz" }
}
}
}
}
The “foo” package is a redirect to another registry. The “quux”
package is served from the typical URL on this registry, but not
listed in the registry root response. The “asdf” package is served
from this registry, but at a different URL. The “bar” package lists
top-level information but supplies a URL for specific versions. The
“baz” package lists top-level information as well as package
descriptors for specific versions.
package root url
The package root url is the base URL where a client can get top-level
information about a package and all of the versions known to the
registry.
A valid “package root url” response MUST be returned when the client
requests {registry root url}/{package name}
Redirection
The package root URL SHOULD be found at {registry root url}/{package
name}. If the registry would rather proxy to a different URL for a
specific package, then it MUST respond with a 301 or 302 status code
and a Location header to the desired address. It MAY send a JSON
response body with a “location” field containing the intended
location. For example, requesting http://example.com/foo in the
previous example might return this:
HTTP 1.1/302 Found
Location: http://foo.com/package/versions/foo
Content-Type: application/json
Content-Length: 54

{ "location" : "http://foo.com/package/versions/foo" }
Redirection may also be used within the same host if the registry
wishes to serve package root information from a different location
than {registry root url}/{package name}.
Package Root Object
The root object that describes all versions of a package MUST be a
JSON object with the following fields:
name: The name of the package. When both are decoded, this MUST match
the “package name” portion of the URL. That is, packages with
irregular characters in their names would be URL-Encoded in the
request, and JSON-encoded in the data. So, a request to
/%C3%A7%C2%A5%C3%A5%C3%B1%C3%AE%E2%88%82%C3%A9 would show a package
root object with “\u00e7\u00a5\u00e5\u00f1\u00ee\u2202\u00e9” as the
name, and would refer to the “ç¥åñî∂é” project.
versions: An object hash of version identifiers to valid “package
version url” responses: either URL strings or package descriptor
objects.
The following fields are optional, and thus MAY be present in the
package root response. If they are present, then they MUST have the
meanings described:
mtime: The last modified time of the package root object, expressed as
an ISO String.
ctime: The creation time of the package root object, expressed as an ISO String.
maintainers: An array of identifiers of package registry users who
maintain the package described.
repository: The repository where the project source is managed. This
SHOULD match the repository field on the most recently published
version of the package.
url: A URL where the package root object can be found. (This is
largely unnecessary in the case of requesting a package root object
directly, but can be helpful with abbreviated package root objects are
returned in the registry root response.)
description: A description of the project. This SHOULD match the
description field on the most recently published version of the
package.
package version url
The package version url is the base URL where a client can get package
descriptor information about a specific version of a package.
A valid “package version url” response MUST be returned when the
client requests {registry root url}/{package name}/{package version}
Redirection
The package version URL SHOULD be found at {registry root
url}/{package name}/{package version}. If the registry would rather
proxy to a different URL for a specific package version, then it MUST
respond with a 301 or 302 status code and a Location header to the
desired address. It MAY send a JSON response body with a “location”
field containing the intended location. For example, requesting
http://example.com/foo/1.0.0 in the previous example might return
this:
HTTP 1.1/302 Found
Location: http://foo.com/package/versions/foo/1.0.0
Content-Type: application/json
Content-Length: 60

{ "location" : "http://foo.com/package/versions/foo/1.0.0" }
Redirection may also be used within the same host if the registry
wishes to serve package root information from a different location
than {registry root url}/{package name}/{package version}.
Note: In this example, the “foo” package root was served on a
different registry. If that registry is also a CommonJS Compliant
Package Registry, then tacking /1.0.0 onto the package root URL would
be a valid package version url.
Package Version Object
The Package Version Object is almost identical to the Package
Descriptor object described in the CommonJS Packages specification.
For the purposes of the package registry, the following fields are
required. Note that some of these do not exist in the Packages
specification.
name: The package name. This MUST match the {package name} portion of the URL.
version: The package version. This MUST match the {package version}
portion of the URL.
dist: An object hash with urls of where the package archive can be
found. The key is the type of archive. At the moment the following
archive types are supported, but more may be added in the future:
tarball: A url to a gzipped tar archive containing a single folder
with the package contents (including the package.json file in the root
of said folder)
Changes to Packages Spec

Besides the addition of fields to the Package Version Object, this
addition to the Packages spec imposes the following restrictions on
the “name” and “version” fields:
MUST NOT start with “-“
MUST NOT contain any “/” characters
MUST NOT be “.” or “..”
SHOULD contain only URL-safe characters
This makes it simpler for package names and versions to map to a URL
scheme. Since they may not start with “-“, this makes it possible for
registry owners to “escape” from the package registry on the same
hostname.
For example, a package registry owner might wish to serve the tarball
for “foo” version 1.0.0 from http://example.com/foo/-/foo-1.0.0.tgz.
Since - is never a valid version, this will not be interpreted as a
request for package data. (See “Scope” above.)
Prior Art

npm Registry http://registry.npmjs.org/

Kris Zyp

unread,
Sep 4, 2010, 4:20:32 PM9/4/10
to comm...@googlegroups.com, Isaac Schlueter
I think this looks great.

I only have a few small suggestions:
First, I think we should list "zip" in the type of archives. Yeah, I
prefer using tarballs too, but I know that whenever I put up a
distribution in both tarball and zip format, the zip gets way more
downloads. For better or for worse, the reality is that zip is extremely
popular. It should at least be included in the spec.

Second, I would try to be a little more RESTful, and give the repo
server more control over their own URL namespace with regards to
searching. Could we provide a way for the server to indicate how to
search for packages with a URI template? I realize that we can control
the package root url by giving the URL for each package root object
returned from the repository root url (which is very nice, a great
RESTful feature), but it does mean you have to download the entire list
of packages to be certain of a the package root url for a given package
name. Maybe allow a root URL that is separate from the package listing
url so one could do something like (where a "package-by-name" relation
would override the default package root url setting in the spec):
"links":[
{
"rel":"package-by-name",
"href":"repository/?name={name}"
},
{
"rel":"fulltext-search",
"href":"repository/?fulltext={keyword}"
}
]
Maybe we aren't expecting repos to be so big that downloading the entire
list of packages is that bad, and this isn't worth it, not sure.

Thanks,
Kris

--
Thanks,
Kris

Irakli Gozalishvili

unread,
Sep 6, 2010, 9:00:42 AM9/6/10
to comm...@googlegroups.com, Isaac Schlueter
Thanks Isaac  this looks good to me as well!

On Sat, Sep 4, 2010 at 22:20, Kris Zyp <kri...@gmail.com> wrote:
 I think this looks great.

I only have a few small suggestions:
First, I think we should list "zip" in the type of archives. Yeah, I
prefer using tarballs too, but I know that whenever I put up a
distribution in both tarball and zip format, the zip gets way more
downloads. For better or for worse, the reality is that zip is extremely
popular. It should at least be included in the spec.


I do agree with Kris, specially since that's only type of archive used for bundling firefox extensions. 
 
Second, I would try to be a little more RESTful, and give the repo
server more control over their own URL namespace with regards to
searching. Could we provide a way for the server to indicate how to
search for packages with a URI template? I realize that we can control
the package root url by giving the URL for each package root object
returned from the repository root url (which is very nice, a great
RESTful feature), but it does mean you have to download the entire list
of packages to be certain of a the package root url for a given package
name. Maybe allow a root URL that is separate from the package listing
url so one could do something like (where a "package-by-name" relation
would override the default package root url setting in the spec):
"links":[
 {
   "rel":"package-by-name",
   "href":"repository/?name={name}"
 },
 {
   "rel":"fulltext-search",
   "href":"repository/?fulltext={keyword}"
 }
]
Maybe we aren't expecting repos to be so big that downloading the entire
list of packages is that bad, and this isn't worth it, not sure.


This seems like a reasonable suggestion, but I think it can be omitted for now, we can come back to it once we'll reach that amount of list of entries and will have some practical experience of solving it.
 

--
You received this message because you are subscribed to the Google Groups "CommonJS" group.
To post to this group, send email to comm...@googlegroups.com.
To unsubscribe from this group, send email to commonjs+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/commonjs?hl=en.


Kris Zyp

unread,
Sep 7, 2010, 11:55:47 AM9/7/10
to comm...@googlegroups.com
One more comment, the specification does not indicate whether or not
relative URLs are allowed in the URL properties ("url", "registry",
etc). I would hope that relative URLs are allowed and are resolved
relative to the current document. It is certainly best practice to use
relative URL for resources on the same server to make the data more
portable (can be replicated or moved to another server without having to
change the data).

--
Thanks,
Kris

Isaac Schlueter

unread,
Sep 7, 2010, 5:53:14 PM9/7/10
to comm...@googlegroups.com
On Sat, Sep 4, 2010 at 13:20, Kris Zyp <kri...@gmail.com> wrote:
> First, I think we should list "zip" in the type of archives.

Sure, that's fine. How about this?

The "dist" hash MUST contain one or more of the following fields:

"tarball" : URL to a gzipped tar archive
"zip" : URL to a zip archive


> Second, I would try to be a little more RESTful, and give the repo
> server more control over their own URL namespace with regards to
> searching. Could we provide a way for the server to indicate how to
> search for packages with a URI template?

I'd prefer not to support URI templates. It adds surprising
complexity to the client for little to no gain.

If a server wants to have a package at somewhere other than /{name}
then they can set up a redirect. Likewise if they want to host a
version somewhere other than /{name}/{version}. These redirects can
even be to other servers. This is *more* flexible than a URI template
(since it can be directed to a different host), and is significantly
simpler for the easy case (just set it up the same way.)

For version 1 of this spec, I'd suggest we leave it as is, or even
remove the redirect stuff. When there's a compelling use case that is
only served by URI templates, then I'd suggest we do it then.


> but it does mean you have to download the entire list
> of packages to be certain of a the package root url for a given package
> name.

No, you can just request /{name} or /{name}/{version}, and follow any
redirects you get.


> Maybe we aren't expecting repos to be so big that downloading the entire
> list of packages is that bad, and this isn't worth it, not sure.

CPAN has 18328 modules, and the page that lists them all is very fast.
And that's HTML, which adds a lot more baggage than a JSON list of
the package names.

When anyone's js package registry hits 20k packages, let's revisit
this suggestion. ;)


On Mon, Sep 6, 2010 at 21:07, Kris Zyp <kri...@gmail.com> wrote:
> BTW, does NPM implement all of
> http://wiki.commonjs.org/wiki/Packages/Registry right now? That would be
> really cool if it did...

Yep. npm's registry at http://registry.npmjs.org/ is a "CommonJS
Compliant Package Registry" according to this spec. If you point your
npm "registry" config to any other registry that complies with this
spec, then it would be able to install packages just fine from it.

It also has a lot of other extra stuff to handle putting stuff INTO
the registry, which this spec specifically does not cover. I think
speccing that would just cut down innovation.


On Tue, Sep 7, 2010 at 08:55, Kris Zyp <kri...@gmail.com> wrote:
> I would hope that relative URLs are allowed and are resolved
> relative to the current document.

I have no problem with that. The only tricky thing is that /{name}/
and /{name} are not the same thing wrt relative URIs, and the spec
does not talk about the trailing slash. For that reason (and so that
they're clickable in JSONView) npm's registry lists the full absolute
URL.

--i

Kris Zyp

unread,
Sep 7, 2010, 5:56:29 PM9/7/10
to comm...@googlegroups.com, Isaac Schlueter
Sounds good to me. Thanks for putting this together.
Thanks,
Kris

--
Thanks,
Kris

Mikeal Rogers

unread,
Sep 7, 2010, 6:06:47 PM9/7/10
to comm...@googlegroups.com
On Tue, Sep 7, 2010 at 2:53 PM, Isaac Schlueter <i...@izs.me> wrote:
On Sat, Sep 4, 2010 at 13:20, Kris Zyp <kri...@gmail.com> wrote:
> First, I think we should list "zip" in the type of archives.

Sure, that's fine.  How about this?

The "dist" hash MUST contain one or more of the following fields:

"tarball" : URL to a gzipped tar archive
"zip" : URL to a zip archive


> Second, I would try to be a little more RESTful, and give the repo
> server more control over their own URL namespace with regards to
> searching. Could we provide a way for the server to indicate how to
> search for packages with a URI template?

I'd prefer not to support URI templates.  It adds surprising
complexity to the client for little to no gain.

If a server wants to have a package at somewhere other than /{name}
then they can set up a redirect.  Likewise if they want to host a
version somewhere other than /{name}/{version}.  These redirects can
even be to other servers.  This is *more* flexible than a URI template
(since it can be directed to a different host), and is significantly
simpler for the easy case (just set it up the same way.)

Maybe just specify at the top of the spec that any resource MAY return a 307 temporary redirect. I wouldn't want to open up the can of worms that is permanent redirects + caching.

-Mikeal
 

Isaac Schlueter

unread,
Sep 10, 2010, 1:21:13 PM9/10/10
to comm...@googlegroups.com
On Tue, Sep 7, 2010 at 15:06, Mikeal Rogers <mikeal...@gmail.com> wrote:
> Maybe just specify at the top of the spec that any resource MAY return a 307
> temporary redirect. I wouldn't want to open up the can of worms that is
> permanent redirects + caching.

307 is handled oddly by MSIE (and perhaps others). It's better to use
the more general "302 Found" (which MUST NOT be cached) if it's
temporary.

The spec just says that the registry can use 3xx status codes, but
doesn't specify which ones. I think it's fair to assume that if the
redirect is permanent (and cacheable) then a 301 could be used,
otherwise a 302.

That being said, it's a good point in general, and specs should err on
the side of being too specific, so I'll add it :)

--i

Reply all
Reply to author
Forward
0 new messages