http://wiki.commonjs.org/wiki/Packages/Mappings/C
Kris Kowal
From what I can tell, existing package managers use one of two methods:
1. Each the "lib" directory (whatever it may be called) of each package in the dependencies hash is added to the search paths. This means essentially that packages overwrite each other unless you cooperatively avoid it through a naming convention. - this is how tusk works i believe.
2. The "lib" directory for the package is linked into the global library space used by the underlying engine to search for modules. This way you reference the package by using its name as the first term of your module - much like mappings makes possible but less flexible.
Since mappings is optional, would it be worth specifying what behavior the module runtime should use without it?
Sent from my iPad
> --
> You received this message because you are subscribed to the Google Groups "CommonJS" group.
> To post to this group, send email to comm...@googlegroups.com.
> To unsubscribe from this group, send email to commonjs+u...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/commonjs?hl=en.
>
Deliberately undefined. If packages design to this specification, the
only way to reference another package is through a mapping.
> From what I can tell, existing package managers use one of two methods:
>
> 1. Each the "lib" directory (whatever it may be called) of each package in the dependencies hash is added to the search paths. This means essentially that packages overwrite each other unless you cooperatively avoid it through a naming convention. - this is how tusk works i believe.
To be precise, this is how Narwhal works; Tusk just downloads packages
and only uses package.json to compose a registry/catalog so that it
can download dependencies. This approach is really great for
Narwhal's engine system, but not so great for packages in the wild
because of the coordinated naming you mention, and also for
performance, since require's performance degrades O(n) where "n" is
the number of installed packages. There are ways around this, but
they're hard and we haven't implemented them yet. The next version of
Narwhal will target the use of "mappings" exclusively for packages in
the wild. We will still support "overlaid" packages, but only for
engines and migration of legacy packages.
> 2. The "lib" directory for the package is linked into the global library space used by the underlying engine to search for modules. This way you reference the package by using its name as the first term of your module - much like mappings makes possible but less flexible.
It's my hope that these implementations will at least use this
technique to emulate mappings.
> Since mappings is optional, would it be worth specifying what behavior the module runtime should use without it?
I really think we should not. It's one less thing to argue about.
Kris Kowal
Again, to be precise, this is what I want to do; Tom and I have not
talked about this or made any plans. In any case, I plan to work on
supporting whatever specification works for everyone.
Kris Kowal
I am still wondering what would be recommended for branching to platform
specific modules. Obviously mappings are inappropriate if you can only
use full URLs. Is putting require statements in if-branches likely to be
the best approach going forward?
--
Thanks,
Kris
Looks good. I have the following comments:
1) The "catalog" hint should be allowed on a per mapping level as well
and not just globally.
2) Point 7) Indicate that the descriptor property is to be merged on top
of the package descriptor of the external package.
3) Indicate that a mapping may contain additional properties that are to
be ignored by package managers who do not support them.
Christoph
Sure.
> 2) Point 7) Indicate that the descriptor property is to be merged on top of
> the package descriptor of the external package.
I want to hear whether this would be a deal-breaker for other folks.
That's obviously the intent.
> 3) Indicate that a mapping may contain additional properties that are to be
> ignored by package managers who do not support them.
While I agree in principle and such extensions are inevitable in
practice, there's no way to state this normatively or way to frame it
in a test, and we should discourage people from using names that are
likely to be consolidated in a future version of the specification.
Perhaps it would be sufficient to add a note that various
implementations should use a per-implementation sub-object to avoid
name space collisions with other implementations and future versions
of the specification.
Kris Kowal
I would prefer that we note implementations should use an implementation prefix for their keys instead of requiring a sub-object because this is simpler to implement and supports both approaches anyway.
-C
http://wiki.commonjs.org/index.php?title=Packages/Mappings/C&diff=2832&oldid=2831
The prefix is a good idea. I am perfectly fine with that.
I think we should specify what the delimiter should be for *any* prefix
in package.json.
1) prefix_xxx
2) prefix-xxx
3) prefix.xxx
4) prefixXxx
Christoph
Thanks,
Kris
On 7/7/2010 12:16 PM, Mikeal Rogers wrote:
> IMO mappings is significantly harder to understand than the rest of
> the Packages spec and at the moment could be a barrier to adoption of
> what is already in Packages.
>
> I'd like to get Packages/1.1 finalized (my bad, should have called for
> a vote already) and let more implementations pick it up and then see
> how much of a need they have for Mappings/C.
>
> What current Packages implementers are asking for Mappings/C?
I am (a package implementer) and asking for it. I am also a package user
that needs to be specify dependencies. Most of my packages I have
written have dependencies. Without being able to specify the
dependencies of a package, the package spec seems pretty useless for
most of my stuff (and the current "dependencies" property basically
doesn't get us anywhere). I am not that interested in a package spec
that continues to get us no where.
--
Thanks,
Kris
On 7/7/2010 12:51 PM, Mikeal Rogers wrote:
> Right, and you can use the current draft of Mappings/C because that is
> your preferred solution to solve this problem.
>
> I know that npm has gone a different route and doesn't want/need
> mappings and before we add mappings to packages which would endorse
> Mappings/C as the preferred solution to this problem I would like to
> see how others go about resolving this because clearly Mappings/C is
> not the only route.
Certainly, npm and others have mechanisms for dependencies since deps
are crucial. You can take a look at their documentation if you want to
know how they work. But, the whole point of these discussions and the
spec is so we can reach consensus and have an interoperable format
rather incompatibility.
--
Thanks,
Kris
I'm planning to implement Mappings/C for Narwhal.
Kris Kowal
I am planning to migrate to Mappings/C for all my code with some custom
additions if needed. The packages spec without a way to define
dependencies is utterly useless for anything other than toying around.
It might be a bit early to ratify Mappings/C but I think we need a group
of people to move towards one spec.
I propose we treat Mappings/C as the *leading candidate* and encourage
implementers to adopt and refine the spec. If there is something you
don't like you need to speak up and propose changes.
As for wide adoption of Mappings/C. I think there are basically two
groups of users. One set sees mappings as useful and sufficient and the
other does not. I believe these stances are based on use-cases and past
experience with systems that treat packages a global/system packages and
not package-local dependencies.
Maybe a good path to consensus would be to have a section on the wiki
that lists package manager implementations including an example of the
supported package.json formats that would allow us to contrast and
compare and come up with a unified property that can be used by all.
Christoph
In fact, I'd even suggest (as Mikeal has) dropping support for
directories.lib as a way to do the require("pkg/module") stuff. It
just opens up a lot of edge cases, unnecessarily increases the API
surface of *every* package, and makes mappings/c harder to implement.
I haven't weighed in on Mappings/C yet, becuase I haven't had a chance
to really thoroughly examine what it would take to implement in npm.
--i
I propose we treat Mappings/C as the *leading candidate* and encourage implementers to adopt and refine the spec. If there is something you don't like you need to speak up and propose changes.
I think we all agree that we need a way to define dependencies sooner
than later. If we cannot ratify it then I think we should provide a
direction.
To me this means we have a spec implementers can work towards knowing
that packages will begin to become available using the Mappings/C
dependency declaration semantics. This is required to fine-tune the spec
to be package manager agnostic and be able to cover all edge cases.
We may not have had input from all camps on the Mappings/C spec yet but
it represents the agreement of several implementers who have
implementations evolved enough to have challenged the relevant issues.
> Also, Mappings/C may be the leading *specification* but npm solves this
> problem a different way and is probably in wider use than the Mappings/C
> implementations.
To me this is not a valid argument. npm is one implementation for one
platform. Mappings/C is being discussed based on several implementations
by implementers who want to have interoperable packages across package
managers and platforms.
> I'd like to see how npm plays out and how future Packages
> implementations go about solving this problem before we consider further
> ratification of Mappings/C. As it stands Mappings/C is actually quite a
> lot to implement and has implications on how Modules would have to be
> implemented in order to support it.
I think we all share this sentiment. We all want one awesome spec but we
have to start somewhere and the nature of what we are specifying
requires a collection of packages to comply which takes time.
I would like to see the following:
* npm challenge Mappings/C in features, not structure (at least initially)
* finalize agreement on structure
* npm migrate to Mappings/C for dependency declarations
IMO there is no reason why npm could not follow Mappings/C and maintain
its current implementation. The spec is designed in a way to support
different ways of defining dependencies.
> I'll be implementing support for Packages/1.1 in CouchDB in the near
> future and I'm still not sure how I will go about solving this problem
> but the implications on Modules kind of scares me away from using
> Mappings/C.
I am looking forward to your feedback on Mappings/C once you have a
chance to challenge it by implementation. That is how Mappings/C has
been arrive at thus far.
Christoph
Ok. I've had some time to sit down and have a good long chat with the
spec. We didn't get along very well.
Here's a start.
======
Issue 1: "all modules in the given package"
> describes special behavior of the "require" function that is
> scoped to all modules in the given package,
npm is not a module loader. It is a package manager, strictly. All
that it does is create an environment that will make things work as
expected using node's built-in bare-bones require() implementation.
I can't hope to create shims for "all modules in the given package",
if we are going to continue to expect directories.lib to be part of
the exposed API of a package.
If we remove the directories.lib bit, and replace it instead with a
specific registry of exposed modules, then this is no longer as
problematic from my point of view. I believe that this is the right
way to go, even though Narwhal may have a long list of modules that it
wants to expose. Perhaps there are things we could do to mitigate
that, using globs or some such, and Narwhal is a minority edge case in
many ways anyway.
From what I've seen in npm's usage thusfar, the vast majority of
packages are happy to export a single "main" module. Very many of
them are only a single file.
That should be a different thread. Suffice it to say here that support
in npm for Mappings/C as it stands today is blocked by that issue. We
can either fix that, or replace "all modules" with "the main module,
and any child modules it includes".
Also problematic is the "Modules" section, item 1.2.2.1, for obviously
similar reasons.
======
Issue 2: catalog
> The URL of the registry or catalog may be provided for
> informational purposes, but this specification does not
> impose any restrictions on the format or content of the
> registry.
I think specifying the catalog URL is problematic if we don't also
specify what a catalog looks like. Pointing to a json file is not
sufficient, and raises more questions than it answers.
npm's registry is very different from Narwhal's catalog.json, which is
quite different from pinf's distributed registry URLs. I'm very wary
about a spec that gives developers a tool that looks like a solution,
but is in fact just another incompatible thing.
I don't have a solution to this one, but I know it's an issue. npm's
registry is not ever going to have a single URL with the full package
information for all packages. (In fact, even the current "listAll"
view is probably going away or getting paginated at some point.)
========
Issue 3: "to be usable by all package managers..."
> To be usable by any package manager, a package must provide
> all of the recommended metadata.
That's a lot to ask of developers. I mean, seriously, *two* urls for
*each* dependency? As well as the name and the version?
Consuming something should be *easy*. These 4 pieces of metadata
belong to the package being required, not in the thing doing the
requiring. If you write a program that depends on just *three*
libraries, and you shoot for full spec compliance, you're going to
spend about half your package.json bytes just outlining dependencies,
and that just feels super wrong.
I think this is the crux of my initial negative gut reaction to this
specification. We all want something a solution that is interoperable
and works for all of us. The problem is that Mappings/C is not a
solution we can all share; it's just taking 4 different solutions and
putting them all in the same place, and asking developers to use all
4. That feels a bit more optimized for us than for our users, and
that's not right.
I realize that you asked me to object on features, and perhaps you
could argue that this is a structural question. However, it is
fundamental. This approach leads necessarily to a lot of data
duplication.
======
Let's back up and focus on the problems we're trying to solve, so that
we can do it in the way that is as easy as possible for the people
using it.
> To me this is not a valid argument. npm is one implementation for one
> platform.
Every package manager is one implementation for N platform(s). The
fact that N=1 in npm's case seems irrelevant to me.
> Mappings/C is being discussed based on several implementations by
> implementers who want to have interoperable packages across package managers
> and platforms.
I want that just as much as anyone else. I object to Mappings/C
because it is an unsatisfactory solution in general, not because I
believe it is just a bad solution for npm. I'd rather have a good
specification that serves all our use cases, even if it means a lot of
refactoring work for me, than something that is fundamentally
problematic, but easy to implement today.
--i
Christoph
Let's back up and focus on the problems we're trying to solve, so that
we can do it in the way that is as easy as possible for the people
using
All the packages I have written are designed such that the majority of
their modules should be exported and accessible from other packages. Of
course that does include some packages that only have one module :).
> That should be a different thread. Suffice it to say here that support
> in npm for Mappings/C as it stands today is blocked by that issue. We
> can either fix that, or replace "all modules" with "the main module,
> and any child modules it includes".
>
I'm curious how npm is able to work better if a list of exposed modules
is provided? If your list of exposed modules includes all the modules
would npm still work? How is including a list of exposed modules
different than reading that list from a directory listing?
Anyway, that being said/asked, I really don't mind including a list of
exported modules in my package.json.
Also, my hope was that the mappings design should be trivial to
implement in npm. It should behave the same as npm treats it's
"dependencies" property where npm can create symlinks to the target
packakges, with the only difference being that the target package is
looked up by a fully qualified unambiguous identifier (URI) rather than
an ambiguous non-namespaced identifer. For example, these should have
the same effect:
dependencies:{foo:"*"}
and this:
mappings: {foo:"http://github.com/kriszyp/foo/zipball/master"}
I would not want the mappings proposal designed to introduce any
unnecessary burden on npm.
> Also problematic is the "Modules" section, item 1.2.2.1, for obviously
> similar reasons.
>
> ======
>
> Issue 2: catalog
>
>
>> The URL of the registry or catalog may be provided for
>> informational purposes, but this specification does not
>> impose any restrictions on the format or content of the
>> registry.
>>
> I think specifying the catalog URL is problematic if we don't also
> specify what a catalog looks like. Pointing to a json file is not
> sufficient, and raises more questions than it answers.
>
> npm's registry is very different from Narwhal's catalog.json, which is
> quite different from pinf's distributed registry URLs. I'm very wary
> about a spec that gives developers a tool that looks like a solution,
> but is in fact just another incompatible thing.
>
> I don't have a solution to this one, but I know it's an issue. npm's
> registry is not ever going to have a single URL with the full package
> information for all packages. (In fact, even the current "listAll"
> view is probably going away or getting paginated at some point.)
>
I agree.
> ========
>
> Issue 3: "to be usable by all package managers..."
>
>
>> To be usable by any package manager, a package must provide
>> all of the recommended metadata.
>>
> That's a lot to ask of developers. I mean, seriously, *two* urls for
> *each* dependency? As well as the name and the version?
>
> Consuming something should be *easy*. These 4 pieces of metadata
> belong to the package being required, not in the thing doing the
> requiring. If you write a program that depends on just *three*
> libraries, and you shoot for full spec compliance, you're going to
> spend about half your package.json bytes just outlining dependencies,
> and that just feels super wrong.
>
> I think this is the crux of my initial negative gut reaction to this
> specification. We all want something a solution that is interoperable
> and works for all of us. The problem is that Mappings/C is not a
> solution we can all share; it's just taking 4 different solutions and
> putting them all in the same place, and asking developers to use all
> 4. That feels a bit more optimized for us than for our users, and
> that's not right.
>
I agree, this should be simple, I would hope most packages could provide
single name-value pairs for each dependency.
> I realize that you asked me to object on features, and perhaps you
> could argue that this is a structural question. However, it is
> fundamental. This approach leads necessarily to a lot of data
> duplication.
>
> ======
>
> Let's back up and focus on the problems we're trying to solve, so that
> we can do it in the way that is as easy as possible for the people
> using it.
>
Would it be helpful to make sure we have agreement on the requirements?
To me the biggest concern is providing unambiguous references to
packages. In a world of github branching and forking and multiple
package managers, a single namespace for package names is woefully
inadequate. The reason I initially proposed the mappings concept was the
problems I experienced with the single namespace of package names in
tusk and being unable to declare a dependency on a my fork of jack
(which obviously wasn't the fork in the tusk catalog). I just don't want
to see us go down the road of failing to provide robust package
identification just because a single namespace works on a small scale.
--
Thanks,
Kris
I would be happy to iterate on the catalog functionality as an optional
extension as is being proposed. Thus it could be removed from Mappings/C.
> To me the biggest concern is providing unambiguous references to
> packages. In a world of github branching and forking and multiple
> package managers, a single namespace for package names is woefully
> inadequate. The reason I initially proposed the mappings concept was the
> problems I experienced with the single namespace of package names in
> tusk and being unable to declare a dependency on a my fork of jack
> (which obviously wasn't the fork in the tusk catalog). I just don't want
> to see us go down the road of failing to provide robust package
> identification just because a single namespace works on a small scale.
+10
Christoph
Sorry, I might have been unclear before.
With the exception of the first 2 issues I mentioned in my initial
email, Mappings/C would be trivial to implement. I could just use the
name/version like I have been, and ignore the rest. But I don't think
it's a *good* solution, that's the problem. It's not one solution;
it's 4 solutions glommed onto a single object. I think we can do
better.
> I'm curious how npm is able to work better if a list of exposed modules
> is provided? If your list of exposed modules includes all the modules
> would npm still work? How is including a list of exposed modules
> different than reading that list from a directory listing?
1. It would make package authors *really think* about what they're
exposing, rather than just relying on everything being public by
default.
2. Many packages have a lot of internal stuff, and a single exported
module. (Qv fab, npm, connect, express.) If someone is doing
require("npm/utils/registry/adduser") then they're Doing It Wrong, and
it will break a lot. Hiding the internal modules is a desirable
feature for many packages.
So, in the common case, npm would only have to create one shim,
instead of dozens. It wouldn't have to recurse through
subdirectories, or any directories at all.
> Would it be helpful to make sure we have agreement on the requirements?
Yes.
> To me the biggest concern is providing unambiguous references to
> packages. In a world of github branching and forking and multiple
> package managers, a single namespace for package names is woefully
> inadequate. The reason I initially proposed the mappings concept was the
> problems I experienced with the single namespace of package names in
> tusk and being unable to declare a dependency on a my fork of jack
> (which obviously wasn't the fork in the tusk catalog). I just don't want
> to see us go down the road of failing to provide robust package
> identification just because a single namespace works on a small scale.
I definitely appreciate your concerns. For my purposes, it'd be good
to know how to acquire a package dependency even if it's not in npm's
registry, so we have somewhat overlapping goals.
I'm not so convinced that a single namespace is a big problem,
however. Couldn't you just name your jack fork something different?
I mean, it IS different, so why not NAME it differently?
> I agree, this should be simple, I would hope most packages could provide
> single name-value pairs for each dependency.
Maybe what we need for this is to keep the name:version-range
dependency hash that npm loves so well, and tack on a clear way to map
that to a registry or archive somehow, but without going the route of
providing registered name, require-as name, version, archive, and
source location. The dependency should ideally be the minimum data
footprint to uniquely specify what's required. Name:version-range
works great, provided you know which registry to query for that info,
and how to do so.
I suspect that a registry API specification will be a tough thing to
settle on. I'm perfectly ok with making radical changes to npm's
registry in the name of interoperability, but there are issues I am
somewhat unwilling to bend on.
I'm definitely not ok with a single catalog.json that contains everything.
Also, I'm not thrilled about having to segregate which packages came
from which registries. I'd planned multiple registries to work
somewhat like apt-get's PPAs. That is, you can install "foo" from any
registry, but the system assumes that "f...@1.2.3" is an identity. A
registry that has a *different* thing with the same name and version
is considered a broken/corrupt registry. A collection of registries
should act like a peer distribution chain, not like different
independent sources that might disagree with one another.
In the end, I want to be able to say that a package manager is not
compliant unless it supports ALL of the specified ways to express
dependencies, and I want to ideally give package authors a single
specific correct way to list their dependencies. Mappings/C is the
opposite of that.
--i
We don't agree on the requirements. About half of us consider it a
requirement to have a distributed name space. Another half consider
it a requirement to have a managed centralized naming system.
Somewhere in between, some are willing to have a single name space
where every name we choose is a roll of the dice, where names are
chosen not to be accurate but to be unlikely to collide. Which is to
say: a central name space that we pretend is distributed.
However, I recognize that this is an area where we can probably at
best agree to disagree. Mappings/C lets us agree to disagree, while
giving package maintainers the ability to support whomever they agree
with. That's a big win: taking the decision away from the mailing
list.
You're of course welcome to write a Packages proposal that documents
compatibility with NPM.
Kris Kowal
UUIDs are the solution for package identification. They require no central management (as does DNS and other registry-based methods), they are constant over time (which DNS is not), and they imply no information other than identification (as does DNS). Since all UUIDs are equally undesirable from a marketing standpoint, there is no need to judge fairness of distribution or pass that burden on to another agency (DNS, github, etc...).
-- Thanks, Kris
--
You've still got a central authority and a single namespace. It just
makes the names incompatible with human brains.
It's worth noting that hostnames are a single-level namespace managed
by a central authority. At some level, it always comes down to this.
I don't really see why relying on internic and DNS is significantly
better than relying on a registry that we all control. I think that
we could come up with a simple way to synchronize registries so that
it's not tied to a single server, but still uses name+version as a
global unique identifier.
If package name *isn't* a unique identifier, then you run into
problems anyway. For instance, if I install Jack from Narwhal's
registry, and then also install Kris Zyp's Jack fork, then what does
the "jackup" command do? In the installation space, there's only one
PATH.
--i
If you have a REST api for getting location info and metadata from a
UUID, then how is that any better than a name/version pair with a
registry?
With npm's registry, at least, "getting there first" is the rule.
It's like twitter. Once you claim the name, it's yours, unless the
registry maintainer (ie, me) deletes it for some reason. There's no
TOS. It's just a benevolent dictatorship. For a few thousand users
or less, that's fine. I sincerely hope to have to solve the "too many
projects" problem eventually.
I'm suggesting a ring of peer registries run by mutual agreement
between the registry owners. That is, a trust-network of
dictatorships with treaties between them and specified data-sharing
mechanisms. I have to build this for npm anyway, and I'd love to do
so in a way that's generally useful.
> But there are problems. Everyone
> loves to use jquery as an example, so what if I go and create a
> jquery2/jquery2 project over on github?
Then you're a jerk. Don't do that. :)
If you create a "fab2" project in npm, I'll delete it, because being a
jerk is not allowed.
> But for package.json, it seems like all that's required is to identify the
> package, hence a UUID should be sufficient.
I am really hesitant to suggest that package authors need to juggle
UUIDs. In every way that URLs are unsavory, UUIDs are worse by far.
> It's all very theoretical, though, since the infrastructure
> doesn't exist.
Of course. Same for what I'm saying.
--i
If you create a "fab2" project in npm, I'll delete it, because being a jerk is not allowed.
--
Kris Kowal
Agreed, UUIDs are not an improvement over URLs. While URLs are not
perfect, having some semantic meaning is better than none. URLs also
allow an easy way to map usable short names to a place to get the code
without needing another central authority.
> It's worth noting that hostnames are a single-level namespace managed
> by a central authority. At some level, it always comes down to this.
> I don't really see why relying on internic and DNS is significantly
> better than relying on a registry that we all control. I think that
> we could come up with a simple way to synchronize registries so that
> it's not tied to a single server, but still uses name+version as a
> global unique identifier.
But is a central registry that "we all control", any better than URLs?
If our own central registry is not better, I do not see that adding
the management of that repository as worth the cost. From what i have
heard described, I see some downsides with our own central registry,
some mentioned below.
> If package name *isn't* a unique identifier, then you run into
> problems anyway. For instance, if I install Jack from Narwhal's
> registry, and then also install Kris Zyp's Jack fork, then what does
> the "jackup" command do? In the installation space, there's only one
> PATH.
Package names need not be unique across all packages -- they just need
to be unique within a package or an app to be useful. Using short
names for packages that have a mapping to URLs is very flexible and
useful for programming in the large. There will be times when you do
not want a central control involved with package resolution. Local
patches to a module that is used inside firewalls will not be unusual.
There are times when you do want to coordinate different packages
using the same dependent package too, but they are not the only case.
For your Jack question, I hope that for Kris Zyp's project that uses
Jack, he could specify the version of Jack he needs for his app, and
have it locally installed for his app without conflict with the global
area of packages.
The purpose of package and modules are to make programming in the
larger easier. Making the "simple" case of using packages easy at the
cost of more issues for doing programming in the large or limiting its
use seems to run counter to the purposes of having packages and
modules. The simple programming cases are already easy, see browser
development approaches.
One of the goals for module IDs are to allow mapping of where those
IDs are found on disk (a type of URL). Taking that same principle to
package identifiers seem to be a natural extension of that concept,
and keeping with the goals to make larger programming possible with
discrete entities. While it is nice to have a catalog that can store
some mappings, it should not be the only way to get modules. Allowing
a package to be explicit with the mapping helps programming in the
large, and while still has some backing in a central service (DNS), it
is still more decentralized than a central repository for package
names, which will use URLs at some point internally.
In summary, central repos that store some mappings are useful, but
they are not a full or robust solution. It adds another dependency in
the chain that could go wrong. Packages should have the option to be
explicit about mappings.
James
--
But URLs aren't constant - everyone's got their stuff on github right now, but a couple of years from now there might be a new king of the hill. If the maintainers move their repository, then what? Will the mappings still work? Curious...Correct semantic meaning are surely better than none, but what about wrong semantic meanings?
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]
There are a lot of things to complain about with CPAN or rubyforge or
pear. But "there is an authority that maps names to things" is not
something I'm aware of as a real problem. With apt-get's PPAs, you
can point at your own internal registry if you want to change the
meanings of the names. Different distributions can use different
registries with their own ports. That is, there could be a RingoJS
registry that pulls all the non-node-specific stuff from the npm
registry, and vice versa.
Expecting a name to mean one thing is basically saying that we, as a
community, expect a word to have a single meaning in a single context.
I think that's a good thing. A common vocabulary gels a community
and makes it work together. It builds interoperability and cohesion.
On Tue, Jul 13, 2010 at 12:33, Wes Garland <w...@page.ca> wrote:
> What about keeping a UUID -> URL mapping on something.commonsj.org?
So... Why not a name@version -> URL mapping on something.commonjs.org?
UUIDs combine the lookup overhead of names with the readability of URLs.
It solves the identity problem, but it makes every other problem
worse. That is the wrong move.
On Tue, Jul 13, 2010 at 11:40, Kris Kowal <kris....@cixar.com> wrote:
> I've been helping maintain the Narwhal
> catalog for a year or more. It's not a job anyone should have to do.
Yeah, that's because it's a manually maintained list of every module.
I did this for about a week in npm before realizing that I needed a
registry. (You may be able to dig up some forks of the now-deleted
npm-data repository. Turns out, "publish via pull request" sucks a
LOT for the owner.) It is a testament to your patience and resistance
to frustration that you have managed to deal with this for so long.
A hands-off web service that works with a tool is a whole different
story. With the exception of dealing with bugs in node's HTTPS client
and nginx's HTTPS server, it's been pretty easy to maintain. And
that's in the midst of making breaking changes to npm's API. We're up
to 121 package names (303 distinct versions), not counting my "testing
please ignore" package, and that number goes up without me having to
ever think about it. It's pretty awesome, actually.
For the first several months of npm's life, it used URLs. Specifying
a dependency was a matter of pointing to a URL of a tarball. You
installed stuff by pointing to a tarball. (The url-as-dependency
doesn't work in npm any more, but installing a tarball still does.)
So, when I say that this users hate this, what I mean is, I hated
using a package manager that worked this way. It's awful. I kind of
just dealt with it, because I had to get the URL to a tarball somehow
anyway, and I grokked why it worked that way. But there's no way that
I could in good conscience ask anyone else to suffer with that.
By comparison, when someone publishes something for the first time,
and does "npm install blerg", and sees it work, it's like magic.
> At best, a catalog should be a convenience for
> collecting common mappings from short names (catalog url, name,
> version) to long names (url). That's provisioned for in Mappings/C.
So, you're saying that a catalog that maps a short name to a URL is a
stop-gap? I couldn't disagree more.
If we specify what a "catalog" is, and then have package.json supply
the name, version, and catalog, then why wouldn't that give everyone
what they need?
> We should strive to make programming less political.
I think that's out of scope for this exercise.
If there are 3 things named "Jack", then you have a political problem,
even if they're all pointing to different URLs.
--i
UUIDs combine the lookup overhead of names with the readability of URLs.
It solves the identity problem, but it makes every other problem
worse. That is the wrong move.
On 7/13/2010 10:53 AM, Isaac Schlueter wrote:
> If you have a REST api for getting location info and metadata from a
> UUID, then how is that any better than a name/version pair with a
> registry?
>
> You've still got a central authority and a single namespace. It just
> makes the names incompatible with human brains.
>
> It's worth noting that hostnames are a single-level namespace managed
> by a central authority. At some level, it always comes down to this.
> I don't really see why relying on internic and DNS is significantly
> better than relying on a registry that we all control.
Because Internic is a massively scaled, proven, and accepted central
authority. Why are we trying to reinvent this? Plus we don't have any
type of delegation of authority (the domain name is from a single
namespace, the host name and path names are delegated) with the naive
flat namespace of package repositories. We can continue to look at the
challenges and complexities of scaling our own simple namespace (even if
it has synced with other servers) in the face of modern social
development with forks, branches, and multiple package management tools
and repositories. We can try to invent new naming conventions for things
like forking and branching and play the mediator of who deserves which
names (there will certainly be some desirable names out there), but why
does this when we already have a enormously successful architecture to
build on (the web)? Trying to force a single namespace is road that just
gets uglier the further down you go, and this is why the web wisely
chose to distribute authority as much as possible (only needing to
centrally manage domain names).
> I think that
> we could come up with a simple way to synchronize registries so that
> it's not tied to a single server, but still uses name+version as a
> global unique identifier.
>
I think this is another example of how the single namespace path
continues to lead to further need for invention.
> If package name *isn't* a unique identifier, then you run into
> problems anyway. For instance, if I install Jack from Narwhal's
> registry, and then also install Kris Zyp's Jack fork, then what does
> the "jackup" command do? In the installation space, there's only one
> PATH.
>
This is an issue for installation of scripts and is irrelevant to the
issue of referencing packages within JavaScript. Most packages don't
care about startup scripts. And people have certainly successfully
installed different Jack forks on the same machine, its not that hard.
On 7/13/2010 2:10 PM, Isaac Schlueter wrote:
> I started npm because the common sentiment in the NodeJS community was
> that NodeJS needed something like CPAN/rubyforge/pear/etc. I'd had a
> lot of experience, both good and bad, dealing with package managers,
> and a lot of very strong feelings about how things should be done.
>
> There are a lot of things to complain about with CPAN or rubyforge or
> pear. But "there is an authority that maps names to things" is not
> something I'm aware of as a real problem. With apt-get's PPAs, you
> can point at your own internal registry if you want to change the
> meanings of the names. Different distributions can use different
> registries with their own ports. That is, there could be a RingoJS
> registry that pulls all the non-node-specific stuff from the npm
> registry, and vice versa.
>
These registries are coupled with much more centralized languages that
group with old school SVN repos, so of course they have more centralized
repositories. It is wrong to assume that what fits for perl or ruby is
the right fit for the much more diverse web-oriented heavily-forking
community (FTW) of JavaScript or CommonJS.
> Expecting a name to mean one thing is basically saying that we, as a
> community, expect a word to have a single meaning in a single context.
> I think that's a good thing. A common vocabulary gels a community
> and makes it work together. It builds interoperability and cohesion.
I agree with this sentiment. I don't want to have to use the term
"CommonJS 1.0 ratified module" within this community when everybody
understands what I mean when I simply say "module", even though "module"
can have different meanings in different contexts. This is exactly why
we have aliases, AKA mappings. It is critical that we understand the
correct order here. The term "module" isn't the ultimate indentifer for
what we mean when we say "module", it is the alias we use in this
community as shorthand for the more explicit "CommonJS 1.0 ratified
module" (or whatever). If you have this backwards you end up playing the
game of trying to disambiguate poorly identified entities. Aliasing is a
much simpler problem to solve.
> I'm not so convinced that a single namespace is a big problem,
> however. Couldn't you just name your jack fork something different?
> I mean, it IS different, so why not NAME it differently?
So if I fork to fix a single bug it is a something different? And it
needs entirely new name? That sounds horrible to me. And if lack
aliasing that means I have to use the different package name through out
my code. And when the bug fix gets pulled into the main branch and I
want to point back to the main fork I'd have to rewrite all my code
(instead of just changing package.json). This type of tight coupling is
another example of the fail of single namespace. Sorry if I am beating a
dead horse here...
--
Thanks,
Kris
Thanks,
Kris
Question for any DNS experts out there:It seems like a UUID to URL mapping could be done using DNS by storing the URL in a TXT record for a specially formatted domain name. So in DNS, this domain name:d0eed110-8f4d-11df-a4ee-0800200c9a66.packages.commonjs.org
It seems like a UUID to URL mapping could be done using DNS by storing the URL in a TXT record for a specially formatted domain name. So in DNS, this domain name:
--
If we define how a package registry works, then a package author can
do something like this:
{ "dependencies" : { "foo" : "1.0.2" }
, "registry" : "http://registry.npmjs.org" }
Then the dependency names can be anything that the registry supports,
and the trust is between the package author and the registry
maintainer.
Everything about how the registry gets its data, where it comes from,
etc., could all be kept under the hood. npm's registry can be a
repository of static tarballs. Another repo might map package names
to git repositories, and versions to git tags. As long as the public
API surface is consistent, all our PMs could use any of our
registries.
Then npm can assume that all the registries in use are compatible (and
die if something clashes), and others who feel strongly that a code's
identity should be registry+name+version can do it that way if they
choose. Then we can stop fighting over the spec, and go back to just
trash talking each others' programs ;) (Good-natured ribbing only, of
course.)
Is that any less flexible or reliable than using URLs directly? It's
more work for us as package manager authors, but it's SO much easier
for package authors, which is what really matters.
I would love to start the discussion about specifying the package
registry API. npm's is working, but I've run into a few problems with
it, and it'd be great to fix those problems in a way that increases
interoperability rather than decreases it.
Does this sound like a valid direction to pursue?
--i
--