tusk catalog and package installation path

Christoph Dorn

unread,

Sep 28, 2009, 2:59:10 PM9/28/09

to narw...@googlegroups.com

I have completed an initial comprehensive API for catalog management and
package installation for tusk. At this time the API will allow you to do
all sorts of things that may or may not be desirable. While implementing
this and applying it to my use-cases I have learned a few things
(discussion following) that may help us narrow down on a decent
implementation.

I was hoping to get agreement on a few concepts before I write unit
tests and simplify, refactor, javascriptize and lock down the API and
internals.

Overall my design goal has been to provide a way of managing package
dependencies via tusk that is simple yet flexible enough to meet the
requirements of distributed open source development.

I am seeking feedback on the following:

NOTE: The implementation described below is not 100% compatible with my
current tusk-catalog branch.

------------------------------------------------------
All package dependencies are resolved through catalogs
------------------------------------------------------

Assumptions:

* There will be many independently developed packages
* Many different packages will have the same name
* Packages will come from many sources
* Many package authors want the ability to distribute their packages
* Different forms (source, binary) of a package must be addressable
* Dependency resolution and installation must be completely automatic
* A user must be able to override a dependency

Solution:

Resolve dependencies via URL's such as:

tusk://<catalog>/<packageName>

Where <catalog> is the name of a system-wide catalog containing a list
of packages addressable via <packageName>.

Furthermore the name of a catalog <catalog> must be of the form:

localhost.* for local non-published catalogs
TLD.* (top level domain) for published catalogs

This allows us to enforce unique catalog names (based on package manager
implementation) such as:

tusk://com.github.tlrobinson/*

The name of a catalog is defined in it's catalog.json ~

{
"name": "com.github.tlrobinson"
}

Allowing the addition of a catalog to a tusk installation via:

tusk catalog add http://github.com/tlrobinson/narwhal/\
raw/master/catalog.json

Which is downloaded and stored as:

com.github.tlrobinson.catalog.json

And contains the following additional info:

{
"origin": {
"url": "http://github.com/tlrobinson/narwhal/\
raw/master/catalog.json"
},
"revision": "783ACDC2E7CF45EA4901F624811F2ABF",
}

Where:

* origin.url - is the URL of where the catalog was downloaded from
* revision - is a hash of the JSON of all "packages" in the catalog
(this is used to check for updates)

You can now define a dependency in foo/package.json with:

{
"dependencies": [
["the-template", "tusk://com.github.tlrobinson/template/latest"]
]
}

Where the "tusk://com.github.tlrobinson/template/latest" package will
have an alias of "the-template" for all modules in our foo package.

The "/latest" specifies the revision of the package either as a keyword
with special meaning or some arbitrary version string.

Now you are working on the "foo" package and realize you need to make
improvements to the "tusk://com.github.tlrobinson/template" package.
i.e. you need the source checkout (a clone of your github fork) instead
of the published package and override the dependency:

git clone <url> ./template
tusk catalog create com.github.cadorn
tusk package link --catalog com.github.cadorn ./template
tusk catalog overlay --catalog com.github.tlrobinson com.github.cadorn
tusk package install -f foo

These steps overlay your "com.github.cadorn" catalog on top of
"com.github.tlrobinson" for your "planet" (your system-wide tusk
install) causing all package searches in "com.github.tlrobinson" to
check your catalog first. Since you linked the package into your catalog
the package will be linked to the source during the installation (if
/latest was used as the dependency revision).

------------------------------------------------------
Package installation paths & loading
------------------------------------------------------

In order to avoid collisions during package install, packages must be
namespaced with the catalog and revision.

This makes most sense in the context of a "sea". For every
project/application you work on you create a sea:

tusk sea create --name foo ./foo

Which creates foo/package.json ~

{
"name": "foo"
}

To switch to the sea you can run:

tusk sea switch foo

Which calls foo/bin/sea to activate your virtual environment that tusk
respects.

Adding the dependency from above to foo/package.json:

{
"dependencies": [
["the-template", "tusk://com.github.tlrobinson/template/latest"]
]
}

And running:

tusk package install -f foo

(-f is needed as the package is already installed in the sea (since it
is the sea))

will install all dependencies. This will cause
"tusk://com.github.tlrobinson/template/latest" to be installed at:

foo/packages/dependencies/com.github.tlrobinson/template/latest

If the "template" package defines it's own dependencies they will be
installed into the same foo/packages/dependencies namespace.

Any packages in foo/packages/dependencies are of no real interest to you
other than being able to review the source for debugging. If you need to
work on a dependency you overlay the catalog as outlined above in which
case you have a second sea available to work on that package
independently and in it's own environment.

The package loader knows about dependencies and is able to map the
dependent packages to the aliases defined in package.json and create an
appropriate list of load search paths for modules.

------------------------------------------------------
Go forward
------------------------------------------------------

There are several scenarios I have worked through but not addressed in
the above explanations and I am sure we will run into others that will
require further work, however we need to find agreement on the above
fundamentals for me to continue in an effective manner.

Please let me know what you think about:

1) All package dependencies are resolved through catalogs
2) Package installation paths & loading

Christoph

Kris Kowal

unread,

Sep 28, 2009, 8:58:03 PM9/28/09

to narw...@googlegroups.com

I, unfortunately, do not have time at the moment to reply in full, but
I'd like to add to the list of requirements. Some of these are long
term vision requirements, not expectations of this stage of work.

On Mon, Sep 28, 2009 at 11:59 AM, Christoph Dorn
<christ...@christophdorn.com> wrote:
> Assumptions:
>
> * There will be many independently developed packages
> * Many different packages will have the same name
> * Packages will come from many sources
> * Many package authors want the ability to distribute their packages
> * Different forms (source, binary) of a package must be addressable
> * Dependency resolution and installation must be completely automatic
> * A user must be able to override a dependency

* Downloading a package and placing it in the "packages/" directory
must be sufficient to install a package.
* Parameters for different forms of a package must be addressable,
including source/binary, architecture, engine, os, and others. The
parameter space must be extensible.
* It must be possible to "reheat" a "frozen" "sea", that is, to
reconstruct a packages/ directory with the exact same versions of all
packages in the transitive dependencies of another sea, by their
"universal" name and "version", albeit targeting a different
source/binary, architecture, engine, os, or other parameter.
* Packages and their transitive dependencies should be downloaded,
built, and tested in a single transaction in a "staging sea".
* It should be possible to build from source in a staging sea and
installed in a "lean" sea, to minimize the file count for a system
like Google App Engine.

Very long term, as it will require loader support:

* To address the universal name space problem, we should use a system
like what Ihab is working on in CommonJS, where individual packages
can explicate a miniature catalog in their package.json, which would
be used only to resolve special URI style references to modules in
those packages, from modules within the parent package.

I think Kevin might be able to offer some more requirements for the
long-term vision. I'll get back to a detailed analysis of this
proposal soon.

Kris Kowal

unread,

Sep 28, 2009, 9:01:33 PM9/28/09

to narw...@googlegroups.com

Oh,

* It should be possible for a sea to contain multiple versions of a
package, perhaps in a versions/{name}/{version} directory tree, with
packages/{name} containing or containing a link to the "active"
package for a particular sea. So, a "sea" might inherit packages from
the other seas in the system.prefixes array, but the activated
packages in the earliest prefixes have precedence over those that come
later.

Kris Kowal

Christoph Dorn

unread,

Sep 28, 2009, 9:48:37 PM9/28/09

to narw...@googlegroups.com

I do not see a problem fitting these requirements into what I have done
so far.

You have hit on a critical separation of responsibility I completely
agree with:

> * Downloading a package and placing it in the "packages/" directory
> must be sufficient to install a package.

The catalog work is focused on where to get packages from in an
automated installation workflow. I believe it should be just as easy to
"install" a customized development environment for a package/application
as it typically is to deploy the package/application to a production system.

The level of indirection I have introduced by means of catalogs meets
many of my requirements. It is important for us to know if this will
sufficiently support and not limit or hinder any current or foreseeable
use-cases.

i.e. what design changes will be required to merge this into master and
provide a foundation we are happy to push and expand on.

It would be great to get Tom's and Ihab's feedback to see if their ideas
fit with what I have outlined.

Thanks for the initial feedback.

Christoph

Kris Kowal

unread,

Sep 29, 2009, 2:47:24 AM9/29/09

to narw...@googlegroups.com

Do you have a stance on whether you would like multiple versions of a
package with one name to be available in the same sandbox?

Kris Kowal

Andy

unread,

Sep 29, 2009, 12:05:22 PM9/29/09

to Narwhal and Jack

On Sep 28, 5:58 pm, Kris Kowal <cowbertvon...@gmail.com> wrote:
> I, unfortunately, do not have time at the moment to reply in full, but
> I'd like to add to the list of requirements. Some of these are long
> term vision requirements, not expectations of this stage of work.
>
> On Mon, Sep 28, 2009 at 11:59 AM, Christoph Dorn
>

> <christoph...@christophdorn.com> wrote:
> > Assumptions:
>
> > * There will be many independently developed packages
> > * Many different packages will have the same name
> > * Packages will come from many sources
> > * Many package authors want the ability to distribute their packages
> > * Different forms (source, binary) of a package must be addressable
> > * Dependency resolution and installation must be completely automatic
> > * A user must be able to override a dependency
>
> * Downloading a package and placing it in the "packages/" directory
> must be sufficient to install a package.

+1

My biggest pet peeve with a lot of systems is that you run this opaque
tool and they do all sorts of magic for you. Then the magic doesn't
work and you're stuck debugging it. Or even worse, you have a rarely-
used program that is silently broken for months. So I prefer to
manage things myself so I know what's going on.

I can't tell from the original post if this was mentioned, but I would
add:

* Multiple collections of packages which are completely isolated
should be easily constructable. There are a lot of reasons for this,
but one is testing compatibility across many versions of the same
package.

Basically you should NOT need what "virtualenv" does in Python.
Python unfortunately follows the Unix convention in sticking
everything in global system dirs, and so this rather elaborate
workaround is needed.

http://pypi.python.org/pypi/virtualenv

Andy

Christoph Dorn

unread,

Sep 29, 2009, 12:57:45 PM9/29/09

to narw...@googlegroups.com

> Do you have a stance on whether you would like multiple versions of a
> package with one name to be available in the same sandbox?

I think that is an important feature. With what I have done it can be
accomplished with:

sea/packages/foo/package.json ~

{
"dependencies": [
["template-old", "tusk://catalog/template/1.2"]
["template-new", "tusk://catalog/template/1.9"]
]
}

sea/packages/bar/package.json ~

{
"dependencies": [
["template-new", "tusk://catalog/template/latest"]
]
}

Where dependencies will be installed at:

sea/packages/dependencies/catalog/template/1.2
sea/packages/dependencies/catalog/template/1.9
sea/packages/dependencies/catalog/template/latest

And package modules can be loaded with:

/* from sea/packages/foo/lib/module.js */

require("template-old", "parser");
require("template-new", "parser");

/* from sea/packages/bar/lib/module.js */

require("template-new", "parser");

Christoph

Christoph Dorn

unread,

Sep 29, 2009, 1:06:48 PM9/29/09

to narw...@googlegroups.com

>> * Downloading a package and placing it in the "packages/" directory
>> must be sufficient to install a package.
>
> +1
>
> My biggest pet peeve with a lot of systems is that you run this opaque
> tool and they do all sorts of magic for you. Then the magic doesn't
> work and you're stuck debugging it. Or even worse, you have a rarely-
> used program that is silently broken for months. So I prefer to
> manage things myself so I know what's going on.
>
> I can't tell from the original post if this was mentioned, but I would
> add:
>
> * Multiple collections of packages which are completely isolated
> should be easily constructable. There are a lot of reasons for this,
> but one is testing compatibility across many versions of the same
> package.

I am in complete agreement. The need for a package manager should be
completely optional and the primary purposes of a package manager are to:

* Fetch packages/dependencies and "place" (i.e. copy or link) them
into the packages path
* Automate the building and compiling of packages after they have
been placed in the packages path using the same tools that can be run
manually from the command line. (i.e. make, ant, ...)

Christoph

Christoph Dorn

unread,

Sep 29, 2009, 1:54:25 PM9/29/09

to narw...@googlegroups.com

> And package modules can be loaded with:
>
> /* from sea/packages/foo/lib/module.js */
>
> require("template-old", "parser");
> require("template-new", "parser");
>
> /* from sea/packages/bar/lib/module.js */
>
> require("template-new", "parser");

If you do not want to use a "package" require you can omit the package
and just load the module in which case you will get the first module on
the search path (which is constructed according to the order of the
dependencies).

Christoph

Kevin Dangoor

unread,

Sep 29, 2009, 2:17:40 PM9/29/09

to narw...@googlegroups.com

On Mon, Sep 28, 2009 at 11:59 AM, Christoph Dorn <christ...@christophdorn.com> wrote:

I have completed an initial comprehensive API for catalog management and
package installation for tusk. At this time the API will allow you to do
all sorts of things that may or may not be desirable. While implementing
this and applying it to my use-cases I have learned a few things
(discussion following) that may help us narrow down on a decent
implementation.

This looks good to me on first read, adding in Kris' comments.

I have also found myself wanting to "tusk install ../foo" or something like that to install a package that I have sitting on my local disk. (Put the bin files in the right place, make the libraries available...) Imagine, for a moment, that you checked out Jack and want to work from source. It would be nice to not manually create the symlinks. It would also be nice to be able to later install a proper package when you're ready to move on from source.

Kevin

--
Kevin Dangoor

work: http://labs.mozilla.com/
email: k...@blazingthings.com
blog: http://www.BlueSkyOnMars.com

Christoph Dorn

unread,

Sep 29, 2009, 3:04:38 PM9/29/09

to narw...@googlegroups.com

> This looks good to me on first read, adding in Kris' comments.
>
> I have also found myself wanting to "tusk install ../foo" or something
> like that to install a package that I have sitting on my local disk.
> (Put the bin files in the right place, make the libraries available...)
> Imagine, for a moment, that you checked out Jack and want to work from
> source. It would be nice to not manually create the symlinks. It would
> also be nice to be able to later install a proper package when you're
> ready to move on from source.

tusk install ../foo

That is still supported. The new syntax is:

tusk package install ../foo

Internally it gets converted to:

tusk package install file://../foo

which first triggers:

tusk package add file://../foo

which pulls the "name" out of file://../foo/package.json and writes an
entry into sea/catalog.json (if it does not already exist).

Finally the package is installed into the sea with:

tusk package install "name"

which checks the sea/catalog.json for the source location and installs
the package.

If you want to symlink the package into the sea you would run the
following instead:

tusk package link file://../foo
tusk package install "name"

The sea/catalog.json is intended to track arbitrary packages you end up
including in your sea until you have made up your mind on how these will
be distributed. This allows you do to the following:

tusk catalog add sea/catalog.json

Which adds a planet catalog (system wide) with the name of "name" taken
from sea/catalog.json (i.e. if you ran 'tusk sea create --name bar' the
catalog name would be 'localhost.bar').

Once the catalog is added to the planet you can use package "foo" as a
dependency in another package with:

tusk://localhost.bar/foo/<revision>

Where <revision> can typically only have a value of "latest".

Christoph

Kris Kowal

unread,

Sep 30, 2009, 3:13:52 AM9/30/09

to narw...@googlegroups.com

On Tue, Sep 29, 2009 at 9:57 AM, Christoph Dorn
<christ...@christophdorn.com> wrote:
>
>> Do you have a stance on whether you would like multiple versions of a
>> package with one name to be available in the same sandbox?
>
> I think that is an important feature. With what I have done it can be
> accomplished with:

> require("template-old", "parser");

> require("template-new", "parser");

This kind of extension will require coordination with the CommonJS
list and a lot of care.

However, extending the common package with singleton modules uniquely
identified by top-ids requires no coordination with CommonJS, except
perhaps to help design the package layout and metadata schema. You
might want to break your work into two milestones; the addition of
"packages" to the "common" package with the abstraction of catalogs
within catalogs, and the eventual addition of per-package catalogs
with explicit package identifiers in require calls. I can heartily
expedite the integration of the former, but the latter I would be
hesitant to merge until there's something similar to consensus on
CommonJS. That is not to discourage an experimental branch; I think
we'll need some exercise with your approach to help us drive the
discussion in CommonJS.

And, I still need to make a more thorough analysis of your
proposal/implementation. I've integrated your branch as my "tusk"
branch. Could you post here your vision for the tusk command tree?
The implementation appears to be in transition.

Kris Kowal

Christoph Dorn

unread,

Sep 30, 2009, 4:54:22 PM9/30/09

to narw...@googlegroups.com

>>> Do you have a stance on whether you would like multiple versions of a
>>> package with one name to be available in the same sandbox?
>> I think that is an important feature. With what I have done it can be
>> accomplished with:
>
>> require("template-old", "parser");
>> require("template-new", "parser");
>
> This kind of extension will require coordination with the CommonJS
> list and a lot of care.

Right. I am participating in the discussions. The syntax and how this is
going to work is not set in stone yet.

> However, extending the common package with singleton modules uniquely
> identified by top-ids requires no coordination with CommonJS, except
> perhaps to help design the package layout and metadata schema. You
> might want to break your work into two milestones; the addition of
> "packages" to the "common" package with the abstraction of catalogs
> within catalogs, and the eventual addition of per-package catalogs
> with explicit package identifiers in require calls. I can heartily
> expedite the integration of the former, but the latter I would be
> hesitant to merge until there's something similar to consensus on
> CommonJS. That is not to discourage an experimental branch; I think
> we'll need some exercise with your approach to help us drive the
> discussion in CommonJS.

I am implementing and documenting on my tusk-catalog branch as the
CommonJS discussions evolve. So far everything I have done seems to be
backwards compatible which is an important design goal for me.

I am not following what you mean with "the addition of "packages" to the
"common" package with the abstraction of catalogs within catalogs".

> And, I still need to make a more thorough analysis of your
> proposal/implementation. I've integrated your branch as my "tusk"
> branch. Could you post here your vision for the tusk command tree?
> The implementation appears to be in transition.

I have not removed the old commands yet and only mapped "tusk install"
to the new "tusk package install".

The commands I have added are documented here [1].

As mentioned the current state of implementation is not 100% consistent
as I have not written any unit tests to cover command combinations I
have not reviewed in a while.

Would it help if I put together a screencast to show a typical workflow
with tusk-catalog?

Christoph

[1] - http://github.com/cadorn/narwhal/blob/tusk-catalog/docs/tusk.md

Reply all

Reply to author

Forward