At Thu, 30 Apr 2015 03:11:05 -0700 (PDT), Thomas Lynch wrote:
>
> I suspect there is an opportunity to tighten up the use of these terms.
I'm enthusiastic about this effort to improve terminology! For now,
I'll try to offer clarifications on the current intent --- although I
don't expect to get even that completely right here.
> To start off with, the only concrete proposal in this post, does this look
> useful/correct? :
>
> A module is a programmer defined grouping of syntax objects which
> defines bindings and access to those bindings. A module may be
> defined implicitly by putting the syntax belonging to the module
> together in a file. The name of the file is then taken as the name of
> the module. Modules may also be defined explicitly using the
> @racket[module] form.
Ok. I'd use the word "declared" instead of "defined" for modules.
> I gather there is no way to list modules? One might wish for something
> like raco module show or some such. Using 'ls -R' on a directory gets the
> files, which is close. Also, no way to list the modules and their
> dependencies? Or perhaps all this is in DrRacket, and this geiser just
> didn't see it ..
It's possible to enumerate collection-based module paths by inspecting
`current-library-collection-paths`, etc., but there's no tool for that
so far, I think.
> The module defines syntax groupings within the context of racket, which is
> a little different than a file within a file system (which is place to put
> bits). It would be good to understand the term 'collection' in a parallel
> manner, though corresponding to a subdirectory tree rather than a file.
>
> The section organizing modules sets the stage for this. So I gather that
> within that discussion it would make sense to interject at some point that
> such the organization of modules being described is called a *collection*.
Yes.
> Just as the file system may be used as a container for collections (via the
> vehicle of subdirectory trees) -- is it the case that a package may also be
> called a container for one or more collections? I suppose a package would
> have some additional information? Now a collection is a container for
> modules, so I wonder, is not a collection of collections really just a
> bigger collection? Then the distinction between collection and package is
> one of context. A collection is functionally defined form the point of
> view of the language, indeed implied by the syntax - in particular that of
> the path syntax in require; while a package is a utility used for
> transferring and installing collections.
Yes, mostly.
One detail is that multiple packages can have different modules that
are in the same collection. For example, the "gui-lib" package includes
modules in the "racket" collection, specifically the "racket/gui"
subcollection. So, when we say that a package provides a collection,
it's a shorthand for a more precise statement: a package provides a set
of modules that are organized via collections.
> In the context of racket, what does 'library' mean?
I use "library" to refer to a module --- particularly one that is meant
to be access via `require`. Other modules I refer to as "languages" ---
the ones meant to be referenced via `#lang`. I think "library" and
"language" together cover all "module"s.
See also 4.2.1 in the Scribble manual (which is probably not the right
place for that information, but no one has moved it to the overall
Racket style guide, so far.)
> Dearsay it seems that
> in some places modules are called libraries, and in other places
> collections are called libraries, and in yet other places the term library
> seems to indicate that something is installed in a standard place.
Can you point to a place where "library" refers to a collection? That
sounds like a mistake to me.
> We also
> see the term 'installed' used. Particularly to distinguish between quoted
> and unquoted paths in require statements. However, all software is
> installed, rather this term seems to mean that there is a process for
> binding modules to the unquoted path names in require - this process is
> then called 'installing'.
That's the intent, although you're getting to an area where the
terminology is less precise in my mind.
> So trying to put this together. In racket then, there is a unique special
> place called "The Library". One may place collections in The Library, by
> installing them via a packet manager command. Then "library modules" (i.e
> modules found within a collection within The Library) may be accessed via
> unquoted path names embedded in require syntax. We then nix the terms
> 'installed module' and instead use 'library module'. *Installing *is then
> a process for copying collections from packages and putting them in The
> Library. It would make sense to point at a package and say that it has
> been installed -- this would mean that the contained collections are in
> "The Library".
I like the idea of having a name for this concept, but since I (intend
to) consistently use "library" as a subset of "module", I'm not
enthusiastic about calling it "The Library".
> Are there other processes for putting a module into The
> Library?
There's `raco link`, and the PLTCOLLECTS environment variable can point
to additional collection-containing directories. Those other methods
are not common or encouraged, though.
> Now by this nomenclature every racket installation would have a standard
> library that comes with racket version x, and then installation specific
> additions to that library added by the administrator later. So now we
> have 'standard library' specific to a racket version, and 'local library'
> specific to an installation. Is this accurate? However, in this doc
> though the term library is used, we are never told about The Library, a
> distribution specific 'standard' library, or a local library. Would this
> be summarily rejected, or if done nicely, be useful?
I'm not enthusiastic about the "standard" part of "standard library".
We use the phrase "main distribution" to refer to the set of packages
(and therefore modules, as organized into collections) that you get
from the current download at
download.racket-lang.org.
> The argument of *require *is a series of identifiers separated by slashes
> with special meaning whether it is in quotes or not. The last identifier
> indicates a module by name. The first identifier is the collection name.
> The string as a whole is called the collection relative module pathname ??
> is that correct? And the string except for the last identifier (except
> for the module name) is called ?? It is analogous to a directory, but
> directory is a file system terminology. We might call it a *module path*,
> distinguished from a *module pathname*, Or is this too subtle. Perhaps
> "relative module location with respect to the collection top" for the
> prefix to the module name?
A reference to an module is a module path. I agree that the usual forms
of reference are "collection-based module paths".
I don't think there are good terms for pieces of various module-path
forms, and it's tricker than distinguishing pieces before and after a
"/". For example, the module path `racket` refers to the "main.rkt"
module within the "racket" collection.
> Can collections contain collections? My sense is no. However, if there
> can be collections in collections then there is a top collection, and
> sub-collections, etc. Then listing collections in The Library only gives
> the top level collections. We also then have to wonder where in the
> collection tree installation puts things when we install a package.
Collections can contain collections. We do sometimes call the contained
ones "subcollections".
> Though the term library is used, in the current nomenclature refers to
> installed packages, but an installed package is a package we can point at
> and say 'hey that was installed' rather than a directory we can point at
> and say 'hey that is a library of collections'. These two things live in
> different place. Indeed I'm likely to delete an installed package because
> it takes up disk space and all I need is the library. Or is it the case,
> that a package is not like a conventional zip file, but that the racket
> uses it even after installation?
You're right that there's a useful distinction between "package" as an
implementation that you can install and "package" as something that is
already installed. I tend to use "package implementation" for the
former and "installed package" for the latter, but I doubt that we have
consistent terminology here.
When you install a package by referring to a ".zip" file, the installed
package does not refer back to the ".zip" file. The same is true if you
refer to a directory and use `--copy`, but the default behavior of
referring to a directory is to link it directly as an installed package.
> The catalog apparently lists paths to
> collections?
I think the terminology here is relatively well defined (see chapter 2
of "Package Management in Racket"): a catalog is a mapping from package
names to package sources, where a package source is a reference to a
package implementation.
> Now we have a catalog of collections, is this the same as a library?
Hopefully it's clear from the above that I think this is a misuse of
"catalog" and "library".
> So we
> don't have a special place where things go as part of an installation
> process, but rather the installation process inserts an file pathname into
> a catalog and this locates the collection. But what distinguishes a
> collection from a package at that point. I could have a zipped
> collection, that I unzip, then install.
I think you're trying to get at the idea of the mapping from module
paths to the filesystem --- as in "The Library", where we don't have a
name for it, and it would be nice to have one.