Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PEP 382: Namespace Packages

44 views
Skip to first unread message

"Martin v. Löwis"

unread,
Apr 2, 2009, 11:32:02 AM4/2/09
to Python-Dev, Python List
I propose the following PEP for inclusion to Python 3.1.
Please comment.

Regards,
Martin

Abstract
========

Namespace packages are a mechanism for splitting a single Python
package across multiple directories on disk. In current Python
versions, an algorithm to compute the packages __path__ must be
formulated. With the enhancement proposed here, the import machinery
itself will construct the list of directories that make up the
package.

Terminology
===========

Within this PEP, the term package refers to Python packages as defined
by Python's import statement. The term distribution refers to
separately installable sets of Python modules as stored in the Python
package index, and installed by distutils or setuptools. The term
vendor package refers to groups of files installed by an operating
system's packaging mechanism (e.g. Debian or Redhat packages install
on Linux systems).

The term portion refers to a set of files in a single directory (possibly
stored in a zip file) that contribute to a namespace package.

Namespace packages today
========================

Python currently provides the pkgutil.extend_path to denote a package as
a namespace package. The recommended way of using it is to put::

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

int the package's ``__init__.py``. Every distribution needs to provide
the same contents in its ``__init__.py``, so that extend_path is
invoked independent of which portion of the package gets imported
first. As a consequence, the package's ``__init__.py`` cannot
practically define any names as it depends on the order of the package
fragments on sys.path which portion is imported first. As a special
feature, extend_path reads files named ``*.pkg`` which allow to
declare additional portions.

setuptools provides a similar function pkg_resources.declare_namespace
that is used in the form::

import pkg_resources
pkg_resources.declare_namespace(__name__)

In the portion's __init__.py, no assignment to __path__ is necessary,
as declare_namespace modifies the package __path__ through sys.modules.
As a special feature, declare_namespace also supports zip files, and
registers the package name internally so that future additions to sys.path
by setuptools can properly add additional portions to each package.

setuptools allows declaring namespace packages in a distribution's
setup.py, so that distribution developers don't need to put the
magic __path__ modification into __init__.py themselves.

Rationale
=========

The current imperative approach to namespace packages has lead to
multiple slightly-incompatible mechanisms for providing namespace
packages. For example, pkgutil supports ``*.pkg`` files; setuptools
doesn't. Likewise, setuptools supports inspecting zip files, and
supports adding portions to its _namespace_packages variable, whereas
pkgutil doesn't.

In addition, the current approach causes problems for system vendors.
Vendor packages typically must not provide overlapping files, and an
attempt to install a vendor package that has a file already on disk
will fail or cause unpredictable behavior. As vendors might chose to
package distributions such that they will end up all in a single
directory for the namespace package, all portions would contribute
conflicting __init__.py files.

Specification
=============

Rather than using an imperative mechanism for importing packages, a
declarative approach is proposed here, as an extension to the existing
``*.pkg`` mechanism.

The import statement is extended so that it directly considers ``*.pkg``
files during import; a directory is considered a package if it either
contains a file named __init__.py, or a file whose name ends with
".pkg".

In addition, the format of the ``*.pkg`` file is extended: a line with
the single character ``*`` indicates that the entire sys.path will
be searched for portions of the namespace package at the time the
namespace packages is imported.

Importing a package will immediately compute the package's __path__;
the ``*.pkg`` files are not considered anymore after the initial import.
If a ``*.pkg`` package contains an asterisk, this asterisk is prepended
to the package's __path__ to indicate that the package is a namespace
package (and that thus further extensions to sys.path might also
want to extend __path__). At most one such asterisk gets prepended
to the path.

extend_path will be extended to recognize namespace packages according
to this PEP, and avoid adding directories twice to __path__.

No other change to the importing mechanism is made; searching
modules (including __init__.py) will continue to stop at the first
module encountered.

Discussion
==========

With the addition of ``*.pkg`` files to the import mechanism, namespace
packages can stop filling out the namespace package's __init__.py.
As a consequence, extend_path and declare_namespace become obsolete.

It is recommended that distributions put a file <distribution>.pkg
into their namespace packages, with a single asterisk. This allows
vendor packages to install multiple portions of namespace package
into a single directory, with no risk of overlapping files.

Namespace packages can start providing non-trivial __init__.py
implementations; to do so, it is recommended that a single distribution
provides a portion with just the namespace package's __init__.py
(and potentially other modules that belong to the namespace package
proper).

The mechanism is mostly compatible with the existing namespace
mechanisms. extend_path will be adjusted to this specification;
any other mechanism might cause portions to get added twice to
__path__.

Copyright
=========

This document has been placed in the public domain.

P.J. Eby

unread,
Apr 2, 2009, 1:14:42 PM4/2/09
to Martin v. Löwis, Python-Dev, Python List
At 10:32 AM 4/2/2009 -0500, Martin v. Löwis wrote:
>I propose the following PEP for inclusion to Python 3.1.
>Please comment.

An excellent idea. One thing I am not 100% clear on, is how to get
additions to sys.path to work correctly with this. Currently, when
pkg_resources adds a new egg to sys.path, it uses its existing
registry of namespace packages in order to locate which packages need
__path__ fixups. It seems under this proposal that it would have to
scan sys.modules for objects with __path__ attributes that are lists
that begin with a '*', instead... which is a bit troubling because
sys.modules doesn't always only contain module objects. Many major
frameworks place lazy module objects, and module proxies or wrappers
of various sorts in there, so scanning through it arbitrarily is not
really a good idea.

Perhaps we could add something like a sys.namespace_packages that
would be updated by this mechanism? Then, pkg_resources could check
both that and its internal registry to be both backward and forward compatible.

Apart from that, this mechanism sounds great! I only wish there was
a way to backport it all the way to 2.3 so I could drop the messy
bits from setuptools.

Carl Banks

unread,
Apr 2, 2009, 2:38:50 PM4/2/09
to
On Apr 2, 8:32 am, "Martin v. Löwis" <mar...@v.loewis.de> wrote:
> I propose the following PEP for inclusion to Python 3.1.
> Please comment.
>
> Regards,
> Martin
>
> Abstract
> ========
>
> Namespace packages are a mechanism for splitting a single Python
> package across multiple directories on disk. In current Python
> versions, an algorithm to compute the packages __path__ must be
> formulated. With the enhancement proposed here, the import machinery
> itself will construct the list of directories that make up the
> package.

-0

My main concern is that we'll start seeing all kinds of packages with
names like:

com.dusinc.sarray.ptookkit.v_1_34_beta.btree.BTree

The current lack of global package namespace effectively prevents
bureaucratic package naming, which in my mind makes it worth the
cost. However, I'd be willing to believe this can be kept under
control some other way.


Carl Banks

Kay Schluehr

unread,
Apr 2, 2009, 2:56:09 PM4/2/09
to

Wow. You python-dev guys are really jumping the shark. Isn't your Rube
Goldberg "import machinery" already complex enough for you?

Chris Rebert

unread,
Apr 2, 2009, 3:08:36 PM4/2/09
to Carl Banks, pytho...@python.org

Agreed, although I'd be slightly less optimistic on its usage being
kept under control. It seems this goes a bit against the "Flat is
better than nested" principle.
Then again, we also have the "Namespaces are honkingly great"
principle to contend with as well, so it's definitely a balancing act.

Cheers,
Chris

--
I have a blog:
http://blog.rebertia.com

Chris Withers

unread,
Apr 2, 2009, 4:03:34 PM4/2/09
to "Martin v. Löwis", Python List, Python-Dev
Martin v. Löwis wrote:
> I propose the following PEP for inclusion to Python 3.1.
> Please comment.

Would this support the following case:

I have a package called mortar, which defines useful stuff:

from mortar import content, ...

I now want to distribute large optional chunks separately, but ideally
so that the following will will work:

from mortar.rbd import ...
from mortar.zodb import ...
from mortar.wsgi import ...

Does the PEP support this? The only way I can currently think to do this
would result in:

from mortar import content,..
from mortar_rbd import ...
from mortar_zodb import ...
from mortar_wsgi import ...

...which looks a bit unsightly to me.

cheers,

Chris

--
Simplistix - Content Management, Zope & Python Consulting
- http://www.simplistix.co.uk

Chris Withers

unread,
Apr 2, 2009, 4:03:49 PM4/2/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev
P.J. Eby wrote:
> Apart from that, this mechanism sounds great! I only wish there was a
> way to backport it all the way to 2.3 so I could drop the messy bits
> from setuptools.

Maybe we could? :-)

andrew cooke

unread,
Apr 2, 2009, 4:14:14 PM4/2/09
to Chris Withers, Python List
Chris Withers wrote:
> Martin v. Löwis wrote:
>> I propose the following PEP for inclusion to Python 3.1.
>> Please comment.
>
> Would this support the following case:
>
> I have a package called mortar, which defines useful stuff:
>
> from mortar import content, ...
>
> I now want to distribute large optional chunks separately, but ideally
> so that the following will will work:
>
> from mortar.rbd import ...
> from mortar.zodb import ...
> from mortar.wsgi import ...
>
> Does the PEP support this? The only way I can currently think to do this
> would result in:
>
> from mortar import content,..
> from mortar_rbd import ...
> from mortar_zodb import ...
> from mortar_wsgi import ...

i may be misunderstanding, but i think you can already do this.

in lepl i have code spread across many modules (equivalent to your
mortar.rbd, i have lepl.matchers etc). then in lepl/__init__.py i import
those and define __this__ to export them into the lepl namespace. so you
can import either do:

from lepl import Literal

or

from lepl.matchers import Literal

and you get the same code.

i copied this from sqlalchemy, but i am sure other packages do something
similar. it's described in the python docs.

andrew


Chris Withers

unread,
Apr 2, 2009, 4:18:57 PM4/2/09
to andrew cooke, Python List
andrew cooke wrote:
>> I now want to distribute large optional chunks separately, but ideally
>> so that the following will will work:
>>
>> from mortar.rbd import ...
>> from mortar.zodb import ...
>> from mortar.wsgi import ...

> i may be misunderstanding, but i think you can already do this.


>
> in lepl i have code spread across many modules (equivalent to your
> mortar.rbd, i have lepl.matchers etc). then in lepl/__init__.py i import
> those and define __this__ to export them into the lepl namespace. so you
> can import either do:

Okay, but do you:

- distribute lepl.matchers in a seperate distribution to lepl?

- have actual code in the lepl package?

cheers,

andrew cooke

unread,
Apr 2, 2009, 4:29:24 PM4/2/09
to Python List, Chris Withers
andrew cooke wrote:
> Chris Withers wrote:
>> Martin v. Löwis wrote:
>>> I propose the following PEP for inclusion to Python 3.1.
>>> Please comment.
>>
>> Would this support the following case:
>>
>> I have a package called mortar, which defines useful stuff:
>>
>> from mortar import content, ...
>>
>> I now want to distribute large optional chunks separately, but ideally
>> so that the following will will work:
[...]

>
> i may be misunderstanding, but i think you can already do this.

ah, sorry, i think i was misunderstanding. you mean within the mechanism
provided by this new pep (which i also don't understand....)

andrew


M.-A. Lemburg

unread,
Apr 2, 2009, 4:33:25 PM4/2/09
to "Martin v. Löwis", Python List, Python-Dev
On 2009-04-02 17:32, Martin v. Löwis wrote:
> I propose the following PEP for inclusion to Python 3.1.

Thanks for picking this up.

I'd like to extend the proposal to Python 2.7 and later.

> Please comment.
>
> Regards,
> Martin


>
> Specification
> =============
>
> Rather than using an imperative mechanism for importing packages, a
> declarative approach is proposed here, as an extension to the existing
> ``*.pkg`` mechanism.
>
> The import statement is extended so that it directly considers ``*.pkg``
> files during import; a directory is considered a package if it either
> contains a file named __init__.py, or a file whose name ends with
> ".pkg".

That's going to slow down Python package detection a lot - you'd
replace an O(1) test with an O(n) scan.

Alternative Approach:
---------------------

Wouldn't it be better to stick with a simpler approach and look for
"__pkg__.py" files to detect namespace packages using that O(1) check ?

This would also avoid any issues you'd otherwise run into if you want
to maintain this scheme in an importer that doesn't have access to a list
of files in a package directory, but is well capable for the checking
the existence of a file.

Mechanism:
----------

If the import mechanism finds a matching namespace package (a directory
with a __pkg__.py file), it then goes into namespace package scan mode and
scans the complete sys.path for more occurrences of the same namespace
package.

The import loads all __pkg__.py files of matching namespace packages
having the same package name during the search.

One of the namespace packages, the defining namespace package, will have
to include a __init__.py file.

After having scanned all matching namespace packages and loading
the __pkg__.py files in the order of the search, the import mechanism
then sets the packages .__path__ attribute to include all namespace
package directories found on sys.path and finally executes the
__init__.py file.

(Please let me know if the above is not clear, I will then try to
follow up on it.)

Discussion:
-----------

The above mechanism allows the same kind of flexibility we already
have with the existing normal __init__.py mechanism.

* It doesn't add yet another .pth-style sys.path extension (which are
difficult to manage in installations).

* It always uses the same naive sys.path search strategy. The strategy
is not determined by some file contents.

* The search is only done once - on the first import of the package.

* It's possible to have a defining package dir and add-one package
dirs.

* Namespace packages are easy to recognize by testing for a single
resource.

* Namespace __pkg__.py modules can provide extra meta-information,
logging, etc. to simplify debugging namespace package setups.

* It's possible to freeze such setups, to put them into ZIP files,
or only have parts of it in a ZIP file and the other parts in the
file-system.

Caveats:

* Changes to sys.path will not result in an automatic rescan for
additional namespace packages, if the package was already loaded.
However, we could have a function to make such a rescan explicit.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 02 2009)
>>> Python/Zope Consulting and Support ... http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
2009-03-19: Released mxODBC.Connect 1.0.1 http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/

Ben Finney

unread,
Apr 2, 2009, 5:59:41 PM4/2/09
to
Kay Schluehr <kay.sc...@gmx.net> writes:

> Wow. You python-dev guys are really jumping the shark. Isn't your
> Rube Goldberg "import machinery" already complex enough for you?

Thanks for your constructive criticism, and your considerate quote
trimming.

--
\ “I was married by a judge. I should have asked for a jury.” |
`\ —Groucho Marx |
_o__) |
Ben Finney

drob...@gmail.com

unread,
Apr 2, 2009, 7:20:18 PM4/2/09
to
On Apr 2, 5:59 pm, Ben Finney <ben+pyt...@benfinney.id.au> wrote:

> Kay Schluehr <kay.schlu...@gmx.net> writes:
> > Wow. You python-dev guys are really jumping the shark. Isn't your
> > Rube Goldberg "import machinery" already complex enough for you?
>
> Thanks for your constructive criticism, and your considerate quote
> trimming.
Ben, you should use google groups. No trimming necessary.

Ben Finney

unread,
Apr 2, 2009, 7:36:08 PM4/2/09
to
"drob...@gmail.com" <drob...@gmail.com> writes:

No, I really shouldn't.

> No trimming necessary.

It's not me that should do the trimming.
<URL:http://en.wikipedia.org/wiki/Posting_style#Inline_replying>

--
\ “Jealousy: The theory that some other fellow has just as little |
`\ taste.” —Henry L. Mencken |
_o__) |
Ben Finney

P.J. Eby

unread,
Apr 2, 2009, 8:44:00 PM4/2/09
to M.-A. Lemburg, Martin v. Löwis, Python List, Python-Dev
At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote:
>That's going to slow down Python package detection a lot - you'd
>replace an O(1) test with an O(n) scan.

I thought about this too, but it's pretty trivial considering that
the only time it takes effect is when you have a directory name that
matches the name you're importing, and that it will only happen once
for that directory, unless there is no package on sys.path with that
name, and the program tries to import the package multiple times. In
other words, the overhead isn't likely to be much, compared to the
time needed to say, open and marshal even a trivial __init__.py file.


>Alternative Approach:
>---------------------
>
>Wouldn't it be better to stick with a simpler approach and look for
>"__pkg__.py" files to detect namespace packages using that O(1) check ?

I thought the same thing (or more precisely, a single .pkg file), but
when I got lower in the PEP I saw the reason was to support system
packages not having overlapping filenames. The PEP could probably be
a little clearer about the connection between needing *.pkg and the
system-package use case.


>One of the namespace packages, the defining namespace package, will have
>to include a __init__.py file.

Note that there is no such thing as a "defining namespace package" --
namespace package contents are symmetrical peers.


>The above mechanism allows the same kind of flexibility we already
>have with the existing normal __init__.py mechanism.
>
>* It doesn't add yet another .pth-style sys.path extension (which are
>difficult to manage in installations).
>
>* It always uses the same naive sys.path search strategy. The strategy
>is not determined by some file contents.

The above are also true for using only a '*' in .pkg files -- in that
event there are no sys.path changes. (Frankly, I'm doubtful that
anybody is using extend_path and .pkg files to begin with, so I'd be
fine with a proposal that instead used something like '.nsp' files
that didn't even need to be opened and read -- which would let the
directory scan stop at the first .nsp file found.


>* The search is only done once - on the first import of the package.

I believe the PEP does this as well, IIUC.


>* It's possible to have a defining package dir and add-one package
>dirs.

Also possible in the PEP, although the __init__.py must be in the
first such directory on sys.path. (However, such "defining" packages
are not that common now, due to tool limitations.)

Neal Becker

unread,
Apr 2, 2009, 9:15:49 PM4/2/09
to pytho...@python.org, pytho...@python.org
While solving this problem, is it possible also to address an issue that
shows up in certain distributions? I'm specifically talking about the fact
that on Redhat/Fedora, we have on x86_64 both /usr/lib/pythonxx/ and
/usr/lib64/pythonxx. The former is supposed to be for non-arch specific
packages (pure python) and the latter for arch-specific.

If it happens that there is:
/usr/lib/pythonxxx/site-packages/mypackage/
and
/usr/lib64/pythonxxx/site-packages/mypackage/subpackage

This (and probably some similar variations) won't work with the current
module loading algorithm. (I believe this is the issue I encountered, it
was a while ago).


Matthias Klose

unread,
Apr 2, 2009, 9:21:10 PM4/2/09
to "Martin v. Löwis", Python List, Python-Dev
Martin v. Löwis schrieb:

> I propose the following PEP for inclusion to Python 3.1.
> Please comment.
>
> Regards,
> Martin
>
> Abstract
> ========
>
> Namespace packages are a mechanism for splitting a single Python
> package across multiple directories on disk. In current Python
> versions, an algorithm to compute the packages __path__ must be
> formulated. With the enhancement proposed here, the import machinery
> itself will construct the list of directories that make up the
> package.

+1

speaking as a downstream packaging python for Debian/Ubuntu I welcome this
approach. The current practice of shipping the very same file (__init__.py) in
different packages leads to conflicts for the installation of these packages
(this is not specific to dpkg, but is true for rpm packaging as well).

Current practice of packaging (for downstreams) so called "name space packages" is:

- either to split out the namespace __init__.py into a separate
(linux distribution) package (needing manual packaging effort for each
name space package)

- using downstream specific packaging techniques to handle conflicting files
(diversions)

- replicating the current behaviour of setuptools simply overwriting the
file conflicts.

Following this proposal (downstream) packaging of namespace packages is made
possible independent of any manual downstream packaging decisions or any
downstream specific packaging decisions.

Matthias

P.J. Eby

unread,
Apr 2, 2009, 11:12:18 PM4/2/09
to Matthias Klose, Martin v. Löwis, Python List, Python-Dev
At 03:21 AM 4/3/2009 +0200, Matthias Klose wrote:
>+1 speaking as a downstream packaging python for Debian/Ubuntu I
>welcome this approach. The current practice of shipping the very
>same file (__init__.py) in different packages leads to conflicts for
>the installation of these packages (this is not specific to dpkg,
>but is true for rpm packaging as well). Current practice of
>packaging (for downstreams) so called "name space packages" is: -
>either to split out the namespace __init__.py into a
>separate (linux distribution) package (needing manual packaging
>effort for each name space package) - using downstream specific
>packaging techniques to handle conflicting files (diversions) -
>replicating the current behaviour of setuptools simply overwriting
>the file conflicts. Following this proposal (downstream)
>packaging of namespace packages is made possible independent of any
>manual downstream packaging decisions or any downstream specific
>packaging decisions

A clarification: setuptools does not currently install the
__init__.py file when installing in
--single-version-externally-managed or --root mode. Instead, it uses
a project-version-nspkg.pth file that essentially simulates a
variation of Martin's .pkg proposal, by abusing .pth file
support. If this PEP is adopted, setuptools would replace its
nspkg.pth file with a .pkg file on Python versions that provide
native support for .pkg imports, keeping the .pth file only for older Pythons.

(.egg files and directories will not be affected by the change,
unless the zipimport module will also supports .pkg files... and
again, only for Python versions that support the new approach.)

"Martin v. Löwis"

unread,
Apr 3, 2009, 3:49:58 PM4/3/09
to P.J. Eby, Python List, Python-Dev
> Perhaps we could add something like a sys.namespace_packages that would
> be updated by this mechanism? Then, pkg_resources could check both that
> and its internal registry to be both backward and forward compatible.

I could see no problem with that, so I have added this to the PEP.

Thanks for the feedback,

Martin

"Martin v. Löwis"

unread,
Apr 3, 2009, 3:55:22 PM4/3/09
to Chris Withers, Python List, Python-Dev
Chris Withers wrote:
> Martin v. Löwis wrote:
>> I propose the following PEP for inclusion to Python 3.1.
>> Please comment.
>
> Would this support the following case:
>
> I have a package called mortar, which defines useful stuff:
>
> from mortar import content, ...
>
> I now want to distribute large optional chunks separately, but ideally
> so that the following will will work:
>
> from mortar.rbd import ...
> from mortar.zodb import ...
> from mortar.wsgi import ...
>
> Does the PEP support this?

That's the primary purpose of the PEP. You can do this today already
(see the zope package, and the reference to current techniques in the
PEP), but the PEP provides a cleaner way.

In each chunk (which the PEP calls portion), you had a structure like
this:

mortar/
mortar/rbd.pkg (contains just "*")
mortar/rbd.py

or

mortar/
mortar/zobd.pkg
mortar/zobd/
mortar/zobd/__init__.py
mortar/zobd/backends.py

As a site effect, you can also do "import mortar", but that would just
give you the (nearly) empty namespace package, whose only significant
contents is the variable __path__.

Regards,
Martin

"Martin v. Löwis"

unread,
Apr 3, 2009, 4:07:10 PM4/3/09
to M.-A. Lemburg, Python List, Python-Dev
> I'd like to extend the proposal to Python 2.7 and later.

I don't object, but I also don't want to propose this, so
I added it to the discussion.

My (and perhaps other people's) concern is that 2.7 might
well be the last release of the 2.x series. If so, adding
this feature to it would make 2.7 an odd special case for
users and providers of third party tools.

> That's going to slow down Python package detection a lot - you'd
> replace an O(1) test with an O(n) scan.

I question that claim. In traditional Unix systems, the file system
driver performs a linear search of the directory, so it's rather
O(n)-in-kernel vs. O(n)-in-Python. Even for advanced file systems,
you need at least O(log n) to determine whether a specific file is
in a directory. For all practical purposes, the package directory
will fit in a single disk block (containing a single .pkg file, and
one or few subpackages), making listdir complete as fast as stat.

> Wouldn't it be better to stick with a simpler approach and look for
> "__pkg__.py" files to detect namespace packages using that O(1) check ?

Again - this wouldn't be O(1). More importantly, it breaks system
packages, which now again have to deal with the conflicting file names
if they want to install all portions into a single location.

> This would also avoid any issues you'd otherwise run into if you want
> to maintain this scheme in an importer that doesn't have access to a list
> of files in a package directory, but is well capable for the checking
> the existence of a file.

Do you have a specific mechanism in mind?

Regards,
Martin

"Martin v. Löwis"

unread,
Apr 3, 2009, 4:15:55 PM4/3/09
to P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
> Note that there is no such thing as a "defining namespace package" --
> namespace package contents are symmetrical peers.

With the PEP, a "defining package" becomes possible - at most one
portion can define an __init__.py.

I know that the current mechanisms don't support it, and it might
not be useful in general, but now there is a clean way of doing it,
so I wouldn't exclude it. Distribution-wise, all distributions
relying on the defining package would need to require (or
install_require, or depend on) it.

> The above are also true for using only a '*' in .pkg files -- in that
> event there are no sys.path changes. (Frankly, I'm doubtful that
> anybody is using extend_path and .pkg files to begin with, so I'd be
> fine with a proposal that instead used something like '.nsp' files that
> didn't even need to be opened and read -- which would let the directory
> scan stop at the first .nsp file found.

That would work for me as well. Nobody at PyCon could remember where
.pkg files came from.

> I believe the PEP does this as well, IIUC.

Correct.

>> * It's possible to have a defining package dir and add-one package
>> dirs.
>
> Also possible in the PEP, although the __init__.py must be in the first
> such directory on sys.path.

I should make it clear that this is not the case. I envision it to work
this way: import zope
- searches sys.path, until finding either a directory zope, or a file
zope.{py,pyc,pyd,...}
- if it is a directory, it checks for .pkg files. If it finds any,
it processes them, extending __path__.
- it *then* checks for __init__.py, taking the first hit anywhere
on __path__ (just like any module import would)
- if no .pkg was found, nor an __init__.py, it proceeds with the next
sys.path item (skipping the directory entirely)

Regards,
Martin

gl...@divmod.com

unread,
Apr 3, 2009, 5:16:49 PM4/3/09
to Python List, Python-Dev
On 08:15 pm, mar...@v.loewis.de wrote:
>>Note that there is no such thing as a "defining namespace package" --
>>namespace package contents are symmetrical peers.
>
>With the PEP, a "defining package" becomes possible - at most one
>portion can define an __init__.py.

For what it's worth, this is a _super_ useful feature for Twisted. We
have one "defining package" for the "twisted" package (twisted core) and
then a bunch of other things which want to put things into twisted.*
(twisted.web, twisted.conch, et. al.).

For debian we already have separate packages, but such a definition of
namespace packages would allow us to actually have things separated out
on the cheeseshop as well.

P.J. Eby

unread,
Apr 3, 2009, 5:23:19 PM4/3/09
to Martin v. Löwis, Python List, M.-A. Lemburg, Python-Dev
At 10:15 PM 4/3/2009 +0200, Martin v. Löwis wrote:
>I should make it clear that this is not the case. I envision it to work
>this way: import zope
>- searches sys.path, until finding either a directory zope, or a file
> zope.{py,pyc,pyd,...}
>- if it is a directory, it checks for .pkg files. If it finds any,
> it processes them, extending __path__.
>- it *then* checks for __init__.py, taking the first hit anywhere
> on __path__ (just like any module import would)
>- if no .pkg was found, nor an __init__.py, it proceeds with the next
> sys.path item (skipping the directory entirely)

Ah, I missed that. Maybe the above should be added to the PEP to clarify.

"Martin v. Löwis"

unread,
Apr 4, 2009, 8:24:34 AM4/4/09
to
> -0
>
> My main concern is that we'll start seeing all kinds of packages with
> names like:
>
> com.dusinc.sarray.ptookkit.v_1_34_beta.btree.BTree
>
> The current lack of global package namespace effectively prevents
> bureaucratic package naming, which in my mind makes it worth the
> cost. However, I'd be willing to believe this can be kept under
> control some other way.

In principle, people can do this today already. That they are not
doing it is a good sign.

I think this bureaucratic naming in Java originates more from an
explicitly stated policy that people should use such naming than
from the ability to actually do so easily.

Regards,
Martin

"Martin v. Löwis"

unread,
Apr 4, 2009, 8:27:06 AM4/4/09
to
Neal Becker wrote:
> While solving this problem, is it possible also to address an issue that
> shows up in certain distributions? I'm specifically talking about the fact
> that on Redhat/Fedora, we have on x86_64 both /usr/lib/pythonxx/ and
> /usr/lib64/pythonxx. The former is supposed to be for non-arch specific
> packages (pure python) and the latter for arch-specific.

I can't see how this is related to this PEP. It's not at all about how
sys.path should be constructed.

Regards,
Martin

Chris Withers

unread,
Apr 6, 2009, 9:00:18 AM4/6/09
to "Martin v. Löwis", Python List, Python-Dev

Martin v. Löwis wrote:


> Chris Withers wrote:
>> Martin v. Löwis wrote:

>>> I propose the following PEP for inclusion to Python 3.1.
>>> Please comment.
>> Would this support the following case:
>>
>> I have a package called mortar, which defines useful stuff:
>>
>> from mortar import content, ...
>>
>> I now want to distribute large optional chunks separately, but ideally
>> so that the following will will work:
>>
>> from mortar.rbd import ...
>> from mortar.zodb import ...
>> from mortar.wsgi import ...
>>
>> Does the PEP support this?
>
> That's the primary purpose of the PEP.

Are you sure?

Does the pep really allow for:

from mortar import content
from mortar.rdb import something

...where 'content' is a function defined in mortar/__init__.py and
'something' is a function defined in mortar/rdb/__init__.py *and* the
following are separate distributions on PyPI:

- mortar
- mortar.rdb

...where 'mortar' does not contain 'mortar.rdb'.

> You can do this today already
> (see the zope package,

No, they have nothing but a (functionally) empty __init__.py in the zope
package.

Jesse Noller

unread,
Apr 6, 2009, 9:21:06 AM4/6/09
to M.-A. Lemburg, Python List, "Martin v. Löwis", Python-Dev
On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <m...@egenix.com> wrote:

> On 2009-04-02 17:32, Martin v. Löwis wrote:
>> I propose the following PEP for inclusion to Python 3.1.
>
> Thanks for picking this up.
>
> I'd like to extend the proposal to Python 2.7 and later.
>

-1 to adding it to the 2.x series. There was much discussion around
adding features to 2.x *and* 3.0, and the consensus seemed to *not*
add new features to 2.x and use those new features as carrots to help
lead people into 3.0.

jesse

Barry Warsaw

unread,
Apr 6, 2009, 9:26:24 AM4/6/09
to Jesse Noller, Python List, Python-Dev, "Martin v. Löwis", M.-A. Lemburg
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Actually, isn't the policy just that nothing can go into 2.7 that
isn't backported from 3.1? Whether the actual backport happens or not
is up to the developer though. OTOH, we talked about a lot of things
and my recollection is probably fuzzy.

Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQCVAwUBSdoDAXEjvBPtnXfVAQIrPgQAse7BXQfPYHJJ/g3HNEtc0UmZZ9MCNtGc
sIoZ2EHRVz+pylZT9fmSmorJdIdFvAj7E43tKsV2bQpo/am9XlL10SMn3k0KLxnF
vNCi39nB1B7Uktbnrlpnfo4u93suuEqYexEwrkDhJuTMeye0Cxg0os5aysryuPza
mKr5jsqkV5c=
=Y9iP
-----END PGP SIGNATURE-----

Eric Smith

unread,
Apr 6, 2009, 10:40:56 AM4/6/09
to Barry Warsaw, Jesse Noller, Python List, M.-A. Lemburg, Martin v. Löwis, Python-Dev
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
>
>> On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <m...@egenix.com> wrote:
>>> On 2009-04-02 17:32, Martin v. Löwis wrote:
>>>> I propose the following PEP for inclusion to Python 3.1.
>>>
>>> Thanks for picking this up.
>>>
>>> I'd like to extend the proposal to Python 2.7 and later.
>>>
>>
>> -1 to adding it to the 2.x series. There was much discussion around
>> adding features to 2.x *and* 3.0, and the consensus seemed to *not*
>> add new features to 2.x and use those new features as carrots to help
>> lead people into 3.0.
>
> Actually, isn't the policy just that nothing can go into 2.7 that
> isn't backported from 3.1? Whether the actual backport happens or not
> is up to the developer though. OTOH, we talked about a lot of things
> and my recollection is probably fuzzy.

I believe Barry is correct. The official policy is "no features in 2.7
that aren't also in 3.1". I personally think I'm not going to put anything
else in 2.7, specifically the ',' formatter stuff from PEP 378. 3.1 has
diverged too far from 2.7 in this regard to make the backport easy to do.
But this decision is left up to the individual committer.

P.J. Eby

unread,
Apr 6, 2009, 11:21:42 AM4/6/09
to Chris Withers, Martin v. Löwis, Python List, Python-Dev
At 02:00 PM 4/6/2009 +0100, Chris Withers wrote:
>Martin v. Löwis wrote:
>>Chris Withers wrote:
>>>Would this support the following case:
>>>
>>>I have a package called mortar, which defines useful stuff:
>>>
>>>from mortar import content, ...
>>>
>>>I now want to distribute large optional chunks separately, but ideally
>>>so that the following will will work:
>>>
>>>from mortar.rbd import ...
>>>from mortar.zodb import ...
>>>from mortar.wsgi import ...
>>>
>>>Does the PEP support this?
>>That's the primary purpose of the PEP.
>
>Are you sure?
>
>Does the pep really allow for:
>
>from mortar import content
>from mortar.rdb import something
>
>...where 'content' is a function defined in mortar/__init__.py and
>'something' is a function defined in mortar/rdb/__init__.py *and*
>the following are separate distributions on PyPI:
>
>- mortar
>- mortar.rdb
>
>...where 'mortar' does not contain 'mortar.rdb'.

See the third paragraph of http://www.python.org/dev/peps/pep-0382/#discussion

Chris Withers

unread,
Apr 6, 2009, 11:57:59 AM4/6/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev
P.J. Eby wrote:

Indeed, I guess the PEP could be made more explanatory then 'cos, as a
packager, I don't see what I'd put in the various setup.py and
__init__.py to make this work...

That said, I'm delighted to hear it's going to be possible and
wholeheartedly support the PEP and it's backporting to 2.7 as a result...

Jesse Noller

unread,
Apr 6, 2009, 12:00:46 PM4/6/09
to Barry Warsaw, Python List, Python-Dev, "Martin v. Löwis", M.-A. Lemburg
On Mon, Apr 6, 2009 at 9:26 AM, Barry Warsaw <ba...@python.org> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Apr 6, 2009, at 9:21 AM, Jesse Noller wrote:
>
>> On Thu, Apr 2, 2009 at 4:33 PM, M.-A. Lemburg <m...@egenix.com> wrote:
>>>
>>> On 2009-04-02 17:32, Martin v. Löwis wrote:
>>>>
>>>> I propose the following PEP for inclusion to Python 3.1.
>>>
>>> Thanks for picking this up.
>>>
>>> I'd like to extend the proposal to Python 2.7 and later.
>>>
>>
>> -1 to adding it to the 2.x series. There was much discussion around
>> adding features to 2.x *and* 3.0, and the consensus seemed to *not*
>> add new features to 2.x and use those new features as carrots to help
>> lead people into 3.0.
>
> Actually, isn't the policy just that nothing can go into 2.7 that isn't
> backported from 3.1?  Whether the actual backport happens or not is up to
> the developer though.  OTOH, we talked about a lot of things and my
> recollection is probably fuzzy.
>
> Barry

That *is* the official policy, but there was discussions around no
further backporting of features from 3.1 into 2.x, therefore providing
more of an upgrade incentive

M.-A. Lemburg

unread,
Apr 7, 2009, 8:02:54 AM4/7/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev
On 2009-04-03 02:44, P.J. Eby wrote:
> At 10:33 PM 4/2/2009 +0200, M.-A. Lemburg wrote:
>> Alternative Approach:
>> ---------------------

>>
>> Wouldn't it be better to stick with a simpler approach and look for
>> "__pkg__.py" files to detect namespace packages using that O(1) check ?
>
>> One of the namespace packages, the defining namespace package, will have
>> to include a __init__.py file.
>
> Note that there is no such thing as a "defining namespace package" --
> namespace package contents are symmetrical peers.

That was a definition :-)

Definition namespace package := the namespace package having the
__pkg__.py file

This is useful to have since packages allowing integration of
other sub-packages typically come as a base package with some
basic infra-structure in place which is required by all other
namespace packages.

If the __init__.py file is not found among the namespace directories,
the importer will have to raise an exception, since the result
would not be a proper Python package.

>> * It's possible to have a defining package dir and add-one package
>> dirs.
>
> Also possible in the PEP, although the __init__.py must be in the first

> such directory on sys.path. (However, such "defining" packages are not
> that common now, due to tool limitations.)

That's a strange limitation of the PEP. Why should the location of
the __init__.py file depend on the order of sys.path ?

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 03 2009)

M.-A. Lemburg

unread,
Apr 7, 2009, 8:30:19 AM4/7/09
to "Martin v. Löwis", Python List, Python-Dev
[Resent due to a python.org mail server problem]

On 2009-04-03 22:07, Martin v. Löwis wrote:
>> I'd like to extend the proposal to Python 2.7 and later.
>

> I don't object, but I also don't want to propose this, so
> I added it to the discussion.
>
> My (and perhaps other people's) concern is that 2.7 might
> well be the last release of the 2.x series. If so, adding
> this feature to it would make 2.7 an odd special case for
> users and providers of third party tools.

I certainly hope that we'll see more useful features backported
from 3.x to the 2.x series or forward ported from 2.x to 3.x
(depending on what the core developer preferences are).

Regarding this particular PEP, it is well possible to implement
an importer that provides the functionality for Python 2.3-2.7
versions, so it doesn't have to be an odd special case.

>> That's going to slow down Python package detection a lot - you'd
>> replace an O(1) test with an O(n) scan.
>
> I question that claim. In traditional Unix systems, the file system
> driver performs a linear search of the directory, so it's rather
> O(n)-in-kernel vs. O(n)-in-Python. Even for advanced file systems,
> you need at least O(log n) to determine whether a specific file is
> in a directory. For all practical purposes, the package directory
> will fit in a single disk block (containing a single .pkg file, and
> one or few subpackages), making listdir complete as fast as stat.

On second thought, you're right, it won't be that costly. It
requires an os.listdir() scan due to the wildcard approach and in
some cases, such a scan may not be possible, e.g. when using
frozen packages. Indeed, the freeze mechanism would not even add
the .pkg files - it only handles .py file content.

The same is true for distutils, MANIFEST generators and other
installer mechanisms - it would have to learn to package
the .pkg files along with the Python files.

Another problem with the .pkg file approach is that the file extension
is already in use for e.g. Mac OS X installers.

You don't have those issues with the __pkg__.py file approach
I suggested.

>> Wouldn't it be better to stick with a simpler approach and look for
>> "__pkg__.py" files to detect namespace packages using that O(1) check ?
>

> Again - this wouldn't be O(1). More importantly, it breaks system
> packages, which now again have to deal with the conflicting file names
> if they want to install all portions into a single location.

True, but since that means changing the package infrastructure, I think
it's fair to ask distributors who want to use that approach to also take
care of looking into the __pkg__.py files and merging them if
necessary.

Most of the time the __pkg__.py files will be empty, so that's not
really much to ask for.

>> This would also avoid any issues you'd otherwise run into if you want
>> to maintain this scheme in an importer that doesn't have access to a list
>> of files in a package directory, but is well capable for the checking
>> the existence of a file.
>
> Do you have a specific mechanism in mind?

Yes: frozen modules and imports straight from a web resource.

The .pkg file approach requires a directory scan and additional
support from all importers.

The __pkg__.py approach I suggested can use existing importers
without modifications by checking for the existence of such
a Python module in an importer managed resource.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 07 2009)

P.J. Eby

unread,
Apr 7, 2009, 10:05:45 AM4/7/09
to M.-A. Lemburg, Martin v. Löwis, Python List, Python-Dev
At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
> >> Wouldn't it be better to stick with a simpler approach and look for
> >> "__pkg__.py" files to detect namespace packages using that O(1) check ?
> >
> > Again - this wouldn't be O(1). More importantly, it breaks system
> > packages, which now again have to deal with the conflicting file names
> > if they want to install all portions into a single location.
>
>True, but since that means changing the package infrastructure, I think
>it's fair to ask distributors who want to use that approach to also take
>care of looking into the __pkg__.py files and merging them if
>necessary.
>
>Most of the time the __pkg__.py files will be empty, so that's not
>really much to ask for.

This means your proposal actually doesn't add any benefit over the
status quo, where you can have an __init__.py that does nothing but
declare the package a namespace. We already have that now, and it
doesn't need a new filename. Why would we expect OS vendors to start
supporting it, just because we name it __pkg__.py instead of __init__.py?

M.-A. Lemburg

unread,
Apr 7, 2009, 10:58:39 AM4/7/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev

I lost you there.

Since when do we support namespace packages in core Python without
the need to add some form of magic support code to __init__.py ?

My suggestion basically builds on the same idea as Martin's PEP,
but uses a single __pkg__.py file as opposed to some non-Python
file yaddayadda.pkg.

Here's a copy of the proposal, with some additional discussion
bullets added:

"""
Alternative Approach:
---------------------

Wouldn't it be better to stick with a simpler approach and look for
"__pkg__.py" files to detect namespace packages using that O(1) check ?

This would also avoid any issues you'd otherwise run into if you want


to maintain this scheme in an importer that doesn't have access to a list
of files in a package directory, but is well capable for the checking
the existence of a file.

Mechanism:
----------

If the import mechanism finds a matching namespace package (a directory
with a __pkg__.py file), it then goes into namespace package scan mode and
scans the complete sys.path for more occurrences of the same namespace
package.

The import loads all __pkg__.py files of matching namespace packages
having the same package name during the search.

One of the namespace packages, the defining namespace package, will have
to include a __init__.py file.

After having scanned all matching namespace packages and loading
the __pkg__.py files in the order of the search, the import mechanism
then sets the packages .__path__ attribute to include all namespace
package directories found on sys.path and finally executes the
__init__.py file.

(Please let me know if the above is not clear, I will then try to
follow up on it.)

Discussion:
-----------

The above mechanism allows the same kind of flexibility we already
have with the existing normal __init__.py mechanism.

* It doesn't add yet another .pth-style sys.path extension (which are
difficult to manage in installations).

* It always uses the same naive sys.path search strategy. The strategy
is not determined by some file contents.

* The search is only done once - on the first import of the package.

* It's possible to have a defining package dir and add-one package
dirs.

* The search does not depend on the order of directories in sys.path.
There's no requirement for the defining package to appear first
on sys.path.

* Namespace packages are easy to recognize by testing for a single
resource.

* There's no conflict with existing files using the .pkg extension
such as Mac OS X installer files or Solaris packages.

* Namespace __pkg__.py modules can provide extra meta-information,
logging, etc. to simplify debugging namespace package setups.

* It's possible to freeze such setups, to put them into ZIP files,
or only have parts of it in a ZIP file and the other parts in the
file-system.

* There's no need for a package directory scan, allowing the
mechanism to also work with resources that do not permit to
(easily and efficiently) scan the contents of a package "directory",
e.g. frozen packages or imports from web resources.

Caveats:

* Changes to sys.path will not result in an automatic rescan for
additional namespace packages, if the package was already loaded.
However, we could have a function to make such a rescan explicit.

David Cournapeau

unread,
Apr 7, 2009, 11:29:02 AM4/7/09
to M.-A. Lemburg, P.J. Eby, Python List, "Martin v. Löwis", Python-Dev
On Tue, Apr 7, 2009 at 11:58 PM, M.-A. Lemburg <m...@egenix.com> wrote:

>>
>> This means your proposal actually doesn't add any benefit over the
>> status quo, where you can have an __init__.py that does nothing but
>> declare the package a namespace.  We already have that now, and it
>> doesn't need a new filename.  Why would we expect OS vendors to start
>> supporting it, just because we name it __pkg__.py instead of __init__.py?
>
> I lost you there.
>
> Since when do we support namespace packages in core Python without
> the need to add some form of magic support code to __init__.py ?

I think P. Eby refers to the problem that most packaging systems don't
like several packages to have the same file - be it empty or not.
That's my main personal grip against namespace packages, and from this
POV, I think it is fair to say the proposal does not solve anything.
Not that I have a solution, of course :)

cheers,

David

> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/cournape%40gmail.com
>

P.J. Eby

unread,
Apr 7, 2009, 1:46:21 PM4/7/09
to M.-A. Lemburg, Python List, Martin v. Löwis, Python-Dev
At 04:58 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
>On 2009-04-07 16:05, P.J. Eby wrote:
> > At 02:30 PM 4/7/2009 +0200, M.-A. Lemburg wrote:
> >> >> Wouldn't it be better to stick with a simpler approach and look for
> >> >> "__pkg__.py" files to detect namespace packages using that O(1)
> >> check ?
> >> >
> >> > Again - this wouldn't be O(1). More importantly, it breaks system
> >> > packages, which now again have to deal with the conflicting file names
> >> > if they want to install all portions into a single location.
> >>
> >> True, but since that means changing the package infrastructure, I think
> >> it's fair to ask distributors who want to use that approach to also take
> >> care of looking into the __pkg__.py files and merging them if
> >> necessary.
> >>
> >> Most of the time the __pkg__.py files will be empty, so that's not
> >> really much to ask for.
> >
> > This means your proposal actually doesn't add any benefit over the
> > status quo, where you can have an __init__.py that does nothing but
> > declare the package a namespace. We already have that now, and it
> > doesn't need a new filename. Why would we expect OS vendors to start
> > supporting it, just because we name it __pkg__.py instead of __init__.py?
>
>I lost you there.
>
>Since when do we support namespace packages in core Python without
>the need to add some form of magic support code to __init__.py ?
>
>My suggestion basically builds on the same idea as Martin's PEP,
>but uses a single __pkg__.py file as opposed to some non-Python
>file yaddayadda.pkg.

Right... which completely obliterates the primary benefit of the
original proposal compared to the status quo. That is, that the PEP
382 way is more compatible with system packaging tools.

Without that benefit, there's zero gain in your proposal over having
__init__.py files just call pkgutil.extend_path() (in the stdlib
since 2.3, btw) or pkg_resources.declare_namespace() (similar
functionality, but with zipfile support and some other niceties).

IOW, your proposal doesn't actually improve the status quo in any way
that I am able to determine, except that it calls for loading all the
__pkg__.py modules, rather than just the first one. (And the
setuptools implementation of namespace packages actually *does* load
multiple __init__.py's, so that's still no change over the status quo
for setuptools-using packages.)

M.-A. Lemburg

unread,
Apr 14, 2009, 11:02:39 AM4/14/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev

The purpose of the PEP is to create a standard for namespace packages.
That's orthogonal to trying to enhance or change some existing
techniques.

I don't see the emphasis in the PEP on Linux distribution support and the
remote possibility of them wanting to combine separate packages back
into one package as good argument for adding yet another separate hierarchy
of special files which Python scans during imports.

That said, note that most distributions actually take the other route:
they try to split up larger packages into smaller ones, so the argument
becomes even weaker.

It is much more important to standardize the approach than to try
to extend some existing trickery and make them even more opaque than
they already are by introducing yet another level of complexity.

My alternative approach builds on existing methods and fits nicely
with the __init__.py approach Python has already been using for more
than a decade now. It's transparent, easy to understand and provides
enough functionality to build upon - much like the original __init__.py
idea.

I've already laid out the arguments for and against it in my
previous reply, so won't repeat them here.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 14 2009)

P.J. Eby

unread,
Apr 14, 2009, 12:27:51 PM4/14/09
to M.-A. Lemburg, Python List, Martin v. Löwis, Python-Dev
At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>I don't see the emphasis in the PEP on Linux distribution support and the
>remote possibility of them wanting to combine separate packages back
>into one package as good argument for adding yet another separate hierarchy
>of special files which Python scans during imports.
>
>That said, note that most distributions actually take the other route:
>they try to split up larger packages into smaller ones, so the argument
>becomes even weaker.

I think you've misunderstood something about the use case. System
packaging tools don't like separate packages to contain the *same
file*. That means that they *can't* split a larger package up with
your proposal, because every one of those packages would have to
contain a __pkg__.py -- and thus be in conflict with each
other. Either that, or they would have to make a separate system
package containing *only* the __pkg__.py, and then make all packages
using the namespace depend on it -- which is more work and requires
greater co-ordination among packagers.

Allowing each system package to contain its own .pkg or .nsp or
whatever files, on the other hand, allows each system package to be
built independently, without conflict between contents (i.e., having
the same file), and without requiring a special pseudo-package to
contain the additional file.

Also, executing multiple __pkg__.py files means that when multiple
system packages are installed to site-packages, only one of them
could possibly be executed. (Note that, even though the system
packages themselves are not "combined", in practice they will all be
installed to the same directory, i.e., site-packages or the platform
equivalent thereof.)

M.-A. Lemburg

unread,
Apr 14, 2009, 4:59:39 PM4/14/09
to P.J. Eby, Python List, "Martin v. Löwis", Python-Dev
On 2009-04-14 18:27, P.J. Eby wrote:
> At 05:02 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>> I don't see the emphasis in the PEP on Linux distribution support and the
>> remote possibility of them wanting to combine separate packages back
>> into one package as good argument for adding yet another separate
>> hierarchy
>> of special files which Python scans during imports.
>>
>> That said, note that most distributions actually take the other route:
>> they try to split up larger packages into smaller ones, so the argument
>> becomes even weaker.
>
> I think you've misunderstood something about the use case. System
> packaging tools don't like separate packages to contain the *same
> file*. That means that they *can't* split a larger package up with your
> proposal, because every one of those packages would have to contain a
> __pkg__.py -- and thus be in conflict with each other. Either that, or
> they would have to make a separate system package containing *only* the
> __pkg__.py, and then make all packages using the namespace depend on it
> -- which is more work and requires greater co-ordination among packagers.

You are missing the point: When breaking up a large package that lives in
site-packages into smaller distribution bundles, you don't need namespace
packages at all, so the PEP doesn't apply.

The way this works is by having a base distribution bundle that includes
the needed __init__.py file and a set of extension bundles the add
other files to the same directory (without including another copy of
__init__.py). The extension bundles include a dependency on the base
package to make sure that it always gets installed first.

Debian has been using that approach for egenix-mx-base for years. Works
great:

http://packages.debian.org/source/lenny/egenix-mx-base

eGenix has been using that approach for mx package add-ons as well -
long before "namespace" packages where given that name :-)

Please note that the PEP is about providing ways to have package parts
live on sys.path that reintegrate themselves into a single package at
import time.

As such it's targeting Python developers that want to ship add-ons to
existing packages, not Linux distributions (they usually have their
own ideas about what goes where - something that's completely out-of-
scope for the PEP).

P.J. Eby

unread,
Apr 14, 2009, 8:32:34 PM4/14/09
to M.-A. Lemburg, Python List, Martin v. Löwis, Python-Dev
At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>You are missing the point: When breaking up a large package that lives in
>site-packages into smaller distribution bundles, you don't need namespace
>packages at all, so the PEP doesn't apply.
>
>The way this works is by having a base distribution bundle that includes
>the needed __init__.py file and a set of extension bundles the add
>other files to the same directory (without including another copy of
>__init__.py). The extension bundles include a dependency on the base
>package to make sure that it always gets installed first.

If we're going to keep that practice, there's no point to having the
PEP: all three methods (base+extensions, pkgutil, setuptools) all
work just fine as they are, with no changes to importing or the stdlib.

In particular, without the feature of being able to drop that
practice, there would be no reason for setuptools to adopt the
PEP. That's why I'm -1 on your proposal: it's actually inferior to
the methods we already have today.

M.-A. Lemburg

unread,
Apr 15, 2009, 3:51:30 AM4/15/09
to P.J. Eby, Python List, Löwis", Python-Dev
On 2009-04-15 02:32, P.J. Eby wrote:
> At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>> You are missing the point: When breaking up a large package that lives in
>> site-packages into smaller distribution bundles, you don't need namespace
>> packages at all, so the PEP doesn't apply.
>>
>> The way this works is by having a base distribution bundle that includes
>> the needed __init__.py file and a set of extension bundles the add
>> other files to the same directory (without including another copy of
>> __init__.py). The extension bundles include a dependency on the base
>> package to make sure that it always gets installed first.
>
> If we're going to keep that practice, there's no point to having the
> PEP: all three methods (base+extensions, pkgutil, setuptools) all work
> just fine as they are, with no changes to importing or the stdlib.

Again: the PEP is about creating a standard for namespace
packages. It's not about making namespace packages easy to use for
Linux distribution maintainers. Instead, it's targeting *developers*
that want to enable shipping a single package in multiple, separate
pieces, giving the user the freedom to the select the ones she needs.

Of course, this is possible today using various other techniques. The
point is that there is no standard for namespace packages and that's
what the PEP is trying to solve.

> In particular, without the feature of being able to drop that practice,
> there would be no reason for setuptools to adopt the PEP. That's why
> I'm -1 on your proposal: it's actually inferior to the methods we
> already have today.

It's simpler and more in line with the Python Zen, not inferior.

You are free not to support it in setuptools - the methods
implemented in setuptools will continue to work as they are,
but continue to require support code and, over time, no longer
be compatible with other tools building upon the standard
defined in the PEP.

In the end, it's the user that decides: whether to go with a
standard or not.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 15 2009)

P.J. Eby

unread,
Apr 15, 2009, 10:44:17 AM4/15/09
to M.-A. Lemburg, Python List, Martin v. Löwis, Python-Dev

Up until this point, I've been trying to help you understand the use
cases, but it's clear now that you already understand them, you just
don't care.

That wouldn't be a problem if you just stayed on the sidelines,
instead of actively working to make those use cases more difficult
for everyone else than they already are.

Anyway, since you clearly understand precisely what you're doing, I'm
now going to stop trying to explain things, as my responses are
apparently just encouraging you, and possibly convincing bystanders
that there's some genuine controversy here as well.

Aahz

unread,
Apr 15, 2009, 12:10:33 PM4/15/09
to Python List, Python-Dev
[much quote-trimming, the following is intended to just give the gist,
but the bits quoted below are not in directe response to each other]

On Wed, Apr 15, 2009, P.J. Eby wrote:
> At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:
>>

>> [...]


>> Again: the PEP is about creating a standard for namespace
>> packages. It's not about making namespace packages easy to use for
>> Linux distribution maintainers. Instead, it's targeting *developers*
>> that want to enable shipping a single package in multiple, separate
>> pieces, giving the user the freedom to the select the ones she needs.

>> [...]
>
> [...]


> Anyway, since you clearly understand precisely what you're doing, I'm
> now going to stop trying to explain things, as my responses are
> apparently just encouraging you, and possibly convincing bystanders that
> there's some genuine controversy here as well.

For the benefit of us bystanders, could you summarize your vote at this
point? Given the PEP's intended goals, if you do not oppose the PEP, are
there any changes you think should be made?
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

Why is this newsgroup different from all other newsgroups?

M.-A. Lemburg

unread,
Apr 15, 2009, 12:15:46 PM4/15/09
to P.J. Eby, Python List, Löwis", Python-Dev
On 2009-04-15 16:44, P.J. Eby wrote:
> At 09:51 AM 4/15/2009 +0200, M.-A. Lemburg wrote:
>> On 2009-04-15 02:32, P.J. Eby wrote:
>> > At 10:59 PM 4/14/2009 +0200, M.-A. Lemburg wrote:
>> >> You are missing the point: When breaking up a large package that
>> lives in
>> >> site-packages into smaller distribution bundles, you don't need
>> namespace
>> >> packages at all, so the PEP doesn't apply.
>> >>
>> >> The way this works is by having a base distribution bundle that
>> includes
>> >> the needed __init__.py file and a set of extension bundles the add
>> >> other files to the same directory (without including another copy of
>> >> __init__.py). The extension bundles include a dependency on the base
>> >> package to make sure that it always gets installed first.
>> >
>> > If we're going to keep that practice, there's no point to having the
>> > PEP: all three methods (base+extensions, pkgutil, setuptools) all work
>> > just fine as they are, with no changes to importing or the stdlib.
>>
>> Again: the PEP is about creating a standard for namespace
>> packages. It's not about making namespace packages easy to use for
>> Linux distribution maintainers. Instead, it's targeting *developers*
>> that want to enable shipping a single package in multiple, separate
>> pieces, giving the user the freedom to the select the ones she needs.
>>
>> Of course, this is possible today using various other techniques. The
>> point is that there is no standard for namespace packages and that's
>> what the PEP is trying to solve.
>>
>> > In particular, without the feature of being able to drop that practice,
>> > there would be no reason for setuptools to adopt the PEP. That's why
>> > I'm -1 on your proposal: it's actually inferior to the methods we
>> > already have today.
>>
>> It's simpler and more in line with the Python Zen, not inferior.
>>
>> You are free not to support it in setuptools - the methods
>> implemented in setuptools will continue to work as they are,
>> but continue to require support code and, over time, no longer
>> be compatible with other tools building upon the standard
>> defined in the PEP.
>>
>> In the end, it's the user that decides: whether to go with a
>> standard or not.
>
> Up until this point, I've been trying to help you understand the use
> cases, but it's clear now that you already understand them, you just
> don't care.
>
> That wouldn't be a problem if you just stayed on the sidelines, instead
> of actively working to make those use cases more difficult for everyone
> else than they already are.
>
> Anyway, since you clearly understand precisely what you're doing, I'm
> now going to stop trying to explain things, as my responses are
> apparently just encouraging you, and possibly convincing bystanders that
> there's some genuine controversy here as well.

Hopefully, bystanders will understand that the one single use case
you are always emphasizing, namely that of Linux distribution maintainers
trying to change the package installation layout, is really a rather
uncommon and rare use case.

It is true that I do understand what the namespace package idea is
all about. I've been active in Python package development since they
were first added to Python as a new built-in import feature in Python 1.5
and have been distributing packages with package add-ons for more than
a decade...

For some history, have a look at:

http://www.python.org/doc/essays/packages.html

Also note how that essay discourages the use of .pth files:

"""
If the package really requires adding one or more directories on sys.path (e.g.
because it has not yet been structured to support dotted-name import), a "path
configuration file" named package.pth can be placed in either the site-python or
site-packages directory.
...
A typical installation should have no or very few .pth files or something is
wrong, and if you need to play with the search order, something is very wrong.
"""

Back to the PEP:

The much more common use case is that of wanting to have a base package
installation which optional add-ons that live in the same logical
package namespace.

The PEP provides a way to solve this use case by giving both developers
and users a standard at hand which they can follow without having to
rely on some non-standard helpers and across Python implementations.

My proposal tries to solve this without adding yet another .pth
file like mechanism - hopefully in the spirit of the original Python
package idea.

P.J. Eby

unread,
Apr 15, 2009, 1:49:20 PM4/15/09
to Aahz, Python List, Python-Dev
At 09:10 AM 4/15/2009 -0700, Aahz wrote:
>For the benefit of us bystanders, could you summarize your vote at this
>point? Given the PEP's intended goals, if you do not oppose the PEP, are
>there any changes you think should be made?

I'm +1 on Martin's original version of the PEP, subject to the point
brought up by someone that .pkg should be changed to a different extension.

I'm -1 on all of MAL's proposed revisions, as IMO they are a step
backwards: they "standardize" an approach that will create problems
that don't need to exist, and don't exist now. Martin's proposal is
an improvement on the status quo, Marc's proposal is a dis-improvement.

P.J. Eby

unread,
Apr 15, 2009, 1:59:34 PM4/15/09
to M.-A. Lemburg, Python List, Martin v.@telecommunity.com, Löwis, Python-Dev
At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
>The much more common use case is that of wanting to have a base package
>installation which optional add-ons that live in the same logical
>package namespace.

Please see the large number of Zope and PEAK distributions on PyPI as
minimal examples that disprove this being the common use case. I
expect you will find a fair number of others, as well.

In these cases, there is NO "base package"... the entire point of
using namespace packages for these distributions is that a "base
package" is neither necessary nor desirable.

In other words, the "base package" scenario is the exception these
days, not the rule. I actually know specifically of only one other
such package besides your mx.* case, the logilab ll.* package.

M.-A. Lemburg

unread,
Apr 15, 2009, 2:00:42 PM4/15/09
to James Y Knight, P.J. Eby, Python List, "Martin v., Löwis", Python-Dev
On 2009-04-15 19:38, James Y Knight wrote:

>
> On Apr 15, 2009, at 12:15 PM, M.-A. Lemburg wrote:
>
>> The much more common use case is that of wanting to have a base package
>> installation which optional add-ons that live in the same logical
>> package namespace.
>>
>> The PEP provides a way to solve this use case by giving both developers
>> and users a standard at hand which they can follow without having to
>> rely on some non-standard helpers and across Python implementations.
>
> I'm not sure I understand what advantage your proposal gives over the
> current mechanism for doing this.
>
> That is, add to your __init__.py file:
>
> from pkgutil import extend_path
> __path__ = extend_path(__path__, __name__)
>
> Can you describe the intended advantages over the status-quo a bit more
> clearly?

Simple: you don't need the above lines in your __init__.py file
anymore and can rely on a Python standard for namespace packages
instead of some helper implementation.

The fact that you have a __pkg__.py file in your package dir
will signal the namespace package character to Python's importer
and this will take care of the lookup process for you.

Namespace packages will be just as easy to write, install and
maintain as regular Python packages.

James Y Knight

unread,
Apr 15, 2009, 1:38:19 PM4/15/09
to M.-A. Lemburg, P.J. Eby, Python List, Python-Dev, __L=F6wis=22?=

On Apr 15, 2009, at 12:15 PM, M.-A. Lemburg wrote:

> The much more common use case is that of wanting to have a base
> package
> installation which optional add-ons that live in the same logical
> package namespace.
>
> The PEP provides a way to solve this use case by giving both
> developers
> and users a standard at hand which they can follow without having to
> rely on some non-standard helpers and across Python implementations.

I'm not sure I understand what advantage your proposal gives over the
current mechanism for doing this.

That is, add to your __init__.py file:

from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)

Can you describe the intended advantages over the status-quo a bit
more clearly?

James

M.-A. Lemburg

unread,
Apr 15, 2009, 2:09:11 PM4/15/09
to P.J. Eby, Python List, Löwis, "Mart...@telecommunity.com, Python-Dev
On 2009-04-15 19:59, P.J. Eby wrote:

> At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
>> The much more common use case is that of wanting to have a base package
>> installation which optional add-ons that live in the same logical
>> package namespace.
>
> Please see the large number of Zope and PEAK distributions on PyPI as
> minimal examples that disprove this being the common use case. I expect
> you will find a fair number of others, as well.
>
> In these cases, there is NO "base package"... the entire point of using
> namespace packages for these distributions is that a "base package" is
> neither necessary nor desirable.
>
> In other words, the "base package" scenario is the exception these days,
> not the rule. I actually know specifically of only one other such
> package besides your mx.* case, the logilab ll.* package.

So now you're arguing against having base packages... at least you've
dropped the strange idea of using Linux distribution maintainers
as central use case ;-)

Think of base namespace packages (the ones providing the __init__.py
file) as defining the namespace. They setup ownership and the basic
infrastructure needed by add-ons.

If you take Zope as example, the Products/ package dir is a good
example: the __init__.py file in that directory is provided by the
Zope installation (generated during Zope instance creation), so Zope
"owns" the package.

With the proposal, Zope could declare this package dir a namespace
base package by adding a __pkg__.py file to it.

Zope add-ons could then be installed somewhere else on sys.path
and include a Products/ dir as well, only this time it doesn't have
the __init__.py file, but only a __pkg__.py file.

Python would then take care of integrating the add-on Products/ dir
Python module/package contents with the base package.

syt

unread,
Apr 16, 2009, 4:00:47 AM4/16/09
to
On Apr 14, 6:27 pm, "P.J. Eby" <p...@telecommunity.com> wrote:
> I think you've misunderstood something about the use case.  System
> packaging tools don't like separate packages to contain the *same
> file*.  That means that they *can't* split a larger package up with
> your proposal, because every one of those packages would have to
> contain a __pkg__.py -- and thus be in conflict with each
> other.  Either that, or they would have to make a separate system
> package containing *only* the __pkg__.py, and then make all packages
> using the namespace depend on it -- which is more work and requires
> greater co-ordination among packagers.

I've maybe missed some point, but doesn't the PEP requires
coordination so that *.pkg files have different names in each portion,
and the same if one want to provide a non empty __init__.py.

Also providing a main package with a non empty __init__.py and others
with no __init__.py turns all this in the "base package" scenario
described later in this discussion, right?

BTW, It's unclear to me what difference do you make between this usage
and zope or peak's one.

> Allowing each system package to contain its own .pkg or .nsp or
> whatever files, on the other hand, allows each system package to be
> built independently, without conflict between contents (i.e., having
> the same file), and without requiring a special pseudo-package to
> contain the additional file.

As said above, provided some conventions are respected...

What's worrying me is that as the time goes, import mecanism becomes
more and more complicated, with more and more trick involved. Of
course I agree we should unify the way namespace packages are handled,
and this should live in the python std lib. What I like in MAL's
proposal is that it makes things simplier... Another point: I don't
like .pth, .pkg files. Isn't this pep an opportunity to at least unify
them?
--
Sylvain Thénault

"Martin v. Löwis"

unread,
Apr 16, 2009, 5:33:36 PM4/16/09
to syt
> I've maybe missed some point, but doesn't the PEP requires
> coordination so that *.pkg files have different names in each portion,
> and the same if one want to provide a non empty __init__.py.

To some degree, coordination is necessary. However, the PEP recommends
that you use <distribution>.pkg as the name; IMO, that should be
sufficient (at least when all competing pacakges are on PyPI, which
requires unique distribution names).

>> Allowing each system package to contain its own .pkg or .nsp or
>> whatever files, on the other hand, allows each system package to be
>> built independently, without conflict between contents (i.e., having
>> the same file), and without requiring a special pseudo-package to
>> contain the additional file.
>
> As said above, provided some conventions are respected...

Yes, however, these are easy to achieve. If a conflict is ever
encountered, the author of the package violating the convention
is asked to follow it, and he usually will - or a fork will occur.

> Another point: I don't
> like .pth, .pkg files. Isn't this pep an opportunity to at least unify
> them?

I don't see this as a problem. I can add it to the discussion section if
you want.

Regards,
Martin

"Martin v. Löwis"

unread,
Apr 29, 2009, 2:41:20 AM4/29/09
to syt
syt wrote:
> Another point: I don't like .pth, .pkg files. Isn't this pep an opportunity
> to at least unify them?

Can you propose a unification? I'm concerned that if I propose one,
you may still not like it.

Regards,
Martin

Chris Withers

unread,
May 1, 2009, 12:26:29 PM5/1/09
to M.-A. Lemburg, P.J. Eby, Python List, "Martin v., Löwis", Python-Dev
M.-A. Lemburg wrote:
> """
> If the package really requires adding one or more directories on sys.path (e.g.
> because it has not yet been structured to support dotted-name import), a "path
> configuration file" named package.pth can be placed in either the site-python or
> site-packages directory.
> ...
> A typical installation should have no or very few .pth files or something is
> wrong, and if you need to play with the search order, something is very wrong.
> """

I'll say! I think .pth files are absolute evil and I wish they could
just be banned.

+1 on anything that makes them closer to going away or reduces the
possibility of yet another similar feature from hurting the
comprehensibility of a python setup.

Chris Withers

unread,
May 1, 2009, 12:30:16 PM5/1/09
to M.-A. Lemburg, P.J. Eby, Python List, "Martin v., Löwis", Python-Dev
M.-A. Lemburg wrote:
> The much more common use case is that of wanting to have a base package
> installation which optional add-ons that live in the same logical
> package namespace.
>
> The PEP provides a way to solve this use case by giving both developers
> and users a standard at hand which they can follow without having to
> rely on some non-standard helpers and across Python implementations.
>
> My proposal tries to solve this without adding yet another .pth
> file like mechanism - hopefully in the spirit of the original Python
> package idea.

Okay, I need to issue a plea for a little help.

I think I kinda get what this PEP is about now, and as someone who wants
to ship a base package with several add-ons that live in the same
logical package namespace, I'm very interested.

However, despite trying to follow this thread *and* having tried to read
the PEP a couple of times, I still don't know how I'd go about doing this.

I did give some examples from what I'd be looking to do much earlier.

I'll ask again in the vague hope of you or someone else explaining
things to me like I'm a 5 year old - something I'm mentally equipped to
be well ;-)

In either of the proposals on the table, what code would I write and
where to have a base package with a set of add-on packages?

Simple examples would be greatly appreciated, and might bring things
into focus for some of the less mentally able bystanders - like myself!

cheers,

Chris Withers

unread,
May 1, 2009, 12:32:14 PM5/1/09
to P.J. Eby, Python List, Python-Dev, Löwis, "Mart...@telecommunity.com, M.-A. Lemburg
P.J. Eby wrote:

> At 06:15 PM 4/15/2009 +0200, M.-A. Lemburg wrote:
>> The much more common use case is that of wanting to have a base package
>> installation which optional add-ons that live in the same logical
>> package namespace.
>
> Please see the large number of Zope and PEAK distributions on PyPI as
> minimal examples that disprove this being the common use case.

If you mean "the common use case as opposed to having code in the
__init__.py of the namespace package", I think you'll find that's
because people (especially me!) don't know how to do this, not because
we don't want to!

Chris - who would actually like to know how to do this, with or without
the PEP, and how to indicate interdependencies in situations like this
to setuptools...

"Martin v. Löwis"

unread,
May 1, 2009, 12:41:03 PM5/1/09
to Chris Withers, P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
> In either of the proposals on the table, what code would I write and
> where to have a base package with a set of add-on packages?

I don't quite understand the question. Why would you want to write code
(except for the code that actually is in the packages)?

PEP 382 is completely declarative - no need to write code.

Regards,
Martin

Scott David Daniels

unread,
May 1, 2009, 12:48:54 PM5/1/09
to
Chris Withers wrote:
> M.-A. Lemburg wrote:
>> """
>> If the package really requires adding one or more directories on
>> sys.path (e.g. because it has not yet been structured to support
>> dotted-name import), a "path configuration file" named package.pth
>> can be placed in either the site-python or site-packages directory....
> I'll say! I think .pth files are absolute evil and I wish they could
> just be banned.
> +1 on anything that makes them closer to going away or reduces the
> possibility of yet another similar feature from hurting the
> comprehensibility of a python setup.

".pth" files are great when used in moderation. Especially when used
with the new per-user location searched when setting up sys.path.
I can move my current project's directory name into a .pth file that
I change as I switch projects. I can also throw in a reference to
my convenience tools (dinky little functions and classes that I would
otherwise waste time scratching together each time I needed them).
They obviate the need to fiddle with site.py.

--Scott David Daniels
Scott....@Acm.Org

Chris Withers

unread,
May 1, 2009, 12:58:18 PM5/1/09
to "Martin v. Löwis", P.J. Eby, Python List, Python-Dev, M.-A. Lemburg

"code" is anything I need to write to make this work...

So, what do I need to do?

Chris

"Martin v. Löwis"

unread,
May 1, 2009, 1:38:12 PM5/1/09
to Chris Withers, P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
>>> In either of the proposals on the table, what code would I write and
>>> where to have a base package with a set of add-on packages?
>>
>> I don't quite understand the question. Why would you want to write code
>> (except for the code that actually is in the packages)?
>>
>> PEP 382 is completely declarative - no need to write code.
>
> "code" is anything I need to write to make this work...
>
> So, what do I need to do?

Ok, so create three tar files:

1. base.tar, containing

simplistix/
simplistix/__init__.py

2. addon1.tar, containing

simplistix/addon1.pth (containing a single "*")
simplistix/feature1.py

3. addon2.tar, containing

simplistix/addon2.pth
simplistix/feature2.py

Unpack each of them anywhere on sys.path, in any order.

Regards,
Martin

Tim Golden

unread,
May 1, 2009, 3:13:13 PM5/1/09
to Python List
Chris Withers wrote:
> I'll say! I think .pth files are absolute evil and I wish they could
> just be banned.
>
> +1 on anything that makes them closer to going away or reduces the
> possibility of yet another similar feature from hurting the
> comprehensibility of a python setup.

I've seen this view expressed by you (and others) a number
of times, but I use them myself in my admittedly simple
circumstances without any problem. Would you mind sharing
what's so bad about them?

TJG

Chris Withers

unread,
May 9, 2009, 5:06:52 AM5/9/09
to "Martin v. Löwis", P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
Martin v. L�wis wrote:
> Ok, so create three tar files:
>
> 1. base.tar, containing
>
> simplistix/
> simplistix/__init__.py

So this __init__.py can have code in it? And base.tar can have other
modules and subpackages in it?
What happens if the base and an addon both define a package called
simplistix.somepackage?

> 2. addon1.tar, containing
>
> simplistix/addon1.pth (containing a single "*")

What does that * mean? I thought .pth files just had python in them?

> Unpack each of them anywhere on sys.path, in any order.

How would this work if base, addon1 and addon2 were eggs managed by
buildout or setuptools?

cheers,

"Martin v. Löwis"

unread,
May 9, 2009, 5:27:22 AM5/9/09
to Chris Withers, P.J. Eby, Python List, M.-A. Lemburg, Python-Dev
>> Ok, so create three tar files:
>>
>> 1. base.tar, containing
>>
>> simplistix/
>> simplistix/__init__.py
>
> So this __init__.py can have code in it?

That's the point, yes.

> And base.tar can have other modules and subpackages in it?

Certainly, yes.

> What happens if the base and an addon both define a package called
> simplistix.somepackage?

Depends on whether simplistix.somepackage is a namespace package
(it should). If so, they get merged just as any other namespace
package.

>> 2. addon1.tar, containing
>>
>> simplistix/addon1.pth (containing a single "*")
>
> What does that * mean?

See PEP 382 (search for "*").

> I thought .pth files just had python in them?

Not at all - they never did. They have paths in them.

>> Unpack each of them anywhere on sys.path, in any order.
>
> How would this work if base, addon1 and addon2 were eggs managed by
> buildout or setuptools?

What is a managed egg (i.e. what kind of management does buildout
or setuptools apply to it)?

Regards,
Martin

Zooko O'Whielacronx

unread,
May 9, 2009, 9:49:13 AM5/9/09
to Chris Withers, P.J. Eby, Python List, Python-Dev, "Martin v., Löwis", M.-A. Lemburg
.pth files are why I can't easily use GNU stow with easy_install.
If installing a Python package involved writing new files into the
filesystem, but did not require reading, updating, and re-writing any
extant files such as .pth files, then GNU stow would Just Work with
easy_install the way it Just Works with most things.

Regards,

Zooko

Chris Withers

unread,
May 9, 2009, 10:10:23 AM5/9/09
to "Martin v. Löwis", P.J. Eby, Python List, M.-A. Lemburg, Python-Dev
Martin v. L�wis wrote:
>> So this __init__.py can have code in it?
>
> That's the point, yes.
>
>> And base.tar can have other modules and subpackages in it?
>
> Certainly, yes.

Great, when is the PEP due to land in 2.x? ;-)

>> What happens if the base and an addon both define a package called
>> simplistix.somepackage?
>
> Depends on whether simplistix.somepackage is a namespace package
> (it should). If so, they get merged just as any other namespace
> package.

Sorry, I was looking at potential bug cases here. What happens if it's
not a namespace package?

> See PEP 382 (search for "*").
>
>> I thought .pth files just had python in them?
>
> Not at all - they never did. They have paths in them.

I've certainly seen them with python in, and that's what I hate about
them...

>>> Unpack each of them anywhere on sys.path, in any order.
>> How would this work if base, addon1 and addon2 were eggs managed by
>> buildout or setuptools?
>
> What is a managed egg (i.e. what kind of management does buildout
> or setuptools apply to it)?

Sorry, bad wording on my part... I guess I meant more how would
buildout/setuptools go about installing/uninstalling/etc packages
thatconform to PEP 382? Would setuptools/buildout need modification or
would the changes take effect lower down in the stack?

"Martin v. Löwis"

unread,
May 9, 2009, 10:18:44 AM5/9/09
to Zooko O'Whielacronx, P.J. Eby, Python List, Chris Withers, Python-Dev, M.-A. Lemburg

Please understand that this is the fault of easy_install, not of .pth
files. There is no technical need for easy_install to rewrite .pth
files on installation. It could just as well have created new .pth
files, rather than modifying existing ones.

If you always use --single-version-externally-managed with easy_install,
it will stop editing .pth files on installation.

Regards,
Martin

"Martin v. Löwis"

unread,
May 9, 2009, 10:32:39 AM5/9/09
to Chris Withers, P.J. Eby, Python List, M.-A. Lemburg, Python-Dev
Chris Withers wrote:
> Martin v. L�wis wrote:
>>> So this __init__.py can have code in it?
>>
>> That's the point, yes.
>>
>>> And base.tar can have other modules and subpackages in it?
>>
>> Certainly, yes.
>
> Great, when is the PEP due to land in 2.x? ;-)

Most likely, never - it probably will be implemented only after
the last feature release of 2.x was made.

>>> What happens if the base and an addon both define a package called
>>> simplistix.somepackage?
>>
>> Depends on whether simplistix.somepackage is a namespace package
>> (it should). If so, they get merged just as any other namespace
>> package.
>
> Sorry, I was looking at potential bug cases here. What happens if it's
> not a namespace package?

Then it will be imported as a regular child package.

>>>> Unpack each of them anywhere on sys.path, in any order.
>>> How would this work if base, addon1 and addon2 were eggs managed by
>>> buildout or setuptools?
>>
>> What is a managed egg (i.e. what kind of management does buildout
>> or setuptools apply to it)?
>
> Sorry, bad wording on my part... I guess I meant more how would
> buildout/setuptools go about installing/uninstalling/etc packages
> thatconform to PEP 382? Would setuptools/buildout need modification or
> would the changes take effect lower down in the stack?

Unfortunately, I don't know precisely what they do, so I don't know
whether any of it needs modification.

All I can say is that if they want to install namespace packages
using the mechanism of PEP 382, they will have to produce the file
layout specified in the PEP.

For distutils (which is the only library in that area that I do know),
I think just installing any .pth files inside a package would be
sufficient.

Regards,
Martin

P.J. Eby

unread,
May 9, 2009, 10:41:02 AM5/9/09
to Martin v. Löwis, Zooko O'Whielacronx, Python List, Chris Withers, Python-Dev, M.-A. Lemburg

It's --multi-version (-m) that does
that. --single-version-externally-managed is a "setup.py install" option.

Both have the effect of not editing .pth files, but they do so in
different ways. The "setup.py install" option causes it to install
in a distutils-compatible layout, whereas --multi-version simply
drops .egg files or directories in the target location and leaves it
to the user (or the generated script wrappers) to add them to sys.path.

"Martin v. Löwis"

unread,
May 9, 2009, 10:42:01 AM5/9/09
to P.J. Eby, Python List, Zooko O'Whielacronx, Chris Withers, Python-Dev, M.-A. Lemburg
>> If you always use --single-version-externally-managed with easy_install,
>> it will stop editing .pth files on installation.
>
> It's --multi-version (-m) that does that.
> --single-version-externally-managed is a "setup.py install" option.
>
> Both have the effect of not editing .pth files, but they do so in
> different ways. The "setup.py install" option causes it to install in a
> distutils-compatible layout, whereas --multi-version simply drops .egg
> files or directories in the target location and leaves it to the user
> (or the generated script wrappers) to add them to sys.path.

Ah, ok. Is there also an easy_install invocation that unpacks the zip
file into some location of sys.path (which then wouldn't require
editing sys.path)?

Regards,
Martin

P.J. Eby

unread,
May 9, 2009, 11:39:52 AM5/9/09
to Martin v. L�wis, Python List, Zooko O'Whielacronx, Chris Withers, Python-Dev, M.-A. Lemburg

Not as yet. I'm sort of waiting to see what comes out of PEP 376
discussions re: an installation manifest... but then, if I actually
had time to work on it right now, I'd probably just implement something.

Currently, you can use pip to do that, though, as long as the
packages you want are in source form. pip doesn't unzip eggs as yet.

It would be really straightforward, though, for someone to implement
an easy_install variant that does this. Just invoke "easy_install
-Zmaxd /some/tmpdir packagelist" to get a full set of unpacked .egg
directories in /some/tmpdir, and then move the contents of the
resulting .egg subdirs to the target location, renaming EGG-INFO
subdirs to projectname-version.egg-info subdirs.

(Of course, this ignores the issue of uninstalling previous versions,
or overwriting of conflicting files in the target -- does pip handle these?)

Дамјан Георгиевски

unread,
May 9, 2009, 2:40:06 PM5/9/09
to
> Ah, ok. Is there also an easy_install invocation that unpacks the zip
> file into some location of sys.path (which then wouldn't require
> editing sys.path)?

You have pip that does that :)


--
дамјан ( http://softver.org.mk/damjan/ )

... knowledge is exactly like power - something
to be distributed as widely as humanly possible,
for the betterment of all. -- jd

Zooko Wilcox-O'Hearn

unread,
May 10, 2009, 11:41:33 AM5/10/09
to P.J. Eby, Python List, M.-A. Lemburg, Martin v. Löwis, Python-Dev
On May 9, 2009, at 9:39 AM, P.J. Eby wrote:

> It would be really straightforward, though, for someone to
> implement an easy_install variant that does this. Just invoke
> "easy_install -Zmaxd /some/tmpdir packagelist" to get a full set of
> unpacked .egg directories in /some/tmpdir, and then move the
> contents of the resulting .egg subdirs to the target location,
> renaming EGG-INFO subdirs to projectname-version.egg-info subdirs.

Except for the renaming part, this is exactly what GNU stow does.

> (Of course, this ignores the issue of uninstalling previous
> versions, or overwriting of conflicting files in the target -- does
> pip handle these?)

GNU stow does handle these issues.

Regards,

Zooko

"Martin v. Löwis"

unread,
May 10, 2009, 1:18:16 PM5/10/09
to Zooko Wilcox-O'Hearn, P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
> GNU stow does handle these issues.

If GNU stow solves all your problems, why do you want to
use easy_install in the first place?

Regards,
Martin

Zooko Wilcox-O'Hearn

unread,
May 10, 2009, 2:04:57 PM5/10/09
to Martin v. Löwis, P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
On May 10, 2009, at 11:18 AM, Martin v. Löwis wrote:

> If GNU stow solves all your problems, why do you want to use
> easy_install in the first place?

That's a good question. The answer is that there are two separate
jobs: building executables and putting them in a directory structure
of the appropriate shape for your system is one job, and installing
or uninstalling that tree into your system is another. GNU stow does
only the latter.

The input to GNU stow is a set of executables, library files, etc.,
in a directory tree that is of the right shape for your system. For
example, if you are on a Linux system, then your scripts all need to
be in $prefix/bin/, your shared libs should be in $prefix/lib, your
Python packages ought to be in $prefix/lib/python$x.$y/site-
packages/, etc. GNU stow is blissfully ignorant about all issues of
building binaries, and choosing where to place files, etc. -- that's
the job of the build system of the package, e.g. the "./configure --
prefix=foo && make && make install" for most C packages, or the
"python ./setup.py install --prefix=foo" for Python packages using
distutils (footnote 1).

Once GNU stow has the well-shaped directory which is the output of
the build process, then it follows a very dumb, completely reversible
(uninstallable) process of symlinking those files into the system
directory structure.

It is a beautiful, elegant hack because it is sooo dumb. It is also
very nice to use the same tool to manage packages written in any
programming language, provided only that they can build a directory
tree of the right shape and content.

However, there are lots of things that it doesn't do, such as
automatically acquiring and building dependencies, or producing
executables for the target platform for each of your console
scripts. Not to mention creating a directory named "$prefx/lib/python
$x.$y/site-packages" and cp'ing your Python files into it. That's
why you still need a build system even if you use GNU stow for an
install-and-uninstall system.

The thing that prevents this from working with setuptools is that
setuptools creates a file named easy_install.pth during the "python ./
setup.py install --prefix=foo" if you build two different Python
packages this way, they will each create an easy_install.pth file,
and then when you ask GNU stow to link the two resulting packages
into your system, it will say "You are asking me to install two
different packages which both claim that they need to write a file
named '/usr/local/lib/python2.5/site-packages/easy_install.pth'. I'm
too dumb to deal with this conflict, so I give up.". If I understand
correctly, your (MvL's) suggestion that easy_install create a .pth
file named "easy_install-$PACKAGE-$VERSION.pth" instead of
"easy_install.pth" would indeed make it work with GNU stow.

Regards,

Zooko

footnote 1: Aside from the .pth file issue, the other reason that
setuptools doesn't work for this use while distutils does is that
setuptools tries to hard to save you from making a mistake: maybe you
don't know what you are doing if you ask it to install into a
previously non-existent prefix dir "foo". This one is easier to fix:
http://bugs.python.org/setuptools/issue54 # "be more like distutils
with regard to --prefix=" .

"Martin v. Löwis"

unread,
May 10, 2009, 2:21:48 PM5/10/09
to Zooko Wilcox-O'Hearn, P.J. Eby, Python List, Python-Dev, M.-A. Lemburg
Zooko Wilcox-O'Hearn wrote:

> On May 10, 2009, at 11:18 AM, Martin v. L�wis wrote:
>
>> If GNU stow solves all your problems, why do you want to use
>> easy_install in the first place?
>
> That's a good question. The answer is that there are two separate jobs:
> building executables and putting them in a directory structure of the
> appropriate shape for your system is one job, and installing or
> uninstalling that tree into your system is another. GNU stow does only
> the latter.

And so does easy_install - it's job is *not* to build the executables
and to put them in a directory structure. Instead, it's
distutils/setuptools which has this job.

The primary purpose of easy_install is to download the files from PyPI
(IIUC).

> The thing that prevents this from working with setuptools is that
> setuptools creates a file named easy_install.pth

It will stop doing that if you ask nicely. That's why I recommended
earlier that you do ask it not to edit .pth files.

> If I understand correctly,
> your (MvL's) suggestion that easy_install create a .pth file named
> "easy_install-$PACKAGE-$VERSION.pth" instead of "easy_install.pth" would
> indeed make it work with GNU stow.

My recommendation is that you use the already existing flag to
setup.py install that stops it from editing .pth files.

Regards,
Martin

Zooko O'Whielacronx

unread,
May 10, 2009, 2:21:57 PM5/10/09
to Zooko Wilcox-O'Hearn, P.J. Eby, Python List, M.-A. Lemburg, Martin v. Löwis, Python-Dev
following-up to my own post to mention one very important reason why
anyone cares:

On Sun, May 10, 2009 at 12:04 PM, Zooko Wilcox-O'Hearn <zo...@zooko.com> wrote:

> It is a beautiful, elegant hack because it is sooo dumb.  It is also very
> nice to use the same tool to manage packages written in any programming
> language, provided only that they can build a directory tree of the right
> shape and content.

And, you are not relying on the author of the package that you are
installing to avoid accidentally or maliciously screwing up your
system. You're not even relying on the authors of the *build system*
(e.g. the authors of distutils or easy_install). You are relying
*only* on GNU stow to avoid accidentally or maliciously screwing up
your system, and GNU stow is very dumb, so it is easy to understand
what it is going to do and why that isn't going to irreversibly screw
up your system.

That is: you don't run the "build yourself and install into $prefix"
step as root. This is an important consideration for a lot of people,
who absolutely refuse on principle to ever run "sudo python
./setup.py" on a system that they care about unless they wrote the
"setup.py" script themselves. (Likewise they refuse to run "sudo make
install" on packages written in C.)

Regards,

Zooko

P.J. Eby

unread,
May 10, 2009, 2:48:46 PM5/10/09
to Zooko Wilcox-O'Hearn, Martin v.L�wis, Python List, Python-Dev, M.-A. Lemburg
At 12:04 PM 5/10/2009 -0600, Zooko Wilcox-O'Hearn wrote:
>The thing that prevents this from working with setuptools is that
>setuptools creates a file named easy_install.pth during the "python
>./ setup.py install --prefix=foo" if you build two different Python
>packages this way, they will each create an easy_install.pth file,
>and then when you ask GNU stow to link the two resulting packages
>into your system, it will say "You are asking me to install two
>different packages which both claim that they need to write a file
>named '/usr/local/lib/python2.5/site-packages/easy_install.pth'.

Adding --record and --single-version-externally-managed to that
command line will prevent the .pth file from being used or needed,
although I believe you already know this.

(What that mode won't do is install dependencies automatically.)

Nick Craig-Wood

unread,
May 11, 2009, 5:30:04 AM5/11/09
to
Zooko Wilcox-O'Hearn <zo...@zooko.com> wrote:
> On May 10, 2009, at 11:18 AM, Martin v. Löwis wrote:
>
> > If GNU stow solves all your problems, why do you want to use
> > easy_install in the first place?
>
> That's a good question. The answer is that there are two separate
> jobs: building executables and putting them in a directory structure
> of the appropriate shape for your system is one job, and installing
> or uninstalling that tree into your system is another. GNU stow does
> only the latter.
>
> The input to GNU stow is a set of executables, library files, etc.,
> in a directory tree that is of the right shape for your system. For
> example, if you are on a Linux system, then your scripts all need to
> be in $prefix/bin/, your shared libs should be in $prefix/lib, your
> Python packages ought to be in $prefix/lib/python$x.$y/site-
> packages/, etc. GNU stow is blissfully ignorant about all issues of
> building binaries, and choosing where to place files, etc. -- that's
> the job of the build system of the package, e.g. the "./configure --
> prefix=foo && make && make install" for most C packages, or the
> "python ./setup.py install --prefix=foo" for Python packages using
> distutils (footnote 1).
>
> Once GNU stow has the well-shaped directory which is the output of
> the build process, then it follows a very dumb, completely reversible
> (uninstallable) process of symlinking those files into the system
> directory structure.

Once you've got that well formed directory structure it is very easy
to make it into a package (eg deb or rpm) so that idea is useful in
general for package managers, not just stow.

--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick

Giuseppe Ottaviano

unread,
May 11, 2009, 8:26:49 AM5/11/09
to Python List, Python-Dev
Talking of stow, I take advantage of this thread to do some shameless
advertising :)
Recently I uploaded to PyPI a software of mine, BPT [1], which does
the same symlinking trick of stow, but it is written in Python (and
with a simple api) and, more importantly, it allows with another trick
the relocation of the installation directory (it creates a semi-
isolated environment, similar to virtualenv).
I find it very convenient when I have to switch between several
versions of the same packages (for example during development), or I
have to deploy on the same machine software that needs different
versions of the dependencies.

I am planning to write an integration layer with buildout and
easy_install. It should be very easy, since BPT can handle directly
tarballs (and directories, in trunk) which contain a setup.py.

HTH,
Giuseppe

[1] http://pypi.python.org/pypi/bpt
P.S. I was not aware of stow, I'll add it to the references and see if
there are any features that I can steal


P.J. Eby

unread,
May 11, 2009, 12:35:58 PM5/11/09
to Martin v. L�wis, Python List, M.-A. Lemburg, Python-Dev
At 04:42 PM 5/9/2009 +0200, Martin v. L�wis wrote:
> >> If you always use --single-version-externally-managed with easy_install,
> >> it will stop editing .pth files on installation.
> >
> > It's --multi-version (-m) that does that.
> > --single-version-externally-managed is a "setup.py install" option.
> >
> > Both have the effect of not editing .pth files, but they do so in
> > different ways. The "setup.py install" option causes it to install in a
> > distutils-compatible layout, whereas --multi-version simply drops .egg
> > files or directories in the target location and leaves it to the user
> > (or the generated script wrappers) to add them to sys.path.
>
>Ah, ok. Is there also an easy_install invocation that unpacks the zip
>file into some location of sys.path (which then wouldn't require
>editing sys.path)?

No; you'd have to use the -e option to easy_install to download and
extract a source version of the package; then run that package's
setup.py, e.g.:

easy_install -eb /some/tmpdir SomeProject
cd /some/tmpdir/someproject # subdir is always lowercased/normalized
setup.py install --single-version-externally-managed --record=...

I suspect that this is basically what pip is doing under the hood, as
that would explain why it doesn't support .egg files.

I previously posted code to the distutils-sig that was an .egg
unpacker with appropriate renaming, though. It was untested, and
assumes you already checked for collisions in the target directory,
and that you're handling any uninstall manifest yourself. It could
probably be modified to take a filter function, though, something like:

def flatten_egg(egg_filename, extract_dir, filter=lambda s,d: d):
eggbase = os.path.filename(egg_filename)+'-info'
def file_filter(src, dst):
if src.startswith('EGG-INFO/'):
src = eggbase+s[8:]
dst = os.path.join(extract_dir, *src.split('/'))
return filter(src, dst)
return unpack_archive(egg_filename, extract_dir, file_filter)

Then you could pass in a None-returning filter function to check and
accumulate collisions and generate a manifest. A second run with the
default filter would do the unpacking.

(This function should work with either .egg files or .egg directories
as input, btw, since unpack_archive treats a directory input as if it
were an archive.)

Anyway, if you used "easy_install -mxd /some/tmpdir [specs]" to get
your target eggs found/built, you could then run this flattening
function (with appropriate filter functions) over the *.egg contents
of /some/tmpdir to do the actual installation.

(The reason for using -mxd instead of -Zmaxd or -zmaxd is that we
don't care whether the eggs are zipped or not, and we leave out the
-a so that dependencies already present on sys.path aren't copied or
re-downloaded to the target; only dependencies we don't already have
will get dropped in /some/tmpdir.)

Of course, the devil of this is in the details; to handle conflicts
and uninstalls properly you would need to know what namespace
packages were in the eggs you are installing. But if you don't care
about blindly overwriting things (as the distutils does not), then
it's actually pretty easy to make such an unpacker.

I mainly haven't made one myself because I *do* care about things
being blindly overwritten.

0 new messages