Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PEP 3147 - new .pyc format

30 views
Skip to first unread message

John Roth

unread,
Jan 30, 2010, 5:14:54 PM1/30/10
to
PEP 3147 has just been posted, proposing that, beginning in release
3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
directory with a .pyr extension. The reason is so that compiled
versions of a program can coexist, which isn't possible now.

Frankly, I think this is a really good idea, although I've got a few
comments.

1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
10.6) ship with both Python release 2.3 and 2.5 installed.

2. I think the proposed logic is too complex. If this is installed in
3.2, then that release should simply store its .pyc file in the .pyr
directory, without the need for either a command line switch or an
environment variable (both are mentioned in the PEP.)

3. Tool support. There are tools that look for the .pyc files; these
need to be upgraded somehow. The ones that ship with Python should, of
course, be fixed with the PEP, but there are others.

4. I'm in favor of putting the source in the .pyr directory as well,
but that's got a couple more issues. One is tool support, which is
likely to be worse for source, and the other is some kind of algorithm
for identifying which source goes with which object.

Summary: I like it, but I think it needs a bit more work.

John Roth

Mensanator

unread,
Jan 30, 2010, 5:36:52 PM1/30/10
to
On Jan 30, 4:14 pm, John Roth <johnro...@gmail.com> wrote:
> PEP 3147 has just been posted, proposing that, beginning in release
> 3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
> directory with a .pyr extension. The reason is so that compiled
> versions of a program can coexist, which isn't possible now.
>
> Frankly, I think this is a really good idea, although I've got a few
> comments.
>
> 1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
> 10.6) ship with both Python release 2.3 and 2.5 installed.

Mac OSX 10.6 has 2.6 installed.

MRAB

unread,
Jan 30, 2010, 5:57:57 PM1/30/10
to pytho...@python.org
The PEP has a .pyr directory for each .py file:

foo.py
foo.pyr/
f2b30a0d.pyc # Python 2.5
f2d10a0d.pyc # Python 2.6
f2d10a0d.pyo # Python 2.6 -O
f2d20a0d.pyc # Python 2.6 -U
0c4f0a0d.pyc # Python 3.1

Other possibilities are:

1. A single pyr directory:

foo.py
pyr/
foo.f2b30a0d.pyc # Python 2.5
foo.f2d10a0d.pyc # Python 2.6
foo.f2d10a0d.pyo # Python 2.6 -O
foo.f2d20a0d.pyc # Python 2.6 -U
foo.0c4f0a0d.pyc # Python 3.1

2. A .pyr directory for each version of Python:

foo.py
f2b30a0d.pyr/ # Python 2.5
foo.pyc
f2d10a0d.pyr/ # Python 2.6/Python 2.6 -O
foo.pyc
foo.pyo
f2d20a0d.pyr/ # Python 2.6 -U
foo.pyc
0c4f0a0d.pyr/ # Python 3.1
foo.pyc

John Bokma

unread,
Jan 30, 2010, 6:38:52 PM1/30/10
to
MRAB <pyt...@mrabarnett.plus.com> writes:

> The PEP has a .pyr directory for each .py file:
>
> foo.py
> foo.pyr/
> f2b30a0d.pyc # Python 2.5
> f2d10a0d.pyc # Python 2.6
> f2d10a0d.pyo # Python 2.6 -O
> f2d20a0d.pyc # Python 2.6 -U
> 0c4f0a0d.pyc # Python 3.1

wow: so much for human readable file names :-(

--
John Bokma j3b

Hacking & Hiking in Mexico - http://johnbokma.com/
http://castleamber.com/ - Perl & Python Development

Alf P. Steinbach

unread,
Jan 30, 2010, 6:46:03 PM1/30/10
to
* John Bokma:

> MRAB <pyt...@mrabarnett.plus.com> writes:
>
>> The PEP has a .pyr directory for each .py file:
>>
>> foo.py
>> foo.pyr/
>> f2b30a0d.pyc # Python 2.5
>> f2d10a0d.pyc # Python 2.6
>> f2d10a0d.pyo # Python 2.6 -O
>> f2d20a0d.pyc # Python 2.6 -U
>> 0c4f0a0d.pyc # Python 3.1
>
> wow: so much for human readable file names :-(

I agree.

Human readable filenames would be much better.


Cheers,

- Alf

Carl Banks

unread,
Jan 30, 2010, 7:14:53 PM1/30/10
to
On Jan 30, 2:14 pm, John Roth <johnro...@gmail.com> wrote:
> PEP 3147 has just been posted, proposing that, beginning in release
> 3.2 (and possibly 2.7) compiled .pyc and .pyo files be placed in a
> directory with a .pyr extension. The reason is so that compiled
> versions of a program can coexist, which isn't possible now.
>
> Frankly, I think this is a really good idea, although I've got a few
> comments.


-1

I think it's a terrible, drastic approach to a minor problem. I'm not
sure why the simple approach of just appending a number (perhaps the
major-minor version, or a serial number) to the filename wouldn't
work, like this:

foo.pyc25

All I can think of is they are concerned with the typically minor
expense of listing the directory (to see if there's already .pyc??
file present). This operation can be reasonably cached; when scanning
a directory listing it need only record all occurrencs of .pyc?? and
mark those modules as subject to version-specific .pyc files. Anyway,
I'd expect the proposed -R switch would only be used in special cases
(like installation time) when a minor inefficiency would be tolerable.

> 1. Apple's MAC OS X should be mentioned, since 10.5 (and presumably
> 10.6) ship with both Python release 2.3 and 2.5 installed.
>
> 2. I think the proposed logic is too complex. If this is installed in
> 3.2, then that release should simply store its .pyc file in the .pyr
> directory, without the need for either a command line switch or an
> environment variable (both are mentioned in the PEP.)

This is utterly unacceptable. Versioned *.pyc files should only be
optionally requested by people who have to deal multiple versions,
such as distro package maintainers. For my projects I don't give a
flying F about versioned *.pyc and I don't want my project directory
cluttered with a million subdirectories. (It would be a bit more
tolerable if my directory was merely cluttered with *.pyc?? files, but
I'd still rather Python didn't do that unless asked.)


> 3. Tool support. There are tools that look for the .pyc files; these
> need to be upgraded somehow. The ones that ship with Python should, of
> course, be fixed with the PEP, but there are others.

How will this affect tools like Py2exe? Now you have a bunch of
identically-named *.pyc files.


> 4. I'm in favor of putting the source in the .pyr directory as well,
> but that's got a couple more issues. One is tool support, which is
> likely to be worse for source, and the other is some kind of algorithm
> for identifying which source goes with which object.

Now this just too much. I didn't like the suggestion that I should be
forced to put up with dozens of subdirectories, now you want me to
force me to put the source files into the subdirectories as well?
That would be a deal-breaker. Thankfully it is too ridiculous to ever
happen.


> Summary: I like it, but I think it needs a bit more work.

I hope it's replaced with something less drastic.


Carl Banks

MRAB

unread,
Jan 30, 2010, 7:19:50 PM1/30/10
to pytho...@python.org
The names are the magic numbers. It's all in the PEP.

John Bokma

unread,
Jan 30, 2010, 7:51:31 PM1/30/10
to
MRAB <pyt...@mrabarnett.plus.com> writes:

Naming files using magic numbers is really beyond me. The fact that the
above needs comments to explain what's what already shows to me that
there's a problem with this naming scheme. What if for one reason or
another I want to delete all pyc files for Python 2.5? Where do I look
up the magic number?

Neil Hodgson

unread,
Jan 30, 2010, 8:09:07 PM1/30/10
to
John Roth:

> 4. I'm in favor of putting the source in the .pyr directory as well,
> but that's got a couple more issues. One is tool support, which is
> likely to be worse for source, and the other is some kind of algorithm
> for identifying which source goes with which object.

Many tools work recursively except for hidden directories so would
return both the source in the repository as well as the original source.
If you want to do this then the repository directory should be hidden by
starting with ".".

Neil

MRAB

unread,
Jan 30, 2010, 8:16:54 PM1/30/10
to pytho...@python.org
John Bokma wrote:
> MRAB <pyt...@mrabarnett.plus.com> writes:
>
>> Alf P. Steinbach wrote:
>>> * John Bokma:
>>>> MRAB <pyt...@mrabarnett.plus.com> writes:
>>>>
>>>>> The PEP has a .pyr directory for each .py file:
>>>>>
>>>>> foo.py
>>>>> foo.pyr/
>>>>> f2b30a0d.pyc # Python 2.5
>>>>> f2d10a0d.pyc # Python 2.6
>>>>> f2d10a0d.pyo # Python 2.6 -O
>>>>> f2d20a0d.pyc # Python 2.6 -U
>>>>> 0c4f0a0d.pyc # Python 3.1
>>>> wow: so much for human readable file names :-(
>>> I agree.
>>>
>>> Human readable filenames would be much better.
>>>
>> The names are the magic numbers. It's all in the PEP.
>
> Naming files using magic numbers is really beyond me. The fact that the
> above needs comments to explain what's what already shows to me that
> there's a problem with this naming scheme. What if for one reason or
> another I want to delete all pyc files for Python 2.5? Where do I look
> up the magic number?
>
True. You might also want to note that "Python 2.6 -U" appears to have a
different magic number from "Python 2.6" and "Python 2.6 -O".

I don't know whether they always change for each new version.

Steven D'Aprano

unread,
Jan 30, 2010, 8:18:52 PM1/30/10
to
On Sat, 30 Jan 2010 14:14:54 -0800, John Roth wrote:

> PEP 3147 has just been posted, proposing that, beginning in release 3.2
> (and possibly 2.7) compiled .pyc and .pyo files be placed in a directory
> with a .pyr extension. The reason is so that compiled versions of a
> program can coexist, which isn't possible now.


http://www.python.org/dev/peps/pep-3147/


Reading through the PEP, I went from an instinctive "oh no, that sounds
horrible" reaction to "hmmm, well, that doesn't sound too bad". I don't
think I need this, but I could live with it.

Firstly, it does sound like there is a genuine need for a solution to the
problem of multiple Python versions. This is not the first PEP trying to
solve it, so even if you personally don't see the need, we can assume
that others do.

Secondly, the current behaviour will remain unchanged. Python will
compile spam.py to spam.pyc (or spam.pyo with the -O switch) by default.
If you don't need to support multiple versions, you don't need to care
about this PEP. I like this aspect of the PEP very much. I would argue
that any solution MUST support the status quo for those who don't care
about multiple versions.

To get the new behaviour, you have to explicitly ask for it. You ask for
it by calling python with the -R switch, by setting an environment
variable, or explicitly providing the extra spam/<magic>.pyc files.

Thirdly, the magic file names aren't quite as magic as they appear at
first glance. They represent the hexified magic number of the version of
Python. More about the magic number here:

http://nedbatchelder.com/blog/200804/the_structure_of_pyc_files.html

Unfortunately the magic number doesn't seem to be documented anywhere I
can find other than in the source code (import.c). The PEP gives some
examples:

f2b30a0d.pyc # Python 2.5
f2d10a0d.pyc # Python 2.6
f2d10a0d.pyo # Python 2.6 -O
f2d20a0d.pyc # Python 2.6 -U
0c4f0a0d.pyc # Python 3.1

but how can one map magic numbers to versions, short of reading import.c?
I propose that sys grow an object sys.magic which is the hexlified magic
number.


> 2. I think the proposed logic is too complex. If this is installed in
> 3.2, then that release should simply store its .pyc file in the .pyr
> directory, without the need for either a command line switch or an
> environment variable (both are mentioned in the PEP.)

I disagree. Making the new behaviour optional is an advantage, even if it
leads to extra complexity. It is pointless forcing .pyc files to be in a
subdirectory if you don't need multiple versions.


> 3. Tool support. There are tools that look for the .pyc files; these
> need to be upgraded somehow. The ones that ship with Python should, of
> course, be fixed with the PEP, but there are others.

Third party tools will be the responsibility of the third parties.


> 4. I'm in favor of putting the source in the .pyr directory as well, but
> that's got a couple more issues. One is tool support, which is likely to
> be worse for source, and the other is some kind of algorithm for
> identifying which source goes with which object.

It certain does.

What's the advantage of forcing .py files to live inside a directory with
the same name?

Modules:
mymodule.py => mymodule/mymodule.py

Packages:
mypackage/__init__.py => mypackage/__init__/__init__.py
mypackage/spam.py => mypackage/spam/spam.py


Seems like a pointless and annoying extra layer to me.

--
Steven

Ben Finney

unread,
Jan 30, 2010, 9:39:49 PM1/30/10
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

> Unfortunately the magic number doesn't seem to be documented anywhere
> I can find other than in the source code (import.c). The PEP gives
> some examples:
>
> f2b30a0d.pyc # Python 2.5
> f2d10a0d.pyc # Python 2.6
> f2d10a0d.pyo # Python 2.6 -O
> f2d20a0d.pyc # Python 2.6 -U
> 0c4f0a0d.pyc # Python 3.1
>
> but how can one map magic numbers to versions, short of reading
> import.c?

Mapping magic numbers to versions is infeasible and will be incomplete:
Any mapping that exists in (say) Python 3.1 can't know in advance what
the magic number will be for Python 4.5.

The more important mapping is from version to magic number.

> I propose that sys grow an object sys.magic which is the hexlified magic
> number.

The ‘imp’ module already has this::

>>> import sys
>>> import imp
>>> sys.version
'2.5.4 (r254:67916, Jan 24 2010, 16:09:54) \n[GCC 4.4.2]'
>>> imp.get_magic().encode('hex')
'b3f20d0a'

Unfortunately, I think the examples in the PEP have mangled the magic
number into little-endian byte ordering.

--
\ “I got up the other day, and everything in my apartment has |
`\ been stolen and replaced with an exact replica.” —Steven Wright |
_o__) |
Ben Finney

Paul Rubin

unread,
Jan 30, 2010, 9:40:42 PM1/30/10
to
Ben Finney <ben+p...@benfinney.id.au> writes:
>> 0c4f0a0d.pyc # Python 3.1

> Mapping magic numbers to versions is infeasible and will be incomplete:
> Any mapping that exists in (say) Python 3.1 can't know in advance what
> the magic number will be for Python 4.5.

But why do the filenames have magic numbers instead of version numbers?

Steven D'Aprano

unread,
Jan 30, 2010, 10:09:25 PM1/30/10
to

The magic number changes with each incompatible change in the byte code
format, which is not the same as each release. Selected values taken from
import.c:

Python 2.5a0: 62071
Python 2.5a0: 62081 (ast-branch)
Python 2.5a0: 62091 (with)
Python 2.5a0: 62092 (changed WITH_CLEANUP opcode)
Python 2.5b3: 62101 (fix wrong code: for x, in ...)
Python 2.5b3: 62111 (fix wrong code: x += yield)
Python 2.5c1: 62121 (fix wrong lnotab with for loops and
storing constants that should have been
removed)
Python 2.5c2: 62131 (fix wrong code: for x, in ... in
listcomp/genexp)


http://svn.python.org/view/python/trunk/Python/import.c?view=markup

The relationship between byte code magic number and release version
number is not one-to-one. We could have, for the sake of the argument,
releases 3.2.3 through 3.5.0 (say) all having the same byte codes. What
version number should the .pyc file show?


--
Steven

Alf P. Steinbach

unread,
Jan 30, 2010, 10:44:18 PM1/30/10
to
* Steven D'Aprano:

I don't know enough about Python yet to comment on your question, but, just an
idea: how about a human readable filename /with/ some bytecode version id (that
added id could be the magic number)?

I think that combo could serve both the human and machine needs, so to speak. :-)


Cheers,

- Alf

Steven D'Aprano

unread,
Jan 31, 2010, 12:03:35 AM1/31/10
to
On Sun, 31 Jan 2010 04:44:18 +0100, Alf P. Steinbach wrote:

>> The relationship between byte code magic number and release version
>> number is not one-to-one. We could have, for the sake of the argument,
>> releases 3.2.3 through 3.5.0 (say) all having the same byte codes. What
>> version number should the .pyc file show?
>
> I don't know enough about Python yet to comment on your question, but,
> just an idea: how about a human readable filename /with/ some bytecode
> version id (that added id could be the magic number)?

Sorry, that still doesn't work. Consider the hypothetical given above.
For simplicity, I'm going to drop the micro point versions, so let's say
that releases 3.2 through 3.5 all use the same byte-code. (In reality,
you'll be likely looking at version numbers like 3.2.1 rather than just
3.2.) Now suppose you have 3.2 and 3.5 both installed, and you look
inside your $PYTHONPATH and see this:

spam.py
spam.pyr/
3.2-f2e70a0d.pyc


It would be fairly easy to have the import machinery clever enough to
ignore the version number prefix, so that Python 3.5 correctly uses
3.2-f2e70a0d.pyc. That part is easy.

(I trust nobody is going to suggest that Python should create multiple
redundant, identical, copies of the .pyc files. That would be just lame.)

But now consider the human reader. You want human-readable file names for
the benefit of the human reader. How is the human reader supposed to know
that Python 3.5 is going to use 3.2-f2e70a0d.pyc?

Suppose I'm running Python 3.5, and have a troubling bug, and I think "I
know, maybe there's some sort of problem with the compiled byte code, I
should delete it". I go looking for spam.pyr/3.5-*.pyc and don't find
anything.

Now I have two obvious choices:

(1) Start worrying about why Python 3.5 isn't writing .pyc files. Is my
installation broken? Have I set the PYTHONDONTWRITEBYTECODE environment
variable? Do I not have write permission to the folder? WTF is going on?
Confusion will reign.

(2) Learn that the 3.2- prefix is meaningless, and ignore it.

Since you have to ignore the version number prefix anyway, let's not lay
a trap for people by putting it there. You need to know the magic number
to do anything sensible with the .pyc files other than delete them, so
let's not pretend otherwise.

If you don't wish to spend time looking up the magic number, the solution
is simple: hit it with a sledgehammer. Why do you care *which*
specific .pyc file is being used anyway?

rm -r spam.pyr/

or for the paranoid:

rm spam.pyr/*.py[co]

(Obviously there will be people who care about the specific .pyc file
being deleted. Those people can't use a sledgehammer, they need one of
those little tack-hammers with the narrow head. But 99% of the time, a
sledgehammer is fine.)

Frankly, unless you're a core developer or you're hacking byte-code, you
nearly always won't care about the .pyc files. You want the compiler to
manage them. And for the few times you do care, it isn't hard to find out:

>>> import binascii, imp
>>> binascii.hexlify(imp.get_magic())
b'4f0c0d0a'

Stick that in your "snippets" folder (you do have a snippets folder,
don't you?) and, as they say Down Under, "She'll be right mate".


--
Steven

Martin v. Loewis

unread,
Jan 31, 2010, 5:13:06 AM1/31/10
to John Bokma
> Naming files using magic numbers is really beyond me. The fact that the
> above needs comments to explain what's what already shows to me that
> there's a problem with this naming scheme. What if for one reason or
> another I want to delete all pyc files for Python 2.5? Where do I look
> up the magic number?

py> import imp, binascii
py> binascii.hexlify(imp.get_magic())
'b3f20d0a'

(note: this is on a little-endian system)

Regards,
Martin

Martin v. Loewis

unread,
Jan 31, 2010, 5:17:19 AM1/31/10
to MRAB
> True. You might also want to note that "Python 2.6 -U" appears to have a
> different magic number from "Python 2.6" and "Python 2.6 -O".
>
> I don't know whether they always change for each new version.

Here is a recent list of magic numbers:

Python 2.6a0: 62151 (peephole optimizations and STORE_MAP opcode)
Python 2.6a1: 62161 (WITH_CLEANUP optimization)
Python 2.7a0: 62171 (optimize list comprehensions/change LIST_APPEND)
Python 2.7a0: 62181 (optimize conditional branches:
introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
Python 2.7a0 62191 (introduce SETUP_WITH)
Python 2.7a0 62201 (introduce BUILD_SET)
Python 2.7a0 62211 (introduce MAP_ADD and SET_ADD)

#define MAGIC (62211 | ((long)'\r'<<16) | ((long)'\n'<<24))

Regards,
Martin

Andrej Mitrovic

unread,
Jan 31, 2010, 9:13:44 AM1/31/10
to
Leave magic to the witches of Perl. :)

John Bokma

unread,
Jan 31, 2010, 10:06:18 AM1/31/10
to
Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

> On Sun, 31 Jan 2010 04:44:18 +0100, Alf P. Steinbach wrote:
>
>>> The relationship between byte code magic number and release version
>>> number is not one-to-one. We could have, for the sake of the argument,
>>> releases 3.2.3 through 3.5.0 (say) all having the same byte codes. What
>>> version number should the .pyc file show?
>>
>> I don't know enough about Python yet to comment on your question, but,
>> just an idea: how about a human readable filename /with/ some bytecode
>> version id (that added id could be the magic number)?
>
> Sorry, that still doesn't work. Consider the hypothetical given above.
> For simplicity, I'm going to drop the micro point versions, so let's say
> that releases 3.2 through 3.5 all use the same byte-code. (In reality,

Based on the magic numbers I've seen so far it looks like that not an
option. They increment with every minor change. So to me, at this moment
(and maybe it's my ignorance) it looks like a made up example to justify
what to me still looks like a bad decision.

Sean DiZazzo

unread,
Jan 31, 2010, 3:04:36 PM1/31/10
to

> Here is a recent list of magic numbers:
>
>        Python 2.6a0: 62151 (peephole optimizations and STORE_MAP opcode)
>        Python 2.6a1: 62161 (WITH_CLEANUP optimization)
>        Python 2.7a0: 62171 (optimize list comprehensions/change LIST_APPEND)
>        Python 2.7a0: 62181 (optimize conditional branches:
>                             introduce POP_JUMP_IF_FALSE and POP_JUMP_IF_TRUE)
>        Python 2.7a0  62191 (introduce SETUP_WITH)
>        Python 2.7a0  62201 (introduce BUILD_SET)
>        Python 2.7a0  62211 (introduce MAP_ADD and SET_ADD)
>
> #define MAGIC (62211 | ((long)'\r'<<16) | ((long)'\n'<<24))
>
> Regards,
> Martin

Does "magic" really need to be used? Why not just use the revision
number?

Benjamin Peterson

unread,
Jan 31, 2010, 4:32:42 PM1/31/10
to pytho...@python.org
Sean DiZazzo <half.italian <at> gmail.com> writes:
> Does "magic" really need to be used? Why not just use the revision
> number?

Because magic is easier and otherwise CPython developers would have to rebuild
their pycs everytime their working copy was updated.


Message has been deleted

Steven D'Aprano

unread,
Jan 31, 2010, 5:37:04 PM1/31/10
to
On Sun, 31 Jan 2010 09:06:18 -0600, John Bokma wrote:

> Based on the magic numbers I've seen so far it looks like that not an
> option. They increment with every minor change.

They increment with every *incompatible* change to the marshal format,
not every change to the compiler.

> So to me, at this moment
> (and maybe it's my ignorance) it looks like a made up example to justify
> what to me still looks like a bad decision.

Of course it's a made-up example. But with Python now entering a period
where there is a moratorium on changes to the language, is it really so
difficult to imagine that the marshal format will settle down for a while
even as the standard library goes through upgrades?

--
Steven

Steven D'Aprano

unread,
Jan 31, 2010, 5:40:03 PM1/31/10
to
On Sun, 31 Jan 2010 14:10:34 -0800, Dennis Lee Bieber wrote:

> Ugh... That would mean that for an application using, say 20
> files,
> one now has 20 subdirectories for what, in a lot of cases, will contain
> just one file each (and since I doubt older Python's will be modified to
> support this scheme, it will only be applicable to 3.x, and maybe a
> 2.7?)

If you only use one version of Python, then don't run it with the -R
switch.

Have you read the PEP? It is quite explicit that the default behaviour
of .pyc files will remain unchanged, that to get the proposed behaviour
you have to specifically ask for it.

--
Steven

Daniel Fetchinson

unread,
Feb 1, 2010, 5:14:42 AM2/1/10
to Python

I also think the PEP is a great idea and proposes a solution to a real
problem. But I also hear the 'directory clutter' argument and I'm
really concerned too, having all these extra directories around (and
quite a large number of them indeed!). How about this scheme:

1. install python source files to a shared (among python
installations) location /this/is/shared
2. when python X.Y imports a source file from /this/is/shared it will
create pyc files in its private area /usr/lib/pythonX.Y/site-packages/
Time comparison would be between /this/is/shared/x.py and
/usr/lib/pythonX.Y/site-packages/x.pyc, for instance.

Obviously pythonX.Y needs to know the path to /this/is/shared so it
can import modules from there, but this can be controlled by an
environment variable. There would be only .py files in
/this/is/shared.

Linux distro packagers would only offer a single python-myapp to
install and it would only contain python source, and the
version-specific pyc files would be created the first time the
application is used by python. In /usr/lib/pythonX.Y/site-packages
there would be only pyc files with magic number matching python X.Y.

So, basically nothing would change only the location of py and pyc
files would be different from current behavior, but the same algorithm
would be run to determine which one to load, when to create a pyc
file, when to ignore the old one, etc.

What would be wrong with this setup?

Cheers,
Daniel


--
Psss, psss, put it down! - http://www.cafepress.com/putitdown

Steven D'Aprano

unread,
Feb 1, 2010, 10:27:31 AM2/1/10
to
On Mon, 01 Feb 2010 11:14:42 +0100, Daniel Fetchinson wrote:

> I also think the PEP is a great idea and proposes a solution to a real
> problem. But I also hear the 'directory clutter' argument and I'm really
> concerned too, having all these extra directories around (and quite a
> large number of them indeed!).

Keep in mind that if you don't explicitly ask for the proposed feature,
you won't see any change at all. You need to run Python with the -R
switch, or set an environment variable. The average developer won't see
any clutter at all unless she is explicitly supporting multiple versions.

> How about this scheme:
>
> 1. install python source files to a shared (among python installations)
> location /this/is/shared
> 2. when python X.Y imports a source file from /this/is/shared it will
> create pyc files in its private area /usr/lib/pythonX.Y/site-packages/

$ touch /usr/lib/python2.5/site-packages/STEVEN
touch: cannot touch `/usr/lib/python2.5/site-packages/STEVEN': Permission
denied

There's your first problem: most users don't have write-access to the
private area. When you install a package, you normally do so as root, and
it all works. When you import a module and it gets compiled as a .pyc
file, you're generally running as a regular user.


> Time comparison would be between /this/is/shared/x.py and
> /usr/lib/pythonX.Y/site-packages/x.pyc, for instance.

I don't quite understand what you mean by "time comparison".


[...]


> In /usr/lib/pythonX.Y/site-packages there would be only pyc files with
> magic number matching python X.Y.

Personally, I think it is a terribly idea to keep the source file and
byte code file in such radically different places. They should be kept
together. What you call "clutter" I call having the files that belong
together kept together.

> So, basically nothing would change only the location of py and pyc files
> would be different from current behavior, but the same algorithm would
> be run to determine which one to load, when to create a pyc file, when
> to ignore the old one, etc.

What happens when there is a .pyc file in the same location as the .py
file? Because it *will* happen. Does it get ignored, or does it take
precedence over the site specific file?

Given:

./module.pyc
/usr/lib/pythonX.Y/site-packages/module.pyc

and you execute "import module", which gets used? Note that in this
situation, there may or may not be a module.py file.


> What would be wrong with this setup?

Consider:

./module.py
./package/module.py

Under your suggestion, both of these will compile to

/usr/lib/pythonX.Y/site-packages/module.pyc

--
Steven

Daniel Fetchinson

unread,
Feb 1, 2010, 3:19:52 PM2/1/10
to Python
>> I also think the PEP is a great idea and proposes a solution to a real
>> problem. But I also hear the 'directory clutter' argument and I'm really
>> concerned too, having all these extra directories around (and quite a
>> large number of them indeed!).
>
> Keep in mind that if you don't explicitly ask for the proposed feature,
> you won't see any change at all. You need to run Python with the -R
> switch, or set an environment variable. The average developer won't see
> any clutter at all unless she is explicitly supporting multiple versions.
>
>
>
>> How about this scheme:
>>
>> 1. install python source files to a shared (among python installations)
>> location /this/is/shared
>> 2. when python X.Y imports a source file from /this/is/shared it will
>> create pyc files in its private area /usr/lib/pythonX.Y/site-packages/
>
> $ touch /usr/lib/python2.5/site-packages/STEVEN
> touch: cannot touch `/usr/lib/python2.5/site-packages/STEVEN': Permission
> denied
>
> There's your first problem: most users don't have write-access to the
> private area.

True, I haven't thought about that (I should have though).

> When you install a package, you normally do so as root, and
> it all works. When you import a module and it gets compiled as a .pyc
> file, you're generally running as a regular user.
>
>
>> Time comparison would be between /this/is/shared/x.py and
>> /usr/lib/pythonX.Y/site-packages/x.pyc, for instance.
>
> I don't quite understand what you mean by "time comparison".

I meant the comparison of timestamps on .py and .pyc files in order to
determine which is newer and if a recompilation should take place or
not.

> [...]
>> In /usr/lib/pythonX.Y/site-packages there would be only pyc files with
>> magic number matching python X.Y.
>
> Personally, I think it is a terribly idea to keep the source file and
> byte code file in such radically different places. They should be kept
> together. What you call "clutter" I call having the files that belong
> together kept together.

I see why you think so, it's reasonable, however there is compelling
argument, I think, for the opposite view: namely to keep things
separate. An average developer definitely wants easy access to .py
files. However I see no good reason for having access to .pyc files. I
for one have never inspected a .pyc file. Why would you want to have a
.pyc file at hand?

If we don't really want to have .pyc files in convenient locations
because we (almost) never want to access them really, then I'd say
it's a good idea to keep them totally separate and so make don't get
in the way.

>> So, basically nothing would change only the location of py and pyc files
>> would be different from current behavior, but the same algorithm would
>> be run to determine which one to load, when to create a pyc file, when
>> to ignore the old one, etc.
>
> What happens when there is a .pyc file in the same location as the .py
> file? Because it *will* happen. Does it get ignored, or does it take
> precedence over the site specific file?
>
> Given:
>
> ./module.pyc
> /usr/lib/pythonX.Y/site-packages/module.pyc
>
> and you execute "import module", which gets used? Note that in this
> situation, there may or may not be a module.py file.
>
>
>> What would be wrong with this setup?
>
> Consider:
>
> ./module.py
> ./package/module.py
>
> Under your suggestion, both of these will compile to
>
> /usr/lib/pythonX.Y/site-packages/module.pyc

I see the problems with my suggestion. However it would be great if in
some other way the .pyc files could be kept out of the way. Granted, I
don't have a good proposal for this.

Steven D'Aprano

unread,
Feb 1, 2010, 7:49:00 PM2/1/10
to
On Mon, 01 Feb 2010 21:19:52 +0100, Daniel Fetchinson wrote:

>> Personally, I think it is a terribly idea to keep the source file and
>> byte code file in such radically different places. They should be kept
>> together. What you call "clutter" I call having the files that belong
>> together kept together.
>
> I see why you think so, it's reasonable, however there is compelling
> argument, I think, for the opposite view: namely to keep things
> separate. An average developer definitely wants easy access to .py
> files. However I see no good reason for having access to .pyc files. I
> for one have never inspected a .pyc file. Why would you want to have a
> .pyc file at hand?

If you don't care about access to .pyc files, why do you care where they
are? If they are in a subdirectory module.pyr, then shrug and ignore the
subdirectory.

If you (generic you) are one of those developers who don't care
about .pyc files, then when you are browsing your source directory and
see this:


module.py
module.pyc

you just ignore the .pyc file. Or delete it, and Python will re-create it
as needed. So if you see

module.pyr/

just ignore that as well.

> If we don't really want to have .pyc files in convenient locations
> because we (almost) never want to access them really, then I'd say it's
> a good idea to keep them totally separate and so make don't get in the
> way.

I like seeing them in the same place as the source file, because when I
start developing a module, I often end up renaming it multiple times
before it settles on a final name. When I rename or move it, I delete
the .pyc file, and that ensures that if I miss changing an import, and
try to import the old name, it will fail.

By hiding the .pyc file elsewhere, it is easy to miss deleting one, and
then the import won't fail, it will succeed, but use the old, obsolete
byte code.

--
Steven

Daniel Fetchinson

unread,
Feb 2, 2010, 3:38:07 AM2/2/10
to Python


Okay, I see your point but I think your argument about importing shows
that python is doing something suboptimal because I have to worry
about .pyc files. Ideally, I only would need to worry about python
source files. There is now a chance to 'fix' (quotation marks because
maybe there is nothing to fix, according to some) this issue and make
all pyc files go away and having python magically doing the right
thing. A central pyc repository would be something I was thinking
about, but I admit it's a half baked or not even that, probably
quarter baked idea.

Steven D'Aprano

unread,
Feb 2, 2010, 8:42:18 PM2/2/10
to
On Tue, 02 Feb 2010 09:38:07 +0100, Daniel Fetchinson wrote:

>> I like seeing them in the same place as the source file, because when I
>> start developing a module, I often end up renaming it multiple times
>> before it settles on a final name. When I rename or move it, I delete
>> the .pyc file, and that ensures that if I miss changing an import, and
>> try to import the old name, it will fail.
>>
>> By hiding the .pyc file elsewhere, it is easy to miss deleting one, and
>> then the import won't fail, it will succeed, but use the old, obsolete
>> byte code.
>
>
> Okay, I see your point but I think your argument about importing shows
> that python is doing something suboptimal because I have to worry about
> .pyc files. Ideally, I only would need to worry about python source
> files.

That's no different from any language that is compiled: you have to worry
about keeping the compiled code (byte code or machine language) in sync
with the source code.

Python does most of that for you: it automatically recompiles the source
whenever the source code's last modified date stamp is newer than that of
the byte code. So to a first approximation you can forget all about
the .pyc files and just care about the source.

But that's only a first approximation. You might care about the .pyc
files if:

(1) you want to distribute your application in a non-human readable
format;

(2) if you care about clutter in your file system;

(3) if you suspect a bug in the compiler;

(4) if you are working with byte-code hacks;

(5) if the clock on your PC is wonky;

(6) if you leave random .pyc files floating around earlier in the
PYTHONPATH than your source files;

etc.


> There is now a chance to 'fix' (quotation marks because maybe
> there is nothing to fix, according to some) this issue and make all pyc
> files go away and having python magically doing the right thing.

Famous last words...

The only ways I can see to have Python magically do the right thing in
all cases would be:

(1) Forget about byte-code compiling, and just treat Python as a purely
interpreted language. If you think Python is slow now...

(2) Compile as we do now, but only keep the byte code in memory. This
would avoid all worries about scattered .pyc files, but would slow Python
down significantly *and* reduce functionality (e.g. losing the ability to
distribute non-source files).

Neither of these are seriously an option.


> A
> central pyc repository would be something I was thinking about, but I
> admit it's a half baked or not even that, probably quarter baked idea.

A central .pyc repository doesn't eliminate the issues developers may
have with byte code files, it just puts them somewhere else, out of
sight, where they are more likely to bite.

--
Steven

Daniel Fetchinson

unread,
Feb 3, 2010, 5:55:57 AM2/3/10
to Python
>>> I like seeing them in the same place as the source file, because when I
>>> start developing a module, I often end up renaming it multiple times
>>> before it settles on a final name. When I rename or move it, I delete
>>> the .pyc file, and that ensures that if I miss changing an import, and
>>> try to import the old name, it will fail.
>>>
>>> By hiding the .pyc file elsewhere, it is easy to miss deleting one, and
>>> then the import won't fail, it will succeed, but use the old, obsolete
>>> byte code.
>>
>>
>> Okay, I see your point but I think your argument about importing shows
>> that python is doing something suboptimal because I have to worry about
>> .pyc files. Ideally, I only would need to worry about python source
>> files.
>
> That's no different from any language that is compiled: you have to worry
> about keeping the compiled code (byte code or machine language) in sync
> with the source code.

True.

> Python does most of that for you: it automatically recompiles the source
> whenever the source code's last modified date stamp is newer than that of
> the byte code. So to a first approximation you can forget all about
> the .pyc files and just care about the source.

True, but the .pyc file is lying around and I always have to do 'ls
-al | grep -v pyc' in my python source directory.

> But that's only a first approximation. You might care about the .pyc
> files if:
>
> (1) you want to distribute your application in a non-human readable
> format;

Sure, I do care about pyc files, of course, I just would prefer to
have them at a separate location.

> (2) if you care about clutter in your file system;

You mean having an extra directory structure for the pyc files? This I
think would be better than having the pyc files in the source
directory, but we are getting into 'gut feelings' territory :)

> (3) if you suspect a bug in the compiler;

If the pyc files are somewhere else you can still inspect them if you want.

> (4) if you are working with byte-code hacks;

Again, just because they are somewhere else doesn't mean you can't get to them.

> (5) if the clock on your PC is wonky;

Same as above.

> (6) if you leave random .pyc files floating around earlier in the
> PYTHONPATH than your source files;
>
> etc.
>
>
>
>
>> There is now a chance to 'fix' (quotation marks because maybe
>> there is nothing to fix, according to some) this issue and make all pyc
>> files go away and having python magically doing the right thing.
>
> Famous last words...
>
> The only ways I can see to have Python magically do the right thing in
> all cases would be:
>
> (1) Forget about byte-code compiling, and just treat Python as a purely
> interpreted language. If you think Python is slow now...

I'm not advocating this option, naturally.

> (2) Compile as we do now, but only keep the byte code in memory. This
> would avoid all worries about scattered .pyc files, but would slow Python
> down significantly *and* reduce functionality (e.g. losing the ability to
> distribute non-source files).

I'm not advocating this option either.

> Neither of these are seriously an option.

Agreed.

>> A
>> central pyc repository would be something I was thinking about, but I
>> admit it's a half baked or not even that, probably quarter baked idea.
>
> A central .pyc repository doesn't eliminate the issues developers may
> have with byte code files, it just puts them somewhere else, out of
> sight, where they are more likely to bite.

Here is an example: shared object files. If your code needs them, you
can use them easily, you can access them easily if you want to, but
they are not in the directory where you keep your C files. They are
somewhere in /usr/lib for example, where they are conveniently
collected, you can inspect them, look at them, distribute them, do
basically whatever you want, but they are out of the way, and 99% of
the time while you develop your code, you don't need them. In the 1%
of the case you can easily get at them in the centralized location,
/usr/lib in our example.

Of course the relationship between C source files and shared objects
is not parallel to the relationship to python source files and the
created pyc files, please don't nitpick on this point. The analogy is
in the sense that your project inevitable needs for whatever reason
some binary files which are rarely needed at hand, only the
linker/compiler/interpreter/etc needs to know where they are. These
files can be stored separately, but at a location where one can
inspect them if needed (which rarely happens).

Steven D'Aprano

unread,
Feb 3, 2010, 4:17:42 PM2/3/10
to
On Wed, 03 Feb 2010 11:55:57 +0100, Daniel Fetchinson wrote:

[...]


>> Python does most of that for you: it automatically recompiles the
>> source whenever the source code's last modified date stamp is newer
>> than that of the byte code. So to a first approximation you can forget
>> all about the .pyc files and just care about the source.
>
> True, but the .pyc file is lying around and I always have to do 'ls -al
> | grep -v pyc' in my python source directory.


So alias a one-word name to that :)


[...]


> Here is an example: shared object files. If your code needs them, you
> can use them easily, you can access them easily if you want to, but they
> are not in the directory where you keep your C files. They are somewhere
> in /usr/lib for example, where they are conveniently collected, you can
> inspect them, look at them, distribute them, do basically whatever you
> want, but they are out of the way, and 99% of the time while you develop
> your code, you don't need them. In the 1% of the case you can easily get
> at them in the centralized location, /usr/lib in our example.
>
> Of course the relationship between C source files and shared objects is
> not parallel to the relationship to python source files and the created
> pyc files, please don't nitpick on this point. The analogy is in the
> sense that your project inevitable needs for whatever reason some binary
> files which are rarely needed at hand, only the
> linker/compiler/interpreter/etc needs to know where they are. These
> files can be stored separately, but at a location where one can inspect
> them if needed (which rarely happens).

I'll try not to nit-pick :)

When an object file is in /usr/lib, you're dealing with it as a user.
You, or likely someone else, have almost certainly compiled it in a
different directory and then used make to drop it in place. It's now a
library, you're a user of that library, and you don't care where the
object file is so long as your app can find it (until you have a
conflict, and then you do).

While you are actively developing the library, on the other hand, the
compiler typically puts the object file in the same directory as the
source file. (There may be an option to gcc to do otherwise, but surely
most people don't use it often.) While the library is still being
actively developed, the last thing you want is for the object file to be
placed somewhere other than in your working directory. A potentially
unstable or broken library could end up in /usr/lib and stomp all over a
working version. Even if it doesn't, it means you have to be flipping
backwards and forwards between two locations to get anything done.

Python development is much the same, the only(?) differences are that we
have a lower threshold between "in production" and "in development", and
that we typically install both the source and the binary instead of just
the binary.

When you are *using* a library/script/module, you don't care whether
import uses the .py file or the .pyc, and you don't care where they are,
so long as they are in your PYTHONPATH (and there are no conflicts). But
I would argue that while you are *developing* the module, it would more
nuisance than help to have the .pyc file anywhere other than immediately
next to the .py file (either in the same directory, or in a clearly named
sub-directory).

--
Steven

Daniel Fetchinson

unread,
Feb 3, 2010, 5:28:14 PM2/3/10
to Python

Okay, I think we got to a point where it's more about rationalizing
gut feelings than factual stuff. But that's okay,
system/language/architecure design is often times more about gut
feelings than facts so nothing to be too surprised about :)

0 new messages