[Python-Dev] #12982: Should -O be required to *read* .pyo files?

77 views
Skip to first unread message

Terry Reedy

unread,
Jun 12, 2012, 2:16:04 PM6/12/12
to pytho...@python.org
http://bugs.python.org/issue12982

Currently, cpython requires the -O flag to *read* .pyo files as well as
the write them. This is a nuisance to people who receive them from
others, without the source. The originator of the issue quotes the
following from the doc (without giving the location).

"It is possible to have a file called spam.pyc (or spam.pyo when -O is
used) without a file spam.py for the same module. This can be used to
distribute a library of Python code in a form that is moderately hard to
reverse engineer."

There is no warning that .pyo files are viral, in a sense. The user has
to use -O, which is a) a nuisance to remember if he has multiple scripts
and some need it and some not, and b) makes his own .py files used with
.pyo imports cached as .pyo, without docstrings, like it or not.

Currently, the easiest workaround is to rename .pyo to .pyc and all
seems to work fine, even with a mixture of true .pyc and renamed .pyo
files. (The same is true with the -O flag and no renaming.) This
suggests that there is no current reason for the restriction in that the
*execution* of bytecode is not affected by the -O flag. (Another
workaround might be a custom importer -- but this is not trivial,
apparently.)

So is the import restriction either an accident or obsolete holdover? If
so, can removing it be treated as a bugfix and put into current
releases, or should it be treated as an enhancement only for a future
release?

Or is the restriction an intentional reservation of the possibility of
making *execution* depend on the flag? Which would mean that the
restriction should be kept and only the doc changed?

--
Terry Jan Reedy

_______________________________________________
Python-Dev mailing list
Pytho...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Ethan Furman

unread,
Jun 12, 2012, 2:41:52 PM6/12/12
to pytho...@python.org
I have no history so cannot say what was supposed to happen, but my
$0.02 would be that if -O is *not* specified then we should try to read
.pyc, then .pyo, and finally .py. In other words, I vote for -O being a
write flag, not a read flag.

~Ethan~

Alexandre Zani

unread,
Jun 12, 2012, 2:49:04 PM6/12/12
to Ethan Furman, pytho...@python.org
What if I change .py?

>
> ~Ethan~
>
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/alexandre.zani%40gmail.com

Ethan Furman

unread,
Jun 12, 2012, 3:14:14 PM6/12/12
to pytho...@python.org
Well, the case in question is that there is no .py available.

But if it were available, and you changed it, then it would and should
work just like it does now -- if .py is newer, compile it; if -O was
specified, compile it optimized; now run the compiled code.

~Ethan~
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Ronan Lamy

unread,
Jun 12, 2012, 3:57:37 PM6/12/12
to Ethan Furman, pytho...@python.org

I don't know much about the history either, but under PEP 3147, there
are really two cases:
* .pyc and .pyo as compilation caches. These live in __pycache__/ and
have a cache_tag, their filename looks like
pkg/__pycache__/module.cpython-33.pyc and their only role is to speed up
imports.
* .pyc and .pyo as standalone, precompiled sources for modules. These
are found in the same place as .py files (e.g. pkg/module.pyc).

In the first case, I think that -O should dictate which of .pyc and .pyo
is used, while the other is completely ignored.

In the second case, both .pyc and .pyo should always be considered as
valid module sources, because -O is a compilation flag and loading a
bytecode file doesn't involve compilation. At most, -O could switch the
priority between .pyc and .pyo.

2.7 doesn't really differentiate between cached .pyc and
standalone .pyc, so I don't know if a consistent behaviour can be
achieved. Maybe the presence or absence of a matching .py can be used to
trigger the first or second case above.

Kristján Valur Jónsson

unread,
Jun 12, 2012, 6:03:30 PM6/12/12
to Ronan Lamy, Ethan Furman, pytho...@python.org
[Vaguely related:
-B prevents the writing of .pyc and .pyo (don't know how it works for pep 3147)
However, it doesn't prevent the _reading_ of said files. It's been discussed here before and considered useful, since rudiment .pyc files tend to stick around. Maybe a -BB flag should be considered?]
K

________________________________________
Frá: python-dev-bounces+kristjan=ccpgam...@python.org [python-dev-bounces+kristjan=ccpgam...@python.org] fyrir hönd Ronan Lamy [ronan...@gmail.com]
Sent: 12. júní 2012 19:57
To: Ethan Furman
Cc: pytho...@python.org
Efni: Re: [Python-Dev] #12982: Should -O be required to *read* .pyo files?
Unsubscribe: http://mail.python.org/mailman/options/python-dev/kristjan%40ccpgames.com

R. David Murray

unread,
Jun 13, 2012, 1:58:10 PM6/13/12
to pytho...@python.org
On Wed, 13 Jun 2012 13:19:43 -0400, Brett Cannon <br...@python.org> wrote:
> On Tue, Jun 12, 2012 at 2:16 PM, Terry Reedy <tjr...@udel.edu> wrote:
>
> > http://bugs.python.org/**issue12982 <http://bugs.python.org/issue12982>
> >
> > Currently, cpython requires the -O flag to *read* .pyo files as well as
> > the write them. This is a nuisance to people who receive them from others,
> > without the source. The originator of the issue quotes the following from
> > the doc (without giving the location).
[...]
> > So is the import restriction either an accident or obsolete holdover?
>
> Neither. .pyo files are actually different from .pyc files in terms of what
> bytecode they may emit. Currently -O causes all asserts to be left out of
> the bytecode and -OO leaves out all docstrings on top of what -O does. This
> makes a difference if you are trying to introspect at the interpreter
> prompt or are testing things in development and want those asserts to be
> triggered if needed.
>
> > If so, can removing it be treated as a bugfix and put into current
> > releases, or should it be treated as an enhancement only for a future
> > release?
> >
> The behaviour shouldn't change. There has been talk of doing even more
> aggressive optimizing under -O, which once again would cause an even larger
> deviation between a .pyo file and a .pyc file (e.g. allowing Python code to
> hook into the peephole optimizer or an entirely new AST optimizer).
>
> > Or is the restriction an intentional reservation of the possibility of
> > making *execution* depend on the flag? Which would mean that the
> > restriction should be kept and only the doc changed?
>
> The docs should get updated to be more clear.

OK, but you didn't answer the question :). If I understand correctly,
everything you said applies to *writing* the bytecode, not reading it.

So, is there any reason to not use the .pyo file (if that's all that is
around) when -O is not specified?

The only technical reason I can see why -O should be required for a .pyo
file to be used (*if* it is the only thing around) is if it won't *run*
without the -O switch. Is there any expectation that that will ever be
the case?

On the other hand, not wanting make any extra effort to support sourceless
distributions could be a reason as well. But if that's the case we
should be transparent about it.

--David

Brett Cannon

unread,
Jun 13, 2012, 1:19:43 PM6/13/12
to Terry Reedy, pytho...@python.org
On Tue, Jun 12, 2012 at 2:16 PM, Terry Reedy <tjr...@udel.edu> wrote:
http://bugs.python.org/issue12982

Currently, cpython requires the -O flag to *read* .pyo files as well as the write them. This is a nuisance to people who receive them from others, without the source. The originator of the issue quotes the following from the doc (without giving the location).

"It is possible to have a file called spam.pyc (or spam.pyo when -O is used) without a file spam.py for the same module. This can be used to distribute a library of Python code in a form that is moderately hard to reverse engineer."

There is no warning that .pyo files are viral, in a sense. The user has to use -O, which is a) a nuisance to remember if he has multiple scripts and some need it and some not, and b) makes his own .py files used with .pyo imports cached as .pyo, without docstrings, like it or not.

Currently, the easiest workaround is to rename .pyo to .pyc and all seems to work fine, even with a mixture of true .pyc and renamed .pyo files. (The same is true with the -O flag and no renaming.) This suggests that there is no current reason for the restriction in that the *execution* of bytecode is not affected by the -O flag. (Another workaround might be a custom importer -- but this is not trivial, apparently.)

In Python 3.3 it's actually trivial.
 

So is the import restriction either an accident or obsolete holdover?

Neither. .pyo files are actually different from .pyc files in terms of what bytecode they may emit. Currently -O causes all asserts to be left out of the bytecode and -OO leaves out all docstrings on top of what -O does. This makes a difference if you are trying to introspect at the interpreter prompt or are testing things in development and want those asserts to be triggered if needed.
 
If so, can removing it be treated as a bugfix and put into current releases, or should it be treated as an enhancement only for a future release?


The behaviour shouldn't change. There has been talk of doing even more aggressive optimizing under -O, which once again would cause an even larger deviation between a .pyo file and a .pyc file (e.g. allowing Python code to hook into the peephole optimizer or an entirely new AST optimizer).
 
Or is the restriction an intentional reservation of the possibility of making *execution* depend on the flag? Which would mean that the restriction should be kept and only the doc changed?

Toshio Kuratomi

unread,
Jun 13, 2012, 2:20:24 PM6/13/12
to pytho...@python.org
On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>
> OK, but you didn't answer the question :). If I understand correctly,
> everything you said applies to *writing* the bytecode, not reading it.
>
> So, is there any reason to not use the .pyo file (if that's all that is
> around) when -O is not specified?
>
> The only technical reason I can see why -O should be required for a .pyo
> file to be used (*if* it is the only thing around) is if it won't *run*
> without the -O switch. Is there any expectation that that will ever be
> the case?
>
Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
Another piece of code can legally import that and try to use the docstring
for something. This would fail if only the .pyo was present.

Of course, it would also fail under the present behaviour since no .py or
.pyc was present to be imported. The error that's displayed might be
clearer if we fail when attempting to read a .py/.pyc rather than failing
when the docstring is found to be missing, though.

-Toshio

Ethan Furman

unread,
Jun 13, 2012, 2:46:47 PM6/13/12
to pytho...@python.org
Brett Cannon wrote:
> On Tue, Jun 12, 2012 at 2:16 PM, Terry Reedy wrote:
>
>> http://bugs.python.org/__issue12982 <http://bugs.python.org/issue12982>
But what does this have to do with those cases where *only* the .pyo
file is available, and we are trying to run it? In these cases it would
have to be in the main folder (not __pycache__) which means somebody did
it deliberately.

~Ethan~

Antoine Pitrou

unread,
Jun 13, 2012, 2:46:50 PM6/13/12
to pytho...@python.org
On Wed, 13 Jun 2012 11:20:24 -0700
Toshio Kuratomi <a.ba...@gmail.com> wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> >
> > OK, but you didn't answer the question :). If I understand correctly,
> > everything you said applies to *writing* the bytecode, not reading it.
> >
> > So, is there any reason to not use the .pyo file (if that's all that is
> > around) when -O is not specified?
> >
> > The only technical reason I can see why -O should be required for a .pyo
> > file to be used (*if* it is the only thing around) is if it won't *run*
> > without the -O switch. Is there any expectation that that will ever be
> > the case?
> >
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.

Not only docstrings, but also asserts. I think running a pyo without -O
would be a bug.

Regards

Antoine.

Ethan Furman

unread,
Jun 13, 2012, 2:41:56 PM6/13/12
to pytho...@python.org
Toshio Kuratomi wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>> OK, but you didn't answer the question :). If I understand correctly,
>> everything you said applies to *writing* the bytecode, not reading it.
>>
>> So, is there any reason to not use the .pyo file (if that's all that is
>> around) when -O is not specified?
>>
>> The only technical reason I can see why -O should be required for a .pyo
>> file to be used (*if* it is the only thing around) is if it won't *run*
>> without the -O switch. Is there any expectation that that will ever be
>> the case?
>>
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.

Why should it fail? -OO causes docstring access to return None, just as
if a docstring had not been specified in the first place. Any decent
code will be checking for an undefined docstring -- after all, they are
not rare.

~Ethan~

R. David Murray

unread,
Jun 13, 2012, 2:57:35 PM6/13/12
to pytho...@python.org
On Wed, 13 Jun 2012 11:20:24 -0700, Toshio Kuratomi <a.ba...@gmail.com> wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> >
> > OK, but you didn't answer the question :). If I understand correctly,
> > everything you said applies to *writing* the bytecode, not reading it.
> >
> > So, is there any reason to not use the .pyo file (if that's all that is
> > around) when -O is not specified?
> >
> > The only technical reason I can see why -O should be required for a .pyo
> > file to be used (*if* it is the only thing around) is if it won't *run*
> > without the -O switch. Is there any expectation that that will ever be
> > the case?
> >
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.

Yes, but that's not what I'm talking about. I would treat code that
depends on the presence of docstrings and doesn't have a fallback for
dealing with there absence as buggy code, since anyone might decide
to run that code with -OO, and the code would fail in that case too.

I'm talking about a case where the code runs correctly with -O (or -OO),
but fails if the code from the .pyo is loaded and python is run
*without* -O (or -OO).

> Of course, it would also fail under the present behaviour since no .py or
> .pyc was present to be imported. The error that's displayed might be
> clearer if we fail when attempting to read a .py/.pyc rather than failing
> when the docstring is found to be missing, though.

Well, right now if there is only a .pyo file and you run python without
-O, you get an import error. The question is, is that the way we really
want it to work?

R. David Murray

unread,
Jun 13, 2012, 3:13:54 PM6/13/12
to pytho...@python.org
On Wed, 13 Jun 2012 20:46:50 +0200, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Wed, 13 Jun 2012 11:20:24 -0700
> Toshio Kuratomi <a.ba...@gmail.com> wrote:
> > On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> > >
> > > OK, but you didn't answer the question :). If I understand correctly,
> > > everything you said applies to *writing* the bytecode, not reading it.
> > >
> > > So, is there any reason to not use the .pyo file (if that's all that is
> > > around) when -O is not specified?
> > >
> > > The only technical reason I can see why -O should be required for a .pyo
> > > file to be used (*if* it is the only thing around) is if it won't *run*
> > > without the -O switch. Is there any expectation that that will ever be
> > > the case?
> > >
> > Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> > Another piece of code can legally import that and try to use the docstring
> > for something. This would fail if only the .pyo was present.
>
> Not only docstrings, but also asserts. I think running a pyo without -O
> would be a bug.

Again, a program that depends on asserts is buggy.

As Ethan pointed out we are asking about the case where someone is
*deliberately* setting the .pyo file up to be run as the "normal"
case.

I'm not sure we want to support that, I just want us to be clear
about why we don't :)

--David

Ethan Furman

unread,
Jun 13, 2012, 3:36:55 PM6/13/12
to pytho...@python.org
R. David Murray wrote:
> On Wed, 13 Jun 2012 20:46:50 +0200, Antoine Pitrou <soli...@pitrou.net> wrote:
>> On Wed, 13 Jun 2012 11:20:24 -0700
>> Toshio Kuratomi <a.ba...@gmail.com> wrote:
>>> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>>>> OK, but you didn't answer the question :). If I understand correctly,
>>>> everything you said applies to *writing* the bytecode, not reading it.
>>>>
>>>> So, is there any reason to not use the .pyo file (if that's all that is
>>>> around) when -O is not specified?
>>>>
>>>> The only technical reason I can see why -O should be required for a .pyo
>>>> file to be used (*if* it is the only thing around) is if it won't *run*
>>>> without the -O switch. Is there any expectation that that will ever be
>>>> the case?
>>>>
>>> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
>>> Another piece of code can legally import that and try to use the docstring
>>> for something. This would fail if only the .pyo was present.
>> Not only docstrings, but also asserts. I think running a pyo without -O
>> would be a bug.
>
> Again, a program that depends on asserts is buggy.
>
> As Ethan pointed out we are asking about the case where someone is
> *deliberately* setting the .pyo file up to be run as the "normal"
> case.
>
> I'm not sure we want to support that, I just want us to be clear
> about why we don't :)

Currently, the alternative to supporting this behavior is to either:

1) require the end-user to specify -O (major nuisance)

or

2) have the distributor rename the .pyo file to .pyc

I think 1 is a non-starter (non-finisher? ;) but I could live with 2 --
after all, if someone is going to the effort of removing the .py file
and moving the .pyo file into its place, renaming the .pyo to .pyc is
trivial.

So the question, then, is: is option 2 better than just supporting .pyo
files without -O when they are all that is available?

~Ethan~

Terry Reedy

unread,
Jun 13, 2012, 4:06:22 PM6/13/12
to pytho...@python.org
On 6/13/2012 2:46 PM, Antoine Pitrou wrote:

> Not only docstrings, but also asserts. I think running a pyo without -O
> would be a bug.

That cat is already out of the bag ;-)
People are doing that now by renaming x.pyo to x.pyc.
Brett claims that it is also easy to do in 3.3 with a custom importer.

--
Terry Jan Reedy

Terry Reedy

unread,
Jun 13, 2012, 4:32:01 PM6/13/12
to pytho...@python.org
On 6/13/2012 1:19 PM, Brett Cannon wrote:

> On Tue, Jun 12, 2012 at 2:16 PM, Terry Reedy <tjr...@udel.edu
> <mailto:tjr...@udel.edu>> wrote:
>
> http://bugs.python.org/__issue12982 <http://bugs.python.org/issue12982>
>
> Currently, cpython requires the -O flag to *read* .pyo files as well
> as the write them. This is a nuisance to people who receive them
> from others, without the source. The originator of the issue quotes
> the following from the doc (without giving the location).
>
> "It is possible to have a file called spam.pyc (or spam.pyo when -O
> is used) without a file spam.py for the same module. This can be
> used to distribute a library of Python code in a form that is
> moderately hard to reverse engineer."
>
> There is no warning that .pyo files are viral, in a sense. The user
> has to use -O, which is a) a nuisance to remember if he has multiple
> scripts and some need it and some not, and b) makes his own .py
> files used with .pyo imports cached as .pyo, without docstrings,
> like it or not.
>
> Currently, the easiest workaround is to rename .pyo to .pyc and all
> seems to work fine, even with a mixture of true .pyc and renamed
> .pyo files. (The same is true with the -O flag and no renaming.)
> This suggests that there is no current reason for the restriction in
> that the *execution* of bytecode is not affected by the -O flag.
> (Another workaround might be a custom importer -- but this is not
> trivial, apparently.)
>
>
> In Python 3.3 it's actually trivial.

For you. Anyway, I am sure Michael of #12982 is using an earlier version.

> So is the import restriction either an accident or obsolete holdover?
>
>
> Neither. .pyo files are actually different from .pyc files in terms of
> what bytecode they may emit. Currently -O causes all asserts to be left
> out of the bytecode and -OO leaves out all docstrings on top of what -O
> does. This makes a difference if you are trying to introspect at the
> interpreter prompt or are testing things in development and want those
> asserts to be triggered if needed.

I suggested to Michael that he should request an all-.pyc library for
that reason.

> If so, can removing it be treated as a bugfix and put into current
> releases, or should it be treated as an enhancement only for a
> future release?
>
>
> The behaviour shouldn't change. There has been talk of doing even more
> aggressive optimizing under -O, which once again would cause an even
> larger deviation between a .pyo file and a .pyc file (e.g. allowing
> Python code to hook into the peephole optimizer or an entirely new AST
> optimizer).

Would such a change mean that *reading* a .pyo file as if it were a .pyc
file would start failing? (It now works, by renaming the file.)
If so, would not it be better to rely on having a different magic number
*in* the file rather than on its mutable external name?

You just said above that evading import restriction by name is now
trivial. So what is the point of keeping it. If, in the future, there
*are* separate execution pathways*, and we want the .pyo pathway closed
unless -O is passed, then it seems that that could only be enforced by a
magic number in the file.

*Unladen Swallow would have *optionally* produced cache files completely
different from current bytecode, with a different extension. A couple of
people have suggested using wordcode instead of bytecode. If this were
also introduced as an option, its cache files would also need a
different extension and magic number.

> Or is the restriction an intentional reservation of the possibility
> of making *execution* depend on the flag? Which would mean that the
> restriction should be kept and only the doc changed?
>
> The docs should get updated to be more clear.

Steven D'Aprano

unread,
Jun 13, 2012, 8:55:59 PM6/13/12
to pytho...@python.org
On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:

> So, is there any reason to not use the .pyo file (if that's all that is
> around) when -O is not specified?

.pyo and .pyc files have potentially different semantics. Right now,
.pyo files don't include asserts, so that's one difference right there.
In the future there may be more aggressive optimizations.

Good practice is to never write an assert that actually changes the
semantics of your program, but in practice people don't write asserts
correctly, e.g. they use them for checking user-input or function
parameters.

So, no, we should never use .pyo files unless explicitly told to do so,
since doing so risks breaking poorly-written but otherwise working code.



--
Steven

Nick Coghlan

unread,
Jun 13, 2012, 9:48:08 PM6/13/12
to Terry Reedy, pytho...@python.org
On Thu, Jun 14, 2012 at 6:06 AM, Terry Reedy <tjr...@udel.edu> wrote:
> On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
>
>> Not only docstrings, but also asserts. I think running a pyo without -O
>> would be a bug.
>
>
> That cat is already out of the bag ;-)
> People are doing that now by renaming x.pyo to x.pyc.
> Brett claims that it is also easy to do in 3.3 with a custom importer.

Right, but by resorting to either of those approaches, people are
clearly doing something that isn't formally supported by the core.
Yes, you can do it, and most of the time it will work out OK, but any
weird glitches that result are officially *not our problem*.

The main reason this matters is that the "__debug__" flag is
*supposed* to be process global - if you check it in one place, the
answer should be correct for all Python code loaded in the process. If
you load a .pyo file into a process running without -O (or a .pyc file
into a process running *with* -O), then you have broken that
assumption. Because the compiler understands __debug__, and is
explicitly free to make optimisations based on the value of that flag
at compile time (such as throwing away unreachable branches in if
statements or applying constant folding operations), the following
code will do different things if loaded from a .pyo file instead of
.pyc:

print("__debug__ is not a builtin, it is checked at compile time")
if __debug__:
print("A .pyc file always has __debug__ == True")
else:
print("A .pyo file always has __debug__ == False")

$ ./python -c "import foo"
__debug__ is not a builtin, it is checked at compile time
A .pyc file always has __debug__ == True
$ ./python -O -c "import foo"
__debug__ is not a builtin, it is checked at compile time
A .pyo file always has __debug__ == False
$ ./python __pycache__/foo.cpython-33.pyo
__debug__ is not a builtin, it is checked at compile time
A .pyo file always has __debug__ == False
$ ./python -O __pycache__/foo.cpython-33.pyc
__debug__ is not a builtin, it is checked at compile time
A .pyc file always has __debug__ == True

Cheers,
Nick.

--
Nick Coghlan   |   ncog...@gmail.com   |   Brisbane, Australia

Terry Reedy

unread,
Jun 13, 2012, 9:54:30 PM6/13/12
to pytho...@python.org
On 6/13/2012 8:55 PM, Steven D'Aprano wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>
>> So, is there any reason to not use the .pyo file (if that's all that is
>> around) when -O is not specified?
>
> .pyo and .pyc files have potentially different semantics. Right now,
> .pyo files don't include asserts, so that's one difference right there.
> In the future there may be more aggressive optimizations.
>
> Good practice is to never write an assert that actually changes the
> semantics of your program, but in practice people don't write asserts
> correctly, e.g. they use them for checking user-input or function
> parameters.
>
> So, no, we

You mean the interpreter?

> should never use

Do you mean import or execute?
Current, the interpreter executes any bytecode that gets imported.

> .pyo files unless explicitly told to do so,

What constitutes 'explicitly told to do so'? Currently, an 'optimized'
file written as .pyo gets imported (and hence executed) if
1) the interpreter is started with -O
2) a custom importer ignores the absence of -O
3) someone renames x.pyo to x.pyc.

> since doing so risks breaking poorly-written but otherwise working code.

Agreed, though a slightly different issue. Would you somehow disable 2)
or 3) if not considered 'explicit' enough?

--
Terry Jan Reedy

R. David Murray

unread,
Jun 13, 2012, 10:47:45 PM6/13/12
to pytho...@python.org
On Thu, 14 Jun 2012 11:48:08 +1000, Nick Coghlan <ncog...@gmail.com> wrote:
> On Thu, Jun 14, 2012 at 6:06 AM, Terry Reedy <tjr...@udel.edu> wrote:
> > On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
> >
> >> Not only docstrings, but also asserts. I think running a pyo without -O
> >> would be a bug.
> >
> > That cat is already out of the bag ;-)
> > People are doing that now by renaming x.pyo to x.pyc.
> > Brett claims that it is also easy to do in 3.3 with a custom importer.
>
> Right, but by resorting to either of those approaches, people are
> clearly doing something that isn't formally supported by the core.
> Yes, you can do it, and most of the time it will work out OK, but any
> weird glitches that result are officially *not our problem*.
>
> The main reason this matters is that the "__debug__" flag is
> *supposed* to be process global - if you check it in one place, the

OK, the above are the two concrete reasons I have heard in this thread
for continuing the current behavior:

1) we do not wish to support running from .pyo files without -O
being on, even if it currently happens to work

2) the __debug__ setting is supposed to be process-global

Both of these are good reasons. IMO the issue should be closed with a
documentation fix, which could optionally include either or both of the
above motivations.

--David

Terry Reedy

unread,
Jun 13, 2012, 11:06:53 PM6/13/12
to pytho...@python.org
On 6/13/2012 10:47 PM, R. David Murray wrote:
> On Thu, 14 Jun 2012 11:48:08 +1000, Nick Coghlan<ncog...@gmail.com> wrote:

>> Right, but by resorting to either of those approaches, people are
>> clearly doing something that isn't formally supported by the core.

That was not clear to me until I read your post -- the key word being
formally (or officially). I see now that distributing a sourceless
library as a mixture of .pyc and .pyo files is even crazier that I thought.

>> Yes, you can do it, and most of the time it will work out OK, but any
>> weird glitches that result are officially *not our problem*.
>>
>> The main reason this matters is that the "__debug__" flag is
>> *supposed* to be process global - if you check it in one place, the
>
> OK, the above are the two concrete reasons I have heard in this thread
> for continuing the current behavior:
>
> 1) we do not wish to support running from .pyo files without -O
> being on, even if it currently happens to work
>
> 2) the __debug__ setting is supposed to be process-global
>
> Both of these are good reasons. IMO the issue should be closed with a
> documentation fix, which could optionally include either or both of the
> above motivations.

I agree. We have gotten what we need from this thread.

--
Terry Jan Reedy

Steven D'Aprano

unread,
Jun 13, 2012, 11:12:16 PM6/13/12
to pytho...@python.org
On Wed, Jun 13, 2012 at 03:13:54PM -0400, R. David Murray wrote:

> Again, a program that depends on asserts is buggy.
>
> As Ethan pointed out we are asking about the case where someone is
> *deliberately* setting the .pyo file up to be run as the "normal"
> case.

You can't be sure that the .pyo file is there due to *deliberate*
choice. It may be accidental. Perhaps the end user has ignorantly
deleted the .pyc file, but failed to delete the .pyo file. Perhaps the
developer has merely made a mistake.

Under current behaviour, deleting the .pyc file shouldn't matter:

- if the source file is available, that will be used
- if not, a clear error is raised

Under the proposed change:

- if the source file is *newer* than the .pyo file, it will be used
- but if it is missing or older, the .pyo file is used

This opens a potential mismatch between the code I *think* is being run,
and the actual code being run: I think the .py[c] code is running when
the .pyo is actually running.

Realistically, we should expect that most people don't *sufficiently*
test their apps under -O (if at all!) even if they are aware that there
are differences in behaviour. I know I don't, and I know I should. This
is just a matter of priority: testing without -O is a higher priority
for me than testing with -O and -OO.

The consequence is that I may then receive a mysterious bug report that
I can't duplicate, because the user correctly reports that they are
running *without* -O, but unknown to anyone, they are actually running
the .pyo file.


> I'm not sure we want to support that, I just want us to be clear
> about why we don't :)

If I receive a bug report that only occurs under -O, then I immediately
suspect that the bug has something to do with assert.

If I receive a bug report that occurs without -O, under the proposed
change I can't be sure with the optimized code or standard code is
running. That adds complexity and confusion.


--
Steven

Steven D'Aprano

unread,
Jun 13, 2012, 11:15:42 PM6/13/12
to pytho...@python.org
On Wed, Jun 13, 2012 at 04:06:22PM -0400, Terry Reedy wrote:
> On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
>
> >Not only docstrings, but also asserts. I think running a pyo without -O
> >would be a bug.
>
> That cat is already out of the bag ;-)
> People are doing that now by renaming x.pyo to x.pyc.
> Brett claims that it is also easy to do in 3.3 with a custom importer.

That's fine. Both steps require an overt, deliberate act, and so is
under the control of (and the responsibilty of) the developer. It's not
something that could happen innocently by accident.



--
Steven

Steven D'Aprano

unread,
Jun 13, 2012, 11:25:25 PM6/13/12
to pytho...@python.org
On Wed, Jun 13, 2012 at 09:54:30PM -0400, Terry Reedy wrote:

> >So, no, we
>
> You mean the interpreter?

Yes.

> >should never use
>
> Do you mean import or execute?
> Current, the interpreter executes any bytecode that gets imported.

Both.

> >.pyo files unless explicitly told to do so,
>
> What constitutes 'explicitly told to do so'? Currently, an 'optimized'
> file written as .pyo gets imported (and hence executed) if
> 1) the interpreter is started with -O
> 2) a custom importer ignores the absence of -O
> 3) someone renames x.pyo to x.pyc.

Any of the above are fine by me.

I oppose this one:

4) the interpreter is started without -O but there is no .pyc file.

since it can lead to a mismatch between what I (the developer) thinks is
being run and what is actually being run (or imported).

For the avoidance of doubt, if my end-users secretly rename .pyo to .pyc
files, that's my problem, not the Python's interpreter's problem. I
don't expect Python to be idiot-proof.



--
Steven

Ethan Furman

unread,
Jun 13, 2012, 11:39:04 PM6/13/12
to pytho...@python.org
Steven D'Aprano wrote:
> On Wed, Jun 13, 2012 at 03:13:54PM -0400, R. David Murray wrote:
>
>> Again, a program that depends on asserts is buggy.
>>
>> As Ethan pointed out we are asking about the case where someone is
>> *deliberately* setting the .pyo file up to be run as the "normal"
>> case.
>
> You can't be sure that the .pyo file is there due to *deliberate*
> choice. It may be accidental. Perhaps the end user has ignorantly
> deleted the .pyc file, but failed to delete the .pyo file. Perhaps the
> developer has merely made a mistake.

You can't just delete the .pyc file to get the .pyo file to run;
remember in 3.x compiled files are kept in a __pycache__ folder, and if
there is no .py file the compiled files are ignored (correct me if I'm
wrong), so to get the either the .pyc file /or/ the .pyo file to run
/without/ a .py file, you have to physically move the compiled file to
where the source file should be. It could still be accidental, but it's
far less likely to be.


> Under current behaviour, deleting the .pyc file shouldn't matter:
>
> - if the source file is available, that will be used
> - if not, a clear error is raised
>
> Under the proposed change:
>
> - if the source file is *newer* than the .pyo file, it will be used
> - but if it is missing or older, the .pyo file is used

Again, not in 3.x.

~Ethan~

Maciej Fijalkowski

unread,
Jun 14, 2012, 6:11:42 AM6/14/12
to R. David Murray, pytho...@python.org
On Wed, Jun 13, 2012 at 9:13 PM, R. David Murray <rdmu...@bitdance.com> wrote:
On Wed, 13 Jun 2012 20:46:50 +0200, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Wed, 13 Jun 2012 11:20:24 -0700
> Toshio Kuratomi <a.ba...@gmail.com> wrote:
> > On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> > >
> > > OK, but you didn't answer the question :).  If I understand correctly,
> > > everything you said applies to *writing* the bytecode, not reading it.
> > >
> > > So, is there any reason to not use the .pyo file (if that's all that is
> > > around) when -O is not specified?
> > >
> > > The only technical reason I can see why -O should be required for a .pyo
> > > file to be used (*if* it is the only thing around) is if it won't *run*
> > > without the -O switch.  Is there any expectation that that will ever be
> > > the case?
> > >
> > Yes.  For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> > Another piece of code can legally import that and try to use the docstring
> > for something.  This would fail if only the .pyo was present.
>
> Not only docstrings, but also asserts. I think running a pyo without -O
> would be a bug.

Again, a program that depends on asserts is buggy.

As Ethan pointed out we are asking about the case where someone is
*deliberately* setting the .pyo file up to be run as the "normal"
case.

I'm not sure we want to support that, I just want us to be clear
about why we don't :)

PyPy toolchain is an example of such buggy program. And oh any tests. I would not be impressed if my python read .pyo files out of nowhere when not running with -O flag (I'm trying very hard to never run python with -O, because it's different python after all)

Antoine Pitrou

unread,
Jun 14, 2012, 6:25:24 AM6/14/12
to pytho...@python.org
On Wed, 13 Jun 2012 12:36:55 -0700
Ethan Furman <et...@stoneleaf.us> wrote:
>
> Currently, the alternative to supporting this behavior is to either:
>
> 1) require the end-user to specify -O (major nuisance)
>
> or
>
> 2) have the distributor rename the .pyo file to .pyc
>
> I think 1 is a non-starter (non-finisher? ;) but I could live with 2 --
> after all, if someone is going to the effort of removing the .py file
> and moving the .pyo file into its place, renaming the .pyo to .pyc is
> trivial.
>
> So the question, then, is: is option 2 better than just supporting .pyo
> files without -O when they are all that is available?

Honestly, I think the best option would be to deprecate .pyo files as
well as the useless -O option. They only cause confusion without
providing any significant benefits.

(also, they ironically make Python installs bigger since both .pyc
and .pyo files have to be provided by system packages)

Regards

Antoine.

Floris Bruynooghe

unread,
Jun 14, 2012, 7:58:16 AM6/14/12
to pytho...@python.org
On 14 June 2012 11:25, Antoine Pitrou <soli...@pitrou.net> wrote:
> Honestly, I think the best option would be to deprecate .pyo files as
> well as the useless -O option. They only cause confusion without
> providing any significant benefits.

+1

But what happens to __debug__ and assert statements? I think it
should be possible to always put assert statements inside a __debug__
block and then create -O a simple switch for setting __debug__ to
False. If desired a simple strip tool could then easily remove
__debug__ blocks and (unused) docstrings.

--
Debian GNU/Linux -- The Power of Freedom
www.debian.org | www.gnu.org | www.kernel.org

Antoine Pitrou

unread,
Jun 14, 2012, 8:14:54 AM6/14/12
to pytho...@python.org
On Thu, 14 Jun 2012 12:58:16 +0100
Floris Bruynooghe <fl...@devork.be> wrote:
> On 14 June 2012 11:25, Antoine Pitrou <soli...@pitrou.net> wrote:
> > Honestly, I think the best option would be to deprecate .pyo files as
> > well as the useless -O option. They only cause confusion without
> > providing any significant benefits.
>
> +1
>
> But what happens to __debug__ and assert statements? I think it
> should be possible to always put assert statements inside a __debug__
> block and then create -O a simple switch for setting __debug__ to
> False. If desired a simple strip tool could then easily remove
> __debug__ blocks and (unused) docstrings.

I don't really see the point. In my experience there is no benefit to
removing assert statements in production mode. This is a C-specific
notion that doesn't really map very well to Python code. Do other
high-level languages have similar functionality?

Regards

Antoine.

R. David Murray

unread,
Jun 14, 2012, 9:14:59 AM6/14/12
to pytho...@python.org
On Thu, 14 Jun 2012 14:14:54 +0200, Antoine Pitrou <soli...@pitrou.net> wrote:
> On Thu, 14 Jun 2012 12:58:16 +0100
> Floris Bruynooghe <fl...@devork.be> wrote:
> > On 14 June 2012 11:25, Antoine Pitrou <soli...@pitrou.net> wrote:
> > > Honestly, I think the best option would be to deprecate .pyo files as
> > > well as the useless -O option. They only cause confusion without
> > > providing any significant benefits.
> >
> > +1
> >
> > But what happens to __debug__ and assert statements? I think it
> > should be possible to always put assert statements inside a __debug__
> > block and then create -O a simple switch for setting __debug__ to
> > False. If desired a simple strip tool could then easily remove
> > __debug__ blocks and (unused) docstrings.
>
> I don't really see the point. In my experience there is no benefit to
> removing assert statements in production mode. This is a C-specific
> notion that doesn't really map very well to Python code. Do other
> high-level languages have similar functionality?

What does matter though is the memory savings. I'm working with an
application where the difference between normal and -OO is around a 10%
savings (about 2MB) in program DATA size at startup, and that makes a
difference for an ap running in a memory constrained environment.

A docstring stripper would enable the bulk of that savings, but it is
still nice to be able to omit code (such as debug logging statements)
as well.

--David

Antoine Pitrou

unread,
Jun 14, 2012, 9:23:34 AM6/14/12
to pytho...@python.org
On Thu, 14 Jun 2012 09:14:59 -0400
"R. David Murray" <rdmu...@bitdance.com> wrote:
>
> What does matter though is the memory savings. I'm working with an
> application where the difference between normal and -OO is around a 10%
> savings (about 2MB) in program DATA size at startup, and that makes a
> difference for an ap running in a memory constrained environment.
>
> A docstring stripper would enable the bulk of that savings,

Probably indeed.

> but it is
> still nice to be able to omit code (such as debug logging statements)
> as well.

But does that justify all the additional complication in the core
interpreter, as well as potential user confusion?

Regards

Antoine.

Brett Cannon

unread,
Jun 14, 2012, 10:58:55 AM6/14/12
to R. David Murray, pytho...@python.org
On Wed, Jun 13, 2012 at 10:47 PM, R. David Murray <rdmu...@bitdance.com> wrote:
On Thu, 14 Jun 2012 11:48:08 +1000, Nick Coghlan <ncog...@gmail.com> wrote:
> On Thu, Jun 14, 2012 at 6:06 AM, Terry Reedy <tjr...@udel.edu> wrote:
> > On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
> >
> >> Not only docstrings, but also asserts. I think running a pyo without -O
> >> would be a bug.
> >
> > That cat is already out of the bag ;-)
> > People are doing that now by renaming x.pyo to x.pyc.
> > Brett claims that it is also easy to do in 3.3 with a custom importer.
>
> Right, but by resorting to either of those approaches, people are
> clearly doing something that isn't formally supported by the core.
> Yes, you can do it, and most of the time it will work out OK, but any
> weird glitches that result are officially *not our problem*.
>
> The main reason this matters is that the "__debug__" flag is
> *supposed* to be process global - if you check it in one place, the

OK, the above are the two concrete reasons I have heard in this thread
for continuing the current behavior:

   1) we do not wish to support running from .pyo files without -O
      being on, even if it currently happens to work

   2) the __debug__ setting is supposed to be process-global

Both of these are good reasons.  IMO the issue should be closed with a
documentation fix, which could optionally include either or both of the
above motivations.

Just for completeness, there is a third reason:

3) Would lead to an extra stat call per module when doing sourceless loads.

While minor, it could add up if you ship only .pyo files but never run with -O.

Steven D'Aprano

unread,
Jun 14, 2012, 4:38:45 PM6/14/12
to pytho...@python.org
Floris Bruynooghe wrote:
> On 14 June 2012 11:25, Antoine Pitrou <soli...@pitrou.net> wrote:
>> Honestly, I think the best option would be to deprecate .pyo files as
>> well as the useless -O option. They only cause confusion without
>> providing any significant benefits.
>
> +1
>
> But what happens to __debug__ and assert statements? I think it
> should be possible to always put assert statements inside a __debug__
> block and then create -O a simple switch for setting __debug__ to
> False. If desired a simple strip tool could then easily remove
> __debug__ blocks and (unused) docstrings.

So in other words, you want to keep the functionality of -O, but make it the
responsibility of the programmer to write an external tool to implement it?

Apart from the duplication of effort (everyone who wants to optimize their
code has to write their own source-code strip tool), that surely is only going
to decrease the reliability and usefulness of Python optimization, not
increase it.

-O may be under-utilized by programmers who don't need or want it, but that
doesn't mean it isn't useful to those who do want it. -1 on deprecation.



--
Steven

Antoine Pitrou

unread,
Jun 14, 2012, 4:46:36 PM6/14/12
to pytho...@python.org
On Fri, 15 Jun 2012 06:38:45 +1000
Steven D'Aprano <st...@pearwood.info> wrote:
>
> Apart from the duplication of effort (everyone who wants to optimize their
> code has to write their own source-code strip tool),

Actually, it could be shipped with Python, or even done dynamically at
runtime (instead of relying on separate bytecode files).

Regards

Antoine.

Steven D'Aprano

unread,
Jun 14, 2012, 4:54:28 PM6/14/12
to pytho...@python.org
Antoine Pitrou wrote:

> Do other high-level languages have similar functionality?


Parrot (does anyone actually use Parrot?) has a byte-code optimizer.

javac -O is supposed to emit optimized byte-code, but allegedly it is a no-op.

On the other hand, the Java ecosystem includes third-party Java compilers
which claim to be faster/better than Oracle's compiler, including emitting
much tighter byte-code.

There are also Java byte-code optimizers such as Proguard and Soot.

By default, Perl doesn't write byte-code to files. But when it does, there are
various "optimization back-ends" that you can use.

Until version 1.9, Ruby didn't even use byte-code at all.


--
Steven

"Martin v. Löwis"

unread,
Jun 14, 2012, 4:57:40 PM6/14/12
to Antoine Pitrou, pytho...@python.org
> I don't really see the point. In my experience there is no benefit to
> removing assert statements in production mode. This is a C-specific
> notion that doesn't really map very well to Python code. Do other
> high-level languages have similar functionality?

It's not at all C specific. C# also has it:

http://msdn.microsoft.com/en-us/library/ttcc4x86(v=vs.80).aspx

Java makes it a VM option (rather than a compiler option), but it's
still a flag to the VM (-enableassertions):

http://docs.oracle.com/javase/1.4.2/docs/tooldocs/windows/java.html

Delphi also has assertions that can be disabled at compile time.

Regards,
Martin

Gregory P. Smith

unread,
Jun 16, 2012, 8:04:17 PM6/16/12
to Martin v. Löwis, Antoine Pitrou, pytho...@python.org
On Thu, Jun 14, 2012 at 1:57 PM, "Martin v. Löwis" <mar...@v.loewis.de> wrote:
> I don't really see the point. In my experience there is no benefit to
> removing assert statements in production mode. This is a C-specific
> notion that doesn't really map very well to Python code. Do other
> high-level languages have similar functionality?

It's not at all C specific. C# also has it:

http://msdn.microsoft.com/en-us/library/ttcc4x86(v=vs.80).aspx

Java makes it a VM option (rather than a compiler option), but it's
still a flag to the VM (-enableassertions):

http://docs.oracle.com/javase/1.4.2/docs/tooldocs/windows/java.html

Delphi also has assertions that can be disabled at compile time.


It may be a commonly supported feature but that doesn't mean it is a good idea. :)

It seems to me that assertion laden code where the assertions are removed before shipping it or running it in production is largely a concept from the pre-unittesting era.  One big issue with them in Python is that enabling or disabling them is VM global rather than controlled on a per module/library basis.  That isn't true in C/C++ where you can control it on a per file basis at compile time (NDEBUG).

As a developer I agree that it is very convenient to toss asserts into code while you are iterating on writing it.  But I regularly have people remove assert statements during code reviews at work and, if the condition is important, replacing them with actual if condition checks that handle the problem or raise an appropriate module specific exception and documenting that as part of their API.  At this point, I wish Python just didn't support assert as part of the language because it causes less headaches in the end of code is just written without them begin with.  Too late now.

In agreement with others: docstring memory consumption is a big deal, some large Python apps at work strip them or use -O (I don't remember which technique they're using today) before deployment. It does seem like something we should provide a standard way of doing. Docstring data is not needed within non-interactive code.

-gps
Reply all
Reply to author
Forward
0 new messages