Currently, cpython requires the -O flag to *read* .pyo files as well as the write them. This is a nuisance to people who receive them from others, without the source. The originator of the issue quotes the following from the doc (without giving the location).
"It is possible to have a file called spam.pyc (or spam.pyo when -O is used) without a file spam.py for the same module. This can be used to distribute a library of Python code in a form that is moderately hard to reverse engineer."
There is no warning that .pyo files are viral, in a sense. The user has to use -O, which is a) a nuisance to remember if he has multiple scripts and some need it and some not, and b) makes his own .py files used with .pyo imports cached as .pyo, without docstrings, like it or not.
Currently, the easiest workaround is to rename .pyo to .pyc and all seems to work fine, even with a mixture of true .pyc and renamed .pyo files. (The same is true with the -O flag and no renaming.) This suggests that there is no current reason for the restriction in that the *execution* of bytecode is not affected by the -O flag. (Another workaround might be a custom importer -- but this is not trivial, apparently.)
So is the import restriction either an accident or obsolete holdover? If so, can removing it be treated as a bugfix and put into current releases, or should it be treated as an enhancement only for a future release?
Or is the restriction an intentional reservation of the possibility of making *execution* depend on the flag? Which would mean that the restriction should be kept and only the doc changed?
> Currently, cpython requires the -O flag to *read* .pyo files as well as > the write them. This is a nuisance to people who receive them from > others, without the source. The originator of the issue quotes the > following from the doc (without giving the location).
> "It is possible to have a file called spam.pyc (or spam.pyo when -O is > used) without a file spam.py for the same module. This can be used to > distribute a library of Python code in a form that is moderately hard to > reverse engineer."
> There is no warning that .pyo files are viral, in a sense. The user has > to use -O, which is a) a nuisance to remember if he has multiple scripts > and some need it and some not, and b) makes his own .py files used with > .pyo imports cached as .pyo, without docstrings, like it or not.
> Currently, the easiest workaround is to rename .pyo to .pyc and all > seems to work fine, even with a mixture of true .pyc and renamed .pyo > files. (The same is true with the -O flag and no renaming.) This > suggests that there is no current reason for the restriction in that the > *execution* of bytecode is not affected by the -O flag. (Another > workaround might be a custom importer -- but this is not trivial, > apparently.)
> So is the import restriction either an accident or obsolete holdover? If > so, can removing it be treated as a bugfix and put into current > releases, or should it be treated as an enhancement only for a future > release?
> Or is the restriction an intentional reservation of the possibility of > making *execution* depend on the flag? Which would mean that the > restriction should be kept and only the doc changed?
I have no history so cannot say what was supposed to happen, but my $0.02 would be that if -O is *not* specified then we should try to read .pyc, then .pyo, and finally .py. In other words, I vote for -O being a write flag, not a read flag.
>> Currently, cpython requires the -O flag to *read* .pyo files as well as
>> the write them. This is a nuisance to people who receive them from others,
>> without the source. The originator of the issue quotes the following from
>> the doc (without giving the location).
>> "It is possible to have a file called spam.pyc (or spam.pyo when -O is
>> used) without a file spam.py for the same module. This can be used to
>> distribute a library of Python code in a form that is moderately hard to
>> reverse engineer."
>> There is no warning that .pyo files are viral, in a sense. The user has to
>> use -O, which is a) a nuisance to remember if he has multiple scripts and
>> some need it and some not, and b) makes his own .py files used with .pyo
>> imports cached as .pyo, without docstrings, like it or not.
>> Currently, the easiest workaround is to rename .pyo to .pyc and all seems
>> to work fine, even with a mixture of true .pyc and renamed .pyo files. (The
>> same is true with the -O flag and no renaming.) This suggests that there is
>> no current reason for the restriction in that the *execution* of bytecode is
>> not affected by the -O flag. (Another workaround might be a custom importer
>> -- but this is not trivial, apparently.)
>> So is the import restriction either an accident or obsolete holdover? If
>> so, can removing it be treated as a bugfix and put into current releases, or
>> should it be treated as an enhancement only for a future release?
>> Or is the restriction an intentional reservation of the possibility of
>> making *execution* depend on the flag? Which would mean that the restriction
>> should be kept and only the doc changed?
> I have no history so cannot say what was supposed to happen, but my $0.02
> would be that if -O is *not* specified then we should try to read .pyc, then
> .pyo, and finally .py. In other words, I vote for -O being a write flag,
> not a read flag.
Alexandre Zani wrote:
> On Tue, Jun 12, 2012 at 11:41 AM, Ethan Furman <et...@stoneleaf.us> wrote:
>> Terry Reedy wrote:
>>> http://bugs.python.org/issue12982
>>> Currently, cpython requires the -O flag to *read* .pyo files as well as
>>> the write them. This is a nuisance to people who receive them from others,
>>> without the source. The originator of the issue quotes the following from
>>> the doc (without giving the location).
>>> "It is possible to have a file called spam.pyc (or spam.pyo when -O is
>>> used) without a file spam.py for the same module. This can be used to
>>> distribute a library of Python code in a form that is moderately hard to
>>> reverse engineer."
>>> There is no warning that .pyo files are viral, in a sense. The user has to
>>> use -O, which is a) a nuisance to remember if he has multiple scripts and
>>> some need it and some not, and b) makes his own .py files used with .pyo
>>> imports cached as .pyo, without docstrings, like it or not.
>>> Currently, the easiest workaround is to rename .pyo to .pyc and all seems
>>> to work fine, even with a mixture of true .pyc and renamed .pyo files. (The
>>> same is true with the -O flag and no renaming.) This suggests that there is
>>> no current reason for the restriction in that the *execution* of bytecode is
>>> not affected by the -O flag. (Another workaround might be a custom importer
>>> -- but this is not trivial, apparently.)
>>> So is the import restriction either an accident or obsolete holdover? If
>>> so, can removing it be treated as a bugfix and put into current releases, or
>>> should it be treated as an enhancement only for a future release?
>>> Or is the restriction an intentional reservation of the possibility of
>>> making *execution* depend on the flag? Which would mean that the restriction
>>> should be kept and only the doc changed?
>> I have no history so cannot say what was supposed to happen, but my $0.02
>> would be that if -O is *not* specified then we should try to read .pyc, then
>> .pyo, and finally .py. In other words, I vote for -O being a write flag,
>> not a read flag.
> What if I change .py?
Well, the case in question is that there is no .py available.
But if it were available, and you changed it, then it would and should work just like it does now -- if .py is newer, compile it; if -O was specified, compile it optimized; now run the compiled code.
> > Currently, cpython requires the -O flag to *read* .pyo files as well as > > the write them. This is a nuisance to people who receive them from > > others, without the source. The originator of the issue quotes the > > following from the doc (without giving the location).
> > "It is possible to have a file called spam.pyc (or spam.pyo when -O is > > used) without a file spam.py for the same module. This can be used to > > distribute a library of Python code in a form that is moderately hard to > > reverse engineer."
> > There is no warning that .pyo files are viral, in a sense. The user has > > to use -O, which is a) a nuisance to remember if he has multiple scripts > > and some need it and some not, and b) makes his own .py files used with > > .pyo imports cached as .pyo, without docstrings, like it or not.
> > Currently, the easiest workaround is to rename .pyo to .pyc and all > > seems to work fine, even with a mixture of true .pyc and renamed .pyo > > files. (The same is true with the -O flag and no renaming.) This > > suggests that there is no current reason for the restriction in that the > > *execution* of bytecode is not affected by the -O flag. (Another > > workaround might be a custom importer -- but this is not trivial, > > apparently.)
> > So is the import restriction either an accident or obsolete holdover? If > > so, can removing it be treated as a bugfix and put into current > > releases, or should it be treated as an enhancement only for a future > > release?
> > Or is the restriction an intentional reservation of the possibility of > > making *execution* depend on the flag? Which would mean that the > > restriction should be kept and only the doc changed?
> I have no history so cannot say what was supposed to happen, but my > $0.02 would be that if -O is *not* specified then we should try to read > .pyc, then .pyo, and finally .py. In other words, I vote for -O being a > write flag, not a read flag.
I don't know much about the history either, but under PEP 3147, there are really two cases: * .pyc and .pyo as compilation caches. These live in __pycache__/ and have a cache_tag, their filename looks like pkg/__pycache__/module.cpython-33.pyc and their only role is to speed up imports. * .pyc and .pyo as standalone, precompiled sources for modules. These are found in the same place as .py files (e.g. pkg/module.pyc).
In the first case, I think that -O should dictate which of .pyc and .pyo is used, while the other is completely ignored.
In the second case, both .pyc and .pyo should always be considered as valid module sources, because -O is a compilation flag and loading a bytecode file doesn't involve compilation. At most, -O could switch the priority between .pyc and .pyo.
2.7 doesn't really differentiate between cached .pyc and standalone .pyc, so I don't know if a consistent behaviour can be achieved. Maybe the presence or absence of a matching .py can be used to trigger the first or second case above.
[Vaguely related:
-B prevents the writing of .pyc and .pyo (don't know how it works for pep 3147)
However, it doesn't prevent the _reading_ of said files. It's been discussed here before and considered useful, since rudiment .pyc files tend to stick around. Maybe a -BB flag should be considered?]
K
________________________________________
Frá: python-dev-bounces+kristjan=ccpgames....@python.org [python-dev-bounces+kristjan=ccpgames....@python.org] fyrir hönd Ronan Lamy [ronan.l...@gmail.com]
Sent: 12. júní 2012 19:57
To: Ethan Furman
Cc: python-...@python.org
Efni: Re: [Python-Dev] #12982: Should -O be required to *read* .pyo files?
Le mardi 12 juin 2012 ŕ 11:41 -0700, Ethan Furman a écrit :
> > Currently, cpython requires the -O flag to *read* .pyo files as well as
> > the write them. This is a nuisance to people who receive them from
> > others, without the source. The originator of the issue quotes the
> > following from the doc (without giving the location).
> > "It is possible to have a file called spam.pyc (or spam.pyo when -O is
> > used) without a file spam.py for the same module. This can be used to
> > distribute a library of Python code in a form that is moderately hard to
> > reverse engineer."
> > There is no warning that .pyo files are viral, in a sense. The user has
> > to use -O, which is a) a nuisance to remember if he has multiple scripts
> > and some need it and some not, and b) makes his own .py files used with
> > .pyo imports cached as .pyo, without docstrings, like it or not.
> > Currently, the easiest workaround is to rename .pyo to .pyc and all
> > seems to work fine, even with a mixture of true .pyc and renamed .pyo
> > files. (The same is true with the -O flag and no renaming.) This
> > suggests that there is no current reason for the restriction in that the
> > *execution* of bytecode is not affected by the -O flag. (Another
> > workaround might be a custom importer -- but this is not trivial,
> > apparently.)
> > So is the import restriction either an accident or obsolete holdover? If
> > so, can removing it be treated as a bugfix and put into current
> > releases, or should it be treated as an enhancement only for a future
> > release?
> > Or is the restriction an intentional reservation of the possibility of
> > making *execution* depend on the flag? Which would mean that the
> > restriction should be kept and only the doc changed?
> I have no history so cannot say what was supposed to happen, but my
> $0.02 would be that if -O is *not* specified then we should try to read
> .pyc, then .pyo, and finally .py. In other words, I vote for -O being a
> write flag, not a read flag.
I don't know much about the history either, but under PEP 3147, there
are really two cases:
* .pyc and .pyo as compilation caches. These live in __pycache__/ and
have a cache_tag, their filename looks like
pkg/__pycache__/module.cpython-33.pyc and their only role is to speed up
imports.
* .pyc and .pyo as standalone, precompiled sources for modules. These
are found in the same place as .py files (e.g. pkg/module.pyc).
In the first case, I think that -O should dictate which of .pyc and .pyo
is used, while the other is completely ignored.
In the second case, both .pyc and .pyo should always be considered as
valid module sources, because -O is a compilation flag and loading a
bytecode file doesn't involve compilation. At most, -O could switch the
priority between .pyc and .pyo.
2.7 doesn't really differentiate between cached .pyc and
standalone .pyc, so I don't know if a consistent behaviour can be
achieved. Maybe the presence or absence of a matching .py can be used to
trigger the first or second case above.
> > Currently, cpython requires the -O flag to *read* .pyo files as well as
> > the write them. This is a nuisance to people who receive them from others,
> > without the source. The originator of the issue quotes the following from
> > the doc (without giving the location).
[...]
> > So is the import restriction either an accident or obsolete holdover?
> Neither. .pyo files are actually different from .pyc files in terms of what
> bytecode they may emit. Currently -O causes all asserts to be left out of
> the bytecode and -OO leaves out all docstrings on top of what -O does. This
> makes a difference if you are trying to introspect at the interpreter
> prompt or are testing things in development and want those asserts to be
> triggered if needed.
> > If so, can removing it be treated as a bugfix and put into current
> > releases, or should it be treated as an enhancement only for a future
> > release?
> The behaviour shouldn't change. There has been talk of doing even more
> aggressive optimizing under -O, which once again would cause an even larger
> deviation between a .pyo file and a .pyc file (e.g. allowing Python code to
> hook into the peephole optimizer or an entirely new AST optimizer).
> > Or is the restriction an intentional reservation of the possibility of
> > making *execution* depend on the flag? Which would mean that the
> > restriction should be kept and only the doc changed?
> The docs should get updated to be more clear.
OK, but you didn't answer the question :). If I understand correctly,
everything you said applies to *writing* the bytecode, not reading it.
So, is there any reason to not use the .pyo file (if that's all that is
around) when -O is not specified?
The only technical reason I can see why -O should be required for a .pyo
file to be used (*if* it is the only thing around) is if it won't *run*
without the -O switch. Is there any expectation that that will ever be
the case?
On the other hand, not wanting make any extra effort to support sourceless
distributions could be a reason as well. But if that's the case we
should be transparent about it.
> Currently, cpython requires the -O flag to *read* .pyo files as well as
> the write them. This is a nuisance to people who receive them from others,
> without the source. The originator of the issue quotes the following from
> the doc (without giving the location).
> "It is possible to have a file called spam.pyc (or spam.pyo when -O is
> used) without a file spam.py for the same module. This can be used to
> distribute a library of Python code in a form that is moderately hard to
> reverse engineer."
> There is no warning that .pyo files are viral, in a sense. The user has to
> use -O, which is a) a nuisance to remember if he has multiple scripts and
> some need it and some not, and b) makes his own .py files used with .pyo
> imports cached as .pyo, without docstrings, like it or not.
> Currently, the easiest workaround is to rename .pyo to .pyc and all seems
> to work fine, even with a mixture of true .pyc and renamed .pyo files. (The
> same is true with the -O flag and no renaming.) This suggests that there is
> no current reason for the restriction in that the *execution* of bytecode
> is not affected by the -O flag. (Another workaround might be a custom
> importer -- but this is not trivial, apparently.)
In Python 3.3 it's actually trivial.
> So is the import restriction either an accident or obsolete holdover?
Neither. .pyo files are actually different from .pyc files in terms of what
bytecode they may emit. Currently -O causes all asserts to be left out of
the bytecode and -OO leaves out all docstrings on top of what -O does. This
makes a difference if you are trying to introspect at the interpreter
prompt or are testing things in development and want those asserts to be
triggered if needed.
> If so, can removing it be treated as a bugfix and put into current
> releases, or should it be treated as an enhancement only for a future
> release?
The behaviour shouldn't change. There has been talk of doing even more
aggressive optimizing under -O, which once again would cause an even larger
deviation between a .pyo file and a .pyc file (e.g. allowing Python code to
hook into the peephole optimizer or an entirely new AST optimizer).
> Or is the restriction an intentional reservation of the possibility of
> making *execution* depend on the flag? Which would mean that the
> restriction should be kept and only the doc changed?
On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> OK, but you didn't answer the question :). If I understand correctly,
> everything you said applies to *writing* the bytecode, not reading it.
> So, is there any reason to not use the .pyo file (if that's all that is
> around) when -O is not specified?
> The only technical reason I can see why -O should be required for a .pyo
> file to be used (*if* it is the only thing around) is if it won't *run*
> without the -O switch. Is there any expectation that that will ever be
> the case?
Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
Another piece of code can legally import that and try to use the docstring
for something. This would fail if only the .pyo was present.
Of course, it would also fail under the present behaviour since no .py or
.pyc was present to be imported. The error that's displayed might be
clearer if we fail when attempting to read a .py/.pyc rather than failing
when the docstring is found to be missing, though.
>> Currently, cpython requires the -O flag to *read* .pyo files as well
>> as the write them. This is a nuisance to people who receive them
>> from others, without the source. The originator of the issue quotes
>> the following from the doc (without giving the location).
>> "It is possible to have a file called spam.pyc (or spam.pyo when -O
>> is used) without a file spam.py for the same module. This can be
>> used to distribute a library of Python code in a form that is
>> moderately hard to reverse engineer."
>> There is no warning that .pyo files are viral, in a sense. The user
>> has to use -O, which is a) a nuisance to remember if he has multiple
>> scripts and some need it and some not, and b) makes his own .py
>> files used with .pyo imports cached as .pyo, without docstrings,
>> like it or not.
>> Currently, the easiest workaround is to rename .pyo to .pyc and all
>> seems to work fine, even with a mixture of true .pyc and renamed
>> .pyo files. (The same is true with the -O flag and no renaming.)
>> This suggests that there is no current reason for the restriction in
>> that the *execution* of bytecode is not affected by the -O flag.
>> (Another workaround might be a custom importer -- but this is not
>> trivial, apparently.)
> In Python 3.3 it's actually trivial.
>> So is the import restriction either an accident or obsolete holdover?
> Neither. .pyo files are actually different from .pyc files in terms of > what bytecode they may emit. Currently -O causes all asserts to be left > out of the bytecode and -OO leaves out all docstrings on top of what -O > does. This makes a difference if you are trying to introspect at the > interpreter prompt or are testing things in development and want those > asserts to be triggered if needed.
But what does this have to do with those cases where *only* the .pyo file is available, and we are trying to run it? In these cases it would have to be in the main folder (not __pycache__) which means somebody did it deliberately.
Toshio Kuratomi <a.bad...@gmail.com> wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> > OK, but you didn't answer the question :). If I understand correctly,
> > everything you said applies to *writing* the bytecode, not reading it.
> > So, is there any reason to not use the .pyo file (if that's all that is
> > around) when -O is not specified?
> > The only technical reason I can see why -O should be required for a .pyo
> > file to be used (*if* it is the only thing around) is if it won't *run*
> > without the -O switch. Is there any expectation that that will ever be
> > the case?
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.
Not only docstrings, but also asserts. I think running a pyo without -O
would be a bug.
Toshio Kuratomi wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>> OK, but you didn't answer the question :). If I understand correctly,
>> everything you said applies to *writing* the bytecode, not reading it.
>> So, is there any reason to not use the .pyo file (if that's all that is
>> around) when -O is not specified?
>> The only technical reason I can see why -O should be required for a .pyo
>> file to be used (*if* it is the only thing around) is if it won't *run*
>> without the -O switch. Is there any expectation that that will ever be
>> the case?
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.
Why should it fail? -OO causes docstring access to return None, just as if a docstring had not been specified in the first place. Any decent code will be checking for an undefined docstring -- after all, they are not rare.
On Wed, 13 Jun 2012 11:20:24 -0700, Toshio Kuratomi <a.bad...@gmail.com> wrote:
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> > OK, but you didn't answer the question :). If I understand correctly,
> > everything you said applies to *writing* the bytecode, not reading it.
> > So, is there any reason to not use the .pyo file (if that's all that is
> > around) when -O is not specified?
> > The only technical reason I can see why -O should be required for a .pyo
> > file to be used (*if* it is the only thing around) is if it won't *run*
> > without the -O switch. Is there any expectation that that will ever be
> > the case?
> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> Another piece of code can legally import that and try to use the docstring
> for something. This would fail if only the .pyo was present.
Yes, but that's not what I'm talking about. I would treat code that
depends on the presence of docstrings and doesn't have a fallback for
dealing with there absence as buggy code, since anyone might decide
to run that code with -OO, and the code would fail in that case too.
I'm talking about a case where the code runs correctly with -O (or -OO),
but fails if the code from the .pyo is loaded and python is run
*without* -O (or -OO).
> Of course, it would also fail under the present behaviour since no .py or
> .pyc was present to be imported. The error that's displayed might be
> clearer if we fail when attempting to read a .py/.pyc rather than failing
> when the docstring is found to be missing, though.
Well, right now if there is only a .pyo file and you run python without
-O, you get an import error. The question is, is that the way we really
want it to work?
On Wed, 13 Jun 2012 20:46:50 +0200, Antoine Pitrou <solip...@pitrou.net> wrote:
> On Wed, 13 Jun 2012 11:20:24 -0700
> Toshio Kuratomi <a.bad...@gmail.com> wrote:
> > On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> > > OK, but you didn't answer the question :). If I understand correctly,
> > > everything you said applies to *writing* the bytecode, not reading it.
> > > So, is there any reason to not use the .pyo file (if that's all that is
> > > around) when -O is not specified?
> > > The only technical reason I can see why -O should be required for a .pyo
> > > file to be used (*if* it is the only thing around) is if it won't *run*
> > > without the -O switch. Is there any expectation that that will ever be
> > > the case?
> > Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
> > Another piece of code can legally import that and try to use the docstring
> > for something. This would fail if only the .pyo was present.
> Not only docstrings, but also asserts. I think running a pyo without -O
> would be a bug.
Again, a program that depends on asserts is buggy.
As Ethan pointed out we are asking about the case where someone is
*deliberately* setting the .pyo file up to be run as the "normal"
case.
I'm not sure we want to support that, I just want us to be clear
about why we don't :)
R. David Murray wrote:
> On Wed, 13 Jun 2012 20:46:50 +0200, Antoine Pitrou <solip...@pitrou.net> wrote:
>> On Wed, 13 Jun 2012 11:20:24 -0700
>> Toshio Kuratomi <a.bad...@gmail.com> wrote:
>>> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>>>> OK, but you didn't answer the question :). If I understand correctly,
>>>> everything you said applies to *writing* the bytecode, not reading it.
>>>> So, is there any reason to not use the .pyo file (if that's all that is
>>>> around) when -O is not specified?
>>>> The only technical reason I can see why -O should be required for a .pyo
>>>> file to be used (*if* it is the only thing around) is if it won't *run*
>>>> without the -O switch. Is there any expectation that that will ever be
>>>> the case?
>>> Yes. For instance, if I create a .pyo with -OO it wouldn't have docstrings.
>>> Another piece of code can legally import that and try to use the docstring
>>> for something. This would fail if only the .pyo was present.
>> Not only docstrings, but also asserts. I think running a pyo without -O
>> would be a bug.
> Again, a program that depends on asserts is buggy.
> As Ethan pointed out we are asking about the case where someone is
> *deliberately* setting the .pyo file up to be run as the "normal"
> case.
> I'm not sure we want to support that, I just want us to be clear
> about why we don't :)
Currently, the alternative to supporting this behavior is to either:
1) require the end-user to specify -O (major nuisance)
or
2) have the distributor rename the .pyo file to .pyc
I think 1 is a non-starter (non-finisher? ;) but I could live with 2 -- after all, if someone is going to the effort of removing the .py file and moving the .pyo file into its place, renaming the .pyo to .pyc is trivial.
So the question, then, is: is option 2 better than just supporting .pyo files without -O when they are all that is available?
> Not only docstrings, but also asserts. I think running a pyo without -O
> would be a bug.
That cat is already out of the bag ;-)
People are doing that now by renaming x.pyo to x.pyc.
Brett claims that it is also easy to do in 3.3 with a custom importer.
> Currently, cpython requires the -O flag to *read* .pyo files as well
> as the write them. This is a nuisance to people who receive them
> from others, without the source. The originator of the issue quotes
> the following from the doc (without giving the location).
> "It is possible to have a file called spam.pyc (or spam.pyo when -O
> is used) without a file spam.py for the same module. This can be
> used to distribute a library of Python code in a form that is
> moderately hard to reverse engineer."
> There is no warning that .pyo files are viral, in a sense. The user
> has to use -O, which is a) a nuisance to remember if he has multiple
> scripts and some need it and some not, and b) makes his own .py
> files used with .pyo imports cached as .pyo, without docstrings,
> like it or not.
> Currently, the easiest workaround is to rename .pyo to .pyc and all
> seems to work fine, even with a mixture of true .pyc and renamed
> .pyo files. (The same is true with the -O flag and no renaming.)
> This suggests that there is no current reason for the restriction in
> that the *execution* of bytecode is not affected by the -O flag.
> (Another workaround might be a custom importer -- but this is not
> trivial, apparently.)
> In Python 3.3 it's actually trivial.
For you. Anyway, I am sure Michael of #12982 is using an earlier version.
> So is the import restriction either an accident or obsolete holdover?
> Neither. .pyo files are actually different from .pyc files in terms of
> what bytecode they may emit. Currently -O causes all asserts to be left
> out of the bytecode and -OO leaves out all docstrings on top of what -O
> does. This makes a difference if you are trying to introspect at the
> interpreter prompt or are testing things in development and want those
> asserts to be triggered if needed.
I suggested to Michael that he should request an all-.pyc library for that reason.
> If so, can removing it be treated as a bugfix and put into current
> releases, or should it be treated as an enhancement only for a
> future release?
> The behaviour shouldn't change. There has been talk of doing even more
> aggressive optimizing under -O, which once again would cause an even
> larger deviation between a .pyo file and a .pyc file (e.g. allowing
> Python code to hook into the peephole optimizer or an entirely new AST
> optimizer).
Would such a change mean that *reading* a .pyo file as if it were a .pyc file would start failing? (It now works, by renaming the file.)
If so, would not it be better to rely on having a different magic number *in* the file rather than on its mutable external name?
You just said above that evading import restriction by name is now trivial. So what is the point of keeping it. If, in the future, there *are* separate execution pathways*, and we want the .pyo pathway closed unless -O is passed, then it seems that that could only be enforced by a magic number in the file.
*Unladen Swallow would have *optionally* produced cache files completely different from current bytecode, with a different extension. A couple of people have suggested using wordcode instead of bytecode. If this were also introduced as an option, its cache files would also need a different extension and magic number.
> Or is the restriction an intentional reservation of the possibility
> of making *execution* depend on the flag? Which would mean that the
> restriction should be kept and only the doc changed?
On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
> So, is there any reason to not use the .pyo file (if that's all that is
> around) when -O is not specified?
.pyo and .pyc files have potentially different semantics. Right now, .pyo files don't include asserts, so that's one difference right there. In the future there may be more aggressive optimizations.
Good practice is to never write an assert that actually changes the semantics of your program, but in practice people don't write asserts correctly, e.g. they use them for checking user-input or function parameters.
So, no, we should never use .pyo files unless explicitly told to do so, since doing so risks breaking poorly-written but otherwise working code.
On Thu, Jun 14, 2012 at 6:06 AM, Terry Reedy <tjre...@udel.edu> wrote:
> On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
>> Not only docstrings, but also asserts. I think running a pyo without -O
>> would be a bug.
> That cat is already out of the bag ;-)
> People are doing that now by renaming x.pyo to x.pyc.
> Brett claims that it is also easy to do in 3.3 with a custom importer.
Right, but by resorting to either of those approaches, people are
clearly doing something that isn't formally supported by the core.
Yes, you can do it, and most of the time it will work out OK, but any
weird glitches that result are officially *not our problem*.
The main reason this matters is that the "__debug__" flag is
*supposed* to be process global - if you check it in one place, the
answer should be correct for all Python code loaded in the process. If
you load a .pyo file into a process running without -O (or a .pyc file
into a process running *with* -O), then you have broken that
assumption. Because the compiler understands __debug__, and is
explicitly free to make optimisations based on the value of that flag
at compile time (such as throwing away unreachable branches in if
statements or applying constant folding operations), the following
code will do different things if loaded from a .pyo file instead of
.pyc:
print("__debug__ is not a builtin, it is checked at compile time")
if __debug__:
print("A .pyc file always has __debug__ == True")
else:
print("A .pyo file always has __debug__ == False")
$ ./python -c "import foo"
__debug__ is not a builtin, it is checked at compile time
A .pyc file always has __debug__ == True
$ ./python -O -c "import foo"
__debug__ is not a builtin, it is checked at compile time
A .pyo file always has __debug__ == False
$ ./python __pycache__/foo.cpython-33.pyo
__debug__ is not a builtin, it is checked at compile time
A .pyo file always has __debug__ == False
$ ./python -O __pycache__/foo.cpython-33.pyc
__debug__ is not a builtin, it is checked at compile time
A .pyc file always has __debug__ == True
> On Wed, Jun 13, 2012 at 01:58:10PM -0400, R. David Murray wrote:
>> So, is there any reason to not use the .pyo file (if that's all that is
>> around) when -O is not specified?
> .pyo and .pyc files have potentially different semantics. Right now,
> .pyo files don't include asserts, so that's one difference right there.
> In the future there may be more aggressive optimizations.
> Good practice is to never write an assert that actually changes the
> semantics of your program, but in practice people don't write asserts
> correctly, e.g. they use them for checking user-input or function
> parameters.
> So, no, we
You mean the interpreter?
> should never use
Do you mean import or execute?
Current, the interpreter executes any bytecode that gets imported.
> .pyo files unless explicitly told to do so,
What constitutes 'explicitly told to do so'? Currently, an 'optimized' file written as .pyo gets imported (and hence executed) if
1) the interpreter is started with -O
2) a custom importer ignores the absence of -O
3) someone renames x.pyo to x.pyc.
> since doing so risks breaking poorly-written but otherwise working code.
Agreed, though a slightly different issue. Would you somehow disable 2) or 3) if not considered 'explicit' enough?
On Thu, 14 Jun 2012 11:48:08 +1000, Nick Coghlan <ncogh...@gmail.com> wrote:
> On Thu, Jun 14, 2012 at 6:06 AM, Terry Reedy <tjre...@udel.edu> wrote:
> > On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
> >> Not only docstrings, but also asserts. I think running a pyo without -O
> >> would be a bug.
> > That cat is already out of the bag ;-)
> > People are doing that now by renaming x.pyo to x.pyc.
> > Brett claims that it is also easy to do in 3.3 with a custom importer.
> Right, but by resorting to either of those approaches, people are
> clearly doing something that isn't formally supported by the core.
> Yes, you can do it, and most of the time it will work out OK, but any
> weird glitches that result are officially *not our problem*.
> The main reason this matters is that the "__debug__" flag is
> *supposed* to be process global - if you check it in one place, the
OK, the above are the two concrete reasons I have heard in this thread
for continuing the current behavior:
1) we do not wish to support running from .pyo files without -O
being on, even if it currently happens to work
2) the __debug__ setting is supposed to be process-global
Both of these are good reasons. IMO the issue should be closed with a
documentation fix, which could optionally include either or both of the
above motivations.
> On Thu, 14 Jun 2012 11:48:08 +1000, Nick Coghlan<ncogh...@gmail.com> wrote:
>> Right, but by resorting to either of those approaches, people are
>> clearly doing something that isn't formally supported by the core.
That was not clear to me until I read your post -- the key word being formally (or officially). I see now that distributing a sourceless library as a mixture of .pyc and .pyo files is even crazier that I thought.
>> Yes, you can do it, and most of the time it will work out OK, but any
>> weird glitches that result are officially *not our problem*.
>> The main reason this matters is that the "__debug__" flag is
>> *supposed* to be process global - if you check it in one place, the
> OK, the above are the two concrete reasons I have heard in this thread
> for continuing the current behavior:
> 1) we do not wish to support running from .pyo files without -O
> being on, even if it currently happens to work
> 2) the __debug__ setting is supposed to be process-global
> Both of these are good reasons. IMO the issue should be closed with a
> documentation fix, which could optionally include either or both of the
> above motivations.
I agree. We have gotten what we need from this thread.
On Wed, Jun 13, 2012 at 03:13:54PM -0400, R. David Murray wrote:
> Again, a program that depends on asserts is buggy.
> As Ethan pointed out we are asking about the case where someone is
> *deliberately* setting the .pyo file up to be run as the "normal"
> case.
You can't be sure that the .pyo file is there due to *deliberate* choice. It may be accidental. Perhaps the end user has ignorantly deleted the .pyc file, but failed to delete the .pyo file. Perhaps the developer has merely made a mistake.
Under current behaviour, deleting the .pyc file shouldn't matter:
- if the source file is available, that will be used
- if not, a clear error is raised
Under the proposed change:
- if the source file is *newer* than the .pyo file, it will be used
- but if it is missing or older, the .pyo file is used
This opens a potential mismatch between the code I *think* is being run, and the actual code being run: I think the .py[c] code is running when the .pyo is actually running.
Realistically, we should expect that most people don't *sufficiently* test their apps under -O (if at all!) even if they are aware that there are differences in behaviour. I know I don't, and I know I should. This is just a matter of priority: testing without -O is a higher priority for me than testing with -O and -OO.
The consequence is that I may then receive a mysterious bug report that I can't duplicate, because the user correctly reports that they are running *without* -O, but unknown to anyone, they are actually running the .pyo file.
> I'm not sure we want to support that, I just want us to be clear
> about why we don't :)
If I receive a bug report that only occurs under -O, then I immediately suspect that the bug has something to do with assert.
If I receive a bug report that occurs without -O, under the proposed change I can't be sure with the optimized code or standard code is running. That adds complexity and confusion.
On Wed, Jun 13, 2012 at 04:06:22PM -0400, Terry Reedy wrote:
> On 6/13/2012 2:46 PM, Antoine Pitrou wrote:
> >Not only docstrings, but also asserts. I think running a pyo without -O
> >would be a bug.
> That cat is already out of the bag ;-)
> People are doing that now by renaming x.pyo to x.pyc.
> Brett claims that it is also easy to do in 3.3 with a custom importer.
That's fine. Both steps require an overt, deliberate act, and so is under the control of (and the responsibilty of) the developer. It's not something that could happen innocently by accident.
On Wed, Jun 13, 2012 at 09:54:30PM -0400, Terry Reedy wrote:
> >So, no, we
> You mean the interpreter?
Yes.
> >should never use
> Do you mean import or execute?
> Current, the interpreter executes any bytecode that gets imported.
Both.
> >.pyo files unless explicitly told to do so,
> What constitutes 'explicitly told to do so'? Currently, an 'optimized' > file written as .pyo gets imported (and hence executed) if
> 1) the interpreter is started with -O
> 2) a custom importer ignores the absence of -O
> 3) someone renames x.pyo to x.pyc.
Any of the above are fine by me.
I oppose this one:
4) the interpreter is started without -O but there is no .pyc file.
since it can lead to a mismatch between what I (the developer) thinks is being run and what is actually being run (or imported).
For the avoidance of doubt, if my end-users secretly rename .pyo to .pyc files, that's my problem, not the Python's interpreter's problem. I don't expect Python to be idiot-proof.