[Python-ideas] Prevent importing yourself?

151 views
Skip to first unread message

Ned Batchelder

unread,
Jan 29, 2016, 5:42:45 PM1/29/16
to python-ideas
Hi,

A common question we get in the #python IRC channel is, "I tried
importing a module, but I get an AttributeError trying to use the things
it said it provided." Turns out the beginner named their own file the
same as the module they were trying to use.

That is, they want to try (for example) the "azure" package. So they
make a file called azure.py, and start with "import azure". The import
succeeds, but it has none of the contents the documentation claims,
because they have imported themselves. It's baffling, because they have
used the exact statements shown in the examples, but it doesn't work.

Could we make this a more obvious failure? Is there ever a valid reason
for a file to import itself? Is this situation detectable in the import
machinery?

--Ned.
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Ryan Gonzalez

unread,
Jan 29, 2016, 5:57:52 PM1/29/16
to Ned Batchelder, python-ideas


On January 29, 2016 4:42:18 PM CST, Ned Batchelder <n...@nedbatchelder.com> wrote:
>Hi,
>
>A common question we get in the #python IRC channel is, "I tried
>importing a module, but I get an AttributeError trying to use the
>things
>it said it provided." Turns out the beginner named their own file the
>same as the module they were trying to use.
>
>That is, they want to try (for example) the "azure" package. So they
>make a file called azure.py, and start with "import azure". The import
>succeeds, but it has none of the contents the documentation claims,
>because they have imported themselves. It's baffling, because they
>have
>used the exact statements shown in the examples, but it doesn't work.
>
>Could we make this a more obvious failure? Is there ever a valid
>reason
>for a file to import itself? Is this situation detectable in the
>import
>machinery?
>

Haha, +1. This bit me a good 50 times when I started learning Python.

>--Ned.
>_______________________________________________
>Python-ideas mailing list
>Python...@python.org
>https://mail.python.org/mailman/listinfo/python-ideas
>Code of Conduct: http://python.org/psf/codeofconduct/

--
Sent from my Nexus 5 with K-9 Mail. Please excuse my brevity.

Greg Ewing

unread,
Jan 29, 2016, 6:10:50 PM1/29/16
to python-ideas
Ned Batchelder wrote:
> Could we make this a more obvious failure? Is there ever a valid reason
> for a file to import itself?

I've done it occasionally, but only when doing something
very unusual, and I probably wouldn't mind having to
pull it out of sys.modules in cases like that.

--
Greg

Sjoerd Job Postmus

unread,
Jan 29, 2016, 6:42:58 PM1/29/16
to Ned Batchelder, python-ideas
On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote:
> Hi,
>
> A common question we get in the #python IRC channel is, "I tried
> importing a module, but I get an AttributeError trying to use the
> things it said it provided." Turns out the beginner named their own
> file the same as the module they were trying to use.
>
> That is, they want to try (for example) the "azure" package. So
> they make a file called azure.py, and start with "import azure". The
> import succeeds, but it has none of the contents the documentation
> claims, because they have imported themselves. It's baffling,
> because they have used the exact statements shown in the examples,
> but it doesn't work.
>
> Could we make this a more obvious failure? Is there ever a valid
> reason for a file to import itself? Is this situation detectable in
> the import machinery?
>
> --Ned.

I feel this is only a partial fix. I've been bitten by something like
this, but not precisely like this. The difference in how I experienced
this makes it enough for me to say: I don't think your suggestion is
that useful.

What I experienced was having collisions on the python-path, and modules
from my codebase colliding with libraries in the stdlib (or outside it).
For example, a library might import one of its dependencies which
coincidentally had the same name as one of the libraries I have.

Maybe a suggestion would be to add the path of the module to the error
message?

Currently the message is

sjoerdjob$ cat json.py
import json

TEST = '{"foo": "bar"}'

print(json.loads(TEST))


sjoerdjob$ python3.5 json.py
Traceback (most recent call last):
File "json.py", line 1, in <module>
import json
File "/Users/sjoerdjob/Development/spikes/importself/json.py", line 5, in <module>
print(json.loads(TEST))
AttributeError: module 'json' has no attribute 'loads'


But maybe the error could/should be

AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/json.py) has no attribute 'loads'.

As another corner case, consider the following:

#json.py
JSON_DATA = '{"foo": "bar"}'

#mod_a.py
import json
def parse(blob):
return json.loads(blob)

#mod_b.py
from json import JSON_DATA
from mod_a import parse
print(parse(JSON_DATA))

(Now, consider that instead of 'json' we chose a less common module
name.). You still get the error `module 'json' has no attribute
'loads'`. In this case, I think it's more helpful to know the filename
of the 'json' module. For me that'd sooner be a trigger to "What's going
on.", because I haven't been bitten by the 'import self' issue as often
as 'name collision in dependency tree'.

(Of course, another option would be to look for other modules of the
same name when you get an attribute-error on a module to aid debugging,
but I think that's too heavy-weight.)

Kind regards,
Sjoerd Job

John Wong

unread,
Jan 29, 2016, 7:05:58 PM1/29/16
to Sjoerd Job Postmus, python-ideas
On Fri, Jan 29, 2016 at 6:42 PM, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
I feel this is only a partial fix. I've been bitten by something like
this, but not precisely like this. The difference in how I experienced
this makes it enough for me to say: I don't think your suggestion is
that useful.


Yes, your example is actually more likely to happen, and it happened to me many times. One reason is some of the stdlib module names are kind of commons. Once I defined my own random.py and then another time I had a requests.py which collided with requests library.

I think the right solution is assume every import error needs some guidance, some hints. Don't just target a specific problem.

Ned probably familiar with this, in the case of Ansible, if Ansible cannot resolve and locate the role you specify in the playbook, Ansible will complain and give this error message:

ERROR: cannot find role in /current/path/roles/some-role or /current/path/some-role or /etc/ansible/roles/some-role

So the import error should be more or less like this

AttributeError: module 'json' has no attribute 'loads'

Possible root causes:
   * json is not found in the current PYTHONPATH. Python tried /current/python/site-packages/json, /current/python/site-packages/json.py. For the full list of PYTHONPATH, please refer to THIS DOC ON PYTHON.ORG.
   * your current module has the same name as the module you intent to import.

You can even simplify this to say possible cause please go to this doc on python.org and we can go verbose there.

John

Oscar Benjamin

unread,
Jan 29, 2016, 7:14:27 PM1/29/16
to Sjoerd Job Postmus, python-ideas
On 29 January 2016 at 23:42, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
> On Fri, Jan 29, 2016 at 05:42:18PM -0500, Ned Batchelder wrote:
>> Hi,
>>
>> A common question we get in the #python IRC channel is, "I tried
>> importing a module, but I get an AttributeError trying to use the
>> things it said it provided." Turns out the beginner named their own
>> file the same as the module they were trying to use.
...
>> Could we make this a more obvious failure? Is there ever a valid
>> reason for a file to import itself? Is this situation detectable in
>> the import machinery?
>
> I feel this is only a partial fix. I've been bitten by something like
> this, but not precisely like this. The difference in how I experienced
> this makes it enough for me to say: I don't think your suggestion is
> that useful.
>
> What I experienced was having collisions on the python-path, and modules
> from my codebase colliding with libraries in the stdlib (or outside it).
> For example, a library might import one of its dependencies which
> coincidentally had the same name as one of the libraries I have.

Another way that the error can arrive is if your script has the same
name as an installed module that is indirectly imported. I commonly
see my students choose the name "random.py" for a script which can
lead to this problem:

$ echo 'import urllib2' > random.py
$ python random.py
Traceback (most recent call last):
File "random.py", line 1, in <module>
import urllib2
File "/usr/lib/python2.7/urllib2.py", line 94, in <module>
import httplib
File "/usr/lib/python2.7/httplib.py", line 80, in <module>
import mimetools
File "/usr/lib/python2.7/mimetools.py", line 6, in <module>
import tempfile
File "/usr/lib/python2.7/tempfile.py", line 35, in <module>
from random import Random as _Random
ImportError: cannot import name Random

To fully avoid this error you need to know every possible top-level
module/package name and not use any of them as the name of your
script. This would be avoided if module namespaces had some nesting
e.g. 'import stdlib.random' but apparently flat is better than
nested...

> Maybe a suggestion would be to add the path of the module to the error
> message?
>
> Currently the message is
...
> AttributeError: module 'json' has no attribute 'loads'
>
> But maybe the error could/should be
>
> AttributeError: module 'json' (imported from /Users/sjoerdjob/Development/spikes/importself/json.py) has no attribute 'loads'.

I think that would be an improvement. It would still be a problem for
absolute beginners but at least the error message gives enough
information to spot the problem.

In general though I think it's unfortunate that it's possible to be
able to override installed or even stdlib modules just by having a .py
file with the same name in the same directory as the running script. I
had a discussion with a student earlier today about why '.' is not
usually on PATH for precisely this reason: you basically never want
the ls (or whatever) command to run a program in the current
directory.

--
Oscar

Andrew Barnert via Python-ideas

unread,
Jan 29, 2016, 7:17:30 PM1/29/16
to Sjoerd Job Postmus, python-ideas
On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
>
> What I experienced was having collisions on the python-path, and modules
> from my codebase colliding with libraries in the stdlib (or outside it).
> For example, a library might import one of its dependencies which
> coincidentally had the same name as one of the libraries I have.

Yes. The version of this I've seen most from novices is that they write a program named "json.py" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads.

(One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "time.py", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :))

> Maybe a suggestion would be to add the path of the module to the error
> message?

That would probably help, but think about what it entails:

Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it.

So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting.

> (Of course, another option would be to look for other modules of the
> same name when you get an attribute-error on a module to aid debugging,
> but I think that's too heavy-weight.)

If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes.

Oscar Benjamin

unread,
Jan 29, 2016, 7:30:39 PM1/29/16
to Andrew Barnert, python-ideas
On 30 January 2016 at 00:16, Andrew Barnert via Python-ideas
<python...@python.org> wrote:
>> Maybe a suggestion would be to add the path of the module to the error
>> message?
>
> That would probably help, but think about what it entails:
>
> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

Oh yeah, good point. Somehow I read the AttributeError as an ImportError e.g.

$ python random.py
Traceback (most recent call last):
File "random.py", line 1, in <module>
import urllib2
File "/usr/lib/python2.7/urllib2.py", line 94, in <module>
import httplib
File "/usr/lib/python2.7/httplib.py", line 80, in <module>
import mimetools
File "/usr/lib/python2.7/mimetools.py", line 6, in <module>
import tempfile
File "/usr/lib/python2.7/tempfile.py", line 35, in <module>
from random import Random as _Random
ImportError: cannot import name Random

That error message could be changed to something like

ImportError: cannot import name Random from module 'random'
(/home/oscar/random.py)

Attribute errors would be more problematic.

--
Oscar

Chris Angelico

unread,
Jan 29, 2016, 10:11:38 PM1/29/16
to python-ideas
On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin
<oscar.j....@gmail.com> wrote:
> In general though I think it's unfortunate that it's possible to be
> able to override installed or even stdlib modules just by having a .py
> file with the same name in the same directory as the running script. I
> had a discussion with a student earlier today about why '.' is not
> usually on PATH for precisely this reason: you basically never want
> the ls (or whatever) command to run a program in the current
> directory.
>

One solution would be to always work in a package. As of Python 3,
implicit relative imports don't happen, so you should be safe. Maybe
there could be a flag like -m that means "run as if current directory
is a module"? You can change to a parent directory and run "python3 -m
dir.file" to run dir/file.py; if "python3 -r file" could run file.py
from the current directory (and assume the presence of an empty
__init__.py if one isn't found), that would prevent all accidental
imports - if you want to grab a file from right next to you, that's
"from . import otherfile", which makes perfect sense.

It'd be 100% backward compatible, as the new behaviour would take
effect only if the option is explicitly given. Doable?

ChrisA

Andrew Barnert via Python-ideas

unread,
Jan 29, 2016, 10:23:46 PM1/29/16
to Chris Angelico, python-ideas
On Jan 29, 2016, at 19:11, Chris Angelico <ros...@gmail.com> wrote:
>
> On Sat, Jan 30, 2016 at 11:13 AM, Oscar Benjamin
> <oscar.j....@gmail.com> wrote:
>> In general though I think it's unfortunate that it's possible to be
>> able to override installed or even stdlib modules just by having a .py
>> file with the same name in the same directory as the running script. I
>> had a discussion with a student earlier today about why '.' is not
>> usually on PATH for precisely this reason: you basically never want
>> the ls (or whatever) command to run a program in the current
>> directory.
>
> One solution would be to always work in a package. As of Python 3,
> implicit relative imports don't happen, so you should be safe. Maybe
> there could be a flag like -m that means "run as if current directory
> is a module"? You can change to a parent directory and run "python3 -m
> dir.file" to run dir/file.py; if "python3 -r file" could run file.py
> from the current directory (and assume the presence of an empty
> __init__.py if one isn't found), that would prevent all accidental
> imports - if you want to grab a file from right next to you, that's
> "from . import otherfile", which makes perfect sense.
>
> It'd be 100% backward compatible, as the new behaviour would take
> effect only if the option is explicitly given. Doable?

I like it.

The only problem is that people on platforms where you can add the -r on the shbang line will start doing that, and then their scripts won't be portable...

Sjoerd Job Postmus

unread,
Jan 30, 2016, 1:59:15 AM1/30/16
to Andrew Barnert, python-ideas
On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
> >
> > What I experienced was having collisions on the python-path, and modules
> > from my codebase colliding with libraries in the stdlib (or outside it).
> > For example, a library might import one of its dependencies which
> > coincidentally had the same name as one of the libraries I have.
>
> Yes. The version of this I've seen most from novices is that they write a program named "json.py" that imports and uses requests, which tries to use the stdlib module json, which gives them an AttributeError on json.loads.
>
> (One of my favorite questions on StackOverflow came from a really smart novice who'd written a program called "time.py", and he got an error about time.time on one machine, but not another. He figured out that obviously, requests wants him to define his own time function, which he was able to do by using the stuff in datetime. And he figured out the probable difference between the two machines--the working one had an older version of requests. He just wanted to know why requests didn't document this new requirement that they'd added. :))
>
> > Maybe a suggestion would be to add the path of the module to the error
> > message?
>
> That would probably help, but think about what it entails:
>
> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.

True. Most AttributeErrors are on user-defined classes with a typo. But
that's not the case we're discussing here. Here we are discussing how a
user should debug the effects of module name collisions, and the
resulting AttributeError.

I would expect it to be quite unlikely that two modules with the same
name each have a class with the same name, and you accidentally
initialize the wrong one.

More likely (in my experience) is that you get an AttributeError on a
module (in the case of module-name collisions).

> To make matters worse, AttributeError objects don't even carry the name of the object being attributed, so even if you wanted to make tracebacks do some magic if isinstance(obj, types.ModuleType), there's no way to do it.
>
> So, that means you'd have to make ModuleType.__getattr__ do the special error message formatting.

Yes, indeed. That's what I was thinking of. I decided to write up a quick hack that added the filename to the exception string.

sjoerdjob$ ../python mod_a.py
Traceback (most recent call last):
File "mod_a.py", line 4, in <module>
print(parse(JSON_DATA))
File "/home/sjoerdjob/dev/cpython/tmp/mod_b.py", line 4, in parse
return json.loads(blob)
AttributeError: module 'json' (loaded from /home/sjoerdjob/dev/cpython/tmp/json.py) has no attribute 'loads'

Here's the patch, in case anyone is interested.

diff --git a/Objects/moduleobject.c b/Objects/moduleobject.c
index 24c5f4c..5cc144a 100644
--- a/Objects/moduleobject.c
+++ b/Objects/moduleobject.c
@@ -654,17 +654,25 @@ module_repr(PyModuleObject *m)
static PyObject*
module_getattro(PyModuleObject *m, PyObject *name)
{
- PyObject *attr, *mod_name;
+ PyObject *attr, *mod_name, *mod_file;
attr = PyObject_GenericGetAttr((PyObject *)m, name);
if (attr || !PyErr_ExceptionMatches(PyExc_AttributeError))
return attr;
PyErr_Clear();
if (m->md_dict) {
_Py_IDENTIFIER(__name__);
mod_name = _PyDict_GetItemId(m->md_dict, &PyId___name__);
if (mod_name) {
- PyErr_Format(PyExc_AttributeError,
+ _Py_IDENTIFIER(__file__);
+ mod_file = _PyDict_GetItemId(m->md_dict, &PyId___file__);
+ if (mod_file && PyUnicode_Check(mod_file)) {
+ PyErr_Format(PyExc_AttributeError,
+ "module '%U' (loaded from %U) has no attribute '%U'", mod_name, mod_file, name);
+ } else {
+ PyErr_Format(PyExc_AttributeError,
"module '%U' has no attribute '%U'", mod_name, name);
+ }
return NULL;
}
else if (PyErr_Occurred()) {

Unfortunately, I do think this might impose **some** performance issue, but on
the other hand, I'd be inclined to think that attribute-errors on module
objects are not that likely to begin with, except for typos and issues
like these. (And of course the case that you have to support older
versions of Python with a slower implementation, but you most often see
those checks being done at the module-level, so it would only impact
load-time and not running-time.)

The added benefit would be quicker debugging when finally having posted
to a forum: "Ah, I see from the message that the path of the module is
not likely a standard-library path. Maybe you have a name collision?
Check for files or directories named '<module name here>(.py)' in your
working directory / project / ... .

>
> > (Of course, another option would be to look for other modules of the
> > same name when you get an attribute-error on a module to aid debugging,
> > but I think that's too heavy-weight.)
>
> If that could be done only when the exception escapes to top level and dumps s traceback, that might be reasonable. And it would _definitely_ be helpful. But I don't think it's possible without major changes.

No, indeed, that was also my expectation: helpful, but too big a hassle
to be worth it.

Andrew Barnert via Python-ideas

unread,
Jan 30, 2016, 2:45:18 AM1/30/16
to Sjoerd Job Postmus, python-ideas
On Jan 29, 2016, at 22:58, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
>
>> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
>>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
>>>
>>> Maybe a suggestion would be to add the path of the module to the error
>>> message?
>>
>> That would probably help, but think about what it entails:
>>
>> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.
>
> True. Most AttributeErrors are on user-defined classes with a typo. But
> that's not the case we're discussing here. Here we are discussing how a
> user should debug the effects of module name collisions, and the
> resulting AttributeError.

Right. So my point is, either we have to do the extra work in module.__getattr__ when formatting the string, or we have to extend the interface of AttributeError to carry more information in general (the object and attr name, presumably). The latter may be better, but it's also clearly not going to happen any time soon. (People have been suggesting since before 3.0 that all the standard exceptions should have more useful info, but nobody's volunteered to change the hundreds of lines of C code, Python code, and docs to do it...)

So, the only argument against your idea I can see is the potential performance issues. Which should be pretty easy to dismiss with a microbenchmark showing it's pretty small even in the worst case and a macrobenchmark showing it's not even measurable in real code, right?

Sebastian Kreft

unread,
Jan 30, 2016, 3:05:56 AM1/30/16
to Andrew Barnert, Python-Ideas


On Jan 30, 2016 6:45 PM, "Andrew Barnert via Python-ideas" <python...@python.org> wrote:
>
> On Jan 29, 2016, at 22:58, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
> >
> >> On Fri, Jan 29, 2016 at 04:16:46PM -0800, Andrew Barnert wrote:
> >>> On Jan 29, 2016, at 15:42, Sjoerd Job Postmus <sjoe...@sjec.nl> wrote:
> >>>
> >>> Maybe a suggestion would be to add the path of the module to the error
> >>> message?
> >>
> >> That would probably help, but think about what it entails:
> >>
> >> Most AttributeErrors aren't on module objects, they're on instances of user-defined classes with a typo, or on None because the user forgot a "return" somewhere, or on str because the user didn't realize the difference between the string representation of an object and the objects, etc.
> >
> > True. Most AttributeErrors are on user-defined classes with a typo. But
> > that's not the case we're discussing here. Here we are discussing how a
> > user should debug the effects of module name collisions, and the
> > resulting AttributeError.
>
> Right. So my point is, either we have to do the extra work in module.__getattr__ when formatting the string, or we have to extend the interface of AttributeError to carry more information in general (the object and attr name, presumably). The latter may be better, but it's also clearly not going to happen any time soon. (People have been suggesting since before 3.0 that all the standard exceptions should have more useful info, but nobody's volunteered to change the hundreds of lines of C code, Python code, and docs to do it...)

Pep 473 centralizes all of this requests. https://www.python.org/dev/peps/pep-0473/.

I started adding support for name error, the most simple change and it turned out to be much more complex as I had thought. I had a couple of tests which were failing and didn't have the bandwidth to debug.

Nick Coghlan

unread,
Jan 30, 2016, 4:30:30 AM1/30/16
to Ned Batchelder, python-ideas
On 30 January 2016 at 08:42, Ned Batchelder <n...@nedbatchelder.com> wrote:
> Hi,
>
> A common question we get in the #python IRC channel is, "I tried importing a
> module, but I get an AttributeError trying to use the things it said it
> provided." Turns out the beginner named their own file the same as the
> module they were trying to use.
>
> That is, they want to try (for example) the "azure" package. So they make a
> file called azure.py, and start with "import azure". The import succeeds,
> but it has none of the contents the documentation claims, because they have
> imported themselves. It's baffling, because they have used the exact
> statements shown in the examples, but it doesn't work.
>
> Could we make this a more obvious failure? Is there ever a valid reason for
> a file to import itself? Is this situation detectable in the import
> machinery?

We could potentially detect when __main__ is being reimported under a
different name and issue a user visible warning when it happens, but
we can't readily detect a file importing itself in the general case
(since it may be an indirect circular reference rather than a direct).

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Ned Batchelder

unread,
Jan 30, 2016, 6:20:07 AM1/30/16
to Nick Coghlan, python-ideas
On 1/30/16 4:30 AM, Nick Coghlan wrote:
> On 30 January 2016 at 08:42, Ned Batchelder <n...@nedbatchelder.com> wrote:
>> Hi,
>>
>> A common question we get in the #python IRC channel is, "I tried importing a
>> module, but I get an AttributeError trying to use the things it said it
>> provided." Turns out the beginner named their own file the same as the
>> module they were trying to use.
>>
>> That is, they want to try (for example) the "azure" package. So they make a
>> file called azure.py, and start with "import azure". The import succeeds,
>> but it has none of the contents the documentation claims, because they have
>> imported themselves. It's baffling, because they have used the exact
>> statements shown in the examples, but it doesn't work.
>>
>> Could we make this a more obvious failure? Is there ever a valid reason for
>> a file to import itself? Is this situation detectable in the import
>> machinery?
> We could potentially detect when __main__ is being reimported under a
> different name and issue a user visible warning when it happens, but
> we can't readily detect a file importing itself in the general case
> (since it may be an indirect circular reference rather than a direct).

I thought about the indirect case, and for the errors I'm trying to make
clearer, the direct case is plenty.

While we're at it though, re-importing __main__ is a separate kind of
behavior that is often a problem, since it means you'll have the same
classes defined twice.

--Ned.
> Cheers,
> Nick.

Nick Coghlan

unread,
Jan 30, 2016, 6:57:37 AM1/30/16
to Ned Batchelder, python-ideas
On 30 January 2016 at 21:19, Ned Batchelder <n...@nedbatchelder.com> wrote:
> On 1/30/16 4:30 AM, Nick Coghlan wrote:
>> We could potentially detect when __main__ is being reimported under a
>> different name and issue a user visible warning when it happens, but
>> we can't readily detect a file importing itself in the general case
>> (since it may be an indirect circular reference rather than a direct).
>
> I thought about the indirect case, and for the errors I'm trying to make
> clearer, the direct case is plenty.

In that case, the only problem I see off the top of my head with
emitting a warning for direct self-imports is that it would rely on
import system behaviour we're currently trying to reduce/minimise: the
import machinery needing visibility into the globals for the module
initiating the import.

It's also possible that by the time we get hold of the __spec__ for
the module being imported, we've already dropped our reference to the
importing module's globals, so we can't check against __file__ any
more. However, I'd need to go read the code to remember how quickly we
get to extracting just the globals of potential interest.

> While we're at it though, re-importing __main__ is a separate kind of
> behavior that is often a problem, since it means you'll have the same
> classes defined twice.

Right, but it combines with the name shadowing behaviour to create a
*super* confusing combination when you write a *script* that shadows
the name of a standard library module:
http://python-notes.curiousefficiency.org/en/latest/python_concepts/import_traps.html#the-name-shadowing-trap

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Oscar Benjamin

unread,
Jan 30, 2016, 8:21:10 AM1/30/16
to Nick Coghlan, python-ideas
On 30 January 2016 at 11:57, Nick Coghlan <ncog...@gmail.com> wrote:
> On 30 January 2016 at 21:19, Ned Batchelder <n...@nedbatchelder.com> wrote:
>> On 1/30/16 4:30 AM, Nick Coghlan wrote:
>>> We could potentially detect when __main__ is being reimported under a
>>> different name and issue a user visible warning when it happens, but
>>> we can't readily detect a file importing itself in the general case
>>> (since it may be an indirect circular reference rather than a direct).
>>
>> I thought about the indirect case, and for the errors I'm trying to make
>> clearer, the direct case is plenty.
>
> In that case, the only problem I see off the top of my head with
> emitting a warning for direct self-imports is that it would rely on
> import system behaviour we're currently trying to reduce/minimise: the
> import machinery needing visibility into the globals for the module
> initiating the import.
>
> It's also possible that by the time we get hold of the __spec__ for
> the module being imported, we've already dropped our reference to the
> importing module's globals, so we can't check against __file__ any
> more. However, I'd need to go read the code to remember how quickly we
> get to extracting just the globals of potential interest.

Maybe this is because I don't really understand how the import
machinery works but I would say that if I run

$ python random.py

Then the interpreter should be able to know that __main__ is called
"random" and know the path to that file. It should also be evident if
'' is at the front of sys.path then "import random" is going to import
that same module. Why is it difficult to detect that case?

I think it would be better to try and solve the problem a little more
generally though. Having yesterday created a file called random.py in
my user directory (on Ubuntu 15.04) I get the following today:

$ cat random.py
import urllib2
$ python3
Python 3.4.3 (default, Mar 26 2015, 22:03:40)
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> *a = [1, 2]
File "<stdin>", line 1
SyntaxError: starred assignment target must be in a list or tuple
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/apport_python_hook.py", line
63, in apport_excepthook
from apport.fileutils import likely_packaged, get_recent_crashes
File "/usr/lib/python3/dist-packages/apport/__init__.py", line 5, in <module>
from apport.report import Report
File "/usr/lib/python3/dist-packages/apport/report.py", line 12, in <module>
import subprocess, tempfile, os.path, re, pwd, grp, os, time
File "/usr/lib/python3.4/tempfile.py", line 175, in <module>
from random import Random as _Random
File "/home/oscar/random.py", line 1, in <module>
import urllib2
ImportError: No module named 'urllib2'

Original exception was:
File "<stdin>", line 1
SyntaxError: starred assignment target must be in a list or tuple

--
Oscar

Steven D'Aprano

unread,
Jan 30, 2016, 5:47:53 PM1/30/16
to python...@python.org
On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:

> While we're at it though, re-importing __main__ is a separate kind of
> behavior that is often a problem, since it means you'll have the same
> classes defined twice.

As far as I can tell, importing __main__ is fine. It's only when you
import __main__ AND the main module under its real name at the same time
that you can run into problems -- and even then, not always. The sort of
errors I've seen involve something like this:

import myscript
import __main__ # this is actually myscript
a = myscript.TheClass()
# later
assert isinstance(a, __main__.TheClass)

which fails, because myscript and __main__ don't share state, despite
actually coming from the same source file.

So I think it's pretty rare for something like this to actually happen.
I've never seen it happen by accident, I've only seen it done
deliberately as a counter-example to to the "modules are singletons"
rule.


--
Steve

Nathaniel Smith

unread,
Jan 30, 2016, 6:26:17 PM1/30/16
to Steven D'Aprano, python...@python.org
On Sat, Jan 30, 2016 at 2:47 PM, Steven D'Aprano <st...@pearwood.info> wrote:
> On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:
>
>> While we're at it though, re-importing __main__ is a separate kind of
>> behavior that is often a problem, since it means you'll have the same
>> classes defined twice.
>
> As far as I can tell, importing __main__ is fine. It's only when you
> import __main__ AND the main module under its real name at the same time
> that you can run into problems -- and even then, not always. The sort of
> errors I've seen involve something like this:
>
> import myscript
> import __main__ # this is actually myscript
> a = myscript.TheClass()
> # later
> assert isinstance(a, __main__.TheClass)
>
> which fails, because myscript and __main__ don't share state, despite
> actually coming from the same source file.
>
> So I think it's pretty rare for something like this to actually happen.
> I've never seen it happen by accident, I've only seen it done
> deliberately as a counter-example to to the "modules are singletons"
> rule.

Not only is importing __main__ fine, it's actually unavoidable...
__main__ is just an alias to the main script's namespace.

-- test_main.py --
class Foo:
pass
import __main__
# Prints "True"
print(Foo is __main__.Foo)
-- end test_main.py --

So importing __main__ never creates any new copies of any singletons;
it's only importing the main script under its filesystem name that
creates the problem.

-n

--
Nathaniel J. Smith -- https://vorpus.org

Ned Batchelder

unread,
Jan 30, 2016, 7:59:12 PM1/30/16
to python...@python.org
On 1/30/16 5:47 PM, Steven D'Aprano wrote:
> On Sat, Jan 30, 2016 at 06:19:35AM -0500, Ned Batchelder wrote:
>
>> While we're at it though, re-importing __main__ is a separate kind of
>> behavior that is often a problem, since it means you'll have the same
>> classes defined twice.
> As far as I can tell, importing __main__ is fine. It's only when you
> import __main__ AND the main module under its real name at the same time
> that you can run into problems -- and even then, not always. The sort of
> errors I've seen involve something like this:
>
> import myscript
> import __main__ # this is actually myscript
> a = myscript.TheClass()
> # later
> assert isinstance(a, __main__.TheClass)
>
> which fails, because myscript and __main__ don't share state, despite
> actually coming from the same source file.
>
> So I think it's pretty rare for something like this to actually happen.
> I've never seen it happen by accident, I've only seen it done
> deliberately as a counter-example to to the "modules are singletons"
> rule.
>
>

Something like this does happen in the real world. A class is defined
in the main module, and then the module is later imported with its real
name. Now you have __main__.Class and module.Class both defined. You
don't need to actually "import __main__" for it to happen.
__main__.Class is used implicitly from the main module simply as Class.

--Ned.

Nick Coghlan

unread,
Jan 31, 2016, 1:45:24 AM1/31/16
to Oscar Benjamin, python-ideas
Yes, this is the case I originally said we could definitely detect.
The case I don't know if we can readily detect is the one where a
module *other than __main__* is imported a second time under a
different name. However, I'm not sure that latter capability would be
at all useful, so it probably doesn't matter whether or not it's
feasible.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Terry Reedy

unread,
Feb 3, 2016, 4:17:36 PM2/3/16
to python...@python.org
On 1/30/2016 1:58 AM, Sjoerd Job Postmus wrote:

> Yes, indeed. That's what I was thinking of. I decided to write up a quick hack that added the filename to the exception string.
>
> sjoerdjob$ ../python mod_a.py
> Traceback (most recent call last):
> File "mod_a.py", line 4, in <module>
> print(parse(JSON_DATA))
> File "/home/sjoerdjob/dev/cpython/tmp/mod_b.py", line 4, in parse
> return json.loads(blob)
> AttributeError: module 'json' (loaded from /home/sjoerdjob/dev/cpython/tmp/json.py) has no attribute 'loads'

Would it help is we added an AttributeError subclass,
ModuleAttributeError and raise that instead module_getattro?
--
Terry Jan Reedy

Steven D'Aprano

unread,
Feb 3, 2016, 7:31:37 PM2/3/16
to python...@python.org
Ah, yes of course you are correct, you don't need an explicit "import
__main__", but you do need an explicit "import module" somewhere.

I think that in order to have a problem, you need something like this
set of circumstances:


(1) You need a module which is intended to be used as BOTH an executable
script and an importable library.

(2) Your module ends up directly or indirectly importing itself when
running as __main__.

(3) Your code relies on the "module is a singleton" invariant that you
have just violated. (If you ever do something like `if type(obj) is
MyClass`, you're relying on that invariant.)

(If your module is never __main__, you have no problem. If your module
is always __main__ and never imported under the real file name, you have
no problem. But if it is both, you may have a problem.)

I'm prepared to believe that actually diagnosing this error can be
difficult, especially for those who have never come across this before.
But I don't think it is especially common.

Rather than trying to ban self imports, could we change sys.modules so
it caches the module object under *both* the original name and
'__main__'?


# running "script.py" as the main module
sys.modules['__main__'] = sys.modules['script'] = module_object

# now "import script" will pick up the same module object as __main__

If the script name isn't a valid identifier, there's no need for the
second entry, since it can't be imported.

(I have a vague feeling I've already asked this question before, but I
can't find any mail with it. If I have asked it, and it's been answered,
my apologies. What was the answer again?)


--
Steve

Chris Angelico

unread,
Feb 3, 2016, 8:05:16 PM2/3/16
to python-ideas
On Thu, Feb 4, 2016 at 11:30 AM, Steven D'Aprano <st...@pearwood.info> wrote:
> (3) Your code relies on the "module is a singleton" invariant that you
> have just violated. (If you ever do something like `if type(obj) is
> MyClass`, you're relying on that invariant.)

If ever you do "except MyException:", you're relying on that invariant, too.

ChrisA

Eric Snow

unread,
Mar 1, 2016, 5:47:43 PM3/1/16
to Ned Batchelder, python-ideas
(Wow, I'm late to this thread.)

FTR, the underlying cause is that a script's directory is prepended to
sys.path. [1] Otherwise importing your script wouldn't be a problem
nearly as often. This is something I've hoped we could address at
some point. [2] Note that PEP 395 [3] aimed to help with a number of
related issues and PEP 432 [4] may still help make it easier to sort
out interpreter-startup-related matters like this. Personally I'm
still in favor of deprecating "sys.path[0] auto-initialisation", but
I'm not gonna hold my breath on that. :)

-eric


[1] https://hg.python.org/cpython/file/7e48300c7f3b/Modules/main.c#l251
https://hg.python.org/cpython/file/7e48300c7f3b/Lib/runpy.py#l258
[2] http://bugs.python.org/issue13475
[3] http://www.python.org/dev/peps/pep-0395/
[4] http://www.python.org/dev/peps/pep-0432/

Rick Johnson

unread,
Mar 15, 2016, 4:02:51 PM3/15/16
to python-ideas, python...@python.org
On Friday, January 29, 2016 at 6:14:27 PM UTC-6, Oscar Benjamin wrote:
> To fully avoid this error you need to know every possible
> top-level module/package name and not use any of them as
> the name of your script. This would be avoided if module
> namespaces had some nesting e.g. 'import stdlib.random'
> but apparently flat is better than nested...

The Python Zen should not be a suicide pact.
Reply all
Reply to author
Forward
0 new messages