How to guard against bugs like this one?

2 views
Skip to first unread message

kj

unread,
Feb 1, 2010, 9:34:07 PM2/1/10
to

I just spent about 1-1/2 hours tracking down a bug.

An innocuous little script, let's call it buggy.py, only 10 lines
long, and whose output should have been, at most two lines, was
quickly dumping tens of megabytes of non-printable characters to
my screen (aka gobbledygook), and in the process was messing up my
terminal *royally*. Here's buggy.py:

import sys
import psycopg2
connection_params = "dbname='%s' user='%s' password='%s'" % tuple(sys.argv[1:])
conn = psycopg2.connect(connection_params)
cur = conn.cursor()
cur.execute('SELECT * FROM version;')
print '\n'.join(x[-1] for x in cur.fetchall())


(Of course, buggy.py is pretty useless; I reduced the original,
more useful, script to this to help me debug it.)

Through a *lot* of trial an error I finally discovered that the
root cause of the problem was the fact that, in the same directory
as buggy.py, there is *another* innocuous little script, totally
unrelated, whose name happens to be numbers.py. (This second script
is one I wrote as part of a little Python tutorial I put together
months ago, and is not much more of a script than hello_world.py;
it's baby-steps for the absolute beginner. But apparently, it has
a killer name! I had completely forgotten about it.)

Both scripts live in a directory filled with *hundreds* little
one-off scripts like the two of them. I'll call this directory
myscripts in what follows.

It turns out that buggy.py imports psycopg2, as you can see, and
apparently psycopg2 (or something imported by psycopg2) tries to
import some standard Python module called numbers; instead it ends
up importing the innocent myscript/numbers.py, resulting in *absolute
mayhem*.

(This is no mere Python "wart"; this is a suppurating chancre, and
the fact that it remains unfixed is a neverending source of puzzlement
for me.)

How can the average Python programmer guard against this sort of
time-devouring bug in the future (while remaining a Python programmer)?
The only solution I can think of is to avoid like the plague the
basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files,
and *pray* that whatever name one chooses for one's script does
not suddenly pop up in the appropriate /usr/lib/pythonX.XX directory
of a future release.

What else can one do? Let's see, one should put every script in its
own directory, thereby containing the damage.

Anything else?

Any suggestion would be appreciated.

TIA!

~k

Chris Rebert

unread,
Feb 1, 2010, 9:57:27 PM2/1/10
to pytho...@python.org
On Mon, Feb 1, 2010 at 6:34 PM, kj <no.e...@please.post> wrote:
> I just spent about 1-1/2 hours tracking down a bug.
<snip>

I think absolute imports avoid this problem:

from __future__ import absolute_import

For details, see PEP 328:
http://www.python.org/dev/peps/pep-0328/

Cheers,
Chris
--
http://blog.rebertia.com

Roy Smith

unread,
Feb 1, 2010, 10:15:32 PM2/1/10
to
In article <hk82uv$8kn$1...@reader1.panix.com>, kj <no.e...@please.post>
wrote:

> Through a *lot* of trial an error I finally discovered that the
> root cause of the problem was the fact that, in the same directory
> as buggy.py, there is *another* innocuous little script, totally
> unrelated, whose name happens to be numbers.py.

> [...]


> It turns out that buggy.py imports psycopg2, as you can see, and
> apparently psycopg2 (or something imported by psycopg2) tries to
> import some standard Python module called numbers; instead it ends
> up importing the innocent myscript/numbers.py, resulting in *absolute
> mayhem*.

I feel your pain, but this is not a Python problem, per-se. The general
pattern is:

1) You have something which refers to a resource by name.

2) There is a sequence of places which are searched for this name.

3) The search finds the wrong one because another resource by the same name
appears earlier in the search path.

I've gotten bitten like this by shells finding the wrong executable (in
$PATH). By dynamic loaders finding the wrong library (in
$LD_LIBRARY_PATH). By C compilers finding the wrong #include file. And so
on. This is just Python's import finding the wrong module in your
$PYTHON_PATH.

The solution is the same in all cases. You either have to refer to
resources by some absolute name, or you need to make sure you set up your
search paths correctly and know what's in them. In your case, one possible
solution be to make sure "." (or "") isn't in sys.path (although that might
cause other issues).

Steven D'Aprano

unread,
Feb 1, 2010, 10:28:55 PM2/1/10
to
On Tue, 02 Feb 2010 02:34:07 +0000, kj wrote:

> I just spent about 1-1/2 hours tracking down a bug.
>
> An innocuous little script, let's call it buggy.py, only 10 lines long,
> and whose output should have been, at most two lines, was quickly
> dumping tens of megabytes of non-printable characters to my screen (aka
> gobbledygook), and in the process was messing up my terminal *royally*.
> Here's buggy.py:

[...]


> It turns out that buggy.py imports psycopg2, as you can see, and
> apparently psycopg2 (or something imported by psycopg2) tries to import
> some standard Python module called numbers; instead it ends up importing
> the innocent myscript/numbers.py, resulting in *absolute mayhem*.


There is no module numbers in the standard library, at least not in 2.5.

>>> import numbers
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named numbers

It must be specific to psycopg2.

I would think this is a problem with psycopg2 -- it sounds like it should
be written as a package, but instead is written as a bunch of loose
modules. I could be wrong of course, but if it is just a collection of
modules, I'd definitely call that a poor design decision, if not a bug.


> (This is no mere Python "wart"; this is a suppurating chancre, and the
> fact that it remains unfixed is a neverending source of puzzlement for
> me.)

No, it's a wart. There's no doubt it bites people occasionally, but I've
been programming in Python for about ten years and I've never been bitten
by this yet. I'm sure it will happen some day, but not yet.

In this case, the severity of the bug (megabytes of binary crud to the
screen) is not related to the cause of the bug (shadowing a module).

As for fixing it, unfortunately it's not quite so simple to fix without
breaking backwards-compatibility. The opportunity to do so for Python 3.0
was missed. Oh well, life goes on.


> How can the average Python programmer guard against this sort of
> time-devouring bug in the future (while remaining a Python programmer)?
> The only solution I can think of is to avoid like the plague the
> basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files, and
> *pray* that whatever name one chooses for one's script does not suddenly
> pop up in the appropriate /usr/lib/pythonX.XX directory of a future
> release.

Unfortunately, Python makes no guarantee that there won't be some clash
between modules. You can minimize the risks by using packages, e.g. given
a package spam containing modules a, b, c, and d, if you refer to spam.a
etc. then you can't clash with modules a, b, c, d, but only spam. So
you've cut your risk profile from five potential clashes to only one.

Also, generally most module clashes are far more obvious. If you do this:

import module
x = module.y

and module is shadowed by something else, you're *much* more likely to
get an AttributeError than megabytes of crud to the screen.

I'm sorry that you got bitten so hard by this, but in practice it's
uncommon, and relatively mild when it happens.


> What else can one do? Let's see, one should put every script in its own
> directory, thereby containing the damage.

That's probably a bit extreme, but your situation:

"Both scripts live in a directory filled with *hundreds* little
one-off scripts like the two of them."

is far too chaotic for my liking. You don't need to go to the extreme of
a separate directory for each file, but you can certainly tidy things up
a bit. For example, anything that's obsolete should be moved out of the
way where it can't be accidentally executed or imported.


--
Steven

Tim Chase

unread,
Feb 1, 2010, 10:33:10 PM2/1/10
to Stephen Hansen, pytho...@python.org
Stephen Hansen wrote:
> First, I don't shadow built in modules. Its really not very hard to avoid.

Given the comprehensive nature of the batteries-included in
Python, it's not as hard to accidentally shadow a built-in,
unknown to you, but yet that is imported by a module you are
using. The classic that's stung me enough times (and many others
on c.l.p and other forums, as a quick google evidences) such that
I *finally* remember:

bash$ touch email.py
bash$ python
...
>>> import smtplib


Traceback (most recent call last):
File "<stdin>", line 1, in <module>

File "/usr/lib/python2.5/smtplib.py", line 46, in <module>
import email.Utils
ImportError: No module named Utils

Using "email.py" is an innocuous name for a script/module you
might want to do emailish things, and it's likely you'll use
smtplib in the same code...and kablooie, things blow up even if
your code doesn't reference or directly use the built-in email.py.

Yes, as Chris mentions, PEP-328 absolute vs. relative imports
should help ameliorate the problem, but it's not yet commonly
used (unless you're using Py3, it's only at the request of a
__future__ import in 2.5+).

-tkc


Carl Banks

unread,
Feb 1, 2010, 11:01:11 PM2/1/10
to
On Feb 1, 6:34 pm, kj <no.em...@please.post> wrote:
> Both scripts live in a directory filled with *hundreds* little
> one-off scripts like the two of them.  I'll call this directory
> myscripts in what follows.

[snip]

> How can the average Python programmer guard against this sort of
> time-devouring bug in the future (while remaining a Python programmer)?


Don't put hundreds of little one-off scripts in single directory.
Python can't save you from polluting your own namespace.

Don't choose such generic names for modules. Keep in mind module
names are potentially globally visible and any sane advice you ever
heard about globals is to use descriptive names. I instinctively use
adjectives, compound words, and abstract nouns for the names of all my
modules so as to be more descriptive, and to avoid name conflicts with
classes and variables.

Also learn to debug better.


Carl Banks

Carl Banks

unread,
Feb 1, 2010, 11:12:25 PM2/1/10
to
On Feb 1, 7:33 pm, Tim Chase <python.l...@tim.thechases.com> wrote:
> Stephen Hansen wrote:
> > First, I don't shadow built in modules. Its really not very hard to avoid.
>
> Given the comprehensive nature of the batteries-included in
> Python, it's not as hard to accidentally shadow a built-in,
> unknown to you, but yet that is imported by a module you are
> using.  The classic that's stung me enough times (and many others
> on c.l.p and other forums, as a quick google evidences) such that
> I *finally* remember:
>
>    bash$ touch email.py
>    bash$ python
>    ...
>    >>> import smtplib
>    Traceback (most recent call last):
>      File "<stdin>", line 1, in <module>
>      File "/usr/lib/python2.5/smtplib.py", line 46, in <module>
>        import email.Utils
>    ImportError: No module named Utils
>
> Using "email.py" is an innocuous name for a script/module you
> might want to do emailish things, and it's likely you'll use
> smtplib in the same code...and kablooie, things blow up even if
> your code doesn't reference or directly use the built-in email.py.


email.py is not an innocuous name, it's a generic name in a global
namespace, which is a Bad Thing. Plus what does a script or module
called "email.py" actually do? Send email? Parse email? "email" is
terrible name for a module and you deserve what you got for using it.

Name your modules "send_email.py" or "sort_email.py" or if it's a
library module of related functions, "email_handling.py". Modules and
scripts do things (usually), they should be given action words as
names.


(**) Questionable though it be, if the Standard Library wants to use
an "innocuous" name, It can.


Carl Banks

Mark Dickinson

unread,
Feb 2, 2010, 4:20:59 AM2/2/10
to
On Feb 2, 3:28 am, Steven D'Aprano

<ste...@REMOVE.THIS.cybersource.com.au> wrote:
>
> There is no module numbers in the standard library, at least not in 2.5.

It's new in 2.6 (and 3.0, I think; it's there in 3.1, anyway). It
provides abstract base classes for numeric types; see the fractions
module source for some of the ways it can be used. Here are the docs:

http://docs.python.org/library/numbers.html

and also PEP 3141, on which it's based:

http://www.python.org/dev/peps/pep-3141/

--
Mark

Jean-Michel Pichavant

unread,
Feb 2, 2010, 5:49:04 AM2/2/10
to Carl Banks, pytho...@python.org
That does not solve anything, if the smtplib follows your advice, then
you'll be shadowing its send_email module.
The only way to avoid collision would be to name your module
__PDSFLSDF_send_email__13221sdfsdf__.py

That way, the probabilty you'd shadow one package hidden module is below
the probability that Misses Hilton ever says something relevant.
However nobody wants to use such names.

Stephen gave good advices in this thread that helps avoiding this issue.

JM

kj

unread,
Feb 2, 2010, 9:13:19 AM2/2/10
to

Let me preface everything by thanking you and all those who
replied for their comments.

I have only one follow-up question (or rather, set of related
questions) that I'm very keen about, plus a bit of a vent at the
end.

>As for fixing it, unfortunately it's not quite so simple to fix without
>breaking backwards-compatibility. The opportunity to do so for Python 3.0
>was missed.

This last point is to me the most befuddling of all. Does anyone
know why this opportunity was missed for 3.0? Anyone out there
with the inside scoop on this? Was the fixing of this problem
discussed in some PEP or some mailing list thread? (I've tried
Googling this but did not hit on the right keywords to bring up
the deliberations I'm looking for.)

~k

[NB: as I said before, what follows begins to slide into a vent,
and is quite unimportant; I've left it, for whatever grain of truth
it may contain, as an grossly overgrown PS; feel free to ignore
it, I'm *by far* most interested in the question stated in the
paragraph right above, because it will give me, I hope, a better
sense of where the biggest obstacles to fixing this problem lie.]

P.S. Yes, I see the backwards-compatibility problem, but that's
what rolling out a whole new versions is good for; it's a bit of
a fresh start. I remember hearing GvR's Google Talk on the coming
Python 3, which was still in the works then, and being struck by
the sheer *modesty* of the proposed changes (while the developers
of the mythical Perl6 seemed to be on a quest for transcendence to
a Higher Plane of Programming, as they still are). In particular
the business with print -> print() seemed truly bizarre to me: this
is a change that will break a *huge* volume of code, and yet,
judging by the rationale given for it, the change solves what are,
IMHO, a relatively minor annoyances. Python's old print statement
is, I think, at most a tiny little zit invisible to all but those
obsessed with absolute perfection. And I can't imagine that whatever
would be required to fix Python's import system could break more
code than redefining the rules for a workhorse like print.

In contrast, the Python import problem is a ticking bomb potentially
affecting all code that imports other modules. All that needs to
happen is that, in a future release of Python, some new standard
module emerges (like numbers.py emerged in 2.6), and this module
is imported by some module your code imports. Boom! Note that it
was only coincidental that the bug I reported in this thread occurred
in a script I wrote recently. I could have written both scripts
before 2.6 was released, and the new numbers.py along with it;
barring the uncanny clairvoyance of some responders, there would
have been, at the time, absolutely no plausible reason for not
naming one of the two scripts numbers.py.

To the argument that the import system can't be easily fixed because
it breaks existing code, one can reply that the *current* import
system already breaks existing code, as illustrated by the example
I've given in this thread: this could have easily been old pre-2.6
code that got broken just because Python decided to add numbers.py
to the distribution. (Yes, Python can't guarantee that the names
of new standard modules won't clash with the names of existing
local modules, but this is true for Perl as well, and due to Perl's
module import scheme (and naming conventions), a scenario like the
one I presented in this thread would have been astronomically
improbable. The Perl example shows that the design of the module
import scheme and naming conventions for standard modules can go
a long way to minimize the consequences of this unavoidable potential
for future name clashes.)


Grant Edwards

unread,
Feb 2, 2010, 10:00:28 AM2/2/10
to
On 2010-02-02, Roy Smith <r...@panix.com> wrote:
> In article <hk82uv$8kn$1...@reader1.panix.com>, kj <no.e...@please.post>
> wrote:
>
>> Through a *lot* of trial an error I finally discovered that the
>> root cause of the problem was the fact that, in the same directory
>> as buggy.py, there is *another* innocuous little script, totally
>> unrelated, whose name happens to be numbers.py.
>> [...]
>> It turns out that buggy.py imports psycopg2, as you can see, and
>> apparently psycopg2 (or something imported by psycopg2) tries to
>> import some standard Python module called numbers; instead it ends
>> up importing the innocent myscript/numbers.py, resulting in *absolute
>> mayhem*.
>
> I feel your pain, but this is not a Python problem, per-se.

I think it is. There should be different syntax to import from
"standard" places and from "current directory". Similar to the
difference between "foo.h" and <foo.h> in cpp.

> The general
> pattern is:
>
> 1) You have something which refers to a resource by name.
>
> 2) There is a sequence of places which are searched for this
> name.

Searching the current directory by default is the problem.
Nobody in their right mind has "." in the shell PATH and IMO it
shouldn't be in Python's import path either. Even those
wreckless souls who do put "." in their path put it at the end
so they don't accidentally override system commands.

--
Grant Edwards grante Yow! So this is what it
at feels like to be potato
visi.com salad

Nobody

unread,
Feb 2, 2010, 12:02:10 PM2/2/10
to
On Tue, 02 Feb 2010 15:00:28 +0000, Grant Edwards wrote:

>>> It turns out that buggy.py imports psycopg2, as you can see, and
>>> apparently psycopg2 (or something imported by psycopg2) tries to
>>> import some standard Python module called numbers; instead it ends
>>> up importing the innocent myscript/numbers.py, resulting in *absolute
>>> mayhem*.
>>
>> I feel your pain, but this is not a Python problem, per-se.
>
> I think it is.

I agree.

> There should be different syntax to import from
> "standard" places and from "current directory". Similar to the
> difference between "foo.h" and <foo.h> in cpp.

I don't know if that's necessary. Only supporting the "foo.h" case would
work fine if Python behaved like gcc, i.e. if the "current directory"
referred to the directory contain the file performing the import rather
than in the process' CWD.

As it stands, imports are dynamically scoped, when they should be
lexically scoped.

>> The general
>> pattern is:
>>
>> 1) You have something which refers to a resource by name.
>>
>> 2) There is a sequence of places which are searched for this
>> name.
>
> Searching the current directory by default is the problem.
> Nobody in their right mind has "." in the shell PATH and IMO it
> shouldn't be in Python's import path either. Even those
> wreckless souls who do put "." in their path put it at the end
> so they don't accidentally override system commands.

Except, what should be happening here is that it should be searching the
directory containing the file performing the import *first*. If foo.py
contains "import bar", and there's a bar.py in the same directory as
foo.py, that's the one it should be using.

The existing behaviour is simply wrong, and there's no excuse for it
("but it's easier to implement" isn't a legitimate argument).

The only situation where the process' CWD should be used is for an import
statement in a non-file source (i.e. stdin or the argument to the -c
switch).

Alf P. Steinbach

unread,
Feb 2, 2010, 12:20:41 PM2/2/10
to
* Nobody:

+1


> The only situation where the process' CWD should be used is for an import
> statement in a non-file source (i.e. stdin or the argument to the -c
> switch).

Hm, not sure about that last.


Cheers,

- Alf

Terry Reedy

unread,
Feb 2, 2010, 1:29:54 PM2/2/10
to pytho...@python.org
On 2/2/2010 9:13 AM, kj wrote:

>> As for fixing it, unfortunately it's not quite so simple to fix without
>> breaking backwards-compatibility. The opportunity to do so for Python 3.0
>> was missed.
>
> This last point is to me the most befuddling of all. Does anyone
> know why this opportunity was missed for 3.0? Anyone out there
> with the inside scoop on this? Was the fixing of this problem
> discussed in some PEP or some mailing list thread? (I've tried
> Googling this but did not hit on the right keywords to bring up
> the deliberations I'm looking for.)

There was a proposal to put the whole stdlib into a gigantic package, so
that

import itertools

would become, for instance

import std.itertools.

Guido rejected that. I believe he both did not like it and was concerned
about making upgrade to 3.x even harder. The discussion was probably on
the now closed py3k list.

Terry Jan Reedy

Carl Banks

unread,
Feb 2, 2010, 1:38:53 PM2/2/10
to
On Feb 2, 9:02 am, Nobody <nob...@nowhere.com> wrote:
> I don't know if that's necessary. Only supporting the "foo.h" case would
> work fine if Python behaved like gcc, i.e. if the "current directory"
> referred to the directory contain the file performing the import rather
> than in the process' CWD.
>
> As it stands, imports are dynamically scoped, when they should be
> lexically scoped.

Mostly incorrect. The CWD is in sys.path only for interactive
sessions, and when started with -c switch. When running scripts, the
directory where the script is located is used instead, not the
process's working directory.

So, no, it isn't anything like dynamic scoping.


> The only situation where the process' CWD should be used is for an import
> statement in a non-file source (i.e. stdin or the argument to the -c
> switch).

It already is that way, chief.

I think you're misunderstanding what's wrong here; the CWD doesn't
have anything to do with it. Even if CWD isn't in the path you still
get the bad behavior kj noted. So now what?

Python's importing can be improved but there's no foolproof way to get
rid of the fundamental problem of name clashes.


Carl Banks

Carl Banks

unread,
Feb 2, 2010, 1:45:07 PM2/2/10
to
On Feb 2, 2:49 am, Jean-Michel Pichavant <jeanmic...@sequans.com>
wrote:

> Carl Banks wrote:
> > Name your modules "send_email.py" or "sort_email.py" or if it's a
> > library module of related functions, "email_handling.py".  Modules and
> > scripts do things (usually), they should be given action words as
> > names.
>
> > (**) Questionable though it be, if the Standard Library wants to use
> > an "innocuous" name, It can.
>
> That does not solve anything,

Of course it does, it solves the problem of having poorly-named
modules. It also helps reduce possibility of name clashes.

> if the smtplib follows your advice, then
> you'll be shadowing its send_email module.
> The only way to avoid collision would be to name your module
> __PDSFLSDF_send_email__13221sdfsdf__.py

I know, and as we all know accidental name clashes are the end of the
world and Mother Python should protect us feeble victims from any
remote possibility of ever having a name clash.


Carl Banks

Jean-Michel Pichavant

unread,
Feb 2, 2010, 2:07:48 PM2/2/10
to Carl Banks, pytho...@python.org
Carl Banks wrote:
> On Feb 2, 2:49 am, Jean-Michel Pichavant <jeanmic...@sequans.com>
> wrote:
>
>> Carl Banks wrote:
>>
>>> Name your modules "send_email.py" or "sort_email.py" or if it's a
>>> library module of related functions, "email_handling.py". Modules and
>>> scripts do things (usually), they should be given action words as
>>> names.
>>>
>>> (**) Questionable though it be, if the Standard Library wants to use
>>> an "innocuous" name, It can.
>>>
>> That does not solve anything,
>>
>
> Of course it does, it solves the problem of having poorly-named
> modules. It also helps reduce possibility of name clashes.
>

Actually don't you think it will increase the possibility ? There are
much less possibilties of properly naming an object than badly naming it.
So if everybody tend to properly name their object with their obvious
version like you proposed, the set of possible names will decrease,
increasing the clash ratio.

I'm just nitpicking by the way, but it may be better to ask for better
namespacing instead of naming (which is good thing but unrelated to the
OP issue).

JM


kj

unread,
Feb 2, 2010, 2:43:40 PM2/2/10
to

>import itertools

>would become, for instance

>import std.itertools.


Thanks. I'll look for this thread.

~K

Carl Banks

unread,
Feb 2, 2010, 3:26:16 PM2/2/10
to
On Feb 2, 11:07 am, Jean-Michel Pichavant <jeanmic...@sequans.com>

wrote:
> Carl Banks wrote:
> > On Feb 2, 2:49 am, Jean-Michel Pichavant <jeanmic...@sequans.com>
> > wrote:
>
> >> Carl Banks wrote:
>
> >>> Name your modules "send_email.py" or "sort_email.py" or if it's a
> >>> library module of related functions, "email_handling.py".  Modules and
> >>> scripts do things (usually), they should be given action words as
> >>> names.
>
> >>> (**) Questionable though it be, if the Standard Library wants to use
> >>> an "innocuous" name, It can.
>
> >> That does not solve anything,
>
> > Of course it does, it solves the problem of having poorly-named
> > modules.  It also helps reduce possibility of name clashes.
>
> Actually don't you think it will increase the possibility ? There are
> much less possibilties of properly naming an object than badly naming it.

You've got to be kidding me, you're saying that a bad name like
email.py is less likely to clash than a more descriptive name like
send_email.py?

> So if everybody tend to properly name their object with their obvious
> version like you proposed, the set of possible names will decrease,
> increasing the clash ratio.

I did not propose obvious module names. I said obvious names like
email.py are bad; more descriptive names like send_email.py are
better.


Carl Banks

Roel Schroeven

unread,
Feb 2, 2010, 4:38:43 PM2/2/10
to
Op 2010-02-02 18:02, Nobody schreef:

> On Tue, 02 Feb 2010 15:00:28 +0000, Grant Edwards wrote:
>
>>>> It turns out that buggy.py imports psycopg2, as you can see, and
>>>> apparently psycopg2 (or something imported by psycopg2) tries to
>>>> import some standard Python module called numbers; instead it ends
>>>> up importing the innocent myscript/numbers.py, resulting in *absolute
>>>> mayhem*.
>>>
>>> I feel your pain, but this is not a Python problem, per-se.
>>
>> I think it is.
>
> I agree.
>
>> There should be different syntax to import from
>> "standard" places and from "current directory". Similar to the
>> difference between "foo.h" and <foo.h> in cpp.
>
> I don't know if that's necessary. Only supporting the "foo.h" case would
> work fine if Python behaved like gcc, i.e. if the "current directory"
> referred to the directory contain the file performing the import rather
> than in the process' CWD.

That is what I would have expected, it is the way I would have
implemented it, and I don't understand why anyone would think
differently. Yet not everyone seems to agree.

Apparently, contrary to my expectations, Python looks in the directory
containing the currently running script instead. That means that the
behavior of "import foo" depends very much on circumstances not under
control of the module in which that statement appears. Very fragile.
Suggestions to use better names or just poor workarounds, IMO. Of the
same nature are suggestions to limit the amount of scrips/modules in a
directory... my /usr/bin contains no less than 2685 binaries, with 0
problems of name clashes; there is IMO no reason why Python should
restrict itself to any less.

Generally I like the design decisions used in Python, or at least I
understand the reasons; in this case though, I don't see the advantages
of the current approach.

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
-- Isaac Asimov

Roel Schroeven

Jonathan Gardner

unread,
Feb 2, 2010, 4:39:19 PM2/2/10
to
On Feb 1, 6:34 pm, kj <no.em...@please.post> wrote:
>
> An innocuous little script, let's call it buggy.py, only 10 lines
> long, and whose output should have been, at most two lines, was
> quickly dumping tens of megabytes of non-printable characters to
> my screen (aka gobbledygook), and in the process was messing up my
> terminal *royally*.
>

In linux terminals, try running the command "reset" to clear up any
gobbledy-gook. It also works if you happen to hit CTRL-C while
entering a password, in the rare case that it fails to set the text
back to visible mode.

Terry Reedy

unread,
Feb 2, 2010, 5:47:19 PM2/2/10
to pytho...@python.org

Stephen Hansen's post explains a bit more than I did. To supplement his
explanation: since print *was* a keyword, every use of 'print' in 2.x
denotes a print statement with standard semantics. Therefore 2to3
*knows* what the statement means and can translate it. On the other
hand, 'import string' usually means 'import the string module of the
stdlib', but it could mean 'import my string module'. This depends on
the execution environment. Moreover, I believe people have intentionally
shadowed stdlib modules. So. like it or not, 2to3 cannot know what
'import string' means.

Terry Jan Reedy

Steven D'Aprano

unread,
Feb 2, 2010, 8:49:17 PM2/2/10
to
On Tue, 02 Feb 2010 12:26:16 -0800, Carl Banks wrote:

> I did not propose obvious module names. I said obvious names like
> email.py are bad; more descriptive names like send_email.py are better.

But surely send_email.py doesn't just send email, it parses email and
receives email as well?


--
Steven

kj

unread,
Feb 2, 2010, 9:35:55 PM2/2/10
to

(For reasons I don't understand Stephen Hansen's posts don't show
in my news server. I became aware of his reply from a passing
reference in one of Terry Reedy's post. Then I found Hansen's post
online, and then an earlier one, and pasted the relevant portion
below.)

> First, I don't shadow built in modules. Its really not very hard to avoid.

...*if* you happen to be clairvoyant. I still don't see how the rest of us
could have followed this fine principle in the case of numbers.py
prior to Python 2.6.

> Secondly, I use packages structuring my libraries, and avoid junk
> directories of a hundred some odd 'scripts'.

<small>(I feel so icky now...)</small>

> Third, I don't execute scripts in that directory structure directly, but
> instead do python -c 'from package.blah import main; main.main()' or some
> such. Usually via some short-cut, or a runner batch file.

Breathtaking... I wonder why the Python documentation, in particular
the official Python tutorial, is not more forthcoming with these
rules.

~K

kj

unread,
Feb 2, 2010, 9:36:49 PM2/2/10
to

Thanks, this dispels some of the mystery.

~K

Steve Holden

unread,
Feb 2, 2010, 10:53:44 PM2/2/10
to pytho...@python.org
kj wrote:
>
> (For reasons I don't understand Stephen Hansen's posts don't show
> in my news server. I became aware of his reply from a passing
> reference in one of Terry Reedy's post. Then I found Hansen's post
> online, and then an earlier one, and pasted the relevant portion
> below.)
>
>
>
>> First, I don't shadow built in modules. Its really not very hard to avoid.
>
> ...*if* you happen to be clairvoyant. I still don't see how the rest of us
> could have followed this fine principle in the case of numbers.py
> prior to Python 2.6.
>
Clearly the more you know about the standard library the less likely
this is to be a problem. Had you been migrqating from an earlier version
the breakage would have alerted you to look for some version-dependent
difference.

>> Secondly, I use packages structuring my libraries, and avoid junk
>> directories of a hundred some odd 'scripts'.
>
> <small>(I feel so icky now...)</small>
>

Be as flippant as you like, but that is good advice.

>> Third, I don't execute scripts in that directory structure directly, but
>> instead do python -c 'from package.blah import main; main.main()' or some
>> such. Usually via some short-cut, or a runner batch file.
>
> Breathtaking... I wonder why the Python documentation, in particular
> the official Python tutorial, is not more forthcoming with these
> rules.
>

Because despite the fact that this issue has clearly bitten you badly
enough to sour you against the language, such issues are remarkably rare
in practice and normally rather easier to debug.

regards
Steve
--
Steve Holden +1 571 484 6266 +1 800 494 3119
PyCon is coming! Atlanta, Feb 2010 http://us.pycon.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS: http://holdenweb.eventbrite.com/

Carl Banks

unread,
Feb 2, 2010, 10:55:15 PM2/2/10
to
On Feb 2, 5:49 pm, Steven D'Aprano

No, it doesn't.


Carl Banks

Steven D'Aprano

unread,
Feb 2, 2010, 11:52:42 PM2/2/10
to

Nevertheless, as a general principle, modules will tend to be multi-
purpose and/or generic. How would you rename the math or random modules
to be less "obvious" and more "descriptive"?

And of course, the less obvious the name, the harder it becomes for
people to find and use it. Which extreme would you rather?

import zip
import compress_and_decompress_files_to_zip_archives


I'm sympathetic to the position you're taking. It's not bad advice at
all, but I think you're over-selling it as a complete solution to the
problem of name clashes. I think it can only slightly alleviate the
problem of name clashes, not eliminate it.


--
Steven

kj

unread,
Feb 3, 2010, 8:52:03 AM2/3/10
to

Steve, I apologize for the snarkiness of my previous reply to you.
After all, I started the thread by asking the forum for advice on
how to avoid a certain kind of bugs, you were among those who gave
me advice. So nothing other than thanking you for it was in order.
I just let myself get carried away by my annoyance with the Python
import scheme. I'm sorry about it. Even though I don't think I
can put to practice all of your advice, I can still learn a good
deal from it.

Cheers,

~kj


Steve Holden <st...@holdenweb.com> writes:

>kj wrote:
>>
>>> First, I don't shadow built in modules. Its really not very hard to avoid.
>>
>> ...*if* you happen to be clairvoyant. I still don't see how the rest of us
>> could have followed this fine principle in the case of numbers.py
>> prior to Python 2.6.
>>
>Clearly the more you know about the standard library the less likely
>this is to be a problem. Had you been migrqating from an earlier version
> the breakage would have alerted you to look for some version-dependent
>difference.

<snip>

kj

unread,
Feb 3, 2010, 11:17:52 AM2/3/10
to
In <hkbv23$c07$1...@reader2.panix.com> kj <no.e...@please.post> writes:


>Steve, I apologize for the snarkiness of my previous reply to you.
>After all, I started the thread by asking the forum for advice on
>how to avoid a certain kind of bugs, you were among those who gave
>me advice. So nothing other than thanking you for it was in order.
>I just let myself get carried away by my annoyance with the Python
>import scheme. I'm sorry about it. Even though I don't think I
>can put to practice all of your advice, I can still learn a good
>deal from it.


Boy, that was dumb of me. The above apology was meant for Stephen
Hansen, not Steve Holden. I guess this is now a meta-apology...
(Sheesh.)

~kj

Nobody

unread,
Feb 3, 2010, 11:55:40 AM2/3/10
to
On Tue, 02 Feb 2010 10:38:53 -0800, Carl Banks wrote:

>> I don't know if that's necessary. Only supporting the "foo.h" case would
>> work fine if Python behaved like gcc, i.e. if the "current directory"
>> referred to the directory contain the file performing the import rather
>> than in the process' CWD.
>>
>> As it stands, imports are dynamically scoped, when they should be
>> lexically scoped.
>
> Mostly incorrect. The CWD is in sys.path only for interactive
> sessions, and when started with -c switch. When running scripts, the
> directory where the script is located is used instead, not the
> process's working directory.

Okay, so s/CWD/directory containing __main__ script/, but the general
argument still holds.

> So, no, it isn't anything like dynamic scoping.

That's what it looks like to me. The way that an import name is resolved
depends upon the run-time context in which the import occurs.

>> The only situation where the process' CWD should be used is for an import
>> statement in a non-file source (i.e. stdin or the argument to the -c
>> switch).
>
> It already is that way, chief.
>
> I think you're misunderstanding what's wrong here; the CWD doesn't
> have anything to do with it. Even if CWD isn't in the path you still
> get the bad behavior kj noted. So now what?

Search for imports first in the directory containing the file performing
the import.

This is essentially the situation with gcc; the directory containing the
current file takes precedence over directories specified by -I switches.
If you want to override this, you have to use the -I- switch, which makes
it very unlikely to happen by accident.

Steve Holden

unread,
Feb 3, 2010, 2:22:59 PM2/3/10
to pytho...@python.org
Don't give it another thought. I'd much rather you cared than you didn't ...

regards
Steve

Steve Holden

unread,
Feb 3, 2010, 2:23:53 PM2/3/10
to pytho...@python.org
Oh, so you don't like *my* advice? ;-)

regards
Steve

Dan Stromberg

unread,
Feb 3, 2010, 4:09:55 PM2/3/10
to kj, pytho...@python.org
kj wrote:
> I just spent about 1-1/2 hours tracking down a bug.

>
> An innocuous little script, let's call it buggy.py, only 10 lines
> long, and whose output should have been, at most two lines, was
> quickly dumping tens of megabytes of non-printable characters to
> my screen (aka gobbledygook), and in the process was messing up my
> terminal *royally*. Here's buggy.py:
>
>
>
> import sys
> import psycopg2
> connection_params = "dbname='%s' user='%s' password='%s'" % tuple(sys.argv[1:])
> conn = psycopg2.connect(connection_params)
> cur = conn.cursor()
> cur.execute('SELECT * FROM version;')
> print '\n'.join(x[-1] for x in cur.fetchall())
>
>
> (Of course, buggy.py is pretty useless; I reduced the original,
> more useful, script to this to help me debug it.)

>
> Through a *lot* of trial an error I finally discovered that the
> root cause of the problem was the fact that, in the same directory
> as buggy.py, there is *another* innocuous little script, totally
> unrelated, whose name happens to be numbers.py. (This second script
> is one I wrote as part of a little Python tutorial I put together
> months ago, and is not much more of a script than hello_world.py;
> it's baby-steps for the absolute beginner. But apparently, it has
> a killer name! I had completely forgotten about it.)
>
> Both scripts live in a directory filled with *hundreds* little
> one-off scripts like the two of them. I'll call this directory
> myscripts in what follows.
>
> It turns out that buggy.py imports psycopg2, as you can see, and
> apparently psycopg2 (or something imported by psycopg2) tries to
> import some standard Python module called numbers; instead it ends
> up importing the innocent myscript/numbers.py, resulting in *absolute
> mayhem*.
>
> (This is no mere Python "wart"; this is a suppurating chancre, and
> the fact that it remains unfixed is a neverending source of puzzlement
> for me.)
>
> How can the average Python programmer guard against this sort of
> time-devouring bug in the future (while remaining a Python programmer)?
> The only solution I can think of is to avoid like the plague the
> basenames of all the 200 or so /usr/lib/pythonX.XX/xyz.py{,c} files,
> and *pray* that whatever name one chooses for one's script does
> not suddenly pop up in the appropriate /usr/lib/pythonX.XX directory
> of a future release.
>
> What else can one do? Let's see, one should put every script in its
> own directory, thereby containing the damage.
>
> Anything else?
>
> Any suggestion would be appreciated.
>
> TIA!
>
> ~k
>
Here's a pretty simple fix that should work in about any version of
python available:

Put modules in ~/lib. Put scripts in ~/bin. Your modules end with
.py. Your scripts don't. Your scripts add ~/lib to sys.path as
needed. Things that go in ~/lib are named carefully. Things in ~/bin
also need to be named carefully, but for an entirely different reason -
if you name something "ls", you may get into trouble.

Then things in ~/lib plainly could cause issues. Things in ~/bin don't.

Ending everything with .py seems to come from the perl tradition of
ending everything with .pl. This perl tradition appears to have come
from perl advocates wanting everyone to know (by looking at a URL) that
they are using a perl CGI. IMO, it's language vanity, and best
dispensed with - aside from this issue, it also keeps you from rewriting
your program in another language with an identical interface.

This does, however, appear to be a scary issue from a security
standpoint. I certainly hope that scripts running as root don't search
"." for modules.


Carl Banks

unread,
Feb 3, 2010, 5:24:59 PM2/3/10
to
On Feb 3, 8:55 am, Nobody <nob...@nowhere.com> wrote:
> On Tue, 02 Feb 2010 10:38:53 -0800, Carl Banks wrote:
> >> I don't know if that's necessary. Only supporting the "foo.h" case would
> >> work fine if Python behaved like gcc, i.e. if the "current directory"
> >> referred to the directory contain the file performing the import rather
> >> than in the process' CWD.
>
> >> As it stands, imports are dynamically scoped, when they should be
> >> lexically scoped.
>
> > Mostly incorrect.  The CWD is in sys.path only for interactive
> > sessions, and when started with -c switch.  When running scripts, the
> > directory where the script is located is used instead, not the
> > process's working directory.
>
> Okay, so s/CWD/directory containing __main__ script/, but the general
> argument still holds.
>
> > So, no, it isn't anything like dynamic scoping.
>
> That's what it looks like to me. The way that an import name is resolved
> depends upon the run-time context in which the import occurs.

Well it has one superficial similarity to dynamic binding, but that's
pretty much it.

When the directory containing __main__ script is in the path, it
doesn't matter if you invoke it from a different directory or os.chdir
() during the program, the same modules get imported. I.e., there's
nothing dynamic at all about which modules are used. (Unless you
fiddle with the path but that's another question.)

All I'm saying is the analogy is bad. A better analogy would be if
you have lexical binding, but you automatically look in various sister
scope before your own scope.


Carl Banks

Carl Banks

unread,
Feb 3, 2010, 5:34:13 PM2/3/10
to
On Feb 2, 8:52 pm, Steven D'Aprano

<ste...@REMOVE.THIS.cybersource.com.au> wrote:
> On Tue, 02 Feb 2010 19:55:15 -0800, Carl Banks wrote:
> > On Feb 2, 5:49 pm, Steven D'Aprano
> > <ste...@REMOVE.THIS.cybersource.com.au> wrote:
> >> On Tue, 02 Feb 2010 12:26:16 -0800, Carl Banks wrote:
> >> > I did not propose obvious module names.  I said obvious names like
> >> > email.py are bad; more descriptive names like send_email.py are
> >> > better.
>
> >> But surely send_email.py doesn't just send email, it parses email and
> >> receives email as well?
>
> > No, it doesn't.
>
> Nevertheless, as a general principle, modules will tend to be multi-
> purpose and/or generic.

Uh, no?

If your module is a library with a public API, then you might
defensibly have a "generic and/or multi-purpose module", but if that's
the case you should have already christened it something unique.

Otherwise modules should stick to a single purpose that can be
summarized in a short action word or phrase.


Carl Banks

Tim Golden

unread,
Feb 3, 2010, 11:33:57 AM2/3/10
to pytho...@python.org
On 03/02/2010 16:17, kj wrote:
> Boy, that was dumb of me. The above apology was meant for Stephen
> Hansen, not Steve Holden. I guess this is now a meta-apology...
> (Sheesh.)

You see? That's what I like about the Python community:
people even apologise for apologising :)

TJG

Steve Holden

unread,
Feb 4, 2010, 7:04:35 AM2/4/10
to pytho...@python.org
Tim Golden wrote:
> On 03/02/2010 16:17, kj wrote:
>> Boy, that was dumb of me. The above apology was meant for Stephen
>> Hansen, not Steve Holden. I guess this is now a meta-apology...
>> (Sheesh.)
>
> You see? That's what I like about the Python community:
> people even apologise for apologising :)
>
QOTW?

John Nagle

unread,
Feb 5, 2010, 3:16:46 PM2/5/10
to
kj wrote:
...

>
> Through a *lot* of trial an error I finally discovered that the
> root cause of the problem was the fact that, in the same directory
> as buggy.py, there is *another* innocuous little script, totally
> unrelated, whose name happens to be numbers.py.

The right answer to this is to make module search return an
error if two modules satisfy the search criteria. "First find"
isn't a good solution.

John Nagle

MRAB

unread,
Feb 5, 2010, 3:44:58 PM2/5/10
to pytho...@python.org
Stephen Hansen wrote:
> On Fri, Feb 5, 2010 at 12:16 PM, John Nagle <na...@animats.com
> <mailto:na...@animats.com>> wrote:
> And thereby slowdown every single import and application startup time as
> the common case of finding a module in one of the first couple entries
> in sys.path now has to search it in every single item on that path. Its
> not uncommon to have a LOT of things on sys.path.
>
> No thanks. "First Find" is good enough, especially with PEP328 and
> absolute_import being on in Python 3 (and presumably 2.7). It doesn't
> really help older Python versions unfortunately, but changing how import
> works wouldn't help them anyways. Yeah, there might be two paths on
> sys.path which both have a 'numbers.py' at the top level and First Find
> might return the wrong one, but... people making poor decisions on code
> organization and not using packages isn't something the language really
> needs to fix.
>
You might want to write a script that looks through the search paths for
duplicated names, especially ones which hide modules in the standard
library. Has anyone done this already?

Ethan Furman

unread,
Feb 5, 2010, 4:48:22 PM2/5/10
to pytho...@python.org

Then what happens when you *want* to shadow a module? As MRAB suggests,
if you are really concerned about it use a script that checks for
duplicate modules (not a bad idea for debugging), but don't start
throwing errors... next thing you know we won't be able to shadow
classes, functions, or built-ins! !-)

~Ethan~

Reply all
Reply to author
Forward
0 new messages