Deprecating reload() ???

Ellinghaus, Lance

unread,

Mar 11, 2004, 3:10:59 PM3/11/04

to pytho...@python.org

> > Other surprises: Deprecating reload()

>Reload doesn't work the way most people think
>it does: if you've got any references to the old module,
>they stay around. They aren't replaced.

>It was a good idea, but the implementation simply
>doesn't do what the idea promises.

I agree that it does not really work as most people think it does, but how
would you perform the same task as reload() without the reload()?

Would this be done by "del sys.modules['modulename']" and then perform an
'import'?

I use reload() for many purposes, knowing how it works/does not work.

Lance Ellinghaus

John Roth

unread,

Mar 11, 2004, 4:43:14 PM3/11/04

to

"Ellinghaus, Lance" <lance.el...@eds.com> wrote in message
news:mailman.282.10790359...@python.org...

I usually figure out another way to do it - mostly reload the entire
program from scratch. In any case, it's not a real issue for me -
it's just enough of an irritation that the PythonWin editor does it that
I keep a command line open rather than use what is (to me) a
really broken implementation of run.

There are a lot of things in life that I wished worked the way they
are advertised. They don't, so you cope.

John Roth
>
> Lance Ellinghaus
>

David MacQuigg

unread,

Mar 11, 2004, 7:45:43 PM3/11/04

to

On Thu, 11 Mar 2004 15:10:59 -0500, "Ellinghaus, Lance"
<lance.el...@eds.com> wrote:

>> > Other surprises: Deprecating reload()
>
>>Reload doesn't work the way most people think
>>it does: if you've got any references to the old module,
>>they stay around. They aren't replaced.
>
>>It was a good idea, but the implementation simply
>>doesn't do what the idea promises.
>
>I agree that it does not really work as most people think it does, but how
>would you perform the same task as reload() without the reload()?

Seems like reload() could be *made* to work by scanning currently
loaded modules for all references to objects in the reloaded module,
and resetting them to the new objects. If an object is missing from
the new module, throw an exception.

GvR suggests using 'exec' as an alternative.
http://python.org/doc/essays/ppt/regrets/6
I don't see how that solves the problem of eliminating references to
old objects.

-- Dave

us...@domain.invalid

unread,

Mar 12, 2004, 12:27:36 AM3/12/04

to

Ellinghaus, Lance wrote:

> I agree that it does not really work as most people think it does, but how
> would you perform the same task as reload() without the reload()?
>
> Would this be done by "del sys.modules['modulename']" and then perform an
> 'import'?

Ah, interesting! I've been doing this:

del modulename; import modulename

but it doesn't pick up any recent changes to the
modulename.py file.

That's a better solution than exiting Python and
starting it up again. Thanks!

--
Steven D'Aprano

Peter Hansen

unread,

Mar 12, 2004, 6:58:06 AM3/12/04

to

Just to clarify, *neither* of the above solutions does anything useful,
as far as I know. Only reload() will really reload a module. Certainly
the del mod/import mod approach is practically a no-op...

-Peter

David MacQuigg

unread,

Mar 12, 2004, 7:43:25 AM3/12/04

to

This doesn't work. It doesn't matter if you first "del modulename".
All that does is remove the reference to "modulename" from the current
namespace.

testimport.py
a = 8
print 'a =', a

>>> import testimport
a = 8
>>> ## ==> change a to 9 in testimport.py
>>> testimport.a
8
>>> dir()
['__builtins__', '__doc__', '__name__', 'testimport']
>>> del testimport
>>> dir()
['__builtins__', '__doc__', '__name__']
>>> import testimport
>>> dir()
['__builtins__', '__doc__', '__name__', 'testimport']
>>> testimport.a
8 <== The old module is still in memory !!!
>>> a = testimport.a
>>> dir()
['__builtins__', '__doc__', '__name__', 'a', 'testimport']
>>> reload(testimport)
a = 9
>>> testimport.a
9 <== reload() changes new fully-qualified references
>>> a
8 <== but not any other references that were set up before.
>>>

I know this is the *intended* behavior of reload(), but it has always
seemed to me like a bug. Why would you *ever* want to keep pieces of
an old module when that module is reloaded?

Seems to me we should *fix* reload, not deprecate it.

-- Dave

Peter Hansen

unread,

Mar 12, 2004, 8:45:24 AM3/12/04

to

David MacQuigg wrote:

> I know this is the *intended* behavior of reload(), but it has always
> seemed to me like a bug. Why would you *ever* want to keep pieces of
> an old module when that module is reloaded?

If you wanted a dynamic module, where changes would be picked up by the
application from time to time, but didn't want to terminate existing
instances of the old version. There's nothing wrong with the idea that
both old and new versions of something could exist in memory, at least
for a while until the old ones are finished whatever they are doing.

Basically, envision an application update mechanism for long-running
applications, and its special needs.

> Seems to me we should *fix* reload, not deprecate it.

Reload is not broken, and certainly shouldn't be deprecated at least
until there's a better solution that won't suffer from reload's one
problem, IMHO, which is that it surprises some people by its behaviour.
I think that when you consider Python's namespace mechanism, you can't
avoid the possibility of situations like the ones reload can now lead to.

-Peter

Michael Hudson

unread,

Mar 12, 2004, 8:39:07 AM3/12/04

to

"Ellinghaus, Lance" <lance.el...@eds.com> writes:

> > > Other surprises: Deprecating reload()
>
> >Reload doesn't work the way most people think
> >it does: if you've got any references to the old module,
> >they stay around. They aren't replaced.
>
> >It was a good idea, but the implementation simply
> >doesn't do what the idea promises.

I missed these mails...

> I agree that it does not really work as most people think it does, but how
> would you perform the same task as reload() without the reload()?
>
> Would this be done by "del sys.modules['modulename']" and then perform an
> 'import'?
>
> I use reload() for many purposes, knowing how it works/does not work.

To paraphrase Tim Peters, you can have reload() when I'm dead. Just
because it doesn't and never has done what some people expect is no
argument at all for deprecating it.

Cheers,
mwh

--
But maybe I've just programmed in enough different languages to
assume that they are, in fact, different.
-- Tony J Ibbs explains why Python isn't Java on comp.lang.python

Skip Montanaro

unread,

Mar 12, 2004, 9:20:14 AM3/12/04

to Peter Hansen, pytho...@python.org

Regarding

>> del sys.modules['modulename']; import modulename

vs.

>> del modulename ; import modulename

Peter sez:

Peter> Just to clarify, *neither* of the above solutions does anything
Peter> useful, as far as I know. Only reload() will really reload a
Peter> module. Certainly the del mod/import mod approach is practically
Peter> a no-op...

Not so. del sys.modules['mod']/import mod is effectively what reload()
does. Given foo.py with this content:

a = 1
print a

here's a session which shows that it is outwardly the same as reload(...).
That is, the module-level code gets reexecuted (precisely because there's
not already a reference in sys.modules).

>>> import foo
1
>>> reload(foo)
1
<module 'foo' from 'foo.pyc'>
>>> import foo
>>> import sys
>>> del sys.modules['foo']
>>> import foo
1

David MacQuigg's contention that it doesn't work is missing the fact that
when he executed

a = testimport.a

he simply created another reference to the original object which was outside
the scope of the names in testimport. Reloading testimport created a new
object and bound "testimport.a" to it. That has nothing to do with the
object to which "a" is bound. This is the problem reload() doesn't solve.

Given Python's current object model it would be an interesting challenge to
write a "super reload" which could identify all the objects created as a
side effect of importing a module and for those with multiple references,
locate those other references by traversing the known object spaces, then
perform the import and finally rebind the references found in the first
step. I thought I saw something like this posted in a previous thread on
this subject.

In case you're interested in messing around with this, there are three
object spaces to be careful of, builtins (which probably won't hold any
references), all the imported module globals (which you can find in
sys.modules) and all the currently active local Python functions (this is
where it gets interesting ;-). There's a fourth set of references
- those held by currently active functions which were defined in
extension modules - but I don't think you can get to them from pure Python
code. Whether or not that's a serious enough problem to derail the quest
for super_reload() function is another matter. Most of the time it probably
won't matter. Every once in a great while it will probably bite you in the
ass. Accordingly, if super_reload() can't track down all the references
which need rebinding it should probably raise a warning/exception or return
a special value to indicate that.

Skip

David MacQuigg

unread,

Mar 12, 2004, 10:54:11 AM3/12/04

to

On Fri, 12 Mar 2004 08:45:24 -0500, Peter Hansen <pe...@engcorp.com>
wrote:

>David MacQuigg wrote:
>
>> I know this is the *intended* behavior of reload(), but it has always
>> seemed to me like a bug. Why would you *ever* want to keep pieces of
>> an old module when that module is reloaded?
>
>If you wanted a dynamic module, where changes would be picked up by the
>application from time to time, but didn't want to terminate existing
>instances of the old version. There's nothing wrong with the idea that
>both old and new versions of something could exist in memory, at least
>for a while until the old ones are finished whatever they are doing.

OK, I agree this is a valid mode of operation.

>Basically, envision an application update mechanism for long-running
>applications, and its special needs.

I think a good example might be some objects that need to retain the
state they had before the reload.

>> Seems to me we should *fix* reload, not deprecate it.
>
>Reload is not broken, and certainly shouldn't be deprecated at least
>until there's a better solution that won't suffer from reload's one
>problem, IMHO, which is that it surprises some people by its behaviour.

It's worse than just a surprise. It's a serious problem when what you
need to do is what most people are expecting -- replace every
reference to objects in the old module with references to the new
objects. The problem becomes a near impossibility when those
references are scattered throughout a multi-module program.

For the *exceptional* case where we want to pick and chose which
objects to update, seems like the best solution would be to add some
options to the reload function.

def reload(<module>, objects = '*'):

The default second argument '*' updates all references. '?' prompts
for each object. A list of objects updates references to just those
objects automatically. 'None' would replicate the current behavior,
updating only the name of the module itself.

The '?' mode would normally ask one question for each object in the
module to be reloaded, and then update all references to the selected
objects. We could even have a '??' mode that would prompt for each
*reference* to each object.

In a typical debug session, there is only one object that you have
updated, so I think there would seldom be a need to enter more than
one name as the second argument.

> I think that when you consider Python's namespace mechanism, you can't
> avoid the possibility of situations like the ones reload can now lead to.

I don't understand. My assumption is you would normally update all
references to the selected objects in all namespaces.

-- Dave

Mel Wilson

unread,

Mar 12, 2004, 10:29:18 AM3/12/04

to

In article <2j12505t4oaniqq76...@4ax.com>,

David MacQuigg <d...@gain.com> wrote:
>On Thu, 11 Mar 2004 15:10:59 -0500, "Ellinghaus, Lance"
><lance.el...@eds.com> wrote:
>
>>> > Other surprises: Deprecating reload()
>>
>>>Reload doesn't work the way most people think
>>>it does: if you've got any references to the old module,
>>>they stay around. They aren't replaced.
>>
>>>It was a good idea, but the implementation simply
>>>doesn't do what the idea promises.
>>
>>I agree that it does not really work as most people think it does, but how
>>would you perform the same task as reload() without the reload()?
>
>Seems like reload() could be *made* to work by scanning currently
>loaded modules for all references to objects in the reloaded module,
>and resetting them to the new objects. If an object is missing from
>the new module, throw an exception.

I don't quite get this. I don't see objects _in_ a
module. I see objects referenced from the modules
namespace, but they can be referenced from other modules'
namespaces at the same time. Who is to say which module the
objects are *in*?

e.g. (untested)

#===============================
"M1.py"

f1 = file ('a')
f2 = file ('b')

#===============================
"main.py"
import M1

a_file = M1.f1
another = file ('c')
M1.f2 = another
reload (M1)

Regards. Mel.

David MacQuigg

unread,

Mar 12, 2004, 12:25:35 PM3/12/04

to

On Fri, 12 Mar 2004 08:20:14 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

>Given Python's current object model it would be an interesting challenge to
>write a "super reload" which could identify all the objects created as a
>side effect of importing a module and for those with multiple references,
>locate those other references by traversing the known object spaces, then
>perform the import and finally rebind the references found in the first
>step. I thought I saw something like this posted in a previous thread on
>this subject.

I'm not familiar with the internals of Python, but I was assuming that
all objects could be easily associated with the module which created
them. If not, maybe what we need is the ability to put a selected
module in "debug" mode and keep a list of all objects created by that
module and all references to those objects. That would add a little
overhead, but avoid the difficulties of searching all object spaces
with every reload.

-- Dave

John Roth

unread,

Mar 12, 2004, 1:24:21 PM3/12/04

to

"David MacQuigg" <d...@gain.com> wrote in message
news:hgr3501a1qu612fvu...@4ax.com...

That's actually the wrong end of the problem. Even if you
could associate all the objects with the module that created
them, you would still have to find all the references to those
modules. That's the harder of the two tasks.

It's actually relatively easy to find the objects that would have
to be replaced: it's all of the objects that are bound at the module
level in the module you're replacing. Since CPython uses memory
addresses as IDs, it's trivially easy to stick them in a dictionary and
compare them while scanning all the places they could possibly be
bound.

John Roth
>
> -- Dave
>

Peter Hansen

unread,

Mar 12, 2004, 1:29:48 PM3/12/04

to

Skip Montanaro wrote:
> Regarding
>
> >> del sys.modules['modulename']; import modulename
>
> vs.
>
> >> del modulename ; import modulename
>
> Peter sez:
>
> Peter> Just to clarify, *neither* of the above solutions does anything
> Peter> useful, as far as I know. Only reload() will really reload a
> Peter> module. Certainly the del mod/import mod approach is practically
> Peter> a no-op...
>
> Not so. del sys.modules['mod']/import mod is effectively what reload()
> does.

Oops, sorry. On reflection that seems logical, but I thought there was
a little more to reload() than just this. :-(

-Peter

John Roth

unread,

Mar 12, 2004, 1:29:33 PM3/12/04

to

"David MacQuigg" <d...@gain.com> wrote in message

news:8ok3505b3b3g22bk8...@4ax.com...

I can certainly see wanting to do that in a debug session - or even
with an editor that's broken the way PythonWin is broken (it doesn't
reload the top level module when you say Run unless it's been updated,
so it never reloads the lower level modules even if they've been updated.)

However, I don't remember the last time I used the debugger,
it's been so long. When you use TDD to develop, you find that
the debugger becomes a (thankfully) long ago memory.

However, if you want to be able to replace a module in a
running program, I suspect you'd be much better off designing
your program to make it easy, rather than depending on the
system attempting to find all the references.

John Roth

>
> -- Dave
>

Peter Hansen

unread,

Mar 12, 2004, 1:34:36 PM3/12/04

to

David MacQuigg wrote:

> On Fri, 12 Mar 2004 08:45:24 -0500, Peter Hansen <pe...@engcorp.com>
> wrote:
>>Reload is not broken, and certainly shouldn't be deprecated at least
>>until there's a better solution that won't suffer from reload's one
>>problem, IMHO, which is that it surprises some people by its behaviour.
>
> It's worse than just a surprise. It's a serious problem when what you
> need to do is what most people are expecting -- replace every
> reference to objects in the old module with references to the new
> objects. The problem becomes a near impossibility when those
> references are scattered throughout a multi-module program.

I don't consider this a problem with reload, I consider it a design
defect. If there's a need for such a thing, it should be designed in to
the application, and certainly one would remove the "scattering" of
objects such as these which are about to be replaced en masse.

I think many applications would be inherently broken if a programmer
thought a simple "reload" of the style you envision would work without
serious but possibly quite subtle side effects.

>>I think that when you consider Python's namespace mechanism, you can't
>>avoid the possibility of situations like the ones reload can now lead to.
>
> I don't understand. My assumption is you would normally update all
> references to the selected objects in all namespaces.

I guess we're coming at this from different viewpoints. My comments
above should probably explain why I said that. Basically, it seems to
me very unlikely there are good use cases for wanting to update the
classes behind the backs of objects regardless of where references to
them are bound. I'm open to suggestions though.

-Peter

Denis S. Otkidach

unread,

Mar 12, 2004, 10:48:35 AM3/12/04

to Skip Montanaro, pytho...@python.org, Peter Hansen

On Fri, 12 Mar 2004, Skip Montanaro wrote:

SM> Given Python's current object model it would be an
SM> interesting challenge to
SM> write a "super reload" which could identify all the objects
SM> created as a
SM> side effect of importing a module and for those with
SM> multiple references,
SM> locate those other references by traversing the known object
SM> spaces, then
SM> perform the import and finally rebind the references found
SM> in the first
SM> step. I thought I saw something like this posted in a
SM> previous thread on
SM> this subject.

Is it possible? How to handle the following cases:

module.var = new_value
adict = {module.var: value}
atuple = (module.var,)

module.var = someobj = new_value # now they refer to the same
# object

module.var = 0 # 0, 1 etc. are always shared. We cannot test
someobj = 0 # where it came from

--
Denis S. Otkidach
http://www.python.ru/ [ru]

Skip Montanaro

unread,

Mar 12, 2004, 12:21:44 PM3/12/04

to Denis S. Otkidach, pytho...@python.org, Peter Hansen

Denis> module.var = 0 # 0, 1 etc. are always shared. We cannot test
Denis> someobj = 0 # where it came from

Yikes... You're right. Sorry for the bumbled suggestion.

Skip

Skip Montanaro

unread,

Mar 12, 2004, 12:42:06 PM3/12/04

to David MacQuigg, pytho...@python.org

>> Reload is not broken, and certainly shouldn't be deprecated at least
>> until there's a better solution that won't suffer from reload's one
>> problem, IMHO, which is that it surprises some people by its
>> behaviour.

David> It's worse than just a surprise. It's a serious problem when
David> what you need to do is what most people are expecting -- replace
David> every reference to objects in the old module with references to
David> the new objects. The problem becomes a near impossibility when
David> those references are scattered throughout a multi-module program.

This is where I think your model of how Python works has broken down.
Objects don't live within modules. References to objects do. All objects
inhabit a space not directly associated with any particular Python
namespace. If I execute

import urllib
quote = urllib.quote
reload(urllib)

quote and urllib.quote refer to different objects and compare False. (I
suppose the cmp() routine for functions could compare code objects and other
attributes which might change.) The object referred to by urllib.quote is
not bound in any other way to the urllib module other than the reference
named "quote" in the urllib module's dict. reload() simply decrements the
reference count to urllib, which decrements the reference count to its dict,
which causes it to clean up and decrement the reference count for all its
key/value pairs.

David> For the *exceptional* case where we want to pick and chose which
David> objects to update, seems like the best solution would be to add
David> some options to the reload function.

David> def reload(<module>, objects = '*'):

David> The default second argument '*' updates all references. '?'
David> prompts for each object. A list of objects updates references to
David> just those objects automatically. 'None' would replicate the
David> current behavior, updating only the name of the module itself.

If you want to implement this and all you're interested in are class,
function and method definitions and don't care about local variables I think
you can make a reasonable stab at this. I wouldn't recommend you overload
reload() with this code, at least not initially. Write a super_reload() in
Python. If/when you get that working, then rewrite it in C, then get that
working, then you're free to recommend that the builtin reload() function be
modified.

David> I don't understand. My assumption is you would normally update
David> all references to the selected objects in all namespaces.

As Denis Otkidach pointed out in response to an earlier post of mine,
objects on shared free lists can't be rebound using this mechanism. I think
it's safe initially to restrict super_reload() to classes, functions and
methods.

Skip

David MacQuigg

unread,

Mar 12, 2004, 2:08:54 PM3/12/04

to

On Fri, 12 Mar 2004 13:34:36 -0500, Peter Hansen <pe...@engcorp.com>
wrote:

>David MacQuigg wrote:
>
>> On Fri, 12 Mar 2004 08:45:24 -0500, Peter Hansen <pe...@engcorp.com>
>> wrote:
>>> Reload is not broken, and certainly shouldn't be deprecated at least
>>> until there's a better solution that won't suffer from reload's one
>>> problem, IMHO, which is that it surprises some people by its behaviour.
>>
>> It's worse than just a surprise. It's a serious problem when what you
>> need to do is what most people are expecting -- replace every
>> reference to objects in the old module with references to the new
>> objects. The problem becomes a near impossibility when those
>> references are scattered throughout a multi-module program.
>
> I don't consider this a problem with reload, I consider it a design
> defect. If there's a need for such a thing, it should be designed in to
> the application, and certainly one would remove the "scattering" of
> objects such as these which are about to be replaced en masse.

I agree, most programs should not have 'reload()' designed in, and
those that do, should be well aware of its limitations. I'm concerned
more about interactive use, specifically of programs which cannot be
conveniently restarted from the beginning. I guess I'm spoiled by HP
BASIC, where you can change the program statements while the program
is running! (half wink)

>>>I think that when you consider Python's namespace mechanism, you can't
>>>avoid the possibility of situations like the ones reload can now lead to.
>>
>> I don't understand. My assumption is you would normally update all
>> references to the selected objects in all namespaces.
>
>I guess we're coming at this from different viewpoints. My comments
>above should probably explain why I said that. Basically, it seems to
>me very unlikely there are good use cases for wanting to update the
>classes behind the backs of objects regardless of where references to
>them are bound. I'm open to suggestions though.

Objects derived from classes are a different, and probably unsolvable
problem. Attempting to update those would be like trying to put the
program in the state it would have been, had the module changes been
done some time in the past. We would have to remember the values at
the time the object was created of all variables that went into the
__init__ call. Classes don't have this problem, and they should be
updatable.

Here is a use-case for classes. I've got hundreds of variables in a
huge hierarchy of "statefiles". In my program, that hierarchy is
handled as a hierarchy of classes. If I want to access a particular
variable, I say something like:
wavescan.window1.plot2.xaxis.label.font.size = 12
These classes have no methods, just names and values and other
classes.

If I reload a module that changes some of those variables, I would
like to not have to hunt down every reference in the running program
and change it manually.

-- Dave

David MacQuigg

unread,

Mar 12, 2004, 2:35:10 PM3/12/04

to

On Fri, 12 Mar 2004 18:48:35 +0300 (MSK), "Denis S. Otkidach"
<o...@strana.ru> wrote:

>Is it possible? How to handle the following cases:
>
>module.var = new_value
>adict = {module.var: value}
>atuple = (module.var,)
>
>module.var = someobj = new_value # now they refer to the same
> # object
>
>module.var = 0 # 0, 1 etc. are always shared. We cannot test
>someobj = 0 # where it came from

I'm not sure what you are intending with this code, but I'll assume
'new_value' is the object to be updated with a reload. 'adict' and
'atuple' should retain their references to the old object, since
dictionary keys and tuples are immutable. 'module.var' and 'someobj'
no longer point to 'new_value', so they can be left as is.

If the reload were done *before* the last two lines, then 'module.var'
and 'someobj' should be changed to point to the new 'new_value'.

-- Dave

David MacQuigg

unread,

Mar 12, 2004, 2:54:15 PM3/12/04

to

On Fri, 12 Mar 2004 11:42:06 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> David> It's worse than just a surprise. It's a serious problem when

> David> what you need to do is what most people are expecting -- replace
> David> every reference to objects in the old module with references to
> David> the new objects. The problem becomes a near impossibility when
> David> those references are scattered throughout a multi-module program.
>
> This is where I think your model of how Python works has broken down.
> Objects don't live within modules. References to objects do. All objects
> inhabit a space not directly associated with any particular Python
> namespace. If I execute
>
> import urllib
> quote = urllib.quote
> reload(urllib)
>
>quote and urllib.quote refer to different objects and compare False. (I
>suppose the cmp() routine for functions could compare code objects and other

[ snip ]

Understood. When I said "objects in the old module", I should have
said "objects from the old module". I wasn't making any assumption
about where these objects reside once loaded. I'm still assuming it
is possible ( even if difficult ) to locate and change all current
references to these objects. This may require a special "debug" mode
to keep track of this information.

Another "brute force" kind of solution would be to replace the old
objects with links to the new. Every refence, no matter where it came
from, would be re-routed. The inefficiency would only last until you
restart the program.

-- Dave

Paul Miller

unread,

Mar 12, 2004, 2:21:41 PM3/12/04

to pytho...@python.org

> >> It's worse than just a surprise. It's a serious problem when what you
> >> need to do is what most people are expecting -- replace every
> >> reference to objects in the old module with references to the new
> >> objects. The problem becomes a near impossibility when those
> >> references are scattered throughout a multi-module program.

...

> > I don't consider this a problem with reload, I consider it a design
> > defect. If there's a need for such a thing, it should be designed in to

...

>I agree, most programs should not have 'reload()' designed in, and
>those that do, should be well aware of its limitations. I'm concerned

I've been working around the problem with reload by loading my modules into
self-contained interpreters, using the multiple interpreter API. This all
worked wonderfully until Python 2.3, where it all sort of broke.

I would be able to do to all with one interpreter, *if* reload did "what I
expect".

Clearly, there needs to be SOME solution to this problem, as I'm definitely
not the only person trying to do this.

(the context comes from wanting to use Python modules as "plugins" to a C++
application - and to aid in development, be able to reload plugins on the
fly if there code is changed).

Skip Montanaro

unread,

Mar 12, 2004, 4:17:54 PM3/12/04

to David MacQuigg, pytho...@python.org

Dave> Another "brute force" kind of solution would be to replace the old
Dave> objects with links to the new. Every refence, no matter where it
Dave> came from, would be re-routed. The inefficiency would only last
Dave> until you restart the program.

That would require that you be able to transmogrify an object into a proxy
of some sort without changing its address. In theory I think this could be
done, but not in pure Python. It would require a helper written in C.

Skip

David MacQuigg

unread,

Mar 12, 2004, 8:08:52 PM3/12/04

to

On Fri, 12 Mar 2004 11:42:06 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> >> Reload is not broken, and certainly shouldn't be deprecated at least

> >> until there's a better solution that won't suffer from reload's one
> >> problem, IMHO, which is that it surprises some people by its
> >> behaviour.

I've written a short description of what reload() does to try and help
reduce the confusion. This is intended for EEs who are new to Python.
Please see
http://ece.arizona.edu/~edatools/Python/Reload.htm

I've also started a new thread to discuss this. See "Reload()
Confusion" Comments are welcome.

-- Dave

Skip Montanaro

unread,

Mar 12, 2004, 10:56:24 PM3/12/04

to David MacQuigg, pytho...@python.org

Just an FYI, I didn't write this statement:

Dave> On Fri, 12 Mar 2004 11:42:06 -0600, Skip Montanaro <sk...@pobox.com>
Dave> wrote:

>> >> Reload is not broken, and certainly shouldn't be deprecated at least
>> >> until there's a better solution that won't suffer from reload's one
>> >> problem, IMHO, which is that it surprises some people by its
>> >> behaviour.

Dave> I've written a short description of what reload() does to try and
Dave> help reduce the confusion. This is intended for EEs who are new
Dave> to Python.

I'm not sure why you're planning to teach them reload(). I've used it
rarely in about ten years of Python programming. Its basic semantics are
straightforward, but as we've seen from the discussions in this thread
things can go subtly awry. Just tell people to either not create references
which refer to globals in other modules (e.g. "quote = urllib.quote") if
they intend to use reload() or tell them to just exit and restart their
application, at least until they understand the limitations of trying to
modify a running Python program.

Skip

David MacQuigg

unread,

Mar 13, 2004, 10:12:07 AM3/13/04

to

On Fri, 12 Mar 2004 21:56:24 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> Dave> I've written a short description of what reload() does to try and

> Dave> help reduce the confusion. This is intended for EEs who are new
> Dave> to Python.
>
>I'm not sure why you're planning to teach them reload(). I've used it
>rarely in about ten years of Python programming. Its basic semantics are
>straightforward, but as we've seen from the discussions in this thread
>things can go subtly awry. Just tell people to either not create references
>which refer to globals in other modules (e.g. "quote = urllib.quote") if
>they intend to use reload() or tell them to just exit and restart their
>application, at least until they understand the limitations of trying to
>modify a running Python program.

I don't think we can avoid reload(). A typicial design session has
several tools running, and it is a real pain to restart. Design
engineers often leave sessions open for several days.

What I will try to do is write the modules that are likely to be
reloaded in a way that minimizes the problems, accessing objects in
those modules *only* via their fully-qualified names, etc.

Again, these are interactive sessions. I don't think I will need
reload() as part of the code.

-- Dave

David MacQuigg

unread,

Mar 13, 2004, 10:19:26 AM3/13/04

to

On Fri, 12 Mar 2004 15:17:54 -0600, Skip Montanaro <sk...@pobox.com>
wrote:
>

How about if we could just show the reference counts on all of the
reloaded objects? That way we could know if we've missed one in our
manual search and update. Could avoid the need for transmogrification
of objects. :>)

-- Dave

Skip Montanaro

unread,

Mar 13, 2004, 11:30:04 AM3/13/04

to David MacQuigg, pytho...@python.org

Dave> How about if we could just show the reference counts on all of the
Dave> reloaded objects?

That wouldn't work for immutable objects which can be shared. Ints come to
mind, but short strings are interned, some tuples are shared, maybe some
floats, and of course None, True and False are. You will have to define a
subset of object types to display.

Skip

David MacQuigg

unread,

Mar 13, 2004, 1:27:25 PM3/13/04

to

On Sat, 13 Mar 2004 10:30:04 -0600, Skip Montanaro <sk...@pobox.com>
wrote:
>

Just to make sure I understand this, I think what you are saying is
that if I have a module M1 that defines a value x = 3.1, it will be
impossible to keep track of the number of references to M1.x because
the object '3.1' may have other references to it from other modules
which use the same constant 3.1. This really does make it impossible
to do a complete reload.

I'm not sure at this point if an improved reload() is worth pursuing,
but perhaps we could do something with a "debug" mode in which the
minuscule benefit of creating these shared references is bypassed, at
least for the modules that are in "debug mode". Then it would be
possible to check after each reload and pop a warning:

>>> reload(M1)
<module 'M1' from 'M1.pyc'>
Warning: References to objects in the old module still exist:
M1.x (3)
M1.y (2)
>>>

Even if we don't update all the references after a reload, it would
sure be nice to have a warning like this. We could then avoid
creating direct (not fully-qualified) references to objects within any
module that is likely to be reloaded, and be assured that we will get
a warning if we miss one.

-- Dave

Skip Montanaro

unread,

Mar 13, 2004, 3:27:00 PM3/13/04

to David MacQuigg, pytho...@python.org

David> I'm not sure at this point if an improved reload() is worth
David> pursuing, ...

I wrote something and threw it up on my Python Bits page:

http://www.musi-cal.com/~skip/python/

See if it suits your needs.

Skip

Terry Reedy

unread,

Mar 14, 2004, 2:51:13 AM3/14/04

to pytho...@python.org

"David MacQuigg" <d...@gain.com> wrote in message

news:6kk6501shlve52ds2...@4ax.com...

> Just to make sure I understand this, I think what you are saying is
> that if I have a module M1 that defines a value x = 3.1, it will be
> impossible to keep track of the number of references to M1.x because
> the object '3.1' may have other references to it from other modules
> which use the same constant 3.1. This really does make it impossible
> to do a complete reload.

Currently, this is possible but not actual for floats, but it is actual, in
CPython, for some ints and strings. For a fresh 2.2.1 interpreter

>>> sys.getrefcount(0)
52
>>> sys.getrefcount(1)
50
>>> sys.getrefcount('a')
7

> Warning: References to objects in the old module still exist:

> creating direct (not fully-qualified) references to objects within any

> module that is likely to be reloaded, and be assured that we will get
> a warning if we miss one.

A module is a namespace created by external code, resulting in a namespace
with a few special attributes like __file__, __name__, and __doc__. A
namespace contains names, or if you will, name bindings. It does not,
properly speaking, contain objects -- which are in a separate, anonymous
data space. One can say, reasonably, that functions and classes defined in
a module 'belong' to that module, and one could, potentially, track down
and replace all references to such.

As you have already noticed, you can make this easier by always accessing
the functions and classess via the module (mod.fun(), mod.clas(), etc.) --
which mean no anonymous references via tuple, list, or dict slots, etc.

However, there is still the problem of instances and their __class__
attribute. One could, I believe (without trying it) give each class in a
module an __instances__ list that is updated by each call to __init__.
Then super_reload() could grab the instances lists, do a normal reload, and
then update the __instances__ attributes of the reloaded classes and the
__class__ attributes of the instances on the lists. In other words,
manually rebind instances to new classes and vice versa.

Another possibility (also untried) might be to reimport the module as
'temp' (for instance) and then manully replace, in the original module,
each of the function objects and other constants and manually update each
of the class objects. Then instance __class__ attributes would remain
valid.

Either method is obviously restricted to modules given special treatment
and planning. Either update process also needs to be 'atomic' from the
viewpoint of Python code. A switch to another Python thread that accesses
the module in the middle of the process would not be good. There might
also be other dependencies I am forgetting, but the above should be a
start.

Terry J. Reedy

David MacQuigg

unread,

Mar 14, 2004, 7:14:20 AM3/14/04

to

On Sat, 13 Mar 2004 14:27:00 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> David> I'm not sure at this point if an improved reload() is worth

> David> pursuing, ...
>
>I wrote something and threw it up on my Python Bits page:
>
> http://www.musi-cal.com/~skip/python/

I get AttributeErrors when I try the super_reload function. Looks like
sys.modules has a bunch of items with no '__dict__'.

I'll work with Skip via email.

-- Dave

Skip Montanaro

unread,

Mar 14, 2004, 11:15:16 AM3/14/04

to David MacQuigg, pytho...@python.org

>> I wrote something and threw it up on my Python Bits page:
>>
>> http://www.musi-cal.com/~skip/python/

Dave> I get AttributeErrors when I try the super_reload function. Looks
Dave> like sys.modules has a bunch of items with no '__dict__'.

You can put objects in sys.modules which are not module objects. I updated
the code to use getattr() and setattr() during the rebinding step. I think
that will help, though of course this entire exercise is obviously only an
approximation to a solution.

Skip

David MacQuigg

unread,

Mar 14, 2004, 4:51:43 PM3/14/04

to

On Sun, 14 Mar 2004 02:51:13 -0500, "Terry Reedy" <tjr...@udel.edu>
wrote:

>
>"David MacQuigg" <d...@gain.com> wrote in message
>news:6kk6501shlve52ds2...@4ax.com...
>> Just to make sure I understand this, I think what you are saying is
>> that if I have a module M1 that defines a value x = 3.1, it will be
>> impossible to keep track of the number of references to M1.x because
>> the object '3.1' may have other references to it from other modules
>> which use the same constant 3.1. This really does make it impossible
>> to do a complete reload.
>
>Currently, this is possible but not actual for floats, but it is actual, in
>CPython, for some ints and strings. For a fresh 2.2.1 interpreter
>
>>>> sys.getrefcount(0)
>52
>>>> sys.getrefcount(1)
>50
>>>> sys.getrefcount('a')
>7

I'm amazed how many of these shared references there are.

[snip]

>
>However, there is still the problem of instances and their __class__
>attribute. One could, I believe (without trying it) give each class in a
>module an __instances__ list that is updated by each call to __init__.
>Then super_reload() could grab the instances lists, do a normal reload, and
>then update the __instances__ attributes of the reloaded classes and the
>__class__ attributes of the instances on the lists. In other words,
>manually rebind instances to new classes and vice versa.

We need to draw a clean line between what gets updated and what
doesn't. I would not update instances, because in general, that will
be impossible. Here is a section from my update on Reload Basics at
http://ece.arizona.edu/~edatools/Python/Reload.htm I need to provide
my students with a clear explanation, hopefully with sensible
motivation, for what gets updated and what does not. Comments are
welcome.

Background on Reload
Users often ask why doesn't reload just "do what we expect" and update
everything. The fundamental problem is that the current state of
objects in a running program can be dependent on the conditions which
existed when the object was created, and those conditions may have
changed. Say you have in your reloaded module:

class C1:
def __init__(self, x, y ):
...

Say you have an object x1 created from an earlier version of class C1.
The current state of x1 depends on the values of x and y at the time
x1 was created. Asking reload to "do what we expect" in this case, is
asking to put the object x1 into the state it would be now, had we
made the changes in C1 earlier.

If you are designing a multi-module program, *and* users may need to
reload certain modules, *and* re-starting everything may be
impractical, then you should avoid any direct references to objects
within the modules to be reloaded. Direct references are created by
statements like 'x = M1.x' or 'from M1 import x'. Always access these
variables via the fully-qualified names, like M1.x, and you will avoid
leftover references to old objects after a reload. This won't solve
the object creation problem, but at least it will avoid some surprises
when you re-use the variable x.

--- end of section ---

I *would* like to do something about numbers and strings and other
shared objects not getting updated, because that is going to be hard
to explain. Maybe we could somehow switch off the generation of
shared objects for modules in a 'debug' mode.

-- Dave

John Roth

unread,

Mar 14, 2004, 7:49:08 PM3/14/04

to

"David MacQuigg" <d...@gain.com> wrote in message

news:rhj950ts4fbrbfadp...@4ax.com...

>
> I *would* like to do something about numbers and strings and other
> shared objects not getting updated, because that is going to be hard
> to explain. Maybe we could somehow switch off the generation of
> shared objects for modules in a 'debug' mode.

It doesn't matter if numbers and strings get updated. They're
immutable objects, so one copy of a number is as good as
another. In fact, that poses a bit of a problem since quite
a few of them are singletons. There's only one object that
is an integer 1 in the system, so if the new version changes
it to, say 2, and you go around and rebind all references to
1 to become references to 2, you might have a real mess
on your hands.

On the other hand, if you don't rebind the ones that came out
of the original version of the module, you've got a different
mess on your hands.

John Roth
>
> -- Dave
>

Greg Ewing (using news.cis.dfn.de)

unread,

Mar 14, 2004, 8:19:58 PM3/14/04

to

Skip Montanaro wrote:
> Not so. del sys.modules['mod']/import mod is effectively what reload()
> does.

Not quite -- reload() keeps the existing module object and changes
its contents, whereas the above sequence creates a new module
object.

The difference will be apparent if any other modules have done
'import mod' before the reload.

--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg

Hung Jung Lu

unread,

Mar 15, 2004, 2:01:09 AM3/15/04

to

> >> On Fri, 12 Mar 2004 08:45:24 -0500, Peter Hansen <pe...@engcorp.com>
> >> wrote:
> >>
> >> It's worse than just a surprise. It's a serious problem when what you
> >> need to do is what most people are expecting -- replace every
> >> reference to objects in the old module with references to the new
> >> objects. The problem becomes a near impossibility when those
> >> references are scattered throughout a multi-module program.

You could use a class instead of a module. I have done that kind of
thing with classes and weakrefs. By the way, it kind of surprises me
that no one has mentioned weakref in this thread. It's not too hard to
keep a list of weakrefs, and everytime an object is created, you
register it with that list. Now, when the new class comes in (e.g. via
reload(),) you get the list of weakrefs of the existing objects, and
re-assign their __class__, and voila, dynamic change of class
behavior. Of course, if you spend some time and push this feature into
the metaclass, everytime becomes even easier.

But it is true that in Python you have to implement dynamic refreshing
of behavior (module or class) explicitly, whereas in Ruby, as I
understand, class behavior refreshing is automatic.

David MacQuigg <d...@gain.com> wrote in message news:<ba1450tvgdttth3dd...@4ax.com>...

>
> I agree, most programs should not have 'reload()' designed in, and
> those that do, should be well aware of its limitations. I'm concerned
> more about interactive use, specifically of programs which cannot be
> conveniently restarted from the beginning. I guess I'm spoiled by HP
> BASIC, where you can change the program statements while the program
> is running! (half wink)

Edit-and-continue. Which is kind of important. For instance, I often
have to load in tons of data from the database, do some initial
processing, and then do the actual calculations. Or in game
programming, where you have to load up a lot of things, play quite a
few initial steps, before you arrive at the point of your interest.
Now, in these kinds of programs, where the initial state preparation
takes a long time, you really would like some "edit-and-continue"
feature while developing the program.

For GUI programs, edit-and-continue is also very helpful during
development.

And for Web applications, actually most CGI-like programs (EJB in Java
jargon, or external methods in Zope) are all reloadable while the
web/app server is running. Very often you can do open-heart surgery on
web/app servers, while the website is running live.

> Here is a use-case for classes. I've got hundreds of variables in a
> huge hierarchy of "statefiles". In my program, that hierarchy is
> handled as a hierarchy of classes. If I want to access a particular
> variable, I say something like:
> wavescan.window1.plot2.xaxis.label.font.size = 12
> These classes have no methods, just names and values and other
> classes.

"State file" reminds me of a programming paradigm based on REQUEST and
RESPONSE. (Sometimes REQUEST alone.) Basically, your program's
information is stored in a single "workspace" object. The advantage of
this approach is: (1) All function/method calls at higher level have
unique "header", something like f(REQUEST, RESPONSE), or f(REQUEST),
and you will never need to worry about header changes. (2) The REQUEST
and/or RESPONSE object could be serialized and stored on disk, or
passed via remote calls to other computers. Since they can be
serialized, you can also intercept/modify the content and do unit
testing. This is very important in programs that take long time to
build up initial states. Basically, once you are able to serialize and
cache the state on disk (and even modify the states offline), then you
can unit test various parts of your program WITHOUT having to start
from scratch. Some people use XML for serialization to make state
modification even easier, but any other serialization format is just
as fine. This approach is also good when you want/need some
parallel/distributed computing down the future, since the serialized
states could be potentially be dispatched independently.

Today's file access time is so fast that disk operations are often
being sub-utilized. In heavy numerical crunching, having a "workspace"
serialization can make development and debugging a lot less painful.

> If I reload a module that changes some of those variables, I would
> like to not have to hunt down every reference in the running program
> and change it manually.

In Python the appropriate tool is weakref. Per each class that
matters, keep a weakref list of the instances. This way you can
automate the refresing. I've done that before.

regards,

Hung Jung

Skip Montanaro

unread,

Mar 15, 2004, 6:49:58 AM3/15/04

to David MacQuigg, pytho...@python.org

Dave> Maybe we could somehow switch off the generation of shared objects
Dave> for modules in a 'debug' mode.

You'd have to disable the integer free list. There's also code in
tupleobject.c to recognize and share the empty tuple. String interning
could be disabled as well. Everybody's ignored the gorilla in the room:

>>> sys.getrefcount(None)
1559

In general, I don't think that disabling immutable object sharing would be
worth the effort. Consider the meaning of module level integers. In my
experience they are generally constants and are infrequently changed once
set. Probably the only thing worth tracking down during a super reload
would be function, class and method definitions.

Skip

Skip Montanaro

unread,

Mar 15, 2004, 6:57:19 AM3/15/04

to Hung Jung Lu, pytho...@python.org

Hung Jung> But it is true that in Python you have to implement dynamic
Hung Jung> refreshing of behavior (module or class) explicitly, whereas
Hung Jung> in Ruby, as I understand, class behavior refreshing is
Hung Jung> automatic.

That has its own attendant set of problems. If an instance's state is
created with an old version of a class definition, then updated later to
refer to a new version, who's to say that the current state of the instance
is what you would have obtained had the instance been created using the new
class from the start?

Skip

Michael Hudson

unread,

Mar 15, 2004, 7:45:13 AM3/15/04

to

Skip Montanaro <sk...@pobox.com> writes:

> Not so. del sys.modules['mod']/import mod is effectively what
> reload() does.

It's more like 'exec mod.__file__[:-1] in mod.__dict__", actually.

Cheers,
mwh

--
I don't have any special knowledge of all this. In fact, I made all
the above up, in the hope that it corresponds to reality.
-- Mark Carroll, ucam.chat

Michael Hudson

unread,

Mar 15, 2004, 7:46:49 AM3/15/04

to

David MacQuigg <d...@gain.com> writes:

They'll be None, mostly.

Cheers,
mwh

--
C++ is a siren song. It *looks* like a HLL in which you ought to
be able to write an application, but it really isn't.
-- Alain Picard, comp.lang.lisp

Skip Montanaro

unread,

Mar 15, 2004, 8:40:12 AM3/15/04

to Michael Hudson, pytho...@python.org

>> >I wrote something and threw it up on my Python Bits page:
>> >
>> > http://www.musi-cal.com/~skip/python/
>>
>> I get AttributeErrors when I try the super_reload function. Looks
>> like sys.modules has a bunch of items with no '__dict__'.

Michael> They'll be None, mostly.

What's the significance of an entry in sys.modules with a value of None?
That is, how did they get there and why are they there?

Skip

Michael Hudson

unread,

Mar 15, 2004, 9:40:16 AM3/15/04

to

Skip Montanaro <sk...@pobox.com> writes:

Something to do with packags and things that could have been but
weren't relative imports, I think...

>>> from distutils.core import setup
>>> import sys
>>> for k,v in sys.modules.items():
... if v is None:
... print k
...
distutils.distutils
distutils.getopt
encodings.encodings
distutils.warnings
distutils.string
encodings.codecs
encodings.exceptions
distutils.types
encodings.types
distutils.os
distutils.re
distutils.sys
distutils.copy

Cheers,
mwh

--
It's actually a corruption of "starling". They used to be carried.
Since they weighed a full pound (hence the name), they had to be
carried by two starlings in tandem, with a line between them.
-- Alan J Rosenthal explains "Pounds Sterling" on asr

David MacQuigg

unread,

Mar 15, 2004, 11:27:12 AM3/15/04

to

On Sun, 14 Mar 2004 19:49:08 -0500, "John Roth"
<newsg...@jhrothjr.com> wrote:

>"David MacQuigg" <d...@gain.com> wrote in message
>news:rhj950ts4fbrbfadp...@4ax.com...
>>
>> I *would* like to do something about numbers and strings and other
>> shared objects not getting updated, because that is going to be hard
>> to explain. Maybe we could somehow switch off the generation of
>> shared objects for modules in a 'debug' mode.
>
>It doesn't matter if numbers and strings get updated. They're
>immutable objects, so one copy of a number is as good as
>another. In fact, that poses a bit of a problem since quite
>a few of them are singletons. There's only one object that
>is an integer 1 in the system, so if the new version changes
>it to, say 2, and you go around and rebind all references to
>1 to become references to 2, you might have a real mess
>on your hands.

The immutability of numbers and strings is referring only to what you
can do via executable statements. If you use a text editor on the
original source code, clearly you can change any "immutable".

You do raise a good point, however, about the need to avoid changing
*all* references to a shared object. The ones that need to change are
those that were created via a reference to an earlier version of the
reloaded module.

>On the other hand, if you don't rebind the ones that came out
>of the original version of the module, you've got a different
>mess on your hands.

True.

-- Dave

David MacQuigg

unread,

Mar 15, 2004, 12:15:33 PM3/15/04

to

On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> Dave> Maybe we could somehow switch off the generation of shared objects

> Dave> for modules in a 'debug' mode.
>
>You'd have to disable the integer free list. There's also code in
>tupleobject.c to recognize and share the empty tuple. String interning
>could be disabled as well. Everybody's ignored the gorilla in the room:
>
> >>> sys.getrefcount(None)
> 1559

Implementation detail. ( half wink )

>In general, I don't think that disabling immutable object sharing would be
>worth the effort. Consider the meaning of module level integers. In my
>experience they are generally constants and are infrequently changed once
>set. Probably the only thing worth tracking down during a super reload
>would be function, class and method definitions.

If you reload a module M1, and it has an attribute M1.x, which was
changed from '1' to '2', we want to change also any references that
may have been created with statements like 'x = M1.x', or 'from M1
import *' If we don't do this, reload() will continue to baffle and
frustrate new users. Typically, they think they have just one
variable 'x'

It's interesting to see how Ruby handles this problem.
http://userlinux.com/cgi-bin/wiki.pl?RubyPython I'm no expert on
Ruby, but it is my understanding that there *are* no types which are
implicitly immutable (no need for tuples vs lists, etc.). If you
*want* to make an object (any object) immutable, you do that
explicitly with a freeze() function.

I'm having trouble understanding the benefit of using shared objects
for simple numbers and strings. Maybe you can save a significant
amount of memory by having all the *system* modules share a common
'None' object, but when a user explicitly says 'M1.x = None', surely
we can afford a few bytes to provide a special None for that
reference. The benefit is that when you change None to 'something' by
editing and reloading M1, all references that were created via a
reference to M1.x will change automatically.

We should at least have a special 'debug' mode in which the hidden
sharing of objects is disabled for selected modules. You can always
explicitly share an object by simply referencing it, rather than
typing in a fresh copy.

x = "Here is a long string I want to share."
y = x
z = "Here is a long string I want to share."

In any mode, x and y will be the same object. In debug mode, we
allocate a little extra memory to make z a separate object from x, as
the user apparently intended.

If we do the updates for just certain types of objects, we will have a
non-intuitive set of rules that will be difficult for users to
understand. I would like to make things really simple and say:
"""
If you have a direct reference to an object in a reloaded module, that
reference will be updated. If the reference is created by some other
process (e.g. copying a string, or instantiation of a new object based
on a class in the reloaded module) then that reference will not be
updated. Only references to objects from the old module are updated.
The old objects are then garbage collected.
"""

We may have to pay a price in implementation cost and a little extra
storage to make things simple for the user.

-- Dave

John Roth

unread,

Mar 15, 2004, 1:43:07 PM3/15/04

to

"David MacQuigg" <d...@gain.com> wrote in message

news:tuob50liqn5mcrbvh...@4ax.com...

> On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <sk...@pobox.com>
> wrote:
>

>
> I'm having trouble understanding the benefit of using shared objects
> for simple numbers and strings. Maybe you can save a significant
> amount of memory by having all the *system* modules share a common
> 'None' object, but when a user explicitly says 'M1.x = None', surely
> we can afford a few bytes to provide a special None for that
> reference. The benefit is that when you change None to 'something' by
> editing and reloading M1, all references that were created via a
> reference to M1.x will change automatically.

I believe it's a performance optimization; the memory savings
are secondary.

> We should at least have a special 'debug' mode in which the hidden
> sharing of objects is disabled for selected modules. You can always
> explicitly share an object by simply referencing it, rather than
> typing in a fresh copy.

That would have rather disasterous concequences, since
some forms of comparison depend on there only being
one copy of the object.

>
> -- Dave
>

Jeff Epler

unread,

Mar 15, 2004, 12:33:04 PM3/15/04

to David MacQuigg, pytho...@python.org

On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> >You'd have to disable the integer free list. There's also code in
> >tupleobject.c to recognize and share the empty tuple. String interning
> >could be disabled as well. Everybody's ignored the gorilla in the room:
> >
> > >>> sys.getrefcount(None)
> > 1559

On Mon, Mar 15, 2004 at 10:15:33AM -0700, David MacQuigg wrote:
> Implementation detail. ( half wink )

I'd round that down from half to None, personally.

This is guaranteed to work:
x = None
y = None
assert x is y
by the following text in the language manual:
None
This type has a single value. There is a single object with
this value. This object is accessed through the built-in
name None. It is used to signify the absence of a value in
many situations, e.g., it is returned from functions that
don't explicitly return anything. Its truth value is false.
There are reams of code that rely on the object identity of None, so a
special debug mode where "x = <some literal>" makes x refer to something
that has a refcount of 1 will break code.

The 'is' guarantee applies to at least these built-in values:
None Ellipsis NotImplemented True False

The only problem I can see with reload() is that it doesn't do what you
want. But on the other hand, what reload() does is perfectly well
defined, and at least the avenues I've seen explored for "enhancing" it
look, well, like train wreck.

Jeff

Skip Montanaro

unread,

Mar 15, 2004, 1:28:50 PM3/15/04

to David MacQuigg, pytho...@python.org

>> In general, I don't think that disabling immutable object sharing
>> would be worth the effort. Consider the meaning of module level
>> integers. In my experience they are generally constants and are
>> infrequently changed once set. Probably the only thing worth
>> tracking down during a super reload would be function, class and
>> method definitions.

Dave> If you reload a module M1, and it has an attribute M1.x, which was
Dave> changed from '1' to '2', we want to change also any references
Dave> that may have been created with statements like 'x = M1.x', or
Dave> 'from M1 import *' If we don't do this, reload() will continue to
Dave> baffle and frustrate new users. Typically, they think they have
Dave> just one variable 'x'

Like I said, I think that sort of change will be relatively rare. Just tell
your users, "don't do that".

Dave> I'm having trouble understanding the benefit of using shared
Dave> objects for simple numbers and strings.

Can you say "space and time savings"? Ints are 12 bytes, strings are 24
bytes (plus the storage for the string), None is 8 bytes. It adds up. More
importantly, small ints, interned strings and None would constantly be
created and freed. The performance savings of sharing them are probably
much more important.

Finally, from a semantic viewpoint, knowing that None is defined by the
language to be a singleton object allows the more efficient "is" operator to
be used when testing objects against None for equality. If you allowed many
copies of that object that wouldn't work.

Dave> We should at least have a special 'debug' mode in which the hidden
Dave> sharing of objects is disabled for selected modules.

It might help with your problem but would change the semantics of the
language.

Skip

David MacQuigg

unread,

Mar 15, 2004, 2:23:11 PM3/15/04

to

On Mon, 15 Mar 2004 11:33:04 -0600, Jeff Epler <jep...@unpythonic.net>
wrote:

>This is guaranteed to work:
> x = None
> y = None
> assert x is y
>by the following text in the language manual:
> None
> This type has a single value. There is a single object with
> this value. This object is accessed through the built-in
> name None. It is used to signify the absence of a value in
> many situations, e.g., it is returned from functions that
> don't explicitly return anything. Its truth value is false.
>There are reams of code that rely on the object identity of None, so a
>special debug mode where "x = <some literal>" makes x refer to something
>that has a refcount of 1 will break code.
>
>The 'is' guarantee applies to at least these built-in values:
> None Ellipsis NotImplemented True False

This certainly complicates things. I *wish* they had not made this
"single object" statement. Why should how things are stored
internally matter to the user? We could have just as easily worked
with x == y, but now, as you say, it may be too late.

The same problem occurs with strings (some strings at least):
>>> x = 'abcdefghighklmnop'
>>> y = 'abcdefghighklmnop'
>>> x is y
True
>>> x = 'abc xyz'
>>> y = 'abc xyz'
>>> x is y
False

Since there is no simple way for the user to distinguish these cases,
it looks like we might break some code if the storage of equal objects
changes. The change would have to be for "debug" mode only, and for
only the modules the user specifically imports in debug mode. We
would need a big, bold warning that you should not use 'is'
comparisons in cases like the above, at least for any objects from
modules that are imported in debug mode.

>The only problem I can see with reload() is that it doesn't do what you
>want. But on the other hand, what reload() does is perfectly well
>defined, and at least the avenues I've seen explored for "enhancing" it
>look, well, like train wreck.

It's worse than just a misunderstanding. It's a serious limitation on
what we can do with editing a running program. I don't agree that
what it does now is well defined (at least not in the documentation).
The discussion in Learning Python is totally misleading. We should at
least update the description of the reload function in the Python
Library Reference. See the thread "Reload Confusion" for some
suggested text.

-- Dave

David MacQuigg

unread,

Mar 15, 2004, 3:18:34 PM3/15/04

to

On Mon, 15 Mar 2004 12:28:50 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> >> In general, I don't think that disabling immutable object sharing

> >> would be worth the effort. Consider the meaning of module level
> >> integers. In my experience they are generally constants and are
> >> infrequently changed once set. Probably the only thing worth
> >> tracking down during a super reload would be function, class and
> >> method definitions.
>
> Dave> If you reload a module M1, and it has an attribute M1.x, which was
> Dave> changed from '1' to '2', we want to change also any references
> Dave> that may have been created with statements like 'x = M1.x', or
> Dave> 'from M1 import *' If we don't do this, reload() will continue to
> Dave> baffle and frustrate new users. Typically, they think they have
> Dave> just one variable 'x'
>
>Like I said, I think that sort of change will be relatively rare.

I think wanting to change numbers in a reloaded module is very common.

>Just tell your users, "don't do that".

The problem is the complexity of "that" which they can and cannot do.
Even renouned text authors don't seem to explain it clearly. I'm
opting now for "don't do anything" to try and make it simple. By that
I mean - Don't expect reloads to update anything but the reference to
the reloaded module itself. This is simple, just not very convenient.

> Dave> I'm having trouble understanding the benefit of using shared
> Dave> objects for simple numbers and strings.
>
>Can you say "space and time savings"? Ints are 12 bytes, strings are 24
>bytes (plus the storage for the string), None is 8 bytes. It adds up.

Maybe you can save a significant amount of memory by having all the

*system* modules share a common 'None' object, but when a user

explicitly says 'M1.x = None', surely we can afford a 8 bytes to

provide a special None for that reference.

I'm no expert on these implementation issues, but these numbers seem
small compared to the 512MB in a typical modern PC. I suppose there
are some rare cases where you need to create an array of millions of
references to a single constant. In those cases the debug mode may be
too much of a burden. In general, we ought to favor simplicity over
efficiency.

>More importantly, small ints, interned strings and None would constantly be
>created and freed. The performance savings of sharing them are probably
>much more important.

Again, as a non-expert, this seems strange. The burden of comparing a
new object to what is already in memory, using an '==' type of
comparison, must be comparable to simply creating a new object.

>Finally, from a semantic viewpoint, knowing that None is defined by the
>language to be a singleton object allows the more efficient "is" operator to
>be used when testing objects against None for equality. If you allowed many
>copies of that object that wouldn't work.

We would lose some, assuming '==' for these small obects is slower
than 'is', but then you would not have to test '==' on a large number
of objects already in memory each time you define a new integer or
small string.

> Dave> We should at least have a special 'debug' mode in which the hidden
> Dave> sharing of objects is disabled for selected modules.
>
>It might help with your problem but would change the semantics of the
>language.

I assume you are referring to the symantics of 'is' when working with
small objects like None, 2, 'abc'. I agree, that is a problem for the
proposed debug mode. I don't see a way around it, other than warning
users not to expect modules imported in the debug mode to optimize the
sharing of small objects in memory. Use 'x == 2' rather than 'x is 2'
if you intend to use the debug mode.

-- Dave

Jeff Epler

unread,

Mar 15, 2004, 4:38:13 PM3/15/04

to David MacQuigg, pytho...@python.org

On Mon, Mar 15, 2004 at 01:18:34PM -0700, David MacQuigg wrote:
> Maybe you can save a significant amount of memory by having all the
> *system* modules share a common 'None' object, but when a user
> explicitly says 'M1.x = None', surely we can afford a 8 bytes to
> provide a special None for that reference.

Not only is there a lot of Python code written that says
if x is None: ...
but there are dozens of pointer comparisons to Py_None in the C source
code for Python. In this somewhat stale CVS tree, grep found 263 lines
with both "==" and "Py_None" on the same line, and 86 more with "!=" and
"Py_None" on the same line. Py_True and Py_False appear with == and !=
an additional 45 times. Perhaps you'd like to fix all the third-party
extension modules once you're done inspecting something like 300 sites
in the Python core.

For Py_None, at least, the pointer equality rule is enshrined in the API
documentation:
Since None is a singleton, testing for object identity (using "=="
in C) is sufficient.
http://www.python.org/doc/api/noneObject.html

Jeff

Skip Montanaro

unread,

Mar 15, 2004, 2:50:47 PM3/15/04

to David MacQuigg, pytho...@python.org

Dave> The same problem occurs with strings (some strings at least):

>>> x = 'abcdefghighklmnop'
>>> y = 'abcdefghighklmnop'
>>> x is y
True
>>> x = 'abc xyz'
>>> y = 'abc xyz'
>>> x is y
False

The difference is the first string has the form of an identifier, so the
interpreter automatically interns it and it gets shared after that. The
second doesn't. You can force things though:

>>> x = 'abc xyz'
>>> id(x)
10476320
>>> x = intern(x)
>>> x
'abc xyz'
>>> id(x)
10476320
>>> y = 'abc xyz'
>>> id(y)
10477088
>>> y = intern(y)
>>> id(y)
10476320
>>> x is y
True

Dave> We should at least update the description of the reload function
Dave> in the Python Library Reference. See the thread "Reload
Dave> Confusion" for some suggested text.

Please file a bug report on Sourceforge so your ideas don't get lost. Feel
free to assign it to me (sf id == "montanaro").

Skip

John Roth

unread,

Mar 15, 2004, 5:38:52 PM3/15/04

to

"David MacQuigg" <d...@gain.com> wrote in message

news:fc0c50htlip4mtpjj...@4ax.com...

> On Mon, 15 Mar 2004 11:33:04 -0600, Jeff Epler <jep...@unpythonic.net>
> wrote:
>
>
> It's worse than just a misunderstanding. It's a serious limitation on
> what we can do with editing a running program.

Well, that's a definite yes and no. The limitation is quite simple:
any object in the module that has a reference from outside of the
module will not have that reference changed. It will continue to
refer to the old copy of the object.

The solution to this is to apply some design discipline. Systems
exist that have absolute "cannot come down for any reason"
type of requirements where software has to be replaced while the
system is running. It's not impossible, it simply requires a great
deal of discipline in not allowing references to wander all over the
place.

As far as updating in place while debugging, there are a few
solutions that, so far, haven't been implemented. One is to
notice that functions are objects that have a reference to another
object called a "code object." This is the actual result of the
compilation, and if you go behind the scenes and replace the
code object reference in the function object, you've basically
done an update in place - as long as you don't have a stack
frame with a pointer into the old code object! (The stack frame
could, of course, be fixed too. Extra credit for doing so.)

I can easily imagine a development environment that could
do this kind of magic. If someone wants to build it, I'm
certainly not going to stop them (and wouldn't even if I could.)
I might even find it useful!

The thing that is not going to work, ever, is having reload()
do the work for you.

> I don't agree that
> what it does now is well defined (at least not in the documentation).

It's well enough defined for someone who knows how Python works.

> The discussion in Learning Python is totally misleading. We should at
> least update the description of the reload function in the Python
> Library Reference. See the thread "Reload Confusion" for some
> suggested text.

I agree with that: instead of "There are some caveats" it should say:

WARNING - references to objects in the old copy of the module
that have leaked out of the module will NOT be replaced. A few of
the implications are:

and then continue with the current text (my version of the doc is 2.3.2).

John Roth
>
> -- Dave
>

Hung Jung Lu

unread,

Mar 15, 2004, 7:15:00 PM3/15/04

to

David MacQuigg <d...@gain.com> wrote in message news:<fc0c50htlip4mtpjj...@4ax.com>...

> On Mon, 15 Mar 2004 11:33:04 -0600, Jeff Epler <jep...@unpythonic.net>
> wrote:
>
> >The only problem I can see with reload() is that it doesn't do what you
> >want. But on the other hand, what reload() does is perfectly well
> >defined, and at least the avenues I've seen explored for "enhancing" it
> >look, well, like train wreck.
>
> It's worse than just a misunderstanding. It's a serious limitation on
> what we can do with editing a running program.

As I said in another message, you CAN do the kinds of things you want
to do (edit-and-continue), if you use weakrefs, and use classes
instead of modules. Take a look at the weakref module. I am not saying
that it's trivial in Python: it does require a bit of work, but
edit-and-continue in Python is definitely doable. I've done it, many
other people have done it.

(Of course, if you asked whether the Ruby behavior is better, I'd
think so. I think it's better to automatically replace class behavior
on reload, by default, and leave open the possibility of explicitly
refusing replacement. Python is the opposite: by default the class
behavior does NOT get modified, and you have to do somework to replace
it.)

I think it is an historical accident in Python that modules are not
made more class-like. Another thing I have seen people wishing having
is getter/setter accessor methods (or properties) for module-level
attributes.

It usually is a better practice in Python to store attributes in
classes rather than modules, exactly because down the future you'd
often start to wish having class-like behaviors for your modules.

regards,

Hung Jung

David MacQuigg

unread,

Mar 15, 2004, 9:09:20 PM3/15/04

to

On Mon, 15 Mar 2004 13:50:47 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> Dave> We should at least update the description of the reload function

> Dave> in the Python Library Reference. See the thread "Reload
> Dave> Confusion" for some suggested text.
>
>Please file a bug report on Sourceforge so your ideas don't get lost. Feel
>free to assign it to me (sf id == "montanaro").

Will do. I would like to get a few comments from folks following this
thread before submitting the proposed text to Sourceforge. Here is
the summary:
"""
To summarize what happens with reload(M1):

The module M1 must have been already imported in the current
namespace.

The new M1 is executed, and new objects are created in memory.

The names in the M1 namespace are updated to point to the new objects.
No other references are changed. References to objects removed from
the new module remain unchanged.

Previously created references from other modules to the old objects
remain unchanged and must be updated in each namespace where they
occur.

The old objects remain in memory until all references to them are
gone.
"""

To read this in context, with a nice example, background discussion,
etc. see http://ece.arizona.edu/~edatools/Python/Reload.htm I think
I've finally got it right, but I'm always prepared for another
surprise.

Once I'm confident that my understanding is correct, I'll see if I can
weave this into the existing text of the Library Reference. I may try
also to put in some of the "motivation" for the way things are (from
the Background section of the above webpage.)

-- Dave

Carl Banks

unread,

Mar 15, 2004, 11:43:01 PM3/15/04

to

David MacQuigg wrote:
> On Mon, 15 Mar 2004 05:49:58 -0600, Skip Montanaro <sk...@pobox.com>
> wrote:
>
>> Dave> Maybe we could somehow switch off the generation of shared objects
>> Dave> for modules in a 'debug' mode.
>>
>>You'd have to disable the integer free list. There's also code in
>>tupleobject.c to recognize and share the empty tuple. String interning
>>could be disabled as well. Everybody's ignored the gorilla in the room:
>>
>> >>> sys.getrefcount(None)
>> 1559
>
> Implementation detail. ( half wink )
>
>>In general, I don't think that disabling immutable object sharing would be
>>worth the effort. Consider the meaning of module level integers. In my
>>experience they are generally constants and are infrequently changed once
>>set. Probably the only thing worth tracking down during a super reload
>>would be function, class and method definitions.
>
> If you reload a module M1, and it has an attribute M1.x, which was
> changed from '1' to '2', we want to change also any references that
> may have been created with statements like 'x = M1.x', or 'from M1
> import *' If we don't do this, reload() will continue to baffle and
> frustrate new users.

What if one of your users does something like 'y = M1.x + 1'; then
what are you going to do?

It seems to me that your noble effort to make reload() completely
foolproof is ultimately in vain: there's just too many opportunities
for a module's variables to affect things far away.

--
CARL BANKS http://www.aerojockey.com/software
"If you believe in yourself, drink your school, stay on drugs, and
don't do milk, you can get work."
-- Parody of Mr. T from a Robert Smigel Cartoon

David MacQuigg

unread,

Mar 16, 2004, 6:48:49 AM3/16/04

to

On Tue, 16 Mar 2004 04:43:01 GMT, Carl Banks
<imb...@aerojockey.invalid> wrote:

>What if one of your users does something like 'y = M1.x + 1'; then
>what are you going to do?

The goal is *not* to put the program into the state it "would have
been" had the changes in M1 been done earlier. That is impossible.
We simply want to have all *direct* references to objects in M1 be
updated. A direct reference, like 'y = M1.x', sets 'y' to the same
object as 'M1.x' The 'y' in the above example points to a new object,
with an identity different than anything in the M1 module. It should
not get updated.

>It seems to me that your noble effort to make reload() completely
>foolproof is ultimately in vain: there's just too many opportunities
>for a module's variables to affect things far away.

It all depends on your goals for reload(). To me, updating all direct
references is a worthy goal, would add a lot of utility, and is easy
to explain. Going further than that, updating objects that are "only
one operation away from a direct reference" for example, gets into a
grey area where I see no clear line we can draw. There might be some
benefit, but the cost in user confusion would be too great.

Reload() will always be a function that needs to be used cautiously.
Changes in a running program can propagate in strange ways. "Train
wreck" was the term another poster used.

-- Dave

Skip Montanaro

unread,

Mar 16, 2004, 9:58:38 AM3/16/04

to David MacQuigg, pytho...@python.org

>> What if one of your users does something like 'y = M1.x + 1'; then
>> what are you going to do?

Dave> The goal is *not* to put the program into the state it "would have
Dave> been" had the changes in M1 been done earlier. That is
Dave> impossible. We simply want to have all *direct* references to
Dave> objects in M1 be updated.

The above looks like a pretty direct reference to M1.x. <0.5 wink>

It seems to me that you have a continuum from "don't update anything" to
"track and update everything":

don't update update global update all direct update
anything funcs/classes references everything

reload() super_reload() Dave nobody?

Other ideas have been mentioned, like fiddling the __bases__ of existing
instances or updating active local variables. I'm not sure precisely where
those concepts fall on the continuum. Certainly to the right of
super_reload() though.

In my opinion you do what's easy as a first step then extend it as you can.
I think you have to punt on shared objects (ints, None, etc). This isn't
worth changing the semantics of the language even in some sort of
interactive debug mode.

Sitting for long periods in an interactive session and expecting it to track
your changes is foreign to me. I will admit to doing stuff like this for
short sessions:

>>> import foo
>>> x = foo.Foo(...)
>>> x.do_it()
...
TypeError ...
>>> # damn! tweak foo.Foo class in emacs
>>> reload(foo)
>>> x = foo.Foo(...)
>>> x.do_it()
...

but that's relatively rare, doesn't go on for many cycles, and is only made
tolerable by the presence of readline/command retrieval/copy-n-paste in the
interactive environment.

Maybe it's just the nature of your users and their background, but an
(edit/test/run)+ cycle seems much more common in the Python community than a
run/(edit/reload)+ cycle. Note the missing "test" from the second cycle and
from the above pseudo-transcript. I think some Python programmers would
take the opportunity to add an extra test case to their code in the first
cycle, where in the second cycle the testing is going on at the interactive
prompt where it can get lost. "I don't need to write a test case. It will
just slow me down. The interactive session will tell me when I've got it
right." Of course, once the interactive sessions has ended, the sequence of
statements you executed is not automatically saved. You still need to pop
back to your editor to take care of that. It's a small matter of
discipline, but then so is not creating aliases in the first place.

Dave> Reload() will always be a function that needs to be used
Dave> cautiously. Changes in a running program can propagate in strange
Dave> ways. "Train wreck" was the term another poster used.

Precisely. You may wind up making reload() easier to explain in the common
case, but introduce subtleties which are tougher to predict (instances whose
__bases__ change or don't change depending how far along the above continuum
you take things). I think changing the definitions of functions and classes
will be the much more likely result of edits requiring reloads than tweaking
small integers or strings. Forcing people to recreate instances is
generally not that big of a deal.

Finally, I will drag the last line out of Tim's "The Zen of Python":

Namespaces are one honking great idea -- let's do more of those!

By making it easier for your users to get away with aliases like

x = M1.x

you erode the namespace concept ever so slightly just to save typing a
couple extra characters or executing a couple extra bytecodes. Why can't
they just type M1.x again? I don't think the savings is really worth it in
the long run.

Skip

David MacQuigg

unread,

Mar 16, 2004, 1:58:23 PM3/16/04

to

On Tue, 16 Mar 2004 08:58:38 -0600, Skip Montanaro <sk...@pobox.com>
wrote:
[snip]

>It seems to me that you have a continuum from "don't update anything" to
>"track and update everything":
>
> don't update update global update all direct update
> anything funcs/classes references everything
>
> reload() super_reload() Dave nobody?
>
>Other ideas have been mentioned, like fiddling the __bases__ of existing
>instances or updating active local variables. I'm not sure precisely where
>those concepts fall on the continuum. Certainly to the right of
>super_reload() though.
>
>In my opinion you do what's easy as a first step then extend it as you can.
>I think you have to punt on shared objects (ints, None, etc). This isn't
>worth changing the semantics of the language even in some sort of
>interactive debug mode.

I agree, punt is the right play for now, but I want to make one
clarification, in case we need to re-open this question. The semantic
change we are talking about applies only to the 'is' operator, and
only to a few immutable objects which are created via a reload of a
module in "debug" mode. All other objects, including those from other
modules remain unchanged. Objects like None, 1, 'abc', which are
treated as shared objects in normal modules, will be given a unique ID
when loaded from a module in debug mode. This means you will have to
use '==' to test equality of those objects, not 'is'. Since 'is' is
already a tricky, implementation-dependent operator that is best
avoided in these situations, the impact of this added option seems far
less than "changing the semantics of the language".

I'll hold off on any push for expanding super_reload until I have a
good use case. Meanwhile, I'll assume I can work with the existing
reload. This will probably involve a combination of programming
discipline and user education. In programming, I'll do what I can to
avoid problems with the modules I expect will be reloaded. Where
"direct" references are made to constants or other objects in a
reloaded module, I'll be sure to refresh those references at the
beginning of each code section. In my user manual, there will
probably be statements like: """Don't attempt to reload the stats
module while a simulator is active. The reload function can save
having to restart your entire session, but it does not update
functions or data that have already been sent to the simulator. As
with reloading statefiles, always kill and restart the simulator after
a reload of the stats module."""

This is a good description of the program development cycle. My users
(circuit design engineers) won't be doing program development, but
will be making changes in existing data and functions. My goal is to
make that as easy as possible. The biggest step is offering Python as
the scripting language, rather than SKILL, OCEAN, MDL, or a number of
other CPL's ( complex proprietary languages ). I expect them to learn
in two days enough Python to understand a function definition, and to
be able to edit that definition, making it do whatever they want.

> Dave> Reload() will always be a function that needs to be used
> Dave> cautiously. Changes in a running program can propagate in strange
> Dave> ways. "Train wreck" was the term another poster used.
>
>Precisely. You may wind up making reload() easier to explain in the common
>case, but introduce subtleties which are tougher to predict (instances whose
>__bases__ change or don't change depending how far along the above continuum
>you take things). I think changing the definitions of functions and classes
>will be the much more likely result of edits requiring reloads than tweaking
>small integers or strings. Forcing people to recreate instances is
>generally not that big of a deal.
>
>Finally, I will drag the last line out of Tim's "The Zen of Python":
>
> Namespaces are one honking great idea -- let's do more of those!
>
>By making it easier for your users to get away with aliases like
>
> x = M1.x
>
>you erode the namespace concept ever so slightly just to save typing a
>couple extra characters or executing a couple extra bytecodes. Why can't
>they just type M1.x again? I don't think the savings is really worth it in
>the long run.

def h23(freq):
s = complex(2*pi*freq)
h0 = PZfuncs.h0
z1 = PZfuncs.z1; z2 = PZfuncs.z2
p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

Notice the clarity in that last formula. This is a standard form of a
pole-zero transfer function that will be instantly recognized by a
circuit design engineer. The issue isn't the typing of extra
characters, but the compactness of expressions.

In this case we avoid the problem of local variables out-of-sync with
a reloaded module by refreshing those variables with every call to the
function. In other cases, this may add too much overhead to the
computation.

-- Dave

Skip Montanaro

unread,

Mar 16, 2004, 3:52:54 PM3/16/04

to David MacQuigg, pytho...@python.org

Dave> I agree, punt is the right play for now, but I want to make one
Dave> clarification, in case we need to re-open this question. The
Dave> semantic change we are talking about applies only to the 'is'
Dave> operator, and only to a few immutable objects which are created
Dave> via a reload of a module in "debug" mode. All other objects,
Dave> including those from other modules remain unchanged. Objects like
Dave> None, 1, 'abc', which are treated as shared objects in normal
Dave> modules, will be given a unique ID when loaded from a module in
Dave> debug mode. This means you will have to use '==' to test equality
Dave> of those objects, not 'is'. Since 'is' is already a tricky,
Dave> implementation-dependent operator that is best avoided in these
Dave> situations, the impact of this added option seems far less than
Dave> "changing the semantics of the language".

Don't forget all the C code. C programmers know the object which represents
None is unique, so their code generally looks like this snippet from
Modules/socketmodule.c:

if (arg == Py_None)
timeout = -1.0;

"==" in C is the equivalent of "is" in Python. If you change the uniqueness
of None, you have a lot of C code to change.

Dave> def h23(freq):
Dave> s = complex(2*pi*freq)
Dave> h0 = PZfuncs.h0
Dave> z1 = PZfuncs.z1; z2 = PZfuncs.z2
Dave> p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
Dave> return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

Dave> Notice the clarity in that last formula.

Yeah, but h0, z1, z2, etc are not long-lived copies of attributes in
PZfuncs. If I execute:

>>> blah = h23(freq)
>>> reload(PZfuncs)
>>> blah = h23(freq)

things will work properly. It's only long-lived aliases that present a
problem:

def h23(freq, h0=PZfuncs.h0):
s = complex(2*pi*freq)

z1 = PZfuncs.z1; z2 = PZfuncs.z2
p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

Dave> In this case we avoid the problem of local variables out-of-sync
Dave> with a reloaded module by refreshing those variables with every
Dave> call to the function. In other cases, this may add too much
Dave> overhead to the computation.

Unlikely. Creating local copies of frequently used globals (in this case
frequently used globals in another module) is almost always a win. In fact,
it's so much of a win that a fair amount of brain power has been devoted to
optimizing global access. See PEPs 266 and 267 and associated threads in
python-dev from about the time they were written. (Note that optimizing
global access is still an unsolved problem in Python.)

Skip

Terry Reedy

unread,

Mar 16, 2004, 4:42:55 PM3/16/04

to pytho...@python.org

"David MacQuigg" <d...@gain.com> wrote in message

news:42je50p1ephn4uoks...@4ax.com...

> def h23(freq):
> s = complex(2*pi*freq)
> h0 = PZfuncs.h0
> z1 = PZfuncs.z1; z2 = PZfuncs.z2
> p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
> return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))
>
> Notice the clarity in that last formula. This is a standard form of a
> pole-zero transfer function that will be instantly recognized by a
> circuit design engineer. The issue isn't the typing of extra
> characters, but the compactness of expressions.

For one use only, making local copies adds overhead without the
compensation of faster multiple accesses. To make the formula nearly as
clear without the overhead, I would consider

import PZfuncs as z

def h23(freq):
s = complex(2*pi*freq)

return z.h0 * (s-z.z1) * (s-z.z2) / ((s-z.p1) * (s-z.p2) * (s-z.p3))

Terry J. Reedy

David MacQuigg

unread,

Mar 16, 2004, 7:20:25 PM3/16/04

to

On Tue, 16 Mar 2004 16:42:55 -0500, "Terry Reedy" <tjr...@udel.edu>
wrote:

This is equivalent to:

h0 = PZfuncs.h0
z1 = PZfuncs.z1; z2 = PZfuncs.z2
p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3

def h23(freq):
s = complex(2*pi*freq)

return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

Either way we have the architectural problem of ensuring that the code
just above the def gets executed *after* each reload of PZfuncs, and
*before* any call to h23.

-- Dave

David MacQuigg

unread,

Mar 16, 2004, 7:21:35 PM3/16/04

to

On Tue, 16 Mar 2004 14:52:54 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

> Dave> def h23(freq):

> Dave> s = complex(2*pi*freq)
> Dave> h0 = PZfuncs.h0
> Dave> z1 = PZfuncs.z1; z2 = PZfuncs.z2
> Dave> p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
> Dave> return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))
>
> Dave> Notice the clarity in that last formula.
>
>Yeah, but h0, z1, z2, etc are not long-lived copies of attributes in
>PZfuncs. If I execute:
>
> >>> blah = h23(freq)
> >>> reload(PZfuncs)
> >>> blah = h23(freq)
>
>things will work properly. It's only long-lived aliases that present a
>problem:
>
> def h23(freq, h0=PZfuncs.h0):
> s = complex(2*pi*freq)
> z1 = PZfuncs.z1; z2 = PZfuncs.z2
> p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3
> return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

I think we are saying the same thing. I moved the alias definitions
inside the loop, knowing they would be short-lived, and therefor not a
problem.

> Dave> In this case we avoid the problem of local variables out-of-sync
> Dave> with a reloaded module by refreshing those variables with every
> Dave> call to the function. In other cases, this may add too much
> Dave> overhead to the computation.
>
>Unlikely. Creating local copies of frequently used globals (in this case
>frequently used globals in another module) is almost always a win. In fact,
>it's so much of a win that a fair amount of brain power has been devoted to
>optimizing global access. See PEPs 266 and 267 and associated threads in
>python-dev from about the time they were written. (Note that optimizing
>global access is still an unsolved problem in Python.)

Interesting. I just ran a test comparing 10,000 calls to the original
h23 above with 10,000 calls to h23a below.

h0 = PZfuncs.h0

z1 = PZfuncs.z1; z2 = PZfuncs.z2
p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3

def h23a(freq):

s = complex(2*pi*freq)
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

The results are very close:

time/loop (sec) %
Test 1: 6.48E-006 119.1 (class PZfuncs)
Test 2: 6.88E-006 126.4 (module PZfuncs)
Test 3: 5.44E-006 100.0 (z1, p1, etc outside loop)

There is not much difference in the original loop between accessing
the constants from a class vs accessing them from a module. There is
a significant difference ( but not as much as I expected ) if we move
the six assignments outside the loop. Then we are back to the problem
of ensuring that the aliases get updated each time we update the
module.

-- Dave

Skip Montanaro

unread,

Mar 16, 2004, 5:01:12 PM3/16/04

to Terry Reedy, pytho...@python.org

Terry> To make the formula nearly as clear without the overhead, I would
Terry> consider

Terry> import PZfuncs as z
Terry> def h23(freq):
Terry> s = complex(2*pi*freq)
Terry> return z.h0 * (s-z.z1) * (s-z.z2) / ((s-z.p1) * (s-z.p2) * (s-z.p3))

Or even:

def h23(freq):
z = PZfuncs

s = complex(2*pi*freq)
return z.h0 * (s-z.z1) * (s-z.z2) / ((s-z.p1) * (s-z.p2) * (s-z.p3))

(slightly faster and avoids the module aliasing problem Dave is concerned
about).

Skip

Skip Montanaro

unread,

Mar 16, 2004, 8:48:06 PM3/16/04

to David MacQuigg, pytho...@python.org

Dave> Interesting. I just ran a test comparing 10,000 calls to the
Dave> original h23 above with 10,000 calls to h23a below.

...

Dave> The results are very close:

Dave> time/loop (sec) %
Dave> Test 1: 6.48E-006 119.1 (class PZfuncs)
Dave> Test 2: 6.88E-006 126.4 (module PZfuncs)
Dave> Test 3: 5.44E-006 100.0 (z1, p1, etc outside loop)

I'm not sure what these particular tests are measuring. I can't tell which
are h23() calls and which are h23a() calls, but note that because h23() and
h23a() are actually quite simple, the time it takes to call them is going to
be a fair fraction of all calls.

For timing stuff like this I recommend you use timeit.py. Most people here
are getting used to looking at its output. Put something like:

import PZfuncs

h0 = PZfuncs.h0
z1 = PZfuncs.z1; z2 = PZfuncs.z2
p1 = PZfuncs.p1; p2 = PZfuncs.p2; p3 = PZfuncs.p3

def h23null(freq):
pass

def h23a(freq):
s = complex(2*pi*freq)
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

def h23b(freq):

z = PZfuncs
s = complex(2*pi*freq)
return z.h0*(s-z.z1)*(s-z.z2)/((s-z.p1)*(s-z.p2)*(s-z.p3))

into h23.py then run timeit.py like:

timeit.py -s "import h23 ; freq = NNN" "h23null(freq)"
timeit.py -s "import h23 ; freq = NNN" "h23a(freq)"
timeit.py -s "import h23 ; freq = NNN" "h23b(freq)"

Its output is straightforward and pretty immediately comparable across the
runs. The h23null() run will give you some idea of the call overhead. You
can, of course, dream up h23[cdefg]() variants as well.

Post code and results and we'll be happy to throw darts... :-)

Skip

David MacQuigg

unread,

Mar 17, 2004, 12:52:40 PM3/17/04

to

On Tue, 16 Mar 2004 19:48:06 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

>For timing stuff like this I recommend you use timeit.py. Most people here

>are getting used to looking at its output.

Excellent utility. This ought to be highlighted in the docs on the
time module, at least listed under "See also". I just grabbed the
first thing that came up and wrote my own little routine around the
clock() function.

[...]

>Post code and results and we'll be happy to throw darts... :-)

# PZfuncs.py -- Timing test for reloaded constants.

# Local constants:
h0 = 1
z1 = 1; z2 = 1
p1 = -1 +1j; p2 = -1 -1j; p3 = -1
pi = 3.1415926535897931

import Constants

def h23null(freq):
pass

def h23a(freq):
s = complex(2*pi*freq)
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

def h23b(freq):
z = Constants

s = complex(2*pi*freq)
return z.h0*(s-z.z1)*(s-z.z2)/((s-z.p1)*(s-z.p2)*(s-z.p3))

def h23c(freq):
h0 = Constants.h0
z1 = Constants.z1; z2 = Constants.z2
p1 = Constants.p1; p2 = Constants.p2; p3 = Constants.p3

s = complex(2*pi*freq)
return h0*(s-z1)*(s-z2)/((s-p1)*(s-p2)*(s-p3))

% timeit.py -s "from PZfuncs import * ; freq = 2.0" "h23null(freq)"
1000000 loops, best of 3: 0.461 usec per loop
% timeit.py -s "from PZfuncs import * ; freq = 2.0" "h23a(freq)"
100000 loops, best of 3: 4.94 usec per loop
% timeit.py -s "from PZfuncs import * ; freq = 2.0" "h23b(freq)"
100000 loops, best of 3: 5.79 usec per loop
% timeit.py -s "from PZfuncs import * ; freq = 2.0" "h23c(freq)"
100000 loops, best of 3: 6.29 usec per loop

My conclusion is that I should stick with form c in my application.
The savings from moving these assignments outside the function (form
a) does not justify the cost in possible problems after a reload. The
savings in going to form b is negligible. Form a is the easiest to
read, but forms b and c are not much worse.

These functions will typically be used in the interactive part of the
program to set up plots with a few hundred points. The time-consuming
computations are all done in the simulator, which is written in C++.

-- Dave

David MacQuigg

unread,

Mar 17, 2004, 1:26:07 PM3/17/04

to

On Thu, 11 Mar 2004 15:10:59 -0500, "Ellinghaus, Lance"
<lance.el...@eds.com> wrote:
>
>>Reload doesn't work the way most people think
>>it does: if you've got any references to the old module,
>>they stay around. They aren't replaced.
>
>>It was a good idea, but the implementation simply
>>doesn't do what the idea promises.
>
>I agree that it does not really work as most people think it does, but how
>would you perform the same task as reload() without the reload()?

>>> pzfuncs
<open file 'PZfuncs.py', mode 'r' at 0x00A86160>
>>> exec pzfuncs
>>> p3
-2

--- Edit PZfuncs.py here ---

>>> pzfuncs.seek(0)
>>> exec pzfuncs
>>> p3
-3
>>>

The disadvantage compared to reload() is that you get direct
references to *all* the new objects in your current namespace. With
reload() you get only a reference to the reloaded module. With the
proposed super_reload (at least the version I would like) you get no
new references in your current namespace, just updates on the
references that are already there.

Hmm. Maybe we could reload(), then loop over the available names, and
replace any that exist in the current namespace.

-- Dave

Hung Jung Lu

unread,

Mar 17, 2004, 2:08:07 PM3/17/04

to

Skip Montanaro <sk...@pobox.com> wrote in message news:<mailman.37.1079449...@python.org>...

>
> Sitting for long periods in an interactive session and expecting it to track
> your changes is foreign to me.

> ...

Not sure whether this is related to what you are talking about. In
VC/VB, while debugging a program, it is very often to get into this
situation:

(a) you have loops somewhere, (say, from i=0 to i=2000000)
(b) your program fails at some particular points in the loop,
(c) your debugger tells you there is a problem (and maybe you have
some assertion points,) and the execution stops at that point.
(d) you want to add some more debugging code to narrow down the spot,
or narrow down or the condition of the error,
(e) if you do not have good IDE, you'll have to start your program all
over again. But in VC/VB, you just insert some more code, and resume
the execution, all in matter of seconds. And you have much better
insight into the source and nature of the bug. (Is the bug from the
code? Is the bug from the data? What to do? Is the bug from C++? Is
the bug coming from stored-procedure in the database?)

Is this pure theory talk? No, because I just need to use it, now.

Without interactive programming's edit-and-continue feature, very
often you have to stop the program, insert just a few lines of code,
and restart again. This turns really bad when the initial state setup
takes time. Of course, if your programs don't take much initial setup
time, then you won't be able to realize the need or benefit of
edit-and-continue.

Sure, you can unit test things all you want. But in real life,
interactive debugging is, and will always be, the king of bug killers,
especially in large and complex systems.

> Maybe it's just the nature of your users and their background, but an
> (edit/test/run)+ cycle seems much more common in the Python community than a
> run/(edit/reload)+ cycle.

It all depends. For Zope's external methods (CGIs), you don't restart
the whole web/app server everytime you make changes to a CGI. The
run/(edit/reload) cycle is the typical behavior of long-running
applications. (Except for some earlier versions of Java and
Microsoft's web/app servers, where you DID have to restart. And that
was very annoying.)

An analogy is with Windows 95, where everytime you install/update an
application you need to reboot the OS. We know how annoying that is.
Edit-and-continue addresses a similar problem.

By the way, I am told that Common Lisp also has good edit-and-continue
feature.

regards,

Hung Jung

David MacQuigg

unread,

Mar 18, 2004, 8:07:16 PM3/18/04

to

On Mon, 15 Mar 2004 13:50:47 -0600, Skip Montanaro <sk...@pobox.com>
wrote:

>Please file a bug report on Sourceforge so your ideas don't get lost. Feel

>free to assign it to me (sf id == "montanaro").

Done. SF bug ID is 919099. Here is the proposed addition to
reload(module), just after the first paragraph:

"""
When reload(module) is executed:

The objects defined in module are compiled and loaded into memory as
new objects.

The old objects remain in memory until all references to them are

gone, and they are removed by the normal garbage-collection process.

The names in the module namespace are updated to point to any new or
changed objects. Names of unchanged objects, or of objects no longer
present in the new module, remain pointing at the old objects.

Names in other modules that refer directly to the old objects (without
the module-name qualifier) remain unchanged and must be updated in

each namespace where they occur.
"""

Anyone with corrections or clarifications, speak now.

Also here is a bit that I'm not sure of from my write-up on reload()
at http://ece.arizona.edu/~edatools/Python/Reload.htm:

Footnotes
[1] Reload(M1) is equivalent to the following:
>>> file = open(M1.__file__.rstrip('c'), 'r')
>>> file.seek(0) # needed if this is not the first reload
>>> exec file in M1.__dict__ # repeat from line 2
>>> file.close()

-- Dave