Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Python vs Java garbage collection?

1 view
Skip to first unread message

Robert Oschler

unread,
Dec 21, 2002, 4:19:19 PM12/21/02
to
This is a very unscientific observation I have here, in the sense that I've
done no formal research, but in my web and newsgroup perusals, I seem to
have come across quite a few mentions of problems with Java applications in
regards to untimely garbage collection and memory "hogging". Yet I have
come across very few of the same complaints with Python.

Is there a fundamental structural reason for this or is it simply anecdotal
coincidence or due to the kind of apps written with either language. I'm
not knocking Java, it's a fine language and it's been put to good use
although admittedly I'm biased towards Python.

thx

John Roth

unread,
Dec 21, 2002, 4:44:43 PM12/21/02
to

"Robert Oschler" <Osc...@earthlink.net> wrote in message
news:rV4N9.5714$uV4.3...@news2.news.adelphia.net...

The differences depend on the Java version you're using. Java
versions through 1.2 used a garbage collector that didn't collect
unreferenced objects until virtual memory filled up, which let it take
quite a while before an instance's cleanup actions might be executed.

More recent versions use a different stragegy that reportedly
has many fewer problems, however most of the Java runtimes
out there with browsers are quite old.

Python, on the other hand, uses a reference counting strategy.
That tends to release unused objects immediately the last
reference vanishes, however, it has problems with objects
that are linked in a cycle. The latest versions fix this (mostly,)
but it's still wise to break cycles manually to get the best
results.

Hope this explanation makes sense.

John Roth
>
> thx
>
>
>


"Martin v. Löwis"

unread,
Dec 21, 2002, 5:12:57 PM12/21/02
to John Roth
John Roth wrote:
> The differences depend on the Java version you're using. Java
> versions through 1.2 used a garbage collector that didn't collect
> unreferenced objects until virtual memory filled up, which let it take
> quite a while before an instance's cleanup actions might be executed.

In addition, Java's garbage collection used to be conservative, which
did lead to cases where it would not find certain garbage objects. This
is now supposedly fixed; they call it a "precise" collector. Python's
cyclic collector has been precise from the beginning (of course, there
wasn't a cyclic collector in Python's beginning).

There are a few cases where you can exhaust memory without triggering
collection early enough, but those are rare; in those cases,
applications should tune the collector. There are also cases where
collection takes an incredible amount of time, but they are also rare -
there is some hope that a certain class of these cases can be eliminated
by improving the GC scheme in Python.

Regards,
Martin

Stuart D. Gathman

unread,
Dec 21, 2002, 9:49:40 PM12/21/02
to
On Sat, 21 Dec 2002 16:19:19 -0500, Robert Oschler wrote:

> This is a very unscientific observation I have here, in the sense that
> I've done no formal research, but in my web and newsgroup perusals, I
> seem to have come across quite a few mentions of problems with Java
> applications in regards to untimely garbage collection and memory
> "hogging". Yet I have come across very few of the same complaints with
> Python.

I speak from lots of experience with both Python and Java. As someone
else has mentioned, the very early (1996) collectors for Java were
conservative.

However, since JDK 1.1.6, both Sun and IBM implementations of Java have
had robust garbage collection and fast allocation. I have yet to see a
memory problem in Java that was actually due to the GC. Python can also
have the same problems, but seems to have them less - perhaps because the
language is higher level. In order of prevalance, I have seen:

1. The application has a reference to a large and growing collection
which the programmer has forgotten about. Some call this a "memory
leak", but it is not really a leak since all the memory is in fact
reachable. I call it "data cancer", because it is an unwanted and often
fatal growth of a data structure.

2. A library has a non-Java resource such as a window, which is not
disposed because the application forgot to do so and the library programmer
forgot to do so in the finalizer. (Ditto for Python C extensions.)

3. A native code library has a good old fashioned C memory leak.

Python uses reference counting. This is the slowest form of garbage
collection, but it has the virtue that (apart from cycles) memory is
released at the earliest possible moment. Since the language is
interpreted anyway, the overhead for reference counting is not
objectionable. Other forms of GC are faster, but use more memory because
reclamation is delayed.

The problem with Python reference counting, is that it encourages sloppy
programming like:

data = open('myfile','r').read()

depending on the reference counting GC to release and close the file
object immediately when read() returns. This habit must be broken before
Python can evolve to Lisp like speed. The proper code:

fp = open('myfile','r')
data = fp.read()
fp.close()

is not as pretty. Perhaps some clever pythonista will invent some
syntactic sugar to help the medicine go down.

--
Stuart D. Gathman <stu...@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

John Roth

unread,
Dec 22, 2002, 8:43:28 AM12/22/02
to

"Stuart D. Gathman" <stu...@bmsi.com> wrote in message
news:8L9N9.36408$pe.13...@news2.east.cox.net...

I'm not certain I'd call this sloppy programming. It's a language
design choice. What you seem to be saying is that the garbage
collector shouldn't be required to call cleanup methods when it
releases garbage objects.

The trouble is, that subverts the entire rationalle behind
garbage collection. If the programmer has to determine when
an object becomes garbage so it can be finalized properly, then we
might as well just go back to malloc and free.

John Roth

Oren Tirosh

unread,
Dec 22, 2002, 9:23:33 AM12/22/02
to
On Sat, Dec 21, 2002 at 04:44:43PM -0500, John Roth wrote:
> Python, on the other hand, uses a reference counting strategy.
> That tends to release unused objects immediately the last
> reference vanishes, however, it has problems with objects
> that are linked in a cycle. The latest versions fix this (mostly,)
> but it's still wise to break cycles manually to get the best
> results.

The CPython implmentation currently uses reference counting but the
Jython implementation of the Python language uses the garbage collector
of the underlying JVM. Future versions of CPython could conceivably
use a different garbage collection strategy.

Oren

Bengt Richter

unread,
Dec 22, 2002, 12:22:54 PM12/22/02
to

How about a file read opening mode that means auto-close-when-you-hit-eof?
Since code that seeks back from EOF would break, it can't be default,
but it would allow new code to do e.g., file(name,'R').read() with
assured immediate close (assuming upper case R meant that).
(Or maybe use 'rc'). That would give independence from gc issues.

Regards,
Bengt Richter

Erik Max Francis

unread,
Dec 22, 2002, 6:25:04 PM12/22/02
to
John Roth wrote:

> I'm not certain I'd call this sloppy programming. It's a language
> design choice. What you seem to be saying is that the garbage
> collector shouldn't be required to call cleanup methods when it
> releases garbage objects.

The problem is that it _is_ sloppy programming, even within Python. In
_CPython_, you're pretty much guaranteed that the file object will get
reclaimed (and thus closed) rather quickly. You've no such guarantee in
other Python implementations, such as Jython (where the __del__ method
calling and reclaiming is done with finalizers) and thus happens
whenever the object gets garbage collected, which may be soon after the
statement executes or long after.

Even in Python, you should make sure that code which acquires resources
releases them in a timely manner. Don't rely on reference counting and
object reclamation to do this, because you'll be asking for trouble in
other Python implementations.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE
/ \ God will forgive me; that's his business.
\__/ Heinrich Heine
Maths reference / http://www.alcyone.com/max/reference/maths/
A mathematics reference.

Isaac To

unread,
Dec 22, 2002, 10:23:44 PM12/22/02
to
>>>>> "Erik" == Erik Max Francis <m...@alcyone.com> writes:

Erik> The problem is that it _is_ sloppy programming, even within
Erik> Python. In _CPython_, you're pretty much guaranteed that the file
Erik> object will get reclaimed (and thus closed) rather quickly.
Erik> You've no such guarantee in other Python implementations, such as
Erik> Jython (where the __del__ method calling and reclaiming is done
Erik> with finalizers) and thus happens whenever the object gets garbage
Erik> collected, which may be soon after the statement executes or long
Erik> after.

What if C-Python is the only thing I used and care? And, you simply cannot
say that's broken if I have no way to test it (how about a flag of Python to
"disable garbage collection"? Then one can reasonably test whether he is
relying on the GC too much.).

Regards,
Isaac.

Erik Max Francis

unread,
Dec 22, 2002, 10:54:15 PM12/22/02
to
Isaac To wrote:

> What if C-Python is the only thing I used and care?

Software tends to have a much longer lifetime and breadth than you first
expect. One day you may find yourself in a situation where you wish you
hadn't been so short-sighted when you first wrote that code.

> And, you simply cannot
> say that's broken if I have no way to test it (how about a flag of
> Python to
> "disable garbage collection"? Then one can reasonably test whether he
> is
> relying on the GC too much.).

Relying on a certain form of unspecified behavior is "broken," whether
or not it happens to work for you right here, right now.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ I want to know God's thought; the rest are details.
\__/ Albert Einstein
WebVal / http://www.alcyone.com/pyos/webval/
URL scanner, maintainer, and validator in Python.

Courageous

unread,
Dec 22, 2002, 11:05:55 PM12/22/02
to

>Software tends to have a much longer lifetime and breadth than you first
>expect. One day you may find yourself in a situation where you wish you
>hadn't been so short-sighted when you first wrote that code.

I've only barely followed all this. But I'd say that, regardless of
someone's intent to rely on gc from one environment to the next, one
should explicitly call for the closure of files. Files are a resource
that is external to Python, and can have an impact on other programs
in the operating system. An explicit close is clear and helpful. This
thinking can be generalized to other situations; for example, with
sockets and so forth.

Hope this is on subject. :)

C//

Erik Max Francis

unread,
Dec 22, 2002, 11:40:18 PM12/22/02
to
Courageous wrote:

> I've only barely followed all this. But I'd say that, regardless of
> someone's intent to rely on gc from one environment to the next, one
> should explicitly call for the closure of files. Files are a resource
> that is external to Python, and can have an impact on other programs
> in the operating system. An explicit close is clear and helpful. This
> thinking can be generalized to other situations; for example, with
> sockets and so forth.
>
> Hope this is on subject. :)

Yes, that's essentially exactly what I said a few posts ago: "Even in
Python, you should make sure that code which acquires [external]


resources releases them in a timely manner."

--

Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ Take my advice: Pull down your pants and slide on the ice.
\__/ Dr. Sidney Freedman
Computer science / http://www.alcyone.com/max/reference/compsci/
A computer science reference.

Isaac To

unread,
Dec 23, 2002, 12:14:17 AM12/23/02
to
>>>>> "Erik" == Erik Max Francis <m...@alcyone.com> writes:

Erik> Software tends to have a much longer lifetime and breadth than you
Erik> first expect. One day you may find yourself in a situation where
Erik> you wish you hadn't been so short-sighted when you first wrote
Erik> that code.

Hm... what if by that time I have already switched to a completely different
language?

>> And, you simply cannot say that's broken if I have no way to test it
>> (how about a flag of Python to "disable garbage collection"? Then
>> one can reasonably test whether he is relying on the GC too much.).

Erik> Relying on a certain form of unspecified behavior is "broken,"
Erik> whether or not it happens to work for you right here, right now.

Okay, if you want to say it's broken, that's up to you, but then the code of
everybody tends to be "broken" anyway.

Anyway, my point is not that it is good to rely on GC for things like
closing files (whether it is good depends on the definition of GC: if it is
"to collect unused memory", then it is bad. If its definition is "to
collect unused resources", then it is good, at least in some cases). It is
that currently if you use C-Python there is no way, or is very costly, to
test that your code rely on "unspecified" behaviour of GCs.

Regards,
Isaac.

"Martin v. Löwis"

unread,
Dec 23, 2002, 12:47:49 AM12/23/02
to Paul Foley
Paul Foley wrote:
> The "entire rationale behind garbage collection" is to present you
> with the illusion that your computer has infinite memory. If it
> actually *had* infinite memory, garbage collection would be a no-op.
> Where would your finalizers be then?

People often extend this rationale to "your computer has infinite
resources". Closing a file is then not necessary since you can have as
many open files as you want to, and finalizers wouldn't be needed for that.

Regards,
Martin

Courageous

unread,
Dec 23, 2002, 1:52:29 AM12/23/02
to

>Yes, that's essentially exactly what I said a few posts ago: "Even in
>Python, you should make sure that code which acquires [external]
>resources releases them in a timely manner."

I wish I could provide an objective metric for this, but I think it's
mostly opinion and subjective. One bit of objective truth is this,
however: file.close() says quite clearly "I'm done with this." While
one can infer that if the reference to file no longer exists, one is
done with the file, that's an additional inferential step. Explicit
is better than implicit, sometimes. I think this is one of those times.

As another data point, if I a junior programmer on one of my projects
relied on implicit file closing, I'd politely insist that he not.

It's simply my gut feel and years of experience which say "closing
explicitly is best practice."

C//

Erik Max Francis

unread,
Dec 23, 2002, 2:03:54 AM12/23/02
to
Courageous wrote:

> I wish I could provide an objective metric for this, but I think it's
> mostly opinion and subjective. One bit of objective truth is this,
> however: file.close() says quite clearly "I'm done with this." While
> one can infer that if the reference to file no longer exists, one is
> done with the file, that's an additional inferential step. Explicit
> is better than implicit, sometimes. I think this is one of those
> times.

I'm a firm believer in "Explicit is better than implicit" in general
terms, but this is one of those cases where it's verging on a
requirement. Python as a language itself doesn't make any guarantees on
whether finalizers will get called in a timely manner, so if you have
some external resource tied to a Python object, letting the last
reference to it go away does not give any guarantees on when the
resource really will be released.

In the present implementation of CPython, of course, it happens
immediately after the last reference goes away, but Python as a language
does not make that guarantee, and at least one implementation, Jython,
follows Java's garbage collection model and makes no such guarantee.
Indeed, you've really no guarantee that CPython itself will even
continue to behave that way!

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ Whoever named it necking was a poor judge of anatomy.
\__/ Groucho Marx
CAGE / http://www.alcyone.com/pyos/cage/
A cellular automaton simulation system in Python.

Erik Max Francis

unread,
Dec 23, 2002, 2:05:57 AM12/23/02
to
Isaac To wrote:

> Hm... what if by that time I have already switched to a completely
> different
> language?

What does that have to do with anything? Maybe the world will end
tomorrow. So what?

> Okay, if you want to say it's broken, that's up to you, but then the
> code of
> everybody tends to be "broken" anyway.

It's broken in that it may exhibit unexpected, pathological, and awful
behavior on other Python interpreters. Python is a language; CPython
(and Jython) are particular implementations. Write to the language, not
a particular implementation.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

ma...@pobox.com

unread,
Dec 23, 2002, 2:09:41 AM12/23/02
to
John Roth <john...@ameritech.net> wrote:
> I'm not certain I'd call this sloppy programming. It's a language
> design choice. What you seem to be saying is that the garbage
> collector shouldn't be required to call cleanup methods when it
> releases garbage objects.

No, what he's saying is that it's sloppy practice to rely on having
resources (other than memory) reclaimed by the invisible hand of the
garbage collector. You seem to be under the misaprehension that Python
guarantees such timely finalization.

Objects are never explicitly destroyed; however, when they become
unreachable they may be garbage-collected. An implementation is
allowed to postpone garbage collection or omit it altogether
-- Python language reference manual

> The trouble is, that subverts the entire rationalle behind
> garbage collection.

Well, no. Garbage collection is a *memory* management tool.

> If the programmer has to determine when an object becomes garbage so
> it can be finalized properly, then we might as well just go back to
> malloc and free.

See, there, you just said it yourself: garbage collection is memory
management, not object finalization. Or am I mistaken and free()
performs finalization these days? <wink>

"Martin v. Löwis"

unread,
Dec 23, 2002, 2:30:20 AM12/23/02
to
Erik Max Francis wrote:
> Indeed, you've really no guarantee that CPython itself will even
> continue to behave that way!

I can give my word if you want it...

Regards,
Martin


"Martin v. Löwis"

unread,
Dec 23, 2002, 2:34:46 AM12/23/02
to
Erik Max Francis wrote:
> It's broken in that it may exhibit unexpected, pathological, and awful
> behavior on other Python interpreters. Python is a language; CPython
> (and Jython) are particular implementations. Write to the language, not
> a particular implementation.

It's not that there are hundreds of alternative Python implementations
out there. There are precisely two implementations, and that's why the
language definition makes explicitly no guarantee as to when the
finalizers are called. If you were just to use what the language
definition guarantees you, you could not write a single useful application.

Regards,
Martin


ma...@pobox.com

unread,
Dec 23, 2002, 2:41:56 AM12/23/02
to

However, this sort argument won't get you very far when the external
resource is something like a write lock on a shared data store!

Erik Max Francis

unread,
Dec 23, 2002, 2:59:04 AM12/23/02
to
"Martin v. Löwis" wrote:

> It's not that there are hundreds of alternative Python implementations
> out there. There are precisely two implementations, and that's why the
> language definition makes explicitly no guarantee as to when the
> finalizers are called.

And that's precisely why you should not rely on an such behavior in a
portable application.

> If you were just to use what the language
> definition guarantees you, you could not write a single useful
> application.

Huh? Presuming by "language definition" you mean the documentation
available at python.org, I don't see how this comment makes sense.

Erik Max Francis

unread,
Dec 23, 2002, 3:01:49 AM12/23/02
to
"Martin v. Löwis" wrote:

> I can give my word if you want it...

Even with such a guarantee given in good faith (if it were even needed
at all), there's still no assurances that circumstances couldn't come
along that could void that guarantee. You could move on to other
things, or some other faction would take over CPython development and
turn it in new directions, etc.

The point I was trying to make is about what guarantees the language
makes, rather than the particular details of a certain implementation.
I have very little doubt that CPython's finalization behavior will ever
change; but that was not the point.

Brian Quinlan

unread,
Dec 23, 2002, 3:31:48 AM12/23/02
to
> Even with such a guarantee given in good faith (if it were even needed
> at all), there's still no assurances that circumstances couldn't come
> along that could void that guarantee. You could move on to other
> things, or some other faction would take over CPython development and
> turn it in new directions, etc.

Tim Peters has made that guarantee in the past as well.



> The point I was trying to make is about what guarantees the language
> makes, rather than the particular details of a certain implementation.
> I have very little doubt that CPython's finalization behavior will
ever
> change; but that was not the point.

What guarantees does the language make? I can't think of any...

Cheers,
Brian


Erik Max Francis

unread,
Dec 23, 2002, 4:09:17 AM12/23/02
to
Brian Quinlan wrote:

> Tim Peters has made that guarantee in the past as well.

Again, I never claimed it was likely or even plausible that the behavior
was change in CPython.

> What guarantees does the language make? I can't think of any...

Everything that the python.org documentation describes as the expected
behavior of Python and is not disclaimed with "Do not rely on this for
X," "This feature only available in Y," etc.

The documentation explicitly says that one should not rely on finalizers
being called in a timely manner or even at all in some cases. So one
should not rely on it. I'm really surprised at the level of furor over
such a basic point that even the documentation specifically mentions.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ There is nothing so subject to the inconstancy of fortune as war.
\__/ Miguel de Cervantes
EmPy / http://www.alcyone.com/pyos/empy/
A templating system for Python.

Ype Kingma

unread,
Dec 23, 2002, 4:14:36 AM12/23/02
to
Stuart,

> On Sat, 21 Dec 2002 16:19:19 -0500, Robert Oschler wrote:

<snip>



>
> The problem with Python reference counting, is that it encourages sloppy
> programming like:
>
> data = open('myfile','r').read()
>
> depending on the reference counting GC to release and close the file
> object immediately when read() returns. This habit must be broken before
> Python can evolve to Lisp like speed. The proper code:
>
> fp = open('myfile','r')
> data = fp.read()
> fp.close()
>
> is not as pretty. Perhaps some clever pythonista will invent some
> syntactic sugar to help the medicine go down.

Untested code:

def openread(fname):
try:
fp = open(fname)
return fp.read()
finally:
fp.close()

Python has enough syntax already.

Have fun,
Ype


--
email at xs4all.nl

Derek Thomson

unread,
Dec 23, 2002, 4:29:35 AM12/23/02
to
Erik Max Francis wrote:
>
> The documentation explicitly says that one should not rely on finalizers
> being called in a timely manner or even at all in some cases. So one
> should not rely on it. I'm really surprised at the level of furor over
> such a basic point that even the documentation specifically mentions.
>

Because some people don't like it, and think it's a step backwards?

What is the point of having destructors, if they might not be called?
That really annoys me about Java - you can't clean up non-memory
resources transparently. You must add "close" methods if your instance
needs to clean up other resources. And if the class suddenly needs it
where it didn't before, you now need to fix *all* usages of the class.
So much for encapsulation!

I really think destructors were a step forward in C++, as you couldn't
forget to call arbitrary cleanup functions like "close". I was happy to
eliminate them in the step from C to C++, and now in Java and Jython I
have to use them again. Sigh.

I can see the benefit of mark and sweep in terms of efficiency, but not
having destructors is a real step backwards for reducing resource leaks IMO.

One way out within the mark and sweep model might be to make "garbage
collection" extensible, so that it can manage other limited resources,
not just memory. That way we could have standard "managers" for files
and sockets, and we would also be able to add our own.

--
D.

Derek Thomson

unread,
Dec 23, 2002, 4:44:25 AM12/23/02
to
Stuart D. Gathman wrote:
>
> Python uses reference counting. This is the slowest form of garbage
> collection, but it has the virtue that (apart from cycles) memory is
> released at the earliest possible moment.

It has the advantage that it is deterministic, and therefore destructors
can be used to manage non-memory resources, release locks and so on.

Until recently, that is. Suddenly we need to support mark-and-sweep as
well, and so we don't have destructors any more, and need to resort to
"close" methods.

> The problem with Python reference counting, is that it encourages sloppy
> programming like:
>
> data = open('myfile','r').read()
>
> depending on the reference counting GC to release and close the file
> object immediately when read() returns. This habit must be broken before
> Python can evolve to Lisp like speed. The proper code:
>
> fp = open('myfile','r')
> data = fp.read()
> fp.close()

Let he who is without sin, etc.

This code is almost as sloppy. What happens if "read" throws an exception?

>
> is not as pretty. Perhaps some clever pythonista will invent some
> syntactic sugar to help the medicine go down.
>

There is no way. I wish there were. This is a real problem in Java, too.
How do I release locks and non-memory resources without adding functions
like "close" that people have to remember to call all the time?

I understand the benefits of mark and sweep, but there is a real cost
associated with it in terms of potential bugs and creating easy to use
classes.

--
D.

Erik Max Francis

unread,
Dec 23, 2002, 4:47:45 AM12/23/02
to
Ype Kingma wrote:

> Untested code:
>
> def openread(fname):
> try:
> fp = open(fname)
> return fp.read()
> finally:
> fp.close()
>
> Python has enough syntax already.

Really this should be:

fp = open(fname)
try:
return fp.read()
finally:
fp.close()

If open throws, then the fp name will not be bound, so on the way out
the finally clause will generate a NameError.

Ype Kingma

unread,
Dec 23, 2002, 4:44:07 AM12/23/02
to
Eric,
Thanks,
Ype

>
>> Untested code:
>>
>> def openread(fname):
>> try:
>> fp = open(fname)
>> return fp.read()
>> finally:
>> fp.close()

> fp = open(fname)

Lulu of the Lotus-Eaters

unread,
Dec 23, 2002, 4:14:05 AM12/23/02
to
|"Martin v. Löwis" wrote:
|> I can give my word if you want it...

Erik Max Francis <m...@alcyone.com> wrote previously:


|Even with such a guarantee given in good faith (if it were even needed
|at all), there's still no assurances that circumstances couldn't come
|along that could void that guarantee. You could move on to other
|things, or some other faction would take over CPython development and
|turn it in new directions, etc.

Well sure... you also have no *guarantee* that Guido won't decide that
scalar variables must all be prefixed with '$' in Python 3.0 (or
2.3final, for that matter).

I trust Martin's word here, though... even if it is a matter of (good)
percentages.

Yours, Lulu...

P.S. Besides, I let finalization close my files 90% of the time myself
:-)... and I don't think I'd be characterized as a "junior programmer"
very much. Then again, I admit that I like to write command-line
utilities that pretty much want their files until the run is completed
(i.e. a couple seconds later), which probably makes up that 90%.

--
mertz@ _/_/_/_/ THIS MESSAGE WAS BROUGHT TO YOU BY: \_\_\_\_ n o
gnosis _/_/ Postmodern Enterprises \_\_
.cx _/_/ \_\_ d o
_/_/_/ IN A WORLD W/O WALLS, THERE WOULD BE NO GATES \_\_\_ z e


Erik Max Francis

unread,
Dec 23, 2002, 5:17:25 AM12/23/02
to
Lulu of the Lotus-Eaters wrote:

> Well sure... you also have no *guarantee* that Guido won't decide that
> scalar variables must all be prefixed with '$' in Python 3.0 (or
> 2.3final, for that matter).
>
> I trust Martin's word here, though... even if it is a matter of (good)
> percentages.

My comment about CPython potentially changing was really a side point.
I myself have no doubt that CPython's finalization behavior will never
change. Assurances were offered for a point that didn't need assuring;
it was a theoretical aside in the first place. My point was simply was
that Python as a language does not guarantee prompt and certain
finalization, regardless of what CPython does.

What was a minor aside ended up being inflated into a major issue. It
was really a side comment; I was simply pointing out that since the
behavior is marked as unspecified, for all intents and purposes CPython
_could_ change and that would be just fine according to the ad hoc
"specification" that the documentation constitutes. I never suggested
that this was actually likely or even plausible.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ The only completely consistent people are the dead.
\__/ Aldous Huxley
Rules for Buh / http://www.alcyone.com/max/projects/cards/buh.html
The official rules to the betting card game, Buh.

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:24:50 AM12/23/02
to
Erik Max Francis wrote:
> Even with such a guarantee given in good faith (if it were even needed
> at all), there's still no assurances that circumstances couldn't come
> along that could void that guarantee. You could move on to other
> things, or some other faction would take over CPython development and
> turn it in new directions, etc.

But the same applies to any other guarantee that somebody may give you.
It is inherently hard to predict the future.

> The point I was trying to make is about what guarantees the language
> makes, rather than the particular details of a certain implementation.

But any guarantees that "the language" gives are just as void as any
guarantee that I give. It is trivial for the maintainers to declare
something as a documentation bug, and that has happened many times in
the past, and will happen in the future.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:37:54 AM12/23/02
to
Erik Max Francis wrote:
> My point was simply was
> that Python as a language does not guarantee prompt and certain
> finalization, regardless of what CPython does.

And I still believe that "Python as a language" is a myth (a unicorn, to
be precise :-). There is no formal language definition, just a "language
reference", and that is changed with every release of CPython. In
addition to that, there are various books that describe "the language",
two fundamentally different "implementations" of it, and many
implementations that differ in detail. Furthermore, people rightly
consider "the standard library" to be a proper part of "the language".
The library varies across systems even within a single release (and for
good reasons).

> What was a minor aside ended up being inflated into a major issue. It
> was really a side comment; I was simply pointing out that since the
> behavior is marked as unspecified, for all intents and purposes CPython
> _could_ change and that would be just fine according to the ad hoc
> "specification" that the documentation constitutes.

See, and this is precisely where you are mistaken. Whether a change can
or cannot happen is only marginally affected by what the language
reference says. The commitment to not breaking too many applications
across releases of CPython is a much bigger influence. The reference
counting is so ingrained into the implementation, and the many extension
modules, and a large number of applications, that there is no chance
that this will change in a foreseeable future, indepedent of what "the
language" says.

As I just said, predicting the future is always difficult, and it only
becomes a little easier by taking all facts into account. Of course, for
"mere users", taking implementation details into account may not be
advisable, so relying on the documentation is good advise, in general.
In this specific case, trusting that reference counting is a fact should
be allowed, though.

Regards,
Martin


Erik Max Francis

unread,
Dec 23, 2002, 5:41:27 AM12/23/02
to
"Martin v. Löwis" wrote:

> But any guarantees that "the language" gives are just as void as any
> guarantee that I give. It is trivial for the maintainers to declare
> something as a documentation bug, and that has happened many times in
> the past, and will happen in the future.

The difference here is that for the de facto language specification to
change the finalization behavior from unspecified to that of CPython's
and only CPython's would mean they'd have to declare Jython not really a
Python implementation. That seems, to say the least, a lot more
unlikely.

All I was saying was that the finalization behavior is unspecified, and
for very good reason: There is already a Python implementation that
behaves differently than CPython as far as finalization is concerned,
and the Python documentation explicitly acknowledges this as legitimate.
The whole "even CPython could change" subtext was an utterly trivial,
theoretical point that has been blown way out of proportion.

What I was saying is that you need to safeguard external critical
resource acquisition and release with something more stringent than
relying on finalization, and that is because the Python documentation
even says so. That is all.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:50:05 AM12/23/02
to
Erik Max Francis wrote:
>>If you were just to use what the language
>>definition guarantees you, you could not write a single useful
>>application.
>
>
> Huh? Presuming by "language definition" you mean the documentation
> available at python.org, I don't see how this comment makes sense.

There is a number of things that you cannot strictly rely on. For
example, there is no guarantee that "Hello" is a valid string literal -
parsing it may cause a MemoryError, if the machine has not sufficient
memory to represent the literal. Likewise, 42L may not be a valid long
integer literal. There is a (near) guarantee that integer literals can
be up to 2147483647 (although it strictly speaking does not guarantee
this), see

http://www.python.org/doc/current/ref/integers.html

but there is no guarantee for long literals - anything may be
constrained by "available memory". So an implementation that gives a
MemoryError on 42L would be, strictly speaking, conforming.

Going on to the library, there is, strictly speaking, no guarantee that
you can use files. It says a filename is passed to stdio's fopen, which,
in turn, gives no guarantee that there is a single valid file name on a
system, let alone guaranteeing that "foo.txt" is a valid file name.

So you can't use files, long integers, and strings; pretty useless
language, I'd say :-)

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:53:57 AM12/23/02
to
Ype Kingma wrote:
> def openread(fname):
> try:
> fp = open(fname)
> return fp.read()
> finally:
> fp.close()
>
> Python has enough syntax already.

Nearly correct: the open must be outside the try, or else fp may not be
assigned in the finally statement.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:52:37 AM12/23/02
to
ma...@pobox.com wrote:
>>People often extend this rationale to "your computer has infinite
>>resources". Closing a file is then not necessary since you can have as
>>many open files as you want to, and finalizers wouldn't be needed for that.
>
> However, this sort argument won't get you very far when the external
> resource is something like a write lock on a shared data store!

Indeed. That's why, a write lock shouldn't be released in a finalizer,
but in a explicit release operation. Finalizers are only good for
resources where your program would behave correctly if you had unlimited
supply and no finalizers would be called.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 6:00:50 AM12/23/02
to
Erik Max Francis wrote:
> The difference here is that for the de facto language specification to
> change the finalization behavior from unspecified to that of CPython's
> and only CPython's would mean they'd have to declare Jython not really a
> Python implementation. That seems, to say the least, a lot more
> unlikely.

It *is* difficult to predict the future. Jython may become irrelevant,
or Java may grow an extension to invoke finalizers as soon as the object
becomes unreachable.

> The whole "even CPython could change" subtext was an utterly trivial,
> theoretical point that has been blown way out of proportion.

But what, if CPython doesn't change, is the problem with relying on its
implementation details? The application may not work the same way on
Jython. Perhaps this is the case even without relying on refcounting to
close files, or perhaps it is irrelevant whether this application runs
on Jython, as it is meant to be shipped as a frozen HP-UX binary.

> What I was saying is that you need to safeguard external critical
> resource acquisition and release with something more stringent than
> relying on finalization, and that is because the Python documentation
> even says so. That is all.

I completely understand, and I think you are wrong. You don't need
further safeguards; the ones you have are sufficient, atleast for some
(the majority?) of the application.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 6:06:33 AM12/23/02
to
Erik Max Francis wrote:
> The documentation explicitly says that one should not rely on finalizers
> being called in a timely manner or even at all in some cases. So one
> should not rely on it. I'm really surprised at the level of furor over
> such a basic point that even the documentation specifically mentions.

It is no surprise that the documentation must mention this; there is
precedent that "some implementations" behave differently (and this note
was explicitly added to make JPython a conforming implementation). The
documentation has no chance to observe the finer points of all this, as
it must take the option into account that somebody produces another
Python implementation based on it (and we do encourage people doing so).

However, it is also not surprising that people defend their programming
practice to rely on the open(filename).read() pattern to DTRT: Nobody
wants to be accused of being a sloppy programmer. If you add the facts
of life to the facts of documentation, you see that there is really
nothing wrong with such a programming practice - if you are aware of all
the finer points.

Regards,
Martin


Erik Max Francis

unread,
Dec 23, 2002, 6:08:55 AM12/23/02
to
"Martin v. Löwis" wrote:

> See, and this is precisely where you are mistaken. Whether a change
> can
> or cannot happen is only marginally affected by what the language
> reference says.

Changes can always happen, of course. But if you're writing software in
a language and want that software to continue to work tomorrow, your
best bet is to follow the guidelines that are set forth by whatever
standards, reference implementations, or documentation that you have
available. Since for Python, that's the documentation available on
python.org, and that documentation _explicitly_ states that relying on
timely finalization is a bad idea, relying upon it is, well, a bad idea
if you want to write software that continues to work properly down the
road.

> The commitment to not breaking too many applications
> across releases of CPython is a much bigger influence. The reference
> counting is so ingrained into the implementation, and the many
> extension
> modules, and a large number of applications, that there is no chance
> that this will change in a foreseeable future, indepedent of what "the
> language" says.

As I have said, and probably will have to say many more times before
this accursed thread dies, the suggestion that the finalization behavior
of CPython might change was entirely theoretical. I have no doubt that
it will never change for the life of CPython. An utterly theoretical,
trivial point has been inflated as if it were the main focus of my
comment, which it never was.

I do not think that CPython's finalization behavior will change. I
suggested it merely as a theoretical possibility. Please stop treating
it as if it were all I had ever said on the subject.

Erik Max Francis

unread,
Dec 23, 2002, 6:38:28 AM12/23/02
to
"Martin v. Löwis" wrote:

> It *is* difficult to predict the future. Jython may become irrelevant,
> or Java may grow an extension to invoke finalizers as soon as the
> object
> becomes unreachable.

If Jython becomes irrelevant or Java changes their finalizers and those
changes become widespread such that the Python documentation changes the
unspecified clause to a guarantee of immediate finalization, then I
would change my recommendation.

> I completely understand, and I think you are wrong. You don't need
> further safeguards; the ones you have are sufficient, atleast for some
> (the majority?) of the application.

_If you're using CPython_. If you're not, then they're not. If general
Python -- not just CPython -- portability is a concern, then this in my
opinion is bad advice.

Laura Creighton

unread,
Dec 23, 2002, 6:15:54 AM12/23/02
to

Assertion: Any programming language that makes me worry about the low-level
details of file closing is not a high enough level of a language
to do the sorts of programming I wish to do.

Now you can line up wherever you like in defence or condemnation of this
assertion, but that is where this argument goes. 'Sloppy' has nothing to
do with it -- these people are leaving out closes, _on purpose_ and
removing them when they find them, because they think it enhances code
readability. Working on jython is not on their list of priorities.

Laura

Ganesan R

unread,
Dec 23, 2002, 7:03:08 AM12/23/02
to
>>>>> "Martin" == Martin v Löwis <mar...@v.loewis.de> writes:

> However, it is also not surprising that people defend their
> programming practice to rely on the open(filename).read() pattern to
> DTRT: Nobody wants to be accused of being a sloppy programmer. If you
> add the facts of life to the facts of documentation, you see that
> there is really nothing wrong with such a programming practice - if
> you are aware of all the finer points.

I guess, that's really the key thing - _if_ you are aware of all the finer
points. This is assuming "int" is 32-bits in C. You know for in most current
hardware "int" is 32-bits, you know that "int" is 32-bits in most 64-bit C
implementations too. So there's really nothing "wrong" in that programming
practise.

However when you were programming in 16-bit DOS, it wouldn't have been a
good idea to assume 16-bits for an int even if you never intended to port
that program to 32-bit hardware. 32-bit hardware is out there and you never
know if somebody else wants that little program you wrote on that hardware.
That's why, FWIW, I tend to agree with Erik's point of view.

This programming practice ought to deprecated _unless_ the programmer is
fully aware of what (s)he's doing. Take a module writer for example. He may
not care about Jython and choose to depend on the finalizer. Yes, Jython may
disappear tomorrow; but the fact is it's a reality today and there are many
users out there using Jython. Our fictional module writer might have
rendered his module useless for Jython users today.

I code in Java, CPython and Jython and I _am_ frustrated by the lack of
destructors in Java/Jython. It's a pain to take care of everything in a
"finally" block while a simple destructor will do the job. However, Java
garbage collectors are a fact of life and it's not a good idea to make
simplifying assumptions.

I wish there were an equivalent of lexically scoped objects (auto objects in
C/C++) as opposed to references in Java/Python. At least, the language can
guarantee destruction for those objects. It won't simplify everything, but
at least I can use a simpler locking idiom :-).

--
Ganesan R

Samuele Pedroni

unread,
Dec 23, 2002, 7:27:37 AM12/23/02
to
>
> It *is* difficult to predict the future.

Indeed.

> Jython may become irrelevant,
> or Java may grow an extension to invoke finalizers as soon as the object
> becomes unreachable.
>

CPython and Jython could both become irrelevant, (e.g. slowly under MS OSes)
Python would be still a neat idea with maybe some other implementation.

Honestly not closing files explicitly is a bad idea, unless for quick hacks.

regards.


Andrew Dalke

unread,
Dec 23, 2002, 5:06:45 AM12/23/02
to Stuart D. Gathman
Stuart D. Gathman wrote:
> The problem with Python reference counting, is that it encourages sloppy
> programming like:
>
> data = open('myfile','r').read()
>
> depending on the reference counting GC to release and close the file
> object immediately when read() returns. This habit must be broken before
> Python can evolve to Lisp like speed. The proper code:
>
> fp = open('myfile','r')
> data = fp.read()
> fp.close()
>
> is not as pretty.

Actually, that's not proper either. Suppose the read fails and
raises an exception. Then the 'close' will never be called and you
are again dependent on the implementation defined garbage collection
behaviour.

The proper code is more like

fp = open('myfile', 'r')

try:
data = fp.read()
finally:
fp.close()

This guarantees that fp.close() will be called no matter what,
and if an exception was raised in the try block it will be
propogated upwards after the finally clause executes.


However, if there's an exception in the finally clause then
that's the one which will be propogated, as you can see in

>>> try:
... 1/0
... finally:
... qwert
...
Traceback (most recent call last):
File "<stdin>", line 4, in ?
NameError: name 'qwert' is not defined
>>>

So if you want to be really, really correct, you might try
something more like

fp = open('myfile', 'r')

try:
data = fp.read()
finally:
try:
fp.close()
except IOError:
pass

See why automatic garbage collection is so nice?

Andrew
da...@dalkescientific.com

"Martin v. Löwis"

unread,
Dec 23, 2002, 7:31:31 AM12/23/02
to
Erik Max Francis wrote:
> Since for Python, that's the documentation available on
> python.org, and that documentation _explicitly_ states that relying on
> timely finalization is a bad idea, relying upon it is, well, a bad idea
> if you want to write software that continues to work properly down the
> road.

It doesn't say it is a bad idea. Instead, 3.1 says

# An implementation is allowed to postpone garbage collection or omit it
# altogether ... the current implementation uses a reference-counting
# scheme ... which collects most objects as soon as they become
# unreachable

The documentation contains no assessment as to whether relying on the
current implementation is a bad idea, nor an indication as to whether
future implementations may differ (and we both know they won't).

I have explained the rationale for having these statements in the
documentation (to make JPython a conforming implementation). Given this
rationale, I still think that relying on the current implementation is a
good idea in many cases.

> I do not think that CPython's finalization behavior will change. I
> suggested it merely as a theoretical possibility. Please stop treating
> it as if it were all I had ever said on the subject.

In the previous message you said "for all intents and purposes CPython
_could_ change". Maybe I misunderstood "for all intents and purposes"
(I'm still uncertain what the exact translation of this phrase to German
would be), or perhaps I misunderstood "_could_" (although I'm quite sure
that I do understand "could").

Regards,
Martin

Isaac To

unread,
Dec 23, 2002, 7:21:22 AM12/23/02
to
>>>>> "Erik" == Erik Max Francis <m...@alcyone.com> writes:

>> Hm... what if by that time I have already switched to a completely
>> different language?

Erik> What does that have to do with anything? Maybe the world will end
Erik> tomorrow. So what?

Any reasonable person will try to judge how likely different events will
happen in the future, and accordingly try to choose the policy that increase
his expected gain, averaged over near and long terms. If I judge that
switching from C-Python to JPython is very unlikely to happen to me soon,
that the behaviour of C-Python that I rely on is very unlikely to change
soon, and on the other hand I might switch completely to another language
soon, then I'll definitely not make a policy to make sure code will work in
C-Python and JPython even if the cost for achieving that is high (more
testing, less clean code, more lines of code, etc). If you cannot change my
expectation of the language or my expectation of whether I'll switch to
another implementation of Python, then your best hope to change my behaviour
is to reduce the cost of having it done "the right way".

Regards,
Isaac.

"Martin v. Löwis"

unread,
Dec 23, 2002, 7:47:20 AM12/23/02
to
Ganesan R wrote:
> I guess, that's really the key thing - _if_ you are aware of all the finer
> points. This is assuming "int" is 32-bits in C. You know for in most current
> hardware "int" is 32-bits, you know that "int" is 32-bits in most 64-bit C
> implementations too. So there's really nothing "wrong" in that programming
> practise.

It turns out that relying on 32-bit ints is a much more serious problem
for Python than relying on refcounting. People just *know* that
0x80000000 is a negative number; an attempt to change this for Python
2.4 turns out to be very difficult to implement without breaking too
much code.

People have these assumptions in their code whether they are aware of
them or not. This constrains language evolution, but I consider it a
good thing: the language developers need to understand what hidden
assumptions people rely on, and they need to make an explicit choice to
break those assumption. When they do that, they need to find a way to
either not break the existing code, or to give developers mechanisms to
make the hidden assumptions visible. For Python, the mechanisms are
documented in PEP 5.

> This programming practice ought to deprecated _unless_ the programmer is
> fully aware of what (s)he's doing. Take a module writer for example. He may
> not care about Jython and choose to depend on the finalizer. Yes, Jython may
> disappear tomorrow; but the fact is it's a reality today and there are many
> users out there using Jython. Our fictional module writer might have
> rendered his module useless for Jython users today.

I find the Jython argument not very convincing. If people need to
support both CPython and Jython, they have much bigger problems than
finalizers. Most likely, some essential library they use is not
available for Jython, so they have to actively test their application on
Jython. They should also review their code, to find where they rely on
finalizers, and perhaps make other, non-portable assumptions in it.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 7:59:14 AM12/23/02
to
Paul Foley wrote:
> But you can tell the difference between leaving it open and closing
> it. If the "file" is a socket

But it isn't. The specific code in question was open(filename).read().
That you have to close a socket when you are done to perform graceful
shutdown of the connection is out of question. Of course, there may be
applications where even that is not necessary, e.g. if it is acceptable
that the socket stays open as the program is short-lived, or as the
other end will close the connection after some time of inactivity anyway.

> And if you can't tell the difference, why do you care when the
> finalization happens? Instantly or 10 months later, it makes no
> difference.

Right. The only problem is that the resources aren't unlimited, and the
number of open files is often quite limited. You will notice an
exhaustion of the resource, so you want the system to reclaim unneeded
resources before they contribute to exhaustion.

> [Also, I'd expect a finalizer for a database transaction, for example,
> to /abort/ the transaction, not commit it! Don't you agree?

There should be always an option to explicitly release the resource; if
there are different alternatives, refuse the temptation to guess.

Regards,
Martin

Ype Kingma

unread,
Dec 23, 2002, 8:04:01 AM12/23/02
to
Andrew Dalke wrote:

> Stuart D. Gathman wrote:
>> The problem with Python reference counting, is that it encourages sloppy
>> programming like:
>>
>> data = open('myfile','r').read()

<snip>

>
> So if you want to be really, really correct, you might try
> something more like
>
> fp = open('myfile', 'r')
> try:
> data = fp.read()
> finally:
> try:
> fp.close()
> except IOError:
> pass

To be correct the exception traceback should be printed
because that is what happens to uncaught exceptions during
__del__().

> See why automatic garbage collection is so nice?

Jython has automatic garbage collection, but it does need
try/finally constructs because of the unspecified delay
before the __del__() call.

Regards,
Ype

Robert Oschler

unread,
Dec 23, 2002, 8:50:41 AM12/23/02
to

"Derek Thomson" <de...@wedgetail.com> wrote in message
news:3IAN9.477$9f.1...@news.optus.net.au...

> Erik Max Francis wrote:
> >
>
> What is the point of having destructors, if they might not be called?
> That really annoys me about Java - you can't clean up non-memory
> resources transparently. You must add "close" methods if your instance
> needs to clean up other resources. And if the class suddenly needs it
> where it didn't before, you now need to fix *all* usages of the class.
> So much for encapsulation!
>
> I really think destructors were a step forward in C++, as you couldn't
> forget to call arbitrary cleanup functions like "close". I was happy to
> eliminate them in the step from C to C++, and now in Java and Jython I
> have to use them again. Sigh.
>
> I can see the benefit of mark and sweep in terms of efficiency, but not
> having destructors is a real step backwards for reducing resource leaks
IMO.
>
> One way out within the mark and sweep model might be to make "garbage
> collection" extensible, so that it can manage other limited resources,
> not just memory. That way we could have standard "managers" for files
> and sockets, and we would also be able to add our own.
>
> --

Derek,

A big hearty agreement. I've been pushing the "destructors please" issue for
quite some time too. I've never heard anyone give a good reason for why
they were dropped from Java. Most arguments are based on the concept that
"destructors are for memory management" and that Java doesn't need them for
that since it is garbage collected. But as we both agree "destructors are
also for _resource_ management" and it amazes me that gc based language
developers miss this time and time again.

thx

"Martin v. Löwis"

unread,
Dec 23, 2002, 9:17:32 AM12/23/02
to
Robert Oschler wrote:
> A big hearty agreement. I've been pushing the "destructors please" issue for
> quite some time too.

Can you please explain what a destructor is? If it is a fragment of code
executed when an object ends its life, what exactly is it that you are
pushing? since both Python and Java have such a feature.

If it is an explicit statement "this object ends its life now":
a) what happens to references that still exist to that object? and
b) why can't you use a method for that?

Regards,
Martin


Peter Hansen

unread,
Dec 23, 2002, 9:29:11 AM12/23/02
to
I offer this merely as representative of one of the scenarios that
could happen if one fails to close files explicitly.

Admittedly this is not a typical sequence, but the point is, you cannot
*always* rely on the CPython finalization behaviour even in CPython, so
why waste time guessing when you can, and when you can't?


Python 2.0 (#8, Oct 16 2000, 17:27:58) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> f = open('test.fil', 'w')
>>> f.write('this is a test')
>>>
>>> # other code might be here
...
>>> g = open('test.fil', 'r')
>>> x = g.read()
>>> x
''
>>> print x

>>> hello, x, where are you?
File "<stdin>", line 1
hello, x, where are you?
^
SyntaxError: invalid syntax
>>> f.close()
>>> x = g.read()
>>> x
'this is a test'
>>>

-Peter

Robert Oschler

unread,
Dec 23, 2002, 11:28:15 AM12/23/02
to

"Martin v. Löwis" <mar...@v.loewis.de> wrote in message
news:au75tt$jd0$04$1...@news.t-online.com...

Martin,

I don't want to reiterate what's already been said on this thread so pardon
the brief reply, but Python:__del__ and Java:finalize() cannot be relied
upon to be called as soon as an object's reference count goes to 0, so they
"have the feature" but not the call timing guarantee. In C++, using an
auto_ptr example, the instant an object's reference count goes to 0 it's
destructor will be called. So if I'm using a critical resource like a
file/record lock, or perhaps a scarce resource like a speech reco engine, I
can feel confident with C++ that as soon as it's no longer needed by one
part of the program, it will become available for another, and I don't have
make the the cleanup call (destructor) myself so I don't run the risk of
forgetting to make a cleanup call.

thx

"Martin v. Löwis"

unread,
Dec 23, 2002, 12:08:54 PM12/23/02
to
Robert Oschler wrote:
> I don't want to reiterate what's already been said on this thread so pardon
> the brief reply

Robert,

This won't be necessary, either, since I think I followed the thread
closely. Your message surprised me since it appeared to be out of
context. Before saying that it was indeed out of context, I hoped that
you might want to elaborate a bit.

> but Python:__del__ and Java:finalize() cannot be relied
> upon to be called as soon as an object's reference count goes to 0, so they
> "have the feature" but not the call timing guarantee.

No doubt about that (actually, there is doubt about with respect to
CPython; this is the core of the current thread, but clearly not of your
posting).

> In C++, using an auto_ptr example, the instant an object's reference
> count goes to 0 it's destructor will be called.

That is certainly also true.

However, I'm still curious as to what precisely the '"destructors
please" issue' is that you have been pushing for quite some time, and
what precisely it would mean to add destructors to Python and Java. I'm
speaking as a language designer here, so I'm not too interested in how
you would use the facility until I know what the facility is.

Regards,
Martin

A. Lloyd Flanagan

unread,
Dec 23, 2002, 1:34:56 PM12/23/02
to
"Martin v. Löwis" <mar...@v.loewis.de> wrote in message news:<au75tt$jd0$04$1...@news.t-online.com>...
> Robert Oschler wrote:
> > A big hearty agreement. I've been pushing the "destructors please" issue for
> > quite some time too.
>
> Can you please explain what a destructor is? If it is a fragment of code
> executed when an object ends its life, what exactly is it that you are
> pushing? since both Python and Java have such a feature.

A destructor is like a finalization method, it only gets called when
all references to the object are gone. The difference between C++
destructors and Java's finalize() or Python's __del__() is that C++
_guarantees_ that the destructor will get called sometime before the
program exits (barring a core dump or other catastrophe). This turns
out to be terribly useful for things like making sure an external
resource gets cleaned up when it's not needed, and not before.

From the Python reference manual: "It is not guaranteed that __del__()
methods are called for objects that still exist when the interpreter
exits."

So I think what he's saying is that he wants a guarantee that
__del__() gets called at some point, and I agree with him.

Donn Cave

unread,
Dec 23, 2002, 2:07:41 PM12/23/02
to
Quoth alloydf...@attbi.com (A. Lloyd Flanagan):

As I understand it, what's really missing here is that you can't expect
__del__() to be invoked when an object is reclaimed from a cycle by
Python's gc. Or rather, you can expect it not to be called. I don't
know for sure that gc actually runs at program exit, but even if it did,
__del__ won't be called because of potential order dependencies.

If A and B escape reference count collection because they refer to each
other, then when they're finally located by gc, will A.__del__ expect
to find B in working condition - what if it's already been collected?

Does your destructor account for this problem?

Donn Cave, do...@u.washington.edu

Brian Quinlan

unread,
Dec 23, 2002, 3:00:58 PM12/23/02
to
> Since for Python, that's the documentation available on
> python.org, and that documentation _explicitly_ states that relying on
> timely finalization is a bad idea, relying upon it is, well, a bad
idea
> if you want to write software that continues to work properly down the
> road.

Core language features are less reliable than some CPython
implementation details. For example, the semantics for the / operator
are going to change before the finalization timing.

> As I have said, and probably will have to say many more times before
> this accursed thread dies, the suggestion that the finalization
behavior
> of CPython might change was entirely theoretical. I have no doubt
that
> it will never change for the life of CPython. An utterly theoretical,
> trivial point has been inflated as if it were the main focus of my
> comment, which it never was.

So why would I worry about an entirely theoretical point while I code?

Cheers,
Brian


"Martin v. Löwis"

unread,
Dec 23, 2002, 4:45:20 PM12/23/02
to
A. Lloyd Flanagan wrote:
> The difference between C++
> destructors and Java's finalize() or Python's __del__() is that C++
> _guarantees_ that the destructor will get called sometime before the
> program exits (barring a core dump or other catastrophe).

In general, C++ makes no such guarantee. If I do

Foo *foo = new Foo;

and never mention foo in a delete statement, the Foo destructor won't be
called.

> From the Python reference manual: "It is not guaranteed that __del__()
> methods are called for objects that still exist when the interpreter
> exits."

The same is true for destructors in C++.

> So I think what he's saying is that he wants a guarantee that
> __del__() gets called at some point, and I agree with him.

I think he means something different. If there was just a guarantee that
the destructor is called at some point, I believe he would not be
satisfied. Unfortunately, I still can't infer from that what it would
mean to integrate destructors into Python.

Regards,
Martin

Robert Oschler

unread,
Dec 23, 2002, 5:26:32 PM12/23/02
to

"Martin v. Löwis" <mar...@v.loewis.de> wrote in message
news:au7fv7$98i$02$1...@news.t-online.com...

>
> However, I'm still curious as to what precisely the '"destructors
> please" issue' is that you have been pushing for quite some time, and
> what precisely it would mean to add destructors to Python and Java. I'm
> speaking as a language designer here, so I'm not too interested in how
> you would use the facility until I know what the facility is.
>

Martin,

Not being a language designer myself, and knowing almost nothing about how
either Java and Python track their reference counts, I hope I don't sound
like I'm making too simple a message by describing below what I mean by
having destructors in Python. But here's the holy grail:

I'd like there to be a method of an object that is called the instant an
object's reference count drops to 0 and not when the garbage collector gets
around to cleaning up. This method is guaranteed to be called when an
object's reference count drops to 0, and at the exact moment that occurs (or
reasonably close). The implied assumption is also that all object reference
counts drop to 0 on program termination, if not, then that additional
behavior too, so that program global objects could be counted on to have
their destructors called at some time.

If garbage collected languages would have trouble adopting this behavior,
due to the treachery of releasing an object's memory outside of the garbage
collector's normal "schedule", then I would like to formally decouple the
sentiment attached to destructors that involves memory management. I can't
speak for the other's here that also have indicated that they want to see
destructor's, but I'm actually far more concerned with automatic resource
cleanup rather than memory reclamation (releasing database file and
record/locks, releasing a socket, a kernel object, etc.). If all the
Python:Java destructor did was simply guarantee that there was an object
method that was called automatically when an object's reference count
dropped to 0, and didn't release any memory belonging to the terminating
object at all, leaving that to the gc to do it when it normally does, I'd be
extremely happy.

thx


Erik Max Francis

unread,
Dec 23, 2002, 5:34:24 PM12/23/02
to
"Martin v. Löwis" wrote:

> In general, C++ makes no such guarantee. If I do
>
> Foo *foo = new Foo;
>
> and never mention foo in a delete statement, the Foo destructor won't
> be
> called.

As he already said, he was referring to destructors in conjunction with
reference counting entities in C++, such as std::auto_ptr.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ If you think you're free, there's no escape possible.
\__/ Baba Ram Dass
Bosskey.net: Return to Wolfenstein / http://www.bosskey.net/rtcw/
A personal guide to Return to Castle Wolfenstein.

Erik Max Francis

unread,
Dec 23, 2002, 5:36:47 PM12/23/02
to
Brian Quinlan wrote:

> So why would I worry about an entirely theoretical point while I code?

Sigh. It's theoretical that CPython's finalization behavior could
change. It's _not_ theoretical that Jython does not have the same
finalization behavior. If you're concerned about portability between
CPython and Jython, then the difference in finalization is far from a
theoretical issue.

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:44:00 PM12/23/02
to
Robert Oschler wrote:
> I'd like there to be a method of an object that is called the instant an
> object's reference count drops to 0 and not when the garbage collector gets
> around to cleaning up.

I see. This is precisely the definition of Python's __del__ method: it
is called when the reference count drops to zero, atleast for CPython.

Now, for some objects, the reference count never drops to zero, because
they are part of cycles (or reachable from cycles); given your
specification, I assume you would expect that destructor is never called
for these objects.

Python goes beyond that (atleast from 2.0 on): the cyclic garbage
collector finds breaks (some) cycles; objects with finalizers that are
reachable from cycles (but are not part of the cycle) have their __del__
invoked when the cycle is broken. This is actually more than you've
asked for, since you expected that those objects never have their
finalizers called.

If objects with finalizers are part of the cycle, it is not clear how to
break it (since the finalizer may bring the object back to live, but
some references are already broken); those objects are put into
gc.garbage instead.

> The implied assumption is also that all object reference
> counts drop to 0 on program termination, if not, then that additional
> behavior too, so that program global objects could be counted on to have
> their destructors called at some time.

This is not the case in Python; objects with finalizers never have their
finalizers called because of the resulting semantic difficulties. It
should be considered a bug in an application if it has such objects when
the application terminates.

> If garbage collected languages would have trouble adopting this behavior,
> due to the treachery of releasing an object's memory outside of the garbage
> collector's normal "schedule"

Languages with garbage collection have a problem with this strategy if
they don't maintain a reference counter - then there is no efficient way
to find out whether it has dropped to zero. Java is one such language,
that's why it cannot guarantee timely invocation of the finalizer.

Regards,
Martin

"Martin v. Löwis"

unread,
Dec 23, 2002, 5:59:19 PM12/23/02
to
Erik Max Francis wrote:
> As he already said, he was referring to destructors in conjunction with
> reference counting entities in C++, such as std::auto_ptr.

std::auto_ptr is no reference counting entity (it does not count
references). Addition of reference counting entities would not change
anything for Python, atleast CPython already has reference counting
entities (*all* entities are reference counted).

So if you use std::auto_ptr as a class member in C++, you can literally
translate your code to Python (just omit the auto_ptr), and it literally
works the same way with respect to finalization (atleast if you use
CPython): If the container goes away, and it was holding the only
reference (as is a requirement for std::auto_ptr), the contained is
finalized immediately.

What he *might* be thinking of (but he just explained something
completely different) is that he wants to have lexically-scoped
references, or lifetime: if you have a reference in a local variable of
a function, and the function terminates, you want the reference to be
released or the object to terminate its life.

In the case of terminate-its-life, I see a semantic difficulty extending
this to Python: What if there are other references to the same object?
This is no language-specification problem in C++: if you access the dead
object through a dangling pointer, you get undefined behaviour. Of
course, undefined behaviour is not acceptable for Python.

In the case of release-reference, you nearly have that behaviour in
Python: When a function terminates, all its local variables are deleted,
and the objects decrefed. The only exception is an exception: a stack
frame in the traceback keeps the local variables alive even after the
end of the function.

There are two ways to work around this problem: a) don't assign
"interesting" objects to local variables, as done in the open-read
idiom, or b) unassign local variables in a try-finally-block. Of course,
if you have to write a try-finally-block, you could release the
underlying object right away also.

Enhancements in this area are possible, one might consider writing

transient foo = open(filename)

with the expectation that foo is decrefed when the scope is left even in
case of an exception. As with any language change, you need a really
good reason to implement it, though (and you need to think about
semantic subtleties, such as generators).

Regards,
Martin

Robert Oschler

unread,
Dec 23, 2002, 6:29:44 PM12/23/02
to

"Martin v. Löwis" <mar...@v.loewis.de> wrote in message
news:au83ji$4fa$00$1...@news.t-online.com...

> I see. This is precisely the definition of Python's __del__ method: it
> is called when the reference count drops to zero, atleast for CPython.
>

True, but unless I misread the previous posts in this thread, the language
spec does not guarantee this behavior, even for CPython, and therefore
taking advantage of it might "break" something in the future. If it became
a guaranteed item then that would be great. Xmas? :)

thx


Robert Oschler

unread,
Dec 23, 2002, 6:35:37 PM12/23/02
to

"Martin v. Löwis" <mar...@v.loewis.de> wrote in message
news:au84g9$qml$07$1...@news.t-online.com...

> Erik Max Francis wrote:
>
> std::auto_ptr is no reference counting entity (it does not count
> references). Addition of reference counting entities would not change
> anything for Python, atleast CPython already has reference counting
> entities (*all* entities are reference counted).
>

Sorry for the confusion, I said auto_ptr when I meant shared_ptr which is
reference counted.

thx

Robert Oschler

unread,
Dec 23, 2002, 6:52:04 PM12/23/02
to

"Samuele Pedroni" <pedr...@bluewin.ch> wrote in message
news:3e0701a1$1...@news.bluewin.ch...

> >
>
> Honestly not closing files explicitly is a bad idea, unless for quick
hacks.
>

Samuele, if you are saying this in the spirit of diminishing the value of
destructors, then read on. If not, pardon me, I misunderstood. Sure the
following construct (pseudo-code) handles many situations quite well:

allocate resource

try
# do things
finally
# cleanup no matter what

But here's a general description of where automatic resource cleanup via
destructors can be a life saver. You have an object that has long-term
persistence that allocates an important resource upon its creation. This
object is passed between several containers that store objects temporarily,
and the object creators that commit the object to the container, are long
gone and therefore any try/finally blocks they might have are irrelevant.
Perhaps the object is also accessed across several threads and the
container is an application global one, protected by a locking mechanism.
In addition, several other utility classes acquire references to the object
during their lifetime. In situations like these, having the object
automatically release the resource only when every other reference holder
has "let go" of the object, but immediately at the critical moment, is the
most failsafe and bulletproof approach to managing the resource. Trying to
ensure the calling of a user-defined "cleanup" call in such a situation is
world-class headache and debugging nightmare.

Situations like the one described above are actually pretty common, an
example of which is when you get into asynchronous event handling which is
fairly germane to any Socket application with persistent objects. There are
many other examples; sound board accelerator channel acquistion, speech
reco/synthesis instance acquistion, etc. especially in server environments.
In these situations I became a fiend for the C++ Boost shared_ptr library
and it was a lifesave on many an occassion.

thx


Robert Oschler

unread,
Dec 23, 2002, 7:00:53 PM12/23/02
to

"Peter Hansen" <pe...@engcorp.com> wrote in message
news:3E071DB7...@engcorp.com...

> I offer this merely as representative of one of the scenarios that
> could happen if one fails to close files explicitly.
>
> Admittedly this is not a typical sequence, but the point is, you cannot
> *always* rely on the CPython finalization behaviour even in CPython, so
> why waste time guessing when you can, and when you can't?
>
>

Wait a minute, didn't Martin just say in a previous post that you can rely
on CPython to call __del__ when reference counts reach 0? How could Python
"forget" to decrement a reference count, even if an exception occurs in a
finally block or anywhere else for that matter? If the reference holder is
a stack object (function local variable), then when the stack is unwound the
reference count would get decremented, no matter how the funciton exited
(exception, normal return, etc). If the reference is in a container, then
the reference count would get decremented when the container is wiped or the
object holding the reference is deleted from the container. Or am I reading
too much of C++ behavior into Python's reference count management?

Did I miss something?

thx


Erik Max Francis

unread,
Dec 23, 2002, 7:07:40 PM12/23/02
to
Robert Oschler wrote:

> Sorry for the confusion, I said auto_ptr when I meant shared_ptr which
> is
> reference counted.

I brainfarted your brainfart as well; I was thinking of
boost::shared_ptr or something similar rather than std::auto_ptr, but
just mumbled the same words you did. Oops.

I suppose one could try to make the save that std::auto_ptr is a form of
reference counting, it's just one in which it only counts zero or one
references :-).

Erik Max Francis

unread,
Dec 23, 2002, 7:21:35 PM12/23/02
to
Robert Oschler wrote:

> Wait a minute, didn't Martin just say in a previous post that you can
> rely
> on CPython to call __del__ when reference counts reach 0? How could
> Python
> "forget" to decrement a reference count, even if an exception occurs
> in a
> finally block or anywhere else for that matter?

Peter was giving an example where the reference count _didn't_ go to
zero, but a naive programmer might not have noticed that. It's yet
another reason why explicit reclaiming of external resources is a good
idea, even independently of this immediate vs. deferred finalization
issue; it leaves open the opportunity for bugs.

--
Erik Max Francis / m...@alcyone.com / http://www.alcyone.com/max/
__ San Jose, CA, USA / 37 20 N 121 53 W / &tSftDotIotE

/ \ I used to walk around / Like nothing could happen to me
\__/ TLC
EmPy / http://www.alcyone.com/pyos/empy/
A templating system for Python.

Samuele Pedroni

unread,
Dec 23, 2002, 7:23:41 PM12/23/02
to

"Robert Oschler" <Osc...@earthlink.net> ha scritto nel messaggio
news:EkNN9.7101$uV4.3...@news2.news.adelphia.net...

>
> "Samuele Pedroni" <pedr...@bluewin.ch> wrote in message
> news:3e0701a1$1...@news.bluewin.ch...
> > >
> >
> > Honestly not closing files explicitly is a bad idea, unless for quick
> hacks.
> >
>
> Samuele, if you are saying this in the spirit of diminishing the value of
> destructors, then read on.

I was saying it in the context of: Python/files.


Peter Hansen

unread,
Dec 23, 2002, 7:58:48 PM12/23/02
to
Erik Max Francis wrote:
>
> Robert Oschler wrote:
>
> > Wait a minute, didn't Martin just say in a previous post that you can
> > rely
> > on CPython to call __del__ when reference counts reach 0? How could
> > Python
> > "forget" to decrement a reference count, even if an exception occurs
> > in a
> > finally block or anywhere else for that matter?
>
> Peter was giving an example where the reference count _didn't_ go to
> zero, but a naive programmer might not have noticed that. It's yet
> another reason why explicit reclaiming of external resources is a good
> idea, even independently of this immediate vs. deferred finalization
> issue; it leaves open the opportunity for bugs.

That's what I intended. As it turns out, on second examination and
a little experimentation, a call to flush() avoids the particular
problem shown.

It doesn't entirely invalidates my argument, which Erik has captured above.
If you substitute the attempt to read from the file with a call to
"os.remove('test.fil')", just for an example, the point remains.

Calling .close() explicitly is a Good Idea, though there are cases
where you don't really need to. The number of such cases appears
to diminish as long as this thread continues...

I agree with the comment someone made about the source of much of the
disagreement. I personally don't use .close() explicitly much of the
time, but my lengthy programming experience and more recent maturation
in that respect makes me feel a twinge of guilt every time I skip it.
It *is* sloppy, IMHO, and I *am* being lazy when I don't explicitly
close my files, except perhaps in throw-away utilities or trivial programs
where the cost of not being lazy doesn't necessarily pay off.

In serious programs, even were it not for the existence of Jython, or
garbage collection that can't handle cycles with __del__ involved, it's
a good idea to be explicit, *especially* about resource management.

For my New Year's Resolution, I vow always to close my files explicitly
whenever I can, preferably with a tidy try/finally statement... :-)

-Peter

Robert Oschler

unread,
Dec 23, 2002, 9:10:46 PM12/23/02
to

"Erik Max Francis" <m...@alcyone.com> wrote in message
news:3E07A88F...@alcyone.com...

> Robert Oschler wrote:
>
>
> Peter was giving an example where the reference count _didn't_ go to
> zero, but a naive programmer might not have noticed that. It's yet
> another reason why explicit reclaiming of external resources is a good
> idea, even independently of this immediate vs. deferred finalization
> issue; it leaves open the opportunity for bugs.
>
> --

Erik,

Ok thanks. Regarding the explicit (an custom defined 'close' function) vs
implicit (as in destructors) reclamation of external resources, let me
repost a response I made elsewhere in thsi mammoth thread, in regards to
situations where I feel an explicit (programmer has to remember to call it)
finalization call may be intractable:

<repost/>

Sure the following construct (pseudo-code) handles many situations quite
well:

allocate resource

try
# do things
finally

# cleanup no matter what (make explicit user-defined finalize call)

Stuart D. Gathman

unread,
Dec 23, 2002, 11:19:40 PM12/23/02
to
On Mon, 23 Dec 2002 12:08:54 -0500, Martin v. Löwis wrote:

>> In C++, using an auto_ptr example, the instant an object's reference
>> count goes to 0 it's destructor will be called.
>
> That is certainly also true.
>
> However, I'm still curious as to what precisely the '"destructors
> please" issue' is that you have been pushing for quite some time, and
> what precisely it would mean to add destructors to Python and Java. I'm
> speaking as a language designer here, so I'm not too interested in how
> you would use the facility until I know what the facility is.

What is desired is the equivalent of try...finally. It seems that
try...finally is cluttered enough that programmers avoid it whenever they
deem it unnecessary to enhance readability.

This idiom:

try:
fp = open(...)
do_something()
finally:
fp.close()

is what we want to enhance to meet the following objectives:

a) programmers won't avoid proper finalization because of readability

b) programmers would find that proper finalization so convenient, that
they use it all the time even when not required - so that a library can
add a finalization requirement (e.g. close() method) and reasonably
expect most clients not to break.

One feature is a given: there must be a way to identify objects
requiring finalization at their definition - *not* at their point of use.
In Java, the obvious thing is to define an interface (e.g. Disposable)
with a close() or dispose() method. In Python, a base class for python
classes or type flag for C objects would do the trick.

Secondly, Java and Python will not have the problem of C++ with the
danger of references causing undefined behaviour after an object is
disposed. Trying to do things with a disposed object is just as well
defined as trying to do things with a closed file. A well designed
object will probably throw an exception - or possibly do nothing - but it
is well defined and up to the programmer.

Now, the tricky part is that these Disposable objects might go into
containers, so we can't simply declare that they are disposed when the
block where they are created ends. I.e., the following would be
convenient only for certain (common) situations:

def myfunc():
fp = open(...) # assume file is "Disposable"
so_something()
# fp is closed at block exit

Furthermore, any object which contains a disposable object should itself
be disposable, and so on.

Thinking about this eventually leads to the conclusion that reference
counting is the easiest way to define the desired semantics. The most
telling evidence of this is that Java uses reference counting to handle
external resources in systems like RMI and JINI. This is implemented by
creating proxy objects which in turn manage reference counts.

One way to make this more efficient as a language extension is to have
the virtual machine maintain mandatory reference counts for Disposable objects -
but continue to use implementation defined GC for normal objects. The
language would then guarantee a call to the dispose() method when the
reference count reaches 0. Calling dispose() would *not* release memory
or cause undefined behaviour.

--
Stuart D. Gathman <stu...@bmsi.com>
Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154
"Confutatis maledictis, flamis acribus addictis" - background song for
a Microsoft sponsored "Where do you want to go from here?" commercial.

Bengt Richter

unread,
Dec 24, 2002, 3:30:39 AM12/24/02
to
On Mon, 23 Dec 2002 12:00:50 +0100, =?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= <mar...@v.loewis.de> wrote:
>Erik Max Francis wrote:
[...]
>> What I was saying is that you need to safeguard external critical
>> resource acquisition and release with something more stringent than
>> relying on finalization, and that is because the Python documentation
>> even says so. That is all.
>
>I completely understand, and I think you are wrong. You don't need
>further safeguards; the ones you have are sufficient, atleast for some
>(the majority?) of the application.
>
What if you took the approach of trying to design a way of having
guarantees of immediate finalization platform-independently?

E.g., suppose you had a suite-introducer whose purpose it was to define
a local block of statements like try/finally but so that the compiler
would automatically generate code so that all bindings created within
the suite would be monitored and and immediately (i.e., maybe even
during expression evaluation, but at least at the end of a statement)
finalized when the equivalent of ref count going to zero happened. E.g.,
autofinalize:
s = file('foo').read()

Then maybe it could be a language level guarantee (for conforming
implementations) rather than a CPython guarantee.

Regards,
Bengt Richter

"Martin v. Löwis"

unread,
Dec 24, 2002, 7:03:51 AM12/24/02
to
Bengt Richter wrote:
> E.g., suppose you had a suite-introducer whose purpose it was to define
> a local block of statements like try/finally but so that the compiler
> would automatically generate code so that all bindings created within
> the suite would be monitored and and immediately (i.e., maybe even
> during expression evaluation, but at least at the end of a statement)
> finalized when the equivalent of ref count going to zero happened. E.g.,
> autofinalize:
> s = file('foo').read()

How would you implement this in Jython? In particular, what would be the
equivalent of ref count, and how could you tell it goes to zero?

Regards,
Martin

François Pinard

unread,
Dec 24, 2002, 12:58:44 PM12/24/02
to
[Stuart D. Gathman]

> The problem with Python reference counting, is that it encourages sloppy
> programming like:

> data = open('myfile','r').read()

There is no sloppiness there. The only nit one could have against the above
line is that `file' is now a better writing than `open'. Let's use it:

data = file('myfile','r').read()

This is clear, legible, elegant, and all functional in CPython.

> [...] The proper code:

> fp = open('myfile','r')
> data = fp.read()
> fp.close()

> is not as pretty.

This code is surely proper in Jython, working around the fact that Jython
relies on Java's garbage collector. For CPython, it is probably less
proper, because it is more cumbersome and less legible. For me, more
legible it is, more Python it is. Like it or not, Jython is a tiny bit less
Python than CPython! :-)

Of course, one may ought to use Jython for various reasons, and Jython is
undoubtedly a wonderful blessing whenever you need it. But this is the
choice of a programmer to write his code whether with Jython in mind or not.
The key point is that there is nothing sloppy in choosing the `not' side.

--
François Pinard http://www.iro.umontreal.ca/~pinard

Samuele Pedroni

unread,
Dec 24, 2002, 1:36:04 PM12/24/02
to
For me, more
> legible it is, more Python it is. Like it or not, Jython is a tiny bit
less
> Python than CPython! :-)
>

oh my lord an uncontrollable zealot among us!!! <.5 wink>


François Pinard

unread,
Dec 24, 2002, 1:21:18 PM12/24/02
to
[Courageous]

> It's simply my gut feel and years of experience which say "closing
> explicitly is best practice."

Often, experience might come from languages other than Python. We might be
careful when we extend our experience from one language to another, in
recognising the specificities of the new languages. Our experience is often
helpful, but in some unusual cases, it may blinds us.

Even within Python (and I'm thinking CPython), experience taught me that
best practice is avoiding noisy code. Explicitly closing a file which is
used very locally is similar to using `del' for variables we created through
assignment. I have confidence in function exit cleanup for variables, and
whenever possible, I try to consider a file as just another type.

There _are_ cases when explicit closing a file is worth. These are not so
frequent in practice: an explicit close warns our attention on an unusual
situation. So you see, by not abusing of explicit closes, we even document
our code better, and increase its overall clarity. Really, it's good style!

Martin Maney

unread,
Dec 25, 2002, 12:27:54 AM12/25/02
to

It seems to me that it would most likely be a reference count. I take
it Jython doesn't use reference counting, leaving the entire job to the
Java runtime (with a hook to run Python finalizers, of course)?

Hmmm. So presumably Jython wouldn't care to add ref counting to
everything, but it would be useful to be able to get that behavior for
selected objects (or types of objects) both to avoid a potential
portability problem amongst Python implementations as well as to insure
that this useful behavior was always available. Perhaps it would be
useful to have a way of activating ref counting for an object. In
CPython this would be a no-op; in Jython it would have to add ref
counting to the chosen objects and call finalizers, but would not [need
not?] scavenge memory when the count hit zero.

So the language would guarantee that objects for which ref counting was
selected (hmmm, has to be active at creation time, yes? certainly can't
have multiple bindings when this is turned on) would have finalizers
run when their ref count went to zero. Other objects might or might
not, and cycles would likely continue to be troublesome. This would
address at least most of the uses I've seen described here, I think.

Okay, start poking holes.

Merry Christmas!

Martin v. Löwis

unread,
Dec 26, 2002, 6:00:54 PM12/26/02
to
"Martin Maney" <ma...@pobox.com> writes:

> It seems to me that it would most likely be a reference count.

But that would slow down assignments and parameter passing in
inacceptable ways, and you would also need to accomodate for the case
where Java objects hold references to Python objects.

> I take it Jython doesn't use reference counting, leaving the entire
> job to the Java runtime (with a hook to run Python finalizers, of
> course)?

Almost. I believe __del__ code is just put into a Java .finalize
method, so that they are run by the Java GC finalization.

> Perhaps it would be useful to have a way of activating ref counting
> for an object.

Now, how would you implement *that*? I.e. when activated, which
machinery precisely changes the reference count, on, say, assignment?

Also, the issue remains how to pass such objects to Java "native"
methods (i.e. pure Java methods, instead of Jython functions).

Regards,
Martin

Antonio Cuni

unread,
Dec 26, 2002, 3:28:35 PM12/26/02
to
Peter Hansen wrote:

> In serious programs, even were it not for the existence of Jython, or
> garbage collection that can't handle cycles with __del__ involved, it's
> a good idea to be explicit, *especially* about resource management.

it could be true, but sometimes it's hard or not-obvious to be explicit
and correct; suppose you have a function that has to acquire a mutex, to
open a file and to open a socket; if you assume that every function
could throw an exception you must write something like this:

def foo():
mymutex = acquire_mutex()
try:
myfile = open(filename, 'w')
try:
mysocket = open_socket()
try:
do_stuff()
finally:
mysocket.close()
finally:
myfile.close()
finally:
mymutex.release()

or this:

def foo():
try:
mymutex = acquire_mutex()
myfile = open(filename, 'w')
mysocket = open_socket()
do_stuff()
finally:
try: mymutex.release()
except: pass
try: myfile.close()
except: pass
try: mysocket.close()
except: pass

IMHO it is all but beautiful.

I'd like if python had a feature comparable to C++'s deterministic
destructors; something like the following (invented syntax):

def foo():
auto mymutex = acquire_mutex()
auto myfile = open(filename, 'w')
auto mysocket = open_socket()
do_stuff

I presume it's not easy to assure that "auto" variables can be destroyed
at the end of the function: e.g., what if objects bound to auto
variables have refcount > 1 ?
I don't really know if this feature can be implemented without breaking
old code, but I'd like it much.

ciao Anto
--
"Computer science is not about computers any more than astronomy
is about telescopes." -- EW Dijkstra

Martin Maney

unread,
Dec 27, 2002, 12:00:22 PM12/27/02
to
Martin v. L?wis <mar...@v.loewis.de> wrote:
> "Martin Maney" <ma...@pobox.com> writes:
>> It seems to me that it would most likely be a reference count.
>
> But that would slow down assignments and parameter passing in

Yes, I assumed that that was at least a good part of the reason Jython
abandonded the traditional ref counting of CPython. This was also
meant to be implicit agreement with your perhaps rhetorical question to
the previous poster. I can't see any way to provide CPython's
finalizer semantics other than by doing ref counting. The rest of my
post was speculation about whether ref counting might be applied to
selected objects (or types) as a compromise between the convenince of
ref counting's determinism and its cost if applied to every object.

> inacceptable ways, and you would also need to accomodate for the case
> where Java objects hold references to Python objects.

Now that would be a problem. I'm not familiar with Jython, so I have
no idea how common that sort of use is, or whether restricting
"guaranteed to finalize like classic CPython objects" objects from
being so referenced would be feasible. My impression, which may be
incomplete or even just wrong, is that Jython's motivation was mostly
to allow a Python language implementation that ran on top of the Java
VM, and which could access the Java libraries. In that context it
doesn't seem impossible that such an extension might be worth accepting
some such restrictions. Dunno.

I was also thinking that something along these lines might also be
useful for other hypothetical implementations. Even accepting that
CPython will never, ever use anything but ref counting (aside from
appendages that attempt to deal with cyclic garbage <wink>), there
might come to be other implementations that would find a different
approach to garbage collection profitable. If they can compile Lisp
into machine code, why not Python? And if that makes the ref counting
a less insignificant cost and favors some other approach to gc...

>> Perhaps it would be useful to have a way of activating ref counting
>> for an object.
>
> Now, how would you implement *that*? I.e. when activated, which
> machinery precisely changes the reference count, on, say, assignment?

Dunno. I have been enjoying Python as a tool without so far having had
any sufficent motivation to open the hood and see how the plumbing
works; nor would I be likely to begin by examining Jython's plumbing.
So I am extrapolating from past experiences with using ref counting
in other contexts to an assumption that Python's implementation might
be able to choose an answer other than "all" or "nothing".

May I reflect an altered version of this back to you? Under the
hypothesis that CPython changed so that not all objects were reference
counted, how many places in the VM would need to distinguish between
ref-counted and non-rc objects? Would it be easier to have the
distinction made based on the object's type or at a lower level such as
a sentinel ref count value? I don't know how directly the answers to
these questions would translate to Jython's implementation, of course.

Skip Montanaro

unread,
Dec 27, 2002, 1:04:37 PM12/27/02
to

>>> It seems to me that it would most likely be a reference count.
>> But that would slow down assignments and parameter passing in
Martin> Yes, I assumed that that was at least a good part of the reason
Martin> Jython abandonded the traditional ref counting of CPython.

Not at all. Jython didn't abandon reference counting. It wasn't an option.
Jython compiles Python code to JVM bytecode, which is what implements the
garbage collection. In CPython, the C code which makes up the PyVM
implements reference counting.

--
Skip Montanaro - sk...@pobox.com
http://www.musi-cal.com/
http://www.mojam.com/

Aahz

unread,
Dec 27, 2002, 2:52:24 PM12/27/02
to
In article <mailman.1041012336...@python.org>,

Skip Montanaro <sk...@pobox.com> wrote:
>
> >>> It seems to me that it would most likely be a reference count.
> >> But that would slow down assignments and parameter passing in
> Martin> Yes, I assumed that that was at least a good part of the reason
> Martin> Jython abandonded the traditional ref counting of CPython.
>
>Not at all. Jython didn't abandon reference counting. It wasn't an option.
>Jython compiles Python code to JVM bytecode, which is what implements the
>garbage collection. In CPython, the C code which makes up the PyVM
>implements reference counting.

Your third sentence isn't quite true. Early work on Jython did attempt
to emulate CPython's reference semantics, but it was abandoned as being
too slow.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"I disrespectfully agree." --SJM

"Martin v. Löwis"

unread,
Dec 27, 2002, 4:59:58 PM12/27/02
to
Martin Maney wrote:
> Yes, I assumed that that was at least a good part of the reason Jython
> abandonded the traditional ref counting of CPython. This was also
> meant to be implicit agreement with your perhaps rhetorical question to
> the previous poster.

It was 50% rhetorical, as I'm convinced there is no satisfying answer; I
assumed a 50% chance that Bengt tries to answer (and was surprised
that somebody else answered).

> Now that would be a problem. I'm not familiar with Jython, so I have
> no idea how common that sort of use is, or whether restricting
> "guaranteed to finalize like classic CPython objects" objects from
> being so referenced would be feasible.

It is often concidered as a strength of Jython that you can integrate so
easily with the Java library; this may involve call-backs from Java to
Python, which in turn requires that "pure" Java object hold references
to Python objects. A common example for that is usage of the AWT or
Swing in Jython.

> My impression, which may be
> incomplete or even just wrong, is that Jython's motivation was mostly
> to allow a Python language implementation that ran on top of the Java
> VM, and which could access the Java libraries.

Just *having* that implementation is not enough motivation, people also
want to *use* it. Although they have to suffer a few limitations due to
absent modules, they can overcome these limitations in many cases by
using the Java library. Code that runs both on CPython and Jython often
has try-except-ImportError blocks to chose either the C or the Java
version of some functionality. Where possible, Jython wraps the API, but
for more complex APIs (GUI, XML), this is not feasible.

> If they can compile Lisp into machine code, why not Python?

Feel free to try, and to make any changes to the language that you feel
necessary. As you proceed, you will either
a) find that your entire approach is doomed and give up, or
b) find that you have to make so many changes to the language that
nobody is interested in using your code, or
c) find that some of the language changes that you initially had to make
can be taken back without loss of generality or performance.

If you arrive at alternative (c), feel free to discuss any remaining
language changes with other Python authors. There is no need for a
proactive change to accomodate an imaginary implementation. The
often-cited statement in the language reference with regard to
refcounting came precisely out of the Jython development, and is there
to support Jython only.

I believe Mark Hammond's Python .NET ended up in alternative (b) (or
(a), depending on your point of view). Apart from that, I'm not aware of
any prospective alternative Python implementations that have troubles
with the refcounting.

> So I am extrapolating from past experiences with using ref counting
> in other contexts to an assumption that Python's implementation might
> be able to choose an answer other than "all" or "nothing".

Can you please explain how these schemes work? In CPython, all objects
are of type PyObject*, and Py_INCREF and Py_DECREF are macros that
access the internal structure of a PyObject, which contains the member
ob_refcount.

> May I reflect an altered version of this back to you? Under the
> hypothesis that CPython changed so that not all objects were reference
> counted, how many places in the VM would need to distinguish between
> ref-counted and non-rc objects?

You find a Py_INCREF macro invocation about every three lines of CPython
module source code, not only for core modules, but also for every
extension module out there.

> Would it be easier to have the
> distinction made based on the object's type or at a lower level such as
> a sentinel ref count value?

Both would work, I guess, but both would also slow down CPython, as you
now have to look into the type / for the sentinel value before modifying
the refcount.

This doesn't translate to Jython at all, since Jython uses native VM
instructions and the native VM stack to manipulate object references, be
it references to Jython objects, or to "pure" Java objects.

Regards,
Martin


Tim Peters

unread,
Dec 27, 2002, 5:25:36 PM12/27/02
to
[Martin v. Löwis]
> ...

> The often-cited statement in the language reference with regard to
> refcounting came precisely out of the Jython development, and is there
> to support Jython only.

Guido isn't *that* reactive <wink>. If you look at rev 1.1 of ref3.tex,
you'll see that the following paragraph was in the Language Reference manual
in 1992 (at least -- IIRC, the ref man was maintained in FrameMaker before
1992, and CVS doesn't have any history for that):

Objects are never explicitly destroyed; however, when they become
unreachable they may be garbage-collected. An implementation is
allowed to delay garbage collection or omit it altogether --- it is
a matter of implementation quality how garbage collection is
implemented, as long as no objects are collected that are still
reachable. (Implementation note: the current implementation uses a
reference-counting scheme which collects most objects as soon as
they become unreachable, but never collects garbage containing
circular references.)

The only thing that changed here in 10+ years is that the implementation
note changed a little, to remark that cyclic garbage can be reclaimed now in
CPython. Guido was always explicit about that refcounting was simply an
implementation choice, not part of the language definition. Java didn't
exist when this was first written, although Jython may have <wink>.

Stuart D. Gathman

unread,
Dec 27, 2002, 9:08:49 PM12/27/02
to
On Thu, 26 Dec 2002 15:28:35 -0500, Antonio Cuni wrote:

> I'd like if python had a feature comparable to C++'s deterministic
> destructors; something like the following (invented syntax):
>
> def foo():
> auto mymutex = acquire_mutex()
> auto myfile = open(filename, 'w')
> auto mysocket = open_socket()
> do_stuff
>
> I presume it's not easy to assure that "auto" variables can be destroyed
> at the end of the function: e.g., what if objects bound to auto
> variables have refcount > 1 ?

1) This feature should not reclaim memory, only call __del__(). If
references were still live, the effect would be the same as calling
__del__() on a live object.

2) __del__() would get called again when memory is actually reclaimed.
Therefore, your feature would be much safer if it called another special
function - say close() or dispose() at the end of the block.

3) There is probably a way to do what you want without additional syntax:

def foo():
mymutex = auto(acquire_mutex())
myfile = auto(open(filename,'w')
mysocket = auto(open_socket())
do_stuff()

But your syntax would be prettier. In fact, I proposed exactly that
syntax to the Java Community Process. The reviewers noted that
try..finally already provides the same functionality, and it was
summarily rejected. The problem is that try..finally is so ugly for this
extremely common case, that no one writes the correct code - it is just too
painful.

Isaac To

unread,
Dec 28, 2002, 1:59:53 AM12/28/02
to
>>>>> "Stuart" == Stuart D Gathman <stu...@bmsi.com> writes:

Stuart> 1) This feature should not reclaim memory, only call __del__().
Stuart> If references were still live, the effect would be the same as
Stuart> calling __del__() on a live object.

Stuart> 2) __del__() would get called again when memory is actually
Stuart> reclaimed. Therefore, your feature would be much safer if it
Stuart> called another special function - say close() or dispose() at
Stuart> the end of the block.

Stuart> 3) There is probably a way to do what you want without
Stuart> additional syntax:

Stuart> def foo(): mymutex = auto(acquire_mutex()) myfile =
Stuart> auto(open(filename,'w') mysocket = auto(open_socket())
Stuart> do_stuff()

Is there a PEP for that yet? If not, what about writing one?

Regards,
Isaac.

ma...@pobox.com

unread,
Dec 28, 2002, 12:23:02 AM12/28/02
to
Skip Montanaro <sk...@pobox.com> wrote:
>
> >>> It seems to me that it would most likely be a reference count.
> >> But that would slow down assignments and parameter passing in
> Martin> Yes, I assumed that that was at least a good part of the reason
> Martin> Jython abandonded the traditional ref counting of CPython.
>
> Not at all. Jython didn't abandon reference counting. It wasn't an option.

I'm not sure if we disagree about more than word choices, because the
next bit is oddly difficult to digest:

> Jython compiles Python code to JVM bytecode, which is what implements the
> garbage collection. In CPython, the C code which makes up the PyVM
> implements reference counting.

Is the assymetry between "JVM bytecode" and "C code...PyVM"
intentional? I would have thought that the cases were more
symmetrical, with the Python being compiled to bytecodes and the VMs
running things, including the gc mechanism, whatever it may be.

The way you stated it would be more promising for the sort of hybrid
I suggested, since if the gc is implemented in the generated bytecodes
then it would be easier to change; however, I suspect that it's not
that simple.

Antonio Cuni

unread,
Dec 28, 2002, 7:27:42 AM12/28/02
to
Stuart D. Gathman wrote:

>> def foo():
>> auto mymutex = acquire_mutex()
>> auto myfile = open(filename, 'w')
>> auto mysocket = open_socket()
>> do_stuff
>>

> 1) This feature should not reclaim memory, only call __del__(). If
> references were still live, the effect would be the same as calling
> __del__() on a live object.
>
> 2) __del__() would get called again when memory is actually reclaimed.
> Therefore, your feature would be much safer if it called another
> special function - say close() or dispose() at the end of the block.

I can think to something totally different: for example an "auto" object
couldn't be used as an r-value; such a check could be partially done a
compile-time.
But what to do with parameter to function? There should be a way to
assure that a function doesn't store anywhere an "auto" object passed as
a parameter; e.g. we could disallow parameter-passing unless the
function declares its arguments "auto".

But these are only my thoughts: perhaps they are completly wrong... ;-)



> 3) There is probably a way to do what you want without additional
> syntax:
>
> def foo():
> mymutex = auto(acquire_mutex())
> myfile = auto(open(filename,'w')
> mysocket = auto(open_socket())
> do_stuff()

I can't think of any way to do what I want using this syntax: what are
you thinking of?

Isaac To

unread,
Dec 29, 2002, 12:01:14 AM12/29/02
to
>>>>> "Antonio" == Antonio Cuni <cuniREM...@programmazione.it> writes:

>> 2) __del__() would get called again when memory is actually
>> reclaimed. Therefore, your feature would be much safer if it called
>> another special function - say close() or dispose() at the end of the
>> block.

Antonio> I can think to something totally different: for example an
Antonio> "auto" object couldn't be used as an r-value; such a check
Antonio> could be partially done a compile-time. But what to do with
Antonio> parameter to function? There should be a way to assure that a
Antonio> function doesn't store anywhere an "auto" object passed as a
Antonio> parameter; e.g. we could disallow parameter-passing unless the
Antonio> function declares its arguments "auto".

The burden should be left to the programmer rather than put into the
language. I got the idea that

def foo():
... auto(XXX) ... # E.g., a1 = auto(XXX), or a1 = [auto(XXX)]
...
... auto(YYY) ...
...
# end of foo

is just a clean (and perhaps more efficient) alternative for the more
verbose code

def foo():
_t1_ = XXX
try:
... _t1_ ... # E.g., a1 = _t1_, or a1 = [_t1_]
...
_t2_ = YYY
try:
... _t2_ ...
...
finally:
try:
_t2_.__cleanup__()
except:
pass
finally:
try:
_t1_.__cleanup__()
except:
pass
# end of foo

If the programmer does write the more verbose code, and t1/t2 does get
another reference elsewhere in the try part, the program is still screwed
up. (But it is not really that bad: at least the "elsewhere" has a way to
check that the reference is already cleaned.) So this "feature" should
remain there even if the auto keyword is introduced.

If the language enforces that t cannot be referenced somewhere else, there
will be a lot of cases when the programmer wants to use "auto" but the
language forbid that. In fact, what the programmer wants to do might
actually be legitimate. E.g., put it in a dictionary for a while for its
own use, and after that remove the dictionary. And preventing t from being
used as an r-value does not stop that either. Even if you call something
like a.xyz(), within the method xyz one can still get the reference of a
using self, and can still store it somewhere global. In short, it introduce
problems without actually solving some.

>> 3) There is probably a way to do what you want without additional
>> syntax:
>>
>> def foo(): mymutex = auto(acquire_mutex()) myfile =
>> auto(open(filename,'w') mysocket = auto(open_socket()) do_stuff()

Antonio> I can't think of any way to do what I want using this syntax:
Antonio> what are you thinking of?

Avoid necessitating a modification to the parser; and emphasizing that
"auto" is something about an object, not about a variable, I think.

Regards,
Isaac.

ma...@pobox.com

unread,
Dec 29, 2002, 1:00:28 AM12/29/02
to
"Martin v. L?wis" <mar...@v.loewis.de> wrote:
> Martin Maney wrote:
>> Yes, I assumed that that was at least a good part of the reason Jython
>> abandonded the traditional ref counting of CPython. This was also
>> meant to be implicit agreement with your perhaps rhetorical question to
>> the previous poster.
>
> It was 50% rhetorical, as I'm convinced there is no satisfying answer; I
> assumed a 50% chance that Bengt tries to answer (and was surprised
> that somebody else answered).

Being alive consists largely of being surprised, no? :-)

I have been surprised at how the examination of my own notions about
this have been shifted by these discussions. A little while ago I
would have said (or maybe did say?) that the problem was only due to
sloppiness; with the latest insights offerred by the lack of promises
by even pre-Jython documentation, I seem to be left more or less won
over to the "reliable, timely finalizers are nice" side even as any
hope that relying on that doesn't make one's code unportable to other
(non-CPython) implementations has all but vanished. :-(

It was somewhere in that process of change, as I was examining both the
theory as well as looking closely at some of my own natural Python
coding habits, that Bengt's suggestion came along. I thought I saw
what might be a viable compromise between CPython's de facto behavior
and the semantics of finalization that the language definition (no,
it's not as thorough - or as unalterable - as an ISO spec, but it's
what we have as guidance for the long run) guarantees, which is of
course a guarantee of nothing.

So it is in some ways an accident that I've been discussing this in
terms of Jython, although as the most discussed non-CPython
implementation, and having very different performance w.r.t.
finalization, it's an obvious attractor for any concrete discussion.
It does muddy the water in some respects, as it continues to conflate
object finalization with the scavenging of memory. Of course the two
are not entirely separate, but CPython and Jython are alike, unless I
am misunderstanding some of this yet, in linking the two very closely
in time. You finalize and then you scavenge more or less immediately.
That order is required, but not the temporal propinquity.

> It is often concidered as a strength of Jython that you can integrate so
> easily with the Java library; this may involve call-backs from Java to
> Python, which in turn requires that "pure" Java object hold references
> to Python objects.

Which may or may not be very relevant to a selective ref counting for
finalization scheme. I don't think there's much more to say about that
unless we get empirical data from Jython users about the patterns of
use of Python objects that need timely finalization: are they often
shared between languages, or is that usually avoidable?

It does in any case complicate the issue.

> Can you please explain how these schemes work? In CPython, all objects
> are of type PyObject*, and Py_INCREF and Py_DECREF are macros that
> access the internal structure of a PyObject, which contains the member
> ob_refcount.

Okay, Assuming that CPython were to wish to switch to a
non-ref-counting gc; assuming that there was some advantage to using
that, so that non-counted objects would be the default; assuming that a
mechanism not here described is introduced to allow objects to be
created as ref-counted when the tighter finalization timing were
important.

From a perhaps superficial understanding of this, it looks like it
would be easy enough to implement the INCREF/DECREF changes for this.
It seems likely to make the generated machine code simpler to use a
sentinel value in the ob_refcnt field, and the most convenient value is
probably zero. The macros would then look like this (without debug
aids):

#define Py_INCREF(op) if ((op)->ob_refcnt != 0) ((op)->ob_refcnt++)
#define Py_DECREF(op) \
if ((op)->ob_refcnt == 0 || --(op)->ob_refcnt != 0) \
; \
else \
_Py_Dealloc((PyObject *)(op))

BTW, why is DECREF written in that unnatural way?

This adds a little code size to both, and probably makes INCREF a
little slower; DECREF should be a trifle faster for the majority of
objects that are not ref-counted, and it might even turn out to be
worthwhile to move the decrement to a utility function to reclaim the
code space if ref-counting were infrequently enough used. Or would the
call to __del__ be inlined now that this code is no longer responsible
for scavenging the memory?

> This doesn't translate to Jython at all, since Jython uses native VM
> instructions and the native VM stack to manipulate object references, be
> it references to Jython objects, or to "pure" Java objects.

I don't understand what you're saying here at all. Do you really mean
that a Java programmer *cannot* implement reference counting in a way
analogous to what can be done in, say, C++? Of course there are issues
about letting references escape into non-counted contexts, and in Java
one wouldn't be counting for the purpose of scavenging memory.

Paul Moore

unread,
Dec 29, 2002, 10:19:29 AM12/29/02
to
pin...@iro.umontreal.ca (François Pinard) writes:

>> [...] The proper code:
>
>> fp = open('myfile','r')
>> data = fp.read()
>> fp.close()
>
>> is not as pretty.
>
> This code is surely proper in Jython, working around the fact that Jython
> relies on Java's garbage collector.

Technically, "correct" code should do

fp = open('myfile','r')

try:
data = fp.read()
finally:
fp.close()

as the read() call could raise an exception. And you should catch
exceptions in the open() call. And there are probably other things
that could go wrong.

This is a "practicality beats purity" issue.

The bigger point here is that the C++ "resource acquisition is
initialisation" idiom (and associated principles such as releasing
resources in destructors) doesn't carry across unchanged to Python.
You *can* use the idea, but it's not the end of the story for
exception safety in Python. (For example, Python has "finally", where
C++ doesn't.)

Paul.
--
This signature intentionally left blank

Paul Moore

unread,
Dec 29, 2002, 10:08:14 AM12/29/02
to
"Robert Oschler" <Osc...@earthlink.net> writes:

> "Martin v. Löwis" <mar...@v.loewis.de> wrote in message
> news:au83ji$4fa$00$1...@news.t-online.com...
>
>> I see. This is precisely the definition of Python's __del__ method:
>> it is called when the reference count drops to zero, atleast for
>> CPython.
>>
>
> True, but unless I misread the previous posts in this thread, the
> language spec does not guarantee this behavior, even for CPython,
> and therefore taking advantage of it might "break" something in the
> future. If it became a guaranteed item then that would be great.
> Xmas? :)

The language spec does not guarantee the existence of a concept of
"the reference count" of an object. All the spec says about __del__ is
that it is "called when the instance is about to be destroyed".
Unfortunately, this statement is too weak to be of much practical
value.

For many practical uses, relying on CPython's reference counting
semantics is both useful and understandable. But it doesn't always
give you the results you'd like (cycles, and destruction of objects
alive at program termination being the 2 obvious cases). And you are
risking portability if you rely on this behaviour.

Portability to Jython is easy enough to check (Jython has no reference
counting, and so the timing of __del__ calls is not deterministic).
Portability to a not-yet-created implementation of Python is not
checkable, in general - it's hard to see why anyone really cares about
this, for anything other than theoretical reasons, though.

In theory, an entirely valid implementation of Python (the language)
could allocate objects and *never* delete them, even when they become
inaccessible. (This would have sub-optimal memory consumption
characteristics, of course :-)) In such an implementation, __del__
would never be called.

Does this make __del__ useless? I doubt it...

On the other hand, guaranteeing deterministic __del__ behaviour does
pretty much force the implementation to use reference counting. And
while refcounts have been a reasonably good GC mechanism in CPython
for a long time now, they certainly aren't the final word on the
subject. And as Jython shows, they may not be sensible to implement.
So I don't imagine that deterministic __del__ behaviour will be
mandated in the forseeable future.

Ganesan R

unread,
Dec 29, 2002, 11:17:34 PM12/29/02
to
>>>>> "Stuart" == Stuart D Gathman <stu...@bmsi.com> writes:

> 1) This feature should not reclaim memory, only call __del__(). If
> references were still live, the effect would be the same as calling
> __del__() on a live object.

> 2) __del__() would get called again when memory is actually reclaimed.
> Therefore, your feature would be much safer if it called another special
> function - say close() or dispose() at the end of the block.

I actually like C# deterministic cleanup with the "using" block for
this. See http://www.25hoursaday.com/CsharpVsJava.html#idisposable.
I am not sure how to adopt this for Python. May be something like

using myobj as MyClass("A MyClass Object"):
do something with myobj
# runtime calls dispose on myobj object once "using" code block is exited,
# even if exception thrown.

Ganesan

--
Ganesan R

Stuart D. Gathman

unread,
Dec 29, 2002, 11:30:39 PM12/29/02
to

Can you think of a way to register a function to get called when the
current block exits? If so, then it is pretty straightforward to
implement the proposal in pure python. The auto function simply adds the
object to a list, and the block exit function calls the cleanup function
for objects in the list with all the appropriate try..finally wrappers.

Actually, you would want to register a function for the calling block, or
perhaps an arbitrary stack frame. It would also help to be able to
retrieve the current callable object for a frame to make the system
thread-safe.

I can't see anyway around having to patch the interpreter to get the hook
in there. And I think the compiler needs to be involved to cleanup at
block exit (instead of function exit).

Basically, every block that contains a call to auto(obj) should be
wrapped in something like:

__auto_list__ = []
try:
code_for_block
finally:
__auto_exit__(__auto_list__)

Isaac To

unread,
Dec 30, 2002, 12:18:44 AM12/30/02
to
>>>>> "Stuart" == Stuart D Gathman <stu...@bmsi.com> writes:

Stuart> Can you think of a way to register a function to get called when
Stuart> the current block exits? If so, then it is pretty
Stuart> straightforward to implement the proposal in pure python. The
Stuart> auto function simply adds the object to a list, and the block
Stuart> exit function calls the cleanup function for objects in the list
Stuart> with all the appropriate try..finally wrappers.

Perhaps no need to do it in the block level. After all, there is no
block-level variables in Python either. (Even "[a+1 for a in [1]]" leaves a
local variable a in the function.) It might be simpler if variable
management is done when function, class and file scope exits.

Stuart> Actually, you would want to register a function for the calling
Stuart> block, or perhaps an arbitrary stack frame. It would also help
Stuart> to be able to retrieve the current callable object for a frame
Stuart> to make the system thread-safe.

No comment, since I don't quite understand this.

Stuart> I can't see anyway around having to patch the interpreter to get
Stuart> the hook in there. And I think the compiler needs to be
Stuart> involved to cleanup at block exit (instead of function exit).

Yes, any way to mimic that without modifying the language itself would lead
to ugly code. So I ask about a PEP.

Regards,
Isaac.

holger krekel

unread,
Dec 30, 2002, 8:26:34 AM12/30/02
to

I am currently experimentally implementing something along these lines.
It does require introducing new syntax along with a new object protocol.
I currently call it 'Indented Execution' as it allows an object to define
__enter__ and __leave__ methods wrapping execution of 'its' indented block.

The design is completly orthogonal to any refcount or memory management details.

Though it begins to work it's still in the flow the more i get to know the
python core internals. Btw, it's a pleasure reading and modifying the core
python C code. I didn't touch C/C++/Java much since i got to know python.
I was suprised at the readability of the core code. I try to keep the
changes as minimal as possible.

A preview example (could still change) of what my experiments accomplish:

<autoclose f1=open('/etc/passwd') f2=open('etc/shadow')>
data1 = f1.read()
data2 = f2.read()

# f1 and f2 will be closed here, even if there was an exception
# in the indented code block. Any exception will be propagated
# of course.

'autoclose' is the name of an object which provides

def __enter__(self, **kw):
"keyword arguments containing e.g. the above 'f1' and 'f2'"

def __leave__(self)
"""called if execution leaves the indented block"""

i didn't see an easy way to reuse the 'SETUP_FINALLY/END_FINALLY'
bytecodes so i introduced new bytecodes. They might go away later.
But more about this when i have figured out the gory details of
local namespaces & compiler passes among others. First i want
to get my unit tests to pass :-)

It's not only a conincedence that this resembles xml-syntax. Actually
i started out using pythonic ':' syntax but this had a few problems (syntax
ambiguity among others). Besides, 'Indented Execution' could allow to define
xml-documents and conversions with a very familiar (python + xml) syntax.

There will be no closing tags, though.

The whole idea is to build on indentation as opposed to funny closing
characters or tags. The open/close and acquire/release protocols are
a variation of this theme.

ok, that's enough for now. i couldn't hold myself telling you about
this fun project :-) It's actually motivated by recent discussions here
on c.l.py and a suggestion of Michael Hudson (who aims to provide a
new 'with' keyword achieving a significant subset of my goals).

cheers,

holger

It is loading more messages.
0 new messages