Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Problem with apsw and garbage collection

4 views
Skip to first unread message

Nikolaus Rath

unread,
Jun 11, 2009, 10:28:05 PM6/11/09
to
Hi,

Please consider this example:

vacuum.py

Nikolaus Rath

unread,
Jun 12, 2009, 6:33:13 PM6/12/09
to
Nikolaus Rath <Niko...@rath.org> writes:
> Hi,
>
> Please consider this example:
[....]

I think I managed to narrow down the problem a bit. It seems that when
a function returns normally, its local variables are immediately
destroyed. However, if the function is left due to an exception, the
local variables remain alive:

---------snip---------
#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
print "Initializing"

def __del__(self):
print "Destructing"

def dostuff(fail):
obj = testclass()

if fail:
raise TypeError

print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"
---------snip---------


Prints out:


---------snip---------
Calling dostuff
Initializing
Destructing
dostuff returned
Calling dostuff
Initializing
dostuff returned
Destructing
---------snip---------


Is there a way to have the obj variable (that is created in dostuff())
destroyed earlier than at the end of the program? As you can see, I
already tried to explicitly call the garbage collector, but this does
not help.


Best,


-Nikolaus

--
»Time flies like an arrow, fruit flies like a Banana.«

PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6 02CF A9AD B7F8 AE4E 425C

MRAB

unread,
Jun 12, 2009, 9:26:50 PM6/12/09
to pytho...@python.org
Are the objects retained because there's a reference to the stack
frame(s) in the traceback?

Piet van Oostrum

unread,
Jun 13, 2009, 9:06:09 AM6/13/09
to
>>>>> Nikolaus Rath <Niko...@rath.org> (NR) wrote:

>NR> Is there a way to have the obj variable (that is created in dostuff())
>NR> destroyed earlier than at the end of the program? As you can see, I
>NR> already tried to explicitly call the garbage collector, but this does
>NR> not help.

The exact time of the destruction of objects is an implementation detail
and should not be relied upon.
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org

Aahz

unread,
Jun 13, 2009, 8:05:09 PM6/13/09
to
In article <873aa5m...@vostro.rath.org>,

Nikolaus Rath <Niko...@rath.org> wrote:
>
>I think I managed to narrow down the problem a bit. It seems that when
>a function returns normally, its local variables are immediately
>destroyed. However, if the function is left due to an exception, the
>local variables remain alive:

Correct. You need to get rid of the stack trace somehow; the simplest
way is to wrap things in layers of functions (i.e. return from the
function with try/except and *don't* save the traceback). Note that if
your goal is to ensure finalization rather than recovering memory, you
need to do that explicitly rather than relying on garbage collection.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer

Mike Kazantsev

unread,
Jun 14, 2009, 1:55:09 AM6/14/09
to
On Fri, 12 Jun 2009 18:33:13 -0400
Nikolaus Rath <Niko...@rath.org> wrote:

> Nikolaus Rath <Niko...@rath.org> writes:
> > Hi,
> >
> > Please consider this example:
> [....]
>
> I think I managed to narrow down the problem a bit. It seems that when
> a function returns normally, its local variables are immediately
> destroyed. However, if the function is left due to an exception, the
> local variables remain alive:
>

...


>
> Is there a way to have the obj variable (that is created in dostuff())
> destroyed earlier than at the end of the program? As you can see, I
> already tried to explicitly call the garbage collector, but this does
> not help.

Strange thing is that no one suggested contextlib, which made _exactly_
for this purpose:


#!/usr/bin/env python
import gc

class testclass(object):
def __init__(self):
self.alive = True # just for example
print "Initializing"

def __del__(self):
if self.alive:
# try..except wrapper would suffice here,
# so destruction won't raise ex, if already done
print "Destructing"
self.alive = False

def __enter__(self): pass
def __exit__(self, ex_type, ex_val, ex_trace):
self.__del__()
if not ex_type is None:
raise RuntimeError(ex_val)


def dostuff(fail):
with testclass() as obj:
# some stuff
if fail:
raise TypeError
# some more stuff
print "success"


print "Calling dostuff"
dostuff(fail=False)
print "dostuff returned"

try:
print "Calling dostuff"
dostuff(fail=True)
except TypeError:
pass

gc.collect()
print "dostuff returned"


And it doesn't matter where you use "with", it creates a volatile
context, which destructs before anything else happens on higher level.

Another simplified case, similar to yours is file objects:


with open(tmp_path, 'w') as file:
# write_ops
os.rename(tmp_path, path)

So whatever happens inside "with", file should end up closed, else
os.rename might replace valid path with zero-length file.

It should be easy to use cursor with contextlib, consider using
contextmanager decorator:


from contextlib import contextmanager

@contextmanager
def get_cursor():
try:
cursor = conn.cursor()
yield cursor
except Exception as ex: raise ex
finally: cursor.close()

with get_cursor() as cursor:
# whatever ;)

--
Mike Kazantsev // fraggod.net

signature.asc

Lawrence D'Oliveiro

unread,
Jun 16, 2009, 12:45:43 AM6/16/09
to
In message <m2eitow...@cs.uu.nl>, Piet van Oostrum wrote:

> The exact time of the destruction of objects is an implementation detail
> and should not be relied upon.

That may be true in Java and other corporate-herd-oriented languages, but we
know that dynamic languages like Perl and Python make heavy use of
reference-counting wherever they can. If it's easy to satisfy yourself that
the lifetime of an object will be delimited in this way, I don't see why you
can't rely upon it.

Steven D'Aprano

unread,
Jun 16, 2009, 3:34:35 AM6/16/09
to

Reference counting is an implementation detail used by CPython but not
IronPython or Jython. I don't know about the dozen or so other minor/new
implementations, like CLPython, PyPy, Unladen Swallow or CapPython.

In other words, if you want to write *Python* code rather than CPython
code, don't rely on ref-counting.


--
Steven

Lawrence D'Oliveiro

unread,
Jun 17, 2009, 1:52:30 AM6/17/09
to
In message <pan.2009.06...@REMOVE.THIS.cybersource.com.au>, Steven
D'Aprano wrote:

> On Tue, 16 Jun 2009 16:45:43 +1200, Lawrence D'Oliveiro wrote:
>
>> In message <m2eitow...@cs.uu.nl>, Piet van Oostrum wrote:
>>
>>> The exact time of the destruction of objects is an implementation
>>> detail and should not be relied upon.
>>
>> That may be true in Java and other corporate-herd-oriented languages,
>> but we know that dynamic languages like Perl and Python make heavy use
>> of reference-counting wherever they can. If it's easy to satisfy
>> yourself that the lifetime of an object will be delimited in this way, I
>> don't see why you can't rely upon it.
>
> Reference counting is an implementation detail used by CPython but not

> [implementations built on runtimes designed for corporate-herd-oriented
> languages, like] IronPython or Jython.

I rest my case.

Paul Rubin

unread,
Jun 17, 2009, 2:13:41 AM6/17/09
to
Lawrence D'Oliveiro <l...@geek-central.gen.new_zealand> writes:
> > Reference counting is an implementation detail used by CPython but not
> > [implementations built on runtimes designed for corporate-herd-oriented
> > languages, like] IronPython or Jython.
>
> I rest my case.

You're really being pretty ignorant. I don't know of any serious Lisp
system that uses reference counting, both for performance reasons and
to make sure cyclic structures are reclaimed properly. Lisp is
certainly not a corporate herd language.

Even CPython doesn't rely completely on reference counting (it has a
fallback gc for cyclic garbage). Python introduced the "with"
statement to get away from the kludgy CPython programmer practice of
opening files and relying on the file being closed when the last
reference went out of scope.

Steven D'Aprano

unread,
Jun 17, 2009, 2:16:54 AM6/17/09
to

CLPython and Unladen Swallow do not use reference counting. I suppose you
might successfully argue that Lisp is a corporate-herd-oriented language,
and that Google (the company behind Unladen Swallow) is a corporate-herd.
But PyPy doesn't use reference counting either. Perhaps you think that
Python is a language designed for corporate-herds too?

--
Steven

Lawrence D'Oliveiro

unread,
Jun 17, 2009, 7:29:48 AM6/17/09
to
In message <7x7hzbv...@ruckus.brouhaha.com>, wrote:

> Lawrence D'Oliveiro <l...@geek-central.gen.new_zealand> writes:
>
>> > Reference counting is an implementation detail used by CPython but not
>> > [implementations built on runtimes designed for corporate-herd-oriented
>> > languages, like] IronPython or Jython.
>>
>> I rest my case.
>
> You're really being pretty ignorant. I don't know of any serious Lisp
> system that uses reference counting, both for performance reasons and
> to make sure cyclic structures are reclaimed properly.

Both of which, oddly enough, more modern dynamic languages like Python
manage perfectly well.

Charles Yeomans

unread,
Jun 17, 2009, 7:49:52 AM6/17/09
to pytho...@python.org

I'm curious as you why you consider this practice to be kludgy; my
experience with RAII is pretty good.

Charles Yeomans

Steven D'Aprano

unread,
Jun 17, 2009, 9:43:37 PM6/17/09
to
On Wed, 17 Jun 2009 07:49:52 -0400, Charles Yeomans wrote:

>> Even CPython doesn't rely completely on reference counting (it has a
>> fallback gc for cyclic garbage). Python introduced the "with"
>> statement to get away from the kludgy CPython programmer practice of
>> opening files and relying on the file being closed when the last
>> reference went out of scope.
>
> I'm curious as you why you consider this practice to be kludgy; my
> experience with RAII is pretty good.

Because it encourages harmful laziness. Laziness is only a virtue when it
leads to good code for little effort, but in this case, it leads to non-
portable code. Worse, if your data structures include cycles, it also
leads to resource leaks.

--
Steven

Steven D'Aprano

unread,
Jun 17, 2009, 9:44:03 PM6/17/09
to

*Python* doesn't have a ref counter. That's an implementation detail of
*CPython*. There is nothing in the specifications for the language Python
which requires a ref counter.

CPython's ref counter is incapable of dealing with cyclic structures, and
so it has a second garbage collector specifically for that purpose. The
only reason Python manages perfectly well is by NOT relying on a ref
counter: some implementations don't have one at all, and the one which
does, uses a second gc.

Additionally, while I'm a fan of the simplicity of CPython's ref counter,
one serious side effect of it is that it requires the GIL, which
essentially means CPython is crippled on multi-core CPUs compared to non-
ref counting implementations.

--
Steven

Charles Yeomans

unread,
Jun 17, 2009, 10:58:27 PM6/17/09
to pytho...@python.org


Memory management may be an "implementation detail", but it is
unfortunately one that illustrates the so-called law of leaky
abstractions. So I think that one has to write code that follows the
memory management scheme of whatever language one uses. For code
written for CPython only, as mine is, RAII is an appropriate idiom and
not kludgy at all. Under your assumptions, its use would be wrong, of
course.

Charles Yeomans

Steven D'Aprano

unread,
Jun 17, 2009, 11:41:55 PM6/17/09
to


CPython isn't a language, it's an implementation.

I'm unable to find anything in the Python Reference which explicitly
states that files will be closed when garbage collected, except for one
brief mention in tempfile.TemporaryFile:

"Return a file-like object that can be used as a temporary storage area.
The file is created using mkstemp(). It will be destroyed as soon as it
is closed (including an implicit close when the object is garbage
collected)."

http://docs.python.org/library/tempfile.html

In practical terms, it's reasonably safe to assume Python will close
files when garbage collected (it would be crazy not to!) but that's not
explicitly guaranteed anywhere I can see. In any case, there is no
guarantee *when* files will be closed -- for long-lasting processes that
open and close a lot of files, or for data structures with cycles, you
might easily run out of file descriptors.


The docs for file give two recipes for recommended ways of dealing with
files:

http://docs.python.org/library/stdtypes.html#file.close

Both of them close the file, one explicitly, the other implicitly. In
both cases, they promise to close the file as soon as you are done with
it. Python the language does not.


The tutorials explicitly recommends closing the file when you're done:

"When you’re done with a file, call f.close() to close it and free up any
system resources taken up by the open file."

http://docs.python.org/tutorial/inputoutput.html#reading-and-writing-files


In summary: relying on immediate closure of files is implementation
specific behaviour. By all means do so, with your eyes open and with full
understanding that you're relying on platform-specific behaviour with no
guarantee of when, or even if, files will be closed.

--
Steven

Piet van Oostrum

unread,
Jun 18, 2009, 8:08:23 AM6/18/09
to
>>>>> Charles Yeomans <cha...@declareSub.com> (CY) wrote:

>CY> Memory management may be an "implementation detail", but it is
>CY> unfortunately one that illustrates the so-called law of leaky
>CY> abstractions. So I think that one has to write code that follows the
>CY> memory management scheme of whatever language one uses. For code written
>CY> for CPython only, as mine is, RAII is an appropriate idiom and not kludgy
>CY> at all. Under your assumptions, its use would be wrong, of course.

I dare to say that, even in CPython it is doomed to disappear, but we
don't know yet on what timescale.

Aahz

unread,
Jun 18, 2009, 12:09:24 PM6/18/09
to
In article <pan.2009.06...@REMOVE.THIS.cybersource.com.au>,

Steven D'Aprano <ste...@REMOVE.THIS.cybersource.com.au> wrote:
>
>Additionally, while I'm a fan of the simplicity of CPython's ref counter,
>one serious side effect of it is that it requires the GIL, which
>essentially means CPython is crippled on multi-core CPUs compared to non-
>ref counting implementations.

Your bare "crippled" is an unfair overstatement. What you meant to
write was that computational multi-threaded applications that don't use
NumPy are crippled. Otherwise you're simply spreading FUD.

0 new messages