Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Q on explicitly calling file.close

2 views
Skip to first unread message

kj

unread,
Sep 5, 2009, 12:14:02 PM9/5/09
to

There's something wonderfully clear about code like this:

# (1)
def spam(filename):
for line in file(filename):
do_something_with(line)

It is indeed pseudo-codely beautiful. But I gather that it is not
correct to do this, and that instead one should do something like

# (2)
def spam(filename):
fh = file(filename)
try:
for line in fh:
do_something_with(line)
finally:
fh.close()

...or alternatively, if the with-statement is available:

# (3)
def spam(filename):
with file(filename) as fh:
for line in fh:
do_something_with(line)

Mind you, (3) is almost as simple as (1) (only one additional line),
but somehow it lacks (1)'s direct simplicity. (And it adds one
more indentation level, which I find annoying.) Furthermore, I
don't recall ever coming across either (2) or (3) "in the wild",
even after reading a lot of high-quality Python code (e.g. standard
library modules).

Finally, I was under the impression that Python closed filehandles
automatically when they were garbage-collected. (In fact (3)
suggests as much, since it does not include an implicit call to
fh.close.) If so, the difference between (1) and (3) does not seem
very big. What am I missing here?

kynn

MRAB

unread,
Sep 5, 2009, 1:01:59 PM9/5/09
to pytho...@python.org
CPython uses reference counting, so an object is garbage collected as
soon as there are no references to it, but that's just an implementation
detail.

Other implementations, such as Jython and IronPython, don't use
reference counting, so you don't know when an object will be garbage
collected, which means that the file might remain open for an unknown
time afterwards in case 1 above.

Most people use CPython, so it's not surprising that case 1 is so
common.

Dave Angel

unread,
Sep 5, 2009, 2:17:44 PM9/5/09
to kj, pytho...@python.org
We have to distinguish between reference counted and garbage collected.
As MRAB says, when the reference count goes to zero, the file is
immediately closed, in CPython implementation. So all three are
equivalent on that platform.

But if you're not sure the code will run on CPython, then you have to
have something that explicitly catches the out-of-scopeness of the file
object. Both your (2) and (3) do that, with different syntaxes.

DaveA

r

unread,
Sep 5, 2009, 2:54:07 PM9/5/09
to

Stop being lazy and close the file. You don't want open file objects
just floating around in memory. Even the docs says something like
"yes, python will free the memory associated with a file object but
you can never *really* be sure *when* this will happen, so just
explicitly close the damn thing!". Besides, you can't guarantee that
any data has been written without calling f.flush() or f.close()
first. What if your program crashes and no data is written? huh?

I guess i could put my pants on by jumping into both legs at the same
time thereby saving one step, but i my fall down and break my arm. I
would much rather just use the one leg at a time approach...

Message has been deleted

Tim Chase

unread,
Sep 5, 2009, 4:04:55 PM9/5/09
to MRAB, pytho...@python.org
> CPython uses reference counting, so an object is garbage collected as
> soon as there are no references to it, but that's just an implementation
> detail.
>
> Other implementations, such as Jython and IronPython, don't use
> reference counting, so you don't know when an object will be garbage
> collected, which means that the file might remain open for an unknown
> time afterwards in case 1 above.
>
> Most people use CPython, so it's not surprising that case 1 is so
> common.

Additionally, many scripts just use a small number of files (say,
1-5 files) so having a file-handle open for the duration of the
run it minimal overhead.

On the other hand, when processing thousands of files, I always
explicitly close each file to make sure I don't exhaust some
file-handle limit the OS or interpreter may enforce.

-tkc


r

unread,
Sep 5, 2009, 5:52:23 PM9/5/09
to
On Sep 5, 2:47 pm, Dennis Lee Bieber <wlfr...@ix.netcom.com> wrote:
(snip)

> > Finally, I was under the impression that Python closed filehandles
> > automatically when they were garbage-collected.  (In fact (3)
> > suggests as much, since it does not include an implicit call to
> > fh.close.) If so, the difference between (1) and (3) does not seem
> > very big.  What am I missing here?

True, but i find the with statement (while quite useful in general
practice) is not a "cure all" for situations that need and exception
caught. In that case the laborious finger wrecking syntax of "f.close
()" must be painstaking typed letter by painful letter.

f-.-c-l-o-s-e-(-)

It's just not fair ;-(

Steven D'Aprano

unread,
Sep 5, 2009, 7:41:08 PM9/5/09
to
On Sat, 05 Sep 2009 16:14:02 +0000, kj wrote:

> Finally, I was under the impression that Python closed filehandles
> automatically when they were garbage-collected. (In fact (3) suggests
> as much, since it does not include an implicit call to fh.close.) If so,
> the difference between (1) and (3) does not seem very big. What am I
> missing here?

(1) Python the language will close file handles, but doesn't guarantee
when. Some implementations (e.g. CPython) will close them immediately the
file object goes out of scope. Others (e.g. Jython) will close them
"eventually", which may be when the program exists.

(2) If the file object never goes out of scope, say because you've stored
a reference to it somewhere, the file will never be closed and you will
leak file handles. Since the OS only provides a finite number of them,
any program which uses large number of files is at risk of running out.

(3) For quick and dirty scripts, or programs that only use one or two
files, relying on the VM to close the file is sufficient (although lazy
in my opinion *wink*) but for long-running applications using many files,
or for debugging, you may want more control over what happens when.


--
Steven

kj

unread,
Sep 5, 2009, 9:51:50 PM9/5/09
to
In <02b2e6ca$0$17565$c3e...@news.astraweb.com> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> writes:

>(3) For quick and dirty scripts, or programs that only use one or two
>files, relying on the VM to close the file is sufficient (although lazy
>in my opinion *wink*)

It's not a matter of laziness or industriousness, but rather of
code readability. The real problem here is not the close() per
se, but rather all the additional machinery required to ensure that
the close happens. When the code is working with multiple file
handles simultaneously, one ends up with a thicket of try/finally's
that makes the code just *nasty* to look at. E.g., even with only
two files, namely an input and an output file, compare:

def nice(from_, to_):
to_h = file(to_, "w")
for line in file(from_):
print >> to_h, munge(line)

def nasty(from_, to_):
to_h = file(to_, "w")
try:
from_h = file(from_)
try:
for line in from_h:
print >> to_h, munge(line)
finally:
from_h.close()
finally:
to_h.close()

I leave to your imagination the joys of reading the code for
hairy(from_, to_, log_), where log_ is a third file to collect
warning messages.

kynn

Steven D'Aprano

unread,
Sep 5, 2009, 11:56:34 PM9/5/09
to
On Sun, 06 Sep 2009 01:51:50 +0000, kj wrote:

> In <02b2e6ca$0$17565$c3e...@news.astraweb.com> Steven D'Aprano
> <st...@REMOVE-THIS-cybersource.com.au> writes:
>
>>(3) For quick and dirty scripts, or programs that only use one or two
>>files, relying on the VM to close the file is sufficient (although lazy
>>in my opinion *wink*)
>
> It's not a matter of laziness or industriousness, but rather of code
> readability. The real problem here is not the close() per se, but
> rather all the additional machinery required to ensure that the close
> happens. When the code is working with multiple file handles
> simultaneously, one ends up with a thicket of try/finally's that makes
> the code just *nasty* to look at.

Yep, that's because dealing with the myriad of things that *might* (but
probably won't) go wrong when dealing with files is *horrible*. Real
world code is almost always much nastier than the nice elegant algorithms
we hope for.

Most people know they have to deal with errors when opening files. The
best programmers deal with errors when writing to files. But only a few
of the most pedantic coders even attempt to deal with errors when
*closing* the file. Yes, closing the file can fail. What are you going to
do about it? At the least, you should notify the user, then continue.
Dying with an uncaught exception in the middle of processing millions of
records is Not Cool. But close failures are so rare that we just hope
we'll never experience one.

It really boils down to this... do you want to write correct code, or
elegant code?

--
Steven

Terry Reedy

unread,
Sep 6, 2009, 10:03:43 AM9/6/09
to pytho...@python.org
Stephen Hansen wrote:

> This is precisely why the with statement exists; to provide a cleaner
> way to wrap a block in setup and teardown functions. Closing is one.
> Yeah, you get some extra indentation-- but you sorta have to live with
> it if you're worried about correct code. I think it's a good compromise
> between your examples of nasty and nice :)
>
> def compromise(from_, to_):
> with file(to_) as to_h:
> with file(from_) as from_h:


> for line in from_h:
> print >> to_h, munge(line)
>

> It's just too bad that 'with' doesn't support multiple separate "x as y"
> clauses.

The developers already agreed with you ;-).

"With more than one item, the context managers are processed as if
multiple with statements were nested:

with A() as a, B() as b:
suite
is equivalent to

with A() as a:
with B() as b:
suite
Changed in version 3.1: Support for multiple context expressions.
"

(I suspect this will also be in 2.7)

Terry Jan Reedy

Jan Kaliszewski

unread,
Sep 6, 2009, 2:14:02 PM9/6/09
to pytho...@python.org
05-09-2009 r <rt8...@gmail.com> wrote:

> i find the with statement (while quite useful in general
> practice) is not a "cure all" for situations that need and exception
> caught.

In what sense?

I think that:

with open(...) as f:
foo...

is equivalent to:

f = open(...)
try:
foo...
finally:
f.close()

Obviously it doesn't substitute catching with 'except', but I don't
see how it could disturb that.

Cheers,
*j

--
Jan Kaliszewski (zuo) <z...@chopin.edu.pl>

r

unread,
Sep 6, 2009, 7:31:54 PM9/6/09
to
On Sep 6, 1:14 pm, "Jan Kaliszewski" <z...@chopin.edu.pl> wrote:
> 05-09-2009 r <rt8...@gmail.com> wrote:
> > i find the with statement (while quite useful in general
> > practice) is not a "cure all" for situations that need and exception
> > caught.
>
> In what sense?

*ahem*! in the sense that the with statement (while quite useful in


general practice) is not a "cure all" for situations that need and

exception caught ;-)

> I think that:
>
>     with open(...) as f:
>        foo...
>
> is equivalent to:
>
>     f = open(...)
>     try:
>         foo...
>     finally:
>         f.close()
>
> Obviously it doesn't substitute catching with 'except'

My sentiments exactly...?


Get Enlightened -> http://jjsenlightenments.blogspot.com/

Gabriel Genellina

unread,
Sep 7, 2009, 10:57:10 PM9/7/09
to pytho...@python.org
En Sun, 06 Sep 2009 21:20:52 -0300, Stephen Hansen <apt.s...@gmail.com>
escribi�:

> On Sun, Sep 6, 2009 at 4:31 PM, r <rt8...@gmail.com> wrote:
>> On Sep 6, 1:14 pm, "Jan Kaliszewski" <z...@chopin.edu.pl> wrote:
>> > 05-09-2009 r <rt8...@gmail.com> wrote:

>> > > i find the with statement (while quite useful in general
>> > > practice) is not a "cure all" for situations that need and exception
>> > > caught.
>

> But what does that even mean? What's the -problem- that it isn't
> 'curing'?
> That 'with' doesn't completely replace all uses of 'try'? It was never
> meant
> to-- that seems like a completely different problem area. But, if you do
> want it to handle exception catching, that's just a question of making a
> context manager that does so.
>
> import contextlib
>
> @contextlib.contextmanager
> def file_with_errorhandling(filename, mode,
> exceptions=(IOError,OSError)):
> fp = file(filename, mode)
> try:
> yield fp
> except exceptions:
> print "Handle file-operation errors gracefully here."
> fp.close()
>
> with file_with_errorhandling(filename, 'r') as fp:
> fp.read() # or whatever
>
> True, the context manager provided by file objects just lets exceptions
> propagate up but that's usually the desired behavior. You can make
> context
> managers for particular problems you run across that handle exceptions if
> you want.

Note that to correctly emulate the file built-in context manager,
fp.close() should be inside a finally clause.

> The with statement isn't about never having to type try again, I don't
> think.

The with statement is intended as a replacement for common try/finally
blocks, not try/except blocks as the OP seems to imply.

--
Gabriel Genellina

David C Ullrich

unread,
Sep 9, 2009, 4:18:46 PM9/9/09
to
On Sat, 05 Sep 2009 23:41:08 +0000, Steven D'Aprano wrote:

> On Sat, 05 Sep 2009 16:14:02 +0000, kj wrote:
>
>> Finally, I was under the impression that Python closed filehandles
>> automatically when they were garbage-collected.

[...]


>
> (3) For quick and dirty scripts, or programs that only use one or two
> files, relying on the VM to close the file is sufficient (although lazy
> in my opinion *wink*)

Many years ago I read some big Python guy say something to the
effect that you didn't "need" to close files, but if you don't
it will bite you eventually.

So I learned to type "close()". Of course I wouldn't worry about
that in quick and dirty scripts. Then there was that quick and
dirty script that gradually mutated into a not-so-quick and filthy
monstrosity that a certain project depended on. Stopped working
one day.

Me being dim, it took almost a day to find the problem... (I don't
recall how exactly an open file caused a problem, it was still just
three or four files altogether.)

These days I've actually got the syntax and spelling memorized -
I can type "close()" without needing to look it up!

r

unread,
Sep 9, 2009, 4:50:24 PM9/9/09
to
On Sep 9, 3:18 pm, David C Ullrich <dullr...@sprynet.com> wrote:
(snip)

> These days I've actually got the syntax and spelling memorized -
> I can type "close()" without needing to look it up!

+1

You are so right David! I think some people around here need to look
up "code reuse". Here are a couple of simple templates for our friends
to study...

def read_file(fname, mode='rb'):
'''open file and return contents'''
try:
f = open(fname, mode)
s = f.read()
return s
except:
return 0
finally:
f.close()

def write_file(fname, s, mode='wb'):
'''open file, truncate, and write string'''
try:
f = open(fname, mode)
f.write(s)
return 1
except:
return 0
finally:
f.close()


>>> s = read_file(fname)
>>> if s:
... s += 'morecrap'
>>> write_file(s)

#-- Extra Credit --#
Create an append_file() function that takes a <filename> and <string>
as args and appends to the file returning 1 on success, and 0 on
failure.

#-- Double Extra Creidit --#
Create a backup_file() function that takes <filename> as arg and
creates a copy of the file with the extension ".bak"...
>>> backup_file('C:\test.txt') -> 'C:\test.bak'


--
def get_enlightened():
import webbrowser
url = 'http://jjsenlightenments.blogspot.com/'
webbrowser.open(url)

MRAB

unread,
Sep 9, 2009, 5:10:01 PM9/9/09
to pytho...@python.org
r wrote:
[snip]

> #-- Double Extra Creidit --#
> Create a backup_file() function that takes <filename> as arg and
> creates a copy of the file with the extension ".bak"...
>>>> backup_file('C:\test.txt') -> 'C:\test.bak'
>
You should've used raw strings. :-)

Charles Yeomans

unread,
Sep 9, 2009, 5:19:29 PM9/9/09
to Python


Unfortunately, both of these simple templates have the following
problem -- if open fails, a NameError will be raised from the finally
block.


def read_file(fname, mode='rb'):
'''open file and return contents'''

f = open(fname, mode)
try:
return f.read()

finally:
f = f.close()

I removed the except block because I prefer exceptions to error codes.

def write_file(fname, s, mode='wb'):
'''open file, truncate, and write string'''

f = open(fname, mode)
try:
f.write(s)

finally:
f = f.close()

In addition to fixing the latent bug in the second simple template, I
took the opportunity to correct your heinous violation of command-
query separation.


Charles Yeomans

r

unread,
Sep 9, 2009, 6:13:49 PM9/9/09
to
On Sep 9, 4:19 pm, Charles Yeomans <char...@declareSub.com> wrote:
(snip:)

> Unfortunately, both of these simple templates have the following  
> problem -- if open fails, a NameError will be raised from the finally  
> block.

(snip)


> I removed the except block because I prefer exceptions to error codes.

how will the caller know an exception has occurred? What if logic
depends on the validation that a file *had* or *had not* been written
too, huh?

> In addition to fixing the latent bug in the second simple template, I  
> took the opportunity to correct your heinous violation of command-
> query separation.
>
> Charles Yeomans

Oh I see! But what happens if the filename does not exist? What then?
"open" will blow chucks thats what! Here is a version for our paranoid-
schizophrenic-sadomasochist out there...

def egor_read_file(fname, mode='rb'):
print 'yes, master'
try:
f = open(fname, mode=mode)
except IOError:
return (0, 'But, the file no open master!')

try:
s = f.read()
except NameError:
return (0, 'the file still no open master!')

try:
f.close()
except:
print 'That file sure is tricky master!

return (s, 'Whew! here is the file contents, master')


> MRAB wrote:
> You should've used raw strings. :-)

rats!, you got me on that one :-)

--
free beer tomorrow @ http://jjsenlightenments.blogspot.com/


David C. Ullrich

unread,
Sep 10, 2009, 7:26:16 AM9/10/09
to
On Wed, 9 Sep 2009 15:13:49 -0700 (PDT), r <rt8...@gmail.com> wrote:

>On Sep 9, 4:19�pm, Charles Yeomans <char...@declareSub.com> wrote:
>(snip:)
>> Unfortunately, both of these simple templates have the following �
>> problem -- if open fails, a NameError will be raised from the finally �
>> block.
>
>(snip)
>> I removed the except block because I prefer exceptions to error codes.
>
>how will the caller know an exception has occurred? What if logic
>depends on the validation that a file *had* or *had not* been written
>too, huh?
>
>> In addition to fixing the latent bug in the second simple template, I �
>> took the opportunity to correct your heinous violation of command-
>> query separation.
>>
>> Charles Yeomans
>
>Oh I see! But what happens if the filename does not exist? What then?
>"open" will blow chucks thats what! Here is a version for our paranoid-
>schizophrenic-sadomasochist out there...

Well first, we agree that putting the open() in the try part of a
try-finally is wrong. try-finally is supposed to ensure that
_allocated_ resources are cleaned up.

What you do below may work. But it's essentially throwing
out exception handling and using error codes instead. There
are plenty of reasons why exceptions are preferred. The
standard thing is this:

def UseResource(rname):
r = get(rname)
try:
r.use()
finally
r.cleanup()

Now if the get() fails, UseResource will indeed blow chunks.
That's what it _should_ do. Generally speaking, you should use
try-except very rarely, making certain you're catching the
exception at the appropriate level. Here for example in
_some_ uses of UseResource() you _want_ the application
to terminate if the get() fails. In that case you'd just call
UseResource() without any try-excpt. In some other
situation (maybe the user provided a filename that may
or may not exist) you want to trap the exception when
you _call_ UseResource():

try:
UseResource(rname)
except WhateverError:
print error message


>def egor_read_file(fname, mode='rb'):
> print 'yes, master'
> try:
> f = open(fname, mode=mode)
> except IOError:
> return (0, 'But, the file no open master!')
>
> try:
> s = f.read()
> except NameError:
> return (0, 'the file still no open master!')
>
> try:
> f.close()
> except:
> print 'That file sure is tricky master!
>
> return (s, 'Whew! here is the file contents, master')

What's above seems simpler. More important, if you do
it this way then you _always_ have to check the return
value of egor_read_file and take appropriate action -
complicates the code everywhere the function is called
as well as making the function more complicated.
Doing it as in UseResource() above you don't need
to worry about whether UseResource() failed
_except_ in situations where you're certain that
that's the right level to catch the error.


>
>> MRAB wrote:
>> You should've used raw strings. :-)
>
>rats!, you got me on that one :-)

David C. Ullrich

"Understanding Godel isn't about following his formal proof.
That would make a mockery of everything Godel was up to."
(John Jones, "My talk about Godel to the post-grads."
in sci.logic.)

Gabriel Genellina

unread,
Sep 11, 2009, 4:21:35 AM9/11/09
to pytho...@python.org
En Thu, 10 Sep 2009 08:26:16 -0300, David C. Ullrich
<dull...@sprynet.com> escribiᅵ:

> On Wed, 9 Sep 2009 15:13:49 -0700 (PDT), r <rt8...@gmail.com> wrote:
>> On Sep 9, 4:19ï¿œpm, Charles Yeomans <char...@declareSub.com> wrote:

>>> I removed the except block because I prefer exceptions to error codes.
>>
>> how will the caller know an exception has occurred? What if logic
>> depends on the validation that a file *had* or *had not* been written
>> too, huh?
>>

>> Oh I see! But what happens if the filename does not exist? What then?
>> "open" will blow chucks thats what! Here is a version for our paranoid-
>> schizophrenic-sadomasochist out there...
>
> Well first, we agree that putting the open() in the try part of a
> try-finally is wrong. try-finally is supposed to ensure that
> _allocated_ resources are cleaned up.
>
> What you do below may work. But it's essentially throwing
> out exception handling and using error codes instead. There
> are plenty of reasons why exceptions are preferred. The
> standard thing is this:
>
> def UseResource(rname):
> r = get(rname)
> try:
> r.use()
> finally
> r.cleanup()

And it is so widely used that it got its own syntax (the with statement)
and library support (contextlib, for creating custom context managers). In
any decent version of Python this idiom becomes:

with get(rname) as r:
r.use

assuming get(rname) returns a suitable object (that defines __enter__ and
__exit__)

>> def egor_read_file(fname, mode='rb'):
>> print 'yes, master'
>> try:
>> f = open(fname, mode=mode)
>> except IOError:
>> return (0, 'But, the file no open master!')
>>
>> try:
>> s = f.read()
>> except NameError:
>> return (0, 'the file still no open master!')
>>
>> try:
>> f.close()
>> except:
>> print 'That file sure is tricky master!
>>
>> return (s, 'Whew! here is the file contents, master')
>
> What's above seems simpler. More important, if you do
> it this way then you _always_ have to check the return
> value of egor_read_file and take appropriate action -
> complicates the code everywhere the function is called
> as well as making the function more complicated.
> Doing it as in UseResource() above you don't need
> to worry about whether UseResource() failed
> _except_ in situations where you're certain that
> that's the right level to catch the error.

That's the standard argument showing why structured exceptions are a Good
Thing. I'd say everyone should be aware of that, giving the ample usage of
exceptions in Python, but looks like people requires a reminder from time
to time.

--
Gabriel Genellina

jhermann

unread,
Sep 23, 2009, 6:44:20 AM9/23/09
to
>     except:
>         return 0

So wrong on so many levels...

0 new messages