Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: when should I explicitely close a file?

10 views
Skip to first unread message

Chris Rebert

unread,
Apr 13, 2010, 6:24:34 PM4/13/10
to gelonida, pytho...@python.org
On Tue, Apr 13, 2010 at 3:01 PM, gelonida <gelo...@gmail.com> wrote:
> Hi,
>
> I've been told, that following code snippet is not good.
>
> open("myfile","w").write(astring) , because I'm neither explicitely
> closing nor using the new 'with' syntax.
>
> What exactly is the impact of not closing the file explicitely
> (implicitley with a 'with' block)?
>
> Even with my example
> I'd expected to get an exception raised if not all data could have
> been written.
>
> I'd also expected, that all write data is flushed as soon as the
> filehandle is out of scope (meaning in the next line of my source
> code).

That extremely-quick responsiveness of the garbage-collection
machinery is only guaranteed by CPython, not the language
specification itself, and indeed some of the other implementations
*explicitly don't* make that guarantee (and hence the data may not get
flushed in a timely manner on those implementations). And portability
of code is encouraged, hence the admonishment you encountered.

Cheers,
Chris
--
http://blog.rebertia.com

gelonida

unread,
Apr 13, 2010, 6:01:25 PM4/13/10
to
Hi,


I've been told, that following code snippet is not good.


open("myfile","w").write(astring) , because I'm neither explicitely
closing nor using the new 'with' syntax.

What exactly is the impact of not closing the file explicitely
(implicitley with a 'with' block)?


Even with my example
I'd expected to get an exception raised if not all data could have
been written.

I'd also expected, that all write data is flushed as soon as the
filehandle is out of scope (meaning in the next line of my source
code).


Thanks for explaining me exactly what kind of evil I could encounter
with not explicitely closing.

Steven D'Aprano

unread,
Apr 13, 2010, 8:34:23 PM4/13/10
to
On Tue, 13 Apr 2010 15:01:25 -0700, gelonida wrote:

> Hi,
>
>
> I've been told, that following code snippet is not good.
>
>
> open("myfile","w").write(astring) , because I'm neither explicitely
> closing nor using the new 'with' syntax.
>
> What exactly is the impact of not closing the file explicitely
> (implicitley with a 'with' block)?


Your data may not be actually written to disk until the file closes. If
you have a reference loop, and Python crashes badly enough, the garbage
collector may never run and the file will never be closed, hence you will
get data loss.

If you are running something other than CPython (e.g. IronPython or
Jython) then the file might not be closed until your program exits. If
you have a long-running program that opens many, many files, it is
possible for you to run out of system file handles and be unable to open
any more.

Best practice is to explicitly close the file when you are done with it,
but for short scripts, I generally don't bother. Laziness is a virtue :)

But for library code and larger applications, I always explicitly close
the file, because I want to control exactly when the file is closed
rather than leave it up to the interpreter. I don't know if my code might
one day end up in a long-running Jython application so I try to code
defensively and avoid even the possibility of a problem.


> Even with my example
> I'd expected to get an exception raised if not all data could have been
> written.

Generally if you get an exception while trying to *close* a file, you're
pretty much stuffed. What are you going to do? How do you recover?

My feeling is that you're probably safe with something as simple as

file("myfile", "w").write("my data\n")

but if you do something like

some_data_structure.filehandle = file("myfile", "w")
some_data_structure.filehandle.write("my data\n")
# ... lots more code here

and some_data_structure keeps the file open until the interpreter shuts
down, there *might* be rare circumstances where you won't be notified of
an exception, depending on the exact circumstances of timing of when the
file gets closed. In the worst case, the file might not be closed until
the interpreter is shutting down *and* has already dismantled the
exception infrastructure, and so you can't get an exception. I don't know
enough about the Python runtime (particularly about how it works during
shutdown) to know how real this danger is, but if it is a danger, I bet
it involves __del__ methods.

> I'd also expected, that all write data is flushed as soon as the
> filehandle is out of scope (meaning in the next line of my source code).

This is only guaranteed with CPython, not other implementations.

My feeling is that explicit closing is pedantic and careful, implicit
closing is lazy and easy. You make your choice and take your chance :)


--
Steven

Giampaolo Rodola'

unread,
Apr 13, 2010, 8:45:25 PM4/13/10
to
What about open('foo', 'w').close().
Does it have the same problems?


--- Giampaolo
http://code.google.com/p/pyftpdlib
http://code.google.com/p/psutil

Chris Rebert

unread,
Apr 13, 2010, 9:19:59 PM4/13/10
to Giampaolo Rodola', pytho...@python.org
On Tue, Apr 13, 2010 at 5:45 PM, Giampaolo Rodola' <gne...@gmail.com> wrote:
> What about open('foo', 'w').close().
> Does it have the same problems?

Well, no, but that's only because it's a pointless no-op that doesn't
really do anything besides possibly throwing an exception (e.g. if the
script didn't have write access to the current directory).

Dave Angel

unread,
Apr 13, 2010, 10:09:50 PM4/13/10
to gelonida, pytho...@python.org
Evil? No. Just undefined behavior.

The language does NOT guarantee that a close or even a flush will occur
when an object "goes out of scope." This is the same in Python as it is
in Java. There's also no exception for data not being flushed.

In one particular implementation of Python, called CPython, there are
some things that tend to help. So if you're sure you're always going to
be using this particular implementation, and understand what the
restrictions are, then go ahead and be sloppy. Similarly, on some OS
systems, files are flushed when a process ends. So if you know your
application is only going to run on those environments, you might not
bother closing files at the end of execution.

It all depends on how restrictive your execution environment is going to be.

DaveA

Ryan Kelly

unread,
Apr 13, 2010, 11:10:25 PM4/13/10
to Chris Rebert, pytho...@python.org, Giampaolo Rodola'
On Tue, 2010-04-13 at 18:19 -0700, Chris Rebert wrote:
> On Tue, Apr 13, 2010 at 5:45 PM, Giampaolo Rodola' <gne...@gmail.com> wrote:
> > What about open('foo', 'w').close().
> > Does it have the same problems?
>
> Well, no, but that's only because it's a pointless no-op that doesn't
> really do anything besides possibly throwing an exception (e.g. if the
> script didn't have write access to the current directory).

Actually, it will create the file if it doesn't exist, and truncate it
to zero length if it does.


Ryan


--
Ryan Kelly
http://www.rfk.id.au | This message is digitally signed. Please visit
ry...@rfk.id.au | http://www.rfk.id.au/ramblings/gpg/ for details


Lawrence D'Oliveiro

unread,
Apr 17, 2010, 7:23:15 AM4/17/10
to
In message
<d48b70da-5384-4dc6...@r1g2000yqb.googlegroups.com>, gelonida
wrote:

> I've been told, that following code snippet is not good.
>

> open("myfile","w").write(astring) ...

I do that for reads, but never for writes.

For writes, you want to give a chance for write errors to raise an exception
and alert the user, instead of failing silently, to avoid inadvertent data
loss. Hence the explicit close.

Lie Ryan

unread,
Apr 17, 2010, 8:33:18 AM4/17/10
to

In short, in case of doubt, just be explicit.

Since in python nothing is guaranteed about implicit file close, you
must always explicitly close it.

Lawrence D'Oliveiro

unread,
Apr 21, 2010, 8:53:51 PM4/21/10
to
In message <4bc9aadb$1...@dnews.tpgi.com.au>, Lie Ryan wrote:

> Since in python nothing is guaranteed about implicit file close ...

It is guaranteed that objects with a reference count of zero will be
disposed. In my experiments, this happens immediately.

Chris Rebert

unread,
Apr 21, 2010, 9:03:29 PM4/21/10
to pytho...@python.org

Experiment with an implementation other than CPython and prepare to be
surprised.

Steven D'Aprano

unread,
Apr 22, 2010, 12:03:33 AM4/22/10
to
On Thu, 22 Apr 2010 12:53:51 +1200, Lawrence D'Oliveiro wrote:

> In message <4bc9aadb$1...@dnews.tpgi.com.au>, Lie Ryan wrote:
>
>> Since in python nothing is guaranteed about implicit file close ...
>
> It is guaranteed that objects with a reference count of zero will be
> disposed.

Not all Python implementations have reference counts at all, e.g. Jython
and IronPython. Neither of those close files immediately.


> In my experiments, this happens immediately.

Are your experiments done under PyPy, CLPython, or Pynie?

--
Steven

Alf P. Steinbach

unread,
Apr 22, 2010, 1:48:47 AM4/22/10
to
* Lawrence D'Oliveiro:

> In message <4bc9aadb$1...@dnews.tpgi.com.au>, Lie Ryan wrote:
>
>> Since in python nothing is guaranteed about implicit file close ...
>
> It is guaranteed that objects with a reference count of zero will be
> disposed.

Only in current CPython.


> In my experiments, this happens immediately.

Depends what you mean, but even in current CPython destruction of a local can be
postponed indefinitely if a reference to the stack frame is kept somewhere.

And that happens, for example, when an exception is raised (until the handler
completes, but it doesn't necessarily complete for a Very Long Time).


Cheers & hth.,

- Alf

Adam Tauno Williams

unread,
Apr 22, 2010, 5:59:33 AM4/22/10
to pytho...@python.org

A current implementation specific detail. Always close files.
Otherwise, in the future, or on a different run-time, your code will
break.


Lawrence D'Oliveiro

unread,
Apr 23, 2010, 12:29:46 AM4/23/10
to
In message <mailman.2119.1271898...@python.org>, Chris
Rebert wrote:

Any implementation that doesn’t do reference-counting is brain-damaged.

Steven D'Aprano

unread,
Apr 23, 2010, 12:49:14 AM4/23/10
to
On Fri, 23 Apr 2010 16:29:46 +1200, Lawrence D'Oliveiro wrote:

> Any implementation that doesn’t do reference-counting is brain-damaged.

Funny, that's exactly what other people say about implementations that
*do* use reference counting.

--
Steven

Adam Tauno Williams

unread,
Apr 23, 2010, 6:14:23 AM4/23/10
to pytho...@python.org

Why? There are much better ways to do memory management / garbage
collection; especially when dealing with large applications.

Alf P. Steinbach

unread,
Apr 23, 2010, 7:19:41 AM4/23/10
to
* Adam Tauno Williams:

> On Fri, 2010-04-23 at 16:29 +1200, Lawrence D'Oliveiro wrote:
>> In message <mailman.2119.1271898...@python.org>, Chris
>> Rebert wrote:
>>> On Wed, Apr 21, 2010 at 5:53 PM, Lawrence D'Oliveiro wrote:
>>>> In message <4bc9aadb$1...@dnews.tpgi.com.au>, Lie Ryan wrote:
>>>>> Since in python nothing is guaranteed about implicit file close ...
>>>> It is guaranteed that objects with a reference count of zero will be
>>>> disposed.
>>>> In my experiments, this happens immediately.
>>> Experiment with an implementation other than CPython and prepare to be
>>> surprised.
>> Any implementation that doesn’t do reference-counting is brain-damaged.
>
> Why?

Depends on what the statement was meant to mean.

But for a literal context-free interpretation e.g. the 'sys.getrefcount'
function is not documented as CPython only and thus an implementation that
didn't do reference counting would not be a conforming Python implementation.

Whether it uses reference counting to destroy objects at earliest opportunity is
another matter.


> There are much better ways to do memory management / garbage
> collection; especially when dealing with large applications.

Depends on whether you're talking about Python implementations or as a matter of
general principle, and depends on how you define "better", "large" and so on.

On its own it's a pretty meaningless statement.

But although a small flame war erupted the last time I mentioned this, I think a
case can be made that Python is not designed for programming-in-the-large. And
that the current CPython scheme is eminently suitable for small scripts. But it
has its drawbacks, especially considering the various ways that stack frames can
be retained, and considering the documentation of 'gc.garbage', ...

"Objects that have __del__() methods and are part of a reference cycle cause
the entire reference cycle to be uncollectable, including objects not
necessarily in the cycle but reachable only from it."

... which means that a programming style assuming current CPython semantics and
employing RAII can be detrimental in a sufficiently large system.

Steven D'Aprano

unread,
Apr 23, 2010, 6:04:51 PM4/23/10
to
On Fri, 23 Apr 2010 13:19:41 +0200, Alf P. Steinbach wrote:

> But for a literal context-free interpretation e.g. the 'sys.getrefcount'
> function is not documented as CPython only and thus an implementation
> that didn't do reference counting would not be a conforming Python
> implementation.

Since Jython and IronPython are conforming Python implementations, and
Guido has started making policy decisions specifically to support these
other implementations (e.g. the language feature moratorium, PEP 3003), I
think we can assume that this is a documentation bug.

However, a Python implementation that always returned 0 for
sys.getrefcount would technically satisfy the word of the documentation,
if not the spirit.

--
Steven

Alf P. Steinbach

unread,
Apr 24, 2010, 3:30:26 AM4/24/10
to
* Steven D'Aprano:

> On Fri, 23 Apr 2010 13:19:41 +0200, Alf P. Steinbach wrote:
>
>> But for a literal context-free interpretation e.g. the 'sys.getrefcount'
>> function is not documented as CPython only and thus an implementation
>> that didn't do reference counting would not be a conforming Python
>> implementation.
>
> Since Jython and IronPython are conforming Python implementations, and
> Guido has started making policy decisions specifically to support these
> other implementations (e.g. the language feature moratorium, PEP 3003), I
> think we can assume that this is a documentation bug.

The documentation for Jython specifies the same for 'sys.getrefcount'.

However, testing:

<output>
*sys-package-mgr*: processing new jar, 'C:\Program Files\jython2.5.1\jython.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\resources.jar'
*sys-package-mgr*: processing new jar, 'C:\Program Files\Java\jre6\lib\rt.jar'
*sys-package-mgr*: processing new jar, 'C:\Program Files\Java\jre6\lib\jsse.jar'
*sys-package-mgr*: processing new jar, 'C:\Program Files\Java\jre6\lib\jce.jar'
*sys-package-mgr*: processing new jar, 'C:\Program Files\Java\jre6\lib\charsets.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\ext\dnsns.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\ext\localedata.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\ext\sunjce_provider.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\ext\sunmscapi.jar'
*sys-package-mgr*: processing new jar, 'C:\Program
Files\Java\jre6\lib\ext\sunpkcs11.jar'
A created
Traceback (most recent call last):
File "c:\test\refcount.py", line 17, in <module>
writeln( str( sys.getrefcount( a ) - 1 ) )
AttributeError: 'systemstate' object has no attribute 'getrefcount'
</output>


> However, a Python implementation that always returned 0 for
> sys.getrefcount would technically satisfy the word of the documentation,
> if not the spirit.

Yes.

OK, learned something new: I though Jython actually implemented getrefcount.

The Jython docs says it does...


Cheers,

- Alf

Message has been deleted

Lawrence D'Oliveiro

unread,
Apr 27, 2010, 6:06:07 AM4/27/10
to
In message <mailman.2162.1272018...@python.org>, Adam
Tauno Williams wrote:

> On Fri, 2010-04-23 at 16:29 +1200, Lawrence D'Oliveiro wrote:
>
>> Any implementation that doesn’t do reference-counting is brain-damaged.
>
> Why?

Because a) it uses extra memory needlessly, and b) waiting until an object
has dropped out of cache before touching it again just slows things down.

> There are much better ways to do memory management / garbage
> collection; especially when dealing with large applications.

Especially with large applications, the above considerations apply even more
so.

If you don’t agree, you might as well stick to Java.

0 new messages