should we close in-memory file-like objects (StringIO, BytesIO, etc.)?

4,647 views
Skip to first unread message

Tim Graham

unread,
Jul 3, 2015, 9:59:11 AM7/3/15
to django-d...@googlegroups.com
Andriy proposed a patch to close objects like StringIO and BytesIO in our test suite [1]. I am not sure how much benefit this gives (frees "a few bytes of RAM" according to [2]) and it seems to add a lot of verbosity. Looking at Python's own test suite, it doesn't appear these objects are closed there (I spot checked BytesIO and StringIO). Any thoughts on this?

[1] https://github.com/django/django/pull/4938
[2] https://github.com/django/django/pull/4248

Andriy Sokolovskiy

unread,
Jul 3, 2015, 10:09:49 AM7/3/15
to django-d...@googlegroups.com
My vision of that was to have clean code (even if these are few bytes).
Python allows to have memory leaks like this, without throwing warnings,
but if we know that we can close it, I think we should do that.

We're usually writing new tests by copying logic from sibling test (if
new test need to do similar thing).
If that test have unclosed stream or file, new test likely also will not
close streams and files, so this old test is a "bad example",
which causing more unclean code.

So, it question about code style than about bytes savings.
> --
> You received this message because you are subscribed to the Google
> Groups "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to django-develop...@googlegroups.com
> <mailto:django-develop...@googlegroups.com>.
> To post to this group, send email to django-d...@googlegroups.com
> <mailto:django-d...@googlegroups.com>.
> Visit this group at http://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/7579ee1c-ecc6-48d1-9002-733810e64754%40googlegroups.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout.

Łukasz Rekucki

unread,
Jul 3, 2015, 11:01:23 AM7/3/15
to django-developers
Hello ,

If you do mean CPython test suite then it's probably not the best
place to look at because the reference counting *is there* and those
object will be closed at the end of test method. Django doesn't have
this comfort when run with PyPy.

That said, I wouldn't recommend using contextlib.closing() all over
the place. StringIO seems to implement the context manager protocol
just fine[1]. And in Python 3.4+ so does urlopen().

[1]: http://bugs.python.org/issue4039
> To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
> To post to this group, send email to django-d...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/559697A3.2010002%40asokolovskiy.com.
> For more options, visit https://groups.google.com/d/optout.



--
Łukasz Rekucki

Berker Peksağ

unread,
Jul 3, 2015, 11:11:23 AM7/3/15
to django-d...@googlegroups.com
On Fri, Jul 3, 2015 at 4:59 PM, Tim Graham <timog...@gmail.com> wrote:
> Andriy proposed a patch to close objects like StringIO and BytesIO in our
> test suite [1]. I am not sure how much benefit this gives (frees "a few
> bytes of RAM" according to [2]) and it seems to add a lot of verbosity.
> Looking at Python's own test suite, it doesn't appear these objects are
> closed there (I spot checked BytesIO and StringIO). Any thoughts on this?

I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

--Berker

Aymeric Augustin

unread,
Jul 3, 2015, 1:55:55 PM7/3/15
to django-d...@googlegroups.com
2015-07-03 17:10 GMT+02:00 Berker Peksağ <berker...@gmail.com>:
I agree with you on the StringIO and BytesIO cases, but I agree with
Andriy on the other usages (e.g. ZipFile, GzipFile). Many contributors
follow the existing practices to write new tests, so it would be nice
to use best practices if they don't produce too much noise.

StringIO and BytesIO are different from files and sockets. The former only use
RAM while the latter use file descriptors, which are in much scarcer supply
(perhaps 1024 per process).

Leaking file descriptors can cause the process to crash because it exceed the
systems limit, so it's really bad.

Leaking StringIO or BytesIO doesn't change anything. They'll be closed when
the garbage collector releases them.

That's why I don't believe the analogy holds.

--
Aymeric.

Ned Batchelder

unread,
Jul 4, 2015, 8:52:52 AM7/4/15
to django-d...@googlegroups.com
Many of the StringIO and BytesIO changes in the pull requests won't change anything.  The with-statement extends to the end of the test method, which is when the object would be reclaimed anyway:
@@ -31,10 +32,10 @@ def write(self, data):
             self.bytes_sent += len(data)
 
         # XXX check Content-Length and truncate if too many bytes written?
-        data = BytesIO(data)
-        for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
-            self._write(chunk)
-            self._flush()
+        with contextlib.closing(BytesIO(data)) as data:
+            for chunk in iter(lambda: data.read(MAX_SOCKET_CHUNK_SIZE), b''):
+                self._write(chunk)
+                self._flush()
 
     def error_output(self, environ, start_response):
         super(ServerHandler, self).error_output(environ, start_response)

I don't think it makes sense to blindly use contextlib.closing on anything that acts like a file, even when we know it is a harmless StringIO.  Better to have developers think about what is happening, and use the appropriate construct.

When I see " contextlib.closing(BytesIO(data))", I think, this person is following a pattern, and doesn't understand.

--Ned.

--
Aymeric.
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.

Tim Graham

unread,
Jul 6, 2015, 11:53:27 AM7/6/15
to django-d...@googlegroups.com
Thanks for the info and feedback. We'll omit these changes for StringIO and Bytes IO then.
Reply all
Reply to author
Forward
0 new messages