This implementation has largely been untouched for 12 years and, inspired
by a simpler implementation in `werkzeug`, I was able to achieve the
following performance improvement:
{{{
LimitedStream.read() (single line): Mean +- std dev: [bench_limitedstream-
main] 286 ns +- 6 ns -> [bench_limitedstream-patch] 227 ns +- 6 ns: 1.26x
faster
LimitedStream.readline() (single line): Mean +- std dev:
[bench_limitedstream-main] 507 ns +- 11 ns -> [bench_limitedstream-patch]
232 ns +- 8 ns: 2.18x faster
LimitedStream.read(8192) (single line): Mean +- std dev:
[bench_limitedstream-main] 360 ns +- 8 ns -> [bench_limitedstream-patch]
297 ns +- 6 ns: 1.21x faster
LimitedStream.readline(8192) (single line): Mean +- std dev:
[bench_limitedstream-main] 602 ns +- 10 ns -> [bench_limitedstream-patch]
305 ns +- 10 ns: 1.98x faster
LimitedStream.read() (multiple lines): Mean +- std dev:
[bench_limitedstream-main] 290 ns +- 5 ns -> [bench_limitedstream-patch]
236 ns +- 6 ns: 1.23x faster
LimitedStream.readline() (multiple lines): Mean +- std dev:
[bench_limitedstream-main] 517 ns +- 19 ns -> [bench_limitedstream-patch]
239 ns +- 7 ns: 2.16x faster
LimitedStream.read(8192) (multiple lines): Mean +- std dev:
[bench_limitedstream-main] 363 ns +- 8 ns -> [bench_limitedstream-patch]
311 ns +- 11 ns: 1.17x faster
LimitedStream.readline(8192) (multiple lines): Mean +- std dev:
[bench_limitedstream-main] 601 ns +- 12 ns -> [bench_limitedstream-patch]
308 ns +- 7 ns: 1.95x faster
Geometric mean: 1.59x faster
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33865>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* has_patch: 0 => 1
Comment:
[https://github.com/django/django/pull/15872 PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:1>
* Attachment "bench_limitedstream.py" added.
--
Ticket URL: <https://code.djangoproject.com/ticket/33865>
* Attachment "bench_limitedstream-a46dfa87d0.json" added.
* Attachment "bench_limitedstream-patch.json" added.
Comment (by Nick Pope):
Attached benchmark script and output.
(Although output attached is from a different run/machine to that in the
description, it's in the same ballpark. Very much an improvement.)
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:2>
* stage: Unreviewed => Accepted
Comment:
Tentatively accepted for future investigation.
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:3>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:4>
* needs_better_patch: 1 => 0
Comment:
Issues should now be resolved.
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:5>
* cc: Ivan Sagalaev, Florian Apolloner (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:6>
* needs_better_patch: 0 => 1
Comment:
Per Florian's and Nick's comments.
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:7>
* cc: Anvesh Mishra (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:8>
* needs_better_patch: 1 => 0
Comment:
Marking ready for review again. See my comments on the ticket. I don't
''think'' we need to do anything further, and if we do it should be a
separate ticket.
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:9>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:10>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"b47f2f5b907732d80b164f1f361ae39da94a3fa6" b47f2f5]:
{{{
#!CommitTicketReference repository=""
revision="b47f2f5b907732d80b164f1f361ae39da94a3fa6"
Fixed #33865 -- Optimized LimitedStream wrapper.
The current implementation of LimitedStream is slow because .read()
performs an extra copy into a buffer and .readline() performs two
extra copies. The stream being wrapped is already typically a BytesIO
object so this is unnecessary.
This implementation has largely been untouched for 12 years and,
inspired by a simpler implementation in werkzeug, it was possible to
achieve the following performance improvement:
LimitedStream.read() (single line):
Mean +- std dev: [bench_limitedstream-main] 286 ns +- 6 ns
-> [bench_limitedstream-patch] 227 ns +- 6 ns: 1.26x faster
LimitedStream.readline() (single line):
Mean +- std dev: [bench_limitedstream-main] 507 ns +- 11 ns
-> [bench_limitedstream-patch] 232 ns +- 8 ns: 2.18x faster
LimitedStream.read(8192) (single line):
Mean +- std dev: [bench_limitedstream-main] 360 ns +- 8 ns
-> [bench_limitedstream-patch] 297 ns +- 6 ns: 1.21x faster
LimitedStream.readline(8192) (single line):
Mean +- std dev: [bench_limitedstream-main] 602 ns +- 10 ns
-> [bench_limitedstream-patch] 305 ns +- 10 ns: 1.98x faster
LimitedStream.read() (multiple lines):
Mean +- std dev: [bench_limitedstream-main] 290 ns +- 5 ns
-> [bench_limitedstream-patch] 236 ns +- 6 ns: 1.23x faster
LimitedStream.readline() (multiple lines):
Mean +- std dev: [bench_limitedstream-main] 517 ns +- 19 ns
-> [bench_limitedstream-patch] 239 ns +- 7 ns: 2.16x faster
LimitedStream.read(8192) (multiple lines):
Mean +- std dev: [bench_limitedstream-main] 363 ns +- 8 ns
-> [bench_limitedstream-patch] 311 ns +- 11 ns: 1.17x faster
LimitedStream.readline(8192) (multiple lines):
Mean +- std dev: [bench_limitedstream-main] 601 ns +- 12 ns
-> [bench_limitedstream-patch] 308 ns +- 7 ns: 1.95x faster
Geometric mean: 1.59x faster
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:14>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"95182a8593f87046999fc50c4f4161843d68273c" 95182a85]:
{{{
#!CommitTicketReference repository=""
revision="95182a8593f87046999fc50c4f4161843d68273c"
Refs #33865 -- Corrected signature of ExplodingBytesIO.read().
These subclasses of io.BytesIO should inherit the correct signature.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:12>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"57f5669d23fe17d940887e1e3ce694c7366a6330" 57f5669d]:
{{{
#!CommitTicketReference repository=""
revision="57f5669d23fe17d940887e1e3ce694c7366a6330"
Refs #33865 -- Improved implementation of FakePayload.
FakePayload is a wrapper around io.BytesIO and is expected to
masquerade as though it is a file-like object. For that reason it makes
sense that it should inherit the correct signatures from io.BytesIO
methods.
Crucially an implementation of .readline() is added which will be
necessary for this to behave more like the expected file-like objects as
LimitedStream will be changed to defer to the wrapped stream object
rather than rolling its own implementation for improved performance.
It should be safe to adjust these signatures because FakePayload is
only used internally within test client helpers, is undocumented, and
thus private.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:13>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"7a1543d9f6dd3cac32f0579d8d3b9128c427f531" 7a1543d]:
{{{
#!CommitTicketReference repository=""
revision="7a1543d9f6dd3cac32f0579d8d3b9128c427f531"
Refs #33865 -- Made RequestsTests.test_set_encoding_clears_GET use
FakePayload.
The input stream, wsgi.input, must be a file-like object. The existing
implementation of LimitedStream was lax and allowed an empty string to
be passed incorrectly.
See https://wsgi.readthedocs.io/en/latest/definitions.html#envvar-
wsgi.input
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33865#comment:11>