==== Initial Issue
Using Daphne as a server for a Django application seems to cause a
malfunction in streaming responses.
When you pass an iterator into a response, the contents of the response
are not retrieved chunk by chunk, but from a single block when the
iterator has finished iterating.
I have a minimal project that can reproduce the behavior, at
[https://github.com/florianvazelle/minimal-daphne-stream florianvazelle
/minimal-daphne-stream].
==== Solution
After some research, in Daphne, we use Twisted with asyncioreactor. I
managed to reproduce the bug in a minimal way:
[https://gist.github.com/florianvazelle/c7d8c49fb005a72ed5962ee55a83b000
Minimal example of an "ASGI-like" streaming server with Twisted].
The streaming issue occurs when we call a blocking method in an iterator
(io, sleep, request ...), which blocks the reactor. That's why the result
is not done progressively.
The reactor does not get control of execution back until the end of the
iteration.
To correct this behavior we need to add an asynchronous layer and use
async / non-blocking alternatives.
==== Proposition
In my minimal project, the view will become:
{{{#!python
import asyncio
from django.http.response import StreamingHttpResponse
async def iterable_content():
for _ in range(5):
await asyncio.sleep(1)
print('Returning chunk')
yield b'a' * 10000
def test_stream_view(request):
return StreamingHttpResponse(iterable_content())
}}}
But django does not handle asynchronous generators.
Some works are already did in this tickets
[https://code.djangoproject.com/ticket/32798 #32798 (StreamingHttpResponse
Raises SynchronousOnlyOperation in ASGI Server) – Django], but considering
the performance, we could manage it differently here.
I propose to add asynchronous responses, to be used in an async context,
so with an ASGI server.
I make a PoC, in a very basic way:
[https://github.com/florianvazelle/django/commit/4a52f975fa39ae1f52e015a973f9d146fc1cb561
florianvazelle/django].
--
Ticket URL: <https://code.djangoproject.com/ticket/33735>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
==== Initial Issue
==== Solution
==== Proposition
{{{#!python
import asyncio
from django.http.response import StreamingHttpResponse
def test_stream_view(request):
return StreamingHttpResponse(iterable_content())
}}}
[https://github.com/django/django/pull/15727 PoC Fixed #33735 -- Handle
async streaming response].
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:1>
* cc: Carlton Gibson (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:2>
Old description:
> [https://github.com/django/django/pull/15727 PoC Fixed #33735 -- Handle
> async streaming response].
New description:
This ticket follows the one I opened on
[https://github.com/django/daphne/issues/413 django/daphne#413].
==== Initial Issue
==== Solution
==== Proposition
{{{#!python
import asyncio
from django.http.response import StreamingHttpResponse
def test_stream_view(request):
return StreamingHttpResponse(iterable_content())
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:3>
* type: Uncategorized => New feature
* stage: Unreviewed => Accepted
Comment:
Accepting for review.
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:4>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:5>
* owner: nobody => florianvazelle
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:6>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:7>
* cc: Andrew Godwin, Michael Brown (added)
Comment:
Hi Florian — Thanks again for this.
I **finally** got the time to sit with the PR and that for #32798 to try
and think through the right way forward.
I'm commenting here, rather than the PR, because I think we need a
slightly different approach.
Also cc-ing Andrew and Micheal to ask for their input too.
So what follows is an RFC...
Looking at the [https://github.com/django/django/pull/15727 PR here], and
that for #32798 (either the [https://github.com/django/django/pull/14526
complex first suggestion], or the
[https://github.com/django/django/pull/14652 simpler but slower
alternative]) I think we should consciously opt-out of trying to map
between sync and async generators. **Maybe** somebody writes a queue
implementation that has sync one end and async at the other, and handles
thread-safety, and back-pressure, and who knows what. **Maybe** we could
use that in Django if we felt it trust-worthy enough. But I don't think we
should try and write one ourselves.
Rather, I think we should say that, in the case of streaming responses,
users need to adjust their code for whether they're planning to run under
WSGI or ASGI. (We will handle the **other** case both ways, but at a
performance and memory cost.)
To wit:
We add `__aiter__` to `StreamingHttpResponse` and adopt `ASGIHandler` to
use `async for` in the `if response.streaming` block. Under ASGI you would
provide `StreamingHttpResponse` with an async iterator, still yielding
bytestrings as content. This should allow the streaming async responses
use-case.
Under WSGI you'd continue to provide a sync iterator and everything would
work as it is already.
For each case, `__iter__` and `__aiter__` we handle the **wrong case** by
consuming the iterator in a single step (using the `asgiref.sync`
utilities, with `thread_sensitive` for `sync_to_async`) and then yield it
out in parts as expected. (This is the performance and memory cost.) We
log a **warning** (that may be filtered in a logging content if not
desired) saying that folks should use an appropriate generator type in
this case.
Note that this is similar to the `__aiter__` implementation on QuerySet,
which
[https://github.com/django/django/blob/f30c7e381c94f41d361877d8a3e90f8cfb391709/django/db/models/query.py#L397-L405
fetches all the records once inside a single sync_to_async()] to avoid
multiple (costly) context switches there:
{{{
def __aiter__(self):
# Remember, __aiter__ itself is synchronous, it's the thing it
returns
# that is async!
async def generator():
await sync_to_async(self._fetch_all)()
for item in self._result_cache:
yield item
return generator()
}}}
I think this should be both good enough and maintainable. In the happy
case, where you provide the right kind of iterator for your chosen
protocol (ASGI/WSGI) you get the result you want, without us need anything
overly complex. In the deviant case it'll work (assuming you're not trying
to stream ''Too much™''). It means we're not branching in the handlers to
handle each of the different possible iterator types. The abstraction
leaks a little bit, but I kind of think that's reasonable at this point.
Maybe: If it proves impossible to get the WSGI+async/ASGI+sync fallback to
work correctly, just erroring at that point would maybe be acceptable. 🤔
(That you could provide an async iterator to ASGIHandler would allow long-
lived responses, that aren't currently feasible.)
Thoughts? Thanks!
**Assuming** all that makes sense, would you (or Michael maybe) want to
take this on?
If not I can pick it up, since it would be good to hit Django 4.2 (feature
freeze Jan 23) for this. 🎯
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:8>
Comment (by Andrew Godwin):
Hmm, I think that makes sense, as long as we put appropriately large
warnings in the docs around it. It still works, it's simpler, and if
someone feels brave enough to try and map between iterator types in an
efficient fashion, they can come around and try that work separately later
on.
I can probably have a crack at doing it too, if nobody else wants to rush
in and pick it up :)
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:9>
* cc: Jon Janzen (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:10>
* owner: florianvazelle => (none)
* needs_better_patch: 1 => 0
* has_patch: 1 => 0
* status: assigned => new
Comment:
Hey Florian. Thinking about the 4.2 feature freeze after the new year, I'm
going to deassign you so others can pick up if they're interested. (Of
course that other may be you, if you still have bandwidth :) Thanks.
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:11>
Comment (by Carlton Gibson):
#32798 was the related duplicate.
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:12>
* cc: Carles Pina Estany (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:13>
Comment (by Carles Pina Estany):
Replying to [comment:8 Carlton Gibson]:
> **Assuming** all that makes sense, would you (or Michael maybe) want to
take this on?
> If not I can pick it up, since it would be good to hit Django 4.2
(feature freeze Jan '23) for this. 🎯
We encountered exactly this problem in an application. I don't think that
I will have enough bandwidth to prepare a PR (if I had, well, you will see
it :-) ) but for sure I will test and comment any code changes around here
(I've spent quite some time looking at this code the last two days and
have an easy to reproduce test, a real case, with daphne and without,
etc.).
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:14>
* owner: (none) => Carlton Gibson
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:15>
* has_patch: 0 => 1
Comment:
I've opened [https://github.com/django/django/pull/16384 an initial PR
here], outlining the approach from comment:8.
For the basic example, streaming an async generator, it works well, with
both Daphne and Uvicorn.
Still need to think about `streaming_content` (for wrapping by middleware)
and aclosing usage, but I wanted to open sooner to enable feedback.
(`python -X dev ...` FTW :)
All eyes welcome. 🙏
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:16>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:17>
* needs_better_patch: 1 => 0
Comment:
PR should now be ready for review.
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:18>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:19>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"0bd2c0c9015b53c41394a1c0989afbfd94dc2830" 0bd2c0c]:
{{{
#!CommitTicketReference repository=""
revision="0bd2c0c9015b53c41394a1c0989afbfd94dc2830"
Fixed #33735 -- Added async support to StreamingHttpResponse.
Thanks to Florian Vazelle for initial exploratory work, and to Nick
Pope and Mariusz Felisiak for review.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:20>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"52b054824e899db40ba48f908a9a00dadc56cb89" 52b05482]:
{{{
#!CommitTicketReference repository=""
revision="52b054824e899db40ba48f908a9a00dadc56cb89"
Fixed #34342, Refs #33735 -- Fixed test client handling of async streaming
responses.
Bug in 0bd2c0c9015b53c41394a1c0989afbfd94dc2830.
Co-authored-by: Carlton Gibson <carlton...@noumenal.es>
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:21>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"610cd06c3ffe9ef49554f7f7b1f9ff4aa9c2b879" 610cd06c]:
{{{
#!CommitTicketReference repository=""
revision="610cd06c3ffe9ef49554f7f7b1f9ff4aa9c2b879"
[4.2.x] Fixed #34342, Refs #33735 -- Fixed test client handling of async
streaming responses.
Bug in 0bd2c0c9015b53c41394a1c0989afbfd94dc2830.
Co-authored-by: Carlton Gibson <carlton...@noumenal.es>
Backport of 52b054824e899db40ba48f908a9a00dadc56cb89 from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33735#comment:22>