request for API review of streaming responses additions

291 views
Skip to first unread message

Tim Graham

unread,
Aug 18, 2015, 6:18:14 PM8/18/15
to Django developers (Contributions to Django itself)
I'd like to ask for a high-level API review of some proposed streaming API
additions (I have already given the patch a couple of detailed reviews, but
other eyes would be welcome on the details as well).

Summary:

* django.views.generic.base.StreamingTemplateView to stream a template
  rather than render it.

* A Template.stream() method, a django.template.loader.stream() function,
  and django.shortcuts.stream_to_response() and
  django.shortcuts.stream() shortcuts.

* django.template.response.StreamingTemplateResponse and
  django.template.response.SimpleStreamingTemplateResponse classes to
  handle streaming of templates.

Pull request:
https://github.com/django/django/pull/4783

Thanks!

Aymeric Augustin

unread,
Aug 23, 2015, 1:18:20 PM8/23/15
to django-d...@googlegroups.com
Hello,

I have three main comments on this patch.

1) About the general API design

This patch adds 7 new “stream” public APIs that duplicate the current “render”
APIs. Adding a stream=False argument to the current APIs would be a more
lightweight alternative. Passing stream=True would enable the streaming
behavior. Was this alternative considered and if so, why was it rejected?

I think it would make the patch less intrusive and keep the documentation more
readable. Specifically I'm concerned about making several parts of the
documentation twice longer for a relatively uncommon need.

It's unclear to me why TemplateView gets a StreamingTemplateView sibling while
other generic class base views that inherit TemplateResponseMixin don't. All
GCBVs except View and RedirectView could get a Streaming version. Even if the
initial patch doesn't do it, someone will do it eventually, which amounts to
13 additional classes.

I'm -1 on the concept of duplicating every Django API that deals with HTTP
responses. That design doesn't strike a good balance between simplicity for
newcomers and power for advanced users.

2) About third-party template engines

How should third-party template engines that don't support streaming rendering
behave? Neither the code nor the release notes provide any guidance on this
issue. I suggest to add a stream() method that raises NotImplementedError to
the base class for template backends.

The current patch implements a stream() method that doesn't actually stream in
the DummyBackend and a boilerplate render() method that concatenates a list of
one element. This is bad, all the more since the dummy backend is intended to
be a template for third-party engines.

All template backends have the same copy-pasted render() method. Perhaps it's
a symptom of a problem with the API. Perhaps it should just be pulled to the
base class.

3) About performance

I think benchmarks are required:

a. to ensure that this change doesn't degrade performance for the traditional
  rendering mode

Yann Malet's recent bug report shows that even modest performance regressions
in template rendering can be a real problem for users.

b. to assess the performance of streaming rendering which could be
  unexpectedly slow on a realistic network (try over 3G tethering)

I'm bringing this up because streaming rendering will often yield many small
chunks. If each of these ends up in its own TCP packet -- which is going to
happen unless the application server provides additional buffering -- I expect
poor performance. If that's confirmed, it should be mentioned in the docs.
Streaming rendering will still be useful for rendering gigantic amounts of
data. But it isn't the performance optimization it may look like.

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/772cae79-1941-4a6a-94a8-b82d5e767aa1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tom Christie

unread,
Aug 24, 2015, 9:25:11 AM8/24/15
to Django developers (Contributions to Django itself)
Potential there to treat these as separately reviewable pull requests.

For example - needing streaming template generation doesn't *necessarily* imply needing streaming responses. The node-by-node rendering might mean imply that less memory is used during the template rendering, even if the complete response content all ends up in memory. (I haven't got a handle from the pull request on if that'd be the case or not, and it's possible that I've misrepresented it, and there's no benefit other than being able to stream the output bytes)

More generally: As presented there's lots of technical change, with very little simple end-user presentable "why and when this is a benefit?".

* Template.stream()

What's the data to back this up?
Does this result in lower memory usage during the template generation, or is there no difference aside from allowing the output to be streamed? If there is an internal benefit then how large do those templates need to be before that's significant?

* StreamingTemplateResponse

Again, where's the data to back this up? We're introducing a decision point for our users without giving them any basis on which to make that decision. At what point would I likely want to use this, and what real-world cases for our users are driving this? (For really large CSV responses are templates a good idea in any case?)

I also don't really understand from the documentation how the behavior for SimpleStreamingTemplateResponse and StreamingTemplateResponse differs - "does not handle any rendering attributes/methods (including callbacks)" - I understand what the callbacks refer to there, but what does the rest of that mean?

* django.views.generic.base.StreamingTemplateView

Unclear to me that the decision point this view introduces to users is worth the benefit it provides.
Writing a view that streams a simple template is just about as simple a view as is possible to write in Django - would it be better if we simply took an informed decision on which of TemplateView / StreamingTemplateView is likely to be a better default on behalf of our users?

Also lukewarm on introducing a new GCBV that actually just requires a one-line addition to an existing GCBV. Should we prefer to instead document the usage of `response_class` more fully, rather than extending the API surface area?

Apologies for jumping in with the criticisms - intended positively! Understand the intent behind this, but a bit wary of changes that come across as purely technical with very little in the way of concrete demonstration of benefit. Think it'd be good to see it approached from a perspective of "what are we actually recommending our users do here?"

Cheers & thanks to everyone for their hard work!

Yann Fouillat

unread,
Sep 7, 2015, 9:43:03 AM9/7/15
to Django developers (Contributions to Django itself)
Hello and thanks for the review,


On Monday, 24 August 2015 15:25:11 UTC+2, Tom Christie wrote:
Potential there to treat these as separately reviewable pull requests.

For example - needing streaming template generation doesn't *necessarily* imply needing streaming responses. The node-by-node rendering might mean imply that less memory is used during the template rendering, even if the complete response content all ends up in memory. (I haven't got a handle from the pull request on if that'd be the case or not, and it's possible that I've misrepresented it, and there's no benefit other than being able to stream the output bytes)

More generally: As presented there's lots of technical change, with very little simple end-user presentable "why and when this is a benefit?".

* Template.stream()

What's the data to back this up?
Does this result in lower memory usage during the template generation, or is there no difference aside from allowing the output to be streamed? If there is an internal benefit then how large do those templates need to be before that's significant?


As said in the other review, benchmarking is necessary to determine that.
 
* StreamingTemplateResponse

Again, where's the data to back this up? We're introducing a decision point for our users without giving them any basis on which to make that decision. At what point would I likely want to use this, and what real-world cases for our users are driving this? (For really large CSV responses are templates a good idea in any case?)

I also don't really understand from the documentation how the behavior for SimpleStreamingTemplateResponse and StreamingTemplateResponse differs - "does not handle any rendering attributes/methods (including callbacks)" - I understand what the callbacks refer to there, but what does the rest of that mean?


The methods defined by SimpleTemplateResponse (https://github.com/Gagaro/django/blob/ticket_13910/django/template/response.py#L112) are not available to SimpleStreamingTemplateResponse. 
 
* django.views.generic.base.StreamingTemplateView

Unclear to me that the decision point this view introduces to users is worth the benefit it provides.
Writing a view that streams a simple template is just about as simple a view as is possible to write in Django - would it be better if we simply took an informed decision on which of TemplateView / StreamingTemplateView is likely to be a better default on behalf of our users?

Also lukewarm on introducing a new GCBV that actually just requires a one-line addition to an existing GCBV. Should we prefer to instead document the usage of `response_class` more fully, rather than extending the API surface area?


As I said in the other answer, I did port this blindly from the old PR. It could be better off in the documentation instead.
 
Apologies for jumping in with the criticisms - intended positively! Understand the intent behind this, but a bit wary of changes that come across as purely technical with very little in the way of concrete demonstration of benefit. Think it'd be good to see it approached from a perspective of "what are we actually recommending our users do here?"


Thanks for the criticisms, it helps going forward :).

Yann Fouillat

unread,
Sep 7, 2015, 9:43:07 AM9/7/15
to Django developers (Contributions to Django itself)
Hello and thanks for the review,

First I will say that most of this pull request is a port of https://github.com/django/django/pull/1037 which I ported more or less blindly.

On Sunday, 23 August 2015 19:18:20 UTC+2, Aymeric Augustin wrote:

1) About the general API design

This patch adds 7 new “stream” public APIs that duplicate the current “render”
APIs. Adding a stream=False argument to the current APIs would be a more
lightweight alternative. Passing stream=True would enable the streaming
behavior. Was this alternative considered and if so, why was it rejected?


I'm not sure it was considered, but I don't think it would really be less intrusive. The documentation would be better, but the code would be less readable with a lot more conditions in the rendering logic (as opposed to keeping the same logic in stream and joining the streams in render).
 
I think it would make the patch less intrusive and keep the documentation more
readable. Specifically I'm concerned about making several parts of the
documentation twice longer for a relatively uncommon need.

It's unclear to me why TemplateView gets a StreamingTemplateView sibling while
other generic class base views that inherit TemplateResponseMixin don't. All
GCBVs except View and RedirectView could get a Streaming version. Even if the
initial patch doesn't do it, someone will do it eventually, which amounts to
13 additional classes.


As I said, it was done on the original PR so I did it too. Maybe putting it as an example in the doc would be enough, so users know how to use streaming templates in GCBVs.
 
I'm -1 on the concept of duplicating every Django API that deals with HTTP
responses. That design doesn't strike a good balance between simplicity for
newcomers and power for advanced users.

2) About third-party template engines

How should third-party template engines that don't support streaming rendering
behave? Neither the code nor the release notes provide any guidance on this
issue. I suggest to add a stream() method that raises NotImplementedError to
the base class for template backends.


The Template class in the backends doesn't inherits from another class though, unless I'm missing something ?
 
The current patch implements a stream() method that doesn't actually stream in
the DummyBackend and a boilerplate render() method that concatenates a list of
one element. This is bad, all the more since the dummy backend is intended to
be a template for third-party engines.


Should the NotImplemented be in the DummyBackend stream method then ?
 
All template backends have the same copy-pasted render() method. Perhaps it's
a symptom of a problem with the API. Perhaps it should just be pulled to the
base class.


So we should have a base class for backends' Template class ?
 
3) About performance

I think benchmarks are required:

a. to ensure that this change doesn't degrade performance for the traditional
  rendering mode

Yann Malet's recent bug report shows that even modest performance regressions
in template rendering can be a real problem for users.
 
b. to assess the performance of streaming rendering which could be
  unexpectedly slow on a realistic network (try over 3G tethering)

I'm bringing this up because streaming rendering will often yield many small
chunks. If each of these ends up in its own TCP packet -- which is going to
happen unless the application server provides additional buffering -- I expect
poor performance. If that's confirmed, it should be mentioned in the docs.
Streaming rendering will still be useful for rendering gigantic amounts of
data. But it isn't the performance optimization it may look like.


I agree, do you know what tools could I use to emulate 3G ?

Aymeric Augustin

unread,
Sep 7, 2015, 10:05:35 AM9/7/15
to django-d...@googlegroups.com
Hello,

2015-09-07 10:00 GMT+02:00 Yann Fouillat <gaga...@gmail.com>:
First I will say that most of this pull request is a port of https://github.com/django/django/pull/1037 which I ported more or less blindly.

As far as I can tell, this previous PR was never discussed on django-developers. Someone threw the code on GitHub, that's all; you can't make any assumptions about its suitability for inclusion in Django. Usually we discuss how a feature should be implement, look for consensus, and then submit the code ;-)
 

1) About the general API design

This patch adds 7 new “stream” public APIs that duplicate the current “render”
APIs. Adding a stream=False argument to the current APIs would be a more
lightweight alternative. Passing stream=True would enable the streaming
behavior. Was this alternative considered and if so, why was it rejected?

I'm not sure it was considered, but I don't think it would really be less intrusive. The documentation would be better, but the code would be less readable with a lot more conditions in the rendering logic (as opposed to keeping the same logic in stream and joining the streams in render).

The point of Django is to handle the painful stuff in the framework to make our users' lives easier. If we need to write slightly more complicated code to improve the API, that's just fine.

Also I doubt there will be many more conditions. The higher levels will just pass the stream keyword argument to the lower levels and eventually to the template engine. The only conditional should be be `response_class = StreamingHttpResponse if stream else HttpResponse`.

I'm aware of the irony of me making this argument. I set a bad precedent when I added StreamingHttpResponse. I didn't think of HttpResponse(stream=True) at the time. The prospect of having 25 StreamingFooBar classes in Django makes me realize this wasn't a great idea. Let's learn from this mistake. (The stakes are a bit low to consider fixing it through a deprecation path, though.)
  

I think it would make the patch less intrusive and keep the documentation more
readable. Specifically I'm concerned about making several parts of the
documentation twice longer for a relatively uncommon need.

It's unclear to me why TemplateView gets a StreamingTemplateView sibling while
other generic class base views that inherit TemplateResponseMixin don't. All
GCBVs except View and RedirectView could get a Streaming version. Even if the
initial patch doesn't do it, someone will do it eventually, which amounts to
13 additional classes.

As I said, it was done on the original PR so I did it too. Maybe putting it as an example in the doc would be enough, so users know how to use streaming templates in GCBVs.

I'm proposing to add a `stream = False` attribute to TemplateResponseMixin.


2) About third-party template engines

How should third-party template engines that don't support streaming rendering
behave? Neither the code nor the release notes provide any guidance on this
issue. I suggest to add a stream() method that raises NotImplementedError to
the base class for template backends.


The Template class in the backends doesn't inherits from another class though, unless I'm missing something ?


 
The current patch implements a stream() method that doesn't actually stream in
the DummyBackend and a boilerplate render() method that concatenates a list of
one element. This is bad, all the more since the dummy backend is intended to
be a template for third-party engines.


Should the NotImplemented be in the DummyBackend stream method then ?

Yes, in order to keep https://github.com/django/django/blob/adff499e/django/template/backends/dummy.py#L17 a simple example of how to implement a Django template backend.

 
All template backends have the same copy-pasted render() method. Perhaps it's
a symptom of a problem with the API. Perhaps it should just be pulled to the
base class.


So we should have a base class for backends' Template class ?

You're mixing template backends (instantiated once per template engine configured in settings.TEMPLATES) and templates (instanciated once per call to render() or equivalent).

  
3) About performance

I think benchmarks are required:

a. to ensure that this change doesn't degrade performance for the traditional
  rendering mode

Yann Malet's recent bug report shows that even modest performance regressions
in template rendering can be a real problem for users.
 
b. to assess the performance of streaming rendering which could be
  unexpectedly slow on a realistic network (try over 3G tethering)

I agree, do you know what tools could I use to emulate 3G ?

Tom Evans

unread,
Sep 7, 2015, 11:43:48 AM9/7/15
to django-d...@googlegroups.com
As well as these tools, there is similar functionality built in to the
chrom(e|ium) browser.

Inspect the page, toggle "Device mode" (click the phone icon next to
"Elements" tab), and options to throttle the network (with various
presets) will appear on the page.

I don't know a way to throttle network in chrome without also toggling
device emulation however, so the page will look different (within a
viewport).

Cheers

Tom

Matthew Somerville

unread,
Sep 14, 2015, 10:13:39 AM9/14/15
to Django developers (Contributions to Django itself)
On Monday, 7 September 2015 15:05:35 UTC+1, Aymeric Augustin wrote:
2015-09-07 10:00 GMT+02:00 Yann Fouillat <gaga...@gmail.com>:
First I will say that most of this pull request is a port of https://github.com/django/django/pull/1037 which I ported more or less blindly.

As far as I can tell, this previous PR was never discussed on django-developers. Someone threw the code on GitHub, that's all; you can't make any assumptions about its suitability for inclusion in Django.

The original ticket is at https://code.djangoproject.com/ticket/13910 where it was Accepted 5 years ago, that original PR done two years ago, and then rebased/worked on more this summer. That ticket also suggested the creation of "StreamingTemplateResponse" ;-) There was a brief discussion at https://groups.google.com/forum/#!searchin/django-developers/13910/django-developers/bpu5EHjnjDs/locX-gWm8PoJ and brief side mention at https://groups.google.com/forum/#!searchin/django-developers/13910/django-developers/OubwBVo_gxw/Avh1pdbpPMYJ – certainly nothing major, no.
 
I think stream=False sounds like a good idea, I definitely wouldn't want to duplicate all the Mixin/Views for streaming/non-streaming.

From a performance point of view, I wrote a blog post about how not using streaming templates caused our entire server to fall over and what I wrote to work around the lack of streaming templates: https://www.mysociety.org/2015/06/01/django-streaminghttpresponse-json-html/ – it's certainly not a magic bullet but in certain circumstances would be very useful :-)

ATB,
Matthew

Yann Fouillat

unread,
Sep 21, 2015, 4:25:35 AM9/21/15
to Django developers (Contributions to Django itself)
Hi, sorry for the delay,


On Monday, 7 September 2015 16:05:35 UTC+2, Aymeric Augustin wrote:
Hello,

2015-09-07 10:00 GMT+02:00 Yann Fouillat <gaga...@gmail.com>:
First I will say that most of this pull request is a port of https://github.com/django/django/pull/1037 which I ported more or less blindly.

As far as I can tell, this previous PR was never discussed on django-developers. Someone threw the code on GitHub, that's all; you can't make any assumptions about its suitability for inclusion in Django. Usually we discuss how a feature should be implement, look for consensus, and then submit the code ;-)
 

1) About the general API design

This patch adds 7 new “stream” public APIs that duplicate the current “render”
APIs. Adding a stream=False argument to the current APIs would be a more
lightweight alternative. Passing stream=True would enable the streaming
behavior. Was this alternative considered and if so, why was it rejected?

I'm not sure it was considered, but I don't think it would really be less intrusive. The documentation would be better, but the code would be less readable with a lot more conditions in the rendering logic (as opposed to keeping the same logic in stream and joining the streams in render).

The point of Django is to handle the painful stuff in the framework to make our users' lives easier. If we need to write slightly more complicated code to improve the API, that's just fine.

Also I doubt there will be many more conditions. The higher levels will just pass the stream keyword argument to the lower levels and eventually to the template engine. The only conditional should be be `response_class = StreamingHttpResponse if stream else HttpResponse`.

I'm aware of the irony of me making this argument. I set a bad precedent when I added StreamingHttpResponse. I didn't think of HttpResponse(stream=True) at the time. The prospect of having 25 StreamingFooBar classes in Django makes me realize this wasn't a great idea. Let's learn from this mistake. (The stakes are a bit low to consider fixing it through a deprecation path, though.)
  

I think it would make the patch less intrusive and keep the documentation more
readable. Specifically I'm concerned about making several parts of the
documentation twice longer for a relatively uncommon need.

It's unclear to me why TemplateView gets a StreamingTemplateView sibling while
other generic class base views that inherit TemplateResponseMixin don't. All
GCBVs except View and RedirectView could get a Streaming version. Even if the
initial patch doesn't do it, someone will do it eventually, which amounts to
13 additional classes.

As I said, it was done on the original PR so I did it too. Maybe putting it as an example in the doc would be enough, so users know how to use streaming templates in GCBVs.

I'm proposing to add a `stream = False` attribute to TemplateResponseMixin.


I think documenting that changing the `response_class` attribute should be enough. Adding a `stream` attribute would make two attributes for selecting the response class (the stream attribute would not change anything else).
 

2) About third-party template engines

How should third-party template engines that don't support streaming rendering
behave? Neither the code nor the release notes provide any guidance on this
issue. I suggest to add a stream() method that raises NotImplementedError to
the base class for template backends.


The Template class in the backends doesn't inherits from another class though, unless I'm missing something ?



These backends don't have any render method. The render method is on the Template class: https://github.com/django/django/blob/aaacaeb0963c406c4fe6f68c6ae51f4a65878250/django/template/backends/django.py#L54
 
 
The current patch implements a stream() method that doesn't actually stream in
the DummyBackend and a boilerplate render() method that concatenates a list of
one element. This is bad, all the more since the dummy backend is intended to
be a template for third-party engines.


Should the NotImplemented be in the DummyBackend stream method then ?

Yes, in order to keep https://github.com/django/django/blob/adff499e/django/template/backends/dummy.py#L17 a simple example of how to implement a Django template backend.

 
All template backends have the same copy-pasted render() method. Perhaps it's
a symptom of a problem with the API. Perhaps it should just be pulled to the
base class.


So we should have a base class for backends' Template class ?

You're mixing template backends (instantiated once per template engine configured in settings.TEMPLATES) and templates (instanciated once per call to render() or equivalent).

  
3) About performance

I think benchmarks are required:

a. to ensure that this change doesn't degrade performance for the traditional
  rendering mode

Yann Malet's recent bug report shows that even modest performance regressions
in template rendering can be a real problem for users.
 
b. to assess the performance of streaming rendering which could be
  unexpectedly slow on a realistic network (try over 3G tethering)

I agree, do you know what tools could I use to emulate 3G ?
 
As far as I know, the canonical tools are:



Thanks I'll take a look at them.
 
Best regards,

--
Aymeric.
Reply all
Reply to author
Forward
0 new messages