#12826 proposed adding __slots__ to many classes but was closed with the
proposal to do so on a case-by-case basis.
#33474 added __slots__ to `Variable` and related classes, showing memory
reductions and performance gains.
I propose adding __slots__ to all template Node classes to save further
memory. This leads to about a 1% improvement in template rendering time,
using this benchmark script:
{{{
import django
from django.conf import settings
from django.template import Context
from django.template import Template
settings.configure(
TEMPLATES=[{"BACKEND":
"django.template.backends.django.DjangoTemplates"}]
)
django.setup()
template = Template("{% if x %}{{ x }}{% endif %}" * 1_000)
context = Context({"x": "abc"})
for i in range(1_000):
template.render(context)
}}}
And invoking [hyperfine](https://github.com/sharkdp/hyperfine) like so:
{{{
$ hyperfine \
-N --warmup 1 \
--prepare 'git switch -d 7d0e566208' \
'python benchmark.py' \
--prepare 'git switch optimize_templates' \
'python benchmark.py'
Benchmark 1: python benchmark.py
Time (mean ± σ): 2.006 s ± 0.007 s [User: 1.968 s, System:
0.035 s]
Range (min … max): 1.995 s … 2.013 s 10 runs
Benchmark 2: python benchmark.py
Time (mean ± σ): 1.993 s ± 0.008 s [User: 1.958 s, System:
0.034 s]
Range (min … max): 1.984 s … 2.012 s 10 runs
Summary
'python benchmark.py' ran
1.01 ± 0.01 times faster than 'python benchmark.py'
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34521>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
Declaring __slots__ reduces the memory footprint of a class, and can lead
to performance gains due to less data fetching by the CPU.
#12826 proposed adding __slots__ to many classes but was closed with the
proposal to do so on a case-by-case basis.
#33474 added __slots__ to `Variable` and related classes, showing memory
reductions and performance gains.
I propose adding __slots__ to more template classes to further save memory
(Template, Token, Lexer, and the Node classes). This change leads to about
a 1% improvement in template rendering time, using this benchmark script:
{{{
import django
from django.conf import settings
from django.template import Context
from django.template import Template
settings.configure(
TEMPLATES=[{"BACKEND":
"django.template.backends.django.DjangoTemplates"}]
)
django.setup()
template = Template("{% if x %}{{ x }}{% endif %}" * 1_000)
context = Context({"x": "abc"})
for i in range(1_000):
template.render(context)
}}}
And invoking [hyperfine](https://github.com/sharkdp/hyperfine) like so,
with Python 3.11.2:
{{{
$ hyperfine \
-N --warmup 1 \
--prepare 'git switch -d 7d0e566208' \
'python benchmark.py' \
--prepare 'git switch optimize_templates' \
'python benchmark.py'
Benchmark 1: python benchmark.py
Time (mean ± σ): 2.006 s ± 0.007 s [User: 1.968 s, System:
0.035 s]
Range (min … max): 1.995 s … 2.013 s 10 runs
Benchmark 2: python benchmark.py
Time (mean ± σ): 1.993 s ± 0.008 s [User: 1.958 s, System:
0.034 s]
Range (min … max): 1.984 s … 2.012 s 10 runs
Summary
'python benchmark.py' ran
1.01 ± 0.01 times faster than 'python benchmark.py'
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:1>
Old description:
> Declaring __slots__ reduces the memory footprint of a class, and can lead
> to performance gains due to less data fetching by the CPU.
>
> #12826 proposed adding __slots__ to many classes but was closed with the
> proposal to do so on a case-by-case basis.
>
> #33474 added __slots__ to `Variable` and related classes, showing memory
> reductions and performance gains.
>
New description:
(Template, Token, Lexer, and the Node classes). This change leads to 1-2%
improvement in template rendering time, using this benchmark script:
{{{
import django
from django.conf import settings
from django.template import Context
from django.template import Template
settings.configure(
TEMPLATES=[{"BACKEND":
"django.template.backends.django.DjangoTemplates"}]
)
django.setup()
template = Template("{% if x %}{{ x }}{% endif %}" * 1_000)
context = Context({"x": "abc"})
for i in range(1_000):
template.render(context)
}}}
And invoking [hyperfine](https://github.com/sharkdp/hyperfine) like so,
with Python 3.11.2:
{{{
$ hyperfine \
-N --warmup 1 \
--prepare 'git switch -d 7d0e566208' \
'python benchmark.py' \
--prepare 'git switch ticket_34521_optimize_templates' \
'python benchmark.py'
Benchmark 1: python benchmark.py
Time (mean ± σ): 2.028 s ± 0.042 s [User: 1.990 s, System:
0.035 s]
Range (min … max): 1.997 s … 2.116 s 10 runs
Benchmark 2: python benchmark.py
Time (mean ± σ): 1.985 s ± 0.007 s [User: 1.950 s, System:
0.033 s]
Range (min … max): 1.976 s … 1.997 s 10 runs
Summary
'python benchmark.py' ran
1.02 ± 0.02 times faster than 'python benchmark.py'
}}}
(I decided not to use djangobench since it's not well maintained, and
doesn't do such great stats as hyperfine.)
--
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:2>
* owner: nobody => Adam Johnson
* status: new => assigned
* stage: Unreviewed => Accepted
Comment:
Thanks. Tentatively accepted.
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:3>
* cc: Keryn Knight (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:4>
* cc: Carlton Gibson (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:5>
* needs_better_patch: 0 => 1
Comment:
Marking as "Needs improvement" while waiting for
[https://github.com/django/django/pull/16805#issuecomment-1543276511
follow up].
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:6>
Comment (by Carlton Gibson):
I'd like to argue for wontfixing this, at least for a couple of release
cycles...
Copying my
[https://github.com/django/django/pull/16805#pullrequestreview-1430606122
comment from the PR]:
> Can I ask that we pause on this please?
>
> I'm currently investigating possible implementations for _Template
Fragments_ (or _Partials_), as per the [HTMX essay on that
theme](https://htmx.org/essays/template-fragments/).
>
> Whilst custom template tags have long been able to write to the
`context`, this route isn't available for (an as yet in-development)
_partial nodes_ that are not rendered, for example because they're
declared outside a `block`, or are rendered **after** they're required for
output, because of varying rendering order due to inheritance, say.
>
> In order to work around this, and in order to allow partials to
resolvable from a specific template, I'm currently assigning a mapping to
the template Origin to serve as a lookup store.
>
> This isn't particularly _nice™_, but it functions, and allows work on
the prototype to continue.
>
> My current implementation works on Django 4/2, but is broken by this PR.
>
> * Can no longer create weak references to template objects. (This could
be worked around but...)
> * Can no longer assign a mapping to any object that's available at both
parsing and rendering time, and so there's no (immediately) clear way to
pass partial references out from the parser.
> * A module global might work here, but it's difficult to maintain the
correct per-template mapping, and (more) even using weakref for the storer
making sure other references are dropped has proved non-trivial in my
experiments thus far.
>
> Beyond seriously putting in doubt my current work, I think forcing slots
here makes experimenting with the DTL much harder, and I don't think
that's paid for by the speed increase on offer. We're only now beginning
to come back to _What's possible with templates?_, after a long period of
JSON APIs being primary. I'd like to ask that we don't handicap that by
overly restricting what's easy to do with template objects.
>
> I also note that in researching this project, there is a long history of
people trying all sorts of customisations (and I was only looking at
output capture). Given that this PR breaks my WIP, I'd be surprised if it
didn't lead to breakages more widely.
>
> I appreciate that it's not part of the official contract for template
object **not** to use slots, but I think it's a pretty major change.
tl;dr: With fresh attention, there are plenty of gains to be made to the
DTL. Making everything use slots make experimenting with that harder.
Thanks.
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:7>
* status: assigned => closed
* resolution: => wontfix
* stage: Accepted => Unreviewed
Comment:
Closing as "wontifx", as a deeper discussion is needed due to the backward
incompatibility of this change.
Thanks y'all.
--
Ticket URL: <https://code.djangoproject.com/ticket/34521#comment:8>