[Django] #31582: Django template backend allocates model cache even iterator() is used

10 views
Skip to first unread message

Django

unread,
May 13, 2020, 5:40:43 PM5/13/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
-------------------------------------------+------------------------
Reporter: sumerc | Owner: nobody
Type: Bug | Status: new
Component: Template system | Version: 3.0
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------------+------------------------
Hi,

I was testing template engines with a memory profiler we have written and
I have seen an interesting issue with Django template engine. Still not
sure if this is an issue or intended.

The database backend I use is django.db.backends.sqlite3. I am using
following code to retrieve a huge list of objects:

{{{
comments = Comment.objects.all().iterator()
template = loader.get_template('polls/comments.html')
context = {
'comments': comments,
}
return HttpResponse(template.render(context, request))
}}}

And the template code:

{{{
{% for comment in comments %}
a
{% endfor %}
}}}

When I ran the view, I see the
{{{django.db.models.base.Comment.__init__}}} caches the results. If I
change the TEMPLATE backend to {{{Jinja2}}} I verified it is not being
cached.

Is this somewhat intended behaviour? If so, how can we disable Model
caching for Django templates?

--
Ticket URL: <https://code.djangoproject.com/ticket/31582>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
May 14, 2020, 5:44:01 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: Bug | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------
Changes (by Carlton Gibson):

* status: new => closed
* resolution: => needsinfo


Comment:

Hi. I'm not sure this is the right place (as yet) for this query.

It seems like a usage question, rather than a specific bug reports, so I'd
point you to TicketClosingReasons/UseSupportChannels.

**Maybe** there's an issue there... if there's a concrete behaviour that's
wrong but I can't see that from the report.

> I see the django.db.models.base.Comment.__init__ caches the results.

`QuerySet` has an internal results cache. `Model.__init__()` creates the
instance in memory yes, but `ModelIterable` (which is what `.iterator()`
gives you) instantiates the objects when you iterate it, so that's
expected behaviour, independent of Jinja vs the DTL.

If you can narrow this down to something specific that Django is doing
wrong, we're very happy to have a look.

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:1>

Django

unread,
May 14, 2020, 9:07:31 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: Bug | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

Comment (by Sümer Cip):

Let me clarify this a bit more.

I have modified {{{ModelIterable.__iter__}}} like following:

{{{

class ModelIterable(BaseIterable):
"""Iterable that yields a model instance for each row."""

def __iter__(self):
print("Enter ModelIterable.__iter__",
tracemalloc.get_traced_memory())
...
print("Leave ModelIterable.__iter__",
tracemalloc.get_traced_memory())
}}}

Above code prints out the current used memory in that function + the peak
memory reached.
When I run it without using an {{{iterator()}}} call I get following
values:

{{{
DTL no iterator (7K items)
Enter ModelIterable.__iter__ (1340851, 1342301)
Leave ModelIterable.__iter__ (5063404, 5063988)
}}}

Now, as you indicated this seems OK because the {{{Comment}}} object is
created and its values initialized and cached. Let's see what happens if
we run this with iterator:

{{{
DTL iterator (7K items)
Enter ModelIterable.__iter__ (92488, 92995)
Leave ModelIterable.__iter__ (5113261, 5219668)
}}}

Now, the memory usage and peak starts with a smaller value and somehow the
memory usage+peak values go from 92KB to 5MB. We have confirmed from
previous run that 5MB is the memory needed to hold the objects in cache.
So, either memory is being cached somewhere or we are not freeing memory
in this function.

After returning from this function, the memory used+peak returns to
normal, but we are peaking memory with the growth of items which should
not be the case for {{{iterator()}}}, right?

I have verified above code is working fine with Jinja2 engine. See the
results:

{{{
Jinja2 no iterator (7K ITEMS)
Enter ModelIterable.__iter__ (1362882, 1364332)
Leave ModelIterable.__iter__ (5080351, 5080935)

Jinja2 iterator (7K ITEMS)
Enter ModelIterable.__iter__ (115616, 116123)
Leave ModelIterable.__iter__ (878651, 1343215)
}}}

You can see that when iterator is being used, the memory usage numbers do
not grow by a factor of 50. It is not reaching the value of 5080351 like
we have in DTL.


I am suspecting, somehow the loop in ModelIterable.__init__ is holding the
reference somewhere else in DTL case, maybe?

{{{
for row in compiler.results_iter(results):
obj = model_cls.from_db(db, init_list,
row[model_fields_start:model_fields_end])
....
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:2>

Django

unread,
May 14, 2020, 10:12:32 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: Bug | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

Comment (by Keryn Knight):

If you can observe a difference between DTL & Jinja2, I'm inclined to
think it's due to the `defaulttags.ForNode` implementation, and a
''potential'' candidate for the 'leak' would be:
{{{
if not hasattr(values, "__len__"):
values = list(values)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:3>

Django

unread,
May 14, 2020, 11:00:48 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: Bug | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

Comment (by Sümer Cip):

Hi Keryn,

I think you are exactly right about that:

I traced memory allocation before/after that function call:

{{{
if not hasattr(values, '__len__'):

print("render len enter", tracemalloc.get_traced_memory())
values = list(values)
print("render len exit", tracemalloc.get_traced_memory())
'''
render len enter (48421, 48653)
render len exit (9390626, 9462231)
'''
}}}

But we also have this afterwards in that function which complicates things
more:

{{{
for node in self.nodelist_loop:
nodelist.append(node.render_annotated(context))
}}}

{{{context}}} is pointing to the {{{item}}} which is the model itself!
Even we fix above {{{len()}}} call somehow, this will hurt us again.

I am not sure how to proceed further on this as this seems to require more
internal Django knowledge than I have right now :)

Any ideas?

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:4>

Django

unread,
May 14, 2020, 11:01:29 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: Bug | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: needsinfo

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

Comment (by Sümer Cip):

And I think we should reopen this?

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:5>

Django

unread,
May 14, 2020, 11:14:56 AM5/14/20
to django-...@googlegroups.com
#31582: Django template backend allocates model cache even iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody

Type: Bug | Status: new
Component: Template system | Version: 3.0
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------
Changes (by Sümer Cip):

* status: closed => new
* resolution: needsinfo =>


--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:6>

Django

unread,
May 14, 2020, 1:11:40 PM5/14/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+--------------------------------------
Reporter: Sümer Cip | Owner: nobody

Type: Bug | Status: new
Component: Template system | Version: 3.0
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:7>

Django

unread,
May 14, 2020, 1:55:05 PM5/14/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: New feature | Status: new

Component: Template system | Version: 3.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+------------------------------------
Changes (by Carlton Gibson):

* type: Bug => New feature
* stage: Unreviewed => Accepted


Comment:

Good follow up both, that's super!

At first glance, certainly :), consuming the iterator into a list isn't
ideal.


Question is, what can we do about it?

The next line is:

{{{
len_values = len(values)
}}}

Then `len_values` is used to populate the various loop counter variables.
`revcounter`, `last` and so on.

I haven't thought it through but, I could see these possibly being made
optional, maybe along the lines of adding an extra option like `reversed`?
(It would require editing the `do_for()` function to parse another option.
Not sure what it should be: ...`iterator`? 😬)

I think we'd have to call that a new feature. The behaviour isn't
**incorrect**, it just uses more memory than we'd like in this case.

Let's provisionally accept on that basis. It might be worth pinging the
DevelopersMailingList to seek advice on how to proceed, particularly with
regard to the possible API for such a change.

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:8>

Django

unread,
May 14, 2020, 2:01:54 PM5/14/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: New feature | Status: new

Component: Template system | Version: 3.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+------------------------------------

Comment (by Carlton Gibson):

Ah, I didn't properly read this bit:

> context is pointing to the item which is the model itself! Even we fix
above len() call somehow, this will hurt us again.

There's an effort to allow a generator based approach to template
rendering in #13910. There's a PR for that which got so far but needs
pushing over the line.
I don't know if that would solve second issues here...?

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:9>

Django

unread,
May 22, 2020, 5:02:27 AM5/22/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: New feature | Status: new

Component: Template system | Version: 3.0
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+------------------------------------

Comment (by Keryn Knight):

Yielding for all rendering rather than build up the `nodelist` would
presumably assist in fixing the problem, but the problem itself would
still have to remain as long as `revcounter` and `revcounter0` are
documented as being there (I've no idea who'd use those, but presumably
they exist for a reason). For `last` you can probably rework the loop to
be based around using `next(...)` to find out if there *is* another item
to yield, if there's not. That's last. Similarly for empty nodes rendering
you can check whether the first `next()` raised `StopIteration`

But until `revcounter` and `revcounter0` become optional (ie: the
guarantee is only made if the `items` has a `__len__`) you can't avoid
consuming the iterator to find out the overall span. I certainly don't
think adding another argument is a good idea, because you could then end
up with something absurd like `{% for x in mymodel.objects.all.iterator
iterator %}` or `{% for x in mymodel.objects.all iterator %}`, right?
which both look like typos or accidents.

I don't know if the exception made by doing
`reversed(mymodel.objects.all().iterator())` would get swallowed by DTL or
surfaced, but that'd be another consideration. If it's surfaced I think
that's fine, if it's not that's ''another'' issue.

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:10>

Django

unread,
May 23, 2020, 3:35:55 AM5/23/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+------------------------------------
Reporter: Sümer Cip | Owner: nobody
Type: New feature | Status: closed

Component: Template system | Version: 3.0
Severity: Normal | Resolution: wontfix
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+------------------------------------
Changes (by Carlton Gibson):

* status: new => closed
* resolution: => wontfix


Comment:

> I certainly don't think adding another argument is a good idea, because

you could then end up with something absurd like...

OK. I think Keryn's analysis is correct. I agree the extra param doesn't
look nice, but I can't see another way.

So short of a concrete proposal, I think we have to say `wontfix`. (Happy
to reopen if an acceptable proposal is forthcoming.)

--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:11>

Django

unread,
May 23, 2020, 3:37:25 AM5/23/20
to django-...@googlegroups.com
#31582: Django template engine render pre-fetches data even when iterator() is used
---------------------------------+--------------------------------------

Reporter: Sümer Cip | Owner: nobody
Type: New feature | Status: closed
Component: Template system | Version: 3.0
Severity: Normal | Resolution: wontfix

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
---------------------------------+--------------------------------------
Changes (by Carlton Gibson):

* stage: Accepted => Unreviewed


--
Ticket URL: <https://code.djangoproject.com/ticket/31582#comment:12>

Reply all
Reply to author
Forward
0 new messages