[Django] #28312: ModelChoiceIterator uses cached length of queryset.

18 views
Skip to first unread message

Django

unread,
Jun 15, 2017, 6:51:04 AM6/15/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
----------------------------------------------+------------------------
Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Uncategorized | Version: 1.8
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
----------------------------------------------+------------------------
In Django 1.8 (but also master), `ModelChoiceIterator` has the following
implementation:

{{{
class ModelChoiceIterator:
def __init__(self, field):
self.field = field
self.queryset = field.queryset

def __iter__(self):
if self.field.empty_label is not None:
yield ("", self.field.empty_label)
queryset = self.queryset.all()
# Can't use iterator() when queryset uses prefetch_related()
if not queryset._prefetch_related_lookups:
queryset = queryset.iterator()
for obj in queryset:
yield self.choice(obj)

def __len__(self):
return (len(self.queryset) + (1 if self.field.empty_label is not
None else 0))

def choice(self, obj):
return (self.field.prepare_value(obj),
self.field.label_from_instance(obj))
}}}

As can be seen, `__iter__` actually does a `.all()` to make sure it gets
fresh results. However, `__len__` does not. Also: it currently caches all
objects inside `self.queryset`, only releasing them again on shutdown
which might be problematic if there are a lot of results (or even a few
"large" results).

Suggested implementation

{{{
def __len__(self):
return self.queryset.count() + (1 if self.field.empty_label is not
None else 0))
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28312>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 15, 2017, 8:50:02 AM6/15/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
------------------------------------+--------------------------------------

Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------------------+--------------------------------------
Changes (by Tim Graham):

* component: Uncategorized => Forms


Comment:

I don't understand the "bug" aspect to the report. Can you give a test
case that demonstrates incorrect behavior?

I understand the point about memory consumption. I guess switching to
`count()` might be better, although it adds a query and could require some
adapting of `assertNumQueries()` tests (there is one test in Django's
`model_forms` tests that breaks in this way).

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:1>

Django

unread,
Jun 15, 2017, 1:56:20 PM6/15/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
------------------------------------+--------------------------------------

Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------------------+--------------------------------------

Comment (by Sjoerd Job Postmus):

The following two tests that show how I would expect the
`ModelMultipleChoiceField` to work. The first tests `__iter__` (and
works), the second tests `__len__` (and fails: `AssertionError: 3 != 4`).

{{{
diff --git a/tests/model_forms/tests.py b/tests/model_forms/tests.py
index c85eb2a..3c6241d 100644
--- a/tests/model_forms/tests.py
+++ b/tests/model_forms/tests.py
@@ -1992,6 +1992,27 @@ class ModelMultipleChoiceFieldTests(TestCase):
form = ArticleCategoriesForm(instance=article)
self.assertCountEqual(form['categories'].value(), [self.c2.slug,
self.c3.slug])

+ def test_added_fields_gets_rendered(self):
+ f = forms.ModelMultipleChoiceField(Category.objects.all())
+ self.assertEqual(list(f.choices), [
+ (self.c1.pk, 'Entertainment'),
+ (self.c2.pk, "It's a test"),
+ (self.c3.pk, 'Third')])
+ c4 = Category.objects.create(
+ name="Now you see me", slug="now", url="added")
+ self.assertEqual(list(f.choices), [
+ (self.c1.pk, 'Entertainment'),
+ (self.c2.pk, "It's a test"),
+ (self.c3.pk, 'Third'),
+ (c4.pk, 'Now you see me')])
+
+ def test_added_fields_gets_counted(self):
+ f = forms.ModelMultipleChoiceField(Category.objects.all())
+ self.assertEqual(len(f.choices), 3)
+ c4 = Category.objects.create(
+ name="Now you see me", slug="now", url="added")
+ self.assertEqual(len(f.choices), 4)
+

class ModelOneToOneFieldTests(TestCase):
def test_modelform_onetoonefield(self):
}}}

This means that the "invariant" that `len(f.choices) ==
len(list(f.choices))` does not hold.

In particular: note that (currently) the value is calculated just once:
the first time `__len__` is called. This caching is not limited to 'per
request', but is cached for the lifetime of the process handling requests.

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:2>

Django

unread,
Jun 15, 2017, 2:36:54 PM6/15/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
------------------------------------+--------------------------------------

Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------------------+--------------------------------------

Comment (by Sjoerd Job Postmus):

Perhaps this can be closed, as (our specific case) is resolved by #27563.
My apologies.

Extra context:

We ran into this problem as our form contained logic as follows:

{{{
if len(self.fields['status'].choices) <= 1:
self.fields.pop('status')
}}}

I tested this: count = 1: no field. Add a `Status`, reload page: still no
field. Turns out the count was being "cached". Since it is a Django 1.8
project, above fix is not taken into account, and the cached value did
indeed live longer than the current request. Looking at the changes, I
expect the behaviour to be correct in Django 2.0.

Even so: the `ModelChoiceIterator` uses `.all()` or `.iterator()` in the
`__iter__` method, and thus forces obtaining a fresh value. But `__len__`
does not do so, and that is an inconsistency that might still qualify as a
bug.

Are there any plans to backport #27563 to Django 1.8? I expect not, as
it's past the "End of mainstream support".

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:3>

Django

unread,
Jun 16, 2017, 12:31:56 PM6/16/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
------------------------------------+------------------------------------

Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------------------+------------------------------------
Changes (by Tim Graham):

* has_patch: 0 => 1
* stage: Unreviewed => Accepted


Comment:

Yes, I think the current behavior of `__len__()` seems inappropriate.
[https://github.com/django/django/pull/8649 PR]

You are correct that 1.8 only receives security/data loss fixes at this
time. This fix does not qualify for a backport to any stable branch based
on our [https://docs.djangoproject.com/en/dev/internals/release-process
/#supported-versions supported versions policy].

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:4>

Django

unread,
Jun 16, 2017, 1:40:51 PM6/16/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
------------------------------------+------------------------------------

Reporter: Sjoerd Job Postmus | Owner: nobody
Type: Bug | Status: new
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
------------------------------------+------------------------------------
Changes (by Tim Graham):

* needs_better_patch: 0 => 1


Comment:

I've closed my PR, someone else is welcome to continue this.

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:5>

Django

unread,
Aug 12, 2017, 1:18:30 PM8/12/17
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: Srinivas
| Reddy Thatiparthy
Type: Bug | Status: assigned

Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Srinivas Reddy Thatiparthy):

* owner: nobody => Srinivas Reddy Thatiparthy
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:6>

Django

unread,
Apr 10, 2018, 2:42:29 AM4/10/18
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: François
| Freitag

Type: Bug | Status: assigned
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by François Freitag):

* owner: Srinivas Reddy Thatiparthy => François Freitag
* needs_better_patch: 1 => 0


Comment:

[,https://github.com/django/django/pull/9867 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:7>

Django

unread,
Apr 11, 2018, 3:49:02 AM4/11/18
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: François
| Freitag
Type: Bug | Status: assigned
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* has_patch: 1 => 0


Comment:

Related discussion on [https://groups.google.com/d/topic/django-
developers/7UZyY0IX5aw/discussion mailing list]

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:8>

Django

unread,
Apr 20, 2018, 8:07:22 PM4/20/18
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: François
| Freitag
Type: Bug | Status: assigned
Component: Forms | Version: 1.8
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by François Freitag):

* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/9891 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:9>

Django

unread,
Apr 23, 2018, 1:03:22 PM4/23/18
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: François
| Freitag
Type: Bug | Status: closed
Component: Forms | Version: 1.8
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"3fca95e1ad5b355f7813b98c97a194a30f2ab47b" 3fca95e]:
{{{
#!CommitTicketReference repository=""
revision="3fca95e1ad5b355f7813b98c97a194a30f2ab47b"
Fixed #28312 -- Made ModelChoiceIterator.__len__() more memory-efficient.

Instead of loading all QuerySet results in memory, count the number of
entries. This adds an extra query when list() or tuple() is called on the
choices (because both call __len__() then __iter__()) but uses less
memory since the QuerySet results won't be cached. In most cases, the
choices will only be iterated on, meaning that __len__() won't be called
and only one query will be executed.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:10>

Django

unread,
Apr 23, 2018, 1:27:15 PM4/23/18
to django-...@googlegroups.com
#28312: ModelChoiceIterator uses cached length of queryset.
-------------------------------------+-------------------------------------
Reporter: Sjoerd Job Postmus | Owner: François
| Freitag
Type: Bug | Status: closed
Component: Forms | Version: 1.8
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"d1413c5d703c60dfb9e2a418c79b3e4aed32ffac" d1413c5]:
{{{
#!CommitTicketReference repository=""
revision="d1413c5d703c60dfb9e2a418c79b3e4aed32ffac"
Refs #28312 -- Added an optimized __bool__() to ModelChoiceIterator.

COUNT is more expensive than EXISTS; use the latter when possible.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/28312#comment:11>

Reply all
Reply to author
Forward
0 new messages