Hi,
I'm having mixed feelings toward ModelChoiceIterator's implementation.
That custom iterator intends to fetch choices defined by the queryset
argument of a ChoiceField.
Issue
=====
Currently, the expectation seems to be that the latest data is queried
from the database each time the choices attribute of a ChoiceField
instance is evaluated. That's best described by
ModelFormBasicTests.test_runtime_choicefield_populated [1]. Besides,
#28312 reports reusing [...] the cached QuerySet in ModelChoiceIterator
as a bug [2].
My expectation is that once evaluated, the QuerySet stays cached for the
ModelChoiceIterator instance's lifetime (i.e. in
test_runtime_choicefield_populated the w_bernstein entry would not be
visible in the second f.as_ul() output). Unaware of #28312, I created
#29159 to avoid making a duplicate query when calling list() on a
ModelChoiceIterator's choices. The fix for #29159 [3] makes a step
toward caching the QuerySet results on the ModelChoiceIterator instance,
which goes against the existing expectation.
Questions
=========
Before to break the existing assumption further, I would like to gather
opinions on the following two options.
1. Should a new patch be submitted for #29159 [4] to use `.count()` for
`__len__`, thus reverting back to the original behaviour?
2. Should ModelChoiceIterator be updated to cache its QuerySet? (reverts
results of #23623 [5])
2a. Removing test_runtime_choicefield_populated
2b. Marking #28312 as fixed (because the len() and list() content would
be consistent)
My opinion (option 2)
=====================
AFAICT, reusing the ModelChoiceIterator queryset results, instead of
issuing a new query whenever the ModelChoiceIterator instance is
evaluated, is the only way to guarantee results consistency across calls
to the choices attribute of a ChoiceField instance.
If the QuerySet results were to change, there could be mismatches, for
example between the choices count and the actual number of choices
offered to the user. That would lead to subtle bugs under concurrent
activity, where a new entry is inserted between the counting and the
rendering.
However, this option means breaking the current assumption that choices
will be as "fresh" as possible. For instance, a choice created after the
ModelChoiceIterator instance was evaluated the first time would not be
available for selection, making the following pattern incorrect (untested):
```
f = MyChoiceField(Category.objects.all())
if 'form' not in {val for val, label in f.choices}: # Evaluate choices
Category.objects.create('Django')
f.as_ul() # Won't display Django, because the first choices evaluation
# created a cached queryset.
```
IMHO, the freshness isn't so much of an issue for most use cases,
because the ModelChoiceIterator will likely be consumed during the
template rendering, so the choices should be just as fresh.
This option also means consuming more memory, because Python objects
would stay cached on the QuerySet (we currently use .iterator() to load
the choices when possible). It don't think that's too bad, since
ModelChoiceIterator instances evaluation is lazy, thus the lifespan of
the choices cache should be short.
I don't expect that loading even thousands of choices would lead to a
major memory overhead. SQLite and MariaDB/MySQL adapters will store
database results in memory anyway, so the overhead would be on storing
Python instances versus storing text for these backends.
If memory really is a concern, I think significant savings can be
achieved using `.defer()` to reduce the size of the Python objects or
filtering/limiting the choices QuerySet. Another option is to use a
custom ModelChoiceIterator subclass to fetch the results using
`.iterator()`.
I would prefer Django to have consistent results by default, even though
that means more memory usage. What do you think?
Kind regards,
François Freitag
[1]
https://github.com/django/django/blob/73cb62a33197652a3c8261dbf052d7eb75e26139/tests/model_forms/tests.py#L1483-L1538
[2]
https://code.djangoproject.com/ticket/28312
[3]
https://code.djangoproject.com/ticket/29159
[4]
https://github.com/django/django/commit/a2e97abd8149e78071806a52282a24c27fe8236b
[5]
https://code.djangoproject.com/ticket/23623