[Django] #33007: `form_clean()` performance

Django

unread,

Aug 9, 2021, 1:54:05 PM8/9/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
-----------------------------------------+------------------------
Reporter: David Smith | Owner: nobody
Type: Uncategorized | Status: new
Component: Uncategorized | Version: 3.2
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-----------------------------------------+------------------------
I ran the ran the benchmark suite to see if anything had changed in the
last few months. See [https://smithdc1.github.io/django-
asv/#forms.FormBenchmarks.time_form_clean link]

This highlighted a regression (c.15% increase) in the performance
`form_clean()` following this
[https://github.com/django/django/commit/90a33ab2 commit], see
[https://smithdc1.github.io/django-
asv/#forms.FormBenchmarks.time_form_clean benchmark].

I'm not certain what to suggest here, or if it's even that much of an
issue and should be `wontfix`ed, but thought it best to atleast raise.

--
Ticket URL: <https://code.djangoproject.com/ticket/33007>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,

Aug 9, 2021, 4:25:21 PM8/9/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------
Changes (by Mariusz Felisiak):

* cc: Chris Jerdonek (added)
* type: Uncategorized => Cleanup/optimization
* component: Uncategorized => Forms
* stage: Unreviewed => Accepted

Comment:

Thanks! It's definitely worth investigating.

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:1>

Django

unread,

Aug 9, 2021, 5:28:15 PM8/9/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------

Comment (by Chris Jerdonek):

That ticket (#32920) was primarily about correctness rather than trying to
do anything about performance. Nevertheless, one possible optimization
that occurred to me while working on it was adding a flag to `BaseForm`
called something like `self._bound_fields_cache_filled`. Then, the
[https://github.com/django/django/blob/b64db05b9cedd96905d637a2d824cbbf428e40e7/django/forms/forms.py#L147-L150
BaseForm._bound_items()] and
[https://github.com/django/django/blob/b64db05b9cedd96905d637a2d824cbbf428e40e7/django/forms/forms.py#L152-L155
BaseForm.__iter__()] iterators could consult that flag at their outset. If
set, the `BoundField` items could be served directly from
`_bound_fields_cache` instead of going through the extra layer of
indirection of `BoundField.__getitem__()`. However, I don't know what's
contributing most in terms of the slow-down. Maybe it's that
`BoundField.initial`
[https://github.com/django/django/blob/b64db05b9cedd96905d637a2d824cbbf428e40e7/django/forms/boundfield.py#L228-L230
uses @cached_property]?

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:2>

Django

unread,

Aug 9, 2021, 8:29:18 PM8/9/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)

Comment:

While this particular commit was made to address a correctness issue the
recent influx in optimization PRs merged in the past weeks made me wonder
if we'd rather invest in having some form of CI to confirm their benefit
in the grand scheme of things.

I greatly appreciate having David run these benchmark from time to time
but it'd be great to back optimization PRs with addition to the
performance suite to make sure we don't trade one improvement for another
over time. It would seem like a reasonable ask given how invested we are
in writing regression tests for all bug fixes and feature additions?

Thoughts? Maybe I should bring that up on the mailing list instead?

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:3>

Django

unread,

Aug 9, 2021, 11:46:18 PM8/9/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------

Comment (by Chris Jerdonek):

> Thoughts?

If a PR's purpose is to optimize, but it adds complexity (like adding
caching), then it seems reasonable to me that there be an objective
measure that people can look to that shows improvement. A related example:
just in the last couple days, it was pointed out in
[https://github.com/django/django/pull/14748 this PR] that
`_lazy_re_compile()` (which was originally added as an optimization) was
actually slower in one instance. How many more are like that?

On the other hand, if a PR is doing something like removing a redundant
operation or eliminating an unneeded argument, I think it should be okay
even if a benchmark doesn't show a noticeable improvement. I know in some
cases like that, the commit message wound up saying something like
"Optimized ...," even if the impact might not really be noticeable. (I
think a message like "Simplified ..." or "Removed redundant ..." might be
more accurate in cases like that, so it doesn't overstate the impact.)

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:4>

Django

unread,

Aug 10, 2021, 1:35:39 AM8/10/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------

Comment (by Carlton Gibson):

+1 to building on David's work: I believe the next stage is that we need
to provision the benchmarks to run on small but otherwise empty instance,
that isn't David's own machine — But I guess details can be discussed on
the django-bench repo… — general point is it would be great to have these
running regularly (maybe daily on main, and PRs on request?)

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:5>

Django

unread,

Aug 10, 2021, 2:23:14 AM8/10/21

to django-...@googlegroups.com

#33007: `form_clean()` performance
--------------------------------------+------------------------------------

Reporter: David Smith | Owner: nobody

Type: Cleanup/optimization | Status: new
Component: Forms | Version: 3.2
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0

--------------------------------------+------------------------------------

Comment (by David Smith):

Replying to [comment:5 Carlton Gibson]:

> But I guess details can be discussed on the django-bench repo…

I ''finally'' got round to finishing my thoughts on benchmarking more
generally. [https://github.com/django/djangobench/issues/38 see post].

--
Ticket URL: <https://code.djangoproject.com/ticket/33007#comment:6>

Reply all

Reply to author

Forward