[Django] #32980: Improve performance of related manager attribute access

40 views
Skip to first unread message

Django

unread,
Aug 1, 2021, 9:51:18 AM8/1/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn | Owner: Keryn Knight
Knight |
Type: | Status: assigned
Cleanup/optimization |
Component: Database | Version: dev
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Currently, getting related managers from an instance is slower than I
think it ought to be. Some prelude indicating a relatively expected
baseline:
{{{
In [1]: from django.contrib.auth.models import User
In [2]: %timeit User.objects
1.26 µs ± 25.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops
each)
In [3]: x = User.objects.first()
In [4]: %timeit x.username
34.2 ns ± 0.574 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
In [5]: %timeit x.get_short_name()
110 ns ± 2.02 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops
each)
In [7]: %timeit x.pk
201 ns ± 4.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
}}}
So far, all making sense mostly, but then:
{{{
In [8]: %timeit x.groups
7.94 µs ± 40.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [9]: %timeit x.user_permissions
8.16 µs ± 182 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
}}}
Crikey that's a big difference. What's going on here?
{{{
In [10]: x.groups
Out[10]:
<django.db.models.fields.related_descriptors.create_forward_many_to_many_manager.<locals>.ManyRelatedManager
at 0x10dc58d60>
In [11]: x.groups
Out[11]:
<django.db.models.fields.related_descriptors.create_forward_many_to_many_manager.<locals>.ManyRelatedManager
at 0x10da8b490>
In [12]: x.groups is x.groups
Out[12]: False
In [13]: x.user_permissions is x.user_permissions
Out[13]: False
}}}
Aha, it's one of those rare points where seeing the `0x` notation in a
repr is useful! The ''manager'' is being re-created on ''every'' access,
so even though repeated uses might look innocuous because they're "the
same" attribute (eg: `x.groups.count()` and then `x.groups.filter(...)`),
they're much slower to work with.

{{{
In [13]: %%timeit -n1000
...: users = User.objects.prefetch_related('groups')
...: for user in users:
...: user.groups
...:
8.03 ms ± 108 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
}}}

If we instead cache the returned manager (as is done for the
`related_manager_cls`) somewhere, we can amortize the cost of creating a
per-instance manager on repeated access:
{{{
In [1]: from django.contrib.auth.models import User
In [2]: x = User.objects.first()
In [4]: %timeit x.groups
554 ns ± 9.86 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [5]: %timeit x.user_permissions
534 ns ± 8.09 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [6]: x.groups is x.groups
Out[6]: True
In [7]: x.user_permissions is x.user_permissions
Out[7]: True
In [8]: %%timeit -n1000
...: users = User.objects.prefetch_related('groups')
...: for user in users:
...: user.groups
...:
6.93 ms ± 207 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
}}}
Because `prefetch_one_level` will use `getattr` to get the manager for
every instance seen, the cost is already paid by the time the user wants
to then ''use'' the fetched data - even though the manager's underlying
queryset will just proxy straight to the prefetched objects cache, it is
currently paying a high price to do so.

The how and where and the final implementation (if accepted) are still a
bit up for grabs; we cannot cache it on the ''descriptor'' as that would
outlive the instance and grow unbounded as instances are seen, similarly
using a weakref dict on the descriptor would stop the cached manager
outliving the instance but because it's weakly held, it disappears
immediately, so doesn't effectively work as a cache.
That effectively leaves ''somewhere'' as "on the instance" as either a
private item pushed into `instance.__dict__` or re-using the `field_cache`
on the instance's `ModelState`. I have a branch which I'll push shortly
showing a proof of concept of how it could be done, having taken some
vague guesses on bits - I guarantee it's not ''good enough'' right now,
but I think it 'could'' be.

I think this generally only applies to reverse many-to-one and many-to-
many managers because the one-to-one and forward many-to-one descriptors
are already making use of the `field_cache` in their `__get__`
implementation. So I've only concerned myself with those where I've noted
that's not the case...

--
Ticket URL: <https://code.djangoproject.com/ticket/32980>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 1, 2021, 9:59:07 AM8/1/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Keryn Knight):

* needs_better_patch: 0 => 1
* has_patch: 0 => 1


Comment:

PR is https://github.com/django/django/pull/14724 ... so far tests are
still running, so who knows if it even works :)
Pre-emptively marked as patch needs improvement, and I guarantee there'll
be discussion on the PoC if the principle idea is accepted.

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:1>

Django

unread,
Aug 3, 2021, 4:27:26 AM8/3/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* stage: Unreviewed => Accepted


Comment:

OK, sounds plausible. Let's accept for discussion/review.

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:2>

Django

unread,
Oct 9, 2021, 10:39:31 AM10/9/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Keryn Knight):

* needs_better_patch: 1 => 0


Comment:

Refactored the PoC into something tidier, without having to hard-code
strings and sprinkle those through-out various places, so now the focus
can solely be on things like:
- am I using the right attributes/methods?
- are there test cases missing for completeness?

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:3>

Django

unread,
Oct 26, 2021, 2:28:16 AM10/26/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:4>

Django

unread,
Nov 4, 2021, 9:46:57 AM11/4/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Keryn Knight):

* needs_better_patch: 1 => 0


Comment:

Marking as ready to have eyes on it once again, naturally can revert back
to needs improvement.

A question I've raised on the ticket, which I reproduce for historical
purposes here, is:

> There's one thing that isn't handled, because I'm not sure if/how it can
actually happen, and that's the case of a ManyToManyField which is
symmetrical and the branch entered is self.reverse is True. As far as I
know, it's impossible to get a reverse manager for symmetrical m2ms, but
if it's possible, it would currently pose a problem because get_cache_name
defers to get_accessor_name and that returns None for symmetries, so there
would be the possibility of a collision.

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:5>

Django

unread,
Nov 8, 2021, 2:51:02 AM11/8/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:6>

Django

unread,
Nov 8, 2021, 8:58:52 AM11/8/21
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: closed

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6" 4f8c7fd9]:
{{{
#!CommitTicketReference repository=""
revision="4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6"
Fixed #32980 -- Made models cache related managers.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:7>

Django

unread,
Sep 30, 2022, 12:18:57 PM9/30/22
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by GitHub <noreply@…>):

In [changeset:"5e0aa362d91d000984995ce374c2d7547d8d107f" 5e0aa362]:
{{{
#!CommitTicketReference repository=""
revision="5e0aa362d91d000984995ce374c2d7547d8d107f"
Fixed #33984 -- Reverted "Fixed #32980 -- Made models cache related
managers."

This reverts 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 and adds
two regression tests:
- test_related_manager_refresh(), and
- test_create_copy_with_m2m().

Thanks joeli for the report.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:8>

Django

unread,
Sep 30, 2022, 12:21:50 PM9/30/22
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"7a1675806a37375698df208c00f892cc81afe1b9" 7a167580]:
{{{
#!CommitTicketReference repository=""
revision="7a1675806a37375698df208c00f892cc81afe1b9"
[4.1.x] Fixed #33984 -- Reverted "Fixed #32980 -- Made models cache
related managers."

This reverts 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 and adds
two regression tests:
- test_related_manager_refresh(), and
- test_create_copy_with_m2m().

Thanks joeli for the report.

Backport of 5e0aa362d91d000984995ce374c2d7547d8d107f from main
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:9>

Django

unread,
Sep 30, 2022, 12:23:28 PM9/30/22
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: new

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: closed => new
* has_patch: 1 => 0
* resolution: fixed =>
* stage: Ready for checkin => Accepted


Comment:

4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 was reverted (see #33984).

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:10>

Django

unread,
Mar 18, 2024, 2:43:11 AM3/18/24
to django-...@googlegroups.com
#32980: Improve performance of related manager attribute access
-------------------------------------+-------------------------------------
Reporter: Keryn Knight | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Ülgen Sarıkavak):

* cc: Ülgen Sarıkavak (added)

--
Ticket URL: <https://code.djangoproject.com/ticket/32980#comment:11>
Reply all
Reply to author
Forward
0 new messages