[Django] #35270: Optimize Model._meta._property

Django

unread,

Mar 4, 2024, 1:27:49 PMMar 4

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam | Owner: Adam Johnson
Johnson |
Type: | Status: assigned
Cleanup/optimization |
Component: Database | Version: dev
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 1
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Optimize Model._meta._property_names

Continuing my project to optimize the system checks, I found some
optimizations for `Options._meta._property_names`, which I found to take
~4% of the total runtime for checks.

Most of this function’s runtime was being spent running
`inspect.getattr_static()`. This is not surprising as it
[https://github.com/python/cpython/blob/9b9e819b5116302cb4e471763feb2764eb17dde8/Lib/inspect.py#L1852
jumps through many hoops] in order to avoid triggering attribute access.

I added use of `getattr_static()` back in #28269 /
ed244199c72f5bbf33ab4547e06e69873d7271d0 to fix a bug with instance
descriptors. But I think it’s overly cautious, and we can assume that
accessing the `__dict__` of the model class will work fine.

Two optimizations make the function run in negligible time:

1. Changing the function to use `__dict__` directly
2. Caching on a per-class basis. This requires using a weak-reference to
classes, as we shouldn’t mutate base classes in the MRO, some of which can
be non-model subclasses, like `Model` itself for the `pk` property,
`object`, or any mixins.

Before optimization stats:

106 calls to `_property_names` took 26ms, or ~4% of the total runtime of
system checks.

After optimization:

The same calls take 1ms, or ~0.2% of the total runtime. (The real runtime
may be <1ms, but shows as 1 due to rounding up by cProfile.)
--
Ticket URL: <https://code.djangoproject.com/ticket/35270>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,

Mar 4, 2024, 1:32:43 PMMar 4

to django-...@googlegroups.com

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Description changed by Adam Johnson:

Old description:

> Optimize Model._meta._property_names
>
> Continuing my project to optimize the system checks, I found some
> optimizations for `Options._meta._property_names`, which I found to take
> ~4% of the total runtime for checks.
>
> Most of this function’s runtime was being spent running
> `inspect.getattr_static()`. This is not surprising as it
> [https://github.com/python/cpython/blob/9b9e819b5116302cb4e471763feb2764eb17dde8/Lib/inspect.py#L1852
> jumps through many hoops] in order to avoid triggering attribute access.
>
> I added use of `getattr_static()` back in #28269 /
> ed244199c72f5bbf33ab4547e06e69873d7271d0 to fix a bug with instance
> descriptors. But I think it’s overly cautious, and we can assume that
> accessing the `__dict__` of the model class will work fine.
>
> Two optimizations make the function run in negligible time:
>
> 1. Changing the function to use `__dict__` directly
> 2. Caching on a per-class basis. This requires using a weak-reference to
> classes, as we shouldn’t mutate base classes in the MRO, some of which
> can be non-model subclasses, like `Model` itself for the `pk` property,
> `object`, or any mixins.
>
> Before optimization stats:
>
> 106 calls to `_property_names` took 26ms, or ~4% of the total runtime of
> system checks.
>
> After optimization:
>
> The same calls take 1ms, or ~0.2% of the total runtime. (The real runtime
> may be <1ms, but shows as 1 due to rounding up by cProfile.)

New description:

Continuing my project to optimize the system checks, I found some
optimizations for `Options._meta._property_names`, which I found to take
~4% of the total runtime for checks.

Most of this function’s runtime was being spent running
`inspect.getattr_static()`. This is not surprising as it
[https://github.com/python/cpython/blob/9b9e819b5116302cb4e471763feb2764eb17dde8/Lib/inspect.py#L1852
jumps through many hoops] in order to avoid triggering attribute access.

I added use of `getattr_static()` back in #28269 /
ed244199c72f5bbf33ab4547e06e69873d7271d0 to fix a bug with instance
descriptors. But I think it’s overly cautious, and we can assume that
accessing the `__dict__` of the model class will work fine.

Two optimizations make the function run in negligible time:

1. Changing the function to use `__dict__` directly
2. Caching on a per-class basis. This requires using a weak-reference to
classes, as we shouldn’t mutate base classes in the MRO, some of which can
be non-model subclasses, like `Model` itself for the `pk` property,
`object`, or any mixins.

Before optimization stats:

106 calls to `_property_names` took 26ms, or ~4% of the total runtime of
system checks.

After optimization:

The same calls take 1ms, or ~0.2% of the total runtime. (The real runtime
may be <1ms, but shows as 1 due to rounding up by cProfile.)

--

--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:1>

Django

unread,

Mar 4, 2024, 8:26:20 PMMar 4

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Natalia Bidart):

I'm not so sure about this one, particularly after having read the history
in the relevant PRs ([https://github.com/django/django/pull/7598 the
original optimization in this code] and
[https://github.com/django/django/pull/8599 its regression fix]).

I wonder, would using [https://github.com/django/django/pull/8601 the
solution proposed for "1.11.x"] be an option for getting rid of
`inspect.getattr_static`? I'm not a fan of the custom weak key cache, it
feels like an unnecessary adding to the framework only for optimization
purposes.
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:2>

Django

unread,

Mar 6, 2024, 7:09:39 AMMar 6

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Changes (by Keryn Knight):

* cc: Keryn Knight (added)

Comment:

👋 Adam, do you happen to know the overall percentage of the "win" each of
the 2 optimisations does? i.e. is 80% of it the change to use the
`klass.__dict__` or is 95% of it the `weak_key_cache`, etc?

Is there a world where we get enough of a benefit from ''just'' `Changing
the function to use __dict__ directly ` that we don't need the
`weak_key_cache`?

I ask because the `weak_key_cache` is the kind of deep magic that **I**
don't fully understand immediately, and because you mentioned caching on a
''per-class basis'' but I'd have ''assumed'' (from my naive understanding
of these innards) that was already approaching done, by virtue of the
`cached_property` existing on the `Options` per model? (i.e. `Options` is
a singleton per class)
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:3>

Django

unread,

Mar 6, 2024, 3:45:38 PMMar 6

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam Johnson):

I didn’t fully explain the caching. What I meant by “per-class” is
per-*any*-class, not per *model class* - that means caching values for all
the property names in `object` (none), in `models.Model` (just `pk`, at
current), any mixing classes, and so on. Yes, the caching on `Options`
means it’s cached per *model class*, but just relying on that caching
means we don’t get to reuse the work done checking everything defined on
mixins, `Model`, or `object`.

`weak_key_cache` implements a pattern I’ve played with before to associate
extra data with an object without attaching arbitrary attributes, since
they might clash or affect operations like serialization or repr. django-
upgrade uses [https://github.com/search?q=repo%3Aadamchainz%2Fdjango-
upgrade%20weakkeydictionary&type=code a bunch of WeakKeyDictionary
instances] to keep fixers modular to their own files.

It doesn’t make sense to use `@weak_key_cache` without the `__dict__`
optimization, because it requires one call per class, whilst the old
`dir()` approach checks attributes defined in the class *and* all
superclasses.

I profiled `__dict__` without `@weak_key_cache` though, using this
implementation:

{{{
@cached_property
def _property_names(self):
"""Return a set of the names of the properties defined on the
model."""
names = set()
for klass in self.model.__mro__:
names.update(
{
name
for name, value in klass.__dict__.items()
if isinstance(value, property)
}
)
return frozenset(names)
}}}

The result was that the calls took 2ms, keeping most of the savings. That
said, the project I’m using doesn’t have deep model inheritance or many
mixins, so we wouldn’t expect the caching to do so much.

If you’d both prefer this version, sure, we can go for it. Best to keep
things maintainable for all, and we can always add `@weak_key_cache` or
similar in the future.
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:4>

Django

unread,

Mar 7, 2024, 1:03:53 PMMar 7

to django-...@googlegroups.com

Keywords: | Triage Stage: Accepted

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Changes (by Natalia Bidart):

* stage: Unreviewed => Accepted

Comment:

Replying to [comment:4 Adam Johnson]:
[...]

> The result was that the calls took 2ms, keeping most of the savings.
That said, the project I’m using doesn’t have deep model inheritance or
many mixins, so we wouldn’t expect the caching to do so much.
>
> If you’d both prefer this version, sure, we can go for it. Best to keep
things maintainable for all, and we can always add `@weak_key_cache` or
similar in the future.

I'm very much in favor of a simpler optimization. I agree with Keryn that
`@weak_key_cache` is the kind of deep magic is not fully understood
immediately.
Accepting following this simplification proposal.
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:5>

Django

unread,

Mar 7, 2024, 5:47:47 PMMar 7

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam Johnson):

Alright, updated.
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:6>

Django

unread,

Mar 8, 2024, 12:48:00 AMMar 8

to django-...@googlegroups.com

Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Changes (by Mariusz Felisiak):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:7>

Django

unread,

Mar 11, 2024, 12:32:55 AMMar 11

to django-...@googlegroups.com

#35270: Optimize Model._meta._property_names
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
Type: | Johnson

Cleanup/optimization | Status: closed

Component: Database layer | Version: dev
(models, ORM) |

Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* resolution: => fixed
* status: assigned => closed

Comment:

Fixed by faeb92ea13f0c1b2cc83f45b512f2c41cfb4f02d.
--
Ticket URL: <https://code.djangoproject.com/ticket/35270#comment:8>

Reply all

Reply to author

Forward

[Django] #35270: Optimize Model._meta._property_names

Django

Django

Django

Django

Django

Django

Django

Django

Django