[Django] #33441: model Field.__hash__ should be immutable

3 views
Skip to first unread message

Django

unread,
Jan 13, 2022, 1:49:34 PMJan 13
to django-...@googlegroups.com
#33441: model Field.__hash__ should be immutable
-------------------------------------+-------------------------------------
Reporter: Adam | Owner: nobody
Johnson |
Type: Bug | Status: assigned
Component: Database | Version: dev
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
`Field.__hash__` changes value when a field is assigned to a model class.

This code crashes with an `AssertionError`:

{{{
from django.db import models


f = models.CharField(max_length=200)
d = {f: 1}


class Book(models.Model):
title = f


assert f in d
}}}

The bug was introduced in #31750.

It's unlikely to have been encountered because there are few use cases to
put a field in a dict *before* it's assigned to a model class. But I found
a reason to do so whilst implementing #26472 and the behaviour had me
stumped for a little.

IMO we can revert the `__hash__` change from #31750. Objects with the same
hash are still checked for equality, which was fixed in that ticket. But
it's bad if an object's hash changes, since it breaks its use in dicts.

--
Ticket URL: <https://code.djangoproject.com/ticket/33441>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 13, 2022, 1:52:18 PMJan 13
to django-...@googlegroups.com
#33441: model Field.__hash__ should be immutable
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam

| Johnson
Type: Bug | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Adam Johnson):

* owner: nobody => Adam Johnson
* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:1>

Django

unread,
Jan 13, 2022, 2:20:40 PMJan 13
to django-...@googlegroups.com
#33441: model Field.__hash__ should be immutable
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: closed

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* cc: Simon Charette, Ryan Hiebert (added)
* status: assigned => closed
* resolution: => invalid


Comment:

As far as I'm aware, fields are quite specific, in a real life they don't
live without models. Field's hash should change when it's assigned to a
model, comparing `creation_counter` is not enough. We shouldn't consider
the same field instance assigned to a different models as equal. IMO the
current behavior is expected, we don't want to reintroduce an issue fixed
in 502e75f9ed5476ffe8229109acf0c23999d4b533.

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:2>

Django

unread,
Jan 13, 2022, 3:05:44 PMJan 13
to django-...@googlegroups.com
#33441: model Field.__hash__ should be immutable
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam Johnson):

> As far as I'm aware, fields are quite specific, in a real life they
don't live without models.

Yes the use case is contrived. But there are plenty of uses for fields in
ORM expressions that *might* lead one to create references to a field
before/after they are assigned to a model class.

> We shouldn't consider the same field instance assigned to a different
models as equal.

`__hash__` is never used for equality. The only requirement is that *IF*
two objects are equal, they have the same hash. There's no problem with
creating a `__hash__` that always does `return 1` - it will only make its
use in dictionaries slow.

Changing `__hash__` in the PR was unnecessary for the previous fix. I
believe it came from a misunderstanding around what `__hash__` means.

> IMO the current behavior is expected, we don't want to reintroduce an
issue fixed in 502e75f9ed5476ffe8229109acf0c23999d4b533.

I don't think there's any risk. The PR passes all tests, including the one
introduced, minus the unnecessary `hash()` assertions.

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:3>

Django

unread,
Jan 13, 2022, 4:28:48 PMJan 13
to django-...@googlegroups.com
#33441: model Field.__hash__ should be immutable
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: closed
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> But there are plenty of uses for fields in ORM expressions that *might*

lead one to create references to a field before/after they are assigned to
a model class.

Could you go into more detail about this statement? Fields meant to be
assigned/bound to a model class should all be set after the app setup
phase and no fields should be bound to a model class after this point so
I'm struggling to come up with the plenty of uses in ORM expressions you
are referring to. Just saying that providing concrete examples might help
your case here.

Ideally, we'd have a notion of `BoundField(model: Model, field: Field)` to
avoid having to differentiate between fields attached to models and ones
that aren't based of `.model` (that's what we do in django.forms for
example) but that's a large change for sure.

As for reverting the patch for the `__hash__` part I'm not against it,
from what I understand and what Adam pointed out it should not break
anything. If believe strongly that inherited fields should have they own
`__hash__` then maybe we could simplify this whole thing by having the
interitance logic ''bump'' `creation_counter` instead?

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:4>

Django

unread,
Jan 14, 2022, 12:50:41 AMJan 14
to django-...@googlegroups.com
#33441: Model Field.__hash__() should be immutable.

-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: new

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: closed => new
* resolution: invalid =>
* stage: Unreviewed => Accepted


Comment:

Replying to [comment:3 Adam Johnson]:


> `__hash__` is never used for equality. The only requirement is that *IF*
two objects are equal, they have the same hash. There's no problem with
creating a `__hash__` that always does `return 1` - it will only make its
use in dictionaries slow.
>
> Changing `__hash__` in the PR was unnecessary for the previous fix.

Right, I missed it. I always find it suspicious when `__hash__()` does not
match `__eq__()`, but it's probably just me 🤷, and we have a similar
solution for `Model`.

I'm also struggling with finding ''"plenty of uses"'', nevertheless it
should not block this change .

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:5>

Django

unread,
Jan 14, 2022, 1:05:03 AMJan 14
to django-...@googlegroups.com
#33441: Model Field.__hash__() should be immutable.
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => assigned
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:6>

Django

unread,
Jan 14, 2022, 3:31:27 AMJan 14
to django-...@googlegroups.com
#33441: Model Field.__hash__() should be immutable.
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam Johnson):

> Could you go into more detail about this statement? Fields meant to be
assigned/bound to a model class should all be set after the app setup
phase and no fields should be bound to a model class after this point so
I'm struggling to come up with the plenty of uses in ORM expressions you
are referring to. Just saying that providing concrete examples might help
your case here.

> I'm also struggling with finding "plenty of uses", nevertheless it


should not block this change .

Sorry, I wasn't clear. I meant: There are plenty of uses for fields in ORM
expressions. These *might* lead a user to create references to a field
before/after they are assigned to a model class. I don't have a concrete
example, and clearly it must be rare, since no one has reported a bug
since Django 3.2. But, I don't think it's inconcievable that someone will
try this at some point.

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:7>

Django

unread,
Jan 14, 2022, 3:40:02 AMJan 14
to django-...@googlegroups.com
#33441: Model Field.__hash__() should be immutable.
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: Adam
| Johnson
Type: Bug | Status: closed

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed

* resolution: => fixed


Comment:

In [changeset:"fdfa97fb166ef5065aa2b229f19cb4ce303084e5" fdfa97f]:
{{{
#!CommitTicketReference repository=""
revision="fdfa97fb166ef5065aa2b229f19cb4ce303084e5"
Fixed #33441 -- Restored immutability of models.Field.__hash__().

Regression in 502e75f9ed5476ffe8229109acf0c23999d4b533.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33441#comment:8>

Reply all
Reply to author
Forward
0 new messages