The main problem is that when you have fetched a model containing a m2m
field from the database, and access its m2m field the manager gets cached.
If you then set the model's `.pk` to `None` and do a `.save()` to save it
as a copy of the old one, the related manager cache is not cleared. Here's
some code with inline comments to demonstrate:
{{{
# models.py
from django.db import models
class Tag(models.Model):
tag = models.SlugField(max_length=64, unique=True)
def __str__(self):
return self.tag
class Thing(models.Model):
tags = models.ManyToManyField(Tag, blank=True)
def clone(self) -> 'Thing':
# To clone a thing, we first save a list of the tags it has
tags = list(self.tags.all())
# Then set its pk to none and save, creating the copy
self.pk = None
self.save()
# In django 3.2 the following sets the original tags for the new
instance.
# In 4.x it's a no-op because self.tags returns the old instance's
manager.
self.tags.set(tags)
return self
@staticmethod
def has_bug():
one, _ = Tag.objects.get_or_create(tag='one')
two, _ = Tag.objects.get_or_create(tag='two')
obj = Thing.objects.create()
obj.tags.set([one, two])
new_thing = obj.clone()
# new_thing.tags.all() returns the expected tags, but it is
actually returning obj.tags.all()
# If we fetch new_thing again it returns the actual tags for
new_thing, which is empty.
#
# Even `new_thing.refresh_from_db()` -- which is not required with
3.x -- does not help.
# `new_thing._state.related_managers_cache.clear()` works, but
feels like something I
# shouldn't have to do.
return list(new_thing.tags.all()) !=
list(Thing.objects.get(pk=new_thing.pk).tags.all())
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33984>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
I'm upgrading our codebase from Django 3.2 to 4.1, and came upon a
regression when running our test suite. I bisected it down to commit
4f8c7fd9d9 Fixed #32980 -- Made models cache related managers:
[https://github.com/django/django/commit/4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6]
def __str__(self):
return self.tag
--
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:1>
* cc: Keryn Knight (added)
* severity: Normal => Release blocker
* stage: Unreviewed => Accepted
Comment:
This is a documented breaking change, however it also broke a
[https://docs.djangoproject.com/en/4.1/topics/db/queries/#copying-model-
instances documented way for copying model instances], so we need to
update docs or fix it.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:2>
Comment (by Bhuvnesh):
I am a bit confuse here, Using clone method all the tags are being copied
from the old instance still the actual tags for the new instance (
new_thing ) are empty after fetching the new instance again? why the tags
of old instance are not cloned ?
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:3>
Comment (by joeli):
Replying to [comment:3 Bhuvnesh]:
> I am a bit confuse here, Using clone method all the tags are being
copied from the old instance still the actual tags for the new instance (
new_thing ) are empty after fetching the new instance again? why the tags
of old instance are not cloned ?
That is the bug here: in `.clone()` after doing `self.pk = None` and
`self.save()` to create a new copy, `self.tags` still points to the old
thing's related manager. So `.clone()` ends up setting the tags for the
old instance (that already has said tags). Then it returns a thing that
pretends to be the `new_thing` but actually returns the tags of the old
thing unless you completely refetch it from database.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:4>
Comment (by joeli):
Replying to [comment:2 Mariusz Felisiak]:
> This is a documented breaking change, however it also broke a
[https://docs.djangoproject.com/en/4.1/topics/db/queries/#copying-model-
instances documented way for copying model instances], so we need to
update docs or fix it.
Thanks, I was looking for that exact link to include in my original report
but couldn't find where it was mentioned. Maybe worth noting here that
adding `self._state.adding = True` before the save in .clone() does not
alter the result of this bug report.
FWIW I'd really like to see this fixed rather than addressed in the
documentation, as we are relying on this quite heavily in production code
to clone objects with deeply nested relations, so changing it to another
style would be a pain. Not only that but I don't think there's any other
documented way to duplicate objects in the ORM?
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:5>
Comment (by Keryn Knight):
In ''theory'' the "simplest" solution (to keep the caching behaviour and
allow the documented duplication method) is to special-case `None` as a pk
value and always instantiate a fresh manager, circumventing the cache
entirely.
I think though that the problem ''technically'' extends further, any
change to the `pk` -- or more specifically, the manager's core filter
dependencies -- would become stale subsequently. Unsupported those other
cases may be from a documentation standpoint, but I'm sure someone
somewhere is doing it...
If there's a path where we (I) can invalidate correctly, and it keeps the
overall improvement of #32980, and it's not too invasive or complex, we
can try and proceed on that basis. If the change restores the ''old''
performance behaviour in the process, there's not much argument (except
maybe memory pressure/garbage collection) to keeping the cache
functionality, off the top of my head.
Basically there's two things to check there above, and if either of them
is a miss, I guess we'd revert #32980 because I can envisage ''plenty'' of
cases where people use the `pk=None` trick to clone an instance,
documented as it is.
I may assign this out to myself in the near future if it's otherwise not
picked up, assuming I can find the availability.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:6>
Comment (by Mariusz Felisiak):
> In theory the "simplest" solution (to keep the caching behaviour and
allow the documented duplication method) is to special-case None as a pk
value and always instantiate a fresh manager, circumventing the cache
entirely.
`None` is already
[https://github.com/django/django/blob/32536b1324e98768dd892980408a8c6b26c23fd9/django/db/models/base.py#L931-L940
handled separately for related objects], so maybe it's enough to refresh
`related_managers_cache` when `self.pk is None` and they're cached 🤔 I'm
also happy to revert 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 and re-open
#32980.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:7>
Comment (by Bhuvnesh):
Can anyone write a regression test for this one?
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:8>
Comment (by Mariusz Felisiak):
Replying to [comment:8 Bhuvnesh]:
> Can anyone write a regression test for this one?
A regression test is already in
[https://docs.djangoproject.com/en/stable/topics/db/queries/#copying-
model-instances docs]:
{{{#!python
entry = Entry.objects.all()[0] # some previous entry
old_authors = entry.authors.all()
entry.pk = None
entry._state.adding = True
entry.save()
entry.authors.set(old_authors)
self.assertCountEqual(entry.authors.all(), [...])
}}}
Have you check previous comments?
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:9>
Comment (by Bhuvnesh):
Yes , i read the previous comments! Sorry, i didn't realize that i could
make out regression tests from documentation too :)
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:10>
* has_patch: 0 => 1
Comment:
[https://github.com/django/django/pull/16047 PR]
Added a condition to refresh {{{related_managers_cache}}} when {{{self.pk
is None}}} and {{{self._state.adding is True}}}. If {{{self.pk = None}}}
will be missing then the instance would not be cloned and if
{{{self._state.adding = True}}} is missing then related_manangers_cache
would not be refreshed generating inconsistent results, hence both the
conditions are necessary. It is also mentioned in
[https://docs.djangoproject.com/en/4.1/topics/db/queries/#copying-model-
instances docs.]
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:11>
* owner: nobody => Bhuvnesh
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:12>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:13>
Comment (by Mariusz Felisiak):
I found another regression in caching related manager. The following test
works on Django 4.0 and crashes with
4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6:
{{{#!diff
diff --git a/tests/many_to_many/tests.py b/tests/many_to_many/tests.py
index 53e870ddad..770808a92c 100644
--- a/tests/many_to_many/tests.py
+++ b/tests/many_to_many/tests.py
@@ -92,6 +92,27 @@ class ManyToManyTests(TestCase):
a5.authors.remove(user_2.username)
self.assertSequenceEqual(a5.authors.all(), [])
+ def test_related_manager_refresh(self):
+ user_1 = User.objects.create(username="Jean")
+ user_2 = User.objects.create(username="Joe")
+ a5 = Article.objects.create(headline="Django lets you create web
apps easily")
+ a5.authors.add(user_1.username)
+ self.assertSequenceEqual(user_1.article_set.all(), [a5])
+ # Change the username on a different instance of the same user.
+ user_1_from_db = User.objects.get(pk=user_1.pk)
+ self.assertSequenceEqual(user_1_from_db.article_set.all(), [a5])
+ user_1_from_db.username = "Paul"
+ user_1_from_db.save()
+ user_2.username = "Jean"
+ user_2.save()
+ # Add a different article.
+ self.a4.authors.add(user_1_from_db.username)
+ self.assertSequenceEqual(user_1_from_db.article_set.all(),
[self.a4])
+ # Refresh the first instance from DB.
+ user_1.refresh_from_db()
+ self.assertEqual(user_1.username, "Paul")
+ self.assertSequenceEqual(user_1.article_set.all(), [self.a4])
+
def test_add_remove_invalid_type(self):
msg = "Field 'id' expected a number but got 'invalid'."
for method in ["add", "remove"]:
}}}
It shows two issues:
- the second call of `user_1_from_db.article_set.all()` should use a new
username (`Paul`) but it's still using the old one (`Jean`),
- `user_1.refresh_from_db()` doesn't clear cached managers.
I'm going to prepare revert of 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6
and reopen #32980.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:14>
* owner: Bhuvnesh => Mariusz Felisiak
* needs_better_patch: 1 => 0
Comment:
[https://github.com/django/django/pull/16140 New PR]
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:15>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"5e0aa362d91d000984995ce374c2d7547d8d107f" 5e0aa362]:
{{{
#!CommitTicketReference repository=""
revision="5e0aa362d91d000984995ce374c2d7547d8d107f"
Fixed #33984 -- Reverted "Fixed #32980 -- Made models cache related
managers."
This reverts 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 and adds
two regression tests:
- test_related_manager_refresh(), and
- test_create_copy_with_m2m().
Thanks joeli for the report.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:16>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"7a1675806a37375698df208c00f892cc81afe1b9" 7a167580]:
{{{
#!CommitTicketReference repository=""
revision="7a1675806a37375698df208c00f892cc81afe1b9"
[4.1.x] Fixed #33984 -- Reverted "Fixed #32980 -- Made models cache
related managers."
This reverts 4f8c7fd9d91b35e2c2922de4bb50c8c8066cbbc6 and adds
two regression tests:
- test_related_manager_refresh(), and
- test_create_copy_with_m2m().
Thanks joeli for the report.
Backport of 5e0aa362d91d000984995ce374c2d7547d8d107f from main
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:17>
Comment (by Steven Mapes):
Is this why model forms now, since Django 4.1.0 raise ValueErrors if you
try access Related Managers within the __init__ even if you check for the
the instance being set along with the relationship?
I.E If you have models and a Model form like this
{{{
# models.py
from django.db import models
class Tag(models.Model):
tag = models.SlugField(max_length=64, unique=True)
class Thing(models.Model):
tag = models.ForeignKey(Tag, on_delete=models.CASCADE,
related_name="things")
active = models.BooleanField(default=True)
}}}
{{{
# forms.py
from django import forms
from example.models import Tag
class TagForm(forms.ModelForm):
class Meta:
model = Tag
fields = ["tag"]
def __init__(self, *args, **kwargs):
super(TagForm, self).__init__(*args, **kwargs)
if self.instance and self.instance.things:
inactive_things = self.instance.things.filter(active=False)
# Do something with it
}}}
Then have the following test
{{{
from unittest.case import TestCase
from example.forms import TagForm
class TagFormCase(TestCase):
def test_required(self):
"""Test required fields"""
form = TagForm({})
self.assertFalse(form.is_valid())
self.assertEqual(
{
"tag": ["This field is required."],
},
form.errors,
)
}}}
The test would pass in all Django versions up to 4.1. From 4.1.0 onwards
it raises a ValueError of ```ValueError: 'Tag' instance needs to have a
primary key value before this relationship can be used.```
In order for the test to pass you need to change the flow control to ```if
self.instance.pk and self.instance.things```. You can't use
```self.instance.things.count()``` as it raises the exception.
However if you overload the primary key on the model to use a UUID such
as:
{{{
# models.py
import uuid
from django.db import models
class Tag(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4,
editable=False)
tag = models.SlugField(max_length=64, unique=True)
class Thing(models.Model):
tag = models.ForeignKey(Tag, on_delete=models.CASCADE,
related_name="things")
active = models.BooleanField(default=True)
}}}
then the original test will pass as the uuid will be set prior to saving
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:18>
Comment (by Keryn Knight):
Replying to [comment:18 Steven Mapes]:
> Is this why model forms now, since Django 4.1.0 raise ValueErrors if you
try access Related Managers within the `__init__` even if you check for
the the instance being set along with the relationship?
I ''believe'' that's #33952 -- that was fixed in `4.1.1` and this ticket
in `4.1.2`.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:19>
Comment (by Steven Mapes):
Replying to [comment:19 Keryn Knight]:
> Replying to [comment:18 Steven Mapes]:
> > Is this why model forms now, since Django 4.1.0 raise ValueErrors if
you try access Related Managers within the `__init__` even if you check
for the the instance being set along with the relationship?
>
> I ''believe'' that's #33952 -- that was fixed in `4.1.1` and this ticket
in `4.1.2`.
Ah okay, I just tested it with 3.2.16, 4.0.8, 4.1.0, 4.1.1, 4.1.2 and the
tests pass in 3.2.16 and 4.0.8 but still fail in the others
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:20>
Comment (by Alexandr Bezenkov):
Replying to [comment:19 Keryn Knight]:
> Replying to [comment:18 Steven Mapes]:
> > Is this why model forms now, since Django 4.1.0 raise ValueErrors if
you try access Related Managers within the `__init__` even if you check
for the the instance being set along with the relationship?
>
> I ''believe'' that's #33952 -- that was fixed in `4.1.1` and this ticket
in `4.1.2`.
Confirming, exists in `4.1.2`.
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:21>
Comment (by GitHub <noreply@…>):
In [changeset:"57c2e5da710b583110d8c7c20e91774a61d7472f" 57c2e5d]:
{{{
#!CommitTicketReference repository=""
revision="57c2e5da710b583110d8c7c20e91774a61d7472f"
Refs #33984 -- Added test for creating copies of model instances with
inherited m2m fields.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/33984#comment:22>