[Django] #18556: .remove() on a reverse foreign key executes too many queries

57 views
Skip to first unread message

Django

unread,
Jul 3, 2012, 12:28:10 PM7/3/12
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database | Keywords:
layer (models, ORM) | Has patch: 0
Severity: Normal | Needs tests: 0
Triage Stage: Accepted | Easy pickings: 0
Needs documentation: 0 |
Patch needs improvement: 0 |
UI/UX: 0 |
-------------------------------------+-------------------------------------
It does one query per object received, instead of just one for the entire
batch. Attached is a patch which fixes this. It's technically not
backwards compatible because signals are no longer sent, however one could
artificially send them. Also, in the event of an error, none of the
objects will be modified, whereas currently some of them will be.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jul 3, 2012, 2:39:26 PM7/3/12
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* has_patch: 0 => 1


Comment:

+1 for this change. There are already situations where the signals
framework doesn't catch all changes. Using .update() in this situation
seems natural. On the other hand, I do not actually use signals...

If we wanted to keep signals for this, then we should have some sort of
pre/post update signal which one could use here. The signal would likely
have an argument of "objs" which would be lazily fetched from the DB - if
nobody access the objs then no extra work done.

The patch itself looks good to me.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:1>

Django

unread,
Feb 12, 2013, 4:30:27 PM2/12/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by timo):

I think it makes sense not to send any signals for consistency with how
`clear()` works, plus it's going to be backwards-incompatible to some
extent with `save()` no longer being called. I've added documentation,
including a note to the backwards-incompatible changes for 1.6.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:2>

Django

unread,
Feb 12, 2013, 4:50:42 PM2/12/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by carljm):

The closest parallel to `remove()` in API terms is not `clear()`, it's
`add()`. There is no consistency gained by having `add()` continue to send
signals and `remove()` not. If we're going to do this, the same approach
should be used for both `add()` and `remove()` (there's also no reason for
having an efficiency difference between them).

There's also another backwards incompatibility here; the passed-in object
instances themselves are no longer updated. This looks trivial to fix;
just restore the `setattr` line in the loop.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:3>

Django

unread,
Feb 12, 2013, 5:32:46 PM2/12/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by timo):

Thanks Carl, good point. I'll be happy to update `add()` as well. Are you
+1 on removing signals for these operations or do you think it needs a
discussion on django-developers?

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:4>

Django

unread,
Feb 12, 2013, 5:43:25 PM2/12/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by carljm):

Yeah, I didn't comment on that in my first comment because I'm not sure
:-) I think it will in all likelihood break real code if we stop sending
these signals here. On the other hand, I also agree with Alex and Anssi
that the current implementation is dumb, `update()` makes way more sense,
and that it would be better to have an update signal.

I wouldn't say I'm a +1, but I'm not -1 either. Somewhere in the zero
range, I guess :-) I'm not gonna stand in the way.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:5>

Django

unread,
Feb 12, 2013, 5:51:35 PM2/12/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

There is also the possibility to have fast-path of .update() for no
listeners case, and loop when there are listeners. The complexity of dual
paths isn't big at all, and it is worth some complexity when one can save
possibly large amounts of DB resources.

The code as written doesn't do just one query per obj, it does two.
.save() will select, then update for each object. This is typical example
case where either using update_fields, or applying #16649 would help.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:6>

Django

unread,
Feb 19, 2013, 8:50:16 AM2/19/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by timo):

For `add()`, it looks like we need to at least keep the possibility of
executing `save()` for an object that hasn't been saved yet. See this
test
https://github.com/django/django/blob/master/tests/modeltests/many_to_one/tests.py#L49-L65

Here's what I came up with so far. Tests pass except for
multiple_database. I'm guessing
`self.model.objects.filter(pk__in=ids).update(**{rel_field.name:
self.instance})` needs a slightly different implementation to work with
multi-db, but I'm not sure what that would be.

https://github.com/timgraham/django/commit/d6b2720138e53cde0e8f60471e8d92f444e87474

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:7>

Django

unread,
Feb 19, 2013, 12:03:18 PM2/19/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 0
Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

The multidb issue is likely due to having the object saved (thus PK set),
but into different DB.

Just checking the pk is set isn't enough, consulting the model._state.db +
_state.adding would likely yield the correct result. If different than the
current DB or adding, then save, else go to update() directly.

This seems to also raise a possible race condition. Assume thread T1
fetches the object from DB, then T2 deletes it, and then T1 issues add().
The result was that the object was resaved to DB, after patch it is that
it remains deleted. In the add() case the correct behavior is resave so
that after add you can trust that relation actually contains all the
objects in the DB.

The race could most reliably be resolved by doing an UPDATE ... RETURNING
PK. Check which PKs were updated, those that weren't must be resaved.
Unfortunately RETURNING isn't available in MySQL or SQLite, so this idea
would only apply to those DBs supporting returning. Others could of course
do update(); values_list('pk'); save those not existing in the values_list
separately.

Completely another matter is the race condition actually matters in real
world situations...

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:8>

Django

unread,
Aug 29, 2013, 7:55:56 AM8/29/13
to django-...@googlegroups.com
#18556: .remove() on a reverse foreign key executes too many queries
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: nobody
Type: | Status: new
Cleanup/optimization | Version: 1.4
Component: Database layer | Resolution:
(models, ORM) | Triage Stage: Accepted
Severity: Normal | Needs documentation: 0
Keywords: | Patch needs improvement: 1

Has patch: 1 | UI/UX: 0
Needs tests: 0 |
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by timo):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:9>

Django

unread,
Mar 14, 2015, 9:24:44 PM3/14/15
to django-...@googlegroups.com
#18556: Improve query efficiency of .add() on a reverse foreign key
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: timgraham
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* owner: nobody => timgraham
* needs_better_patch: 1 => 0
* version: 1.4 => master
* status: new => assigned


Comment:

It seems the issue with `remove()` has since been dealt with by adding
[https://github.com/django/django/commit/f450bc9f a bulk parameter]. I
[https://github.com/django/django/pull/4319 updated my original patch] and
fixed the multidb problems as suggested by Anssi.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:10>

Django

unread,
Mar 26, 2015, 9:45:26 AM3/26/15
to django-...@googlegroups.com
#18556: Improve query efficiency of .add() on a reverse foreign key
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: timgraham
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1


Comment:

Loic would like the same `bulk` parameter for `add()`. Also
`GenericForeignKey` needs to be updated with the same behavior.

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:11>

Django

unread,
Apr 6, 2015, 8:01:57 PM4/6/15
to django-...@googlegroups.com
#18556: Improve query efficiency of .add() on a reverse foreign key
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: timgraham
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:12>

Django

unread,
Jul 27, 2015, 10:31:40 PM7/27/15
to django-...@googlegroups.com
#18556: Improve query efficiency of .add() on a reverse foreign key
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: timgraham
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Loïc Bistuer <loic.bistuer@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"adc0c4fbac98f9cb975e8fa8220323b2de638b46" adc0c4fb]:
{{{
#!CommitTicketReference repository=""
revision="adc0c4fbac98f9cb975e8fa8220323b2de638b46"
Fixed #18556 -- Allowed RelatedManager.add() to execute 1 query where
possible.

Thanks Loic Bistuer for review.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:13>

Django

unread,
Jul 27, 2015, 10:53:52 PM7/27/15
to django-...@googlegroups.com
#18556: Improve query efficiency of .add() on a reverse foreign key
-------------------------------------+-------------------------------------
Reporter: Alex | Owner: timgraham
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette <charette.s@…>):

In [changeset:"6a46f239573369d0f701a1caf037f62b25adec02" 6a46f23]:
{{{
#!CommitTicketReference repository=""
revision="6a46f239573369d0f701a1caf037f62b25adec02"
Refs #18556 -- Fixed a typo in the related manager add() method docs.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/18556#comment:14>

Reply all
Reply to author
Forward
0 new messages