|More efficient m2m assignment SQL||jdunck||7/26/12 10:15 PM|
I have 2 models:
districts = ManyToManyField(District)
Sometimes I want to reset the m2m list:
voter.districts = other_districts
I noticed that this uses ManyRelatedManager._clear_items, which, among
other things, boils down to:
I would have expected something along the lines of :
DELETE FROM `api_voter_districts` WHERE `voter_id` = 1
But instead it's 2 queries:
SELECT `api_voter_districts`.`id`, `api_voter_districts`.`voter_id`,
`api_voter_districts`.`district_id` FROM `api_voter_districts` WHERE
`api_voter_districts`.`voter_id` = 1
DELETE FROM `api_voter_districts` WHERE `id` IN (2, 3, 4, 5, 6,...)
Was it intentional to take 2 queries? That is, is there a use case
where avoiding the single DELETE query is preferable?
|Re: More efficient m2m assignment SQL||Anssi Kääriäinen||7/26/12 11:26 PM|
To me it seems there is no reason to do two queries in delete if the
following conditions hold:
- there are no signals sent for the deleted objects
- the delete does not cascade further (actually you don't need to
collect even in this case - you can do the cascade query completely in
the DB without fetching the IDs first).
For automatic M2M tables the above conditions always hold, but it is
not guaranteed if you are using the "through" argument.
To me it seems the correct fix is checking the above conditions. If
they hold, issue immediate delete query, if not, you need to collect
the objects, then delete. Doing the delete in single query can be a
huge win if you are deleting a lot of objects in one go. Actually, if
you are deleting too many objects, you will likely go OOM.
I am not objecting fixing this just for automatic m2m tables if
checking the conditions mentioned above turns out to be complicated.
|Re: More efficient m2m assignment SQL||jdunck||7/27/12 8:50 AM|
On Thu, Jul 26, 2012 at 11:26 PM, Anssi Kääriäinen...
>> I would have expected something along the lines of :...
> To me it seems there is no reason to do two queries in delete if theAh, yes, hmm. Actually, it *is* possible to get signals attached to
from django.db.models import signals
So I don't think we can even take that shortcut on just the auto m2m
tables. As the "signals maintainer", let me just say: damn, they
cause a lot of trouble. :-)
OK, it would be easy enough to add a signal method to see if there are
signals attached for a given sender (model in this case).
I'll work up a patch for the simplified case of auto-intermediates and
no receivers for pre/post delete and see where that gets us in
Do we already have machinery in place for "does this model have the
potential to cascade"? I see Collector uses
._meta.get_all_related_objects, but that seems pretty heavy for the
simple need to branch depending on "yes or no".
|Re: More efficient m2m assignment SQL||Anssi Kääriäinen||9/20/12 9:01 AM|
On 27 heinä, 18:50, Jeremy Dunck <jdu...@gmail.com> wrote:This issue came up in #django-dev recently and I decided to do
something about this. I think I have something commit-quality done for
this, see: https://code.djangoproject.com/ticket/18676#comment:2 and
The code has full fast path support - we can bulk delete in the DB if
there are no signals, and no cascades further (including parent models
and generic foreign keys).
I will be pushing this into 1.5 before feature freeze if no blocker