While sorting through queries, I noticed something unusual: let's say you
have non-abstract models inheritance:
{{{
class Parent(models.Model):
parent_field = models.BooleanField()
class DirectChild(Parent):
direct_child_field = models.BooleanField()
class IndirectChild(DirectChild):
indirect_child_field = models.BooleanField()
}}}
When trying to create IndirectChild, I would expect 3 queries to be done:
* INSERT Parent
* INSERT DirectChild
* INSERT IndirectChild
Instead, since they all share the same PK, 5 queries are done:
* INSERT Parent
* UPDATE DirectChild (since it has a PK)
* INSERT DirectChild (because previous UPDATE failed)
* UPDATE IndirectChild (since it has a PK)
* INSERT IndirectChild (because previous UPDATE failed)
I found a fix that worked for me by modifying _save_parents in
django/db/models/base.py to return if something was inserted (default to
fAlse if there is no parent), and force_insert if the parent was inserted
(because if there parent was inserted, the child can't possibly exist).
Being a beginner, I'm really wary of breaking something. Could someone if
it's something wanted by Django and check if there is obvious mistake?
Here is the modified method:
{{{
def _save_parents(self, cls, using, update_fields):
"""Save all the parents of cls using values from self."""
meta = cls._meta
inserted = False
for parent, field in meta.parents.items():
# Make sure the link fields are synced between parent and
self.
if (field and getattr(self, parent._meta.pk.attname) is None
and
getattr(self, field.attname) is not None):
setattr(self, parent._meta.pk.attname, getattr(self,
field.attname))
parent_inserted = self._save_parents(cls=parent, using=using,
update_fields=update_fields)
updated = self._save_table(cls=parent, using=using,
update_fields=update_fields, force_insert=parent_inserted)
if not updated:
inserted = True
# Set the parent's PK value to self.
if field:
setattr(self, field.attname,
self._get_pk_val(parent._meta))
# Since we didn't have an instance of the parent handy set
# attname directly, bypassing the descriptor. Invalidate
# the related object cache, in case it's been accidentally
# populated. A fresh instance will be re-built from the
# database if necessary.
if field.is_cached(self):
field.delete_cached_value(self)
return inserted
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29568>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
Hello,
While sorting through queries, I noticed something unusual: let's say you
have non-abstract models inheritance:
{{{
class Parent(models.Model):
parent_field = models.BooleanField()
class DirectChild(Parent):
direct_child_field = models.BooleanField()
class IndirectChild(DirectChild):
indirect_child_field = models.BooleanField()
}}}
When trying to create IndirectChild, I would expect 3 queries to be done:
* INSERT Parent
* INSERT DirectChild
* INSERT IndirectChild
Instead, since they all share the same PK, 5 queries are done:
* INSERT Parent
* UPDATE DirectChild (since it has a PK)
* INSERT DirectChild (because previous UPDATE failed)
* UPDATE IndirectChild (since it has a PK)
* INSERT IndirectChild (because previous UPDATE failed)
I found a fix that worked for me by modifying _save_parents and save_base
in django/db/models/base.py: _save_parents return if something was
inserted (default to fAlse if there is no parent), and force_insert if the
parent was inserted in both methods (because if there parent was inserted,
the child can't possibly exist).
Being a beginner, I'm really wary of breaking something. Could someone if
it's something wanted by Django and check if there is obvious mistake?
Here is the modified method (without irrelevant code, materialized as
[...]):
{{{
def _save_parents([...]):
[...]
inserted = False
for parent, field in meta.parents.items():
[...]
parent_inserted = self._save_parents(cls=parent, using=using,
update_fields=update_fields)
updated = self._save_table(cls=parent, using=using,
update_fields=update_fields, force_insert=parent_inserted)
if not updated:
inserted = True
[...]
return inserted
def save_base([...]):
[...]
with transaction.atomic(using=using, savepoint=False):
parent_inserted = False
if not raw:
parent_inserted = self._save_parents(cls, using,
update_fields)
updated = self._save_table(raw, cls, force_insert or
parent_inserted, force_update, using, update_fields)
[...]
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:1>
Old description:
> Hello,
>
> While sorting through queries, I noticed something unusual: let's say you
> have non-abstract models inheritance:
> {{{
> class Parent(models.Model):
> parent_field = models.BooleanField()
>
> class DirectChild(Parent):
> direct_child_field = models.BooleanField()
>
> class IndirectChild(DirectChild):
> indirect_child_field = models.BooleanField()
> }}}
>
> When trying to create IndirectChild, I would expect 3 queries to be done:
> * INSERT Parent
> * INSERT DirectChild
> * INSERT IndirectChild
>
> Instead, since they all share the same PK, 5 queries are done:
> * INSERT Parent
> * UPDATE DirectChild (since it has a PK)
> * INSERT DirectChild (because previous UPDATE failed)
> * UPDATE IndirectChild (since it has a PK)
> * INSERT IndirectChild (because previous UPDATE failed)
>
> I found a fix that worked for me by modifying _save_parents and save_base
> in django/db/models/base.py: _save_parents return if something was
> inserted (default to fAlse if there is no parent), and force_insert if
> the parent was inserted in both methods (because if there parent was
> inserted, the child can't possibly exist).
>
> Being a beginner, I'm really wary of breaking something. Could someone if
> it's something wanted by Django and check if there is obvious mistake?
> Here is the modified method (without irrelevant code, materialized as
> [...]):
> {{{
> def _save_parents([...]):
> [...]
> inserted = False
> for parent, field in meta.parents.items():
> [...]
> parent_inserted = self._save_parents(cls=parent, using=using,
> update_fields=update_fields)
> updated = self._save_table(cls=parent, using=using,
> update_fields=update_fields, force_insert=parent_inserted)
> if not updated:
> inserted = True
> [...]
> return inserted
> def save_base([...]):
> [...]
> with transaction.atomic(using=using, savepoint=False):
> parent_inserted = False
> if not raw:
> parent_inserted = self._save_parents(cls, using,
> update_fields)
> updated = self._save_table(raw, cls, force_insert or
> parent_inserted, force_update, using, update_fields)
> [...]
> }}}
New description:
Hello,
While sorting through queries, I noticed something unusual: let's say you
have non-abstract models inheritance:
{{{
class Parent(models.Model):
parent_field = models.BooleanField()
class DirectChild(Parent):
direct_child_field = models.BooleanField()
class IndirectChild(DirectChild):
indirect_child_field = models.BooleanField()
}}}
When trying to create IndirectChild, I would expect 3 queries to be done:
* INSERT Parent
* INSERT DirectChild
* INSERT IndirectChild
Instead, since they all share the same PK, 5 queries are done:
* INSERT Parent
* UPDATE DirectChild (since it has a PK)
* INSERT DirectChild (because previous UPDATE failed)
* UPDATE IndirectChild (since it has a PK)
* INSERT IndirectChild (because previous UPDATE failed)
I found a fix that worked for me by modifying _save_parents and save_base
in django/db/models/base.py: _save_parents return if something was
inserted (default to fAlse if there is no parent), and force_insert if the
parent was inserted in both methods (because if there parent was inserted,
the child can't possibly exist).
Being a beginner, I'm really wary of breaking something. Could someone if
it's something wanted by Django and check if there is obvious mistake?
Here is the modified method (without irrelevant code, materialized as
[...]):
{{{
def _save_parents([...]):
[...]
inserted = False
for parent, field in meta.parents.items():
[...]
parent_inserted = self._save_parents(cls=parent, using=using,
update_fields=update_fields)
updated = self._save_table(cls=parent, using=using,
update_fields=update_fields, force_insert=parent_inserted)
if not updated:
inserted = True
[...]
return inserted
def save_base([...]):
[...]
with transaction.atomic(using=using, savepoint=False):
parent_inserted = False
if not raw:
parent_inserted = self._save_parents(cls, using,
update_fields)
updated = self._save_table(raw, cls, force_insert or
parent_inserted, force_update, using, update_fields)
[...]
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:2>
Comment (by Claude Paroz):
I guess a first experiment would be to make the change and run the test
suite to see the outcome.
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:3>
* stage: Unreviewed => Accepted
Comment:
With`IndirectChild.objects.create(parent_field=True,
direct_child_field=True, indirect_child_field=True)`, I see four queries
at 4d48ddd8f93800a80330ec1dee7b7d4afe6ea95a:
{{{
1. INSERT INTO "t29568_parent" ("parent_field") VALUES (true) RETURNING
"t29568_parent"."id"
2. UPDATE "t29568_directchild" SET "direct_child_field" = true WHERE
"t29568_directchild"."parent_ptr_id" = 1
3. INSERT INTO "t29568_directchild" ("parent_ptr_id",
"direct_child_field") VALUES (1, true)
4. INSERT INTO "t29568_indirectchild" ("directchild_ptr_id",
"indirect_child_field") VALUES (1, true)
}}}
Accepting for investigation.
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:4>
Old description:
> Hello,
>
> While sorting through queries, I noticed something unusual: let's say you
> have non-abstract models inheritance:
> {{{
> class Parent(models.Model):
> parent_field = models.BooleanField()
>
> class DirectChild(Parent):
> direct_child_field = models.BooleanField()
>
> class IndirectChild(DirectChild):
> indirect_child_field = models.BooleanField()
> }}}
>
> When trying to create IndirectChild, I would expect 3 queries to be done:
> * INSERT Parent
> * INSERT DirectChild
> * INSERT IndirectChild
>
> Instead, since they all share the same PK, 5 queries are done:
> * INSERT Parent
> * UPDATE DirectChild (since it has a PK)
> * INSERT DirectChild (because previous UPDATE failed)
> * UPDATE IndirectChild (since it has a PK)
> * INSERT IndirectChild (because previous UPDATE failed)
>
> I found a fix that worked for me by modifying _save_parents and save_base
> in django/db/models/base.py: _save_parents return if something was
> inserted (default to fAlse if there is no parent), and force_insert if
> the parent was inserted in both methods (because if there parent was
> inserted, the child can't possibly exist).
>
> Being a beginner, I'm really wary of breaking something. Could someone if
> it's something wanted by Django and check if there is obvious mistake?
> Here is the modified method (without irrelevant code, materialized as
> [...]):
> {{{
> def _save_parents([...]):
> [...]
> inserted = False
> for parent, field in meta.parents.items():
> [...]
> parent_inserted = self._save_parents(cls=parent, using=using,
> update_fields=update_fields)
> updated = self._save_table(cls=parent, using=using,
> update_fields=update_fields, force_insert=parent_inserted)
> if not updated:
> inserted = True
> [...]
> return inserted
>
> def save_base([...]):
> [...]
> with transaction.atomic(using=using, savepoint=False):
> parent_inserted = False
> if not raw:
> parent_inserted = self._save_parents(cls, using,
> update_fields)
> updated = self._save_table(raw, cls, force_insert or
> parent_inserted, force_update, using, update_fields)
> [...]
> }}}
New description:
Hello,
While sorting through queries, I noticed something unusual: let's say you
have non-abstract models inheritance:
{{{
class Parent(models.Model):
parent_field = models.BooleanField()
class DirectChild(Parent):
direct_child_field = models.BooleanField()
class IndirectChild(DirectChild):
indirect_child_field = models.BooleanField()
}}}
When trying to create IndirectChild, I would expect 3 queries to be done:
* INSERT Parent
* INSERT DirectChild
* INSERT IndirectChild
Instead, since they all share the same PK, when trying to save a
IndirectChild not yet persisted to DB, 5 queries are done:
* INSERT Parent
* UPDATE DirectChild (since it has a PK)
* INSERT DirectChild (because previous UPDATE failed)
* UPDATE IndirectChild (since it has a PK)
* INSERT IndirectChild (because previous UPDATE failed)
Note: when trying to use IndirectChild.objects.create, only 4 queries are
made, since QuerySet.create use force_insert=True (thanks to Tim Graham)
I found a fix that worked for me by modifying _save_parents and save_base
in django/db/models/base.py: _save_parents return if something was
inserted (default to fAlse if there is no parent), and force_insert if the
parent was inserted in both methods (because if there parent was inserted,
the child can't possibly exist).
Being a beginner, I'm really wary of breaking something. Could someone if
it's something wanted by Django and check if there is obvious mistake?
Here is the modified method (without irrelevant code, materialized as
[...]):
{{{
def _save_parents([...]):
[...]
inserted = False
for parent, field in meta.parents.items():
[...]
parent_inserted = self._save_parents(cls=parent, using=using,
update_fields=update_fields)
updated = self._save_table(cls=parent, using=using,
update_fields=update_fields, force_insert=parent_inserted)
if not updated:
inserted = True
[...]
return inserted
def save_base([...]):
[...]
with transaction.atomic(using=using, savepoint=False):
parent_inserted = False
if not raw:
parent_inserted = self._save_parents(cls, using,
update_fields)
updated = self._save_table(raw, cls, force_insert or
parent_inserted, force_update, using, update_fields)
[...]
}}}
--
Comment (by François Dupayrat):
Replying to [comment:3 Claude Paroz]:
> I guess a first experiment would be to make the change and run the test
suite to see the outcome.
Thanks, I'll get that done.
Replying to [comment:4 Tim Graham]:
> With`IndirectChild.objects.create(parent_field=True,
direct_child_field=True, indirect_child_field=True)`, I see four queries
at 4d48ddd8f93800a80330ec1dee7b7d4afe6ea95a:
> {{{
> 1. INSERT INTO "t29568_parent" ("parent_field") VALUES (true) RETURNING
"t29568_parent"."id"
> 2. UPDATE "t29568_directchild" SET "direct_child_field" = true WHERE
"t29568_directchild"."parent_ptr_id" = 1
> 3. INSERT INTO "t29568_directchild" ("parent_ptr_id",
"direct_child_field") VALUES (1, true)
> 4. INSERT INTO "t29568_indirectchild" ("directchild_ptr_id",
"indirect_child_field") VALUES (1, true)
> }}}
> Accepting for investigation.
Indeed, I didn't test with IndirectChild.objects.create, but with save()
on a IndirectChild not yet saved to DB, since it's done that way by a
library (django-tastypie), and outside my control.
I edited the description to reflect that.
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:5>
* has_patch: 0 => 1
Comment:
Pull request created: https://github.com/django/django/pull/10197
Tests showed a different behavior than the one listed in the ticket:
instead of doing UPDATE queries, SELECT queries were made in-between
INSERT INTO queries. Editing Parent, Child and GrandChild to add a Boolean
field revert to UPDATE queries.
In both cases, only INSERT INTO queries should be made anyway, so I don't
think it's necessary to add fields to Parent, Child or GrandChild.
Summary of queries:
- with existing I see 6 queries on GrandChild.objects.create()
{{{
INSERT INTO "model_inheritance_grandparent" ("first_name", "last_name",
"email", "place_id") VALUES ('', '', '', NULL)
SELECT (1) AS "a" FROM "model_inheritance_parent" WHERE
"model_inheritance_parent"."grandparent_ptr_id" = 1 LIMIT 1
INSERT INTO "model_inheritance_parent" ("grandparent_ptr_id") SELECT 1
SELECT (1) AS "a" FROM "model_inheritance_child" WHERE
"model_inheritance_child"."parent_ptr_id" = 1 LIMIT 1
INSERT INTO "model_inheritance_child" ("parent_ptr_id") SELECT 1
INSERT INTO "model_inheritance_grandchild" ("child_ptr_id") SELECT 1
}}}
- by adding a BooleanField with a default value to each of Parent, Child
and GrandChild, I still see 6 queries
{{{
INSERT INTO "model_inheritance_grandparent" ("first_name", "last_name",
"email", "place_id") VALUES ('', '', '', NULL)
UPDATE "model_inheritance_parent" SET "parent_field" = 1 WHERE
"model_inheritance_parent"."grandparent_ptr_id" = 1
INSERT INTO "model_inheritance_parent" ("grandparent_ptr_id",
"parent_field") SELECT 1, 1
UPDATE "model_inheritance_child" SET "child_field" = 1 WHERE
"model_inheritance_child"."parent_ptr_id" = 1
INSERT INTO "model_inheritance_child" ("parent_ptr_id", "child_field")
SELECT 1, 1
INSERT INTO "model_inheritance_grandchild" ("child_ptr_id",
"grand_child_field") SELECT 1, 1
}}}
- with patch I see 4 queries, both for modified and unmodified Models,
provided the new fields have default values.
{{{
INSERT INTO "model_inheritance_grandparent" ("first_name", "last_name",
"email", "place_id") VALUES ('', '', '', NULL)
INSERT INTO "model_inheritance_parent" ("grandparent_ptr_id") SELECT 1
INSERT INTO "model_inheritance_child" ("parent_ptr_id") SELECT 1
INSERT INTO "model_inheritance_grandchild" ("child_ptr_id") SELECT 1
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:6>
* status: new => closed
* resolution: => fixed
Comment:
In [changeset:"861638a3074f289faf729d739edf6326332a484f" 861638a]:
{{{
#!CommitTicketReference repository=""
revision="861638a3074f289faf729d739edf6326332a484f"
Fixed #29568 -- Prevented unnecessary UPDATE queries creating child
models.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29568#comment:7>