[Django] #35751: Ordering a model via a m2m field creates unintended side effect for ForeignKeys

22 views
Skip to first unread message

Django

unread,
Sep 10, 2024, 2:31:22 PM9/10/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: capital-G | Type: Bug
Status: new | Component: Database
| layer (models, ORM)
Version: 5.1 | Severity: Normal
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Python 3.10 @ Django 5.1.1

Using a many to many relation for ordering (which is something you
shouldn't do?) will affect the traversing when the object in question is
accessed via a foreign key - the necessary left join for the ordering
"spills" into the ORM results, yielding n (number of foreign key
references) times m (number of many-to-many relations within the object)
times n objects instead of just the actual n objects.

If you comment out the ordering the ORM behaves as expected.

Th ORM results should not "multiply" due to an ordering configuration.

== How to reproduce

Given a toy `models.py` on a new project which looks like

{{{#!python
from django.db import models

class Order(models.Model):
pass

class BookingTime(models.Model):
date = models.DateTimeField(auto_now=True)


class OrderItem(models.Model):
order = models.ForeignKey(
Order,
on_delete=models.CASCADE,
related_name="order_items",
)

booking_times = models.ManyToManyField(
"BookingTime",
related_name="order_items",
)

class Meta:
ordering = [
# this is the problem!
'booking_times__date',
]

}}}

Then on a shell do

{{{#!python
In [1]: from foo.models import *

In [2]: booking_times = [BookingTime() for _ in range(4)]

In [3]: [b.save() for b in booking_times]
Out[3]: [None, None, None, None]

In [4]: order = Order()

In [5]: order.save()

In [6]: order_item = OrderItem(order=order)

In [7]: order_item.save()

In [8]: order_item.booking_times.add(*booking_times)

In [9]: order.order_items.count()
Out[9]: 1

In [10]: order.order_items.all()
Out[10]: <QuerySet [<OrderItem: OrderItem object (1)>, <OrderItem:
OrderItem object (1)>, <OrderItem: OrderItem object (1)>, <OrderItem:
OrderItem object (1)>]>

In [11]: print(order.order_items.all().query)
SELECT "foo_orderitem"."id", "foo_orderitem"."order_id" FROM
"foo_orderitem" LEFT OUTER JOIN "foo_orderitem_booking_times" ON
("foo_orderitem"."id" = "foo_orderitem_booking_times"."orderitem_id") LEFT
OUTER JOIN "foo_bookingtime" ON
("foo_orderitem_booking_times"."bookingtime_id" = "foo_bookingtime"."id")
WHERE "foo_orderitem"."order_id" = 2 ORDER BY "foo_bookingtime"."date" ASC

In [12]: order.order_items.all().explain()
Out[11]: '5 0 0 SEARCH foo_orderitem USING COVERING INDEX
foo_orderitem_order_id_e19c2dbd (order_id=?)\n11 0 0 SEARCH
foo_orderitem_booking_times USING COVERING INDEX
foo_orderitem_booking_times_orderitem_id_bookingtime_id_49667055_uniq
(orderitem_id=?) LEFT-JOIN\n17 0 0 SEARCH foo_bookingtime USING INTEGER
PRIMARY KEY (rowid=?) LEFT-JOIN\n35 0 0 USE TEMP B-TREE FOR ORDER BY'
}}}

This also happens in a template, e.g.

{{{#!python
{% for item in order.order_items.all %}
{{ item }}
{% endfor %}
}}}

also yields 4 items instead of 1 due to the left join.
--
Ticket URL: <https://code.djangoproject.com/ticket/35751>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 10, 2024, 2:32:42 PM9/10/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: capital-G | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by capital-G:

Old description:
New description:
Out[12]: '5 0 0 SEARCH foo_orderitem USING COVERING INDEX
foo_orderitem_order_id_e19c2dbd (order_id=?)\n11 0 0 SEARCH
foo_orderitem_booking_times USING COVERING INDEX
foo_orderitem_booking_times_orderitem_id_bookingtime_id_49667055_uniq
(orderitem_id=?) LEFT-JOIN\n17 0 0 SEARCH foo_bookingtime USING INTEGER
PRIMARY KEY (rowid=?) LEFT-JOIN\n35 0 0 USE TEMP B-TREE FOR ORDER BY'
}}}

This also happens in a template, e.g.

{{{#!python
{% for item in order.order_items.all %}
{{ item }}
{% endfor %}
}}}

also yields 4 items instead of 1 due to the left join.

--
--
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:1>

Django

unread,
Sep 10, 2024, 2:36:48 PM9/10/24
to django-...@googlegroups.com
> Out[12]: '5 0 0 SEARCH foo_orderitem USING COVERING INDEX
> foo_orderitem_order_id_e19c2dbd (order_id=?)\n11 0 0 SEARCH
> foo_orderitem_booking_times USING COVERING INDEX
> foo_orderitem_booking_times_orderitem_id_bookingtime_id_49667055_uniq
> (orderitem_id=?) LEFT-JOIN\n17 0 0 SEARCH foo_bookingtime USING INTEGER
> PRIMARY KEY (rowid=?) LEFT-JOIN\n35 0 0 USE TEMP B-TREE FOR ORDER BY'
> }}}
>
> This also happens in a template, e.g.
>
> {{{#!python
> {% for item in order.order_items.all %}
> {{ item }}
> {% endfor %}
> }}}
>
> also yields 4 items instead of 1 due to the left join.

New description:

Python 3.10 @ Django 5.1.1

Using a many to many relation for ordering (which is something you
shouldn't do?) will affect the traversing when the object in question is
accessed via a foreign key - the necessary left join for the ordering
"spills" into the ORM results, yielding n (number of foreign key
references) times m (number of many-to-many relations within the object)
objects instead of just the actual n objects referencing the foreign key.
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:2>

Django

unread,
Sep 10, 2024, 2:38:42 PM9/10/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: Dennis Scheiba | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by Dennis Scheiba:

Old description:

> Python 3.10 @ Django 5.1.1
>
> Using a many to many relation for ordering (which is something you
> shouldn't do?) will affect the traversing when the object in question is
> accessed via a foreign key - the necessary left join for the ordering
> "spills" into the ORM results, yielding n (number of foreign key
> references) times m (number of many-to-many relations within the object)
also yields 4 times the same item instead of just once (due to the left
join?).

--
--
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:3>

Django

unread,
Sep 10, 2024, 5:10:13 PM9/10/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: Dennis Scheiba | Owner: (none)
Type: Bug | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* resolution: => invalid
* status: new => closed

Comment:

[https://docs.djangoproject.com/en/5.1/ref/models/querysets/#django.db.models.query.QuerySet.order_by
This behaviour is described in the documentation] under the note that
covers ''It is permissible to specify a multi-valued field to order the
results by''.

> Using a many to many relation for ordering (which is something you
shouldn't do?)

Could you describe wha are you expecting to happen when specifying
`order_by(booking_times__date)` if not for each item to be returned
multiple times? Are you looking for `order_by(Min("booking_times__date"))`
instead?
--
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:4>

Django

unread,
Sep 11, 2024, 6:07:42 AM9/11/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: Dennis Scheiba | Owner: (none)
Type: Bug | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Dennis Scheiba):

I don't have enough expertise in the construction of an ORM to fully
understand all implications, but as you mentioned, an ordering by a m2m
relation is not explicit enough to be determined, and actually the LOC in
question was an overlook from my side. I did not expect that the ordering
of an item can be the cause for the multiplication of items and I instead
digged through the Manager implementation to find the cause.

Maybe a warning could have been helpful here as it fundamentally changes
the way the ORM behaves? I'd think that the edge cases where this actual
behavior is wanted can be done in a way which is more canonical within the
ORM.

In the end, a warning during boot procedure would have been nice for me
and would have saved me some headache - maybe even this issue is already
helpful enough for people.
--
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:5>

Django

unread,
Sep 11, 2024, 8:19:43 AM9/11/24
to django-...@googlegroups.com
#35751: Ordering a model via a m2m field creates unintended side effect for
ForeignKeys
-------------------------------------+-------------------------------------
Reporter: Dennis Scheiba | Owner: (none)
Type: Bug | Status: closed
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: ORM ordering | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

I personally think that `order_by(multivalued__value)` is a foot-gun due
to its ambiguity and duplicate row spanning behaviour and should be
deprecated but it poses some challenges as it is a behavior that was
supported and even documented for over a decade so it's safe to assume
that some users might be legitimately depend on it for some nice query
needs (e.g. adding `distinct()` afterwards to deal with the duplicate row
issue).

If we were to raise warnings or consider deprecating this feature we
should have an upgrade path that covers these use cases. In your
particular case


{{{#!python
OrderItem.objects.order_by("booking_times__date").distinct()
}}}

which results in

{{{#!sql
SELECT DISTINCT orderitem.*
FROM orderitem
LEFT JOIN orderitem_booking_times ON (orderitem_booking_times.orderitem_id
= orderitem.id)
LEFT JOIN bookingtime ON (orderitem_booking_times.bookingtime_id =
bookingtime.id)
ORDER BY bookingtime.date
}}}

can be replaced with

{{{#!python
OrderItem.objects.order_by(Min("booking_times__date"))
}}}

which results in

{{{#!sql
SELECT orderitem.*
FROM orderitem
LEFT JOIN orderitem_booking_times ON (orderitem_booking_times.orderitem_id
= orderitem.id)
LEFT JOIN bookingtime ON (orderitem_booking_times.bookingtime_id =
bookingtime.id)
GROUP BY orderitem.id
ORDER BY MIN(bookingtime.date)
}}}

Since `order_by` is resolved late in the query compilation process it
should be doable to present a deprecation warning that suggests the usage
of `Min` and `Max` instead.
--
Ticket URL: <https://code.djangoproject.com/ticket/35751#comment:6>
Reply all
Reply to author
Forward
0 new messages