[Django] #31004: Using FilteredRelation on M2M relationship duplicates result rows.

118 views
Skip to first unread message

Django

unread,
Nov 19, 2019, 2:29:38 PM11/19/19
to django-...@googlegroups.com
#31004: Using FilteredRelation on M2M relationship duplicates result rows.
-------------------------------------+-------------------------------------
Reporter: | Owner: nobody
idemchenko-wrk |
Type: Bug | Status: new
Component: Database | Version: 2.2
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Using {{{FilteredRelation}}} on many to many relationship could
unexpectedly duplicate result rows.

Models:
{{{
class Discount(models.Model):
name = models.CharField(max_length=255)
active_till = models.DateTimeField()


class Item(models.Model):
name = models.CharField(max_length=255)
discounts = models.ManyToManyField(Discount)
}}}

Query:
{{{
Item.objects
.annotate(
available_discounts=FilteredRelation(
'discounts', condition=Q(discounts__active_till__gte=now)
)
).annotate(
name_with_discount=Concat(F('name'), F('available_discounts__name'))
)
}}}

SQL:
{{{
SELECT
"sales_item"."id",
"sales_item"."name",
CONCAT("sales_item"."name", available_discounts."name") AS
"name_with_discount"
FROM "sales_item"
LEFT OUTER JOIN "sales_item_discounts"
ON ("sales_item"."id" = "sales_item_discounts"."item_id")
LEFT OUTER JOIN "sales_discount" available_discounts
ON (
"sales_item_discounts"."discount_id" =
available_discounts."id"
AND (available_discounts."active_till" >=
'2019-11-17T21:32:42.501283+00:00'::timestamptz)
);
}}}

The problem is that the intermediate table (sales_item_discounts) joins
with the target table (sales_discount) using the left join.
So even if rows of the target table are filtered out, rows of the
intermediate table are still there, and multiply the resulting rows.

Test:
{{{
class DuplicationTest(TestCase):

def test_row_duplication(self):
now = timezone.now()

active_discounts = [
Discount.objects.create(name='-5%', active_till=now +
timedelta(days=1)),
Discount.objects.create(name='-5$', active_till=now +
timedelta(days=1)),
]

old_discounts = [
Discount.objects.create(name='Whatever', active_till=now -
timedelta(days=1)),
Discount.objects.create(name='Whatever', active_till=now -
timedelta(days=1)),
]

item_w_discounts = Item.objects.create(name='item1')
item_w_discounts.discounts.set(active_discounts + old_discounts)

item_w_old_discounts = Item.objects.create(name='item2')
item_w_old_discounts.discounts.set(old_discounts)

item_wo_discounts = Item.objects.create(name='item3')

items = list(
Item.objects
.annotate(
available_discounts=FilteredRelation(
'discounts',
condition=Q(discounts__active_till__gte=now)
)
).annotate(
name_with_discount=Concat(F('name'),
F('available_discounts__name'))
)
)

self.assertEqual(items.count(item_wo_discounts), 1) # Passed
# expected to see one item, since it does not have active
discounts
self.assertEqual(items.count(item_w_old_discounts), 1) #
AssertionError: 2 != 1
# expected to see two items, one for each active discount
self.assertEqual(items.count(item_w_discounts),
len(active_discounts)) # AssertionError: 4 != 2
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31004>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 19, 2019, 2:30:49 PM11/19/19
to django-...@googlegroups.com
#31004: Using FilteredRelation on M2M relationship duplicates result rows.
-------------------------------------+-------------------------------------
Reporter: idemchenko-wrk | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: FilteredRelation | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by idemchenko-wrk):

* keywords: => FilteredRelation


--
Ticket URL: <https://code.djangoproject.com/ticket/31004#comment:1>

Django

unread,
Nov 20, 2019, 3:47:41 AM11/20/19
to django-...@googlegroups.com
#31004: Using FilteredRelation on M2M relationship duplicates result rows.
-------------------------------------+-------------------------------------
Reporter: idemchenko-wrk | Owner: nobody
Type: Bug | Status: closed

Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: FilteredRelation | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* status: new => closed
* resolution: => invalid


Comment:

Thanks for this report. You're using a nullable relations i.e. you don't
filter by `available_discounts` that's why we need to use `LEFT OUTER
JOIN` because you want to get items with and without matching discounts.
You can use `distinct()` or split your query. I don't think that's
anything that we can change in Django to make it works better.

Closing per TicketClosingReasons/UseSupportChannels.

--
Ticket URL: <https://code.djangoproject.com/ticket/31004#comment:2>

Reply all
Reply to author
Forward
0 new messages