Example:
{{{similar_recipes = at_least_four_shared_ingredients |
same_tag_and_at_least_two_shared_ingredients}}}
{{{same_tag_and_at_least_two_shared_ingredients}}} contains entries but
{{{similar_recipes}}} doesn't.
You can reproduce the problem here: https://github.com/Inglorious-
Inge/Kochbuch2/tree/queryset_problem
The test in {{{recipes/tests/test_models.py}}} is failing.
The code under test is in {{{recipes/models.py}}} line 30-45.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Comment (by René Fleschenberg):
As you can see on the main branch of the repository linked above by the
reporter, the query works as expected if it is rewritten to use {{{Q}}}
objects:
{{{#!python
return Recipe.objects.annotate(
shared_ingredients=Count("ingredients")
).filter(has_shared_ingredients).filter(
Q(same_tag, shared_ingredients__gte=2)
| Q(shared_ingredients__gte=4)
).exclude(id=self.id)
}}}
Thanks to CodenameTim on the Django Discord for helping to debug this!
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:1>
* cc: Kbleser (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:2>
* status: new => closed
* resolution: => invalid
Comment:
Following the latest comment and after confirming the test from the repo
is no longer failing, I'll mark this bug as invalid since it was really a
support request.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:3>
* status: closed => new
* resolution: invalid =>
Comment:
Natalia, I disagree. It is not a support request (the reporters problem
was solved by using a different API), but a possible bug in the ORM.
Combining two querysets with {{{|}}} where at least one of them is non-
empty should not result in an empty queryset, IMO. If it can (when?), we
should probably document this.
It also seems to me that the behaviour does not match the note in the
[https://docs.djangoproject.com/en/4.2/ref/models/querysets/#or docs]
about equivalency with {{{Q}}} objects?
If we can't agree here on whether this is a bug or not, should we start an
ML discussion?
BTW, on Discord, there has been another report about {{{qs1 | qs2}}} and
{{{qs2 | qs1}}} not being equivalent, which they should be according to
the docs. This is probably outside of the scope of this ticket, but I
thought it's worth mentioning here (maybe we can fix both issues in one
go).
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:4>
Comment (by Mariusz Felisiak):
Replying to [comment:4 René Fleschenberg]:
> BTW, on Discord, there has been another report about {{{qs1 | qs2}}} and
{{{qs2 | qs1}}} not being equivalent, which they should be according to
the docs. This is probably outside of the scope of this ticket, but I
thought it's worth mentioning here (maybe we can fix both issues in one
go).
Queries may **not** be the same according to
[https://docs.djangoproject.com/en/dev/ref/models/querysets/#or docs],
check out #33319:
> ''"| is not a commutative operation, as different (though equivalent)
queries may be generated."''
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:5>
Comment (by René Fleschenberg):
Replying to [comment:5 Mariusz Felisiak]:
> Replying to [comment:4 René Fleschenberg]:
> > BTW, on Discord, there has been another report about {{{qs1 | qs2}}}
and {{{qs2 | qs1}}} not being equivalent, which they should be according
to the docs. This is probably outside of the scope of this ticket, but I
thought it's worth mentioning here (maybe we can fix both issues in one
go).
>
> Queries may **not** be the same according to
[https://docs.djangoproject.com/en/dev/ref/models/querysets/#or docs],
check out #33319:
> > ''"| is not a commutative operation, as different (though equivalent)
queries may be generated."''
I understand "equivalent" as "they will return the same set of results".
Is that not what the docs mean here?
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:6>
Comment (by Mariusz Felisiak):
Replying to [comment:4 René Fleschenberg]:
> Natalia, I disagree. It is not a support request (the reporters problem
was solved by using a different API), but a possible bug in the ORM.
Combining two querysets with {{{|}}} where at least one of them is non-
empty should not result in an empty queryset, IMO. If it can (when?), we
should probably document this.
>
> It also seems to me that the behaviour does not match the note in the
[https://docs.djangoproject.com/en/4.2/ref/models/querysets/#or docs]
about equivalency with {{{Q}}} objects?
Queryset that you
[https://code.djangoproject.com/ticket/34728?replyto=4#comment:1 proposed]
is not the same query that user used in a sample project, I'm not sure how
the ORM could figure it out from:
{{{
Recipe.objects.filter(
Q(ingredients__in=self.ingredients.all()
).annotate(
shared_ingredients=Count("ingredients")
).filter(shared_ingredients__gte=4)
|
Recipe.objects.filter(
Q(tags__in=self.tags.all()) & Q(ingredients__in=self.ingredients.all()
).annotate(
shared_ingredients=Count("ingredients")
).filter(shared_ingredients__gte=2)
}}}
Docs only describe how the OR operator works by rule. The ORM cannot
analyze user queries and rewrite them.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:7>
Comment (by René Fleschenberg):
Replying to [comment:7 Mariusz Felisiak]:
> Docs only describe how the OR operator works by rule. The ORM cannot
analyze user queries and rewrite them.
I understand that this is probably not easy (maybe not possible) for the
ORM to solve. Maybe I am wrong here, but I think you would expect {{{qs1 |
qs2}}} to return the same set of results as {{{set(qs1) | set(qs2)}}}? Not
sure what to do about it, though.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:8>
Comment (by Mariusz Felisiak):
Replying to [comment:8 René Fleschenberg]:
> I understand that this is probably not easy (maybe not possible) for the
ORM to solve. Maybe I am wrong here, but I think you would expect {{{qs1 |
qs2}}} to return the same set of results as {{{set(qs1) | set(qs2)}}}? Not
sure what to do about it, though.
For this we'd have to use `UNION` instead of combining filters. Queryset
from the ticket description is really complicated (from the structure
point of view) with overlap filters, aggregations, filters on aggregated
values, etc. My point is that you cannot describe it as `Model.filter(x) |
Model.filter(y)` as it contains aggregations and filters on them. Docs
don't precise what would happen when you use the `OR` operator on
`Model.filter(x1).annotate(aggregation).filter(x2) |
Model.filter(y1).annotate(aggregation).filter(y2)`. If you want to merge
results of two independent querysets you should use `.union()`.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:9>
* status: new => closed
* resolution: => needsinfo
Comment:
I'm taking a look at this again. For the sake of clarity, René or the
reporter, could you please paste here the test method (or any other needed
diff) that produce a failure? I've cloned the repo and played a bit with
the existing `find_similar_recipes` and I'm not being able to reproduce
the issue. Thanks!
Regarding:
> If we can't agree here on whether this is a bug or not, should we start
an ML discussion?
I think this may be the best path forward!
>BTW, on Discord, there has been another report about qs1 | qs2 and qs2 |
qs1 not being equivalent, which they should be according to the docs.
Do you have an example or link to the discussion at hand? Perhaps we could
clarify in the docs what "equivalent" mean, in that, given that `|` is not
commutative, it may be the case that the generated `SQL` return different
results but still equivalent, no? :thinking:
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:10>
Comment (by René Fleschenberg):
Hi,
Replying to [comment:10 Natalia Bidart]:
> I'm taking a look at this again. For the sake of clarity, René or the
reporter, could you please paste here the test method (or any other needed
diff) that produce a failure? I've cloned the repo and played a bit with
the existing `find_similar_recipes` and I'm not being able to reproduce
the issue. Thanks!
I suspect you looked at the {{{main}}} branch, which doesn't show the
problem :) Sorry, I should have explicitly mentioned the branch here. If
you checkout the branch {{{queryset_problem}}} and run the tests, you
should see that the test in {{{recipes/tests/test_models.py}}} is failing:
https://github.com/Inglorious-
Inge/Kochbuch2/blob/queryset_problem/recipes/tests/test_models.py
(Not pasting the test setup here for the sake of brevity).
{{{#!python
def test_find_similar_recipes(self):
similar_recipes = self.recipe.find_similar_recipes()
self.assertEquals(len(similar_recipes), 1)
}}}
The reason is that in {{{recipes/models.py}}} on line 39
(https://github.com/Inglorious-
Inge/Kochbuch2/blob/queryset_problem/recipes/models.py#L39C4-L39C106), {{{
at_least_four_shared_ingredients |
same_tag_and_at_least_two_shared_ingredients}}} results in a queryset that
turns out empty when evaluated. If you use a debugger to take a look at
the two querysets being ORed together, you should see that
{{{same_tag_and_at_least_two_shared_ingredients}}} by itself evaluates to
a non-empty queryset.
> >BTW, on Discord, there has been another report about qs1 | qs2 and qs2
| qs1 not being equivalent, which they should be according to the docs.
>
> Do you have an example or link to the discussion at hand? Perhaps we
could clarify in the docs what "equivalent" mean, in that, given that `|`
is not commutative, it may be the case that the generated `SQL` return
different results but still equivalent, no? :thinking:
I really don't know what "equivalent" would mean in regards to SQL queries
if they do not return the same set of rows :) I suspect we thought so far
that equivalency would be guaranteed but it is actually not (if the report
on Discord is correct). I'll try to find the discussion on Discord and
open a separate ticket for this, since I think it is a separate different
issue.
Thanks!
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:11>
Comment (by René Fleschenberg):
Just for the record, the Discord discussion is here:
https://discord.com/channels/856567261900832808/857642132423704577/1126286062642266223
I'll try to contact isik-kaplan and ask them if they already filed a
ticket, or if they can provide a reproducible example.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:12>
* cc: Simon Charette (added)
Comment:
Replying to [comment:11 René Fleschenberg]:
> I suspect you looked at the {{{main}}} branch, which doesn't show the
problem :) Sorry, I should have explicitly mentioned the branch here. If
you checkout the branch {{{queryset_problem}}} and run the tests, you
should see that the test in {{{recipes/tests/test_models.py}}} is failing:
https://github.com/Inglorious-
Inge/Kochbuch2/blob/queryset_problem/recipes/tests/test_models.py
Excellent! I was able to reproduce the failure as reported. What I also
noticed is that without the annotation, the OR behaves as expected (though
I understand the goal of the annotation in this case).
I've been trying to debug further by re-writting the queries in
`find_similar_recipes`, and then checking results and the generated SQL. I
can't detect anything obviously wrong so my next step would be to craft
the SQL I would use if I'd be building this by hand, but I ran out of time
today.
I'll CC Simon in this ticket, I believe they may have a clearer view of
what's expected/supported and what's not. Personally, I do find the
outcome a bit unexpected, so I share the feeling, but the experts will
know more.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:13>
Comment (by Simon Charette):
TL;DR not an issue with queryset combination but another ticket that
relates to how the ORM deals with aggregation when more than one multi-
valued relationship is involved. Either ''invalid'' or ''duplicate'' to
me.
---
> As you can see on the main branch of the repository linked above by the
reporter, the query works as expected if it is rewritten to use Q objects
This happens to work because you perform the aggregation ''before''
filtering the multi-valued relationship which prevents JOIN reuse
[https://docs.djangoproject.com/en/4.2/topics/db/aggregation/#order-of-
annotate-and-filter-clauses as documented] and results in you joining
twice against against the same multi-valued relationship
{{{#!sql
SELECT "recipes_recipe"."id",
"recipes_recipe"."title",
"recipes_recipe"."date_posted",
"recipes_recipe"."created_by_id",
"recipes_recipe"."image",
"recipes_recipe"."preparation_time_in_minutes",
"recipes_recipe"."instructions",
"recipes_recipe"."level",
"recipes_recipe"."serving_size",
COUNT("recipes_ingredienttorecipe"."ingredient_id_id") AS
"shared_ingredients"
FROM "recipes_recipe"
LEFT OUTER JOIN "recipes_ingredienttorecipe" ON ("recipes_recipe"."id" =
"recipes_ingredienttorecipe"."recipe_id_id")
INNER JOIN "recipes_ingredienttorecipe" T4 ON ("recipes_recipe"."id" =
T4."recipe_id_id")
LEFT OUTER JOIN "recipes_tagtorecipe" ON ("recipes_recipe"."id" =
"recipes_tagtorecipe"."recipe_id_id")
WHERE (T4."ingredient_id_id" IN
(SELECT U0."id"
FROM "recipes_ingredient" U0
INNER JOIN "recipes_ingredienttorecipe" U1 ON (U0."id" =
U1."ingredient_id_id")
WHERE U1."recipe_id_id" = 1)
AND NOT ("recipes_recipe"."id" = 1))
GROUP BY "recipes_recipe"."id",
"recipes_recipe"."title",
"recipes_recipe"."date_posted",
"recipes_recipe"."created_by_id",
"recipes_recipe"."image",
"recipes_recipe"."preparation_time_in_minutes",
"recipes_recipe"."instructions",
"recipes_recipe"."level",
"recipes_recipe"."serving_size",
"recipes_tagtorecipe"."tag_id_id"
HAVING (("recipes_tagtorecipe"."tag_id_id" IN
(SELECT U0."id"
FROM "recipes_tag" U0
INNER JOIN "recipes_tagtorecipe" U1 ON (U0."id" =
U1."tag_id_id")
WHERE U1."recipe_id_id" = 1)
AND COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 2)
OR COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 4);
}}}
Notice how `recipes_ingredienttorecipe` is joined twice which means
`COUNT("recipes_ingredienttorecipe"."ingredient_id_id") AS
"shared_ingredients"` will actually be the `COUNT` of the
[http://charettes.name/djangoconus2022/slides.html#21 product of rows]
`2X2=4` (see #10060). In other words, aggregate annotations when many
multi-valued relationships are involved are broken and have been for while
and in your case it's the
`COUNT("recipes_ingredienttorecipe"."ingredient_id_id") >= 4)` that is
allowing your test to pass on `main`. The fact this problems manifests
itself in queryset combination when filtering against broken aggregate
annotation is just a symptom of a much larger problem.
If you define one of your aggregate annotations as a subquery to avoid
joining more than one multi-valued relationship at once (#28296) things
properly work
{{{#!python
def find_similar_recipes(self):
return (
Recipe.objects.annotate(
shared_ingredients_cnt=Count(
"ingredients",
filter=Q(ingredients__in=self.ingredients.all()),
),
# Using a subquery to target the _other_ multi-valued
relationship avoids X product of rows
# with ingredients
shares_tag=Exists(
TagToRecipe.objects.filter(
tag_id__in=self.tags.all(),
recipe_id=OuterRef("pk"),
)
),
)
.filter(
Q(shared_ingredients_cnt__gte=4)
| Q(shares_tag=True, shared_ingredients_cnt__gte=2)
)
.exclude(id=self.id)
)
}}}
Which can also use queryset combination without issues
{{{#!python
def find_similar_recipes(self):
base = Recipe.objects.annotate(
shared_ingredients_cnt=Count(
"ingredients",
filter=Q(ingredients__in=self.ingredients.all()),
)
).exclude(id=self.id)
same_tag_and_at_least_two_shared_ingredients = base.annotate(
shares_tag=Exists(
TagToRecipe.objects.filter(
tag_id__in=self.tags.all(),
recipe_id=OuterRef("pk"),
)
)
).filter(Q(shares_tag=True, shared_ingredients_cnt__gte=2))
at_least_four_shared_ingredients =
base.filter(shared_ingredients_cnt__gte=4)
return (
same_tag_and_at_least_two_shared_ingredients |
at_least_four_shared_ingredients
)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:14>
* resolution: needsinfo => duplicate
Comment:
Thank you Simon for the detailed and clear explanation. I think this is
more accurately closed as duplicate of #10060.
--
Ticket URL: <https://code.djangoproject.com/ticket/34728#comment:15>