I would expect `qs1` and `qs2` to be equivalent in the following code:
{{{#!python
class Application(models.Model):
pass
class Score(models.Model):
application = models.ForeignKey(Application, on_delete=models.PROTECT)
reviewed = models.BooleanField()
a1 = Application.objects.create()
Score.objects.create(reviewed=False, application=a1)
Score.objects.create(reviewed=True, application=a1)
a2 = Application.objects.create()
qs1 = Application.objects.annotate(
needs_review=Case(
When(~Q(score__reviewed=True), then=V(True)),
default=V(False),
output_field=BooleanField()
)
).filter(needs_review=True)
qs2 = Application.objects.filter(
~Q(score__reviewed=True)
)
print(qs1) # <QuerySet [<Application: Application object (1)>,
<Application: Application object (2)>]>
print(qs2) # <QuerySet [<Application: Application object (2)>]>
assert set(qs1) == set(qs2)
}}}
The identical `~Q` expression is behaving differently in the context of
`Case/When` and `filter`. Am I completely missing something and this is
expected behavior?
--
Ticket URL: <https://code.djangoproject.com/ticket/29291>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
Old description:
New description:
{{{#!python
class Application(models.Model):
pass
a2 = Application.objects.create()
assert set(qs1) == set(qs2)
}}}
This is using Django 2.0.4.
--
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:1>
Comment (by Simon Charette):
Could you possible provide the SQL queries generated by these querysets
and which database you a querying against? You can retrieve them by doing
`str(queryset.query)`.
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:2>
Comment (by Bo Marchman):
This is on Postgres 9.6.8.
For `qs1`:
{{{
SELECT "test_app_application"."id",
CASE
WHEN NOT ("test_app_score"."reviewed" = TRUE
AND "test_app_score"."reviewed" IS NOT NULL) THEN
TRUE
ELSE FALSE
END AS "needs_review"
FROM "test_app_application"
LEFT OUTER JOIN "test_app_score" ON ("test_app_application"."id" =
"test_app_score"."application_id")
WHERE CASE
WHEN NOT ("test_app_score"."reviewed" = TRUE
AND "test_app_score"."reviewed" IS NOT NULL) THEN TRUE
ELSE FALSE
END = TRUE
}}}
And `qs2`:
{{{
SELECT "test_app_application"."id"
FROM "test_app_application"
WHERE NOT ("test_app_application"."id" IN
(SELECT U1."application_id"
FROM "test_app_score" U1
WHERE U1."reviewed" = TRUE))
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:3>
* version: 2.0 => master
* stage: Unreviewed => Accepted
Comment:
I feel like `qs2` is correct and that `q1` should be using a subquery as
well to correctly deal with SQL's boolean logic with regards to `NULL`.
{{{#!python
SELECT "test_app_application"."id",
CASE
WHEN NOT ("test_app_application"."id" IN
(SELECT U1."application_id"
FROM "test_app_score" U1
WHERE U1."reviewed" = TRUE)) THEN TRUE
ELSE FALSE
END as "needs_review"
FROM "test_app_application"
WHERE CASE
WHEN NOT ("test_app_application"."id" IN
(SELECT U2."application_id"
FROM "test_app_score" U2
WHERE U2."reviewed" = TRUE)) THEN TRUE
ELSE FALSE
END = TRUE
}}}
I'm not sure how to convert it into a subquery pushdown though.
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:4>
Comment (by rjsparks):
I think this is just a consequence of the way annotate works on a reverse
ForeignKey rather than being a bug in the conditional expression.
{{{
In [53]: Application.objects.annotate(
...: needs_review=Case(
...: When(~Q(score__reviewed=True), then=V(True)),
...: default=V(False),
...: output_field=BooleanField()
...: )
...: )
Out[53]: <QuerySet [<Application: Application object (1)>, <Application:
Application object (1)>, <Application: Application object (2)>]>
In [54]: Application.objects.annotate(
...: needs_review=Case(
...: When(~Q(score__reviewed=True), then=V(True)),
...: default=V(False),
...: output_field=BooleanField()
...: )
...: ).values_list('needs_review',flat=True)
Out[54]: <QuerySet [True, False, True]>
}}}
Or more simply:
{{{
In [55]:
Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
...: '))
Out[55]: <QuerySet [<Application: Application object (1)>, <Application:
Application object (1)>, <Application: Application object (2)>]>
In [56]:
Application.objects.annotate(score_has_been_reviewed=F('score__reviewed
...: ')).values_list('score_has_been_reviewed',flat=True)
Out[56]: <QuerySet [False, True, None]>
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:5>
Comment (by Can Sarıgöl):
as I can see, the problem is null db objects can use in where clause
properly. If your columns is nullable, you have to manage it when you use
in select or where clauses. q1 query might be changed like this:
{{{
qs1 = Application.objects.annotate(
needs_review=Case(
When(~Q(Q(score__reviewed=True) |
Q(score__reviewed__isnull=False)), then=V(True)),
default=V(False),
output_field=BooleanField()
)
).filter(needs_review=True)
}}}
if we want to make a query like q2, it could be like this:
{{{
qs1 = Application.objects.filter(
~Q(id__in=Score.objects.annotate(
needs_review = Case(
When(Q(reviewed=True) | Q(reviewed__isnull=False),
then=F('application_id')),
output_field=IntegerField()
)
).values_list('needs_review', flat=True))
)
}}}
at this case, we can get rid of the score table's column or join query.
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:6>
Comment (by Simon Charette):
It's also interesting to see how `.filter(~(Q(score__reviewed=True) |
Q(score__reviewed=None))` behaves
{{{#!sql
SELECT "application"."id"
FROM "application"
WHERE NOT (("application"."id" IN
(SELECT U1."application_id"
FROM "score" U1
WHERE U1."reviewed" = TRUE)
OR "application"."id" IN
(SELECT U0."id"
FROM "application" U0
LEFT OUTER JOIN "score" U1 ON (U0."id" =
U1."application_id")
WHERE (U1."reviewed" IS NULL
AND U0."id" = ("application"."id")))))
}}}
I think the most appropriate solutions at this point is to adjust `When`
SQL compilation to reuse the `Query._add_q` logic. That will make sure the
subquery pushdown is performed if required and make sure implementations
don't diverge.
This should be favorable over implementing a new algorithm specific to
`When` given how `.filter` is already heavily testing the handling of
exclusions across nullable relationships
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:7>
Comment (by Can Sarıgöl):
I dont agree with you. With one time left join and without subquery
solution is pretty performd and enough. Orm doesnt have to solve Null case
clauses but user have to.
--
Ticket URL: <https://code.djangoproject.com/ticket/29291#comment:8>