{{{
#!python
from django.db import models
class Rectangle(models.Model):
length = models.IntegerField(null=True)
width = models.IntegerField(null=True)
}}}
We make the following queries: Q1) Rectangles that are squares. Q2a)
Rectangles that are not squares (length != width). Q2b) Rectangles that
are not squares (width != length). Queries Q2a and Q2b semantically mean
the same. However, the ORM generates different SQL for these two queries.
{{{
#!python
import django
django.setup()
from django.db.models import F
from testapp.models import Rectangle
print '--- Q1: Get the rectangles that are squares'
print Rectangle.objects.filter(length=F('width')).values('pk').query
print '--- Q2a: Get the rectangles that are not squares'
print Rectangle.objects.exclude(length=F('width')).values('pk').query
print '--- Q2b: Get the rectangles that are not squares'
print Rectangle.objects.exclude(width=F('length')).values('pk').query
}}}
The generated SQL is
{{{
#!sql
--- Q1: Get the rectangles that are squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE
"testapp_rectangle"."length" = ("testapp_rectangle"."width")
--- Q2a: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT
("testapp_rectangle"."length" = ("testapp_rectangle"."width") AND
"testapp_rectangle"."length" IS NOT NULL)
--- Q2b: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT
("testapp_rectangle"."width" = ("testapp_rectangle"."length") AND
"testapp_rectangle"."width" IS NOT NULL)
}}}
Note the asymmetry between Q2a and Q2b. These queries will return
different results.
Reddit user /u/charettes set up this useful SQL fiddle with the above
mentioned schema and test values: http://sqlfiddle.com/#!12/c8283/4 Here's
my reddit post on this issue:
http://www.reddit.com/r/django/comments/2lxjcc/unintuitive_behavior_of_f_expression_with_exclude/
--
Ticket URL: <https://code.djangoproject.com/ticket/23797>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted
Comment:
Unfortunately this will be messy to fix. It is true that .exclude() should
produce the complement of .filter(), and that doesn't happen here.
One possible solution is to use `WHERE ("testapp_rectangle"."length" =
("testapp_rectangle"."width")) IS NOT true`. This should match both null
and false values (the .filter() query could be written as `WHERE
("testapp_rectangle"."length" = ("testapp_rectangle"."width")) IS true`. I
have no idea how well IS NOT true is optimized, and if it is available for
all backends.
Another possibility is to also filter F() NULL values, but as said that
will be messy.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:1>
Comment (by akaariai):
After a quick test, it seems the `WHERE ("testapp_rectangle"."length" =
("testapp_rectangle"."width")) IS NOT true` approach will not work that
well. At least PostgreSQL wont use indexes for a query `select * from
foobar where (id > 1 and id < 20) is true`, but will use the index for
`select * from foobar where (id > 1 and id < 20)`. This tells me
PostgreSQL's optimizer will not handle the is true / is not true
conditions well.
If somebody is able to provide a somewhat clean solution to this, I am all
for fixing this.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:2>
Comment (by mssnlayam):
Is it the case that `filter()` and `exclude()` should produce the
complement of each other? If that is not the case, the obvious queries
would be `WHERE (length = width)` and `WHERE NOT (length = width)`.
If the results should be complimentary, couldn't we use `WHERE (length =
width)` for the filter and `WHERE (length = width)) IS NOT true` for the
exclude. The exclude would be slower, but at least the results would be
consistent.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:3>
Comment (by charettes):
@akaariai what about adding another `IS NOT NULL` clause
[https://github.com/django/django/blob/master/django/db/models/sql/query.py#L1255-L1270
here] based on `value` we should end up with `NOT ((length = width) AND
(length IS NULL) AND (width IS NULL))`?
If the query planer use the index when the generated constraint is `NOT
((length = width) AND (length IS NULL))` then it should also be able to
use it in this case.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:4>
* cc: charettes (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:5>
Comment (by akaariai):
Yes, .exclude() should be the complement of .filter(). As said in
comment:2 the `WHERE (length = width)) IS NOT true` way is just way too
slow at least on PostgreSQL (always a sequential scan according to my
tests). Going for the NOT ((length = width) AND (length IS NULL) AND
(width IS NULL)) should likely work.
I'd like to do this in the Lookup class. We can't do this directly in
as_sql() as we don't have enough context there. Maybe a new method
`get_extra_restriction(self, is_negated)` could work. The method returns
an expression (a Lookup instance or WhereNode instance) that will be ANDed
to the lookup's main condition. I'll try to find the time to write a small
proof of concept.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:6>
Comment (by akaariai):
I have to take back what I said earlier. Unfortunately we have backwards
compatibility requirements for the pre-Lookup classes way of writing
custom lookups. So, we actually don't necessarily have a lookup class at
the point where we want to add the (width IS NULL) condition. So, for now
we have to do this directly in build_filter() method.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:7>
Comment (by shaib):
Just for the record, the `(x=y) is true` syntax is not supported on Oracle
(or Mysql, IIRC). It's a database, it doesn't really handle logic.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:8>
* owner: nobody => Jacob Walls
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:9>
* has_patch: 0 => 1
Comment:
Looks like there was consensus here around the proper query:
NOT ((length = width) AND (length IS NULL) AND (width IS NULL))
as well as its location in `build_query`.
I submitted a patch that builds this query after checking if the
comparison value is an instance of `Col`.
https://github.com/django/django/pull/13118
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:10>
* needs_better_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:11>
* needs_better_patch: 1 => 0
Comment:
Updated patch to address review comments.
It is worth mentioning the patch only addresses the case where both
columns are nullable. If the source column is not nullable and the target
column of the F() expression is nullable, exclude() still does not return
records where the F() expression evaluates to NULL.
I'm open to thoughts on whether this ticket should be left open to address
that unhandled case, or whether another ticket should track it.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:12>
Comment (by Jacob Walls):
Updated the patch to handle the case I mentioned above, so no need to
split tickets.
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:13>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"512da9d585513bcf4aaf977e4ad82c208897a7b1" 512da9d5]:
{{{
#!CommitTicketReference repository=""
revision="512da9d585513bcf4aaf977e4ad82c208897a7b1"
Fixed #23797 -- Fixed QuerySet.exclude() when rhs is a nullable column.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:14>