[Django] #23797: SQL generated for negated F() expressions is incorrect

22 views
Skip to first unread message

Django

unread,
Nov 11, 2014, 4:40:30 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
----------------------------------------------+--------------------
Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: 1.7
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Consider the following model definition.

{{{
#!python
from django.db import models

class Rectangle(models.Model):
length = models.IntegerField(null=True)
width = models.IntegerField(null=True)
}}}

We make the following queries: Q1) Rectangles that are squares. Q2a)
Rectangles that are not squares (length != width). Q2b) Rectangles that
are not squares (width != length). Queries Q2a and Q2b semantically mean
the same. However, the ORM generates different SQL for these two queries.

{{{
#!python
import django
django.setup()

from django.db.models import F
from testapp.models import Rectangle

print '--- Q1: Get the rectangles that are squares'
print Rectangle.objects.filter(length=F('width')).values('pk').query

print '--- Q2a: Get the rectangles that are not squares'
print Rectangle.objects.exclude(length=F('width')).values('pk').query

print '--- Q2b: Get the rectangles that are not squares'
print Rectangle.objects.exclude(width=F('length')).values('pk').query
}}}

The generated SQL is

{{{
#!sql
--- Q1: Get the rectangles that are squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE
"testapp_rectangle"."length" = ("testapp_rectangle"."width")
--- Q2a: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT
("testapp_rectangle"."length" = ("testapp_rectangle"."width") AND
"testapp_rectangle"."length" IS NOT NULL)
--- Q2b: Get the rectangles that are not squares
SELECT "testapp_rectangle"."id" FROM "testapp_rectangle" WHERE NOT
("testapp_rectangle"."width" = ("testapp_rectangle"."length") AND
"testapp_rectangle"."width" IS NOT NULL)
}}}

Note the asymmetry between Q2a and Q2b. These queries will return
different results.

Reddit user /u/charettes set up this useful SQL fiddle with the above
mentioned schema and test values: http://sqlfiddle.com/#!12/c8283/4 Here's
my reddit post on this issue:
http://www.reddit.com/r/django/comments/2lxjcc/unintuitive_behavior_of_f_expression_with_exclude/

--
Ticket URL: <https://code.djangoproject.com/ticket/23797>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 11, 2014, 4:54:39 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted


Comment:

Unfortunately this will be messy to fix. It is true that .exclude() should
produce the complement of .filter(), and that doesn't happen here.

One possible solution is to use `WHERE ("testapp_rectangle"."length" =
("testapp_rectangle"."width")) IS NOT true`. This should match both null
and false values (the .filter() query could be written as `WHERE
("testapp_rectangle"."length" = ("testapp_rectangle"."width")) IS true`. I
have no idea how well IS NOT true is optimized, and if it is available for
all backends.

Another possibility is to also filter F() NULL values, but as said that
will be messy.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:1>

Django

unread,
Nov 11, 2014, 4:59:19 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

After a quick test, it seems the `WHERE ("testapp_rectangle"."length" =
("testapp_rectangle"."width")) IS NOT true` approach will not work that
well. At least PostgreSQL wont use indexes for a query `select * from
foobar where (id > 1 and id < 20) is true`, but will use the index for
`select * from foobar where (id > 1 and id < 20)`. This tells me
PostgreSQL's optimizer will not handle the is true / is not true
conditions well.

If somebody is able to provide a somewhat clean solution to this, I am all
for fixing this.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:2>

Django

unread,
Nov 11, 2014, 9:32:32 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by mssnlayam):

Is it the case that `filter()` and `exclude()` should produce the
complement of each other? If that is not the case, the obvious queries
would be `WHERE (length = width)` and `WHERE NOT (length = width)`.

If the results should be complimentary, couldn't we use `WHERE (length =
width)` for the filter and `WHERE (length = width)) IS NOT true` for the
exclude. The exclude would be slower, but at least the results would be
consistent.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:3>

Django

unread,
Nov 11, 2014, 9:35:32 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by charettes):

@akaariai what about adding another `IS NOT NULL` clause
[https://github.com/django/django/blob/master/django/db/models/sql/query.py#L1255-L1270
here] based on `value` we should end up with `NOT ((length = width) AND
(length IS NULL) AND (width IS NULL))`?

If the query planer use the index when the generated constraint is `NOT
((length = width) AND (length IS NULL))` then it should also be able to
use it in this case.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:4>

Django

unread,
Nov 11, 2014, 9:37:43 AM11/11/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by charettes):

* cc: charettes (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:5>

Django

unread,
Nov 12, 2014, 1:41:59 AM11/12/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

Yes, .exclude() should be the complement of .filter(). As said in
comment:2 the `WHERE (length = width)) IS NOT true` way is just way too
slow at least on PostgreSQL (always a sequential scan according to my
tests). Going for the NOT ((length = width) AND (length IS NULL) AND
(width IS NULL)) should likely work.

I'd like to do this in the Lookup class. We can't do this directly in
as_sql() as we don't have enough context there. Maybe a new method
`get_extra_restriction(self, is_negated)` could work. The method returns
an expression (a Lookup instance or WhereNode instance) that will be ANDed
to the lookup's main condition. I'll try to find the time to write a small
proof of concept.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:6>

Django

unread,
Nov 15, 2014, 9:24:41 AM11/15/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by akaariai):

I have to take back what I said earlier. Unfortunately we have backwards
compatibility requirements for the pre-Lookup classes way of writing
custom lookups. So, we actually don't necessarily have a lookup class at
the point where we want to add the (width IS NULL) condition. So, for now
we have to do this directly in build_filter() method.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:7>

Django

unread,
Nov 15, 2014, 9:41:11 AM11/15/14
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------

Reporter: mssnlayam | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by shaib):

Just for the record, the `(x=y) is true` syntax is not supported on Oracle
(or Mysql, IIRC). It's a database, it doesn't really handle logic.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:8>

Django

unread,
Jun 27, 2020, 5:15:46 PM6/27/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: assigned

Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* owner: nobody => Jacob Walls
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:9>

Django

unread,
Jun 27, 2020, 5:57:18 PM6/27/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: assigned
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* has_patch: 0 => 1


Comment:

Looks like there was consensus here around the proper query:


NOT ((length = width) AND (length IS NULL) AND (width IS NULL))

as well as its location in `build_query`.

I submitted a patch that builds this query after checking if the
comparison value is an instance of `Col`.
https://github.com/django/django/pull/13118

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:10>

Django

unread,
Jul 2, 2020, 12:53:37 AM7/2/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: assigned
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:11>

Django

unread,
Jul 3, 2020, 10:21:33 AM7/3/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: assigned
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* needs_better_patch: 1 => 0


Comment:

Updated patch to address review comments.

It is worth mentioning the patch only addresses the case where both
columns are nullable. If the source column is not nullable and the target
column of the F() expression is nullable, exclude() still does not return
records where the F() expression evaluates to NULL.

I'm open to thoughts on whether this ticket should be left open to address
that unhandled case, or whether another ticket should track it.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:12>

Django

unread,
Jul 3, 2020, 6:04:28 PM7/3/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: assigned
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Jacob Walls):

Updated the patch to handle the case I mentioned above, so no need to
split tickets.

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:13>

Django

unread,
Jul 6, 2020, 4:41:48 AM7/6/20
to django-...@googlegroups.com
#23797: SQL generated for negated F() expressions is incorrect
-------------------------------------+-------------------------------------
Reporter: Suriya Subramanian | Owner: Jacob
| Walls
Type: Bug | Status: closed

Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"512da9d585513bcf4aaf977e4ad82c208897a7b1" 512da9d5]:
{{{
#!CommitTicketReference repository=""
revision="512da9d585513bcf4aaf977e4ad82c208897a7b1"
Fixed #23797 -- Fixed QuerySet.exclude() when rhs is a nullable column.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23797#comment:14>

Reply all
Reply to author
Forward
0 new messages