[Django] #27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.

21 views
Skip to first unread message

Django

unread,
Mar 27, 2017, 5:43:47 AM3/27/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek | Owner: nobody
Glowacki |
Type: Bug | Status: new
Component: Database | Version: master
layer (models, ORM) | Keywords: sql None NULL
Severity: Normal | transform
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
This is a little similar to #25946, but different angle.

Case Study:

We define a custom Field which converts emptystring to `None` before
sending to DB and then `None` back to emptystring when reading from DB:

WHY (optional reading):
For want of a CharField that enforces uniqueness on non-empty elements
only. `'' == ''` in the DB, but `NULL != NULL`. We don't want to just
allow passing None into it in the code though. Hence NulledCharField.

Example Implementation:
{{{
class NulledCharField(models.CharField):
description = _("CharField that stores NULL for emptystrings but
returns ''.")

def from_db_value(self, value, expression, connection, context):
if value is None:
return ''
else:
return value

def to_python(self, value):
if isinstance(value, models.CharField):
return value
if value is None:
return ''
return value

def get_prep_value(self, value):
if value is '':
return None
else:
return value
}}}

This works for the most part, but not with filtering!
I took a look at the SQL, and `Foo.objects.filter(bar='')` resolves to
`... WHERE "Foo"."bar" = NULL; args=(None,)`
Compare this to `Foo.objects.filter(bar=None)`, which resolves to `...
WHERE "Foo"."bar" IS NULL', ()`

The reason for this is that the `=None` -> `__isnull=True` conversion
happens before any transformation of the value.

Specifically, in django/db/models/sql/query.py,
`Query.build_filter()` calls on `self.prepare_lookup_value()` (line1139)
before it calls `self.build_lookup()` (line 1187).
The former does the isnull replacement, while the latter transforms the
value.

This also applies (more severely) to the converse: If my
`get_prep_value()` were to for some reason convert `None` to something
else, then this would get ignored as Django would just decide to use the
`IsNull` lookup off the bat.

Solution seems like we just need to rearrange the order here a little.
Happy to give it a go, but if anyone has context as to whether it's maybe
this way by design, do tell. The above cases would at the very least serve
as good failing tests if we think this kind of custom field support is
appropriate.

--
Ticket URL: <https://code.djangoproject.com/ticket/27985>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 27, 2017, 1:03:03 PM3/27/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham):

* stage: Unreviewed => Accepted


Comment:

I haven't investigated, but please provide a patch and I'll take a closer
look at it.

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:1>

Django

unread,
Mar 28, 2017, 10:50:01 AM3/28/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Jarek Glowacki):

[https://github.com/django/django/compare/master...jarekwg:ticket_27985
Here's a failing testcase] for now; will have a crack at writing a patch
soon..

Also realised it gets even messier for the `.exclude()` case, where it
[https://github.com/django/django/blob/master/django/db/models/sql/query.py#L1214-L1229
adds an extra query].
This causes a mutually exclusive pair of conditions, that'll always
evaluate to false.
It does this:
`Foo.objects.exclude(bar='')` -> `... WHERE NOT ("foo"."bar" = NULL AND
"foo"."bar" IS NOT NULL)`
Instead of just
`Foo.objects.exclude(bar=None)` -> `... WHERE NOT ("foo"."bar" IS NULL)`

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:2>

Django

unread,
Mar 28, 2017, 10:50:50 AM3/28/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Jarek
| Glowacki
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jarek Glowacki):

* owner: nobody => Jarek Glowacki
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:3>

Django

unread,
Apr 5, 2017, 7:41:28 PM4/5/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: (none)
Type: Bug | Status: new

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jarek Glowacki):

* owner: Jarek Glowacki => (none)
* status: assigned => new


Comment:

Yeahh, nope.
The current workflow is too convoluted to just slot in a patch for this.

I couldn't find a way to untangle the ordering such that it'd cover this
use case without breaking multiple other use cases.
I'd suggest taking this ticket under consideration when someone comes
round to refactor this file.

Specifically, the functions `build_filter` and `prepare_lookup_value` need
to be broken down into more managable bits. It's evident they've had extra
conditional steps shoehorned in over time, which has made them very
difficult to work with.


For anyone who comes across this in the future, all we need to address
this case is ensure that this snippet of code from `prepare_lookup_value`:
{{{
# Interpret '__exact=None' as the sql 'is NULL'; otherwise, reject
all
# uses of None as a query value.
if value is None:
if lookups[-1] not in ('exact', 'iexact'):
raise ValueError("Cannot use None as a query value")
return True, ['isnull'], used_joins
}}}
is evaluated AFTER the field's `get_prep_value` has processed the value.
Here are the failing tests again:
[https://github.com/django/django/compare/master...jarekwg:ticket_27985
Failing tests]

And here's a bandaid workaround that hacks two lookups and the form field
to get around the problem.
{{{
class NulledCharField(models.CharField):
"""
This is required when we want uniqueness on a CharField whilst also
allowing it to be empty.
In the DB, '' == '', but NULL != NULL. So we store emptystring as NULL
in the db, but treat it
as emptystring in the code.
"""
description = _("CharField that stores emptystring as NULL but returns
''.")

def from_db_value(self, value, expression, connection, context):
if value is None:
return ''
else:
return value

def to_python(self, value):
if isinstance(value, models.CharField):
return value
if value is None:
return ''
return value

def get_prep_value(self, value):
if value is '':
return None
else:
return value

class NulledCharField(forms.CharField):
""" A form field for the encompassing model field. """
def clean(self, value):
if value == '':
return None
return value

def formfield(self, form_class=None, **kwargs):
return
super().formfield(form_class=self.__class__.NulledCharField, **kwargs)


@NulledCharField.register_lookup
class NulledCharExactLookup(Exact):
""" Generate ISNULL in the `exact` lookup, because Django won't use
the `isnull` lookup correctly. """
lookup_name = 'exact'

def as_sql(self, compiler, connection):
sql, params = compiler.compile(self.lhs)
if self.rhs in ('', None):
return "%s IS NULL" % sql, params
else:
return super().as_sql(compiler, connection)


@NulledCharField.register_lookup
class NulledCharIsNullLookup(IsNull):
"""
Doing something like Foo.objects.exclude(bar='') has django trying to
shove
an extra isnull check where it doesn't need one. We solve this by just
crippling
this lookup altogether. We don't need it as all ISNULL checks that we
actually
do want, we generate in the `exact` lookup. :(
"""
lookup_name = 'isnull'
prepare_rhs = False

def as_sql(self, compiler, connection):
sql, params = compiler.compile(self.lhs)
return "TRUE", params
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:4>

Django

unread,
Jul 28, 2017, 3:10:36 PM7/28/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sergey Fedoseev):

* status: new => assigned

* owner: (none) => Sergey Fedoseev


--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:5>

Django

unread,
Jul 31, 2017, 2:32:33 PM7/31/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sergey Fedoseev):

* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/8831 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:6>

Django

unread,
Jul 31, 2017, 5:51:28 PM7/31/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"58da81a5a372a69f0bac801c412b57f3cce5f188" 58da81a5]:
{{{
#!CommitTicketReference repository=""
revision="58da81a5a372a69f0bac801c412b57f3cce5f188"
Fixed #27985 -- Fixed query for __exact=value when get_prep_value()
converts value to None.

Also fixed crash of .filter(field__transform=None).
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:7>

Django

unread,
Dec 12, 2017, 10:40:05 PM12/12/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"10bfa876be59feec24bb6a40fa11bece808ee405" 10bfa876]:
{{{
#!CommitTicketReference repository=""
revision="10bfa876be59feec24bb6a40fa11bece808ee405"
Refs #27985 -- Reallowed using __exact=None as an alias for __isnull=True
if a custom lookup class with lookup_name != None is registered as the
exact lookup.

Regression in 58da81a5a372a69f0bac801c412b57f3cce5f188 and prerequisite
for refs #28896.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:8>

Django

unread,
Dec 12, 2017, 10:40:17 PM12/12/17
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tim Graham <timograham@…>):

In [changeset:"a5c60404476124c682c996bfb1eb077d59f1ec53" a5c6040]:
{{{
#!CommitTicketReference repository=""
revision="a5c60404476124c682c996bfb1eb077d59f1ec53"
[2.0.x] Refs #27985 -- Reallowed using __exact=None as an alias for


__isnull=True if a custom lookup class with lookup_name != None is
registered as the exact lookup.

Regression in 58da81a5a372a69f0bac801c412b57f3cce5f188 and prerequisite
for refs #28896.

Backport of 10bfa876be59feec24bb6a40fa11bece808ee405 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:9>

Django

unread,
Feb 19, 2018, 5:38:09 PM2/19/18
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Jon Dufresne):

This change caused a regression for my project. Here is a simplified
example:

{{{
class Group(models.Model):
pass

class User(models.Model):
groups = models.ManyToManyField('Group')

class MyTest(TestCase):
def test_unsaved(self):
Group.objects.create()
u = User()
self.assertFalse(Group.objects.filter(user=u).exists())
}}}

This test passes on 1.11 but fails on 2.0.

On 1.11, passing an ''unsaved'' model to a M2M filter would result in an
`INNER JOIN`. A bisect shows that since commit
a5c60404476124c682c996bfb1eb077d59f1ec53, it now results in a `LEFT OUTER
JOIN`.

At this stage, only looking for clarification on what is correct to see if
this change in behavior is intentional. Is this now correct as a `LEFT
JOIN` or should this be considered a regression for this specific use?

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:10>

Django

unread,
Feb 19, 2018, 6:55:09 PM2/19/18
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


Comment:

John, I feel like using a `LOUTER` join is the appropriate behavior here.

I assume the previous query was

{{{#!sql
SELECT *
FROM group
INNER JOIN user_groups ON (user_groups.group_id = group.id)
WHERE user_groups.user_id = NULL
}}}

And the new one is

{{{#!sql
SELECT *
FROM group
LEFT OUTER JOIN user_groups ON (user_groups.group_id = group.id)
WHERE user_groups.user_id IS NULL
}}}

Using `m2m__isnull=True` (or `reverse_fk__isnull=True`) always triggered
this `LOUTER` AFAIK. I guess what broke here was to make `m2m=None` ->
`m2m__isnull=True` which seems more appropriate given how SQL deals with
`= NULL`.

I guess we could get related and reverse related field `__exact` lookups
to warn/error out when an unsaved object is passed as a value though, it
looks like this was undefined behavior if it wasn't caught by the test
suite. An other option would be to make the related
`__exact=unsaved_object` lookup raise `EmptyResultSet` on compilation to
prevent any query from taking place and restore the 1.11 behavior.

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:11>

Django

unread,
Feb 20, 2018, 12:00:46 AM2/20/18
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Here's a PR implementing the `EmptyResultSet` approach that could be used
in a 2.0 backport.

https://github.com/django/django/compare/master...charettes:ticket-27985

I really think we should deprecating this path in 2.1+ though given the
ambiguity of the lookup when `instance._state.adding`, I feel like this
would be inline with the changes to prevent
`instance.m2m.add(unsaved_instance)`.

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:12>

Django

unread,
Feb 20, 2018, 10:42:34 AM2/20/18
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Jon Dufresne):

Thanks Simon.

> I assume the previous query was ...

Yes, that is right. And looking at this again, now, I think the `=` in `=
NULL` was certainly wrong or unintended. So it would seem I was relying on
undefined behavior and got lucky.

> Here's a PR implementing the EmptyResultSet approach that could be used

in a 2.0 backport. ... I really think we should deprecating this path in
2.1+ though

Please don't feel the need to add a workaround or hack for my project. Now
that it is identified, I'm happy to fix my buggy code to rely only on
defined behavior.

Or are you suggesting that in a future version of Django, `RelatedExact`
raise an exception when used with an unsaved object?

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:13>

Django

unread,
Feb 20, 2018, 11:06:55 AM2/20/18
to django-...@googlegroups.com
#27985: Converting `Foo.objects.filter(bar=None)` to an `IsNull` too early.
-------------------------------------+-------------------------------------
Reporter: Jarek Glowacki | Owner: Sergey
| Fedoseev
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: sql None NULL | Triage Stage: Accepted
transform |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> Please don't feel the need to add a workaround or hack for my project.
Now that it is identified, I'm happy to fix my buggy code to rely only on
defined behavior.
>
> Or are you suggesting that in a future version of Django, `RelatedExact`
raise an exception when used with an unsaved object?

While it was undefined I'm kind of shocked we didn't have any tests for
this. I'm not sure if you added this test on purpose in your suite but I
could see it happening often in cases where users mix ''model factories''
with the ORM in tests.

I feel like preventing unsaved objects from being used in all related
filters in future versions of Django would be the best course of action
here. Surprisingly erroring on `if value._state.adding` in
[https://github.com/django/django/blob/master/django/db/models/fields/related_lookups.py#L28
get_normalized_values] yields quite a few test failures in the suite.
Nothing that looks unsolvable though.

--
Ticket URL: <https://code.djangoproject.com/ticket/27985#comment:14>

Reply all
Reply to author
Forward
0 new messages