#36494: Various failures in JSONField lookups when using expressions in right-hand
side
-------------------------------------+-------------------------------------
Reporter: Jacob Walls | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):
* resolution: wontfix =>
* status: closed => new
* summary: Various failures for JSONField lookups and right-hand side
subqueries =>
Various failures in JSONField lookups when using expressions in right-
hand side
Old description:
> After #35972, I worked on a
> [
https://github.com/jacobtylerwalls/django/pull/2/files test] to verify
> support for `JSONField` lookups against right-hand side subqueries.
> Support is mostly complete on Postgres, but largely incomplete on other
> backends.
>
> I'm suggesting we close this ticket (in separate PRs) with a combination
> of adding support where possible (perhaps for `__range`?), with failing
> at the python layer with more appropriate exceptions, or possibly adding
> a feature flag to guard some failures. I agree with the
> [
https://code.djangoproject.com/ticket/31779#comment:3 sentiment] that we
> cannot maintain endless feature flags for database quirks or insulate
> users from every bug in their database vendor, but some of these failures
> seem to me to be on the Django side.
>
> Postgres:
> 1. `__range`
> ([
https://github.com/django/django/blob/6df19412aabb7d969f5eab4b2ff41269de89b233/django/db/models/lookups.py#L649
> source]):
> {{{#!py
> django.db.utils.ProgrammingError: the query has 2 placeholders but 3
> parameters were passed
> }}}
> 2. `__has_keys` or `__has_any_keys`:
> {{{#!py
> File "/django/source/django/db/models/fields/json.py", line 278, in
> get_prep_lookup
> return [str(item) for item in self.rhs]
> ^^^^^^^^
> TypeError: 'Subquery' object is not iterable
> }}}
> ----
> SQLite & MySQL:
> 1. `__range`: same as Postgres
> 2. `__has_keys` or `__has_any_keys`: same as Postgres
> 3. `__has_key`:
> {{{#!py
> File "/django/source/django/db/models/fields/json.py", line 186, in
> compile_json_path_final_key
> return ".%s" % json.dumps(key_transform)
> ^^^^^^^^^^^^^^^^^^^^^^^^^
> TypeError: Object of type Subquery is not JSON serializable
> }}}
> 4. `__gt`, `__gte`, `__lt`, `__lte`:
> {{{#!py
> File "/django/source/django/db/models/fields/json.py", line 610, in
> process_rhs
> rhs_params = [json.loads(value) for value in rhs_params]
> ^^^^^^^^^^^^^^^^^
> json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
> }}}
> 5. `__exact`:
> {{{#!py
> sqlite3.OperationalError: malformed JSON
> }}}
> ----
> Additional failure on MySQL:
> 6. `__in`:
> {{{#!py
> django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't
> yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")
> }}}
> ----
> Oracle:
> 1. `__range`: same as others (database exception)
> 2. `__has_keys` or `__has_any_keys`: same as others
> 3. `__has_key`: same as SQLite & MySQL
> 4. Certain text lookups (`__icontains`, `__startswith`, `__istartswith`,
> `__endswith`, `__iendswith`): return wrong results, they appear to return
> any results having the key on the left-hand side regardless of the
> contents of the right-hand side, refer to the modifications commented out
> in the linked test.
> ----
> This is all pretty niche, but if the context helps, I could envision
> realistic use cases for `__has_keys=queryset`, as I work on a project
> that uses stringified UUIDs identifying other entities as JSON keys.
New description:
After #35972, I worked on a
[
https://github.com/jacobtylerwalls/django/commit/3c56106e4c60639f274c4acbfb582f6d1251051b
test] to verify support for `JSONField` lookups against right-hand side
subqueries. Support is mostly complete on Postgres, but largely incomplete
on other backends.
I'm suggesting we close this ticket (in separate PRs) with a combination
of adding support where possible (perhaps for `__range`?), with failing at
the python layer with more appropriate exceptions, or possibly adding a
feature flag to guard some failures. I agree with the
[
https://code.djangoproject.com/ticket/31779#comment:3 sentiment] that we
cannot maintain endless feature flags for database quirks or insulate
users from every bug in their database vendor, but some of these failures
seem to me to be on the Django side.
Postgres:
1. `__range`
([
https://github.com/django/django/blob/6df19412aabb7d969f5eab4b2ff41269de89b233/django/db/models/lookups.py#L649
source]):
{{{#!py
django.db.utils.ProgrammingError: the query has 2 placeholders but 3
parameters were passed
}}}
2. `__has_keys` or `__has_any_keys`:
{{{#!py
File "/django/source/django/db/models/fields/json.py", line 278, in
get_prep_lookup
return [str(item) for item in self.rhs]
^^^^^^^^
TypeError: 'Subquery' object is not iterable
}}}
----
SQLite & MySQL:
1. `__range`: same as Postgres
2. `__has_keys` or `__has_any_keys`: same as Postgres
3. `__has_key`:
{{{#!py
File "/django/source/django/db/models/fields/json.py", line 186, in
compile_json_path_final_key
return ".%s" % json.dumps(key_transform)
^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: Object of type Subquery is not JSON serializable
}}}
4. `__gt`, `__gte`, `__lt`, `__lte`:
{{{#!py
File "/django/source/django/db/models/fields/json.py", line 610, in
process_rhs
rhs_params = [json.loads(value) for value in rhs_params]
^^^^^^^^^^^^^^^^^
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
}}}
5. `__exact`:
{{{#!py
sqlite3.OperationalError: malformed JSON
}}}
----
Additional failure on MySQL:
6. `__in`:
{{{#!py
django.db.utils.NotSupportedError: (1235, "This version of MySQL doesn't
yet support 'LIMIT & IN/ALL/ANY/SOME subquery'")
}}}
----
Oracle:
1. `__range`: same as others (database exception)
2. `__has_keys` or `__has_any_keys`: same as others
3. `__has_key`: same as SQLite & MySQL
4. Certain text lookups (`__icontains`, `__startswith`, `__istartswith`,
`__endswith`, `__iendswith`): return wrong results, they appear to return
any results having the key on the left-hand side regardless of the
contents of the right-hand side, refer to the modifications commented out
in the linked test.
----
This is all pretty niche, but if the context helps, I could envision
realistic use cases for `__has_keys=queryset`, as I work on a project that
uses stringified UUIDs identifying other entities as JSON keys.
--
Comment:
Simon suggested bracketing `Subquery` for the moment and just focusing on
expression support generally.
With just `F()`...
{{{#!py
def test_lookups_using_expression(self):
contains_lookups = {"contains", "contained_by"}
lookups_not_matching = {"gt", "lt", "has_key"}
text_lookups = {
"has_key",
"iexact",
"icontains",
"startswith",
"istartswith",
"endswith",
"iendswith",
"regex",
"iregex",
}
for lookup in JSONField.get_lookups():
with self.subTest(lookup=lookup):
if lookup == "isnull":
continue # not allowed, rhs must be a literal
boolean.
if (
lookup in contains_lookups
and not
connection.features.supports_json_field_contains
):
continue
if lookup in text_lookups:
rhs = KT("value__bar")
else:
rhs = F("value__bar")
qs =
NullableJSONModel.objects.filter(**{f"value__bar__{lookup}": rhs})
if lookup in lookups_not_matching:
self.assertIs(qs.exists(), False)
else:
# might need tweaking
self.assertQuerySetEqual(qs,
NullableJSONModel.objects.filter(value__has_key="bar"))
}}}
... we see three classes of failures:
1. `gt`/`gte`/`lt`/`lte`
{{{#!py
File "/Users/jwalls/django/django/db/models/fields/json.py", line 614,
in process_rhs
rhs_params = [json.loads(value) for value in rhs_params]
~~~~~~~~~~^^^^^^^
File "/Users/jwalls/cpython/Lib/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
~~~~~~~~~~~~~~~~~~~~~~~^^^
File "/Users/jwalls/cpython/Lib/json/decoder.py", line 345, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/jwalls/cpython/Lib/json/decoder.py", line 363, in
raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
}}}
2. `range`
{{{#!py
django.db.utils.OperationalError: near "AND": syntax error
}}}
3. `has_keys`/`has_any_keys`
{{{#!py
File "/Users/jwalls/django/django/db/models/fields/json.py", line 278,
in get_prep_lookup
return [str(item) for item in self.rhs]
^^^^^^^^
TypeError: 'KeyTransform' object is not iterable
}}}
----
> I'm suggesting we close this ticket (in separate PRs) with a combination
of adding support where possible (perhaps for __range?), with failing at
the python layer with more appropriate exceptions,
I think we should invite PRs per lookup to either raise a better exception
(case 2 above is especially poor, sending malformed SQL) or add support if
the effort is not massive. (At Django on the Med 2025 I made a small
experiment with adding `has_any_keys` support on PostgreSQL: seemed
slightly promising?)
All of the commits would probably be "Refs ...", and then I imagine
closing this ticket manually once we decide the remaining edge cases are
too difficult to implement.
Going through the new features features process *could* help ensure that
"reviewer resources" are equitably competed for in this case, but I can't
think of many other reasons to go through that process. I think in the
absence of any glaring warning in the docs, my expectation is that all of
JSONField's registered lookups work with expressions. I doubt we would add
a release note.
--
Ticket URL: <
https://code.djangoproject.com/ticket/36494#comment:2>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.