Re: [Django] #35194: Postgres 16.2 with _iexact leads to IndeterminateCollation

32 views
Skip to first unread message

Django

unread,
Apr 11, 2024, 7:00:20 PM4/11/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution: needsinfo
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)
* severity: Normal => Release blocker
* stage: Unreviewed => Accepted

Comment:

Re-opening as a release blocker for 5.0 at it's a bug in a new feature
(`GeneratedField`) after some sleuthing on #35368.

The culprit is [https://www.postgresql.org/docs/release/15.6/ this change
Postgres >= 12.18, 13.14, 14.11, 15.6, 16.2 released on 2024-02-08].

> Fix function volatility checking for GENERATED and DEFAULT expressions
(Tom Lane)
>
> These places could fail to detect insertion of a volatile function
default-argument expression, or decide that a polymorphic function is
volatile although it is actually immutable on the datatype of interest.
**This could lead to improperly rejecting or accepting a GENERATED clause,
or to mistakenly applying the constant-default-value optimization in ALTER
TABLE ADD COLUMN**.

In essence the problem seems similar to #34955 as `UPPER` can return
different value depending on the collation and thus is not immutable per-
se?

I've tried to come up with a workaround but I'm not sure what should be
done. The following doesn't work either

`UPPER("text" COLLATE "C") COLLATE "C" = UPPER('test' COLLATE "C") COLLATE
"C"`

so it's possible there might be a bug on the Postgres side as well? In all
cases keeping this ticket open should bring visibility to the issue.

A work around in the mean time is likely to use explicit collations
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 12, 2024, 4:20:29 AM4/12/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* resolution: needsinfo =>
* status: closed => new

--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:5>

Django

unread,
Apr 12, 2024, 4:23:13 AM4/12/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"73b62a21265c4a417004d64d13a896469e2558f3" 73b62a21]:
{{{#!CommitTicketReference repository=""
revision="73b62a21265c4a417004d64d13a896469e2558f3"
Refs #35194 -- Adjusted a generated field test to work on Postgres 15.6+.

Postgres >= 12.18, 13.14, 14.11, 15.6, 16.2 changed the way the
immutability
of generated and default expressions is detected in
postgres/postgres@743ddaf.

The adjusted test semantic is presereved by switching from __icontains to
__contains as both make use of a `%` literal which requires proper
escaping.

Refs #35336.

Thanks bcail for the report.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:6>

Django

unread,
Apr 12, 2024, 9:01:17 AM4/12/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce <42296566+sarahboyce@…>):

In [changeset:"5d95a1c35ef1375a7badcb217c36c5974d1e57ee" 5d95a1c]:
{{{#!CommitTicketReference repository=""
revision="5d95a1c35ef1375a7badcb217c36c5974d1e57ee"
[5.0.x] Refs #35194 -- Adjusted a generated field test to work on Postgres
15.6+.

Postgres >= 12.18, 13.14, 14.11, 15.6, 16.2 changed the way the
immutability
of generated and default expressions is detected in
postgres/postgres@743ddaf.

The adjusted test semantic is presereved by switching from __icontains to
__contains as both make use of a `%` literal which requires proper
escaping.

Refs #35336.

Thanks bcail for the report.

Backport of 73b62a21265c4a417004d64d13a896469e2558f3 from main.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:7>

Django

unread,
Apr 17, 2024, 8:38:00 AM4/17/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Sarah Boyce):

One option I have found (could be a bad idea) is to revert some of #3575.
This was an optimisation where `ILIKE` was removed in preference of using
`UPPER(field) LIKE UPPER('blah')`.
If we use `ILIKE` I no longer get an error here. I guess the question is,
the change of #3575 was implemented many years ago and the performance of
Postgres in this case may have moved on.

Looks a bit like:

{{{
diff --git a/django/db/backends/postgresql/base.py
b/django/db/backends/postgresql/base.py
index e97ab6aa89..4e3f7b3658 100644
--- a/django/db/backends/postgresql/base.py
+++ b/django/db/backends/postgresql/base.py
@@ -154,7 +154,7 @@ class DatabaseWrapper(BaseDatabaseWrapper):
"exact": "= %s",
"iexact": "= UPPER(%s)",
"contains": "LIKE %s",
- "icontains": "LIKE UPPER(%s)",
+ "icontains": "ILIKE %s",
"regex": "~ %s",
"iregex": "~* %s",
"gt": "> %s",
@@ -163,8 +163,8 @@ class DatabaseWrapper(BaseDatabaseWrapper):
"lte": "<= %s",
"startswith": "LIKE %s",
"endswith": "LIKE %s",
- "istartswith": "LIKE UPPER(%s)",
- "iendswith": "LIKE UPPER(%s)",
+ "istartswith": "ILIKE %s",
+ "iendswith": "ILIKE %s",
}

# The patterns below are used to generate SQL pattern lookup clauses
when
diff --git a/django/db/backends/postgresql/operations.py
b/django/db/backends/postgresql/operations.py
index 4b179ca83f..af2463b1d6 100644
--- a/django/db/backends/postgresql/operations.py
+++ b/django/db/backends/postgresql/operations.py
@@ -172,10 +172,6 @@ class DatabaseOperations(BaseDatabaseOperations):
else:
lookup = "%s::text"

- # Use UPPER(x) for case-insensitive lookups; it's faster.
- if lookup_type in ("iexact", "icontains", "istartswith",
"iendswith"):
- lookup = "UPPER(%s)" % lookup
-
return lookup

def no_limit_value(self):
diff --git a/tests/schema/tests.py b/tests/schema/tests.py
index 3a2947cf43..182e3486e0 100644
--- a/tests/schema/tests.py
+++ b/tests/schema/tests.py
@@ -913,7 +913,7 @@ class SchemaTests(TransactionTestCase):
editor.create_model(GeneratedFieldContainsModel)

field = GeneratedField(
- expression=Q(text__contains="foo"),
+ expression=Q(text__icontains="FOO"),
db_persist=True,
output_field=BooleanField(),
)
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:8>

Django

unread,
Apr 18, 2024, 4:09:39 AM4/18/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* has_patch: 0 => 1

Comment:

It's an idea, we might get a better idea 👍
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:9>

Django

unread,
Apr 18, 2024, 8:15:03 AM4/18/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

It is effectively a solution but I'm not convinced this will do more good
than harm.

#3575 has been merged 16 years ago this means in between now and then
thousands of projects were created and added a functional index on
`UPPER("col")` to make `i(exact|contains|startwith)` use an index and the
moment they upgrade to a minor version of 5.0 their database will start
running slow queries as their indices will be unsuables.

On the other hand we have a bug in a newly introduced feature for a very
particular use case that might be affecting only a few users (must use
generated field, must be on a latest version of Postgres, must use
`i(exact|contains|startwith)`.

I appreciate the intent to solve this issue but I think we need to dig
deeper to truly understand ''why'' this is happening before jumping to
conclusions here as there are no true urgency to get things right here;
the ''release blocker'' assignment is self-imposed and nothing prevents us
from deferring a solution to this problem to a future 5.0 release if we
can't understand why this is happening before the May release as for all
we know if might be a bug in Postgres itself.

I tried reaching out on libera.chat#postgres IRC to get an answer but no
one could answer me their (first time this happens) so I was planing to
reach out to their mailing list this week but I might run out of time so
if someone feels comfortable doing so please do.

To summarize I think we should understand why this is happening before
taking any potential harmful action here. For all we know many other
functions and lookups could be affected and this is just the tip of the
iceberg.
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:10>

Django

unread,
Apr 18, 2024, 9:08:13 AM4/18/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* cc: David Sanders (added)

Comment:

David, I think you have successfully engaged with the PostgreSQL team/devs
in the past, leading to productive conversations. Would you have some
availability to reach out to them again to seek their assistance in
debugging this specific issue we're encountering with PostgreSQL >= 12.18,
13.14, 14.11, 15.6, 16.2?
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:11>

Django

unread,
Apr 18, 2024, 9:16:57 AM4/18/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* has_patch: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:12>

Django

unread,
May 3, 2024, 3:45:07 AM5/3/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* resolution: => needsinfo
* severity: Release blocker => Normal
* stage: Accepted => Unreviewed
* status: new => closed

Comment:

It is currently not clear whether the bug is on Postgres side or not.
Until this is clear, we can keep the ticket in the "needsinfo" state as it
best represents it's current status.
Reopening the ticket did bring it to the attention of others and has
helped us look into this a little more, but even in a closed state, the
ticket is still "visible"/searchable.
We can reopen once we better understand the root of the problem and it's
clear that Django needs to implement changes.
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:13>

Django

unread,
Sep 17, 2024, 1:57:38 PM9/17/24
to django-...@googlegroups.com
#35194: Postgres 16.2 with _iexact leads to IndeterminateCollation
-------------------------------------+-------------------------------------
Reporter: Aldalen | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 5.0
(models, ORM) |
Severity: Normal | Resolution: needsinfo
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Adrian Garcia):

We are also running into this issue after upgrading from Postgres 14.10 to
14.11, `iexact` and `icontains` in generated fields were replaced with
similar `iregex` queries as a workaround.
--
Ticket URL: <https://code.djangoproject.com/ticket/35194#comment:14>
Reply all
Reply to author
Forward
0 new messages