Re: [Django] #34898: Adding non-deterministic collations to unique CharFields crashes on PostgreSQL.

6 просмотров
Перейти к первому непрочитанному сообщению

Django

не прочитано,
26 окт. 2023 г., 12:25:1326.10.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
--------------------------------------+------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL collation | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
--------------------------------------+------------------------------------

Comment (by Tom Carrick):

Interesting. I looked into it and it is indeed only this case that breaks.

If you add them both at once, it's fine. If you add the collation, then
the unique, it's fine. But if it's unique, then you try to add the
collation, boom.

As you might expect, this also happens for `db_index`.

However, it doesn't happen when using `Meta` `indexes` / `constraints`
unless you have `opclasses=["varchar_pattern_ops"]`, and if you remove
this in the same migration as adding the collation, this also seems to
work fine (I'm not sure if by luck of because Django first removes the
constrain, then does other stuff, then adds the constraint purposefully).

Given that `db_index` is discouraged in the docs... and I'd make the same
argument for doing the same for `unique=True`, maybe a note in the docs
about this is enough?

I didn't look very hard to find a solution, but I think it should be
possible by adding stuff to
`django.db.backends.postgresql.schema.DatabaseSchemaEditor._alter_field`.
I'll take a look.

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

не прочитано,
26 окт. 2023 г., 12:25:1926.10.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned

Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Tom Carrick):

* owner: nobody => Tom Carrick
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:2>

Django

не прочитано,
30 окт. 2023 г., 14:30:1430.10.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tom Carrick):

I looked into it a bit.

I think handling this in the `schema` module is a really bad idea, as we
would be dropping and creating an index or constraint without the user's
knowledge. I don't see another way to handle it.

The next option is to detect this when running `makemigrations`. I know
very little about how `makemigrations` works so I may be totally wrong
here but it seems like at the moment it doesn't really know anything about
the database state. If we were to fix this bug there, we'd have to add a
query to determine if the collation is deterministic. It would also need
to have some knowledge of Postgres that seems not to be known anywhere
else in this part of the code.

So I again come back to this: I think we should document this and not try
to fix it in code, and perhaps also further discourage use of
`unique=True` and `db_index=True`.

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:3>

Django

не прочитано,
15 нояб. 2023 г., 03:36:2515.11.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:3 Tom Carrick]:


> I think handling this in the `schema` module is a really bad idea, as we
would be dropping and creating an index or constraint without the user's
knowledge. I don't see another way to handle it.

I don't think it's that bad. We've already don't create
`varchar_pattern_ops`/`text_pattern_ops` indexes when non-deterministic
`db_collation` is set, so removing them before adding a collation would be
somehow consistent with the current behavior.

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:4>

Django

не прочитано,
15 нояб. 2023 г., 03:51:3415.11.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tom Carrick):

I agree that it's consistent. What I don't think is a good idea though, is
dropping and recreating the entire index. This could take a fairly long
time on some databases, and it can be quite surprising that a change to
the collation would cause another operation that takes some time.

I can have a look though. Maybe we can at least do it concurrently.

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:5>

Django

не прочитано,
15 нояб. 2023 г., 03:57:0715.11.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:5 Tom Carrick]:


> I agree that it's consistent. What I don't think is a good idea though,
is dropping and recreating the entire index. This could take a fairly long
time on some databases, and it can be quite surprising that a change to
the collation would cause another operation that takes some time.
>
> I can have a look though. Maybe we can at least do it concurrently.

Can we recreate them? For now, we completely skip


`varchar_pattern_ops`/`text_pattern_ops` indexes when non-deterministic

`db_collation` is set. Is it not enough to remove them before adding a
collation?

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:6>

Django

не прочитано,
15 нояб. 2023 г., 05:19:4815.11.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Tom Carrick):

The opposite case works fine because there's no index yet. The user is
adding an index, so it's expected that the index gets created, and of
course we need it without `varchar_pattern_ops`.

In this case though, the user already has a `varchar_pattern_ops` index.
Postgres offers no way to alter an existing index, it's drop and recreate
only, so we need to drop the index, change the collation, then add the
index again but without `varchar_patterm_ops`. My concern here is not that
it's difficult, it's that we're performing potentially expensive
operations without informing the user about them (as they don't show up in
the migration file, or anywhere else). Ideally the user should know they
can't use `varchar_pattern_ops` with a non-deterministic collation, but
`db_index` and `unique` are quite opaque about what is actually happening,
relative to the more explicit `Meta.indexes` and `Meta.constraints`, so
it's hard for them to know (unless it's very explicitly documented
somewhere) what the consequences are of changing the collation in this
scenario.

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:7>

Django

не прочитано,
15 нояб. 2023 г., 06:18:5315.11.2023
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Ah, right. TBH, the current `NotSupportedError: nondeterministic
collations are not supported for operator class "varchar_pattern_ops"`
error is not so bad. Maybe it's enough to add a warning in docs (as you
suggested).

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:8>

Django

не прочитано,
3 мая 2024 г., 10:03:293 мая
– django-...@googlegroups.com
#34898: Adding non-deterministic collations to unique CharFields crashes on
PostgreSQL.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Tom
| Carrick
Type: Bug | Status: assigned
Component: Migrations | Version: 4.2
Severity: Normal | Resolution:
Keywords: PostgreSQL | Triage Stage: Accepted
collation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jacob Walls):

* cc: Jacob Walls (added)

--
Ticket URL: <https://code.djangoproject.com/ticket/34898#comment:9>
Ответить всем
Отправить сообщение автору
Переслать
0 новых сообщений