#35539: SearchVector GinIndex raises IMMUTABLE error
----------------------------------+------------------------------------
Reporter: Alastair D'Silva | Owner: (none)
Type: Bug | Status: closed
Component: contrib.postgres | Version: 5.0
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------+------------------------------------
Changes (by Simon Charette):
* resolution: => invalid
* status: new => closed
Comment:
Blaming the surrounding code leads to #30488 and #30385 which have plenty
of context. It also has ties to #34955.
The summary is that `SearchVector` was designed to support any field and
expression thrown at it even the ones that are not of textual type and /
or are nullable. This encouraged the usage of Postgres `CONCAT` function
over the `||` operator as the former properly deals with both
{{{#!sql
psql (16.3, server 15.6 (Debian 15.6-1.pgdg120+2))
Type "help" for help.
django=# SELECT CONCAT('foo', NULL, 1);
concat
--------
foo1
(1 row)
django=# SELECT 'foo' || NULL || 1;
?column?
----------
(1 row)
}}}
The problem with `CONCAT` however is that it produces non-`IMMUTABLE`
output (the resulting data can change based on some connection settings
that relate to collation for example) and thus is not suitable in index
creation which creates a pickle about how `NULL`-able values should be
dealt with.
----
With all that said this ticket has little to do with this complexity. If
you comment out the code reported to ''solve the issue'' by the reporter
your test still fails Sarah and the reason is simple: `to_tsvector` is not
`IMMUTABLE` unless a `regconfig` is specified
[
https://www.postgresql.org/docs/current/textsearch-tables.html
#TEXTSEARCH-TABLES-INDEX which is covered at length in the PostgreSQL
documentation].
> Notice that the 2-argument version of `to_tsvector` is used. '''Only
text search functions that specify a configuration name can be used in
expression indexes''' (Section 11.7). This is because the index contents
must be unaffected by `default_text_search_config`. If they were affected,
the index contents might be inconsistent because different entries could
contain `tsvectors` that were created with different text search
configurations, and there would be no way to guess which was which. It
would be impossible to dump and restore such an index correctly.
So for the same reason `Concat` cannot be used in indices because it
relies on alterable global configuration `SearchVector` cannot be used in
indices without specifying a `config` because they do not use the same
''stop-words''
{{{#!sql
django=# SELECT to_tsvector('english', 'The Web framework for
perfectionists with deadlines.');
to_tsvector
-----------------------------------------------------
'deadlin':7 'framework':3 'perfectionist':5 'web':2
(1 row)
django=# SELECT to_tsvector('french', 'The Web framework for
perfectionists with deadlines.');
to_tsvector
------------------------------------------------------------------------------
'deadlin':7 'for':4 'framework':3 'perfectionist':5 'the':1 'web':2
'with':6
(1 row)
}}}
I'm not sure how the reporter got that removing these particular lines of
code would help but in its current form the report appears to me to be
invalid.
--
Ticket URL: <
https://code.djangoproject.com/ticket/35539#comment:3>