[Django] #34955: Make available the string concatenation operator `||` for PostgreSQL

44 views
Skip to first unread message

Django

unread,
Nov 8, 2023, 3:43:50 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo | Owner: nobody
Melchiorre |
Type: New | Status: new
feature |
Component: Database | Version: dev
layer (models, ORM) | Keywords: field, database,
Severity: Normal | generated, output_field
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
The `django.db.models.functions.Concat` database ORM function in the
PostgreSQL database backend use the `CONCAT` SQL function.

Unfortunately, the `CONCAT` function in PostgreSQL is not `IMMUTABLE` and
cannot be used in functional indexes or generated fields.

For example, if you use `Concat("first_name", Value(" "), "last_name")` in
a `GeneratedField` you get the following error:

{{{
#!python
psycopg.errors.InvalidObjectDefinition: generation expression is not
immutable
}}}

I propose to make the string concatenation operator `||` available in the
PostgreSQL database backend in addition to the already available string
concatenation function `CONCAT`.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 8, 2023, 5:21:21 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

I'm not convinced, there are other built-in functions that could be
implemented differently and currently cannot be used in `GeneratedField`s,
e.g. `SHA512`, `SHA384`, `SHA224`, `SHA1`, or `MD5` on Oracle. I'm
pessimistic about adding multiple new functions for such cases. This may
open a can of worms. The current thread is to keep Django a core
framework, not providing every utility which might be useful.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:1>

Django

unread,
Nov 8, 2023, 6:28:18 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: Bug | Status: new

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Paolo Melchiorre):

* type: New feature => Bug


Comment:

Replying to [comment:1 Mariusz Felisiak]:


> I'm not convinced, there are other built-in functions that could be
implemented differently and currently cannot be used in `GeneratedField`s

The string concatenation operator `||` is not an exotic function specific
to PostgreSQL, but an operator already used in other database backends, so
your concern about the possible increase in requests to add new database
functions to databases does not apply.

I would add that I would have liked to open this issue since the
introduction of functional indexes in Django 3.2, but I never did so due
to lack of time.

I decided to open this issue now after seeing that the choice to use the
non-immutable concatenation function `CONCAT` instead of the immutable
concatenation operator `||` in the PostgreSQL backend also affects the
case of `GeneratedFields`.

> I'm pessimistic about adding multiple new functions for such cases. This
may open a can of worms. The current thread is to keep Django a core
framework, not providing every utility which might be useful.

Among other things, the string concatenation example is the most used to
explain Generated Fields (e.g [https://fly.io/django-beats/new-goodies-in-
django-50/#generated-fields fly.io], [https://djangochat.com/episodes/git-
and-django-50-adam-johnson djangochat.com],
[https://www.paulox.net/2023/11/07/database-generated-columns-part-1
-django-and-sqlite/#the-user-model paulox.net], ...), but, due to this bug
in the PostgreSQL backend database, it only works on a subset of Django
backend databases, unlike the current thread of keeping Django a core
framework, as you reiterated.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:2>

Django

unread,
Nov 8, 2023, 7:54:50 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new

Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* type: Bug => New feature


Comment:

> ... databases does not apply.

I don't understand why TBH. Do you want to say that it doesn't apply
because it's PostgreSQL and we should treat it specially? I would argue
with that.

This is definitively not a bug. `Concat()` is implemented this way from
the very beginning and database limitations for `GeneratedField` are
[https://docs.djangoproject.com/en/dev/ref/models/fields/#generatedfield
documented].

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:3>

Django

unread,
Nov 8, 2023, 8:03:26 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Moreover, there is no easy way to implement `Concat()` on PostgreSQL such
that it uses ` || `, is immutable, and work the same way as now, because
it uses `COALESCE()` which is also considered mutable by PostgreSQL. I
understand that you want to use new goodies on PostgreSQL but it's a
database limitation, not a Django fault.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:4>

Django

unread,
Nov 8, 2023, 9:16:06 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Paolo Melchiorre):

Replying to [comment:4 Mariusz Felisiak]:
> ... I understand that you want to use new goodies on PostgreSQL ...

This isn't about **me** wanting to play with the latest Django features
with PostgreSQL, I solved the problem in my code some time ago by writing
a function that uses `||` instead of `CONCAT`.

The problem, however, remains non-expert users to whom we reiterate that
the generated fields are compatible with all supported databases, and then
the first example we do, uses strings concatenations, that we already know
will not work with at least one of these databases out-of-the-box.

> Moreover, there is no easy way to implement `Concat()` on PostgreSQL
such that it uses ` || `, is immutable, and work the same way as now,
because it uses `COALESCE()` which is also considered mutable by

PostgreSQL ... it's a database limitation, not a Django fault.

On the contrary, in PostgreSQL, you can concatenate strings into a
generated column without any problem.

See for example:

{{{
#!sql
CREATE TABLE "samples_fullnames" (
"id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
"first_name" varchar(150) NOT NULL,
"last_name" varchar(150) NOT NULL,
"full_name" text GENERATED ALWAYS AS (
"first_name"::text || ' '::text || "last_name"::text
) STORED
);

}}}

But also: ''(mimicking the SQL code generated by the ORM for SQLite)''

{{{
#!sql
CREATE TABLE "samples_names" (
"id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
"first_name" varchar(150) NOT NULL,
"last_name" varchar(150) NOT NULL,
"full_name" text GENERATED ALWAYS AS (
COALESCE(("first_name")::text, '') || COALESCE(COALESCE(('
')::text, '') || COALESCE(("last_name")::text, ''), '')
) STORED
);
}}}

If there is a limitation I would say it lies in how the Django ORM
generates the SQL code for PostgreSQL.

{{{
#!sql
CREATE TABLE "samples_user" (
"id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
"first_name" varchar(150) NOT NULL,
"last_name" varchar(150) NOT NULL,
"full_name" text GENERATED ALWAYS AS (
CONCAT(("first_name")::text, (CONCAT((' ')::text,
("last_name")::text))::text)
) STORED
);
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:5>

Django

unread,
Nov 8, 2023, 9:26:40 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Paolo Melchiorre):

Replying to [comment:3 Mariusz Felisiak]:


> > ... databases does not apply.
>
> I don't understand why TBH.

''You quoted my sentence so much that I had to go and reread it :-D''

> Do you want to say that it doesn't apply because it's PostgreSQL and we
should treat it specially? I would argue with that.

My intention was not to ask for special treatment for PostgreSQL, but on
the contrary to ask that it not be the only one treated differently in
this particular case.

I tried to explain that the `||` operator is not a function, but an
operator, and that among other things it has already been used in other
supported backend databases, so requesting that it ALSO be used in
PostgreSQL doesn't seem right to me. a special request.


> This is definitively not a bug. `Concat()` is implemented this way from
the very beginning and database limitations for `GeneratedField` are
[https://docs.djangoproject.com/en/dev/ref/models/fields/#generatedfield
documented].

I don't know the reasons for using `CONCAT` instead of `||`, but I
reiterate that this choice was already a problem for functional indexes,
and now generated fields have been added. It therefore seems to me that
that choice can at least be reevaluated.

''Note: I realize that explaining myself in writing in English could be a
limitation of mine, and in case what I write is misunderstood or arrogant,
I reiterate here that this is not my intention. :-)''

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:6>

Django

unread,
Nov 8, 2023, 10:01:14 AM11/8/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by David Sanders):

Throwing some fuel on the fire with a few fyi points

- Firstly just want to make it clear that pg marking concat() immutable
is a feature (by design) not a bug :D (it's "stable" but "immutable"
because settings can still affect the output)
- Apparently `||` is ANSI (TIL)
- Even though it is ANSI you still have to set mode `PIPES_AS_CONCAT` [1]
on MySQL
- `Concat()` doesn't actually use `Coalesce()` for pg because it doesn't
need it, NULLs are ignored with `CONCAT()` on pg [2]
- Unfortunately we can't change `Concat()` to use `||` because on pg it
_does_ evaluate NULLs. If it were to use `||` then we _would_ need to make
use of `Coalesce()` to make it consistent.

I'm kinda hoping though that we do get an immutable concat for pg because
concatenating strings seems to be a common desire, at least for me :D

Lastly can I just say that (somewhat jokingly but semi-serious)... we may
be missing the perfect opportunity to make use of the `__floordiv__()`
operator method [3] ;D

{{{
GeneratedField(expression=F('first_name') // F('last_name'))
}}}


[1] https://dev.mysql.com/doc/refman/8.0/en/sql-
mode.html#sqlmode_pipes_as_concat
[2] https://www.postgresql.org/docs/current/functions-string.html
[3] https://docs.python.org/3/reference/datamodel.html#emulating-numeric-
types

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:7>

Django

unread,
Nov 9, 2023, 5:38:50 AM11/9/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

If casting to text data-type fixes the issue then I'd consider documenting
this workaround instead of adding a PostgreSQL-specific function (where
text casting would still be required). For example, adding the following
sentence:

''"In order to make an `IMMUTABLE` expression `MUTABLE` on PostgreSQL,
you can wrap the expression with `Cast()`, e.g.
`Cast(Concat(F("first_name"), Value(" "), F("last_name")),
TextField())`."''

in the warning about `IMMUTABLE` functions on PostgreSQL.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:8>

Django

unread,
Nov 9, 2023, 6:19:40 AM11/9/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Paolo Melchiorre):

Replying to [comment:8 Mariusz Felisiak]:


> If casting to text data-type fixes the issue then I'd consider
documenting this workaround instead of adding a PostgreSQL-specific
function (where text casting would still be required). For example, adding
the following sentence:
>
> ''"In order to make an `IMMUTABLE` expression `MUTABLE` on PostgreSQL,
you can wrap the expression with `Cast()`, e.g.
`Cast(Concat(F("first_name"), Value(" "), F("last_name")),
TextField())`."''
>
> in the warning about `IMMUTABLE` functions on PostgreSQL.


Unfortunately the SQL code generate by the Django ORM is still broken in
PostgreSQL.

I wrote the class as susggested by Mariusz:
{{{
#!python
class Person(models.Model):
first_name = models.CharField()
last_name = models.CharField()
full_name = models.GeneratedField(
expression=Cast(
Concat("first_name", V(" "), "last_name"),
models.TextField(),
),
db_persist=True,
output_field=models.TextField(),
)
}}}

But the migrations generate this SQL code:
{{{
#!sql
CREATE TABLE "samples_person" (


"id" bigint NOT NULL PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,

"first_name" varchar NOT NULL,
"last_name" varchar NOT NULL,


"full_name" text GENERATED ALWAYS AS (

(
CONCAT(


("first_name")::text,
(CONCAT((' ')::text, ("last_name")::text))::text
)

)::text
) STORED
);
}}}

that generated this error message `psycopg.errors.InvalidObjectDefinition:
generation expression is not immutable`

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:9>

Django

unread,
Nov 9, 2023, 6:23:56 AM11/9/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by David Sanders):

I need to clarify an incorrect assumption I was making in my earlier
comment.

COALESCE() is acceptable for use in generated fields & function indexes.
This is because it's not a "real" function, it's an expression.

This means that Paolo's proposal to make `Concat()` on Postgres more like
it does on SQLite an acceptable change keeping behaviour consistent.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:10>

Django

unread,
Nov 9, 2023, 10:34:51 AM11/9/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


Comment:

> Unfortunately we can't change Concat() to use `||` because on pg it
_does_ evaluate NULLs. If it were to use `||` then we _would_ need to make
use of Coalesce() to make it consistent.

That's the crux of the issue and why we had to go back and forth on how it
was implemented on Postgres when adding support for functional index and
constraints. There are few tickets related to this such one such as #29582
and #30385.

> the migrations generate this SQL code (code snipped with many
unnecessary `::text`)

This is a side effect of 09ffc5c1212d4ced58b708cbbf3dfbfb77b782ca (#33308)
to accommodate support for `psycopg>3` server-side bindings. See
779cd28acb1f7eb06f629c0ea4ded99b5ebb670a (#34840) which removed many of
them but
[https://github.com/django/django/blob/f7389c4b07ceeb036436e065898e411b247bca78/django/db/models/functions/text.py#L90
some still remain].

Based on #30385 I think that the best way forward here is to stop using
`CONCAT` entirely and use a strategy where it relies on `||` with
`Coalesce` and `Cast` appropriately when dealing with expressions that are
nullable and/or non-text. The challenge here is that we can't trust
`.null` as the our output field resolving strategy doesn't carry `null`
affinity. If we want to make sure `Concat` maintains is previous behaviour
on Postgres we ''must'' wrap every source expression in `Coalesce`.

Normally when we change the SQL generated by an expression it leaves all
index generated with the previous implementation unusable and forces users
to re-create them but in this case it wasn't possible to even create such
index as `Concat` is not `IMMUTABLE` so I don't think that's an issue.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:11>

Django

unread,
Nov 9, 2023, 11:35:03 AM11/9/23
to django-...@googlegroups.com
#34955: Make available the string concatenation operator `||` for PostgreSQL
-------------------------------------+-------------------------------------
Reporter: Paolo Melchiorre | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: field, database, | Triage Stage:
generated, output_field | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Paolo Melchiorre):

Replying to [comment:11 Simon Charette]:


> > Unfortunately we can't change Concat() to use `||` because on pg it
_does_ evaluate NULLs. If it were to use `||` then we _would_ need to make
use of Coalesce() to make it consistent.
>
> That's the crux of the issue and why we had to go back and forth on how
it was implemented on Postgres when adding support for functional index
and constraints. There are few tickets related to this such one such as
#29582 and #30385.

Thanks, for pointing us to these old issues.

> > the migrations generate this SQL code (code snipped with many
unnecessary `::text`)
>
> This is a side effect of 09ffc5c1212d4ced58b708cbbf3dfbfb77b782ca
(#33308) to accommodate support for `psycopg>3` server-side bindings. See
779cd28acb1f7eb06f629c0ea4ded99b5ebb670a (#34840) which removed many of
them but
[https://github.com/django/django/blob/f7389c4b07ceeb036436e065898e411b247bca78/django/db/models/functions/text.py#L90
some still remain].

Are you suggesting opening an issue to remove all the remaining
unnecessary `CAST`?

> Based on #30385 I think that the best way forward here is to stop using
`CONCAT` entirely and use a strategy where it relies on `||` with
`Coalesce` and `Cast` appropriately when dealing with expressions that are
nullable and/or non-text. The challenge here is that we can't trust
`.null` as the our output field resolving strategy doesn't carry `null`
affinity. If we want to make sure `Concat` maintains is previous behaviour
on Postgres we ''must'' wrap every source expression in `Coalesce`.

I understand that `GeneratedField` is only the last one affected by the
fact that `CONCAT` is not `IMMUTABLE`

I agree with the plan to replace `CONCAT` everywhere with `||` given how
many problems it would solve.

> Normally when we change the SQL generated by an expression it leaves all
index generated with the previous implementation unusable and forces users
to re-create them but in this case it wasn't possible to even create such
index as `Concat` is not `IMMUTABLE` so I don't think that's an issue.

Better this way I would say, there will be no indexes that the user needs
to re-create.

--
Ticket URL: <https://code.djangoproject.com/ticket/34955#comment:12>

Reply all
Reply to author
Forward
0 new messages