Generated Field

100 views
Skip to first unread message

Paolo Melchiorre

unread,
Apr 13, 2022, 6:25:48 AMApr 13
to django-d...@googlegroups.com
Hi all,

I am at PyCon DE 2022 in Berlin with Markus and I shared with him this
idea that I have been thinking about for a few months, given his
interest I also share it with you.

I figured we could add in Django a "GeneratedField" which accepts a
"base_field" attribute in a similar way to "ArrayField" and then an
expression to generate a value for the field.

For example this model:

class Album(models.Model):
...
title = models.CharField(max_length=120)
search = GeneratedField(
SearchVectorField(),
F('title')
)

would generate an SQL code like this in PostgreSQL:

CREATE TABLE album (
...
title char(120),
search tsvector GENERATED ALWAYS AS title STORED
);

I found some documentation for different database backend about
generated column (or similar names)

# SQLite
https://www.sqlite.org/gencol.html
https://www.sqlite.org/releaselog/3_31_0.html

# PostgreSQL
https://www.postgresql.org/docs/12/ddl-generated-columns.html
https://www.postgresql.org/docs/12/release-12.html

# Oracle
https://oracle-base.com/articles/11g/virtual-columns-11gr1

# MySQL
https://dev.mysql.com/doc/refman/5.7/en/mysql-nutshell.html
https://dev.mysql.com/doc/refman/5.7/en/create-table-generated-columns.html

# MariaDB
https://mariadb.com/kb/en/changes-improvements-in-mariadb-102/
https://mariadb.com/kb/en/generated-columns/

I'd love to hear your thoughts on this idea and if you think it might
be something worth continuing to investigate.

Greetings from PyCon DE 2022 in Berlin,
Paolo

--
https://www.paulox.net

Adam Johnson

unread,
Apr 13, 2022, 7:01:46 AMApr 13
to Django developers (Contributions to Django itself)
I'd be interested in seeing this. Generated columns are a useful SQL feature that are missing from Django.

Nice initial research on backend coverage - it looks like they're widely supported.

Some ideas...

Is it necessary to take a base field? Can we not determine the output field type for some kinds of expression? e.g. F("some_integer_field") + 1 can be assumed to have output field type IntegerField.

Would it be worth blocking assignment to the field? Allowing "model.generated_field = ..." would at least be misleading. (This could be done with a special descriptor.)

Good luck!

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAKFO%2Bx5GHUEVdzi2awYtH5C17tTPTPh%2ByoDP%3DKC18pF8%2Bi_7PA%40mail.gmail.com.

Mariusz Felisiak

unread,
Apr 13, 2022, 7:05:49 AMApr 13
to Django developers (Contributions to Django itself)
Related tickets:

- https://code.djangoproject.com/ticket/31300: Add function-based virtual fields on PostgreSQL and Oracle.
- https://code.djangoproject.com/ticket/31565: Support GENERATED ALWAYS columns for MySQL and PostgreSQ

Related DEP:

- https://github.com/django/deps/pull/39 - Refactor ORM with VirtualField and CompositeField

and an accepted ticket for non-database backed calculated field (see discussion https://groups.google.com/g/django-developers/c/ADSuUUuZp3Q/m/eZGYZv74AQAJ):

- https://code.djangoproject.com/ticket/28822 - Add DBCalculatedField to model to annotate models automatically


Kye Russell

unread,
Apr 13, 2022, 7:09:38 AMApr 13
to Django developers (Contributions to Django itself)
I’d love to see this!

Kye
--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.

charettes

unread,
Apr 13, 2022, 9:48:57 AMApr 13
to Django developers (Contributions to Django itself)
> Is it necessary to take a base field? Can we not determine the output field type for some kinds of expression? e.g. F("some_integer_field") + 1 can be assumed to have output field type IntegerField.

It should be possible by simply accessing the `output_field` property of the provided expression once it's resolved yes.

e.g.
# Will crash as the output field of some_integer_field has not been resolved.
(F("some_integer_field") + 1).output_field

# Will work as F resolved to a Col pointing at SomeModel.some_integer_field (assuming it exists)
(F("some_integer_field") + 1).resolve_expression(Query(SomeModel), allow_joins=False).output_field

But as Paolo demonstrate in his example I think we might want a distinct separation between the column type and the expression used to generate the value of this field if unless we want to force users to do gymnastics to rely on type coercion.

For example, in the reported case the user would have to do `GeneratedField(SearchVector('title'))`  which would actually result in `tsvector GENERATED ALWAYS AS to_tsvector(title)` STORED which is not exactly the same. It's not an issue here but I wouldn't be surprised that not being able to configured a generated field to generate a particular SQL output might be problematic in some cases.

What about we make the expected signature `GeneratedField(expression, base_field=None)` where a missing `base_field` defaults to `expression.output_field`? That would allow the exact expected SQL to be generated with `GeneratedField('title', base_field=SearchVectorField())` if there's a requirement for it.

Cheers,
Simon

Adam Johnson

unread,
Apr 13, 2022, 10:34:19 AMApr 13
to Django developers (Contributions to Django itself)
What about we make the expected signature `GeneratedField(expression, base_field=None)` where a missing `base_field` defaults to `expression.output_field`? That would allow the exact expected SQL to be generated with `GeneratedField('title', base_field=SearchVectorField())` if there's a requirement for it.

Yes - sorry for not unclear, this is what I meant. We don't need to *require* the output field, but it will be necessary in some cases.

Bike shed: perhaps output_field is a better name than base_field, to match expressions, and since it's not the base/"source" field for the expression, but where the output/result of the expression is stored.

Dan Davis

unread,
Apr 13, 2022, 4:38:49 PMApr 13
to Django developers (Contributions to Django itself)
Reply all
Reply to author
Forward
0 new messages