Best way to implement PostgreSQL materialized views?

1,478 views
Skip to first unread message

Ethan Ralph

unread,
Apr 28, 2020, 12:05:21 PM4/28/20
to sqlalchemy
Hello. I am currently working on an implementation of PostgreSQL materialized views in SQLAlchemy and Alembic. To give a basic explanation, materialized views are regular views but all the rows are computed statically, and you can refresh them with a REFRESH MATERIALIZED VIEW query.
I've attached the code I have so far. It uses sqlalchemy.sql.table and column proxies to simulate a table, and also implements Alembic operations with autogenerate. I want to make MaterializedView into a proper class like Table. What would be the best way to go about this? What should I inherit from? Which methods should I override? Any input is appreciated.
_materialized_view.py

Mike Bayer

unread,
Apr 29, 2020, 9:55:49 AM4/29/20
to noreply-spamdigest via sqlalchemy
Addition of views to SQLAlchemy is an extremely long term issue that has never been done, largely because we have a recipe that works well enough for most cases, which you have probably seen, at https://github.com/sqlalchemy/sqlalchemy/wiki/Views.    The issue itself is https://github.com/sqlalchemy/sqlalchemy/issues/812, keep in mind all those comments are 13 years old.

A real View object would feature the "materialized" part as just a boolean flag since AFAICT the "materialized" aspect of it only presents itself as a keyword to the user and possibly some other options, the rest of what goes on has to do with the internals of the database.       The class hierarchy would be the same as that of Table:  class View(DialectKWArgs, SchemaItem, TableClause):    and the object would need to act pretty much like Table in just about every way, including the metadata support, autoload, etc.  Basically it would have an extra field indicating the SELECT source of the data, which based on modern conventions would be presented either as a select() object against other Table objects, or as a text() construct.

on the DDL side, a command like metadata.create_all() / metadata.drop_all() would need to be smart about when it creates / drops these views.   views are first and foremost dependent on the Table objects that they select against.   However, they can also be dependent on each other, as you can make views from views.  so a full create/drop sequence would likely need to assemble this list of dependencies in the same way that Table objects are dependent on each other via foreign key relationship, and then ensure creates/drops are done in the right order.   There is a lot of infrastructure in lib/sqlalchemy/sql/ddl.py that does work like this which should be leveraged here.

The good news for views is that reflection is actually already implemented for views in that they come back as Table objects, and even the "view definition" can be retrieved however that's not integrated into the schema system.

So there's a lot to think about here and overall views haven't happened because there is a lot to consider in terms of a fully complete feature that requires no special user intervention, especially the DDL / dependency issue, whereas the recipe plus the reflection support has been most of what people need.    If you have proposals for how things would look then you can use issue 812 for notes.




On Tue, Apr 28, 2020, at 12:05 PM, Ethan Ralph wrote:
Hello. I am currently working on an implementation of PostgreSQL materialized views in SQLAlchemy and Alembic. To give a basic explanation, materialized views are regular views but all the rows are computed statically, and you can refresh them with a REFRESH MATERIALIZED VIEW query.
I've attached the code I have so far. It uses sqlalchemy.sql.table and column proxies to simulate a table, and also implements Alembic operations with autogenerate. I want to make MaterializedView into a proper class like Table. What would be the best way to go about this? What should I inherit from? Which methods should I override? Any input is appreciated.


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.


Attachments:
  • _materialized_view.py

Ethan Ralph

unread,
Apr 29, 2020, 11:07:46 AM4/29/20
to sqlalchemy
Replies inline.


On Wednesday, April 29, 2020 at 1:55:49 PM UTC, Mike Bayer wrote:
Addition of views to SQLAlchemy is an extremely long term issue that has never been done, largely because we have a recipe that works well enough for most cases, which you have probably seen, at https://github.com/sqlalchemy/sqlalchemy/wiki/Views.    The issue itself is https://github.com/sqlalchemy/sqlalchemy/issues/812, keep in mind all those comments are 13 years old.

Yes, I wasn't planning this as a generic ready-for-mainline implementation, I simply wanted to use it in my own project. I know that it has a lot of shortcomings, and just wanted to add whatever was necessary for me to use materialized views. But now that you mention it, you have gotten me interested on getting work underway to implement some sort of views. Perhaps with help from the SQLAlchemy maintainers/community I could implement such a feature in a fork.
A real View object would feature the "materialized" part as just a boolean flag since AFAICT the "materialized" aspect of it only presents itself as a keyword to the user and possibly some other options, the rest of what goes on has to do with the internals of the database.       The class hierarchy would be the same as that of Table:  class View(DialectKWArgs, SchemaItem, TableClause):    and the object would need to act pretty much like Table in just about every way, including the metadata support, autoload, etc.
I see, thanks for the info. I don't know too much about the SQLAlchemy internals so not sure about how to implement the things you mentioned.
  Basically it would have an extra field indicating the SELECT source of the data, which based on modern conventions would be presented either as a select() object against other Table objects, or as a text() construct.
text() might work better for Alembic migrations... But maybe a method could be provided which turns the select() into a text() (if that's possible).

on the DDL side, a command like metadata.create_all() / metadata.drop_all() would need to be smart about when it creates / drops these views.   views are first and foremost dependent on the Table objects that they select against.   However, they can also be dependent on each other, as you can make views from views.  so a full create/drop sequence would likely need to assemble this list of dependencies in the same way that Table objects are dependent on each other via foreign key relationship, and then ensure creates/drops are done in the right order.   There is a lot of infrastructure in lib/sqlalchemy/sql/ddl.py that does work like this which should be leveraged here.
I see. Wouldn't it be easy enough for the expression API to discover all the tables that are used in a select() construct (and in fact, it already has to do this anyway)?

The good news for views is that reflection is actually already implemented for views in that they come back as Table objects, and even the "view definition" can be retrieved however that's not integrated into the schema system.

Can you give a little bit more detail about how to implement the retrieval of the view definition? As far as I can tell, SQLAlchemy is already able to fetch the table definitions (because Alembic can magically generate CreateTable operations in downgrades despite the code for the table not existing in the codebase anymore). Would it work in a similar fashion?
So there's a lot to think about here and overall views haven't happened because there is a lot to consider in terms of a fully complete feature that requires no special user intervention, especially the DDL / dependency issue, whereas the recipe plus the reflection support has been most of what people need.    If you have proposals for how things would look then you can use issue 812 for notes.
It feels like not much effort has been put in because people are content with the recipe. I'm fine with it too, but I'm wondering if some work can be started towards a real mainlined View object in SQLAlchemy.



On Tue, Apr 28, 2020, at 12:05 PM, Ethan Ralph wrote:
Hello. I am currently working on an implementation of PostgreSQL materialized views in SQLAlchemy and Alembic. To give a basic explanation, materialized views are regular views but all the rows are computed statically, and you can refresh them with a REFRESH MATERIALIZED VIEW query.
I've attached the code I have so far. It uses sqlalchemy.sql.table and column proxies to simulate a table, and also implements Alembic operations with autogenerate. I want to make MaterializedView into a proper class like Table. What would be the best way to go about this? What should I inherit from? Which methods should I override? Any input is appreciated.


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlal...@googlegroups.com.

Jonathan Vanasco

unread,
Apr 29, 2020, 2:45:19 PM4/29/20
to sqlalchemy
There also exists a 3rd party library that has been somewhat maintained: https://github.com/jklukas/sqlalchemy-views

IIRC, it is largely based on the Wiki recipe that Mike linked to.


Reply all
Reply to author
Forward
0 new messages