user-defined simple fields in a multi-tenant flask app with ORM

425 views
Skip to first unread message

Xavier Bustamante Talavera

unread,
Mar 28, 2019, 12:27:19 PM3/28/19
to sqlalchemy
Hello,

Thanks in advance for the help.

I am using the ORM with SQLAlchemy in a flask app that is multi-tenant (multi-client): I have several running Flask apps, one per client. Each flask app connects to a different Postgres schema. I use one declarative base for all clients, as they share the same ORM classes, and then every time flask creates a new session it connects to the client schema. I create the tables using the ORM in python, in a way based in this issue: https://github.com/sqlalchemy/sqlalchemy/issues/3914

A new requirement is client-defined custom fields. So, our ORM classes will have their regularly defined fields in sqlalchemy ORM manner in python, plus some client-defined fields. The client-defined fields are regular types without difficult stuff: like strings, numbers... As every client is in its own schema, an example of approach could be creating those fields in the table in the db directly, and then reflect them on the ORM at the beginning of each session (as we need to know the schema). Simpler solutions would be just using hstore or JSON types, but I would be loosing the goodies of SQLAlchemy / Postgres schemas and consistency.

This could be done in different ways in SQLAlchemy, so I am asking you to orient me to a solution, guide, tutorial, or advice to avoid known pitfalls when doing this. For example some steps in how to build this reflection, if you think it is a good solution.

A future requirement will be creating client-defined tables that inherit from our regular ORM ones. If for example, we have "Computer" ORM class, they would want to define types of Computers, like "Desktop". These classes would only be collections of client-defined fields. How would this impact the above solution?

And finally, I will learn database migrations with Alembic, so any piece of advice or link to manage this with Alembic would be appreciated.

This is for open-source software, so if we manage to do it we are happy to write a blog post anywhere (we have a small blog site for example), if you find it can be useful for others.

Thank you for your time (and this amazing software)! :-)

Mike Bayer

unread,
Mar 28, 2019, 1:19:51 PM3/28/19
to sqlal...@googlegroups.com
On Thu, Mar 28, 2019 at 8:27 AM Xavier Bustamante Talavera
<bust...@gmail.com> wrote:
>
> Hello,
>
> Thanks in advance for the help.
>
> I am using the ORM with SQLAlchemy in a flask app that is multi-tenant (multi-client): I have several running Flask apps, one per client. Each flask app connects to a different Postgres schema. I use one declarative base for all clients, as they share the same ORM classes, and then every time flask creates a new session it connects to the client schema. I create the tables using the ORM in python, in a way based in this issue: https://github.com/sqlalchemy/sqlalchemy/issues/3914
>
> A new requirement is client-defined custom fields. So, our ORM classes will have their regularly defined fields in sqlalchemy ORM manner in python, plus some client-defined fields. The client-defined fields are regular types without difficult stuff: like strings, numbers... As every client is in its own schema, an example of approach could be creating those fields in the table in the db directly, and then reflect them on the ORM at the beginning of each session (as we need to know the schema).

I think this would work really poorly. Reflection is slow,
mapping configuration is slow, tearing down of ORM models is slow and
not optimized for any cases outside of test suites, all of this would
add mulit-second latency to every request, and you would need to
prevent all concurrency from each process as you are attempting to use
the same ORM model in the process.

options here include some kind of module-level trickery where each
client gets their own SQLAlchemy model into a private module namespace
in the process that is copied from the main one, which would be
extremely difficult to get right, or to just give each client their
own flask application process,which wont scale, depends on how many
tenants you are talking about.

> Simpler solutions would be just using hstore or JSON types, but I would be loosing the goodies of SQLAlchemy / Postgres schemas and consistency.

this is totally how I'd want to do it unless your clients are given
access to program in SQL and SQLAlchemy. What is a real-world
scenario where you have given a tenant three additional columns on one
of the database tables and the tenants need to use that data ? what
would that look like and what would you be doing that is any different
from pulling those values from an hstore ?



>
> This could be done in different ways in SQLAlchemy, so I am asking you to orient me to a solution, guide, tutorial, or advice to avoid known pitfalls when doing this. For example some steps in how to build this reflection, if you think it is a good solution.
>
> A future requirement will be creating client-defined tables that inherit from our regular ORM ones. If for example, we have "Computer" ORM class, they would want to define types of Computers, like "Desktop". These classes would only be collections of client-defined fields. How would this impact the above solution?

again this sounds like your application just needs to have a Python
package per client where each one has its own version of the ORM
model.


>
> And finally, I will learn database migrations with Alembic, so any piece of advice or link to manage this with Alembic would be appreciated.

all the same, separate alembic migration directories per client. sorry



>
> This is for open-source software, so if we manage to do it we are happy to write a blog post anywhere (we have a small blog site for example), if you find it can be useful for others.
>
> Thank you for your time (and this amazing software)! :-)
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
> ---
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
> To post to this group, send email to sqlal...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

Jonathan Vanasco

unread,
Mar 28, 2019, 8:14:31 PM3/28/19
to sqlalchemy


On Thursday, March 28, 2019 at 9:19:51 AM UTC-4, Mike Bayer wrote:

> Simpler solutions would be just using hstore or JSON types, but I would be loosing the goodies of SQLAlchemy / Postgres schemas and consistency.

this is totally how I'd want to do it unless your clients are given access to program in SQL and SQLAlchemy.    

wile I would handle this as JSON data too, there is also a database pattern for doing this in multi tenant applications where you use a table to allocate and store the allowable keys for each tenant , and another table to store the key values for the tenants objects. but i would do this in JSON.


Ibrahima Gaye

unread,
Mar 28, 2019, 11:32:50 PM3/28/19
to sqlal...@googlegroups.com
Hi Jonathan,
i would do it like this:
- add in your global model  tables named attributs, attributs_value and values,
- any table (let's call it XTable) that will eventually has need extra column per client will be linked to attributs_value via a table XTable_Attributs (For maximum flexibility).
Hope that helps,
Best regards


Ibrahima GAYE



Xavier Bustamante Talavera

unread,
Mar 29, 2019, 8:39:40 AM3/29/19
to sqlal...@googlegroups.com
Hello everyone and thanks for your answers,

Simpler solutions would be just using hstore or JSON types, but I would be loosing the goodies of SQLAlchemy / Postgres schemas and consistency.

this is totally how I'd want to do it unless your clients are given
access to program in SQL and SQLAlchemy.     What is a real-world
scenario where you have given a tenant three additional columns on one
of the database tables and the tenants need to use that data ?  what
would that look like and what would you be doing that is any different
from pulling those values from an hstore ?

@Mike: To answer you the user-defined data would only be used for CRUD operations, no specific coding logic involved, so the only benefit of going to using extra fields in a table rather than a hstore / json type would be data and access consistency —probably not enough for the problems you mention this approach would take.

As I understand then it is better to just define a Postgres JSON type as the custom field and provide our own schema validation.

For the client-defined inheritance, as those tables are going to lack custom logic more than the field definitions, we will value other approaches not touching table definitions.

On 29 Mar 2019, at 00:32, Ibrahima Gaye <ibrahi...@gmail.com> wrote:

Hi Jonathan,
i would do it like this:
- add in your global model  tables named attributs, attributs_value and values,
- any table (let's call it XTable) that will eventually has need extra column per client will be linked to attributs_value via a table XTable_Attributs (For maximum flexibility).
Hope that helps,
Best regards

@Ibrahima and @Jonathan, as I understand you are talking about something like the Entity–Attribute–Value model, adapted to the multi-tenant case. In my case the data comes form an API in JSON, so although there is not a special strong case towards this pattern, I think I will be using a JSON type.

Thank you again for your comments; it has been very enlightening!
Best regards,
Xavier Bustamante Talavera.
Linkedin | +34 634 541 887

Ibrahima Gaye

unread,
Mar 29, 2019, 9:32:25 AM3/29/19
to sqlal...@googlegroups.com
Thanks Xavier,
i did not know the pattern had a name :)  ( the Entity–Attribute–Value model,)
Ibrahima GAYE



Jonathan Vanasco

unread,
Mar 29, 2019, 3:17:22 PM3/29/19
to sqlalchemy


On Friday, March 29, 2019 at 4:39:40 AM UTC-4, Xavier Bustamante Talavera wrote:

@Ibrahima and @Jonathan, as I understand you are talking about something like the Entity–Attribute–Value model, adapted to the multi-tenant case. In my case the data comes form an API in JSON, so although there is not a special strong case towards this pattern, I think I will be using a JSON type.

Yes! That's the name, I could not remember it.  I would definitely use JSON instead of EAV due to it's ease of use and overall performance.

FWIW, There are two main variations of the EAV pattern I've seen: normalizing the values into their own table & using a table that just has attribute id + value id (Ibrahima's suggestion, i think), and just doing a an attribute id + value table (what I was alluding to).  in my experience, the scale of multi tenant applications tends to make the fully normalized implementation incredibly slow, so i just don't bother with it anymore.
Reply all
Reply to author
Forward
0 new messages