Decoupling business and ORM model using dataclass(es)

Nikola Radovanovic

unread,

Sep 20, 2021, 5:04:22 AM9/20/21

to sqlalchemy

Hi,

When decoupling business from ORM model, dataclass passed to SA's imperative mapper map_imperatively will receive SA's internals like for example _sa_instance_state and _sa_class_manager.

I am wondering, what would be the best way to have "pure" dataclass, without SA internals? First thing to come to my mind is to perform query on SA Table and use dict/namedtuple to populate dataclass. Are there better approaches? I am in particular interested how this approach will behave with partial and composite updates? Any recipe to recommend?

Thank you in advance.

Kindest regards

Mike Bayer

unread,

Sep 20, 2021, 10:04:05 AM9/20/21

to noreply-spamdigest via sqlalchemy

to accomplish that you use a registry, typically a weak-keyed one, where keys are the classes / instances in use and the values are the class managers and states. If you wanted to apply such a system to your classes you would make use of the instrumentation extension system:

https://docs.sqlalchemy.org/en/14/orm/extensions/instrumentation.html

there's also an example of it :

https://docs.sqlalchemy.org/en/14/_modules/examples/custom_attributes/custom_management.html

now as far as if you *should* do this. Absolutely not. The instrumentation extension system is a very old and almost never used corner of the API which I moved out of the main library some years ago due to its complexity. adding a weakref lookup to all class/state instrumentation lookups will add a palpable performance penalty to most ORM operations, and the instrumentation extension is likely not without bugs and regressions at this point. it was first created over ten years ago to suit the use case of a single user who was working with Zope security proxied objects. I've seen at least one other person using it, but overall i dont think it's worthwhile to spend lots of time going down tunnels like this; if your dataclass has a few attributes stuck on it, that shouldn't be a problem for any real-world situation.

Thank you in advance.

Kindest regards

--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper

http://www.sqlalchemy.org/

To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/324223bc-695d-4479-be32-b1a7c73e8a76n%40googlegroups.com.

Nikola Radovanovic

unread,

Sep 20, 2021, 11:40:04 PM9/20/21

to sqlalchemy

Thank you,

I am reading Architecture Patterns with Python and trying to find a way to refactor our code to become maintainable. However I noticed that even with mapped dataclass there are problems like the one mentioned here and proposed solution is to use InstrumentationManager you just mentioned, but even with this approach resulting dataclass will be "linked" to ORM object via internal attributes, defeating the purpose of Registry pattern proposed in the book.

Would it be OK to simply use dict objects as adapter between dataclass and ORM?

Kindest regards

Mike Bayer

unread,

Sep 21, 2021, 9:51:36 AM9/21/21

to noreply-spamdigest via sqlalchemy

On Mon, Sep 20, 2021, at 11:40 PM, Nikola Radovanovic wrote:

Thank you,
I am reading Architecture Patterns with Python and trying to find a way to refactor our code to become maintainable. However I noticed that even with mapped dataclass there are problems like the one mentioned here and proposed solution is to use InstrumentationManager you just mentioned, but even with this approach resulting dataclass will be "linked" to ORM object via internal attributes, defeating the purpose of Registry pattern proposed in the book.

the instrumentation manager lets you change how this "link" occurs and there doesnt need to be any attributes on the class.

that said, I am not familiar with that book and if there is some aspect to dataclasses that causes them to fail if there is some instrumentation on them, that's a pretty serious shortcoming in dataclasses.

Would it be OK to simply use dict objects as adapter between dataclass and ORM?

I proposed using a weakkeydictionary, so if that's what you mean, then yes. again i am not famliiar with that book and im not really sure how code is not "maintainable" if it has some private attributes on it. seems pretty off.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/3d3b6d37-d833-41a8-b414-2f41e7c601c3n%40googlegroups.com.

Nikola Radovanovic

unread,

Sep 21, 2021, 10:41:52 AM9/21/21

to sqlalchemy

Thank you.

Maybe I did not explain well (English is not my first language): when I declare a dataclass to serve as business model (so something passed around the code, between libraries, layers, frameworks, etc.) - I don't want any ORM stuff in it. No matter if it is hidden in private/protected attribute. I just want ORM to be able to fill my dataclass after I fetch something from the DB without leaving a "trace" in resulting object and to "understand" dataclass when I perform insert/update (think of it as C structure). Its not a problem if I have to write this "adapter" for the needs of my project, just want to know if someone already did it and if not, what would be the recommended way.

Book I am referring explains exact problems we had when initial project layout was set: we dragged ORM in our business model and made business model aware of the ORM instead the opposite. Seems there are quite other people/teams made the very same mistake (otherwise there will be no book about it). It is not related only to SqlAlchemy but rather how people (mis)use ORMs in general and made their code hard to maintain (as we did because we don't know better atm).

I think SA is great ORM, I use it, recommend it to others and will continue to do so - please don't take this as any sort of critique or nitpicking, I am just looking some clues how to resolve problems we encounter.

Kindest regards

Mike Bayer

unread,

Sep 21, 2021, 11:25:41 AM9/21/21

to noreply-spamdigest via sqlalchemy

On Tue, Sep 21, 2021, at 10:41 AM, Nikola Radovanovic wrote:

Thank you.

Maybe I did not explain well (English is not my first language): when I declare a dataclass to serve as business model (so something passed around the code, between libraries, layers, frameworks, etc.) - I don't want any ORM stuff in it. No matter if it is hidden in private/protected attribute. I just want ORM to be able to fill my dataclass after I fetch something from the DB without leaving a "trace" in resulting object and to "understand" dataclass when I perform insert/update (think of it as C structure). Its not a problem if I have to write this "adapter" for the needs of my project, just want to know if someone already did it and if not, what would be the recommended way.

yeah i just dont see how there is any practical concern to that. i know it "feels" ugly, but thats what ORMs do (all of them). the ORM is explcitly a tool to instrument business objects with persistence logic.

Book I am referring explains exact problems we had when initial project layout was set: we dragged ORM in our business model and made business model aware of the ORM instead the opposite. Seems there are quite other people/teams made the very same mistake (otherwise there will be no book about it). It is not related only to SqlAlchemy but rather how people (mis)use ORMs in general and made their code hard to maintain (as we did because we don't know better atm).

yes this is a common thing among architectural communities. The simple answer is that these model objects would have to have nothing to do with the SQLAlchemy ORM period if you want your business model to have absolutely zero linkage to the persistence details. The _sa_class_state and _sa_instance_state is the least of your problems, SQLAlchemy ORM applies instrumented descriptors to your classes as well. that's the whole way that things like MyModel.column has SQL behavior, how "my_model.column" has data in it, how "my_model.list_of_related" has related objects, etc. your business methods on those classes then build off of those attributes, which means those objects are fully dependent on the ORM.

the standard way to architect for this pattern is to treat SQLAlchemy ORM objects as specific to persistence, then have an entirely different copy of the whole model as actual business objects. Then you have a data persistence layer that runs queries in terms of ORM objects and can translate them to your data objects. This is the pattern I would recommend; by trying to use special instrumentation tricks, you're trying to temporarily apply ORM instrumentation to objects that you otherwise would prefer have nothing to do with the persistence, and that's just wrong; these objects will have *everything* to do with persistence while this process goes on.

the above pattern is the one that is used by most of the Openstack projects, for example, though they have steadily moved away from it due to it being cumbersome. The pydantic project offers a standard recipe for this kind of thing which you can see at https://pydantic-docs.helpmanual.io/usage/models/#orm-mode-aka-arbitrary-class-instances , where you have your dataclasses on one side and your persistence /ORM objects on the other. people are usually pretty annoyed by this pattern so the SQLModel project https://github.com/tiangolo/sqlmodel aims to simplify this by re-joining the two types of objects together, but in this case if you are going for "purity" I dont think you can get around that you will need to have two parallel object models and a data access layer in between.

I think SA is great ORM, I use it, recommend it to others and will continue to do so - please don't take this as any sort of critique or nitpicking, I am just looking some clues how to resolve problems we encounter.

not at all! I know exactly what you're going for, and that's great. you will need to have two separate layers though, trying to sneak by with temporary instrumentation is not the way to do that.

To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/69ec0ab7-dfe1-492e-b3fd-c99c29e17b7dn%40googlegroups.com.

Nikola Radovanovic

unread,

Sep 21, 2021, 1:57:37 PM9/21/21

to sqlalchemy

Many thanks - it is much clearer now!

Best regards

Reply all

Reply to author

Forward