SQLAlchemy ORM instances and direct __dict__ access

862 views
Skip to first unread message

ya...@aknin.name

unread,
Dec 17, 2012, 12:35:52 PM12/17/12
to sqlal...@googlegroups.com
Hi,

SQLAlchemy populates an instance's __dict__ with lazily related instances when the attributes for these instances are first accessed (see here for an ancient and huge discussion on the topic; I thought it's better to start afresh).

This breaks code that assumes __dict__ encapsulates object state (see here for a real world example when using flask-restful).

While several workarounds may alleviate this, I think the best approach here would be to create proxy objects to all relations in any instance's __dict__. These proxies can be transparently replaced with the real thing upon first access(if the list will agree with this approach, I'm willing to try my hand at producing a patch).

What do you think?

 - Yaniv

Michael Bayer

unread,
Dec 17, 2012, 12:50:02 PM12/17/12
to sqlal...@googlegroups.com
On Dec 17, 2012, at 12:35 PM, ya...@aknin.name wrote:

Hi,

SQLAlchemy populates an instance's __dict__ with lazily related instances when the attributes for these instances are first accessed (see here for an ancient and huge discussion on the topic; I thought it's better to start afresh).

This breaks code that assumes __dict__ encapsulates object state (see here for a real world example when using flask-restful).

If flask-restful thinks it can get an accurate picture of a Python object's state via __dict__ access alone, it is mistaken.    Descriptors are an extremely common system of providing automation behind attribute access, and SQLAlchemy builds upon this standard system.    Any serialization system should provide for hooks to customize how the serialization occurs, and assuming flask-restful provides these hooks, that's the solution to take.


While several workarounds may alleviate this, I think the best approach here would be to create proxy objects to all relations in any instance's __dict__. These proxies can be transparently replaced with the real thing upon first access

The Python descriptor already provides a mechanism for proxying data that may or may not be inside of __dict__.     

It is essential in a system like SQLAlchemy that the structure of __dict__ remain completely simple.  Adding another layer of proxying into it would be a massive performance hit and produce tons of bugs where proxy objects don't act like the real thing.     flask-restful needs to provide hooks for alternate systems of serialization, plain and simple.




ya...@aknin.name

unread,
Dec 18, 2012, 8:20:55 AM12/18/12
to sqlal...@googlegroups.com
I thought about it more. You are correct, there's no way to make this work well (I didn't think of proxying __dict__, I thought of placing pre-bound object proxies inside __dict__ for relations with interface mimicking a-la Werkzeug's object proxies, but those will be brittle and not as least-astonishing as I'd like them to be).

I will take this to the flask-restful community. I wish Python had a proper serialization customization API (__reduce__ is oldish, not quite a standard, and probably not flexible enough for many needs), but here is probably not quite the place for such wishes.

Thanks,
 - Yaniv

ya...@aknin.name

unread,
Dec 18, 2012, 8:25:33 AM12/18/12
to sqlal...@googlegroups.com
Ah, wait, I forgot something important to ask: how should I implement the custom serialization hook?

I mean, can someone offer a mixing object with a method that will return a dictionary representation an arbitrary SQLAlchemy instance, including (optional) relation recursion?

I'd be grateful for solutions that rely on the 0.8 introspection API, but will be even more interested in solutions that will work in 0.7.x.

Thanks again,
 - Yaniv


On Monday, December 17, 2012 7:50:02 PM UTC+2, Michael Bayer wrote:

Michael Bayer

unread,
Dec 19, 2012, 11:37:17 AM12/19/12
to sqlal...@googlegroups.com
I'd look at the python stdlib for examples, pickle uses __getstate__ and __setstate__ (or sometimes __reduce__), copy I think uses __copy__, etc.

So per class hooks are one way, the other is to pass a "serializer" into the target library, like:

mylibrary.serialize(object, serializer=MySerializer)



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/i5jIMdCcKwMJ.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

Reply all
Reply to author
Forward
0 new messages