Dynamic relationship

106 views
Skip to first unread message

Ben Chess

unread,
Jun 2, 2011, 6:15:50 PM6/2/11
to sqlalchemy
Hi,

I want to establish a relationship with an object whose key is defined
inside a JSON BLOB column in the child. Naively, I know I can do this
via a regular python @property that uses object_session() to then do a
query() using the id from inside the blob. Is there a better way that
lets sqlalchemy manage the relationship and caching?

Also, I'd like to be able to sometimes eagerly load this
relationship. Obviously I can't do in with a JOIN of the original
query, but is there some way to define the relationship, perhaps using
a column_property, to be able to undefer()?

Thanks,
Ben Chess

Michael Bayer

unread,
Jun 2, 2011, 7:10:58 PM6/2/11
to sqlal...@googlegroups.com
Using a BLOB as a key is a really bad idea and wont work on all backends, but other than the database-level limitations inherent, SQLAlchemy will let you set up whatever column you'd like to use as the key just fine within a relationship(). Guessing what the problem might be. Foreign key ? If you rely upon your table metadata to emit CREATE TABLE statements, you can forego using ForeignKey with your table metadata and configure the foreign key data on the relationship itself using "foreign_keys=[table.c.my_referencing_blob_column], primary_join=table.c.my_key_blob_column==table.c.my_referencing_blob_column".

As far as your eager load, both joined eager loading and subquery eager loading rely on being able to JOIN to the target so if your backend is not letting the JOIN part happen, I'm not sure exactly what SQL you'd like to emit. If you'd like the "lazy" loader to just fire off immediately, you can use the "immediate" style of loader - lazy="immediate" or immediateload() as an option - but that won't save you on SELECT statements.

> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

Ben Chess

unread,
Jun 2, 2011, 8:11:06 PM6/2/11
to sqlal...@googlegroups.com
I'm not intending for the contents of the BLOB to be readable to
MySQL. It would only be cracked open and read from within Python.
Meaning Python only knows what the key actually is. So yeah, I
understand the caveats of this approach. I merely want to provide a
mechanism to, as a second & separate SELECT query, load the relationed
instance and be able to access it from the child python instance.

Hopefully that clears some things up.

Michael Bayer

unread,
Jun 2, 2011, 8:18:14 PM6/2/11
to sqlal...@googlegroups.com

On Jun 2, 2011, at 8:11 PM, Ben Chess wrote:

> I'm not intending for the contents of the BLOB to be readable to
> MySQL. It would only be cracked open and read from within Python.
> Meaning Python only knows what the key actually is. So yeah, I
> understand the caveats of this approach. I merely want to provide a
> mechanism to, as a second & separate SELECT query, load the relationed
> instance and be able to access it from the child python instance.
>
> Hopefully that clears some things up.

How do you want to identify what row contains the target key ? Is the comparison on the SQL side or the Python side ?

Ben Chess

unread,
Jun 2, 2011, 8:25:31 PM6/2/11
to sqlal...@googlegroups.com
On Thu, Jun 2, 2011 at 5:18 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:
>
> On Jun 2, 2011, at 8:11 PM, Ben Chess wrote:
>
>> I'm not intending for the contents of the BLOB to be readable to
>> MySQL.  It would only be cracked open and read from within Python.
>> Meaning Python only knows what the key actually is.  So yeah, I
>> understand the caveats of this approach.  I merely want to provide a
>> mechanism to, as a second & separate SELECT query, load the relationed
>> instance and be able to access it from the child python instance.
>>
>> Hopefully that clears some things up.
>
> How do you want to identify what row contains the target key ?   Is the comparison on the SQL side or the Python side ?

In Python. Below is an example of what I have.

class Account:
data = Column(JSONType)

@property
def salesperson(self):
session = orm.session.object_session(self)
return session.query(Salesperson, id=self.data['salesperson_id']).one()

Michael Bayer

unread,
Jun 2, 2011, 9:28:50 PM6/2/11
to sqlal...@googlegroups.com

On Jun 2, 2011, at 8:25 PM, Ben Chess wrote:

> On Thu, Jun 2, 2011 at 5:18 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:
>>
>> On Jun 2, 2011, at 8:11 PM, Ben Chess wrote:
>>
>>> I'm not intending for the contents of the BLOB to be readable to
>>> MySQL. It would only be cracked open and read from within Python.
>>> Meaning Python only knows what the key actually is. So yeah, I
>>> understand the caveats of this approach. I merely want to provide a
>>> mechanism to, as a second & separate SELECT query, load the relationed
>>> instance and be able to access it from the child python instance.
>>>
>>> Hopefully that clears some things up.
>>
>> How do you want to identify what row contains the target key ? Is the comparison on the SQL side or the Python side ?
>
> In Python. Below is an example of what I have.
>
> class Account:
> data = Column(JSONType)
>
> @property
> def salesperson(self):
> session = orm.session.object_session(self)
> return session.query(Salesperson, id=self.data['salesperson_id']).one()

OK that comparison, assuming you mean to have a "filter()" in there, is on the SQL side. The extraction of the comparison value from the JSON is python side.

relationship() really has no way to deal with load or persist operations that aren't about columns in the two tables being compared. Also, an eager load, at least as far as being able to load Parent and related Child objects (I'm using Parent/Child instead of Account/Salesperson here) with less than N+1 queries is not possible here, since that requires a set of rows representing parents be compared to a set of rows representing children on the SQL side and that's not an option. The relationship also can't really do too much with SQL criteria, at least not query.join(), so it would be limited at best with class level behavior.

So what you would get from a hypothetical relationship are:

1. the attribute is managed, such that it loads when non-populated, remains in __dict__ for subsequent returns, gets expired when the object is expired, after a commit(), etc.
2. if you assign to the attribute, "cascade" can establish that the child object or objects are brought into the session
3. merge() will merge it in
4. potentially the unit of work could fire off rules to establish necessary state on the parent or child based on the object association as well as its history

There is a rough outline of a new extension here, where the end user could plug in functions to establish "how am I loading the object", "I'm flushing, what needs to be populated", "here's my SQL comparison logic", but this would be a big job. It would be daunting for new users, it would have a complex usage contract. A rudimentary version could probably happen quickly, but when people start using it for real, the number of edge cases is huge. Could be a nice feature, but would be an enormous maintenance job for many months, or alternatively a little used and weakly supported feature. Attached is a sketch of part of it, does a little bit of the above. You're free to use it but it uses more of the internals than I think you'd want to (these are internals that aren't guaranteed to stay the same).

But beyond all the work it would require for testing and maintenance, how useful would it actually be, and how confusing would it be that you can kind of build an attribute like this on your own outside of SQLAlchemy, as you've done ?

If all you need is so that hitting salesperson doesn't do the load every time, @memoized_property is easy for that. Another example for that is attached, it also expires itself when the object is expired. Doesn't do as much. It's not clear how much of relationship() you'd really need.

Of course if you could find a way to copy this one aspect of your JSON structure to a plain foreign key attribute (and there's many ways to fire off an event like that), all of this goes away and you get SQL-level eager loading and joins and all that.


use_a_descriptor.py
hypothetical_attribute.py

Ben Chess

unread,
Jun 2, 2011, 9:36:31 PM6/2/11
to sqlal...@googlegroups.com
Sounds far more complicated than something I'd want to take on.
Thanks for your thoughts and the detailed response.

Ben

Reply all
Reply to author
Forward
0 new messages