Parameterized "Relationships" with Eager Loading Capability

413 views
Skip to first unread message

agrot...@gmail.com

unread,
Sep 23, 2020, 12:43:28 AM9/23/20
to sqlalchemy
Let's say I have a model with a one to many relationship as such:
class A(Base):
id = ...

class B(Base):
id = ...
some_field = ....
a_id = Column(ForeignKey(A.id)...
a = relationship(A, backref=backref('bs', lazy='dynamic'))

I can define a method on A:
class A(Base):
...
def get_b_with_some_field(self, some_field):
return self.bs.filter(B.some_field==some_field)

to get all b's that have a certain value of `some_field`.

Is there any way to accomplish this with eager fetching to avoid the n+1 select problem that will occur if I want to query for a lot of `A`s and then iterate over the collection and for each call `get_b_with_some_field` with the same value of `some_field`?

One option is to to change relationship to from `lazy='dynamic'` to `lazy='subquery'` or ``lazy='selectin'`` and then implement the filtering in `get_b_with_some_field` in Python. This will address the n+1 select problem, but will cause pulling extra data from the database (and extra work in Python).

I thought `contains_eager` might be relevant; however, I only see it being mentioned in the case of joined loads.

The reason I am looking for this functionality is I am defining a graphql API that looks like the following:
type query {
as: [A!!
}

type A {
    ...
bs(some_field: String): [B!]!
}

type B {
    ...
some_field: String!
}
where I would like to be able to specify a filter on the `bs` relationship from `A`. I would ideally like to 1. avoid the n+1 select issue and 2. perform the some_field filtering at the database level, and 3. leverage as much of the ORM as possible ;-)

Is it possible to do this within SQLA?

Mike Bayer

unread,
Sep 23, 2020, 8:10:44 AM9/23/20
to noreply-spamdigest via sqlalchemy


On Wed, Sep 23, 2020, at 5:43 AM, agrot...@gmail.com wrote:
Let's say I have a model with a one to many relationship as such:
class A(Base):
id = ...

class B(Base):
id = ...
some_field = ....
a_id = Column(ForeignKey(A.id)...
a = relationship(A, backref=backref('bs', lazy='dynamic'))

I can define a method on A:
class A(Base):
...
def get_b_with_some_field(self, some_field):
return self.bs.filter(B.some_field==some_field)

to get all b's that have a certain value of `some_field`.

Is there any way to accomplish this with eager fetching to avoid the n+1 select problem that will occur if I want to query for a lot of `A`s and then iterate over the collection and for each call `get_b_with_some_field` with the same value of `some_field`?'

if you want to have that attribute remain on "dynamic" then you'd need to define a second relationship where you can use normal eager fetching strategies, and then use that for those cases.

IMO "dynamic" is not really worth it, you can get the same queries more programmatically by using query(B).filter(with_parent(some_a, A.bs)).






One option is to to change relationship to from `lazy='dynamic'` to `lazy='subquery'` or ``lazy='selectin'`` and then implement the filtering in `get_b_with_some_field` in Python. This will address the n+1 select problem, but will cause pulling extra data from the database (and extra work in Python).

I thought `contains_eager` might be relevant; however, I only see it being mentioned in the case of joined loads.

The reason I am looking for this functionality is I am defining a graphql API that looks like the following:
type query {
as: [A!!
}

type A {
    ...
bs(some_field: String): [B!]!
}

type B {
    ...
some_field: String!
}
where I would like to be able to specify a filter on the `bs` relationship from `A`. I would ideally like to 1. avoid the n+1 select issue and 2. perform the some_field filtering at the database level, and 3. leverage as much of the ORM as possible ;-)

Is it possible to do this within SQLA?


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.

agrot...@gmail.com

unread,
Sep 23, 2020, 11:17:16 AM9/23/20
to sqlalchemy
I actually don't really care that much to have the attribute remain dynamic. In fact there is only one specific filtering that I want to apply to it, but that filtering will vary from (web) request to (web) request. This is what made me think of using contains_eager.

Right now this is the best solution I have come up with, which is to define a temporary class that extends A and add to that class a new relationship with the custom filter applied. I then specify to selectinload that property. Is there a better way to do this?

# Run this code for each web request, reading some_field_value from the value the client specifies in the request:
some_field_value = ...

class ATmp(A):
    bs_temp = relationship(
        lambda: models.B,
        primaryjoin=(
            (models.A.id == models.B.a_id)
            & (models.B.some_field == some_field_value)
        ),
    )

q = return db.session.query(ATmp).options(selectinload(cls.bs_temp))
# iterate over the q (which in iterable of "A"s) and for each A, iterate over the bs_temp, which is a filtered collection of Bs.

This also leads to a warning:
SAWarning: This declarative base already contains a class with the same class name and module name as my_app.graphql.queries.ATmp, and will be replaced in the string-lookup table

though it does seem to work (I am able to avoid n+1 and do the filtering in the DB).

Mike Bayer

unread,
Sep 23, 2020, 12:21:41 PM9/23/20
to noreply-spamdigest via sqlalchemy


On Wed, Sep 23, 2020, at 4:17 PM, agrot...@gmail.com wrote:
I actually don't really care that much to have the attribute remain dynamic. In fact there is only one specific filtering that I want to apply to it, but that filtering will vary from (web) request to (web) request. This is what made me think of using contains_eager.

Right now this is the best solution I have come up with, which is to define a temporary class that extends A and add to that class a new relationship with the custom filter applied. I then specify to selectinload that property. Is there a better way to do this?

I would still use a separate relationship on the same class, you can always make a @hybrid_property that switches between the two relationships depending on what you want to do.

In version 1.4, which will be in betas as soon as I can get a huge amount of new docs written, you will have a potentially better option for this which is the PropComparator.and_() operator.  you can play with this now from git master if you were interested:


that way you'd say:

session.query(A).options(selectinload(A.bs).and_(B.some_field == value))

that might be what you're waiting for here


agrot...@gmail.com

unread,
Sep 23, 2020, 1:03:43 PM9/23/20
to sqlalchemy
Cool, yes I think that is what I am looking for. Is there any way to alias the relationship (as read only) to: 1. allow for multiple different filters of the same property and 2. make sure when I read the value (in another place in the code), I have confidence the filter was applied?

something like (I made up the syntax): 
q = session.query(A).options(selectinload(A.bs).and_(B.some_field == value).as(f'bs_filtered_by_{value}')
...
for a in q:
   for b in a.bs_filtered_by_xxxx:
       ....

Mike Bayer

unread,
Sep 23, 2020, 3:21:23 PM9/23/20
to noreply-spamdigest via sqlalchemy
A.bs only goes to the "bs" collection on an A.  there's no eagerloading that puts the collection on some other arbitrary place.  

agrot...@gmail.com

unread,
Sep 23, 2020, 3:51:12 PM9/23/20
to sqlalchemy
Understood. I was thinking though some sort of alias would be an interesting solution to the problem outlined about `contains_eager` as well:
>Keep in mind that when we load only a subset of objects into a collection, that collection no longer represents what’s actually in the database. 
In that rather than re-using a property which does have specific meaning, a new property could be created that explicitly has different meaning.

Mike Bayer

unread,
Sep 23, 2020, 4:20:01 PM9/23/20
to noreply-spamdigest via sqlalchemy
yeah I don't have a solution to that problem right now, as mapped attributes are only a class-bound concept and there is no concept of an arbitrary attribute on an object that's not associated with a class-level mapped attribute.

I think this problem long term would be solved more through some kind of @property selector that works from a class and is not specific to mapping.     

agrot...@gmail.com

unread,
Sep 23, 2020, 4:30:51 PM9/23/20
to sqlalchemy
Gotcha.

Should I be worried about this warning or the approach of defining a transient class to solve my problem (at least until 1.4 comes out):
SAWarning: This declarative base already contains a class with the same class name and module name as my_app.graphql.queries.ATmp, and will be replaced in the string-lookup table

Mike Bayer

unread,
Sep 23, 2020, 5:07:20 PM9/23/20
to noreply-spamdigest via sqlalchemy
the warning is not a big deal but you probably dont want to be making these classes on the fly.  mapping a class is not a quick operation internally, it's messy and somewhat questionable in highly concurrent situations.

agrot...@gmail.com

unread,
Feb 4, 2021, 6:34:13 PM2/4/21
to sqlalchemy
In the case of using the PropComparator.and_() operator (https://docs.sqlalchemy.org/en/14/orm/loading_relationships.html#adding-criteria-to-loader-options), is there any way to then introspect the relationship to tell what if any and_ filtering has been applied to the load?

For example on a, an instance of A:
a.bs._loader_options

so that when I go to use the collection I can tell how it has been filtered / "reduced" from what it would have been without the additional filter?

Mike Bayer

unread,
Feb 4, 2021, 7:07:25 PM2/4/21
to noreply-spamdigest via sqlalchemy
it's not simple to work with a criteria object since you would have to walk through it to make some programmatic decision about it, and you'd also have to search for it.    I can't completely guarantee this wont change so it would be better if you solved your problem differently, but here's how to see it:

s = Session(e)

s.add(A(bs=[B(), B(), B(), B(data='x'), B(data='x')]))
s.commit()

a1 = s.scalar(select(A).options(selectinload(A.bs.and_(B.data == 'x'))))

from sqlalchemy.orm import LoaderCriteriaOption

some_b = a1.bs[0]
opt = [o for o in inspect(some_b).load_options if isinstance(o, LoaderCriteriaOption)][0]
crit = opt.where_criteria


Reply all
Reply to author
Forward
0 new messages