subqueryload with multiple joined table inheritance hierarchies

342 views
Skip to first unread message

Douglas Russell

unread,
Aug 10, 2016, 6:18:27 PM8/10/16
to sqlalchemy
Hi,

I have two (well, more than two with more than two levels each, but this is the simplest recreation of the problem) parallel joined-table inheritance hierarchies

A -> A2
B -> B2

B.a is a relationship to A, backref in A.bs.

The problem arises when I want to query this whole structure back with subqueries. The desired number of queries in this case would be twoish, depending on how the polymorphic queries are done. At any rate, the desire is to remove any lazy loading at all.

The problem lies in formulating a subqueryload which will do this. My first thought was:

session.query(A2).options(subqueryload(A2.bs)).all()

The problem with this is that what I wanted was all the A2s and all the B2s, but what I get is all the A2s, and all the Bs. When I access a property of B2, it lazy loads that single record.

I think the best solution is something along the lines of using a combination of subqueryload and joinedload (or maybe with_polymorphic?) to say: Give me all the A2s with a subquery for all the B2s linked to it. I can't figure out how to do this as the arguments to subqueryload and joinedload are relationships which I don't currently have.


Is there a clever way to handle this?

The only thing I've done which works is to add a relationship between B to B2:

b = relationship('B', backref='b2')

Then I can do:

r = session.query(A2).options(subqueryload(A2.bs).joinedload(B.b2)).all()

but it would be better if I didn't have to go and add these relationships to the existing models as there are several levels, repositories and (most hinderingly) jurisdictions in play in my real project. It seems likely that there is a clever way as SQLAlchemy does such a good job with the polymorphic queries usually.

Cheers,

Douglas

Douglas Russell

unread,
Aug 10, 2016, 6:26:47 PM8/10/16
to sqlalchemy
Scratch that last bit about the joinedload working. It does succeed in doing the joinedload to get all the data in one go, but it still lazy loads the B2 fields despite them already having been queried.

Douglas Russell

unread,
Aug 10, 2016, 6:29:48 PM8/10/16
to sqlalchemy
Finally, you can't see it on the Gist, but there is a branch on the repo if you clone it called 'more_relations' with the code for my joinedload test.

Sorry for 3 posts where 1 would have been better!

Thanks


On Wednesday, 10 August 2016 18:18:27 UTC-4, Douglas Russell wrote:

Mike Bayer

unread,
Aug 10, 2016, 6:55:30 PM8/10/16
to sqlal...@googlegroups.com


On 08/10/2016 06:18 PM, Douglas Russell wrote:
> Hi,
>
> I have two (well, more than two with more than two levels each, but this
> is the simplest recreation of the problem) parallel joined-table
> inheritance hierarchies
>
> A -> A2
> B -> B2
>
> B.a is a relationship to A, backref in A.bs.
>
> The problem arises when I want to query this whole structure back with
> subqueries. The desired number of queries in this case would be twoish,
> depending on how the polymorphic queries are done. At any rate, the
> desire is to remove any lazy loading at all.
>
> The problem lies in formulating a subqueryload which will do this. My
> first thought was:
>
> |
> session.query(A2).options(subqueryload(A2.bs)).all()
> |
>
> The problem with this is that what I wanted was all the A2s and all the
> B2s, but what I get is all the A2s, and all the Bs. When I access a
> property of B2, it lazy loads that single record.

the use case here is accommodated by a per-load with_polymorphic:

from sqlalchemy.orm import with_polymorphic

r = session.query(A2).options(
subqueryload(A2.bs.of_type(with_polymorphic(B, [B2], flat=True)))
).all()


that's also your solution for joinedload(). the "flat=True" will reduce
the use of SELECT in favor of right-nested joins.

probably a link in the "loading objects" section should cross-reference
this, it's discussed in the mapper inheritance config section:

http://docs.sqlalchemy.org/en/latest/orm/inheritance.html#eager-loading-of-specific-or-polymorphic-subtypes



>
> I think the best solution is something along the lines of using a
> combination of subqueryload and joinedload (or maybe with_polymorphic?)
> to say: Give me all the A2s with a subquery for all the B2s linked to
> it. I can't figure out how to do this as the arguments to subqueryload
> and joinedload are relationships which I don't currently have.
>
> Minimal Test
> Case: https://gist.github.com/dpwrussell/508e8b33fe92434eafe001ee806f355a
>
> Is there a clever way to handle this?
>
> The only thing I've done which works is to add a relationship between B
> to B2:
>
> |
> b = relationship('B', backref='b2')
> |
>
> Then I can do:
>
> |
> r = session.query(A2).options(subqueryload(A2.bs).joinedload(B.b2)).all()
> |
>
> but it would be better if I didn't have to go and add these
> relationships to the existing models as there are several levels,
> repositories and (most hinderingly) jurisdictions in play in my real
> project. It seems likely that there is a clever way as SQLAlchemy does
> such a good job with the polymorphic queries usually.
>
> Cheers,
>
> Douglas
>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

Douglas Russell

unread,
Aug 11, 2016, 9:53:10 AM8/11/16
to sqlalchemy
of_type was what I was searching for. Thanks.

Unfortunately, it did not yield the desired results when I extended the hierarchy to 3 levels (C->C2 with a relation between B and C).

I've updated the Gist.

B+B2 is successfully queried, but C+C2 is not. This will then result in lazily generated queries when accessing C2.

In fact, it's actually a little more confusing even than that. With that query, it does (in the original query)
Query for all A+A2
Query for all B+B2 (from the subqueryload)

Then once I start to print the hierarchy out, lazy queries:
Query for all C
Query for each C2

Cheers,

Douglas

On Wednesday, 10 August 2016 18:18:27 UTC-4, Douglas Russell wrote:

Mike Bayer

unread,
Aug 11, 2016, 11:51:39 AM8/11/16
to sqlal...@googlegroups.com


On 08/11/2016 09:53 AM, Douglas Russell wrote:
> of_type was what I was searching for. Thanks.
>
> Unfortunately, it did not yield the desired results when I extended the
> hierarchy to 3 levels (C->C2 with a relation between B and C).
>
> I've updated the Gist.
>
> B+B2 is successfully queried, but C+C2 is not. This will then result in
> lazily generated queries when accessing C2.
>
> In fact, it's actually a little more confusing even than that. With that
> query, it does (in the original query)
> Query for all A+A2
> Query for all B+B2 (from the subqueryload)
>
> Then once I start to print the hierarchy out, lazy queries:
> Query for all C
> Query for each C2


I'm not sure how to document this (I guess just add another example) but
once you've done of_type(), that's now the type you're on and if you
chain from there, it has to be in terms of that new type. The
with_polymorphic() object you've made is that type (note the subclass
namespace that's on the object):

b_b2 = with_polymorphic(B, [B2], flat=True)

r = session.query(A2).options(
subqueryload(A2.bs.of_type(b_b2)).subqueryload(b_b2.B2.cs.of_type(
with_polymorphic(C, [C2], flat=True))
)
).all()

Douglas Russell

unread,
Aug 11, 2016, 3:35:29 PM8/11/16
to sqlalchemy
Of course! That makes sense.

Does SQLAlchemy just discard the subqueryload chain parts that are on the wrong type? It didn't seem to generate any SQL for those when they were wrong.

An example of this in the docs would be a great idea I think as I think it's pretty tough to explain this in the abstract as you say!

Thanks again,

Douglas

Mike Bayer

unread,
Aug 11, 2016, 5:14:32 PM8/11/16
to sqlal...@googlegroups.com


On 08/11/2016 03:35 PM, Douglas Russell wrote:
> Of course! That makes sense.
>
> Does SQLAlchemy just discard the subqueryload chain parts that are on
> the wrong type? It didn't seem to generate any SQL for those when they
> were wrong.

it validates the path as far as classes / attributes, but I think when
subclassing and aliasing is used it might be considering just the types,
e.g. a B2 and a with_polymorphic(B, [B2]) are of compatible types.
There's a lot of ways these paths get built up (like with string names
and such), as well as all kinds of "it's worked this way for years"
kinds of things that may not always be present in testing, so more error
checking in there would have to be added carefully.
> > an email to sqlalchemy+...@googlegroups.com <javascript:>
> > <mailto:sqlalchemy+...@googlegroups.com <javascript:>>.
> > To post to this group, send email to sqlal...@googlegroups.com
> <javascript:>
> > <mailto:sqlal...@googlegroups.com <javascript:>>.
> <https://groups.google.com/group/sqlalchemy>.
> > For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.

Douglas Russell

unread,
Aug 11, 2016, 7:08:57 PM8/11/16
to sqlal...@googlegroups.com
Thanks for your help, Mike!

Douglas

You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.

Douglas Russell

unread,
Aug 12, 2016, 5:07:40 PM8/12/16
to sqlalchemy
Sorry to return again so soon, but I think I have uncovered a bug in the subqueryload.

Continuing my example above. I had some with_polymorphic subqueries, followed by a non-polymorphic subquery.

r = session.query(
    B
).options(
    subqueryload(
        B.cs.of_type(c_c2)
    ).subqueryload(
        c_c2.C2.ds
    )
).all()

In the simplest case I have:
B
C->C2
D

What I expected to get back from the query was all the Bs, with all the C2s(including C) for those Bs, with all the Ds for those Cs.
Unfortunately what happens (for any significant query) is that the database explodes and downs the machine by using all available memory instead. More concisely, it does a query which results in: All the Bs, with all the C2s(including C) for those Bs (good so far!) with all the Ds for all Cs!

In SQL, here is what I would expect:

SELECT t_b.id AS t_b_id
FROM t_b

SELECT t_c_1.type AS t_c_1_type, t_c_1.id AS t_c_1_id, t_c_1.b_id AS t_c_1_b_id, t_c2_1.id AS t_c2_1_id, anon_1.t_b_id AS anon_1_t_b_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN (t_c AS t_c_1 LEFT OUTER JOIN t_c2 AS t_c2_1 ON t_c_1.id = t_c2_1.id) ON anon_1.t_b_id = t_c_1.b_id ORDER BY anon_1.t_b_id

SELECT t_d_1.id AS t_d_1_id, t_d_1.c_id AS t_d_1_c_id, t_c_1.id AS t_c_1_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN t_c AS t_c_1 ON anon_1.t_b_id = t_c_1.b_id JOIN t_d AS t_d_1 ON t_c_1.id = t_d_1.c_id ORDER BY t_c_1.id

The actual SQL generated:

SELECT t_b.id AS t_b_id
FROM t_b

SELECT t_c_1.type AS t_c_1_type, t_c_1.id AS t_c_1_id, t_c_1.b_id AS t_c_1_b_id, t_c2_1.id AS t_c2_1_id, anon_1.t_b_id AS anon_1_t_b_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN (t_c AS t_c_1 LEFT OUTER JOIN t_c2 AS t_c2_1 ON t_c_1.id = t_c2_1.id) ON anon_1.t_b_id = t_c_1.b_id ORDER BY anon_1.t_b_id

SELECT t_d.id AS t_d_id, t_d.c_id AS t_d_c_id, t_c_1.id AS t_c_1_id
FROM (SELECT t_b.id AS t_b_id
FROM t_b) AS anon_1 JOIN t_c AS t_c_1 ON anon_1.t_b_id = t_c_1.b_id, t_c AS t_c_2 LEFT OUTER JOIN t_c2 AS t_c2_1 ON t_c_2.id = t_c2_1.id JOIN t_d ON t_c_2.id = t_d.c_id ORDER BY t_c_1.id

I was able to work around this by doing:

r = session.query(
    B
).options(
    subqueryload(
        B.cs.of_type(c_c2)
    ).subqueryload(
        c_c2.C2.ds.of_type(D)
    )
).all()

but that didn't seem to make sense to me as there was no polymorphism at work in the type D so why would it need to have of_type?

Even if that is necessary, I'd have expected that the query would have failed or dropped that part of the chain (like I was seeing before with parts that could not successfully chain).


My knowledge of how subqueryload is extremely limited so it's possible I'm doing something wrong here, but having made this test case, what I was doing seems logical so I figured it was maybe a bug.

Cheers,

Douglas
>     > <mailto:sqlalchemy+unsub...@googlegroups.com <javascript:>>.

>     > To post to this group, send email to sqlal...@googlegroups.com
>     <javascript:>
>     > <mailto:sqlal...@googlegroups.com <javascript:>>.
>     > Visit this group at https://groups.google.com/group/sqlalchemy
>     <https://groups.google.com/group/sqlalchemy>.
>     > For more options, visit https://groups.google.com/d/optout
>     <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send

> To post to this group, send email to sqlal...@googlegroups.com

> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to a topic in the Google Groups "sqlalchemy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to sqlalchemy+unsubscribe@googlegroups.com.

Mike Bayer

unread,
Aug 12, 2016, 11:57:31 PM8/12/16
to sqlal...@googlegroups.com
that's something pretty much called a "comma join" and is a very common
symptom of issues in SQLAlchemy eager loading and querying due to the
heavy emphasis on connecting things together into joins. when the
connection fails, you get "FROM a JOIN b, b JOIN c" and cartesian
products. Builds for the issue at
https://bitbucket.org/zzzeek/sqlalchemy/issues/3773 are going through
for 1.0 / 1.1 and will mark this as resolved once those are merged into
the repo.
> > > <mailto:sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy%2Bunsu...@googlegroups.com> <javascript:>>.
> > > To post to this group, send email to
> sqlal...@googlegroups.com
> > <javascript:>
> > > <mailto:sqlal...@googlegroups.com <javascript:>>.
> > > Visit this group at
> https://groups.google.com/group/sqlalchemy
> <https://groups.google.com/group/sqlalchemy>
> > <https://groups.google.com/group/sqlalchemy
> <https://groups.google.com/group/sqlalchemy>>.
> > > For more options, visit
> https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>
> > <https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>>.
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "sqlalchemy" group.
> > To unsubscribe from this group and stop receiving emails from
> it, send
> > an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy%2Bunsu...@googlegroups.com>
> > <mailto:sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy%2Bunsu...@googlegroups.com>>.
> > To post to this group, send email to
> sqlal...@googlegroups.com <mailto:sqlal...@googlegroups.com>
> > <mailto:sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>>.
> You received this message because you are subscribed to a topic
> in the Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe
> <https://groups.google.com/d/topic/sqlalchemy/mu56muQI6tM/unsubscribe>.
> To unsubscribe from this group and all its topics, send an email
> to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy%2Bunsu...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/sqlalchemy
> <https://groups.google.com/group/sqlalchemy>.
> For more options, visit https://groups.google.com/d/optout
> <https://groups.google.com/d/optout>.
>
> --
> You received this message because you are subscribed to the Google
> Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages