Disabling subqueries when using joinedload to load a polymorphic class

297 views
Skip to first unread message

Gerald Thibault

unread,
Dec 18, 2012, 3:50:20 PM12/18/12
to sqlal...@googlegroups.com
I've attached an example showing the issue. I have a class, Test, with a single polymorphic subclass, TestOne. I also have TestChild, which has a fk to Test, and a relation using that fk.

When I execute:

session.query(TestChild) \
        .options(joinedload(TestChild.parent)) \
        .get(1)

I see this query:

SELECT test_children.id AS test_children_id, test_children.test_id AS test_children_test_id, anon_1.test_id AS anon_1_test_id, anon_1.test_type AS anon_1_test_type, anon_1.test_name AS anon_1_test_name, anon_1.test1_id AS anon_1_test1_id, anon_1.test1_value1 AS anon_1_test1_value1 
FROM test_children LEFT OUTER JOIN (SELECT test.id AS test_id, test.type AS test_type, test.name AS test_name, test1.id AS test1_id, test1.value1 AS test1_value1 
FROM test LEFT OUTER JOIN test1 ON test.id = test1.id) AS anon_1 ON anon_1.test_id = test_children.test_id 

What I would like to see is this:

SELECT <whatever>
FROM test_children 
LEFT OUTER JOIN test on test_children.test_id = test.id
LEFT OUTER JOIN test1 on test1.id = test.id

How can I accomplish this? Also, the join type doesn't matter, they could be inner joins too, the important thing is getting rid of the subquery because it's completely locking up our database.
polymorphic_joinedload_test.py

Michael Bayer

unread,
Dec 18, 2012, 7:36:18 PM12/18/12
to sqlal...@googlegroups.com
This is how joining from a left side to a polymorphic or inner-joined subclass right side works - the target is always wrapped inside of a subquery, as SQLAlchemy was written without ever resorting to the syntax of "x JOIN (y JOIN z)", as this syntax was not supported by many backends when SQLA was created.  SQLite still does not support this syntax.   

Hence when using ORM-generated queries you will always get "x [LEFT OUTER] JOIN (select * FROM y [LEFT OUTER] JOIN z) AS anon_1".  

Additionally, the aspect above that is "y [LEFT OUTER] JOIN z", which is the "target" here, can be of many forms.  In the default "polymorphic" case, it's a LEFT OUTER JOIN.  If the relationship referred to "Test1" directly, then it would be an inner join, and parenthesization would be required given LEFT OUTER join from the parent.  If the Test/Test1 mapping were a concrete mapping, it wouldn't be a JOIN, it would be a SELECT .. UNION SELECT ... [UNION...], and a fully parenthesized subquery would be necessary there as well.    There are situations where the *left* side is itself a polymorphic outer join or similar, and parenthesization is necessary.

So there's all kinds of combinations that can occur between <parent> <some kind of join> <some kind of target selectable>, and SQLAlchemy's current status is to treat them all exactly the same - the right side is always wrapped in a subquery.  It took years to get this right, but the end result is that SQLAlchemy always returns the right result, in a truly enormous number of circumstances.  SQLA goes for "the right answer, always" first, performance second.

So the issue of "unwrapping" this series when possible is an advanced optimization issue.   There are tickets to attempt this in various scenarios, to "unwrap" when appropriate, to actually render the joins parenthesized directly when SQLite isn't used, but these use cases are considerably complicated.   It means the current approach of one way that always works needs to be broken down into an additional series of decisionmaking chains, where new mistakes and inconsistencies can be introduced.   It's really kind of the next level for Query, and we're probably getting close to where we can go there, but it's a long road.  Query gets major new architectural changes with each major release and we laid a ton of new groundwork in 0.8.

Right now, you can achieve an exact SQL statement using Table instances directly.  I wasn't sure if the polymorphic eager loading part would work here, but it seems to:


    test = Test.__table__
    test1 = TestOne.__table__

    query = session.query(TestChild).\
                outerjoin(test, test.c.id == TestChild.test_id).\
                outerjoin(test1, test1.c.id == test.c.id).\
                options(contains_eager(TestChild.parent))

    tc = query.filter(TestChild.id == 1).first()

YMMV with more complex scenarios, though.
Reply all
Reply to author
Forward
0 new messages