If path in joinedload() include self-referential relationship, should I alias() every table after that relation ?

Юрий Пайков

unread,

Jul 15, 2015, 2:42:07 AM7/15/15

to sqlal...@googlegroups.com

I have an example here https://gist.github.com/ojomio/aa5eca3bea03d21e00e8. This code issue exactly one query and load everything at one time

What I am asking about is line https://gist.github.com/ojomio/aa5eca3bea03d21e00e8#file-gistfile1-py-L65

If I don not use alias second_B and simply write

        ).join(
            AWithChildren.bs
        ).options(
            contains_eager(B.as_).
            contains_eager(A.children, alias=AWithChildren).
            contains_eager(AWithChildren.bs).
            joinedload(B.cs)
        )

Then SQLAlchemy issue another query on C table, apparently not matching expression AWithChildren.bs and B

So my question is - if there are many other tables after A.children - should use alias() for every one and mention them like

.joinedload(
    PreviousTable.relation,
    alias=SomeTableAlias
).

?

Mike Bayer

unread,

Jul 15, 2015, 1:32:42 PM7/15/15

to sqlal...@googlegroups.com

the use here of .joinedload() on the end of a series of contains_eager() calls is already very unusual, and I'm somewhat surprised it works correctly in the first case as this is not a use case that's really tested. As for the case of second_b not being present, this is not surprising as AWithChildren.bs refers to "B", which is already present in the query as the primary entity, for a separate collection of B to be present it needs to be aliased. the join(AWithChildren.bs) is the same as join(B, AWithChildren.bs), and you'd never want to say what is essentially session.query(B).join(Q).join(B) - you need an alias for each subsequent occurrence of B.

--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Юрий Пайков

unread,

Jul 15, 2015, 4:40:18 PM7/15/15

to sqlal...@googlegroups.com

Ok, that is clear now.

Eagerloading of tables occurring more than once in a query is a bit confusing for me as it is not well-documented,

for example that contains_eager() needs alias=<parameter> in order to work properly for a second occurrence of a table. If I might I would advise you to shed some light on the usage in a documentation.

OK, moving on

When using

.join(relation, aliased=True)

it is stated that next .fliter() call will refer to the second occurrence (unless .reset_joinpoint() is called).

Here I have an example of similar behavior. Consider my slightly modified example.

it seems contains_eager(), when given not the full path from the first occurrence(B-A-A-B-C) but rather only the portion of path from the last mention of table(B-C), populates the collection with the data from the latest mentioned instance of that table

Is it the desired outcome or should it be considered a bug? Could this pattern change in future?

And, as always, thank you for you patience and attention brought to my problem

среда, 15 июля 2015 г., 22:32:42 UTC+5 пользователь Michael Bayer написал:

Mike Bayer

unread,

Jul 15, 2015, 7:01:01 PM7/15/15

to sqlal...@googlegroups.com

On 7/15/15 4:40 PM, Юрий Пайков wrote:

Ok, that is clear now.

Eagerloading of tables occurring more than once in a query is a bit confusing for me as it is not well-documented,

for example that contains_eager() needs alias=<parameter> in order to work properly for a second occurrence of a table. If I might I would advise you to shed some light on the usage in a documentation.

it really comes down to an understanding of the basic concept. The SQL itself must return the correct results. You cannot refer to a table twice in the same SQL statement in two different FROM contexts; all but one of them must be aliased for them to be handled separately. contains_eager() mererly refers the ORM to the particular columns in the result set that represent the data for this collection.

If you turn on echo='debug' in your create_engine() statement, you can watch the rows as they come in. To load a series of B rows, which also contain a series of additional B objects represented via JOIN or LEFT OUTER JOIN, requires aliasing.

OK, moving on

When using
.join(relation, aliased=True)
it is stated that next .fliter() call will refer to the second occurrence (unless .reset_joinpoint() is called).

aliased=True is an old feature that I think is more confusing than it is worth for end-user use of query(). I'd not recommend using it unless you have some case where it is absolutely necessary.

Here I have an example of similar behavior. Consider my slightly modified example.

it seems contains_eager(), when given not the full path from the first occurrence(B-A-A-B-C) but rather only the portion of path from the last mention of table(B-C), populates the collection with the data from the latest mentioned instance of that table

That seems incorrect. There is no join from B->cs stated here, there is only a join from second_B->cs. Therefore contains_eager() should be given second_B.cs, not B.cs. But also, yes, the full path of contains_eager() must be stated here, because otherwise the query will never populate B.as_, or A.children, or A.bs, and therefore never get to B.cs; those columns are thrown away.

Is it the desired outcome or should it be considered a bug? Could this pattern change in future?

That the eager loading system requires full explicitness in order to know what to do is exactly why it is so flexible.