If path in joinedload() include self-referential relationship, should I alias() every table after that relation ?

420 views
Skip to first unread message

Юрий Пайков

unread,
Jul 15, 2015, 2:42:07 AM7/15/15
to sqlal...@googlegroups.com
I have an example here https://gist.github.com/ojomio/aa5eca3bea03d21e00e8. This code issue exactly one query and load everything at one time

If I don not use alias second_B and simply write 
        ).join(
           AWithChildren.bs
        ).options(
            contains_eager(B.as_).
            contains_eager(A.children, alias=AWithChildren).
            contains_eager(AWithChildren.bs).
            joinedload(B.cs)
        )

Then  SQLAlchemy issue another query on C table, apparently not matching expression AWithChildren.bs and B
So my question is - if there are many other tables after A.children - should use alias() for every one and mention them like
.joinedload(
   
PreviousTable.relation,
   
alias=
SomeTableAlias
).
?

Mike Bayer

unread,
Jul 15, 2015, 1:32:42 PM7/15/15
to sqlal...@googlegroups.com
the use here of .joinedload() on the end of a series of contains_eager() calls is already very unusual, and I'm somewhat surprised it works correctly in the first case as this is not a use case that's really tested.   As for the case of second_b not being present, this is not surprising as AWithChildren.bs refers to "B", which is already present in the query as the primary entity, for a separate collection of B to be present it needs to be aliased.   the join(AWithChildren.bs) is the same as join(B, AWithChildren.bs), and you'd never want to say what is essentially session.query(B).join(Q).join(B) - you need an alias for each subsequent occurrence of B.




--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Юрий Пайков

unread,
Jul 15, 2015, 4:40:18 PM7/15/15
to sqlal...@googlegroups.com
Ok, that is clear now. 

Eagerloading of tables occurring more than once in a query is a bit confusing for me as it is not well-documented, 
for example that contains_eager() needs alias=<parameter> in order to work properly for a second occurrence of a table. If I might I would advise you to shed some light on the usage in a documentation.

OK, moving on
When using
 .join(relation, aliased=True)
it is stated that next .fliter() call will refer to the second occurrence (unless  .reset_joinpoint() is called). 

Here I have an example of similar behavior. Consider my slightly modified example.
it seems contains_eager(), when given not the full path from the first occurrence(B-A-A-B-C) but rather only the portion of path from the last mention of table(B-C), populates the collection with the data from the latest mentioned instance of that table

Is it the desired outcome or should it be considered a bug? Could this pattern change in future?
And, as always, thank you for you patience and attention brought to my problem

среда, 15 июля 2015 г., 22:32:42 UTC+5 пользователь Michael Bayer написал:

Mike Bayer

unread,
Jul 15, 2015, 7:01:01 PM7/15/15
to sqlal...@googlegroups.com


On 7/15/15 4:40 PM, Юрий Пайков wrote:
Ok, that is clear now. 

Eagerloading of tables occurring more than once in a query is a bit confusing for me as it is not well-documented, 
for example that contains_eager() needs alias=<parameter> in order to work properly for a second occurrence of a table. If I might I would advise you to shed some light on the usage in a documentation.

it really comes down to an understanding of the basic concept.   The SQL itself must return the correct results.  You cannot refer to a table twice in the same SQL statement in two different FROM contexts; all but one of them must be aliased for them to be handled separately.      contains_eager() mererly refers the ORM to the particular columns in the result set that represent the data for this collection.

If you turn on echo='debug' in your create_engine() statement, you can watch the rows as they come in.     To load a series of B rows, which also contain a series of additional B objects represented via JOIN or LEFT OUTER JOIN, requires aliasing.  



OK, moving on
When using
 .join(relation, aliased=True)
it is stated that next .fliter() call will refer to the second occurrence (unless  .reset_joinpoint() is called).

aliased=True is an old feature that I think is more confusing than it is worth for end-user use of query().   I'd not recommend using it unless you have some case where it is absolutely necessary.



Here I have an example of similar behavior. Consider my slightly modified example.
it seems contains_eager(), when given not the full path from the first occurrence(B-A-A-B-C) but rather only the portion of path from the last mention of table(B-C), populates the collection with the data from the latest mentioned instance of that table
That seems incorrect.  There is no join from B->cs stated here, there is only a join from second_B->cs.   Therefore contains_eager() should be given second_B.cs, not B.cs.  But also, yes, the full path of contains_eager() must be stated here, because otherwise the query will never populate B.as_, or A.children, or A.bs, and therefore never get to B.cs; those columns are thrown away.



Is it the desired outcome or should it be considered a bug? Could this pattern change in future?

That the eager loading system requires full explicitness in order to know what to do is exactly why it is so flexible.


And, as always, thank you for you patience and attention brought to my problem
my pleasure!
Reply all
Reply to author
Forward
0 new messages