[missing feature] joined relationship + order by: table alias problem

711 views
Skip to first unread message

Staszek

unread,
Mar 26, 2014, 1:04:38 PM3/26/14
to sqlal...@googlegroups.com, mi...@zzzcomputing.com
Hi

I have a problem with joined relationships, which I will try to describe
using an example:

class User(Base):
__tablename__ = 'users'
__table_args__ = ...

id = Column(BigInteger, autoincrement=True, primary_key=True)
...
carId = Column(BigInteger, ForeignKey('cars.id'))
...

car = relationship("Car", lazy='joined', innerjoin=True)

Querying for User now seems to generate SQL expressions like this:

SELECT ... FROM users INNER JOIN cars AS cars_1 WHERE ...

The problem is, "cars_1" alias is auto-generated and the relationship
function:
http://docs.sqlalchemy.org/en/rel_0_9/orm/relationships.html#sqlalchemy.orm.relationship
does not seem to provide any means to specify another name!

This leads to problems, for example if you want to build such a joined
query and then apply order_by(Car.model) to it: order_by(Car.model)
turns into "ORDER BY cars.model" which is an error because cars table in
this query is used with an auto-generated alias "cars_1"! The proper
ORDER BY clause would be: "ORDER BY cars_1.model".

And so one either has to "guess" this "cars_1" alias (but it does not
seem to be documented) or resource to things like explicit join(),
options() and contains_eager():

dbSession.query(User).join(User.car).options(contains_eager(User.car))

This sucks, especially because you must rewrite this query everywhere:
if you want to select companies with users with cars and
order_by(Car.model), you have to use explicit join()s to cars and
contains_eager() stuff. This makes User.car relationship basically
unusable with order_by().

I can see 2 possible fixes to this problem:

1. specify an 'join-alias' argument in relationship(), to be used with
lazy='joined', so that you have control over this "cars_1" alias and
could specify another name
2. make order_by check what table alias to use (instead of just blindly
copying the table name).

What do you think?

Or maybe I am missing something?...

--
http://people.eisenbits.com/~stf/
http://www.eisenbits.com/

OpenPGP: 80FC 1824 2EA4 9223 A986 DB4E 934E FEA0 F492 A63B

Michael Bayer

unread,
Mar 26, 2014, 1:16:15 PM3/26/14
to sqlal...@googlegroups.com

On Mar 26, 2014, at 1:04 PM, Staszek <stf.lis...@eisenbits.com> wrote:

> Hi
>
> I have a problem with joined relationships, which I will try to describe
> using an example:
>
>
> car = relationship("Car", lazy='joined', innerjoin=True)
>
> Querying for User now seems to generate SQL expressions like this:
>
> SELECT ... FROM users INNER JOIN cars AS cars_1 WHERE ...
>
> The problem is, "cars_1" alias is auto-generated and the relationship
> function:
> http://docs.sqlalchemy.org/en/rel_0_9/orm/relationships.html#sqlalchemy.orm.relationship
> does not seem to provide any means to specify another name!
>
> This leads to problems, for example if you want to build such a joined
> query and then apply order_by(Car.model) to it: order_by(Car.model)

It’s not totally clear which of the common use cases you have here, pick one:

1. I want to emit a JOIN from User to Car and order by Car.model - don’t use eager loading, use query.join():

http://docs.sqlalchemy.org/en/rel_0_9/faq.html#i-m-using-joinedload-or-lazy-false-to-create-a-join-outer-join-and-sqlalchemy-is-not-constructing-the-correct-query-when-i-try-to-add-a-where-order-by-limit-etc-which-relies-upon-the-outer-join

2. I want to load User objects and eagerly load the User.cars collection, and the Car collection should be ordered by “model” in all cases: use relationship()->order_by:

http://docs.sqlalchemy.org/en/rel_0_9/orm/relationships.html?highlight=relationship#sqlalchemy.orm.relationship
http://docs.sqlalchemy.org/en/rel_0_9/orm/relationships.html?highlight=relationship#sqlalchemy.orm.relationship.params.order_by


3. I want to load User objects and eagerly load the Car objects, but only in this *one particular instance* do i want the ordering, otherwise I want (some other? no?) ordering. Well for that we have the join() + contains_eager().

3a: no, I actually have use case #3, I don’t want it ordered by “model” most of the time, only occasionally, and I don’t like typing out two functions, I only want to type one function. OK, well I can see how maybe someday the XYZload() functions could allow an “order_by” feature, that might be nice (also I will gladly implement this feature ASAP if you’d like to sponsor it). For now you need to just use regular Python:

def joinedload_plus_order_by(query, relationship, ordering):
return query.join(relationship).options(contains_eager(relationship)).order_by(ordering)

example:

q = sess.query(User)
q = joinedload_plus_order_by(q, User.cars, Car.model)

I’d have to disagree with your assertions that “relationship is useless” as we are totally using User.cars here to formulate the JOIN, as well as “order by blindly copies the table name”, that’s not at all how it works. For some insight on the aliasing applied by eager loading please read http://docs.sqlalchemy.org/en/rel_0_9/orm/loading.html#the-zen-of-eager-loading .



Staszek

unread,
Mar 26, 2014, 1:56:59 PM3/26/14
to sqlal...@googlegroups.com, Michael Bayer
On 2014-03-26 18:16, Michael Bayer wrote:
> It's not totally clear which of the common use cases you have here, pick one:
>
> 1. I want to emit a JOIN from User to Car and order by Car.model - don't use eager loading, use query.join():

This one.

> http://docs.sqlalchemy.org/en/rel_0_9/faq.html#i-m-using-joinedload-or-lazy-false-to-create-a-join-outer-join-and-sqlalchemy-is-not-constructing-the-correct-query-when-i-try-to-add-a-where-order-by-limit-etc-which-relies-upon-the-outer-join

Ok but this will result in duplicate joins, right? It does not look too
efficient, does it?

It was discussed here:
http://stackoverflow.com/questions/22490777/sqlalchemy-how-to-order-query-results-order-by-on-a-relationships-field-re
. I spotted duplicate joins, which I wanted to avoid, and was advised to
use options(contains_eager(...)).

Is there any better way to get rid of those duplicate joins? Or are we
supposed to live with it?

Michael Bayer

unread,
Mar 26, 2014, 2:12:55 PM3/26/14
to sqlal...@googlegroups.com

On Mar 26, 2014, at 1:56 PM, Staszek <stf.lis...@eisenbits.com> wrote:

> On 2014-03-26 18:16, Michael Bayer wrote:
>> It's not totally clear which of the common use cases you have here, pick one:
>>
>> 1. I want to emit a JOIN from User to Car and order by Car.model - don't use eager loading, use query.join():
>
> This one.
>
>> http://docs.sqlalchemy.org/en/rel_0_9/faq.html#i-m-using-joinedload-or-lazy-false-to-create-a-join-outer-join-and-sqlalchemy-is-not-constructing-the-correct-query-when-i-try-to-add-a-where-order-by-limit-etc-which-relies-upon-the-outer-join
>
> Ok but this will result in duplicate joins, right? It does not look too
> efficient, does it?
>
> It was discussed here:
> http://stackoverflow.com/questions/22490777/sqlalchemy-how-to-order-query-results-order-by-on-a-relationships-field-re
> . I spotted duplicate joins, which I wanted to avoid, and was advised to
> use options(contains_eager(...)).

If you have a lazy=False (or ‘joined’) on User.cars, that means, “eager load this relationship every time”. However contains_eager() will supersede that joined loader as you are specifying a specific form of load for the User.cars relationship.

given this mapping:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

class A(Base):
__tablename__ = 'a'

id = Column(Integer, primary_key=True)
bs = relationship("B", backref=backref('a', lazy='joined'))

class B(Base):
__tablename__ = 'b'

id = Column(Integer, primary_key=True)
a_id = Column(Integer, ForeignKey('a.id’))


sess.query(B).all() produces:

SELECT b.id AS b_id, b.a_id AS b_a_id, a_1.id AS a_1_id
FROM b LEFT OUTER JOIN a AS a_1 ON a_1.id = b.a_id

sess.query(B).join(B.a).order_by(A.id).options(contains_eager(B.a)).all() produces:

SELECT b.id AS b_id, b.a_id AS b_a_id, a.id AS a_id
FROM b JOIN a ON a.id = b.a_id ORDER BY a.id

if you are seeing something different then we need to look more specifically at what you’re doing.

Reply all
Reply to author
Forward
0 new messages