contains_eager, outerjoin, subquery, alias, and loading nested relationships of the same type

234 views
Skip to first unread message

Nick Repole

unread,
Aug 4, 2018, 11:27:09 PM8/4/18
to sqlalchemy
Hi,

I'm attempting to load some filtered relationships, and am running into issues when trying to independently load the same relationship multiple times.

As an example, I'd like to load Album.tracks, and Album.tracks.album.tracks in different ways. In the first tracks relationship, I'm trying to include all tracks with track_id >= 4, and in the second relationship every trackId < 4.

Is this possible? And if so, what load options or technique should I be using? The query generated below contains all the information I'm trying to get, it's a matter of loading that information into my models that's tripping me up.

Complete/runnable example can be found attached or below:

from sqlalchemy import create_engine
from sqlalchemy.orm import aliased, sessionmaker, relationship, contains_eager
from sqlalchemy import Column, ForeignKey, Integer, Unicode
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()
metadata = Base.metadata


class Album(Base):

    __tablename__ = 'Album'

    album_id = Column("AlbumId", Integer, primary_key=True)
    title = Column("Title", Unicode(160), nullable=False)


class Track(Base):

    __tablename__ = 'Track'

    track_id = Column("TrackId", Integer, primary_key=True)
    name = Column("Name", Unicode(200), nullable=False)
    album_id = Column("AlbumId", ForeignKey('Album.AlbumId'), index=True)

    album = relationship('Album', backref="tracks")


engine = create_engine('sqlite://', echo=True)
Base.metadata.create_all(engine, checkfirst=True)
session_cls = sessionmaker(bind=engine)
session = session_cls()


# Set up the database with Chinook data
session.add(Album(album_id=1, title="For Those About To Rock We Salute You"))
session.add(Album(album_id=2, title="Balls to the Wall"))
session.add(Album(album_id=3, title="Restless and Wild"))
session.add(Album(album_id=4, title="Let There Be Rock"))
session.add(Track(track_id=1, name="For Those About To Rock", album_id=1))
session.add(Track(track_id=2, name="Balls to the Wall", album_id=2))
session.add(Track(track_id=3, name="Fast As a Shark", album_id=3))
session.add(Track(track_id=4, name="Restless and Wild", album_id=3))
session.add(Track(track_id=5, name="Princess of the Dawn", album_id=3))
session.add(Track(track_id=6, name="Put The Finger On You", album_id=1))
session.add(Track(track_id=7, name="Let's Get It Up", album_id=1))
session.add(Track(track_id=8, name="Inject The Venom", album_id=1))
session.add(Track(track_id=9, name="Snowballed", album_id=1))
session.add(Track(track_id=10, name="Evil Walks", album_id=1))
session.add(Track(track_id=11, name="C.O.D.", album_id=1))
session.add(Track(track_id=12, name="Breaking The Rules", album_id=1))
session.add(Track(track_id=13, name="Night Of The Long Knives", album_id=1))
session.add(Track(track_id=14, name="Spellbound", album_id=1))
session.add(Track(track_id=15, name="Go Down"    , album_id=4))
session.add(Track(track_id=16, name="Dog Eat Dog", album_id=4))
session.add(Track(track_id=17, name="Let There Be Rock", album_id=4))
session.add(Track(track_id=18, name="Bad Boy Boogie", album_id=4))
session.add(Track(track_id=19, name="Problem Child", album_id=4))
session.add(Track(track_id=20, name="Overdose", album_id=4))
session.add(Track(track_id=21, name="Hell Aint A Bad Place To Be", album_id=4))
session.add(Track(track_id=22, name="Whole Lotta Rosie", album_id=4))
session.commit()


# Query set up
query = session.query(Album).filter(Album.album_id == 1)
track_1_alias = aliased(Track, name="Track1")
album_2_alias = aliased(Album, name="Album2")
track_2_alias = aliased(Track, name="Track2")

# Subqueries we want to  load our relationships from
# This will match 9 tracks for AlbumId==1
track_1_subquery = session.query(track_1_alias).filter(
    track_1_alias.track_id >= 4).subquery("Track1")
album_2_subquery = session.query(album_2_alias).subquery("Album2")
# This should only match 1 track for AlbumId==1
track_2_subquery = session.query(track_2_alias).filter(
    track_2_alias.track_id < 4).subquery("Track2")

# Set up our joins
query = query.outerjoin(track_1_subquery, Album.tracks)
query = query.outerjoin(album_2_subquery, track_1_alias.album)
# Using album_2_alias.tracks for the second argument doesn't seem to work here.
query = query.outerjoin(track_2_subquery,
                        album_2_alias.album_id == track_2_alias.album_id)


# Options - These don't work all at once, but seem fine separately
option1 = contains_eager("tracks", alias=track_1_subquery)
option2 = contains_eager("tracks.album", alias=album_2_subquery)
option3 = contains_eager("tracks.album.tracks", alias=track_2_subquery)


# Attempt to load just tracks
option3_query = query.options(option3)
results = option3_query.all()
assert(len(results) == 1)
# len here should be 9, but it's erroneously filtered to 1
assert(len(results[0].tracks) == 1)
assert(len(results[0].tracks[0].album.tracks) == 1)
test.py

Mike Bayer

unread,
Aug 6, 2018, 10:58:33 AM8/6/18
to sqlal...@googlegroups.com
On Sat, Aug 4, 2018 at 11:27 PM, Nick Repole <nick....@gmail.com> wrote:
> Hi,
>
> I'm attempting to load some filtered relationships, and am running into
> issues when trying to independently load the same relationship multiple
> times.
>
> As an example, I'd like to load Album.tracks, and Album.tracks.album.tracks
> in different ways. In the first tracks relationship, I'm trying to include
> all tracks with track_id >= 4, and in the second relationship every trackId
> < 4.
>
> Is this possible? And if so, what load options or technique should I be
> using? The query generated below contains all the information I'm trying to
> get, it's a matter of loading that information into my models that's
> tripping me up.
>
> Complete/runnable example can be found attached or below:

great test case. What you're doing is not possible because you are
looking at the "album.tracks" collection in two different contexts,
and expecting them to have different contents, but this is the same
Album object and the same "tracks" collection:

(Pdb) results[0]
<__main__.Album object at 0x7f2e1eea3898>
(Pdb) results[0].tracks[0].album
<__main__.Album object at 0x7f2e1eea3898>
(Pdb) results[0].tracks is results[0].tracks[5].album.tracks
True

this is how the ORM identity map works, for any class / primary key,
there's only one of them.

now since this is SQLAlchemy, there are ways to do this, but it means
you'd have two different Album objects with the same primary key in
play. if you were to change both of them in a session and try to
flush, you will get conflicts and/or errors.

Here's the quickest way, note thre's now Album.tracks.parent_album for
the real relationship:

class Album(Base):

__tablename__ = 'Album'

album_id = Column("AlbumId", Integer, primary_key=True)
title = Column("Title", Unicode(160), nullable=False)
tracks = relationship("Track", backref="parent_album")


class SubAlbum(Album):
pass


class Track(Base):

__tablename__ = 'Track'

track_id = Column("TrackId", Integer, primary_key=True)
name = Column("Name", Unicode(200), nullable=False)
album_id = Column("AlbumId", ForeignKey('Album.AlbumId'), index=True)
album = relationship('SubAlbum')


query = session.query(Album).filter(Album.album_id == 1)
track_1_alias = aliased(Track, name="Track1")
album_2_alias = aliased(SubAlbum, name="Album2")
track_2_alias = aliased(Track, name="Track2")

then your assertion passes.


as you can see it is not possible for album<->track to be a
bidirectional relationship and refer to two different Album objects at
the same time. you have to choose one or the other.
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and
> Verifiable Example. See http://stackoverflow.com/help/mcve for a full
> description.
> ---
> You received this message because you are subscribed to the Google Groups
> "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to sqlalchemy+...@googlegroups.com.
> To post to this group, send email to sqlal...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

Nick Repole

unread,
Aug 6, 2018, 8:01:27 PM8/6/18
to sqlalchemy
Appreciate the response, that all makes sense to me. In hindsight, I'm wondering how I ever expected there to be two Album objects with the same identifier and different relationship collections...

On a conceptual level, I wonder how feasible it'd be to have objects expunged immediately as they're loaded. That way the parent Album with album_id==1 could load, have it's tracks load, then when the child album is loaded it wouldn't be the same Album object despite being album_id==1.

My use case is purely read only, so I'm not particularly concerned with any changes being tracked/persisted,so such a scenario would accomplish what I'd need. But I realize my use case is pretty niche.

Mike Bayer

unread,
Aug 7, 2018, 1:47:52 PM8/7/18
to sqlal...@googlegroups.com
On Mon, Aug 6, 2018 at 8:01 PM, Nick Repole <nick....@gmail.com> wrote:
> Appreciate the response, that all makes sense to me. In hindsight, I'm
> wondering how I ever expected there to be two Album objects with the same
> identifier and different relationship collections...
>
> On a conceptual level, I wonder how feasible it'd be to have objects
> expunged immediately as they're loaded. That way the parent Album with
> album_id==1 could load, have it's tracks load, then when the child album is
> loaded it wouldn't be the same Album object despite being album_id==1.
>
> My use case is purely read only, so I'm not particularly concerned with any
> changes being tracked/persisted,so such a scenario would accomplish what I'd
> need. But I realize my use case is pretty niche.

well it would be that the objects aren't put into the session at all
as they are loaded, because you literally want the eager loading
within the same load to create a new Album object. That wouldn't
work because the loading mechanism is continuously seeing the same
primary key at the top for that Album and pulling it from the identity
map per-row in order to implement the eager loading in the first
place.

if you want to rely upon lazy loading, that is possible. The
session supports lazy-loading on non-session-bound objects:
http://docs.sqlalchemy.org/en/latest/orm/session_api.html#sqlalchemy.orm.session.Session.enable_relationship_loading
.

Nick Repole

unread,
Aug 8, 2018, 10:47:35 PM8/8/18
to sqlalchemy
Hm, I'll spend some time playing around with enable_relationship_loading and see if I can produce what I'm looking for. Thanks again for the help!
Reply all
Reply to author
Forward
0 new messages