Selectinload doesn't exploit FK

25 views
Skip to first unread message

yoch melka

unread,
Jul 18, 2019, 5:02:39 PM7/18/19
to sqlalchemy
Hi,

According to the documentation, from 1.3 "selectin loading can omit the JOIN for a simple one-to-many collection" in case if the PK is known by the primary request.

But in the opposite case that we know the FK, it's not exploited by Sqlalchemy to produce more effective queries.

Example:

from sqlalchemy import Table, Column, Integer, ForeignKey, create_engine
from sqlalchemy.orm import relationship, sessionmaker, selectinload
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()

class Parent(Base):
    __tablename__
= 'parent'
    id
= Column(Integer, primary_key=True)
    child_id
= Column(Integer, ForeignKey('child.id'))
    child
= relationship("Child")

class Child(Base):
    __tablename__
= 'child'
    id
= Column(Integer, primary_key=True)


engine
= create_engine('sqlite://')
engine
.echo = True

Base.metadata.create_all(engine)

Session = sessionmaker(bind=engine)
db
=Session()

# create some data
c1
= Child()
db
.add(c1)
for n in range(10):
    p
= Parent(child=c1)
    db
.add(p)
db
.commit()


l
= db.query(Parent).options(selectinload(Parent.child)).all()

The last instruction above will produce the following two SQL queries :

SELECT parent.id AS parent_id, parent.child_id AS parent_child_id 
FROM parent

SELECT parent_1
.id AS parent_1_id, child.id AS child_id
FROM parent AS parent_1 JOIN child ON child
.id = parent_1.child_id
WHERE parent_1
.id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
ORDER BY parent_1
.id

But in fact, the ORM have enough information to made the second query simply:

SELECT child.id AS child_id
FROM child
WHERE id IN
(1)

because we know that all child_id we have in parents are 1.

Is it can be improved ?

Mike Bayer

unread,
Jul 18, 2019, 6:24:03 PM7/18/19
to noreply-spamdigest via sqlalchemy
certainly, do you have resources to work on a pull request?      it would require a new method like _init_for_omit_join() [1] as well as an alternate strategy inside of _load_for_path() [2] which loads the related objects, and matches them to the parents, in a different way.

You can get almost the same effect right now, with a little bit less internal effiency,  to just load the Child objects directly in a list, then access the attribute upon each Parent, which loads them from the identity map:

l = db.query(Parent).all()
c = db.query(Child).filter(Child.id.in_([p.id for p in l])).all()
for p in l:
    p.child  # no query is emitted


a PR for this feature could be informed by a similar strategy, though the "p.child" part should likely use a faster route to populating the attribute directly the way the current strategy does.






--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yoch melka

unread,
Jul 19, 2019, 4:40:50 AM7/19/19
to sqlalchemy
Thank you for your help.

I'd be happy to submit a PR (I've already tried to fix this by myself), but the code is a little hard to master.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlal...@googlegroups.com.

Mike Bayer

unread,
Jul 19, 2019, 1:22:40 PM7/19/19
to noreply-spamdigest via sqlalchemy


On Fri, Jul 19, 2019, at 4:40 AM, yoch melka wrote:
Thank you for your help.

I'd be happy to submit a PR (I've already tried to fix this by myself), but the code is a little hard to master.


I'm glad you tried.    The feature is in review at https://gerrit.sqlalchemy.org/#/c/1368/.   Of particular importance is that it has a new "degrade" step in the very unlikely case that the foreign key attributes on the parent object are unloaded; rather than emitting a separate query for those,  it goes back to the JOIN.

I would encourage you, if possible, to download the patch for the 1.3 series from https://gerrit.sqlalchemy.org/#/c/sqlalchemy/sqlalchemy/+/1369/ and ensure it works and performs as expected for your use case, though a SQLAlchemy release is now overdue and this may go out quickly.


To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.

yoch melka

unread,
Jul 20, 2019, 2:19:39 PM7/20/19
to sqlalchemy
I've tested the patch. Works perfectly with my usecase.

Thank you very much !
Reply all
Reply to author
Forward
0 new messages