Join multiple tables with association tables

81 views
Skip to first unread message

timbecks

unread,
Aug 9, 2021, 5:48:05 PM8/9/21
to sqlalchemy

I am trying to figure out the correct join query setup within SQLAlchemy, but I can't seem to get my head around it.

I have the following table setup (simplified, I left out the non-essential fields):

Unbenannt.png

Unbenannt2.png

The facts are associated to info, info is associated to text. Text and facts aren't directly associated.

I would like to join them all together but can't figure out to do so.

In this example I would like to get all instaces of "Info" that are associated to Fact.id = 1 and all "Text" instances that are associated to that "Info" instance. I came up with

     select(Info, Text)
          .join(facts_info)
          .join(Facts)
          .join(info_text)
          .join(Text)
          here(Facts.id ==1)

But it obviously gives me an error.

Simon King

unread,
Aug 10, 2021, 5:13:32 AM8/10/21
to sqlal...@googlegroups.com
It's difficult to tell from your code what your intention is. Is the relationship between Fact and Info meant to be many-to-many? And likewise the relationship between Text and Info?

Forgetting SQLAlchemy for a moment, what is the SQL that you want to produce?


Does the script below do what you want?


import sqlalchemy as sa
import sqlalchemy.orm as saorm
from sqlalchemy.ext.declarative import declarative_base


Base = declarative_base()


facts_info = sa.Table(
    "facts_info",
    Base.metadata,
    sa.Column(
        "fact_id", sa.Integer, sa.ForeignKey("fact.id"), primary_key=True
    ),
    sa.Column(
        "info_id", sa.Integer, sa.ForeignKey("info.id"), primary_key=True
    ),
)


info_text = sa.Table(
    "info_text",
    Base.metadata,
    sa.Column(
        "info_id", sa.Integer, sa.ForeignKey("info.id"), primary_key=True
    ),
    sa.Column(
        "text_id", sa.Integer, sa.ForeignKey("text.id"), primary_key=True
    ),
)


class Fact(Base):
    __tablename__ = "fact"

    id = sa.Column(sa.Integer, primary_key=True)
    fact = sa.Column(sa.String(500), nullable=False, unique=True)
    created_at = sa.Column(sa.DateTime)
    updated_at = sa.Column(sa.DateTime)

    info = saorm.relationship(
        "Info", secondary=facts_info, back_populates="facts"
    )


class Info(Base):
    __tablename__ = "info"

    id = sa.Column(sa.Integer, primary_key=True)
    filename = sa.Column(sa.String(50))
    format = sa.Column(sa.String(10))

    facts = saorm.relationship(
        "Fact", secondary=facts_info, back_populates="info"
    )
    text = saorm.relationship(
        "Text", secondary=info_text, back_populates="info"
    )


class Text(Base):
    __tablename__ = "text"

    id = sa.Column(sa.Integer, primary_key=True)
    text = sa.Column(sa.String(1000))

    # Relationships
    info = saorm.relationship(
        "Info", secondary=info_text, back_populates="text"
    )


if __name__ == "__main__":
    engine = sa.create_engine("sqlite://", echo=True)
    Base.metadata.create_all(engine)

    Session = saorm.sessionmaker(bind=engine)

    session = Session()

    # two facts
    facts = [Fact(fact="factone"), Fact(fact="facttwo")]
    # three infos, first two are associated with both facts, third is
    # only linked to second fact
    infos = [
        Info(filename="infoone", facts=facts),
        Info(filename="infotwo", facts=facts),
        Info(filename="infothree", facts=facts[1:]),
    ]
    # three texts, first two linked to first info instance, third
    # linked to third info instance
    texts = [
        Text(text="textone", info=[infos[0]]),
        Text(text="texttwo", info=[infos[0]]),
        Text(text="textthree", info=[infos[2]]),
    ]
    session.add_all(facts + infos + texts)
    session.flush()

    # Joining to both facts_info and info_text in the same query
    # doesn't really make sense, because it would end up producing a
    # cartesian product between those tables. Instead we'll use a
    # subquery against facts_info to select the info ids we are
    # interested in.
    info_ids = (
        session.query(facts_info.c.info_id)
        .filter(facts_info.c.fact_id == 1)
    )
    query = (
        session.query(Info, Text)
        .filter(Info.id.in_(info_ids))
        .join(Info.text)
    )

    # Note that this only outputs Info objects that have at least one
    # text object associated with them. If you want to include Info
    # objects without a related Text object, change the
    # ".join(Info.text)" to ".outerjoin(Info.text)"
    for (info, text) in query.all():
        print("Info(filename=%r) Text(text=%r)" % (info.filename, text.text))



Hope that helps,

Simon


--
SQLAlchemy -
The Python SQL Toolkit and Object Relational Mapper
 
http://www.sqlalchemy.org/
 
To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
---
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sqlalchemy/59a98354-4ecf-4b4b-b76b-a8ad67aae207n%40googlegroups.com.

timbecks

unread,
Aug 10, 2021, 10:10:24 AM8/10/21
to sqlalchemy
            Is the relationship between Fact and Info meant to be many-to-many? And likewise the relationship between Text and Info?

You are right about that.

Your code did exactly what I wanted. Thank you so much!

I figured it could have to do somethin with a subquery but I'm just starting with sql so it is quiet difficult for me to understand. Your explanation really helped. 

Thanks again,

Timo
Reply all
Reply to author
Forward
0 new messages