Beginner: subquery and any

19 views
Skip to first unread message

Maik Riechert

unread,
Apr 17, 2014, 8:51:51 AM4/17/14
to sqlal...@googlegroups.com
Hi everyone,

I am really lost on a query I try to formulate. In LINQ it would probably like that (haven't done LINQ in quite some time):

from j in Job
where j.status == 'queued' and j.dependencies.all(d => d.status == 'done')
select j

So, Job.dependencies is a self-referential many-to-many association which describes dependencies between processing jobs. And now I try to find those queued jobs whose dependencies are finished.

I'm brand new to sqlalchemy, I appreciate any help :)

Thanks
Maik

Michael Bayer

unread,
Apr 17, 2014, 9:33:48 AM4/17/14
to sqlal...@googlegroups.com
this is a bit of a brain teaser because I don’t know LINQ and we don’t have an “all()” operator.   Spent a few minutes trying to guess what the heck “all()” would query for; in SQL, it’s very easy to check if a collection contains some elements with some kind of criteria…but “all”, I thought, how can you ever know that there’s not another row somewhere that isn’t in the”all”, which gave me the “duh” moment.    for all() you need to check that there are “not any” of the *opposite* - e.g. no rows exist that don’t match that criteria.   We could add this operator really easily.

But for now

query(Job).filter(Job.status == ‘queued’).filter(~Job.dependencies.any(Dependency.status != ‘done’))



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Maik Riechert

unread,
Apr 18, 2014, 5:14:35 AM4/18/14
to sqlal...@googlegroups.com
Hmm, I already tried that but as far as I know SQLAlchemy doesn't automatically join in queries. So Job.dependencies doesn't work in a query if I didn't do something wrong.

Maik Riechert

unread,
Apr 18, 2014, 5:20:44 AM4/18/14
to sqlal...@googlegroups.com

query(Job).filter(Job.status == ‘queued’).filter(~Job.dependencies.any(Dependency.status != ‘done’))


One more thing. Dependency doesn't exist as a class. Job.dependencies is a many-to-many association. That's why you probably have to use aliases to refer to the status of the dependency Job. I just couldn't get it to work though. The basic model:

# many-to-many association table
JobDependency = Table('job_dependency', Base.metadata,
    Column('jobId', Integer, ForeignKey('job.id'), primary_key=True),
    Column('dependsOnJobId', Integer, ForeignKey('job.id'), primary_key=True)
)

class Job(Base):
    __tablename__ = 'job'
   
    id = Column(Integer, primary_key=True)
    status = Column(String(20), default='queued', nullable=False)
   
    dependencies = relationship(lambda: Job,
                                secondary=JobDependency,
                                primaryjoin=id==JobDependency.c.jobId,
                                secondaryjoin=id==JobDependency.c.dependsOnJobId,
                                backref='dependencyOf')

Michael Bayer

unread,
Apr 18, 2014, 10:46:57 AM4/18/14
to sqlal...@googlegroups.com
the .any() and .has() operators use a correlated EXISTS query, so no JOIN is needed (I’d probably guess LINQ does the same, actually).

If you’re trying to say something like Job.dependencies, then “Job.dependencies” is a relationship() in that case, if it’s many-to-many that doesn’t really matter, the any() operator knows how to work with many-to-many that is correctly configured.

Here’s a complete example:

from sqlalchemy import *
from sqlalchemy.orm import *
from sqlalchemy.ext.declarative import declarative_base, declared_attr
from sqlalchemy import event

Base = declarative_base()

# many-to-many association table
JobDependency = Table('job_dependency', Base.metadata,
    Column('jobId', Integer, ForeignKey('job.id'), primary_key=True),
    Column('dependsOnJobId', Integer, ForeignKey('job.id'), primary_key=True)
)

class Job(Base):
    __tablename__ = 'job'

    id = Column(Integer, primary_key=True)
    status = Column(String(20), default='queued', nullable=False)

    dependencies = relationship(lambda: Job,
                                secondary=JobDependency,
                                primaryjoin=id==JobDependency.c.jobId,
                                secondaryjoin=id==JobDependency.c.dependsOnJobId,
                                backref='dependencyOf')

e = create_engine("sqlite://", echo='debug')
Base.metadata.create_all(e)
s = Session(e)

j1 = Job(status='queued', dependencies=[Job(status='done'), Job(status='done')])
j2 = Job(status='queued', dependencies=[Job(status='inprogress')])
s.add_all([j1, j2])

jobs = s.query(Job).\
        filter(Job.status == 'queued').\
        filter(~Job.dependencies.any(Job.status != 'done')).all()
assert jobs == [j1]



Maik Riechert

unread,
Apr 19, 2014, 4:09:00 AM4/19/14
to sqlal...@googlegroups.com
Am 18.04.2014 16:46, schrieb Michael Bayer:
> the .any() and .has() operators use a correlated EXISTS query, so no
> JOIN is needed (I’d probably guess LINQ does the same, actually).
>
> If you’re trying to say something like Job.dependencies, then
> “Job.dependencies” is a relationship() in that case, if it’s
> many-to-many that doesn’t really matter, the any() operator knows how to
> work with many-to-many that is correctly configured.

Thanks for the example, it works. I don't know what I did different in
my first attempt.

So just to be clear: anything inside any(..) always refers to the job
inside EXISTS and not to the outer job. And if for some reason I wanted
to access the outer job, I would have to use aliases, e.g.:

j = aliased(Job)
jobs = s.query(j).\
filter(j.status == 'queued').\
filter(~j.dependencies.any(Job.status != 'done', j.name=='foo'
)).all()


Cheers
Maik
>> <mailto:sqlalchemy+...@googlegroups.com>.
>> To post to this group, send email to sqlal...@googlegroups.com
>> <mailto:sqlal...@googlegroups.com>.
>> Visit this group at http://groups.google.com/group/sqlalchemy.
>> For more options, visit https://groups.google.com/d/optout.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "sqlalchemy" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/sqlalchemy/vLzKy5ywClk/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> sqlalchemy+...@googlegroups.com
> <mailto:sqlalchemy+...@googlegroups.com>.
> To post to this group, send email to sqlal...@googlegroups.com
> <mailto:sqlal...@googlegroups.com>.
Reply all
Reply to author
Forward
0 new messages