I am using SQLAlchemy 0.7.2 with MySQL 5.1, if that matters.
In my POS app there are "Customer"s which can be grouped in
"CustomerGroup"s. These are stored in 2 tables with a many-to-many
relationship.
I have a declarative Customer class defined as follows:
class Customer(Base):
__tablename__ = 'customers'
id = Column(Integer, primary_key=True)
name = Column(String(255), nullable=False)
...
groups = relationship("CustomerGroup",
secondary=customer_group_link, backref="customers")
and a CustomerGroup declarative class, and the relationship table
defined like that:
customer_group_link = Table('customer_group', metadata,
Column('customer_id', Integer, ForeignKey('customers.id')),
Column('group_id', Integer, ForeignKey('customergroups.id'))
)
I wanted to get all the Customers who are associated to a
CustomerGroup, I did:
session.query(Customer).filter(Customer.groups.contains(group))
I wanted to get all the Customers who do not belong to any
CustomerGroup. I tried that:
session.query(Customer).filter(len(Customer.groups) == 0)
which threw an exception "TypeError: object of type
'InstrumentedAttribute' has no len()".
I understand what that means, but it would be a good thing to have it
available, something similar to Customer.groups.contains().
I can't seem to do such a thing neither:
session.query(Customer).filter([some subquery].count() == 0)
Then what would [some subquery] be?
I did manage to do it with normal SQL (on MySQL 5.1):
SELECT c2.id FROM customers c2 WHERE (SELECT COUNT(cg.id) FROM
customers c, customergroups cg, customer_group ccg WHERE
ccg.customer_id=c.id AND ccg.group_id=cg.id AND c.id=c2.id)=0
It worked as I expected.
I used to do it like that:
all_customers = session.query(Customer).all()
customers = [c for c in all_customers if len(c.groups) == 0]
Which also works.
I know that there is a way to do it in a cleaner way, "a la
SQLAlchemy". I just don't know how.
Please note I'm new to SQLAlchemy and I don't know if the answer to
this is obvious, but I can't seem to find it in any other way.
Thanks in advance.
> Hi,
> My problem is with using a many-to-many relationship inside a query.
>
>
> I wanted to get all the Customers who are associated to a
> CustomerGroup, I did:
> session.query(Customer).filter(Customer.groups.contains(group))
>
> I wanted to get all the Customers who do not belong to any
> CustomerGroup. I tried that:
> session.query(Customer).filter(len(Customer.groups) == 0)
> which threw an exception "TypeError: object of type
> 'InstrumentedAttribute' has no len()".
> I understand what that means, but it would be a good thing to have it
> available, something similar to Customer.groups.contains().
>
> I can't seem to do such a thing neither:
> session.query(Customer).filter([some subquery].count() == 0)
> Then what would [some subquery] be?
>
> I did manage to do it with normal SQL (on MySQL 5.1):
> SELECT c2.id FROM customers c2 WHERE (SELECT COUNT(cg.id) FROM
> customers c, customergroups cg, customer_group ccg WHERE
> ccg.customer_id=c.id AND ccg.group_id=cg.id AND c.id=c2.id)=0
> It worked as I expected.
The usual form of this query is relational terms (the set of all customers) <intersection> (the set of all customer ids in customer_group with N number of group id). The second half of it uses GROUP BY to group by the customer id and HAVING to limit the rows, and you'd use join() to join the two together as in the form at http://www.sqlalchemy.org/docs/orm/tutorial.html#using-subqueries .
But here you're looking for the count being zero. So a subquery against customer_group won't have any rows you'd care about. To query for an exclusion, use WHERE NOT EXISTS. In the ORM this is the expression ~Customer.groups.any(), just like the third example at http://www.sqlalchemy.org/docs/orm/internals.html#sqlalchemy.orm.properties.RelationshipProperty.Comparator.any . The join across the association table is handled here within the subquery.
The correlated subquery approach you have is more amenable to being "packaged up" as a simple WHERE criterion on the Python side, but you'll note the format of the query is less well suited to the typical optimizer. MySQL's optimizer in particular is awful, so this is an example where len(SomeRelationship) might look tidy on the Python side but is actually not taking the reality of the relational database into account to an appropriate degree. You could achieve that exact form by using the style at http://www.sqlalchemy.org/docs/orm/tutorial.html#using-subqueries in conjunction with aliased(Customer) to produce "c2".
I don't know if any of these ways is really "cleaner", except for the ~any() that is only appropriate for the "no groups" comparison.
For the len thing, I thought it would be more intuitive to do it like
that, but the any() method is just as "clean" and intuitive.
Thanks a lot.
And I really like SQLAlchemy, it is just great. :)
On Dec 18, 5:43 am, Michael Bayer <mike...@zzzcomputing.com> wrote:
> On Dec 17, 2011, at 6:56 PM, Jad Kik wrote:
>
>
>
>
>
>
>
>
>
> > Hi,
> > My problem is with using a many-to-many relationship inside a query.
>
> > I wanted to get all the Customers who are associated to a
> > CustomerGroup, I did:
> > session.query(Customer).filter(Customer.groups.contains(group))
>
> > I wanted to get all the Customers who do not belong to any
> > CustomerGroup. I tried that:
> > session.query(Customer).filter(len(Customer.groups) == 0)
> > which threw an exception "TypeError: object of type
> > 'InstrumentedAttribute' has no len()".
> > I understand what that means, but it would be a good thing to have it
> > available, something similar to Customer.groups.contains().
>
> > I can't seem to do such a thing neither:
> > session.query(Customer).filter([some subquery].count() == 0)
> > Then what would [some subquery] be?
>
> > I did manage to do it with normal SQL (on MySQL 5.1):
> > SELECT c2.id FROM customers c2 WHERE (SELECT COUNT(cg.id) FROM
> > customers c, customergroups cg, customer_group ccg WHERE
> > ccg.customer_id=c.id AND ccg.group_id=cg.id AND c.id=c2.id)=0
> > It worked as I expected.
>
> The usual form of this query is relational terms (the set of all customers) <intersection> (the set of all customer ids in customer_group with N number of group id). The second half of it uses GROUP BY to group by the customer id and HAVING to limit the rows, and you'd use join() to join the two together as in the form athttp://www.sqlalchemy.org/docs/orm/tutorial.html#using-subqueries.
>
> But here you're looking for the count being zero. So a subquery against customer_group won't have any rows you'd care about. To query for an exclusion, use WHERE NOT EXISTS. In the ORM this is the expression ~Customer.groups.any(), just like the third example athttp://www.sqlalchemy.org/docs/orm/internals.html#sqlalchemy.orm.prop.... The join across the association table is handled here within the subquery.
>
> The correlated subquery approach you have is more amenable to being "packaged up" as a simple WHERE criterion on the Python side, but you'll note the format of the query is less well suited to the typical optimizer. MySQL's optimizer in particular is awful, so this is an example where len(SomeRelationship) might look tidy on the Python side but is actually not taking the reality of the relational database into account to an appropriate degree. You could achieve that exact form by using the style athttp://www.sqlalchemy.org/docs/orm/tutorial.html#using-subqueriesin conjunction with aliased(Customer) to produce "c2".