order_by(None) for joinedload subclause queries?

710 views
Skip to first unread message

Adam Feuer

unread,
Dec 30, 2014, 4:29:50 PM12/30/14
to sqlal...@googlegroups.com
Hi,

We are using SQLAlchemy 0.9.8 on Python 2.7.7 and Postgres 9.3.

We have a query that uses joinedloads to fully populate some Recipe objects using a single query. The query creates a large SQL statement that takes 20 seconds to execute - too long. Here's the rendered SQL statement on Pastebin. 

The rendered SQL has an ORDER BY clause that Postgres explain says is the source of 99% of the time spent on this query. This appears to come from the relationship in the ORM model, which has an order_by clause.

However, we don't care about the order the results are returned for this query - we only care about the order when looking at a single object. If I remove the ORDER BY clause at the end of the rendered SQL statement, the query executes in less than a second - perfect.

We tried using .order_by(None) on the query, but that seems to have no effect. The ORDER BY seems to be related to the joinedloads, because if change the joinedloads to lazyloads, they go away. But we need the joinedloads for speed.

How can I get SQLAlchemy to omit the ORDER BY clauses? 




FYI, here's the query:

missing_recipes = cls.query(session).filter(Recipe.id.in_(missing_recipe_ids)) if missing_recipe_ids else []

Here's an excerpt from the ORM class:

class Recipe(Base, TransactionalIdMixin, TableCacheMixin, TableCreatedModifiedMixin):
    __tablename__ = 'recipes'
      authors = relationship('RecipeAuthor', cascade=OrmCommonClass.OwnedChildCascadeOptions,
                           single_parent=True,
                           lazy='joined', order_by='RecipeAuthor.order', backref='recipe')
    scanned_photos = relationship(ScannedPhoto, backref='recipe', order_by="ScannedPhoto.position")
    utensils = relationship(CookingUtensil, secondary=lambda: recipe_cooking_utensils_table)
    utensil_labels = association_proxy('utensils', 'name')

Our query() method looks something like this (some more joinedloads omitted):

    @classmethod
    def query(cls, session):
        query = query.options(
            joinedload(cls.ingredients).joinedload(RecipeIngredient.ingredient),
            joinedload(cls.instructions),
            joinedload(cls.scanned_photos),
            joinedload(cls.tags),
            joinedload(cls.authors),
        )


cheers
adam
-- 

Michael Bayer

unread,
Dec 30, 2014, 7:11:23 PM12/30/14
to sqlal...@googlegroups.com


Adam Feuer <ad...@cookbrite.com> wrote:

Hi,

We are using SQLAlchemy 0.9.8 on Python 2.7.7 and Postgres 9.3.

We have a query that uses joinedloads to fully populate some Recipe objects using a single query. The query creates a large SQL statement that takes 20 seconds to execute - too long. Here's the rendered SQL statement on Pastebin. 

The rendered SQL has an ORDER BY clause that Postgres explain says is the source of 99% of the time spent on this query. This appears to come from the relationship in the ORM model, which has an order_by clause.

However, we don't care about the order the results are returned for this query - we only care about the order when looking at a single object. If I remove the ORDER BY clause at the end of the rendered SQL statement, the query executes in less than a second - perfect.

We tried using .order_by(None) on the query, but that seems to have no effect. The ORDER BY seems to be related to the joinedloads, because if change the joinedloads to lazyloads, they go away. But we need the joinedloads for speed.

How can I get SQLAlchemy to omit the ORDER BY clauses? 

you either need to take order_by off the relationship(), probably the best idea here if ordering isn’t important, or otherwise skip joinedload(), write out the joins yourself and use contains_eager() (http://docs.sqlalchemy.org/en/rel_0_9/orm/loading_relationships.html?highlight=contains_eager#contains-eager).




--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy.
For more options, visit https://groups.google.com/d/optout.

Adam Feuer

unread,
Dec 30, 2014, 9:30:00 PM12/30/14
to sqlal...@googlegroups.com
On Tue, Dec 30, 2014 at 4:09 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:
you either need to take order_by off the relationship(), probably the best idea here if ordering isn’t important, or otherwise skip joinedload(), write out the joins yourself and use contains_eager() (http://docs.sqlalchemy.org/en/rel_0_9/orm/loading_relationships.html?highlight=contains_eager#contains-eager).

Michael,

Thanks for the ideas, we'll try them. Not sure I can remove the order_by relationship because we need the order other places... if I remove that, I will have to add sorting in each of the other places. But it might be worth it.

I think the manual joins and contains_eager may be better for us. Will that work because the manual joins and contains_eager doesn't automatically trigger adding the ORDER BY clauses?

cheers
adam
--

Michael Bayer

unread,
Dec 30, 2014, 11:26:36 PM12/30/14
to sqlal...@googlegroups.com
joinedload() is sort of a macro that creates the joins and other modifications to the query (such as ORDER BY the relationship), applies aliases to each of those parts so that there’s no chance of them conflicting with anything on the query, and then routes the columns from those extra FROM clauses into collections and related objects.    contains_eager() does just the very last part of that.    The first two parts, writing the joins and orderings and potentially aliasing them (or not), is up to you in that case, so you retain full control over how the query is rendered.

Adam Feuer

unread,
Dec 30, 2014, 11:28:53 PM12/30/14
to sqlal...@googlegroups.com
On Tue, Dec 30, 2014 at 8:24 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:
 The first two parts, writing the joins and orderings and potentially aliasing them (or not), is up to you in that case, so you retain full control over how the query is rendered.

Michael,

Cool, thanks for the explanation. That sounds like what we want. I'll try it tomorrow.
Reply all
Reply to author
Forward
0 new messages