Complicated filter clause causes recursion depth exceeded exception

727 views
Skip to first unread message

rob.c...@moat.com

unread,
Jan 18, 2013, 4:15:43 PM1/18/13
to sqlal...@googlegroups.com
I haven't boiled this down to a short test case yet, but when my WHERE clause gets especially long I start getting the "recursion depth exceeded" exception.  Is this a well-known limitation of sqlalchemy?  We're running this query in production currently without SQLAlchemy, and it performs fine, but perhaps I need to look for another approach...

If I keep the filter condition relatively short, my query looks like this and runs fine (with fake columns start_date, X, Y, and Z on table T):

SELECT X, sum(Z) AS Z
FROM T
WHERE T.start_date >= :start_date_1 
  AND T.start_date <= :start_date_2 
  AND NOT (T.X = :X_1 AND T.Y = :Y_1) 
  AND NOT (T.X = :X_2 AND T.Y = :Y_2)
  AND NOT (T.X = :X_3 AND T.Y = :Y_3)
GROUP BY T.X

However, if I make the filter() clause very long (over 150 AND NOT... clauses), I start getting exceptions with this stack trace:

  File "test.py", line 350, in do_test
    print q
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 3031, in __str__
    return str(self._compile_context().statement)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py", line 1790, in __str__
    return unicode(self.compile()).encode('ascii', 'backslashreplace')
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py", line 1778, in compile
    return self._compiler(dialect, bind=bind, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/expression.py", line 1784, in _compiler
    return dialect.statement_compiler(dialect, self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 277, in __init__
    engine.Compiled.__init__(self, dialect, statement, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 705, in __init__
    self.string = self.process(self.statement)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 724, in process
    return obj._compiler_dispatch(self, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 941, in visit_select
    t = select._whereclause._compiler_dispatch(self, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in visit_clauselist
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 475, in <genexpr>
    s for s in 
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in <genexpr>
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in visit_clauselist
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 475, in <genexpr>
    s for s in 
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in <genexpr>
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in visit_clauselist
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 475, in <genexpr>
    s for s in 
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in <genexpr>
    for c in clauselist.clauses)
...
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 475, in <genexpr>
    s for s in 
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in <genexpr>
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in visit_clauselist
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 475, in <genexpr>
    s for s in 
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 477, in <genexpr>
    for c in clauselist.clauses)
  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 72, in _compiler_dispatch
    return getter(visitor)(self, **kw)
RuntimeError: maximum recursion depth exceeded

Michael Bayer

unread,
Jan 18, 2013, 6:00:04 PM1/18/13
to sqlal...@googlegroups.com
On Jan 18, 2013, at 4:15 PM, rob.c...@moat.com wrote:

I haven't boiled this down to a short test case yet, but when my WHERE clause gets especially long I start getting the "recursion depth exceeded" exception.  Is this a well-known limitation of sqlalchemy?  We're running this query in production currently without SQLAlchemy, and it performs fine, but perhaps I need to look for another approach...

If I keep the filter condition relatively short, my query looks like this and runs fine (with fake columns start_date, X, Y, and Z on table T):

SELECT X, sum(Z) AS Z
FROM T
WHERE T.start_date >= :start_date_1 
  AND T.start_date <= :start_date_2 
  AND NOT (T.X = :X_1 AND T.Y = :Y_1) 
  AND NOT (T.X = :X_2 AND T.Y = :Y_2)
  AND NOT (T.X = :X_3 AND T.Y = :Y_3)
GROUP BY T.X

However, if I make the filter() clause very long (over 150 AND NOT... clauses), I start getting exceptions with this stack trace:

Always amazing how many wacky new problems come around.   Well, the compilation of these clauses is pretty straightforward, using a recursive traversal scheme.  So if you give Python a tree structure of more than 1000 nodes deep and do such a traversal, this is the error you'd get, and I suppose it's sort of "well known", depends on what perspective you're coming from.

So this indicates you're creating a structure that is nested this deeply.  Which is to say, really deep !   

This could happen if you're doing the AND's using a nesting pattern of one at a time like this:

from sqlalchemy.sql import column

root = column('x') == 5
current = root

for i in xrange(200):
    current = current & (column('x') == 5)

print current


because that's really and_(expr, and_(expr, and_(expr, and_( for 200 times... ))).

But if you flatten out the and_() you can get this:

from sqlalchemy.sql import column, and_

expr = [column('x') == 5]
for i in xrange(200):
    expr.append(column('x') == 5)

expr = and_(*expr)

print expr

then you have a flat structure, and you're fine.

So we could modify our and_()/or_ construct to open itself up this way, that is, as it's built, it flattens out the nesting, though maybe for now there's a way you can build up using one big and_() block.

In fact to flatten out the nesting is something you could enable across the board here, and you can see why I'm hesitant to build this in by default as it adds lots of isinstance() and other expensive checks, but you can add this to your app as a quick fix (just run this anywhere at import time to redefine how and_() and or_() are rendered):

from sqlalchemy.ext.compiler import compiles
from sqlalchemy.sql.expression import BooleanClauseList

@compiles(BooleanClauseList)
def flatten_boolean_clause_list(clauselist, compiler, **kw):
    op = clauselist.operator
    flattened = []
    rewrite = False
    stack = list(clauselist.clauses)
    while stack:
        elem = stack.pop(0)
        if isinstance(elem, BooleanClauseList) and elem.operator is op:
            stack[:0] = elem.clauses
            rewrite = True
        else:
            flattened.append(elem)
    if rewrite:
        clauselist = BooleanClauseList(operator=op, *flattened)
    return compiler.visit_clauselist(clauselist, **kw)

then the original test passes because we've rewritten the nested list as a flat list.   Basically the "recursion" is replaced by the stack based traversal we do here.

or even quicker, you could just increase your recursion depth.  It defaults to 1000, so here's 10000, do this before you try to run the SQL:

import sys
sys.setrecursionlimit(10000)










--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To view this discussion on the web visit https://groups.google.com/d/msg/sqlalchemy/-/FeiLu3iVqisJ.
To post to this group, send email to sqlal...@googlegroups.com.
To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.

rob.c...@moat.com

unread,
Jan 22, 2013, 12:47:23 PM1/22/13
to sqlal...@googlegroups.com
Thanks Michael,

Writing a big list of conditions and combining them with and_(*conditions) worked well.  I was indeed querying like this before:

for condition in conditions:
  q = q.filter(condition)
print q  
Reply all
Reply to author
Forward
0 new messages