how to prepare "get by primary key" query?

49 views
Skip to first unread message

Roman Yakovenko

unread,
Apr 3, 2013, 1:01:09 AM4/3/13
to sqlal...@googlegroups.com
Hello.

I am just starting with SQLAlchemy and have a small problem: my "get" query is compiled every time I use it.
According to cProfile: 
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/expression.py:1864(compile)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/expression.py:1908(_compiler)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/compiler.py:231(__init__)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/engine/interfaces.py:764(__init__)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/engine/interfaces.py:805(process)
90/10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/visitors.py:73(_compiler_dispatch)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/compiler.py:1082(visit_select)


In case of native API, I would use prepare "command/statement", but it looks I am missing something obvious and cannot find similar functionality in the package.

Below, you can find a complete source code, which reproduce the issue.

Thank you!


import sqlalchemy 
import sqlalchemy.ext.declarative

Base = sqlalchemy.ext.declarative.declarative_base()

class Employee(Base):
    __tablename__ = 'employee'
    id = sqlalchemy.Column(sqlalchemy.Integer, primary_key=True)
    name = sqlalchemy.Column(sqlalchemy.String(100), nullable=False)

def run_query(session):
    employee_by_id = session.query(Employee)
    for i in range(10):
        x = employee_by_id.get(i)
        print x.name
    session.rollback()  

if __name__ == '__main__':
    engine = sqlalchemy.create_engine('sqlite://')
    Base.metadata.create_all(engine)
    Session = sqlalchemy.orm.sessionmaker(bind=engine)
    
    session = Session()
    for i in range(10):
        session.add(Employee(id=i, name='name' + str(i)))
    session.commit()
       
    import cProfile
    cProfile.run('run_query(Session())', 'get.stats') 


Michael Bayer

unread,
Apr 3, 2013, 10:47:25 AM4/3/13
to sqlal...@googlegroups.com
On Apr 3, 2013, at 1:01 AM, Roman Yakovenko <roman.y...@gmail.com> wrote:

Hello.

I am just starting with SQLAlchemy and have a small problem: my "get" query is compiled every time I use it.
According to cProfile: 
ncalls  tottime  percall  cumtime  percall filename:lineno(function)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/expression.py:1864(compile)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/expression.py:1908(_compiler)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/compiler.py:231(__init__)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/engine/interfaces.py:764(__init__)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/engine/interfaces.py:805(process)
90/10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/visitors.py:73(_compiler_dispatch)
10    0.000    0.000    0.002    0.000 /usr/local/lib/python2.7/dist-packages/SQLAlchemy-0.8.1dev-py2.7-linux-x86_64.egg/sqlalchemy/sql/compiler.py:1082(visit_select)


In case of native API, I would use prepare "command/statement", but it looks I am missing something obvious and cannot find similar functionality in the package.

the Python DBAPI doesn't have an explicit concept of a "prepared statement" - the idea of preparing a statement involves that a statement handle is established on the database server, which can then be reused.   The closest the DBAPI has is the "executemany()" call, which gives the DBAPI itself the option of using a prepared statement behind the scenes; but this option is not appropriate for SELECT statements since executemany() doesn't support the return of results.

On the Python SQL construction side, SQLAlchemy deals with expression constructs that are converted to strings given a database dialect as well as compilation options.   The amount of time in Python spent for this compilation is relatively small and has been optimized for many years to be as fast as possible.  Nevertheless, in some cases we do try to squeeze more performance out by caching the compiled form of these queries; the ORM in particular will cache the compiled form of insert/update/delete statements that are used by the unit of work.   

Right now, the string form of SELECT queries generated by the ORM are not automatically cached.   It's only worthwhile to try to cache queries that are "fixed", such as the get() query we're referring to here as well as some of the queries used by lazyloading of relationships.  The overhead of this compilation however is fairly minimal; reddit.com uses SQLAlchemy expression constructs for all database queries as well, and they serve well over two billion page views a month without any caching of the expression string.

There's a recipe to make use of the compiled_cache in conjunction with the Query object right now, which is at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/BakedQuery .



--
You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
To post to this group, send email to sqlal...@googlegroups.com.
Visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Roman Yakovenko

unread,
Apr 4, 2013, 2:03:21 PM4/4/13
to sqlal...@googlegroups.com
On Wed, Apr 3, 2013 at 5:47 PM, Michael Bayer <mik...@zzzcomputing.com> wrote:

the Python DBAPI doesn't have an explicit concept of a "prepared statement" - the idea of preparing a statement involves that a statement handle is established on the database server, which can then be reused.   The closest the DBAPI has is the "executemany()" call, which gives the DBAPI itself the option of using a prepared statement behind the scenes; but this option is not appropriate for SELECT statements since executemany() doesn't support the return of results.

This is what I was looking for. It looks like pysqlite does it internally.
 
 
On the Python SQL construction side, SQLAlchemy deals with expression constructs that are converted to strings given a database dialect as well as compilation options.   The amount of time in Python spent for this compilation is relatively small and has been optimized for many years to be as fast as possible.  Nevertheless, in some cases we do try to squeeze more performance out by caching the compiled form of these queries; the ORM in particular will cache the compiled form of insert/update/delete statements that are used by the unit of work.   

Right now, the string form of SELECT queries generated by the ORM are not automatically cached.   It's only worthwhile to try to cache queries that are "fixed", such as the get() query we're referring to here as well as some of the queries used by lazyloading of relationships.  The overhead of this compilation however is fairly minimal; reddit.com uses SQLAlchemy expression constructs for all database queries as well, and they serve well over two billion page views a month without any caching of the expression string.

In my case the performance boost was 30%. I have very few queries, but pretty big loops, so compiling them in advance(outside of loop) gave me a significant gain.
 
There's a recipe to make use of the compiled_cache in conjunction with the Query object right now, which is at http://www.sqlalchemy.org/trac/wiki/UsageRecipes/BakedQuery .


Thank you! 

Michael Bayer

unread,
Apr 4, 2013, 2:34:04 PM4/4/13
to sqlal...@googlegroups.com

On Apr 4, 2013, at 2:03 PM, Roman Yakovenko <roman.y...@gmail.com> wrote:


In my case the performance boost was 30%. I have very few queries, but pretty big loops, so compiling them in advance(outside of loop) gave me a significant gain.
 

OK well we have the architecture in place to cache SQL expression constructs against their compiled form in the ORM on a per-mapper basis, using an LRU cache.   The SQL emitted by get() is in most cases constant so this is a feature that can be added without too much difficulty.


Reply all
Reply to author
Forward
0 new messages