Simple select is really slow when executed via ORM

2,711 views
Skip to first unread message

Massi

unread,
Aug 12, 2011, 8:51:51 AM8/12/11
to sqlalchemy
Hi everyone,

I'm doing some test to evaluate the performance of querying with
sqlalchemy via ORM. I wrote a simple script to measure the execution
time of a simple select query made on relatively small table (300 000
records, 6 columns) in sqlite. Here is the script:

from sqlalchemy import *
from sqlalchemy.orm import mapper, sessionmaker
from time import clock

engine = create_engine('sqlite:///test.db', echo=False)
metadata = MetaData()

table = Table('projects', metadata,
Column('id', Integer, primary_key=True),
Column('inp1', String(50)),
Column('inp2', String(50)),
Column('inp3', String(50)),
Column('inp4', String(50)),
Column('inp5', String(50)),
)

class Project(object) :
pass

mapper(Project, table)
metadata.create_all(engine)

t = []
for i in range(300000) :
t.append({"inp1":str(i), "inp2":str(i), "inp3":str(i),
"inp4":str(i), "inp5":str(i)})

c = clock()
engine.execute(table.insert(), t)
print "Insert: "+str(clock()-c)
session = sessionmaker(bind=engine)()
c = clock()
res = engine.execute(table.select()).fetchall()
print "Sql query: "+str(clock()-c)
c = clock()
res = session.query(Project).all()
print "Session query: "+str(clock()-c)

On my PC (windows 7, 64-bit, intel i7 2.93 Ghz) this is the output:
Insert: 3.41080167807
Sql query: 1.26728367673
Session query: 19.6452334842

The execution time of the ORM query is about 20 times the SQL one, and
this is definitely discouraging. So I guess if I'm doing something
wrong or if there are some tricks when using ORM that I'm not
considering. Any help is really appreciated.
Thanks in advance!

Michael Bayer

unread,
Aug 12, 2011, 10:42:59 AM8/12/11
to sqlal...@googlegroups.com
Here's a comparison that begins to be slightly fair regarding the work of fetching raw rows versus generating and identity-managing full object rows. On my mac the "SQL query" takes 7 seconds and the "Session query" 13.7, so twice as slow, rather than 20. The difference is we are actually fetching the *data* from the SQLite result row, and additionally generating some real Python objects, with a check in the session identity map to simulate a small amount of ORM bookkeeping. The ORM maintains an additional object known as InstanceState for each mapped object, and establishing this object adds a significant chunk of time as well. Python is very slow in creating objects, and in calling functions. Here, each tiny little additional thing the ORM does is multiplied by 300000, so it adds up. As you can see, just adding a small handful of steps to the "SQL" version gets it much closer very quickly.

For a 300K row grab you are better off fetching tuples, that is query(Project.id, Project.inp1, Project.inp2), etc., which eliminates all the ORM bookkeeping associated with mapped objects.

from sqlalchemy import *
from sqlalchemy.orm import mapper, sessionmaker
from time import clock

import os

if os.path.exists("test.db"):
os.remove("test.db")

engine = create_engine('sqlite:///test.db', echo=False)
metadata = MetaData()

table = Table('projects', metadata,
Column('id', Integer, primary_key=True),
Column('inp1', String(50)),
Column('inp2', String(50)),
Column('inp3', String(50)),
Column('inp4', String(50)),
Column('inp5', String(50)),
)

class Project(object) :
pass

mapper(Project, table)
metadata.create_all(engine)

t = []
for i in range(300000) :
t.append({"inp1":str(i), "inp2":str(i), "inp3":str(i),
"inp4":str(i), "inp5":str(i)})

c = clock()
engine.execute(table.insert(), t)
print "Insert: "+str(clock()-c)
session = sessionmaker(bind=engine)()

class UnmappedProject(object):
pass

c = clock()
res = []
for row in engine.execute(table.select()):
identity_key = row[table.c.id]
if identity_key in session.identity_map:
# here we'd use existing object
assert False
else:
obj = UnmappedProject()
for col in table.c:
setattr(obj, col.key, row[col])
res.append(obj)

print "Sql query: "+str(clock()-c)
c = clock()
res = session.query(Project).all()
print "Session query: "+str(clock()-c)

> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

Reply all
Reply to author
Forward
0 new messages