Question about fetchall performance

Michael Twomey

unread,

Oct 12, 2006, 9:47:10 AM10/12/06

to sqlal...@googlegroups.com

Hi,

I've been using SQLAlchemy for some queries against a MS SQL server
via pymssql in my work and I've found performance of the fetchall call
to be slower than our old python sybase module based solution.

A quick peek into the code shows that fetchall is calling fetchone
repeatedly. Changing it to use fetchall from the cursor yields a
massive speedup in my queries. The same goes for the object mapper,
though not as dramatic.

I've attached two diffs which use fetchall instead of looping over
fetchone. In one of my calls it slashes the query time down from about
4 seconds to half a second for about 14,000 rows.

Is there any particular reason why fetchall is implemented this way?
I'm guessing it's the most compatible way of doing this.

cheers,
mick

base_fetchall.diff

orm_fetchall.diff

Michael Bayer

unread,

Oct 12, 2006, 12:08:33 PM10/12/06

to sqlal...@googlegroups.com

this is a great patch. while i dont see any performance increase
running the unit tests for sqlite/postgres/mysql, it most likely
reduces the number of system calls and cant be a bad thing, also glad
that you got the RowProxy in there so that result row post-processing
remains intact.

My original concern for things like fetchall() regarded DBAPIs that
might have "active rows", meaning particular values in an individual
row might maintain some state with the database cursor, such as a
BLOB. I could only find one DBAPI that could potentially have this
phenomenon (i think it was an alternate postgres API) and we dont
support it anyway. but then again you dont really have to call
fetchall() if you dont want to.

anyway patch is committed in 1974.

> > <base_fetchall.diff>
> <orm_fetchall.diff>

Michael Twomey

unread,

Oct 12, 2006, 1:03:07 PM10/12/06

to sqlal...@googlegroups.com

On 10/12/06, Michael Bayer <zzz...@gmail.com> wrote:
>
> this is a great patch. while i dont see any performance increase
> running the unit tests for sqlite/postgres/mysql, it most likely
> reduces the number of system calls and cant be a bad thing, also glad
> that you got the RowProxy in there so that result row post-processing
> remains intact.
>

I suspect performance improvements are database connector specific,
freetds (pymssql and sybase) based connection libraries seem to suffer
badly when using fetchone vs fetchall.

> My original concern for things like fetchall() regarded DBAPIs that
> might have "active rows", meaning particular values in an individual
> row might maintain some state with the database cursor, such as a
> BLOB. I could only find one DBAPI that could potentially have this
> phenomenon (i think it was an alternate postgres API) and we dont
> support it anyway. but then again you dont really have to call
> fetchall() if you dont want to.
>
> anyway patch is committed in 1974.