How to avoid SQLAlchemy generating transactions for read-only queries

2,229 views
Skip to first unread message

phrr...@googlemail.com

unread,
Aug 6, 2009, 10:03:16 AM8/6/09
to sqlalchemy
We are rolling out some SQLAlchemy infrastructure at work and came
across an interesting issue. We are using the mssql dialect on top of
pyodbc to talk to both SQL Server and Sybase dataservers. We use bound
metadata and scoped session_maker with autocommit=True. First off, the
problem we encountered was with the transaction log on master filling
up when the SQLAlchemy transaction ran This was very curious as the
login was read-only. It turns out that the application was emitting
thousands of queries (SELECT only, coming from my_table.select() on
the SA Table metadata objects, not via the ORM) and a transaction was
generated for each one.

It looks (from tracing what is written down the socket to the
dataserver) like SA rolls back the transaction on connection checkin
and then *creates another transaction*?
write(4, 0x08CC0888, 59) = 59
0F01\0 ;\0\0\0\0 ! .\0\0\0\0 I F @ @ T R A N C O U N T > 0
R O L L B A C K B E G I N T R A N S A C T I O N

Where is this coming from? SA? pyodbc? TDS?

Ideally, our connection pool should provide connections without any
transactions active and in unchained mode (@@trancount = 0,
@@transtate = 1, @@tranchained = 0). The checkin code can rollback if
@@trancount > 0.

What is the suggested configuration to achieve this?

thanks,
pjjH

Michael Bayer

unread,
Aug 6, 2009, 10:44:09 AM8/6/09
to sqlal...@googlegroups.com
phrr...@googlemail.com wrote:
>
> We are rolling out some SQLAlchemy infrastructure at work and came
> across an interesting issue. We are using the mssql dialect on top of
> pyodbc to talk to both SQL Server and Sybase dataservers. We use bound
> metadata and scoped session_maker with autocommit=True. First off, the
> problem we encountered was with the transaction log on master filling
> up when the SQLAlchemy transaction ran This was very curious as the
> login was read-only. It turns out that the application was emitting
> thousands of queries (SELECT only, coming from my_table.select() on
> the SA Table metadata objects, not via the ORM) and a transaction was
> generated for each one.
>
> It looks (from tracing what is written down the socket to the
> dataserver) like SA rolls back the transaction on connection checkin
> and then *creates another transaction*?
> write(4, 0x08CC0888, 59) = 59
> 0F01\0 ;\0\0\0\0 ! .\0\0\0\0 I F @ @ T R A N C O U N T > 0
> R O L L B A C K B E G I N T R A N S A C T I O N

the exact SQL here is some crap the MSSQL dialect has in do_begin() which
is the only way we could find to get SAVEPOINT to work.

However, even if this were removed, a new connection is in fact started
the first time you use a freshly checked out connection. Connections
returned to the pool have rollback() called unconditionally. The reasons
are straightforward - a newly checked out connection should act just like
a brand new one - no locks, no transactional state. a checked-in
connection needs to act like a "closed" one in that it isnt hanging onto
any locks.

>
> Ideally, our connection pool should provide connections without any
> transactions active and in unchained mode (@@trancount = 0,
> @@transtate = 1, @@tranchained = 0). The checkin code can rollback if
> @@trancount > 0.
>
> What is the suggested configuration to achieve this?

SQLAlchemy assumes autocommit=False which is per DBAPI spec (and note that
autocommit=False requires that a transaction is present). There is always
a transaction, and in fact this has nothing to do with SQLA - just use
pyodbc directly and you will see this is the case.

Some DBAPI's provide autocommit modes, but since these are not
standardized or universally available, and because SQLA has its own
"autocommit" that works very nicely and consistently, SQLA has no support
for them. I dont know what Pyodbc provides.


Michael Bayer

unread,
Aug 6, 2009, 10:46:50 AM8/6/09
to sqlal...@googlegroups.com
Michael Bayer wrote:
>
> > Some DBAPI's provide autocommit modes, but since these are not
> standardized or universally available, and because SQLA has its own
> "autocommit" that works very nicely and consistently, SQLA has no support
> for them. I dont know what Pyodbc provides.
>

note however that this is not to say you can't add a connect hook to your
pool (using PoolListener) and set the pyodbc connections into "autocommit"
mode, if you identify such an option. as far as the conditional shoved
into do_begin() I still think we should just flatly deny SAVEPOINT support
on MSSQL for now which would obviate the need for that statement.

phrr...@googlemail.com

unread,
Aug 6, 2009, 8:39:38 PM8/6/09
to sqlalchemy
Thanks for the information Mike. I do have a listener in place already
but decided to poke in the configuration directly when creating the
engine:

'connect_args' : {'autocommit' : True,
}

I found some very interesting results by experimenting with vanilla
pyodbc and SA and seeing how they impacted the Sybase transaction log.
Two transaction log IOs *per query* (one for BEGINXACT and one for
ENDXACT) would have devastating performance impact. However, it seems
that these are taken care of by the Sybase User Log Cache (ULC) (or
some other as yet unknown mechanism) otherwise we would surely have
noticed their impact.

I will do some more testing on this tomorrow and followup with a
summary. Auto-wrapping SA models over several thousand tables across
hundreds of databases across dozens of dataservers has certainly
brought up some interesting stuff!

pjjH

Michael Bayer

unread,
Aug 6, 2009, 8:47:35 PM8/6/09
to sqlal...@googlegroups.com

On Aug 6, 2009, at 8:39 PM, phrr...@googlemail.com wrote:

>
> Thanks for the information Mike. I do have a listener in place already
> but decided to poke in the configuration directly when creating the
> engine:
>
> 'connect_args' : {'autocommit' : True,
> }
>
> I found some very interesting results by experimenting with vanilla
> pyodbc and SA and seeing how they impacted the Sybase transaction log.
> Two transaction log IOs *per query* (one for BEGINXACT and one for
> ENDXACT) would have devastating performance impact. However, it seems
> that these are taken care of by the Sybase User Log Cache (ULC) (or
> some other as yet unknown mechanism) otherwise we would surely have
> noticed their impact.
>
> I will do some more testing on this tomorrow and followup with a
> summary. Auto-wrapping SA models over several thousand tables across
> hundreds of databases across dozens of dataservers has certainly
> brought up some interesting stuff!

oh, sybase. Youre using MSSQL dialect to talk to both huh ? :)
interesting do you think sybase and MSSQL should inherit from some
common base ? not surprising you're having probs with that.

phrr...@googlemail.com

unread,
Aug 6, 2009, 10:11:06 PM8/6/09
to sqlalchemy
They differ mainly on 'new' stuff (where new is anything < 15 years
old!) but for the kind of SELECT queries generated by SA we haven't
seen any problems at all (yet!). We decided to go ahead and use the
mssql dialect for read-only apps and postpone the remaining work on
the Sybase dialect on top of sybasect until we run into the limitation
of the mssql dialect.

pjjH


On Aug 6, 8:47 pm, Michael Bayer <mike...@zzzcomputing.com> wrote:
Reply all
Reply to author
Forward
0 new messages