Very slow inserts when using sqlalchemy core

1,589 views
Skip to first unread message

adam....@gmail.com

unread,
Oct 4, 2018, 2:45:31 PM10/4/18
to sqlalchemy
Hi all, 

Having some very and major performance issues using sqlalchemy to insert data, been trying many different ways but none seem to hold even a close candle to raw MySQLdb connections

Versions:
sqlalchemy: 1.2.9
MySQLdb: 1.3.13 (bound with mysql+mysqldb connector)
MySQL: 14.14
Python: 3.6.5
Ubuntu 18.04 x86_64

Server:
8GB ram
2 x Intel(R) Xeon(R) CPU X5650  @ 2.67GHz threads running in VMWare ESXi

Connected to localhost

Base data info:
2018-10-04 13:19:56,519 - INFO - Executing for 10,550 inserts, 89 columns, approx 6,252,460 bytes of raw data...

Pre-formatted session execute ~2 seconds:
2018-10-04 13:19:56,520 - INFO - Building raw TSQL including all escaped value()s - insert into x (a,b, ...) values (escape_string(c), escape_string(d), ..) (escape_string(e), escape_string(f), ...)
2018-10-04 13:19:57,640 - INFO - Executing raw TSQL (9,218,872 bytes)  - session.execute(tsql)
2018-10-04 13:19:59,435 - INFO - Committing

Raw MySQLdb API ~1 second:
2018-10-04 13:19:59,436 - INFO - Executing same raw TSQL directly via MySQLdb API - engine.raw_connection(); cursor.execute(tsql)
2018-10-04 13:20:00,326 - INFO - Committing

Bindparam formatted TSQL converted to text() executed with bindparams as a dict ~12 seconds to build text() and ~19 seconds to execute, ~31 seconds total:
2018-10-04 13:20:00,326 - INFO - Building raw TSQL - insert into x (a, b, ...) values (:a1, :b1, ..), (:a2, :b2, ...)
2018-10-04 13:20:01,017 - INFO - Rebuilding Input data for TSQL bindparams - {a1: d, b1: e, a2: f, b2: g}
2018-10-04 13:20:02,234 - INFO - Building text() object - sql.text(tsql)
2018-10-04 13:20:14,259 - INFO - Executing text() with bindparams - session.execute(text, bindparams)
2018-10-04 13:20:33,552 - INFO - Committing

Bindparam formatted TSQL NOT converted to text, executed with bindparams as a dict: ~33 seconds
2018-10-04 13:20:33,552 - INFO - Executing raw TSQL with bindparams - session.execute(tsql, bindparams)
2018-10-04 13:21:06,307 - INFO - Committing

Using a sqlalchemy class table insert() with using .values() ~34 seconds:
2018-10-04 13:21:06,311 - INFO - Executing sqlalchemy insert w/ bindparams as values - session.execute(__table__.insert().values(bindparams))
2018-10-04 13:21:40,808 - INFO - Committing

Using a sqlalchemy class table insert() with passing bindparams to session.execute() ~8 seconds:
2018-10-04 13:21:40,809 - INFO - Executing sqlalchemy insert() with bindparams as execute param - session.execute(__table__.insert(), bindparams)
2018-10-04 13:21:48,084 - INFO - Committing

Obviously the raw db api will be the fastest as there's little to no processing or overhead, I would think session.execute(str) would come closer to matching that, but its not too far off.

Having sqlalchemy.text(str) take 12 seconds to execute seems rather excessive, but assuming that's the hang with the other two execute(__table__.insert().values(bindparams)) statement too, however why would excute(str, bindparams) do this? Does it shadow text() as well if arg0 is a str?

Now the oddest part of all, doing 10,550 individual inserts in the last example is roughly 26 seconds faster than a single transaction with 10,550 sets of values. This just does not make transactional sense.

Is there something I can be doing better while utilizing core, or am I over-expecting how sqlalchemy to perform with datasets like this?


Hopefully this is enough detail, but I can provide more upon request.

Thank you!

Mike Bayer

unread,
Oct 4, 2018, 3:05:59 PM10/4/18
to sqlal...@googlegroups.com
A Core insert should be only slightly slower than the raw DBAPI,
however, if you're inserting lots of rows, you need to be using
executemany style, that is, a single execute() call with a list of
parameters. There's no code examples here so I don't know what
you're actually doing, if you are running session.execute() itself 10K
times, that's the wrong way to do it. Check out
https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
for usage examples how to bulk-insert for different APIs including
Core (test_core_insert specifically). Additionally, I don't know
if you are using bound parameters with your raw MySQL code or not, and
if so, is there some unicode processing going on in the SQLAlchemy
version that might be slowing things down.

A complete code example like those you see in
https://docs.sqlalchemy.org/en/latest/_modules/examples/performance/bulk_inserts.html
will allow me to see what you are doing.


>
> Having sqlalchemy.text(str) take 12 seconds to execute seems rather excessive, but assuming that's the hang with the other two execute(__table__.insert().values(bindparams)) statement too, however why would excute(str, bindparams) do this? Does it shadow text() as well if arg0 is a str?
>
> Now the oddest part of all, doing 10,550 individual inserts in the last example is roughly 26 seconds faster than a single transaction with 10,550 sets of values. This just does not make transactional sense.
>
> Is there something I can be doing better while utilizing core, or am I over-expecting how sqlalchemy to perform with datasets like this?
>
>
> Hopefully this is enough detail, but I can provide more upon request.
>
> Thank you!
>
> --
> SQLAlchemy -
> The Python SQL Toolkit and Object Relational Mapper
>
> http://www.sqlalchemy.org/
>
> To post example code, please provide an MCVE: Minimal, Complete, and Verifiable Example. See http://stackoverflow.com/help/mcve for a full description.
> ---
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sqlalchemy+...@googlegroups.com.
> To post to this group, send email to sqlal...@googlegroups.com.
> Visit this group at https://groups.google.com/group/sqlalchemy.
> For more options, visit https://groups.google.com/d/optout.

adam....@gmail.com

unread,
Oct 4, 2018, 4:04:50 PM10/4/18
to sqlalchemy
Mike,

No, I'm not calling execute 10k times, but I am doing what they do in your example with test_core_insert

My code for that is:
            log.info('Executing sqlalchemy insert() with bindparams as execute param - session.execute(__table__.insert(), bindparams)')
            sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), data)
            log.info('Committing')
            sql.session.commit()

where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a regular [dict(), dict(),...] too with no gains; this method is the fastest outside of direct or near direct DBAPI calls at about 8 seconds.

However I read somewhere (can't find the source again to cite, but I thought it was the sqlalchemy docs, I may be just not remembering correctly) this query chains together individual INSERT statements instead of doing a multi-insert
i.e. insert into target (a, b, ...) values (c, d, ...); insert into target (a, b, ...) values (e, f, ...); ... instead of a insert into target (a, b, ...) values (c, d, ...), (e, f, ...)
where the latter is the most optimal TSQL

If I'm wrong there and it does do a multi-insert, what is the difference between execute(__table__.insert().values(bindparams)) and execute(__table__.insert(), bindparams), and why would the former be ~26 seconds slower in this use case?

Mike Bayer

unread,
Oct 4, 2018, 5:47:31 PM10/4/18
to sqlal...@googlegroups.com
On Thu, Oct 4, 2018 at 4:04 PM <adam....@gmail.com> wrote:
>
> Mike,
>
> No, I'm not calling execute 10k times, but I am doing what they do in your example with test_core_insert
>
> My code for that is:
> log.info('Executing sqlalchemy insert() with bindparams as execute param - session.execute(__table__.insert(), bindparams)')
> sql.session.execute(models.target.__table__.insert().prefix_with('IGNORE'), data)
> log.info('Committing')
> sql.session.commit()
>
> where data = [OrderedDict(), OrderedDict(), ...] (tried converting to a regular [dict(), dict(),...] too with no gains; this method is the fastest outside of direct or near direct DBAPI calls at about 8 seconds.
>
> However I read somewhere (can't find the source again to cite, but I thought it was the sqlalchemy docs, I may be just not remembering correctly) this query chains together individual INSERT statements instead of doing a multi-insert
> i.e. insert into target (a, b, ...) values (c, d, ...); insert into target (a, b, ...) values (e, f, ...); ... instead of a insert into target (a, b, ...) values (c, d, ...), (e, f, ...)
> where the latter is the most optimal TSQL

I'm not sure what "TSQL" means in this context.

SQLAlchemy uses cursor.executemany() to invoke the statement just
once. the DBAPI decides how it wants to batch it. Regardless, it
should move very fast. On the Postgresql side, there's more of a big
speed difference between the way psycopg2 executes a statement
normally, and some special modes it provides that allow it to chain
together multiple statements in one batch. But that's still behind
the executemany() call.


>
> If I'm wrong there and it does do a multi-insert, what is the difference between execute(__table__.insert().values(bindparams)) and execute(__table__.insert(), bindparams), and why would the former be ~26 seconds slower in this use case?

What is "bindparams" here, that is a list of many tuples ? If so
then yes, the DBAPI is not batching efficiently in the latter case.
Try PyMySQL or mysqlclient.

adam....@gmail.com

unread,
Oct 4, 2018, 7:01:44 PM10/4/18
to sqlalchemy
Mike,

Er, yeah, not TSQL, my old MSSQL days are showing :D meant a str() of sql

I'm saying bindparms as the required input format of the params used to re-bind into the sql in the optimal way, for example execute(__table__.insert(), bindparams) would represent a list of dicts.

I did a packet capture of execute(__table__.insert(), list(dict()) and it was doing the execute many calls (insert ... (a, b) values (c, d), (e, f). so there was no hangup there.
I also just re-coded this to call mysqldb and run executemany(str, [tuple()], it does take about the same amount of time (~8s) as execute(__table__.insert(), list(dict())) so there must be something going on in the DBAPI thats unoptimal, as I can generate the proper SQL and insert it in ~3 seconds with a normal .execute(str).

I tried the test case with PyMySQL with just the execute(str) case and it took about 2 seconds, so I didn't give it much more attention; i'll try it again with an executemany though.
2018-10-04 17:47:02,373 - INFO - Executing for 10,550 inserts, 89 columns, approx 6,252,460 bytes of raw data...
2018-10-04 17:47:02,373 - INFO - Generating datasets
2018-10-04 17:47:02,550 - INFO - Executing with raw_connection().executemany(str, [tuple(a, b, ...)]
2018-10-04 17:47:10,061 - INFO - Committing

            log.info('Generating datasets')
            sql_str = 'INSERT IGNORE INTO table ({}) VALUES({})'.format(','.join(data[0].keys()), ','.join(['%s' for i in range(len(data[0].keys()))]))
            sql_data = [tuple(value for value in list(part.values())) for part in data]
            log.info('Executing with raw_connection().executemany(str, [tuple(a, b, ...)]')
            conn = sql.engine.raw_connection()
            cur = conn.cursor()
            cur.executemany(sql_str, sql_data)
            log.info('Committing')
            conn.commit()


I still however don't get the difference between sqlalchemy's execute(__table__.insert().values(dict())) though and why it would be so much slower than execute(__table__.insert(), dict()), nor why execute(str, dict()) is that slow as well.

Simon King

unread,
Oct 5, 2018, 5:27:11 AM10/5/18
to sqlal...@googlegroups.com
MySQLdb builds a new SQL string by escaping each of the parameters and
then substituting them into the original SQL:

https://github.com/farcepest/MySQLdb1/blob/master/MySQLdb/cursors.py#L182

For 89 columns * 10,550 rows, that's nearly 1 million parameters. Each
one involves a call to "db.literal(item)", which in turn calls
"self.escape(item, self.encoders)". It looks like self.escape is
implemented in C.

This means that you've got at least 2 million function calls (which
are relatively slow in Python) just to build the SQL string to send to
the server.

Simon

adam....@gmail.com

unread,
Oct 5, 2018, 9:15:00 AM10/5/18
to sqlalchemy
Simon,

The MySQLdb module performs this operation with very little difference to native (mysql < data.sql) speed, so I'm not concerned on that, the delta was <0.5 seconds

In my original example (I didn't know I could pass a dict to the execute to have it fill in the parameters) I built the complete SQL raw with calling mysqldb's escape string on each value, and formatting that SQL took slightly over 1 second, and I didn't even try to optimize how I was doing that code.

My more concern/curiosity is the overhead that sqlalchemy core is adding to these inserts and just figuring out why... One might say I'm giving this too much attention because I have a working, acceptable and optimal solution, but that's bypassing issues and really bypassing sqlalchemy as a whole.

In the end, I'm pretty sure it's probably in relation to the size of my dataset; but the time deltas between doing things manually vs letting sqlalchemy do its job is just too great to go unspoken about.

Mike Bayer

unread,
Oct 5, 2018, 10:17:23 AM10/5/18
to sqlal...@googlegroups.com
On Fri, Oct 5, 2018 at 9:15 AM <adam....@gmail.com> wrote:
>
> Simon,
>
> The MySQLdb module performs this operation with very little difference to native (mysql < data.sql) speed, so I'm not concerned on that, the delta was <0.5 seconds
>
> In my original example (I didn't know I could pass a dict to the execute to have it fill in the parameters) I built the complete SQL raw with calling mysqldb's escape string on each value, and formatting that SQL took slightly over 1 second, and I didn't even try to optimize how I was doing that code.
>
> My more concern/curiosity is the overhead that sqlalchemy core is adding to these inserts and just figuring out why...

SQLAlchemy Core adds a very low amount of overhead when used
correctly. the one exception to this is when there is some
processing of data occurring, which historically has been involved
with coercing Python Unicode objects to strings, but there are other
examples of this as well. These can be spotted by using Python
profiling.

We know that Core's overhead is low because I profile it like crazy,
and here are what typical results look like for inserts using
exeutemany:

https://docs.sqlalchemy.org/en/latest/faq/performance.html#i-m-inserting-400-000-rows-with-the-orm-and-it-s-really-slow

SQLAlchemy Core: Total time for 100000 records 0.21024107933 secs
sqlite3: Total time for 100000 records 0.137335062027 sec

Above, we can see for a test that eliminates network and file overhead
by using a SQLite in memory database, the overhead for 100000 rows
being INSERTed is less than .1 seconds greater compared to using
sqlite3 directly. When using the MySQL driver instead, the Core
doesn't have any additional work to do beyond type-specific issues.

The test suite I linked can be run for any database, so here's a run
using a local MariaDB database with mysqlclient, for 10000 rows like
you are doing (you can do the same thing here just running from the
SQLAlchemy source checkout):

$ python -m examples.performance --dburl
mysql://scott:tiger@localhost/test bulk_inserts --num 10000
Tests to run: test_flush_no_pk, test_bulk_save_return_pks,
test_flush_pk_given, test_bulk_save, test_bulk_insert_mappings,
test_core_insert, test_dbapi_raw
test_flush_no_pk : Individual INSERT statements via the ORM, calling
upon last row id (10000 iterations); total time 3.738233 sec
test_bulk_save_return_pks : Individual INSERT statements in "bulk",
but calling upon last row id (10000 iterations); total time 3.571442
sec
test_flush_pk_given : Batched INSERT statements via the ORM, PKs
already defined (10000 iterations); total time 0.966315 sec
test_bulk_save : Batched INSERT statements via the ORM in "bulk",
discarding PKs. (10000 iterations); total time 0.496674 sec
test_bulk_insert_mappings : Batched INSERT statements via the ORM
"bulk", using dictionaries. (10000 iterations); total time 0.283908
sec
test_core_insert : A single Core INSERT construct inserting mappings
in bulk. (10000 iterations); total time 0.220689 sec
test_dbapi_raw : The DBAPI's API inserting rows in bulk. (10000
iterations); total time 0.189292 sec

Pretty much the same result, Core insert is .22 seconds for 10000
rows, .18 seconds with straight DBAPI. Here it is over a network
connection to a MySQL 8.0 database:

python -m examples.performance --dburl
mysql://scott:tiger@mysql80/test bulk_inserts --num 10000
Tests to run: test_flush_no_pk, test_bulk_save_return_pks,
test_flush_pk_given, test_bulk_save, test_bulk_insert_mappings,
test_core_insert, test_dbapi_raw
test_flush_no_pk : Individual INSERT statements via the ORM, calling
upon last row id (10000 iterations); total time 30.921060 sec
test_bulk_save_return_pks : Individual INSERT statements in "bulk",
but calling upon last row id (10000 iterations); total time 30.468841
sec
test_flush_pk_given : Batched INSERT statements via the ORM, PKs
already defined (10000 iterations); total time 1.618901 sec
test_bulk_save : Batched INSERT statements via the ORM in "bulk",
discarding PKs. (10000 iterations); total time 1.136148 sec
test_bulk_insert_mappings : Batched INSERT statements via the ORM
"bulk", using dictionaries. (10000 iterations); total time 0.878011
sec
test_core_insert : A single Core INSERT construct inserting mappings
in bulk. (10000 iterations); total time 0.871134 sec
test_dbapi_raw : The DBAPI's API inserting rows in bulk. (10000
iterations); total time 0.870091 sec

So we can note, two of those runs took 30 seconds - those are both ORM
runs where the primary key of the row was not provided, which means
the ORM has to run each row into an individual execute() and get the
primary key back. Then, the rest of the runs use executemany() where
we can see the time for the ORM is in the 1.5 second area and then
back to the core, there is virtually no difference in speed between
Core and straight DBAPI.


So if you want to figure out what the issue is on your end, here are
things you can do:

1. make sure you are using the C extensions (I don't recall if you
said you were on linux, or windows, or what)

2. run the above performance suite on your database

3. use the profiling recipe against your specific code:
https://docs.sqlalchemy.org/en/latest/faq/performance.html#code-profiling

4. provide *runnable* examples here, e.g. MCVE, which illustrate
*exactly* what is being called, and how, and with what data,
datatypes, schema, etc. Specifics matter. if the rows here have 5
columns vs. 100, that matters. if the rows have large binary objects,
unicodes, dates, precision decimals, that all matters. But really,
the profiling you get from #3 will show where the time is being spent.
Reply all
Reply to author
Forward
0 new messages