Upserts in SqlAlchemy Core?

Anthony Foglia

unread,

Dec 13, 2011, 10:36:19 AM12/13/11

to sqlalchemy

I'm trying to convert our code to use the SqlAlchemy Expression
Language instead of manually building the sql statements strings (plus
we should be able to easily switch to sqlite for unit-testing
purposes). I'm worried about a few statement, in particular this
Oracle upsert command:

sql = """
merge into <table_items> d
using
(
select calc_key,internal_id from <table_items> where
calc_key = %d
minus
select calc_key,internal_id from <table_values> where
calc_key = %d
) tmp
on
(
tmp.calc_key = d.calc_key and
tmp.internal_id = d.internal_id
)
when matched then
update
set d.has_values = 'N'
""" % (self.ID, self.ID)

(That's the Python code, but it's simple string replacement.) Note
that I can rewrite it as a standard sql UPDATE, but at the cost of a
SELECTs for each (calc_key, internal_id). At the current table size
(~170 million rows) that will kill performance.

How would I do something similar with the SqlAlchemy Expression
Language? And can I do it in a way that's database-agnostic?

--
--Anthony

Michael Bayer

unread,

Dec 13, 2011, 11:20:22 AM12/13/11

to sqlal...@googlegroups.com

As far as how you'd make your own Oracle "upsert" thing for now, you'd use the @compiles system which is documented at: http://www.sqlalchemy.org/docs/core/compiler.html . We do sometimes put up recipes for these things either on the wiki or in main docs, if you come up with something other people would also want to use.

As far as agnostic, we've avoided upsert because it's another one of those areas where every backend has a completely different (or no) way of going about it. Oracle often makes these the hardest because their syntaxes are totally standalone. We actually we just passed over them when we recently added "UPDATE...FROM", as Oracle's approach to update from another table (SET (a, b) = (SELECT c, d FROM ...)) is just too far away from what other backends do for us to get into it. CONNECT BY is another crapfest which has been a key factor in preventing us from implementing common table expressions (CTE's).

I wasn't aware Oracle had a MERGE, so let's look at it (http://psoug.org/reference/merge.html ):

MERGE INTO <table> USING (<select>) ON (<criterion>) WHEN MATCHED THEN UPDATE ... DELETE ... WHEN NOT MATCHED THEN INSERT ...

MySQL has INSERT .. ON DUPLICATE KEY :

INSERT INTO <table> (<columns>) <select> ON DUPLICATE KEY UPDATE

MySQL also has REPLACE, not quite the same thing:

REPLACE INTO <table> (<columns>) SELECT ....

Firebird - they have a more restrictive syntax that doesn't actually have a SELECT:

UPDATE OR INSERT INTO <table> (<columns>) VALUES (<values>) MATCHING (<columns>)

Postgresql, SQL Server, SQLite, don't have any kind of upsert so at least we can ignore those.

Looking above it appears like Oracle's syntax is *way* more powerful than the others. You can actually fully tune each row to be a totally distinct UPDATE, DELETE, or INSERT, pretty amazing.

I'm not seeing an "agnostic" structure from this, as much as I'd love to. We could build a very fancy merge() object with Oracle that has matched_update(), matched_delete(), unmatched_insert() kinds of methods (not quite like that probably, but something like it) but that wouldn't translate to MySQL or Firebird very well.

If we built a much simpler "upsert()" construct that generated just a subset of Oracle's MERGE(), maybe, I'm a little uncomfortable with that. I don't know that I see "upsert" as a technique that should even be used in an agnostic way as it has very different implications on different backends (here's a great article on how very different MySQL and Oracle are here: http://blog.mclaughlinsoftware.com/2009/05/25/mysql-merge-gone-awry/ ) .

So as far as "generic", I'm uncomfortable and someone would need to really work out how that would work at least.

>
> --
> --Anthony
>
> --
> You received this message because you are subscribed to the Google Groups "sqlalchemy" group.
> To post to this group, send email to sqlal...@googlegroups.com.
> To unsubscribe from this group, send email to sqlalchemy+...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sqlalchemy?hl=en.
>

Anthony Foglia

unread,

Dec 13, 2011, 12:00:24 PM12/13/11

to sqlal...@googlegroups.com

On Tue, Dec 13, 2011 at 11:20 AM, Michael Bayer
<mik...@zzzcomputing.com> wrote:
> As far as how you'd make your own Oracle "upsert" thing for now, you'd use the @compiles system which is documented at: http://www.sqlalchemy.org/docs/core/compiler.html . We do sometimes put up recipes for these things either on the wiki or in main docs, if you come up with something other people would also want to use.

I was hoping one already existed. :-) This is my first big project
with databases, so I'm in a bit unfamiliar waters.

> As far as agnostic, we've avoided upsert because it's another one of those areas where every backend has a completely different (or no) way of going about it. Oracle often makes these the hardest because their syntaxes are totally standalone. We actually we just passed over them when we recently added "UPDATE...FROM", as Oracle's approach to update from another table (SET (a, b) = (SELECT c, d FROM ...)) is just too far away from what other backends do for us to get into it. CONNECT BY is another crapfest which has been a key factor in preventing us from implementing common table expressions (CTE's).
>
> I wasn't aware Oracle had a MERGE, so let's look at it (http://psoug.org/reference/merge.html ):
>
> MERGE INTO <table> USING (<select>) ON (<criterion>) WHEN MATCHED THEN UPDATE ... DELETE ... WHEN NOT MATCHED THEN INSERT ...
>
> MySQL has INSERT .. ON DUPLICATE KEY :
>
> INSERT INTO <table> (<columns>) <select> ON DUPLICATE KEY UPDATE
>
> MySQL also has REPLACE, not quite the same thing:
>
> REPLACE INTO <table> (<columns>) SELECT ....
>
> Firebird - they have a more restrictive syntax that doesn't actually have a SELECT:
>
> UPDATE OR INSERT INTO <table> (<columns>) VALUES (<values>) MATCHING (<columns>)
>
> Postgresql, SQL Server, SQLite, don't have any kind of upsert so at least we can ignore those.

SQLite has INSERT OR REPLACE (or simply REPLACE) which is similar to
MySQLs REPLACE (insert or delete-then-insert)

> Looking above it appears like Oracle's syntax is *way* more powerful than the others. You can actually fully tune each row to be a totally distinct UPDATE, DELETE, or INSERT, pretty amazing.
>
> I'm not seeing an "agnostic" structure from this, as much as I'd love to. We could build a very fancy merge() object with Oracle that has matched_update(), matched_delete(), unmatched_insert() kinds of methods (not quite like that probably, but something like it) but that wouldn't translate to MySQL or Firebird very well.
>
> If we built a much simpler "upsert()" construct that generated just a subset of Oracle's MERGE(), maybe, I'm a little uncomfortable with that. I don't know that I see "upsert" as a technique that should even be used in an agnostic way as it has very different implications on different backends (here's a great article on how very different MySQL and Oracle are here: http://blog.mclaughlinsoftware.com/2009/05/25/mysql-merge-gone-awry/ ) .

According to wikipedia MERGE is in the SQL 2008 standard
<http://en.wikipedia.org/wiki/Merge_(SQL)>, so it should start
appearing in other dbs, eventually. Not any time soon, but
eventually.

(And I think that blog post is wrong about Oracle's MERGE requires
using the primary key. AFAICT our tables have no primary keys and no
uniqueness contraints, yet we can still perform the MERGE.)

Either way, as someone whose sole SqlAlchemy experience comes from
working through the tutorial, this might be too much for me to figure
out anytime soon.

Thanks for the answer though.

--
--Anthony

Michael Bayer

unread,

Dec 13, 2011, 12:17:25 PM12/13/11

to sqlal...@googlegroups.com

On Dec 13, 2011, at 12:00 PM, Anthony Foglia wrote:

>
> According to wikipedia MERGE is in the SQL 2008 standard
> <http://en.wikipedia.org/wiki/Merge_(SQL)>, so it should start
> appearing in other dbs, eventually. Not any time soon, but
> eventually.

ah well that is good to know (and actually I did know this and just forgot). Also strange is my googling for SQL server initially didn't manage to find that they have a MERGE also. Here's PG discussing how they'd support it: http://wiki.postgresql.org/wiki/SQL_MERGE

So at least we know we'd target MERGE as expressed in the SQL standard, and then the non-standard MySQL / FB / SQLite approaches would be the exception cases.

Still, it's an enormous statement to support right now and it's not very commonly used.

We have a very old ticket for it which includes a @compiles recipe for REPLACE, though MERGE here is a lot more complex:

http://www.sqlalchemy.org/trac/ticket/960

Reply all

Reply to author

Forward