How to do an Upsert - an (atomic) update or insert?

1,295 views
Skip to first unread message

Gordan Valjak

unread,
Jun 10, 2011, 7:07:26 AM6/10/11
to Squeryl
Hello eveyone!

An upsert operation should perform an update if a record already
exists in the table,
or insert the same record into the table otherwise.

Up till now, I was using stored procedures for the job, one such as
this:
http://www.postgresql.org/docs/current/static/plpgsql-control-structures.html#PLPGSQL-UPSERT-EXAMPLE

From what I've gathered the best option would be to create a simple
retrying strategy, pseudo code ensues:

@tailrec
def upsert(row) =
try{
table.update(row)
if (updated) return

table.insert(row)
return
}
catch {
case DuplicatePrimaryKeyInsertFailure dpkif =>
// this should only happen if an insert occured right between
our update and insert statements
// keep retrying because the next one *should* pass as an update
upsert(row)
}


The main problem here is trying to identify the exception mentioned
above.

Should I use simple pattern matching to find out the error type or is
there a smarter way?
The underlying database is PostgreSQL 9, so I know the error messages
can vary depending on localization.

Of course, if there is alredy some smart atomic way of performing an
upsert that would be much better then reinventing the wheel.

Thanks,
Gordan Valjak

David Whittaker

unread,
Jun 10, 2011, 9:00:05 AM6/10/11
to squ...@googlegroups.com
Gordan,

Can you give a use case?  Looking at the example you linked to I assume that you are assigning your own primary keys (they aren't auto-generated) and the entity you are updating has not previously been retrieved from the DB.  Is that correct?

-Dave

Peter Brant

unread,
Jun 10, 2011, 9:51:01 AM6/10/11
to squ...@googlegroups.com
I think I'd use SQLException#getSQLState(). That will have a stable
value regardless of any localized message (and if you're lucky might
even be independent across RDBMS products).

Pete

Brian Maso

unread,
Jun 10, 2011, 11:18:35 AM6/10/11
to squ...@googlegroups.com
I'm a big fan of MySql's "INSERT...ON DUPLICATE KEY UPDATE..." statement.

1. Does Postgres have something like that, or do you have to use the sproc mechanism you describe? (For bulk upserts I'm trying to picture how you'd do that efficiently?)

2. It would be great in Squeryl if we could "extend" a schema with DB-specific syntax. I'm sure there's a way (mixin a "MySqlSpecificOps" trait maybe?) If anyone has a rough outline of how to do such a thing it would be greatly appreciated.

--
Best regards,
Brian Maso
(949) 395-8551
Follow me: @bmaso
br...@blumenfeld-maso.com

Maxime Lévesque

unread,
Jun 10, 2011, 11:47:19 PM6/10/11
to squ...@googlegroups.com

Have you tried the table.insertOrUpdate(x) method ?

 http://squeryl.org/api/index.html#org.squeryl.Table

A Schema extension trait that would have implicits to inject specific DB features
could be a way to do what you are describing.

Gordan Valjak

unread,
Jun 13, 2011, 7:54:12 AM6/13/11
to Squeryl
Here is the example:

I have table which stores data about different file types

trait BaseEntity extends KeyedEntity[Long] {
val id : Long = 0
@Column("created_at")
val createdAt = new Timestamp(System.currentTimeMillis)
@Column("updated_at")
var updatedAt = new Timestamp(System.currentTimeMillis)
}

case class FileType(
ext : String,
alt : Option[String],
`type` : String,
mime : String) extends BaseEntity

Keys are autogenerated, but there is unique constraint on ext field
which provides "real" uniqueness.
As i understand, insertOrUpdate function works only for KeyedEntity
but not in this case where some other unique constraint is violated,
and does not solve problem of concurrent transactions.

Problem is when 2 concurrent transactions try to perform upsert on the
same time. Both would fail to do update, but after that one of them
will insert new row,
but other will end with unique violation. Idea is to catch that
exception and then try everything again (first update, then insert).

Postgres doesn't have upsert function, and proposed stored procedure
is way to do it.

Proposed squeryl pseudo code should work for every DB, but problem is
that procedure have to be done again only in case of unique violation
exception,
not in case some other error or exception. So the question is: is
there any way of catching only that type of exception?

Thanks,
Gordan

Maxime Lévesque

unread,
Jun 13, 2011, 8:53:04 AM6/13/11
to squ...@googlegroups.com

Try to get this particular exception thrown, catch SQLException and
look at the values of : getSQLState   and  getErrorCode

try {

}
catch {
  case e:SQLException =>  {
  
    ... inspect th e.getSQLState  and/or e.getErrorCode ....
  }
}

According to this table :

 http://www.postgresql.org/docs/8.1/static/errcodes-appendix.html


  if(e.getErrorCode == 23505)

should do it.

ML
Reply all
Reply to author
Forward
0 new messages