InsertAll for records that do not already exist with Postgres and Slick

1,197 views
Skip to first unread message

mbu...@quantifind.com

unread,
Jun 25, 2014, 8:25:34 AM6/25/14
to scala...@googlegroups.com
I have many records I would like to insert, so it would be faster to use insertAll, but some of them already exist. I have been searching on StackOverflow to find a way to do this:

This question explains how to insert a single record only if it does already exist: 


Then these questions discuss insertAll:


I tried this approach: do a query first, then use the results to filter the insert, but it did not work and was quite convoluted:

     val commentIds = results.map(_.commentUrl).toSet
     val alreadyInserted = comments.where(_.commentUrl inSetBind commentIds).list.toSet
     val records = results.filterNot(r => alreadyInserted.contains(r.commentUrl)).map(commentToRecord(_))
     comments.insertAll(records: _*)

Is there a better approach? Thanks!

Mark Butler

unread,
Jun 25, 2014, 1:13:18 PM6/25/14
to scala...@googlegroups.com
Even if I use the insert if does not exist that Christopher Vogt describes here:

http://stackoverflow.com/questions/18864351/scalatra-slick-and-insert-if-not-exists

e.g.

    db withSession {
      implicit session: Session =>
        val records = result.videos.map(v => videoToRecord(queryTerm, v))
        records foreach {
          record =>
            if (!videos.filter(_.videoId === record.videoId).exists.run)
              videos += record
        }
        queryResults += YtQueryResultRecord(queryTerm, result.timePeriods, currentTimestamp)
    }

I still get exceptions thrown for duplicates - see below. I guess Slick is using two operations for the insert? I have multiple threads doing inserts - is there anyway around this without using locking locally in the client?

Exception in thread "main" org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "video_id_idx"
  Detail: Key (video_id)=(W3n4JTF_cNs) already exists.
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2161)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1890)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:560)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
at org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)
at com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:1423)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker$$anonfun$internalInsert$1.apply(JdbcInvokerComponent.scala:115)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker$$anonfun$internalInsert$1.apply(JdbcInvokerComponent.scala:112)
at scala.slick.jdbc.JdbcBackend$SessionDef$class.withPreparedStatement(JdbcBackend.scala:161)
at scala.slick.jdbc.JdbcBackend$BaseSession.withPreparedStatement(JdbcBackend.scala:297)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker.prepared(JdbcInvokerComponent.scala:101)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker.internalInsert(JdbcInvokerComponent.scala:112)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker.insert(JdbcInvokerComponent.scala:104)
at scala.slick.driver.JdbcInvokerComponent$BaseInsertInvoker.$plus$eq(JdbcInvokerComponent.scala:151)



--

---
You received this message because you are subscribed to a topic in the Google Groups "Slick / ScalaQuery" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/scalaquery/Z8Ktx3jVbms/unsubscribe.
To unsubscribe from this group and all its topics, send an email to scalaquery+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scalaquery/2a5ae1f7-b865-43e2-8ffe-fe86f8b569b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Vanger B

unread,
Jun 25, 2014, 5:13:34 PM6/25/14
to scala...@googlegroups.com
There is an "insertOrUpdate" in Slick version 2.1 (I'm using 2.1-M2), so you may do something like:
collection.foreach(table.insertOrUpdate)

But hat didn't worked for me, because I must have different projections for Insert and Update (either i insert whole entity or update three columns), so I did the following:
First - I update each record from collection, and save how much rows is affected by update (it should be 1 or 0). Then I filter records which weren't updated and insert them in batch:

val notUpdated = myCollection.map(elem => (elem, table.filter(_.id === elem.id).update(elem))).filter(_._2 == 0).map(_._1) //filter results where 'updated rows' is zero, then leave only CaseFile instances
//now insert the rest.
table.insertAll(notUpdated: _*)

I simplified more complex example, so u can probably throw away unnecessary filtration by id. And it would become something like:
val notUpdated = myCollection.map(elem => (elem, table.update(elem))).filter(_._2 == 0).map(_._1)
//now insert the rest.
table.insertAll(notUpdated: _*)
Reply all
Reply to author
Forward
0 new messages