slick stalls when using transactionally flatmapped DBIO actions

2,264 views
Skip to first unread message

Peter Mortier

unread,
Nov 23, 2015, 11:59:51 AM11/23/15
to Slick / ScalaQuery
We are running into slick issue https://github.com/slick/slick/issues/1274

I managed to reproduce it and have posted a mini-project to github https://github.com/kwark/slick-deadlock

TLDR: using flatmapped DBIO actions together with transactions on a hikari cp leads to connection starvation when throwing a bit of load at it. 

Anybody else seen this?

Seems like a serious issue.

Brett Wooldridge

unread,
Nov 24, 2015, 8:55:12 AM11/24/15
to Slick / ScalaQuery
I don't know how Slick manages its thread pool, but someone needs to make sure that the pool is sized properly for both the number of threads and the number of Connections that may be obtained by a single thread.

Stefan Zeiger

unread,
Nov 24, 2015, 1:23:12 PM11/24/15
to scala...@googlegroups.com
On 2015-11-24 14:55, Brett Wooldridge wrote:
I don't know how Slick manages its thread pool, but someone needs to make sure that the pool is sized properly for both the number of threads and the number of Connections that may be obtained by a single thread.


This looks like it might be a similar effect, although there is no low bound on the number of connections from a single thread that could be used for the pool size. The upper bound for the total number of connections is queue size + thread pool size (because in order to require a connection, an action must either be running at the moment or have a follow-on action scheduled with a pinned session). The example from the ticket is:
  val tasks: IndexedSeq[Future[Int]] = 1 to 20 map { i =>
    val action = { table += i }.map { identity }
    database.run(action.transactionally)
  }
This can lead to a deadlock between the thread pool and the connection pool. The "map" call causes the action to be suspended (because the mapping function needs to be executed on a different thread) while holding a connection (due to "transactionally"). The continuation is scheduled with high priority to ensure that it can always be inserted into the task queue. If you start enough of these actions concurrently you will end up with a situation where the connection pool is exhausted, thus getConnection blocks until an existing connection is freed up. However, this blocking happens on the database thread pool, thus preventing the continuations, which would eventually commit the transaction and return the connection to the pool, from running, thus causing a deadlock.

Unfortunately this is where my line of thought collapses because HikariCP should already guard against these situations. There is a "connectionTimeout" parameter which we set to 1s by default (it would be nice if you could set it to 0 -- we never want to wait for a connection if the pool is exhausted), so any such deadlock should be broken automatically after a second. In the example above and assuming a thread pool size of 2, it should not take more than 18 seconds for the deadlock to be broken and after that time at least the first two inserts should have succeeded.

--
Stefan Zeiger
Slick Tech Lead
Typesafe - Build Reactive Apps!
Twitter: @StefanZeiger

Brett Wooldridge

unread,
Nov 24, 2015, 7:24:08 PM11/24/15
to Slick / ScalaQuery
If HikariCP is not timing out each call to getConnection() after the configured connectionTimeout it would come as a surprise and would certainly be a bug.  A simple reproduction scenario (in Scala, Java, or whatever) would be greatly appreciated.

Regarding the connectionTimeout in general, while you think a connectionTimeout of 0 sounds attractive, I think in practice it certainly is not.  Consider a pool of 10 connections.  At some point a Connection reaches its maxLifetime and is retired from the pool.  A thread is scheduled to replace the Connection in the pool.  While this may not take long, Connection setup still requires a non-zero amount of time -- possibly a few milliseconds, possibly triple-digit milliseconds in the case of a busy server or SSL setup for an encrypted connection.  Either way, for some period of time (very likely less than 1000ms), the pool has 9 connections, not 10.  If nine of the ten connections are in use and we allowed a connectionTimeout of 0ms, a caller to getConnection() would immediately fail, even though the pool is below capacity.  This is certainly not what most users would expect or want (including Slick users, I suspect).

Again, if HikariCP is not timing out at the configured connectionTimeout this is certainly a serious issue.  The number of questions we receive related failure to obtain a connection due to reaching connectionConnection timeout indicates that in general the functionality is worked as expected.  But I cannot discount the possibility of an edge-case not covered by existing unit tests.  Typesafe's assistance in troubleshooting this issue would be greatly appreciated.

Even if the connectionTimeout works as expected, the inability to execute the above code successfully is probably a non-starter for Slick users.  It sounds like either the mechanics of how connections are obtained and released within Slick needs to change, or possibly some logic added to dynamically raise and lower the maxPoolSize.  The maxPoolSize is changeable through the MBean, but if you cannot go this route and you need more direct access, we can certainly work together on a solution.

Regards,
Brett Wooldridge

Stefan Zeiger

unread,
Nov 25, 2015, 9:39:26 AM11/25/15
to scala...@googlegroups.com
On 2015-11-25 1:24, Brett Wooldridge wrote:
If HikariCP is not timing out each call to getConnection() after the configured connectionTimeout it would come as a surprise and would certainly be a bug.  A simple reproduction scenario (in Scala, Java, or whatever) would be greatly appreciated.

I did some more diggiging. It looks like there are several factors that contribute to the problem. First of all, the connectionTimeout in HikariCP seems to work as it should. After raising the timeout in the test case, it always aborted with a SQLTimeoutException indicating that HikariCP gave up just a bit after 1000ms, as configured.

I ran the following test with 1 thread and 1 connection:
  val tasks = 1 to 5 map { i =>
    val a = sql"select 1".as[Int].head.map(identity)
    dc.db.run(a.transactionally)
  }
  Await.result(Future.sequence(tasks), Duration(30, TimeUnit.SECONDS))
The expectation is that this will fail after a bit over 1 second but in fact it takes over 12 seconds for the SQLTimeoutException to arrive.

Problem #1 is in "transactionally":
  def transactionally: DBIOAction[R, S, E with Effect.Transactional] = SynchronousDatabaseAction.fuseUnsafe(
    StartTransaction.andThen(a).cleanUp(eo => if(eo.isEmpty) Commit else Rollback)(DBIO.sameThreadExecutionContext)
      .asInstanceOf[DBIOAction[R, S, E with Effect.Transactional]]
  )
This schedules a Rollback even when StartTransaction fails. Fixing this needs further investigation. I don't think it would be as simple as moving the "StartTransaction andThen ..." out of the "cleanUp" because StartTransaction has already pinned the session if it fails in the actual "setAutoCommit(false)" call, so we'll have to distinguish between failures during connection acquisition and later ones. The same problem exists in "withPinnedSession" but it will require a different kind of fix because DBIO.Pin doesn't acquire a connection, so the actual failure comes later. Maybe it *should* force the connection, then we could immediately unpin the session if this fails.

OK, moving on by applying the fix tentatively. Now the failure occurs after 4 seconds. (So why did it fail after 12 seconds before and not after 8? I have no idea. It could be yet another bug.) This time it's due to Future.sequence which fails unnecessarily late because it gathers results sequentially. Even though Future #3 in the sequence has already failed (so it's clear that the resulting Future will also fail), Future.sequence waits for Futures #1 and #2 to complete first. It seems wrong to me and fixing it wouldn't be hard but it would change the semantics (by propagating the first failure, not the leftmost one).

After implementing a suitable combinator myself, I get a test failure after just over 1 second, as expected.


Regarding the connectionTimeout in general, while you think a connectionTimeout of 0 sounds attractive, I think in practice it certainly is not.  Consider a pool of 10 connections.  At some point a Connection reaches its maxLifetime and is retired from the pool.  A thread is scheduled to replace the Connection in the pool.  While this may not take long, Connection setup still requires a non-zero amount of time -- possibly a few milliseconds, possibly triple-digit milliseconds in the case of a busy server or SSL setup for an encrypted connection.  Either way, for some period of time (very likely less than 1000ms), the pool has 9 connections, not 10.  If nine of the ten connections are in use and we allowed a connectionTimeout of 0ms, a caller to getConnection() would immediately fail, even though the pool is below capacity.  This is certainly not what most users would expect or want (including Slick users, I suspect).

In a traditional, blocking application design, you spawn lots of threads and they get stuck in getConnection calls concurrently if you overload the pool. In this case the connectionTimeout serves as a timeout to fail individual calls. In Slick's semi-asynchronous design it has a very different effect. The only way to overload the pool is by keeping transactions open while waiting for asynchronous calls to return. This can lead to the observed deadlocks. This simplest way to avoid those would be to fail immediately when there are no free connections in the pool. You could also treat the case where you are waiting to replace an expired connection differently and only fail immediately when the pool is at its maximum capacity. OTOH, Slick could implement this feature directly and make it work with any pool implementation. The overhead would be one atomic int operation on each connection acquisition and release.

Another option for Slick is different prioritization of continuations. It's more work because we have to implement a custom queue (well, even more custom than the one we already have). Currently we use low-priority tasks for new actions and high-priority tasks for continuations. The difference is that high-priority tasks can use the entire task queue size (which is actually double of what you configure) whereas low-priority tasks can only use the configured size. This ensures that continuations can always be enqueued but it doesn't affect scheduling, which is always FIFO. We could use an even higher priority for continuations which hold a connection, and enqueue those in front of all other tasks. This should prevent the deadlocks in the first place.

Peter Mortier

unread,
Nov 25, 2015, 1:28:51 PM11/25/15
to Slick / ScalaQuery
Stefan,

Thanks for the thorough investigation.

Op woensdag 25 november 2015 15:39:26 UTC+1 schreef Stefan Zeiger:
In a traditional, blocking application design, you spawn lots of threads and they get stuck in getConnection calls concurrently if you overload the pool. In this case the connectionTimeout serves as a timeout to fail individual calls. In Slick's semi-asynchronous design it has a very different effect. The only way to overload the pool is by keeping transactions open while waiting for asynchronous calls to return. This can lead to the observed deadlocks. This simplest way to avoid those would be to fail immediately when there are no free connections in the pool. You could also treat the case where you are waiting to replace an expired connection differently and only fail immediately when the pool is at its maximum capacity. OTOH, Slick could implement this feature directly and make it work with any pool implementation. The overhead would be one atomic int operation on each connection acquisition and release.


I'm not sure I'm in favour of unrelated DBIO actions suddenly starting to fail, because of other transactional operations waiting for their callbacks to finish. In my opinion once a DBIO has been scheduled on the Slick queue it should get at least a chance to execute. This would be a violation of the principle of least astonishment.
 
Another option for Slick is different prioritization of continuations. It's more work because we have to implement a custom queue (well, even more custom than the one we already have). Currently we use low-priority tasks for new actions and high-priority tasks for continuations. The difference is that high-priority tasks can use the entire task queue size (which is actually double of what you configure) whereas low-priority tasks can only use the configured size. This ensures that continuations can always be enqueued but it doesn't affect scheduling, which is always FIFO. We could use an even higher priority for continuations which hold a connection, and enqueue those in front of all other tasks. This should prevent the deadlocks in the first place.

 
An alternative design might be to use a second threadpool, which only executes the continuations, which are holding connections. 
Since there can't be more continuations as the number of threads in the main threadpool, you only a need a small queue. And because there is already connection available, the continuation will always be guaranteed to proceed.
Of course my lack of the slick internals might show here and maybe this is not a viable solution.

Kind Regards,

Peter Mortier

Stefan Zeiger

unread,
Nov 25, 2015, 2:46:30 PM11/25/15
to scala...@googlegroups.com
On 2015-11-25 19:28, Peter Mortier wrote:

Op woensdag 25 november 2015 15:39:26 UTC+1 schreef Stefan Zeiger:
In a traditional, blocking application design, you spawn lots of threads and they get stuck in getConnection calls concurrently if you overload the pool. In this case the connectionTimeout serves as a timeout to fail individual calls. In Slick's semi-asynchronous design it has a very different effect. The only way to overload the pool is by keeping transactions open while waiting for asynchronous calls to return. This can lead to the observed deadlocks. This simplest way to avoid those would be to fail immediately when there are no free connections in the pool. You could also treat the case where you are waiting to replace an expired connection differently and only fail immediately when the pool is at its maximum capacity. OTOH, Slick could implement this feature directly and make it work with any pool implementation. The overhead would be one atomic int operation on each connection acquisition and release.


I'm not sure I'm in favour of unrelated DBIO actions suddenly starting to fail, because of other transactional operations waiting for their callbacks to finish. In my opinion once a DBIO has been scheduled on the Slick queue it should get at least a chance to execute. This would be a violation of the principle of least astonishment.

In case of a deadlock there is no chance of succeeding. Until a continuation gets to run and free up a connection, all other actions will fail. The best we can do is to make them fail as quickly as possible.


 An alternative design might be to use a second threadpool, which only executes the continuations, which are holding connections. 
Since there can't be more continuations as the number of threads in the main threadpool, you only a need a small queue. And because there is already connection available, the continuation will always be guaranteed to proceed.

Multiple thread pools make correct sizing impossible. You shouldn't be able to run more database actions concurrently with transaction than without transactions.

After thinking about this some more, the high-priority scheduling alone can almost avoid the deadlocks but not quite. Let's take the following example with a thread pool size of 1 and a connection pool size of 1:
  val tasks = 1 to 50 map { i =>
    val a = sql"select 1".as[Int].head.map { i => Thread.sleep(100ms); i }
    dc.db.run(a.transactionally)
  }
  Await.result(Future.sequence(tasks), Duration(30, TimeUnit.SECONDS))

1) All 5 actions are started and put into the task queue (because they begin with a SynchronousDatabaseAction which has been fused from StartTransaction plus the query).

2) The first action runs, acquiring the only Connection from the pool and hanging on to it (because of the transaction).

3) Without the "map" call, the next action would be another SynchronousDatabaseAction, so we'd put it at the front of the queue, so the rest until the final Commit can run and release the connection. (Note that this doesn't occur in practice and the example already works fine without "map" because all the actions get fused.) But we do have a "map", so it gets passed to the global ExecutionContext for asynchronous execution.

4) While the "map" runs asynchronously, the database thread pool starts processing the next job. The asynchronous call finishes 100ms later and enqueues the continuation (which releases the connection) at the front, but it's too late. The next action is has already been started and will block on getConnection, so we have a deadlock.

Failing without blocking would resolve the deadlock without delay but it would drain the entire task queue and make all tasks except for the first one fail. That's not really what we want.

On the other hand, waiting for one second is not desirable, either, because after task #2 has failed and task #1 has finally completed, we just go back to step 1 and run into the same deadlock again with tasks #3 and #4. This cuts the wait time in half (25 seconds minimum, 75 with the current implementation) and still makes half of the tasks fail.

How about we count the open connections and suspend the database thread pool if no new connections are available and the next task in the queue is not of the highest priority (with an existing connection)? I think this would solve the problem completely. Or am I overlooking yet another corner case?

Brett Wooldridge

unread,
Nov 26, 2015, 9:19:44 AM11/26/15
to Slick / ScalaQuery
WIthout knowing the Slick internals, I'm going to throw something out there anyway.  But first a question.  When map is used to asynchronously run a bunch of DB tasks "transactionally", is it expected that each task is a descrete transaction, or does it make more sense for them to be considered one transaction?

The reason I ask is this.  In many JTA implementations, all actions on a single thread, between transaction start and commit/rollback are a single transaction.  In most JTA implementations the behavior is such that no matter how many times a thread calls DataSource.getConnection(), it will always be given the same Connection rather than consuming another one from the pool -- for as long as it is participating in a transaction.  This alleviates a tremendous amount of pressure from the pool, and avoids "pool-locking" in general.

In procedural languages it also eliminates the need to pass a Connection parametrically down through all of the methods involved.  For example, if method foo() starts a transaction, then calls method bar(), followed by baz(), and baz() calls zap(), each method can simply call getConnection() as freely as they like and they will all be given the same Connection instance.  Eventually returning back to foo() which commits/rollsback and ends the transaction.

I don't know if there is a similar concept in Slick, or one that could be introduced, but it might be one solution.  For example, if all tasks queued by the same map iteration, running "transactionally" were made to acquire the same Connection -- through some sort of ThreadLocal or session association.  This might require starting the transaction somehow prior to the map iteration, such that each task would implicitly be pulled into the transaction and share a connection.

Just a thought, based on my experience with JTA (as one of the core developers on Bitronix JTA).

-Brett

Message has been deleted

Brett Wooldridge

unread,
Nov 26, 2015, 9:24:50 AM11/26/15
to Slick / ScalaQuery
One last note, such "shared" Connection transactionality also performs an order of magnitude better [at the DB layer] than dozens, hundreds, or thousands of individual transactions.

-Brett

Peter Mortier

unread,
Nov 26, 2015, 10:36:48 AM11/26/15
to Slick / ScalaQuery


Op woensdag 25 november 2015 20:46:30 UTC+1 schreef Stefan Zeiger:
After thinking about this some more, the high-priority scheduling alone can almost avoid the deadlocks but not quite. Let's take the following example with a thread pool size of 1 and a connection pool size of 1:
  val tasks = 1 to 50 map { i =>
    val a = sql"select 1".as[Int].head.map { i => Thread.sleep(100ms); i }
    dc.db.run(a.transactionally)
  }
  Await.result(Future.sequence(tasks), Duration(30, TimeUnit.SECONDS))

1) All 5 actions are started and put into the task queue (because they begin with a SynchronousDatabaseAction which has been fused from StartTransaction plus the query).

2) The first action runs, acquiring the only Connection from the pool and hanging on to it (because of the transaction).

3) Without the "map" call, the next action would be another SynchronousDatabaseAction, so we'd put it at the front of the queue, so the rest until the final Commit can run and release the connection. (Note that this doesn't occur in practice and the example already works fine without "map" because all the actions get fused.) But we do have a "map", so it gets passed to the global ExecutionContext for asynchronous execution.

4) While the "map" runs asynchronously, the database thread pool starts processing the next job. The asynchronous call finishes 100ms later and enqueues the continuation (which releases the connection) at the front, but it's too late. The next action is has already been started and will block on getConnection, so we have a deadlock.

Failing without blocking would resolve the deadlock without delay but it would drain the entire task queue and make all tasks except for the first one fail. That's not really what we want.

On the other hand, waiting for one second is not desirable, either, because after task #2 has failed and task #1 has finally completed, we just go back to step 1 and run into the same deadlock again with tasks #3 and #4. This cuts the wait time in half (25 seconds minimum, 75 with the current implementation) and still makes half of the tasks fail.

How about we count the open connections and suspend the database thread pool if no new connections are available and the next task in the queue is not of the highest priority (with an existing connection)? I think this would solve the problem completely. Or am I overlooking yet another corner case?


I can't think of any other scenarios/corner cases. I guess connection counting together with additional queue prioritisation will solve the deadlock problem.

Peter Mortier
 

Stefan Zeiger

unread,
Nov 27, 2015, 12:56:02 PM11/27/15
to scala...@googlegroups.com
On 2015-11-26 15:19, Brett Wooldridge wrote:
I don't know if there is a similar concept in Slick, or one that could be introduced, but it might be one solution.  For example, if all tasks queued by the same map iteration, running "transactionally" were made to acquire the same Connection -- through some sort of ThreadLocal or session association.  This might require starting the transaction somehow prior to the map iteration, such that each task would implicitly be pulled into the transaction and share a connection.

That's what we already do. Not a ThreadLocal, there's no fixed association with a single thread, but a single Connection in the Context on which the transaction runs. And that's what causes the deadlocks. In a blocking application all transactions that are ready to continue with a database action (and eventually end the transaction and return the connection to the pool) are running at the same time on different threads, most of the time blocked on a database call. It's up to the database server to schedule these actions appropriately. Slick does some scheduling of its own by funneling everything through a thread pool. This can lead to a state where the connection pool is exhausted and the thread pool is waiting (in vain) for getConnection calls to return in order to run new actions (which do not yet have a connection) and at the same time the existing actions (which already have a connection) cannot continue to run (and thus eventually release their connections) because they are waiting for a thread to become available.
Reply all
Reply to author
Forward
0 new messages