Threads blocking on database connection pool

2,448 views
Skip to first unread message

Chris Blackwell

unread,
Apr 24, 2014, 4:57:04 AM4/24/14
to railo
I'm spinning up multiple threads in an attempt to process a queue of work quickly, each thread picks an item from the head of the queue and processes it, which involves some db interaction. 
What i see in VisualVM is that only a few threads are ever active, most are blocking and a thread dump shows that they are waiting to release a db connection.

here's some simplified code.


qWorkQueue = getWorkToDo();
threadcount = 20;
for(i=1; i<=threadcount; i++) {
  thread action="run" name="runner-#i#" {
  thread.run = true;
  while(thread.run) {
  thread.row = getNext();
  if(!isNull(thread.row)) {
  doWork(thread.row);
  }
  else {
  thread.run = false;
  }
  }
  }
}
for(i=1; i<=threadcount; i++) {
  thread action="join" name="runner-#i#";
}

function getNext() {
  lock name="get-next-lock" timeout="5" {
  if(qWorkQueue.recordcount) {
  var row = qWorkQueue.rowData(1);
  qWorkQueue.deleterow(1);
  }
  }
  return row ?: nullValue();
}


Heres what i see in VisualVM




Those blocked threads in red typically show the following in a thread dump

"cfthread-75" prio=5 tid=0x00007f836d842800 nid=0xec03 waiting for monitor entry [0x00000001197c4000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:94)
- waiting to lock <0x0000000706368678> (a railo.runtime.db.DCStack)
at railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:86)
at railo.runtime.db.DatasourceManagerImpl.releaseConnection(DatasourceManagerImpl.java:113)
at railo.runtime.tag.Query.executeDatasoure(Query.java:697)
at railo.runtime.tag.Query.doEndTag(Query.java:533)
at model.worker_cfc$cf.udfCall(/model/Worker.cfc:181)

Is this something i'm doing wrong, or is there something going on in Railo thats causing this contention?

Cheers,
Chris

Nick S.

unread,
Apr 24, 2014, 9:32:41 AM4/24/14
to ra...@googlegroups.com
I think this ticket is related. See this comment in RAILO-2888:

Chris Blackwell

unread,
Apr 24, 2014, 11:18:16 AM4/24/14
to railo
According to the ticket this is resolved in 4.2.0.002 and i'm running 4.2.0.007
However the first 'at' line of the stack trace does point to a method call within a synchronized block

at railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:94)

points here, but i've no idea if this is good, bad or just meh?

Chris


--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/railo/41c31418-a0ad-4641-912a-829649308c47%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Igal @ getRailo.org

unread,
Apr 24, 2014, 11:22:30 AM4/24/14
to ra...@googlegroups.com
java.lang.Thread.State: BLOCKED (on object monitor)
when a thread is blocked on object monitor then it means that it is trying to get into a synchronized block, but another thread has already entered that synchronized block and acquired the monitor of the object that is used for synchronization (I'm trying to explain, but it's possible that my explanation here is more cryptic than the original message...  too early...  not enough coffee...  the dog ate my homework...  etc...)

For more options, visit https://groups.google.com/d/optout.

-- 
Igal Sapir
Railo Core Developer
http://getRailo.org/

Chris Blackwell

unread,
Apr 24, 2014, 11:54:08 AM4/24/14
to railo
Hi Igal,

Yeah i get what synchronized means, and why it might be used.  
It may be that it really is required here, but it does seem to be a bottleneck on the connection pool.
I simply can't get more than 3 or 4 threads to be in a running state.

Chris




Igal @ getRailo.org

unread,
Apr 24, 2014, 11:56:23 AM4/24/14
to ra...@googlegroups.com
how many CPU cores do you have on that machine?

For more options, visit https://groups.google.com/d/optout.

Chris Blackwell

unread,
Apr 24, 2014, 12:16:38 PM4/24/14
to railo
hmm, its a new iMac, so its 4 physical cores.
scaling to beyond the number of physical cores seems to result in lock contention.
even going to 8, which is the virtual number of cores the os reports still results in contention.

lesson learned i guess

Cheers, Chris




Igal @ getRailo.org

unread,
Apr 24, 2014, 12:45:21 PM4/24/14
to ra...@googlegroups.com
actually if the CPU has 8 threads then I expect it to scale to 8, but there are many factors here.  the point is that if you have N CPU threads then you can at most have N real concurrent operations. 

the OS and/or JVM will splice the work between the threads for you and will show you that virtually you have many threads running at the same time, but you can't have 9 concurrent CPU operations if you have only 8 CPU threads, just like you can't have 9 different conversations with 9 different callers if you have 8 telephone operators.  in order to handle the 9th caller -- one of the previous conversations must either end, or be put on hold (in which case the operator [i.e. thread] must keep switching the context between the different callers that he [it] handles, which is what you usually get in a multi-threaded environment).

also, from glancing at your code -- check out http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/BlockingQueue.html

For more options, visit https://groups.google.com/d/optout.

Nick S.

unread,
Apr 24, 2014, 1:04:38 PM4/24/14
to ra...@googlegroups.com
It was only partially resolved. There were still other instances of the synchronized setting, particularly on releasing the datasource connection and although rare, we're still running into issues with this in production from time-to-time. 

Jochem van Dieten

unread,
Apr 24, 2014, 2:39:00 PM4/24/14
to ra...@googlegroups.com
On Thu, Apr 24, 2014 at 5:54 PM, Chris Blackwell \wrote:
Yeah i get what synchronized means, and why it might be used.  
It may be that it really is required here, but it does seem to be a bottleneck on the connection pool.
I simply can't get more than 3 or 4 threads to be in a running state.

What is the relative speed of getNext() and doWork()? If doWork() is only 4 times as slow, 4 would be the expected concurrency.

Jochem

--
Jochem van Dieten
http://jochem.vandieten.net/

Chris Blackwell

unread,
Apr 24, 2014, 3:29:17 PM4/24/14
to railo

getNext() is returning a row from an in memory query, do Work() is calling a couple of stored procedures on a remote mysql database, typically taking around 100ms.  So they would be orders of magnitude different in their execution times. 

--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.

Brad Wood

unread,
Apr 24, 2014, 8:21:02 PM4/24/14
to ra...@googlegroups.com
I just saw this thread. Funny because I was just noticing this exact scenario on a client server last week.  The server will reach a point where threads start backup up like crazy and taking minutes to complete and a thread dump from Fusion Reactor will show tons of threads all waiting to synchronize the DCStack object in the connection pool-- usually to release a connection.  

This server is on Railo 4.1.3.005 Final (Java 1.7.0_45) but still gets backed up frequently- often time in the connection pooling code.  Here is a stack trace of one of the alerts I received from Fusion Reactor.  You can see that 63 threads are all backed up and waiting to lock the synchronized object, railo.runtime.db.DCStack.


The thread blocking everyone (ajp-bio-8009-exec-743) is trying to release a datasource connection. 
 railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:86)
That specific line of code is:
stack.add(dc);

Here is the top of the stack trace of the thread that is holding the lock:

   java.lang.Thread.State: RUNNABLE
        at railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:86)
        at railo.runtime.db.DatasourceManagerImpl.releaseConnection(DatasourceManagerImpl.java:113)
        at railo.runtime.tag.Query.executeDatasoure(Query.java:659)
        at railo.runtime.tag.Query.doEndTag(Query.java:502)
        at coldboxextensions.logging.appenders.dbappender_cfc$cf.udfCall(C:\webroot\coldboxextensions\logging\appenders\DBAppender.cfc:177)
        at railo.runtime.type.UDFImpl.implementation(UDFImpl.java:95)
        at railo.runtime.type.UDFImpl._call(UDFImpl.java:308)
        at railo.runtime.type.UDFImpl.call(UDFImpl.java:212)
        at railo.runtime.ComponentImpl._call(ComponentImpl.java:623)
        at railo.runtime.ComponentImpl._call(ComponentImpl.java:506)
        - locked railo.runtime.db.DCStack@7dc6b466
        at railo.runtime.ComponentImpl.call(ComponentImpl.java:1732)
        at railo.runtime.util.VariableUtilImpl.callFunctionWithoutNamedValues(VariableUtilImpl.java:713)
        at railo.runtime.PageContextImpl.getFunction(PageContextImpl.java:1476)
        at coldbox.system.logging.logger_cfc$cf.udfCall3(C:\webroot\coldbox\system\logging\Logger.cfc:327)

Note that it appears that the thread locked the DCStack PRIOR to running the query in the DBAppender file.  Is it possible that this thread is holding onto the DCStack object WHILE it waits for the query to complete?   

That doesn't seem to make sense because ComponentImpl has no references to DCStack, but I'm just going off of where the "- locked" bit shows up in the trace.  My understanding is everything above that in the trace is executed while the lock is help open.

Thanks!

~Brad

Chris Blackwell

unread,
Apr 25, 2014, 4:42:07 AM4/25/14
to railo
Hi Brad,

It looks like you're seeing exactly the same issue as me.

Thinking about it last night, I believe the cpu core issue that Igal mentioned is a red herring as each thread is handing off work to an external process and then waiting for a response, so during that idle period the os/cpu should be able to be processing other threads.  

What i'm seeing in VisualVM is that I never have more than a few threads in a running state, but what i would expect to see is some threads running, while others are in a waiting state, waiting for a response from the db.

Theres a full thread dump here taken while trying to run 20 threads
But typically I see all the cfthreads blocked and reporting this

"cfthread-17" prio=5 tid=0x00007fb64b21a800 nid=0xfb03 waiting for monitor entry [0x000000011c1bd000]
   java.lang.Thread.State: BLOCKED (on object monitor)
at railo.runtime.db.DatasourceConnectionPool.getDatasourceConnection(DatasourceConnectionPool.java:44)
- waiting to lock <0x00000007b8224c18> (a railo.runtime.db.DCStack)
at railo.runtime.db.DatasourceManagerImpl.getConnection(DatasourceManagerImpl.java:49)
at railo.runtime.tag.Query.executeDatasoure(Query.java:687)
at railo.runtime.tag.Query.doEndTag(Query.java:533)

And a single thread with a lock on 0x00000007b8224c18 here

"cfthread-0" prio=5 tid=0x00007fb64d967000 nid=0x5807 runnable [0x0000000115d40000]
   java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:112)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:159)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:187)
- locked <0x00000007b71ce750> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3158)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3615)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3604)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4155)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2832)
- locked <0x00000007b71c5098> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:5357)
- locked <0x00000007b71c5098> (a com.mysql.jdbc.JDBC4Connection)
at railo.runtime.db.driver.ConnectionProxy.setAutoCommit(ConnectionProxy.java:238)
at railo.runtime.db.DatasourceConnectionPool.isValid(DatasourceConnectionPool.java:154)
at railo.runtime.db.DatasourceConnectionPool.getDatasourceConnection(DatasourceConnectionPool.java:58)
- locked <0x00000007b8224c18> (a railo.runtime.db.DCStack)
at railo.runtime.db.DatasourceManagerImpl.getConnection(DatasourceManagerImpl.java:49)
at railo.runtime.tag.Query.executeDatasoure(Query.java:687)
at railo.runtime.tag.Query.doEndTag(Query.java:533)



Nick S.

unread,
Apr 25, 2014, 9:42:11 AM4/25/14
to ra...@googlegroups.com
It is somewhat reassuring to see we're not the only ones having this problem... I've been working internally to move forward with a support contract so we can get this fixed but our legal process can be unbearably slow. We had another crash last night with lots of threads in this state:


"ajp-bio-8009-exec-1821" Id=14517 BLOCKED on railo.runtime.db.DCStack@4fcefe94 owned by "Thread-35" Id=57

   java.lang.Thread.State: BLOCKED

        at railo.runtime.db.DatasourceConnectionPool.getDatasourceConnection(DatasourceConnectionPool.java:41)

        - waiting to lock railo.runtime.db.DCStack@4fcefe94 owned by "Thread-35"

        at railo.runtime.db.DatasourceManagerImpl.getConnection(DatasourceManagerImpl.java:49)

        at railo.runtime.tag.Query.executeDatasoure(Query.java:649)

        at railo.runtime.tag.Query.doEndTag(Query.java:502)



Brad Wood

unread,
Apr 25, 2014, 2:19:57 PM4/25/14
to ra...@googlegroups.com
Chris, your stack trace seems to support the notion that the entire DCStack object is being held in a locked state *while* a query is being executed against the database-- as opposed to only being locked while the datasource connection list is being updated or accessed.  I would expect that the lock would be released once a connection was assigned to a thread to use.

Thanks!

~Brad

ColdBox Platform Evangelist
Ortus Solutions, Corp 

ColdBox Platform: http://www.coldbox.org 



You received this message because you are subscribed to a topic in the Google Groups "Railo" group.
To view this discussion on the web visit https://groups.google.com/d/msgid/railo/CAB%3DtfTpCuBhoyJ9V91FyNULo9AntghdfbLNs0hM1dseCLENGNA%40mail.gmail.com.

melinite

unread,
Apr 26, 2014, 2:01:22 AM4/26/14
to ra...@googlegroups.com
for mysql, in chris's case, have you tried updating the mysql jdbc driver? the latest is 5.1.30 https://dev.mysql.com/doc/relnotes/connector-j/en/news-5-1-30.html.

versions prior to 5.1.20 jdbc driver had deadlock issues. 

Also you do know depending on how you write the sp and the version of the DB you use, MySQL stored procs are not necessary the best solution for performance as they are not compiled in the sense Oracle and SQL Server stored procedures are compiled.

We have a worker queue system rolled into cf, very similar to your logic, runs 100s of parallel threads on our vm, on Jetty 8 (recently updated to Wildfly 8), making 1000's of db inserts, updates, selects, without experiencing your issue.
The app is still on the railo stable 4.1.2.005 with updated jdbc driver on Mariadb.

I'm not saying there is no bug in the current bleeding edge release, but I would suggest try the stable branch with updated jdbc driver and see if that helps. Also we have upped the max-queue/thread setting in railo server xml to 300 i think. Jetty threads are at max 500 min 50. If you still experience problems and see no error logged in MySQL maybe tweak the jdbc connection settings as we have done.

melinite

unread,
Apr 26, 2014, 2:05:34 AM4/26/14
to ra...@googlegroups.com
i meant to say versions prior to 5.1.21

Brad Wood

unread,
Apr 26, 2014, 12:20:33 PM4/26/14
to ra...@googlegroups.com
Melinite, The server I was experiencing this on was SQL Server, not MySQL, but nonetheless, we had already upgraded to the latest driver.  I understand that there can be deadlocks in certain versions of drivers and even performance considerations at the database level. However, the stack traces in this thread do not show contention at either the database or the driver level.  They show it in Railo's connection pooling. (And it's not a deadlock either, it's just blocked threads)

Thanks!

~Brad

ColdBox Platform Evangelist
Ortus Solutions, Corp 

ColdBox Platform: http://www.coldbox.org 



melinite

unread,
Apr 26, 2014, 2:53:31 PM4/26/14
to ra...@googlegroups.com
agreed. good observation brad. must be a bug somewhere. i hope railo 5 osgi gives some thought into possible having ability to externaliZe the db connection pool for more granular tuning and control or even allowing custom handlers like c3p0 and the like.

Brad Wood

unread,
May 19, 2014, 3:58:56 PM5/19/14
to ra...@googlegroups.com
Has anyone had a chance to test this issue again the final 4.2 release?

xxrxndxvxs

unread,
May 20, 2014, 2:08:11 PM5/20/14
to ra...@googlegroups.com
We are seeing what sounds like the same thing. Our setup:

Version Railo 4.2.0.008 rc
Version Name Togo
Release date May 2, 2014
ColdFusion® compatibilty version 10.0.0.0

Configuration File /opt/railo/lib/railo-server/context/railo-server.xml
OS Linux (2.6.32-26-pve) 32bit
Remote IP ***
Host Name ***
Servlet Container Apache Tomcat/7.0.29
Railo Server ID ***
Java 1.7.0_05 (Oracle Corporation) 32bit
Architecture 32bit

 
Heap Memory rarely over 30%
Non-Heap even less.

Database version and driver exhibiting this:

Database (Name/Version) Microsoft SQL Server 11.00.3128
Driver (Name/Version) jTDS Type 4 JDBC Driver for MS SQL Server and Sybase 1.2.8 (JDBC 3.0) 


and

Database (Name/Version) Microsoft SQL Server 10.00.2573
Driver (Name/Version) jTDS Type 4 JDBC Driver for MS SQL Server and Sybase 1.2.8 (JDBC 3.0) 

We only have a couple webs running on this server at the moment, but in no-time the number of ESTABLISHED database connections to our MS SQL servers climbs 5-6x the connection limit we set in the Railo Admin. Eventually the performance slows to a snails pace and we restart Railo.

This is not my area of expertise at all, but when I saw this thread I thought perhaps someone with a little more experience in this area could use these details.

Aaron

Pionic

unread,
May 26, 2014, 4:45:22 AM5/26/14
to ra...@googlegroups.com

Just wanted to poke this thread. We've also experienced the same issue on the stable version 4.1.2.005.

Has someone verified the issue with the latest version, and is there a ticket number for this?


Op donderdag 24 april 2014 10:57:04 UTC+2 schreef d1rtym0nk3y:

Phil Sweeney

unread,
May 26, 2014, 9:26:58 AM5/26/14
to ra...@googlegroups.com
I've also seen this twice now on 4.1.2.005 .. (relatively close together.. last on Apr 30).. but nothing for ages before that, and nothing since.

Example from thread dump on hung app instance:

"http-apr-8882-exec-2" daemon prio=10 tid=0x00007f0cb4497000 nid=0x7aa7 waiting for monitor entry [0x00007f0cb25a9000]

   java.lang.Thread.State: BLOCKED (on object monitor)
        at railo.runtime.db.DatasourceConnectionPool.releaseDatasourceConnection(DatasourceConnectionPool.java:85)
        - waiting to lock <0x00000000e6c6b8d8> (a railo.runtime.db.DCStack)
        at railo.runtime.db.DatasourceManagerImpl.releaseConnection(DatasourceManagerImpl.java:113)
        at railo.runtime.tag.Query.executeDatasoure(Query.java:659)

Michael Offner

unread,
May 26, 2014, 11:20:03 AM5/26/14
to ra...@googlegroups.com
FYI
We solved the problem for the next release ...

Micha


--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.

For more options, visit https://groups.google.com/d/optout.



--
/micha

Michael Offner CTO Railo Technologies GmbH

Pionic

unread,
Jun 4, 2014, 1:50:23 PM6/4/14
to ra...@googlegroups.com
Thank Micha (like always ;) )

Which release is/ will it be part of? And, was it fixed with this ticket: https://issues.jboss.org/browse/RAILO-3086 ?

Thanks again!

Op maandag 26 mei 2014 17:20:03 UTC+2 schreef Michael Offner:

Michael Offner

unread,
Jun 4, 2014, 2:17:32 PM6/4/14
to ra...@googlegroups.com
You can see in ticket itself "fixed version"

Micha
--
/micha

Michael Offner CTO Railo Technologies GmbH

--
Did you find this reply useful? Help the Railo community and add it to the Railo Server wiki at https://github.com/getrailo/railo/wiki
---
You received this message because you are subscribed to the Google Groups "Railo" group.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages