BLOCKED on lucee.runtime.db.DCStack@1c1dfcc1

262 views
Skip to first unread message

Jamie Jackson

unread,
Jun 12, 2015, 2:02:43 PM6/12/15
to lu...@googlegroups.com
Happy Friday!

We were load testing our app, and we saw that in FusionReactor, a bunch of requests were hung (and couldn't be terminated), and the load test server became unresponsive.

The hung requests were stuck here:

"http-bio-8888-exec-46" Id=10795 BLOCKED on lucee.runtime.db.DCStack@1c1dfcc1 owned by "http-bio-8888-exec-4" Id=72 
   java.lang.Thread.State: BLOCKED
        at lucee.runtime.db.DatasourceConnectionPool.getDatasourceConnection(Unknown Source)
        - waiting to lock lucee.runtime.db.DCStack@1c1dfcc1 owned by "http-bio-8888-exec-4"
        at lucee.runtime.db.DatasourceManagerImpl.getConnection(Unknown Source)
        at lucee.runtime.tag.Query.executeDatasoure(Unknown Source)
        at lucee.runtime.tag.Query.doEndTag(Unknown Source)

Here's the full thread dump:


FWIW, this is Lucee 4.5.1.000 final with the stock MySQL driver.

Happen to know what's going on here?

Thanks,
Jamie

Michael Offner

unread,
Jun 13, 2015, 2:09:21 AM6/13/15
to lu...@googlegroups.com
Can you raise a ticket for it, it is hard to say without having the line numbers, newer versions will provide that info 

Micha
--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CA%2BonWPdjXGE9PEtiH%3DnM%2Brp2meDcQBSi4ocgLCuhQWTiO7_UwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jamie Jackson

unread,
Jun 15, 2015, 1:50:45 PM6/15/15
to lu...@googlegroups.com
Hi Micha,

As I was creating a ticket, production failed in the same way. Here's the ticket: https://luceeserver.atlassian.net/browse/LDEV-396

Thanks,
Jamie

Jamie Jackson

unread,
Jun 30, 2015, 3:39:01 PM6/30/15
to lu...@googlegroups.com
I brought it up in https://cfml.slack.com today, too, but I'm hereby looking to drum up some more interest in: https://luceeserver.atlassian.net/browse/LDEV-396

Thanks,
Jamie

Terry Whitney

unread,
Jul 1, 2015, 9:46:07 AM7/1/15
to lu...@googlegroups.com

I would bet its your MySQL Server hitting max connections.

post your my.cnf

Jamie Jackson

unread,
Jul 1, 2015, 10:03:04 AM7/1/15
to lu...@googlegroups.com
Hey Terry,

Thanks for the interest.

my.cnf.d/server.cnf:


# this is read by the standalone daemon and embedded servers
[server]

# this is only for the mysqld standalone daemon
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
group=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=1
max_allowed_packet=16384000
group_concat_max_len=1024000

query_cache_size=64M
query_cache_type=1

max_connections = 1000
open_files_limit = 4096
table_cache = 4096

FWIW, during the last crash, looking at the MySQL side, there were 35 processes from Lucee, and 4 connections from other clients.

Thanks,
Jamie

--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

Harry Klein

unread,
Jul 1, 2015, 10:16:20 AM7/1/15
to lu...@googlegroups.com

Today we had similar problems with PostgreSQL, the error message there was

“FATAL: sorry, too many clients already”

 

Max_connections in PostgreSQL where 100 (I think this is the default) – and the connections setting for the Lucee datasource was “inf”.

 

Maybe there is a general problem with the Lucee connection pool and connections are not (always) closed?

 

-Harry

Jamie Jackson

unread,
Jul 1, 2015, 10:28:09 AM7/1/15
to lu...@googlegroups.com
Hi Harry,

Did you get a stack trace, and did you happen to see how many connections there were to PosgreSQL at the time?

I thought it was advisable to set your (Lucee) datasource to connection limit to fewer than your DB's max connections. (If yours was "inf," I think it's understandable that you could have exceeded the DB's limit.)

Thanks,
Jamie

Harry Klein

unread,
Jul 1, 2015, 10:32:46 AM7/1/15
to lu...@googlegroups.com

>> Did you get a stack trace, and did you happen to see how many connections there were to PosgreSQL at the time?

 

Unfortunately not, we just changed the setting in Lucee to 100

 

>> I thought it was advisable to set your (Lucee) datasource to connection limit to fewer than your DB's max connections. (If yours was "inf," I think it's understandable that you could have exceeded the DB's limit.)

 

I am not sure about the ideal setting, but the default for new datasources is „inf“.

 

-Harry

Julian Halliwell

unread,
Jul 1, 2015, 10:36:42 AM7/1/15
to lu...@googlegroups.com
Are you using ORM by any chance? If so, see these two issues:

https://luceeserver.atlassian.net/browse/LDEV-119
https://luceeserver.atlassian.net/browse/LDEV-405

Terry Whitney

unread,
Jul 1, 2015, 11:00:20 AM7/1/15
to lu...@googlegroups.com
What OS are we dealing with? How much memory & processes does this system have?

plus, could you run this command?

show variables like 'max_connections';

Jamie Jackson

unread,
Jul 1, 2015, 11:32:43 AM7/1/15
to lu...@googlegroups.com
Answers inline...


On Wed, Jul 1, 2015 at 11:00 AM, Terry Whitney <twhitn...@gmail.com> wrote:
What OS are we dealing with?
 
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)
 
How much memory & processes does this system have?

$ vmstat -s
      1921908  total memory
      1799316  used memory
       723252  active memory
       890520  inactive memory
       122592  free memory
       119348  buffer memory
      1055820  swap cache
      6160380  total swap
        36612  used swap
      6123768  free swap
     20766847 non-nice user cpu ticks
        15147 nice user cpu ticks
      4640865 system cpu ticks
   2017402669 idle cpu ticks
      1422096 IO-wait cpu ticks
           62 IRQ cpu ticks
        44987 softirq cpu ticks
            0 stolen cpu ticks
     90245632 pages paged in
    177659734 pages paged out
       105382 pages swapped in
        70234 pages swapped out
    790988910 interrupts
    688331515 CPU context switches
   1430651201 boot time
     10478773 forks

$ cat /proc/sys/kernel/pid_max
32768

plus, could you run this command?

show variables like 'max_connections';

'max_connections', '1000'
 

Terry Whitney

unread,
Jul 2, 2015, 2:47:05 PM7/2/15
to lu...@googlegroups.com
In RHEL 6 , bind mysql to localhost. Even if you have your firewall setup, binding it seemingly adds a tiny bit of performance


Second, you can tweak the timeout settings

[mysqld]
interactive_timeout=40
wait_timeout=45

Those should be set to a second or so off from your application timeout settings.


Additionally, set SElinux to disabled temporarily and then stress test your application. 



On Friday, June 12, 2015 at 2:02:43 PM UTC-4, Jamie Jackson wrote:

Jamie Jackson

unread,
Jul 6, 2015, 12:34:08 PM7/6/15
to lu...@googlegroups.com
Hi Terry,

I think I can mentally relate the wait_timeout variable to my issue: I suppose that wait_timeout corresponds to Lucee's datsource "connection timeout" field. I'm supposing that each of those do the same thing, but one is client side and one is server side.

If you're implying that connection timeouts are suspect (and I would agree), are you suggesting that the Lucee timeouts should be longer than the MySQL timeouts, or vice-versa (and why)?

As for the other suggestions (localhost binding, interactive_timeout, and SELinux), I'm having trouble connecting them to my issue. Could you elaborate on those recommendations?

Thanks,
Jamie

--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

Nick S.

unread,
May 14, 2016, 4:40:26 PM5/14/16
to Lucee
Hi Jamie, 

Were you able to make any progress on resolving this datasource locking issue? We're running into the same problem on Oracle and seeing our servers lock up every few hours. It's happening almost constantly under load. Our sites were down most of the day yesterday as a result. 

There are no problems on the database side. We've exhausted just about everything at this point and started looking at migrating back to Railo if we can't find a solution soon. 

Thanks,
-Nick

Jamie Jackson

unread,
May 20, 2016, 6:48:29 AM5/20/16
to Lucee
I added a comment here: https://luceeserver.atlassian.net/browse/LDEV-396, in case you hadn't seen it.

Nick S.

unread,
May 20, 2016, 11:04:10 AM5/20/16
to Lucee
Thanks Jamie, that's helpful. I'm leaning towards this being an Oracle driver issue. We were using the latest version and I've realized this is not the same version we have on our Railo instances. We're going to do a little more testing using the same drivers across both and see if that helps. 

Geoff Parkhurst

unread,
May 20, 2016, 11:43:00 AM5/20/16
to lu...@googlegroups.com
On 20 May 2016 at 16:04, Nick S. <ma...@nicksollecito.com> wrote:
> Thanks Jamie, that's helpful. I'm leaning towards this being an Oracle
> driver issue. We were using the latest version and I've realized this is not
> the same version we have on our Railo instances. We're going to do a little
> more testing using the same drivers across both and see if that helps.

We thought the same in Microsoft-land (newer driver problems) - but
this wasn't the case for us. See the other thread you're in on the
issue:

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/lucee/etIjfig_pV8/6WfBALIYCAAJ
Reply all
Reply to author
Forward
0 new messages