BLOCKED on lucee.runtime.db.DCStack@1c1dfcc1

Jamie Jackson

unread,

Jun 12, 2015, 2:02:43 PM6/12/15

to lu...@googlegroups.com

Happy Friday!

We were load testing our app, and we saw that in FusionReactor, a bunch of requests were hung (and couldn't be terminated), and the load test server became unresponsive.

The hung requests were stuck here:

"http-bio-8888-exec-46" Id=10795 BLOCKED on lucee.runtime.db.DCStack@1c1dfcc1 owned by "http-bio-8888-exec-4" Id=72
java.lang.Thread.State: BLOCKED
at lucee.runtime.db.DatasourceConnectionPool.getDatasourceConnection(Unknown Source)
- waiting to lock lucee.runtime.db.DCStack@1c1dfcc1 owned by "http-bio-8888-exec-4"
at lucee.runtime.db.DatasourceManagerImpl.getConnection(Unknown Source)
at lucee.runtime.tag.Query.executeDatasoure(Unknown Source)
at lucee.runtime.tag.Query.doEndTag(Unknown Source)

Here's the full thread dump:

https://gist.github.com/jamiejackson/4b739ff5844423c1d9e0#file-gistfile1-txt-L296-L302

FWIW, this is Lucee 4.5.1.000 final with the stock MySQL driver.

Happen to know what's going on here?

Thanks,

Jamie

Michael Offner

unread,

Jun 13, 2015, 2:09:21 AM6/13/15

to lu...@googlegroups.com

Can you raise a ticket for it, it is hard to say without having the line numbers, newer versions will provide that info

Micha

--
You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CA%2BonWPdjXGE9PEtiH%3DnM%2Brp2meDcQBSi4ocgLCuhQWTiO7_UwQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jamie Jackson

unread,

Jun 15, 2015, 1:50:45 PM6/15/15

to lu...@googlegroups.com

Hi Micha,

As I was creating a ticket, production failed in the same way. Here's the ticket: https://luceeserver.atlassian.net/browse/LDEV-396

Thanks,

Jamie

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CAG%2BEEBzuV%2Bf8S%3DRKktZng4a0bpLoqppaY46MrtOis8tpM7DmzA%40mail.gmail.com.

Jamie Jackson

unread,

Jun 30, 2015, 3:39:01 PM6/30/15

to lu...@googlegroups.com

I brought it up in https://cfml.slack.com today, too, but I'm hereby looking to drum up some more interest in: https://luceeserver.atlassian.net/browse/LDEV-396

Thanks,

Jamie

Terry Whitney

unread,

Jul 1, 2015, 9:46:07 AM7/1/15

to lu...@googlegroups.com

I would bet its your MySQL Server hitting max connections.

post your my.cnf

Jamie Jackson

unread,

Jul 1, 2015, 10:03:04 AM7/1/15

to lu...@googlegroups.com

Hey Terry,

Thanks for the interest.

my.cnf.d/server.cnf:

# this is read by the standalone daemon and embedded servers
[server]

# this is only for the mysqld standalone daemon
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
user=mysql
group=mysql
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=1
max_allowed_packet=16384000
group_concat_max_len=1024000

query_cache_size=64M
query_cache_type=1

max_connections = 1000
open_files_limit = 4096
table_cache = 4096

FWIW, during the last crash, looking at the MySQL side, there were 35 processes from Lucee, and 4 connections from other clients.

Thanks,

Jamie

--

You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/26ea7863-1ad1-47e8-a06c-a34886057cde%40googlegroups.com.

Harry Klein

unread,

Jul 1, 2015, 10:16:20 AM7/1/15

to lu...@googlegroups.com

Today we had similar problems with PostgreSQL, the error message there was

“FATAL: sorry, too many clients already”

Max_connections in PostgreSQL where 100 (I think this is the default) – and the connections setting for the Lucee datasource was “inf”.

Maybe there is a general problem with the Lucee connection pool and connections are not (always) closed?

-Harry

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CA%2BonWPeT0CD6SVF_X2Vv_amMO9LKe8rm40syqB7Mt86YnFLhkQ%40mail.gmail.com.

Jamie Jackson

unread,

Jul 1, 2015, 10:28:09 AM7/1/15

to lu...@googlegroups.com

Hi Harry,

Did you get a stack trace, and did you happen to see how many connections there were to PosgreSQL at the time?

I thought it was advisable to set your (Lucee) datasource to connection limit to fewer than your DB's max connections. (If yours was "inf," I think it's understandable that you could have exceeded the DB's limit.)

Thanks,

Jamie

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/F5A941E045A6FE4288ABB2E3D797146FA73B8235%40SRV-DC1.contens.local.

Harry Klein

unread,

Jul 1, 2015, 10:32:46 AM7/1/15

to lu...@googlegroups.com

>> Did you get a stack trace, and did you happen to see how many connections there were to PosgreSQL at the time?

Unfortunately not, we just changed the setting in Lucee to 100

>> I thought it was advisable to set your (Lucee) datasource to connection limit to fewer than your DB's max connections. (If yours was "inf," I think it's understandable that you could have exceeded the DB's limit.)

I am not sure about the ideal setting, but the default for new datasources is „inf“.

-Harry

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/CA%2BonWPf0YPWEy6SD6qdSJ_ekwNaBOA_0-bdy_SARoGOoO0Othw%40mail.gmail.com.

Julian Halliwell

unread,

Jul 1, 2015, 10:36:42 AM7/1/15

to lu...@googlegroups.com

Are you using ORM by any chance? If so, see these two issues:

https://luceeserver.atlassian.net/browse/LDEV-119
https://luceeserver.atlassian.net/browse/LDEV-405

Terry Whitney

unread,

Jul 1, 2015, 11:00:20 AM7/1/15

to lu...@googlegroups.com

What OS are we dealing with? How much memory & processes does this system have?

plus, could you run this command?

show variables like 'max_connections';

Jamie Jackson

unread,

Jul 1, 2015, 11:32:43 AM7/1/15

to lu...@googlegroups.com

Answers inline...

On Wed, Jul 1, 2015 at 11:00 AM, Terry Whitney <twhitn...@gmail.com> wrote:

What OS are we dealing with?

$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.6 (Santiago)

How much memory & processes does this system have?

$ vmstat -s
1921908 total memory
1799316 used memory
723252 active memory
890520 inactive memory
122592 free memory
119348 buffer memory
1055820 swap cache
6160380 total swap
36612 used swap
6123768 free swap
20766847 non-nice user cpu ticks
15147 nice user cpu ticks
4640865 system cpu ticks
2017402669 idle cpu ticks
1422096 IO-wait cpu ticks
62 IRQ cpu ticks
44987 softirq cpu ticks
0 stolen cpu ticks
90245632 pages paged in
177659734 pages paged out
105382 pages swapped in
70234 pages swapped out
790988910 interrupts
688331515 CPU context switches
1430651201 boot time
10478773 forks

$ cat /proc/sys/kernel/pid_max
32768

plus, could you run this command?

show variables like 'max_connections';

'max_connections', '1000'

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/a4ffe7bb-8acb-495f-bc86-acb7477e1e57%40googlegroups.com.

Terry Whitney

unread,

Jul 2, 2015, 2:47:05 PM7/2/15

to lu...@googlegroups.com

In RHEL 6 , bind mysql to localhost. Even if you have your firewall setup, binding it seemingly adds a tiny bit of performance

Second, you can tweak the timeout settings

[mysqld]
interactive_timeout=40
wait_timeout=45

Those should be set to a second or so off from your application timeout settings.

Additionally, set SElinux to disabled temporarily and then stress test your application.

On Friday, June 12, 2015 at 2:02:43 PM UTC-4, Jamie Jackson wrote:

Jamie Jackson

unread,

Jul 6, 2015, 12:34:08 PM7/6/15

to lu...@googlegroups.com

Hi Terry,

I think I can mentally relate the wait_timeout variable to my issue: I suppose that wait_timeout corresponds to Lucee's datsource "connection timeout" field. I'm supposing that each of those do the same thing, but one is client side and one is server side.

If you're implying that connection timeouts are suspect (and I would agree), are you suggesting that the Lucee timeouts should be longer than the MySQL timeouts, or vice-versa (and why)?

As for the other suggestions (localhost binding, interactive_timeout, and SELinux), I'm having trouble connecting them to my issue. Could you elaborate on those recommendations?

Thanks,

Jamie

--

You received this message because you are subscribed to the Google Groups "Lucee" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lucee+un...@googlegroups.com.
To post to this group, send email to lu...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/lucee/1185593f-058c-4607-9158-7d3ed72345d4%40googlegroups.com.

Nick S.

unread,

May 14, 2016, 4:40:26 PM5/14/16

to Lucee

Hi Jamie,

Were you able to make any progress on resolving this datasource locking issue? We're running into the same problem on Oracle and seeing our servers lock up every few hours. It's happening almost constantly under load. Our sites were down most of the day yesterday as a result.

There are no problems on the database side. We've exhausted just about everything at this point and started looking at migrating back to Railo if we can't find a solution soon.

Thanks,

-Nick

Jamie Jackson

unread,

May 20, 2016, 6:48:29 AM5/20/16

to Lucee

I added a comment here: https://luceeserver.atlassian.net/browse/LDEV-396, in case you hadn't seen it.

Nick S.

unread,

May 20, 2016, 11:04:10 AM5/20/16

to Lucee

Thanks Jamie, that's helpful. I'm leaning towards this being an Oracle driver issue. We were using the latest version and I've realized this is not the same version we have on our Railo instances. We're going to do a little more testing using the same drivers across both and see if that helps.

Geoff Parkhurst

unread,

May 20, 2016, 11:43:00 AM5/20/16

to lu...@googlegroups.com

On 20 May 2016 at 16:04, Nick S. <ma...@nicksollecito.com> wrote:
> Thanks Jamie, that's helpful. I'm leaning towards this being an Oracle
> driver issue. We were using the latest version and I've realized this is not
> the same version we have on our Railo instances. We're going to do a little
> more testing using the same drivers across both and see if that helps.

We thought the same in Microsoft-land (newer driver problems) - but
this wasn't the case for us. See the other thread you're in on the
issue:

https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/lucee/etIjfig_pV8/6WfBALIYCAAJ

Reply all

Reply to author

Forward