"Too many open files" error...related to FR?

67 views
Skip to first unread message

Mike Kaplan

unread,
Oct 15, 2009, 5:20:55 PM10/15/09
to FusionReactor
Greetings,

We are currently running CF6.1 Standard on Windows Server 2003 and
I've been evaluating monitoring products. I installed the trial
version of FR on one of our boxes, and let it run for the ten days.
The trial expired a couple weeks ago, but I did not uninstall FR.

Yesterday, we updated the server with all Windows 2003 security
updates and hotfixes since August 11 (list is here:
http://support.microsoft.com/kb/894199). Around 20 minutes after
bringing the websites on the server back up, I started getting JRun
servlet 500 errors on every request. In my JRun logs, the errors have
the following characteristic: "Too many open files". I've included a
sample entry from my default-err.log at the end of this message.
Restarting CF or the box would bring the sites back up for 15-20
minutes before the errors kicked in again (this box is not in
production at the moment--its only traffic is four http requests per
minute from a network monitoring tool).

Today I uninstalled the expired trial version of FR, and the problem
disappeared. The sites have been up for hours now without a problem.

I have no idea why Windows updates would affect JRun or why an expired
trial edition of FR would trigger 500 errors, but given the confluence
of events, it seems unlikely to just be a coincidence. Anybody have
any insights?

Thanks,
Mike

[0]java.io.IOException: Too many open files
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:106)
at coldfusion.compiler.NeoTranslationContext.getPageReader
(NeoTranslationContext.java:576)
at coldfusion.compiler.NeoTranslator.parsePage(NeoTranslator.java:
301)
at coldfusion.compiler.NeoTranslator.translateJava(NeoTranslator.java:
240)
at coldfusion.compiler.NeoTranslator.translateJava(NeoTranslator.java:
97)
at coldfusion.runtime.TemplateClassLoader$1.fetch
(TemplateClassLoader.java:287)
at coldfusion.util.LruCache.get(LruCache.java:188)
at coldfusion.runtime.TemplateClassLoader$TemplateCache.fetchSerial
(TemplateClassLoader.java:214)
at coldfusion.util.AbstractCache.fetch(AbstractCache.java:58)
at coldfusion.util.SoftCache.get(SoftCache.java:81)
at coldfusion.runtime.TemplateClassLoader.findClass
(TemplateClassLoader.java:356)
at coldfusion.filter.PathFilter.invoke(PathFilter.java:73)
at coldfusion.filter.ExceptionFilter.invoke(ExceptionFilter.java:47)
at coldfusion.filter.ClientScopePersistenceFilter.invoke
(ClientScopePersistenceFilter.java:28)
at coldfusion.filter.BrowserFilter.invoke(BrowserFilter.java:35)
at coldfusion.filter.GlobalsFilter.invoke(GlobalsFilter.java:43)
at coldfusion.filter.DatasourceFilter.invoke(DatasourceFilter.java:
22)
at coldfusion.CfmServlet.service(CfmServlet.java:105)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:86)
at com.intergral.fusionreactor.filter.FusionReactorFilter.i
(FusionReactorFilter.java:568)
at com.intergral.fusionreactor.filter.FusionReactorFilter.c
(FusionReactorFilter.java:268)
at com.intergral.fusionreactor.filter.FusionReactorFilter.doFilter
(FusionReactorFilter.java:174)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:94)
at com.cj.gzipflt.GzipFilter.doFilter(Unknown Source)
at jrun.servlet.FilterChain.doFilter(FilterChain.java:94)
at jrun.servlet.FilterChain.service(FilterChain.java:101)
at jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:91)
at jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42)
at jrun.servlet.JRunRequestDispatcher.invoke
(JRunRequestDispatcher.java:249)
at jrun.servlet.ServletEngineService.dispatch
(ServletEngineService.java:527)
at jrun.servlet.jrpp.JRunProxyService.invokeRunnable
(JRunProxyService.java:192)
at jrunx.scheduler.ThreadPool$DownstreamMetrics.invokeRunnable
(ThreadPool.java:318)
at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable
(ThreadPool.java:426)
at jrunx.scheduler.ThreadPool$UpstreamMetrics.invokeRunnable
(ThreadPool.java:264)
at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)

getLicenseKey failed: Too many open files
getLicenseKey failed: Too many open files
getLicenseKey failed: Too many open files

Bernd Donath [FusionReactor Team]

unread,
Oct 16, 2009, 8:51:31 AM10/16/09
to FusionReactor
Hi Mike,

we have not heard of such an issue before. I tried to reproduce this
on a Windows 2003 5.2 Server (latest updates installed) running CFMX
6.1.0.63958 on VMWare WS 6.5.3 build-185404 but could no detect any
problems.

I looked a the number of file handles with a trial licensed FR3.5 and
no traffic taking place- > during 30 minutes the number of file
handles did stay around 10800 on this environment.

Next I did the same with an expired FR3.5 and again the number of file
handles stayed around the same value.

Then I used JMeter to run a load test (using the built in web server
of CF) with 50 threads running for 15 minutes with the expired FR3.5
installed and the number of file handles varied between 10500 and
10900 but did not increase - when I stopped the job the number was
about 10800.

The default-err.log file of the server did no contain any error
messages afterwards.
Which version of FusionReactor did you have installed on this machine?
Can you reproduce this problem after the machine has been rebooted?

Regards,
Bernd

Mike Kaplan

unread,
Oct 16, 2009, 9:24:12 AM10/16/09
to FusionReactor
Hi Bernd,

We were running 3.0.1. Rebooting did not help--we still got the same
behavior: sites were up for 15-20 minutes, then the errors returned.
The server is back to normal now--no errors at all since removing FR
yesterday morning.

I don't know if this is relevant, but here is another detail: we use
the GzipFilter servlet filter for our compression (you'll see it in
the stack trace). Initially, when this problem manifested itself, the
error we saw in Firefox was "The page you are trying to view cannot be
shown because it uses an invalid or unsupported form of compression."
So, I removed the filter from our installation and restarted CF. It
didn't solve the problem, but the next time it happened, the error
returned to the browser was "500 JRun Servlet Error."

Mike

On Oct 16, 8:51 am, "Bernd Donath [FusionReactor Team]"

Bernd Donath [FusionReactor Team]

unread,
Oct 19, 2009, 2:46:14 AM10/19/09
to FusionReactor
Hi Mike,

I will try this with 3.0.1 and the GZip filter enabled on my test
environment again. Could you send us any log files (FusionReactor and
ColdFusion) related to this issue together with a summary of your CF
server settings to sup...@fusion-reactor.com?

Thanks in advance,
Bernd

Mike Kaplan

unread,
Oct 19, 2009, 9:42:58 AM10/19/09
to FusionReactor, Bernd Donath [FusionReactor Team]
Bernd,

I'll send along the JRun and CF log files that I have. I have no
FusionReactor logs because 1) it had expired, and thus wasn't actually
running and 2) I've now uninstalled it, so any log files that I would
have had are gone. I'll include the GZip jar file in my email, because
the version that runs with CF6.1 is no longer available online.

Mike

On Oct 19, 2:46 am, "Bernd Donath [FusionReactor Team]"

charlie arehart

unread,
Oct 20, 2009, 5:33:38 PM10/20/09
to fusion...@googlegroups.com
About this, I wanted to ask since you first mentioned compression in your
note Friday, Mike. I was just unable to respond at the time.

Are you in fact referring to the compression in CF, or do you mean your own
compression? And Bernd, were you presuming one or the other? Mike, if you
meant your own, did you realize that FR has its own compression, as one of
the options on the left? And do you recall if that was enabled? I realize
you've uninstalled so may not recall.

But then you also said you can't send the logs because you uninstalled. I
think it keeps the logs around even on an uninstall--so they should be there
and could be helpful, since the reactor-*.log does track what settings are
enabled at the start of FR. Could be helpful to confirm you don't have them.
Since this is Windows, it would have (by default) been in \fusionreactor (or
could have been put by someone in \program files\fusionreactor\).

Hope that's helpful.

/charlie

Mike Kaplan

unread,
Oct 21, 2009, 1:01:03 PM10/21/09
to FusionReactor
Charlie,

The compression I use is from ColdBeans Software (referenced here:
http://www.simonwhatley.co.uk/poor-mans-http-compression-with-coldfusion).
Having it enabled or disabled made no difference. FR's compression was
off at all times--I never enabled it.

Bernd and I are continuing to explore this offline...I've recreated
the problem with an expired trial version of 3.5 now, with the errors
disappearing when the product is licensed.

Mike

On Oct 20, 5:33 pm, "charlie arehart" <charlie_li...@carehart.org>
wrote:

charlie arehart

unread,
Oct 21, 2009, 11:58:56 PM10/21/09
to fusion...@googlegroups.com
OK, thanks. I was just asking, to get clarification. (By "your own
compression", I meant any other than that in FR itself.) Glad to see the
clarifications. Hope things work out with you and Bernd. Those guys are
good. I'm sure they'll solve it. :-)

/charlie


> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of Mike Kaplan
> Sent: Wednesday, October 21, 2009 1:01 PM
> To: FusionReactor
> Subject: FusionReactor Group: Re: "Too many open files" error...related
> to FR?
>
>

Darren Pywell

unread,
Oct 22, 2009, 4:55:18 PM10/22/09
to FusionReactor
Mike,

Thanks for helping us so much to find this issue.

The problem turned out to be exactly the issue that you had
identified; FusionReactor created file handles that didn't get cleaned
up after the trial had ended if a license didn't get uploaded or
FusionReactor wasn't uninstalled. We now clean these up after the
trial has expired.

You'll be please to know that we've posted a new update to
FusionReactor (version 3.5.1) to the www.fusion-reactor.com site that
contains a fix for this issue. Simply download the 3.5.1 installer/
updater and run it to update your existing instances to get the fix
for the problem:

http://www.fusion-reactor.com/fr/downloads.cfm

Thanks again,
Darren

On Oct 22, 5:58 am, "charlie arehart" <charlie_li...@carehart.org>
wrote:

DeMarco, Alex

unread,
Jan 22, 2010, 7:43:40 PM1/22/10
to fusion...@googlegroups.com
Hello All,

I am trying to setup Crash Protection and having some issues.

I want to protect anything in /myapp/* anything that runs over 60seconds
should trigger CP. Right now I have set to notify me. However, this is
not working.. I even set it to 1 second to test it and nothing..

What am I missing?

Thanks to all...

- Alex

charlie arehart

unread,
Jan 23, 2010, 11:51:22 AM1/23/10
to fusion...@googlegroups.com
Alex, there could be a few problems affecting your situation. Since you've
left things a little vague in terms of what you configured, I'll ask a few
questions and share some tips. Even if I didn't guess right about what you
did, the info may help you or others who are setting up crash protection
alerts and (possibly) restrictions.

First, you said simply that you had it set to notify you and "it's not
working". By that do you mean that you were not getting any email alerts?
It's critical then to ensure you have configured (and tested) the email
settings in FR. You won't get any notifications of any kind from FR if those
aren't enabled and configured correctly.

Also, you would want to look in the CP log to see if the CP is even being
caught. It could be that it is but you're not getting the email for some
reason. It's nice to be able to use this rather than just "hope" the email
would come. :-)

Either of those may be the solution you needed.

But moving into another whole discussion, you mention that you "want to
protect anything in /myapp/* that runs over 60 seconds. It's not clear if
you're saying you did anything specific to help you protect those. Do you
realize that you don't need to specify any restrictions for a particular
directory? That by default, if you enable CPs, they apply to all templates
in all directories? Are you saying, perhaps, that you DID setup a "CP
restriction" for that directory? thinking that you needed to (when you
don't, really, per the last point)?

And if you did setup a restriction, then are you saying you've set it up
literally as /myapp/* (replacing whatever myapp is with your app name)? And
did you set that to be an exact match or a regular expression? The thing is,
that string you offer won't work either way. The exact match doesn't accept
wildcards (so is truly an "exact" match), so the * doesn't do what many
think. Similarly, we can't just leave it off (as in /myapp/). Instead, if
one DOES want to setup a restriction, one needs to use a regular expression,
and it should be /myapp/(.*). That sort of example is offered in the online
help. Note as well that the regex is case-sensitive.

Besides ensuring you have the restriction configured correctly, you also
want to check the corresponding settings page (in your case, CP settings) to
check if the restrictions are set to IGNORE or PROTECT the listed patterns.
In the case of CP restrictions, the settings page defaults to IGNORING those
requests that you list in the restrictions. Sounds like you would want
instead to set that to PROTECT.

But then that's only needed if you wanted to say that you ONLY wanted to
enable protection against requests in that directory. Again, recall the
point above: if you just want everything to be protected, you don't need any
restrictions. They're only for if you want to either limit some dirs to be
IGNORED (allowed to run and not trigger CPs), or for some reason you don't
want to protect everything and instead want to protect ONLY some directory
(which would seem pretty unusual.)

Here are a couple more tips for those who try to get CP restrictions to work
and have problems.

When creating CP restrictions, we don't need to build them by hand. In all
the pages that show requests, there's a red circle with a slash: if you
click that, FR opens the CP restrictions page and pre-fills it with the
correct info to build a CP restriction for that request. But that sets up
exact matches by default, so you could still make a mistake in changing it
to a regex.

Finally, when setting up a CP alert restriction, you may want to test it
first as an FR restriction (in the last Fusion Reactor section on the left).
These control whether a request is monitored by FR at all, and there is a
separate restrictions page (and setting in the settings page) just for them.
It's easier to test if you get that right. If you set the FR settings to
have its restrictions be "ignored", then you'll know if your restriction is
correct if FR stops showing the page in the request history, when you visit
a page that you're trying to "restrict". (Or if you set the settings page
restrictions to "monitor", then you'll know you have it right if nothing but
pages in the restrictions show up.)

Once you get the restriction right as an FR restriction, you can remove it
from there and place it instead into the CP restrictions.

Hope those tips help someone.

/charlie

> --
> You received this message because you are subscribed to the Google
> Groups "FusionReactor" group.
> To post to this group, send email to fusion...@googlegroups.com.
> To unsubscribe from this group, send email to
> fusionreacto...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/fusionreactor?hl=en.


DeMarco, Alex

unread,
Jan 23, 2010, 5:14:27 PM1/23/10
to fusion...@googlegroups.com
Thanks Charlie this was very helpful.

I have it working now.

I had email working correctly, my problem was the regex.

The other info you offered below was also very helpful.

- Alex

DeMarco, Alex

unread,
Jan 23, 2010, 5:22:31 PM1/23/10
to fusion...@googlegroups.com
While we are on the subject here is a related question:

I have an issue where our servers gets backed up:
01/15 09:40:47 metrics Web threads (busy/total): 1/11 Sessions: 4519
Total Memory=516096 Free=191235
01/15 09:41:47 metrics Web threads (busy/total): 4/11 Sessions: 4528
Total Memory=518656 Free=164827
01/15 09:42:47 metrics Web threads (busy/total): 7/11 Sessions: 4527
Total Memory=516352 Free=155603
01/15 09:43:47 metrics Web threads (busy/total): 25/72 Sessions: 4512
Total Memory=516928 Free=168748

When the busy thread count gets about 25(max set in jrun) the server
bogs down. Is there a way that CP will help avert this problem? We have
an issue with one of our Oracle Databases that causes it to become
unresponsive, thus holding on the DB requests, this inturn backs up all
other requests on the cfmx server. I know the url that is tied to the
database. Am I rambling? :) Once the database recovers jrun takes off
again and everything is fine. It would be nice to try and weather this
issue a little better.

- Alex

-----Original Message-----
From: fusion...@googlegroups.com
[mailto:fusion...@googlegroups.com] On Behalf Of charlie arehart
Sent: Saturday, January 23, 2010 11:51 AM
To: fusion...@googlegroups.com

Peter Boughton

unread,
Jan 23, 2010, 5:50:39 PM1/23/10
to fusion...@googlegroups.com
> Instead, if
> one DOES want to setup a restriction, one needs to use a regular expression,
> and it should be /myapp/(.*). That sort of example is offered in the online
> help. Note as well that the regex is case-sensitive.

Hmmm, that's the second time I've seen (.*) used/recommended on this
mailing list.

Is Fusion Reactor doing anything with the captured group, or are
people using parentheses because they don't know they don't need to?

In general, doing "xyz(.*)" is not necessary; using "xyz.*" will match
any text beginning with xyz. Unless there's a specific reason to
capture the group, the simpler one should be used.

charlie arehart

unread,
Jan 23, 2010, 8:03:20 PM1/23/10
to fusion...@googlegroups.com
Glad to hear, Alex. Thanks for the update and the kind regards.

/charlie

> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of DeMarco, Alex
> Sent: Saturday, January 23, 2010 5:14 PM
> To: fusion...@googlegroups.com
> Subject: RE: FusionReactor Group: Crash Protection question
>
> Thanks Charlie this was very helpful.
>
> I have it working now.
>
> I had email working correctly, my problem was the regex.
>
> The other info you offered below was also very helpful.
>
> - Alex
>
> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of charlie arehart
> Sent: Saturday, January 23, 2010 11:51 AM
> To: fusion...@googlegroups.com
> Subject: RE: FusionReactor Group: Crash Protection question
>
> Alex, there could be a few problems affecting your situation. Since
> you've
> left things a little vague in terms of what you configured, I'll ask a
> few

<snip>


charlie arehart

unread,
Jan 23, 2010, 8:27:24 PM1/23/10
to fusion...@googlegroups.com
Alex, this is a classic problem. Yes, once the max simultaneous requests in
CF is hit then other requests will start to queue. And if all or some of
those running requests hang a long time, it can seem that the server is
dead. It's not: it's just that those available threads are busy and no new
requests can get in.

You're showing the JRun metrics, but you'd see the same info in the FR
interface, with all or most of the requests showing running in the running
requests. It would also be reflected in the FR "resource" log.

That said, once it is happening, the challenge is to find and resolve the
root cause. In your case, you seem to know if's the database. In that case,
you really want to find out what's making it hang and fix it. Most people,
though, who have this problem don't know what the root cause is, so I want
to offer some info for them, too.

You ask about using Crash Protection, and if you mean to use that to
terminate the long-running requests, I would argue it would not help. If as
in your case the requests are hanging because the DB is not responding, then
having CF's CP try to "terminate" these long running requests will do no
good. They can't be terminated. They will run however long they need to run,
or until what's holding them (the DB) is stopped, or the CF server is
restarted (which of course does then kill the threads).

So again you need to find out why they are hanging. In your case you seem to
know it's a DB call, but I will point out for others that in this situation
the next step would be to use FR's "stack trace" feature comes in. (And you
may want to confirm this yourself, Alex.)

While the long-running requests are executing and you see them in the
running requests (or slow requests) page, you can click on the magnifying
glass icon to left of the running request, to get a "stack trace", or what a
request is doing at that moment, in terms of the underlying Java methods
called on behalf of the running CFML. Often, within a few lines of the top
that stack trace will be a reference to a CFML file and a number after the
filename, which indicates the line number. If you take a couple of stack
traces of a hanging request, a few seconds apart, and the line number
doesn't change, that's likely your culprit.

Look at the file named on that line and at that line number to see what it's
hanging on. It's often a CFQUERY, or a CFSTOREDPROC. Or it could be a
CFHTTP, or a CFINVOKE of a web service. Any time CF is involved in talking
to something outside of itself, such a tag cannot be interrupted: not by FR,
not by CF's "request timeout" feature, not by any tool's "kill thread"
feature.

So my point is: you want to get to the root cause of the problem, rather
than rely on FR's CP to try to "terminate" the long-running requests. That's
really only a band-aid (if it works), in that it only terminates the
long-running request when the hanging tag finally ends, in which case it may
not have taken much longer then to complete anyway.

But I do recommend using the CP feature to NOTIFY you of the situation. In
fact, FR will include in the email you get a full thread dump, which is a
stack trace of all the running threads. You can then use that just like if
you had clicked the stack trace button on all the running requests. And as
you likely would get a couple of these CP emails in a row (separated by a
minute, by default, as set in the FR admin), you can do the same comparison
of the stack traces for a given request to find what's hanging, and again go
resolve the problem.

All that said, I will note that some do use the "queue and notify" feature
of the "request protection" CP option. In your case, you may use it to have
FR queue up requests once the 25 are reached and then after a specified time
have it show the users a message or URL explaining the problem. Some may
know that CF does that queuing for you also, as you saw in the JRun metrics.
And since CF 7, in Enterprise at least, you've been able to also set a queue
timeout and error message or page to show. But for those on CF Standard,
it's a nice additional option in FR to consider.

Hope that helps. And others here may have a different perspective. I'm just
one user.

Let us know how it goes getting to (and resolving) the root cause of your
problem.

/charlie


> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of DeMarco, Alex
> Sent: Saturday, January 23, 2010 5:23 PM
> To: fusion...@googlegroups.com
> Subject: RE: FusionReactor Group: Crash Protection question
>

charlie arehart

unread,
Jan 23, 2010, 8:46:51 PM1/23/10
to fusion...@googlegroups.com
Thanks for the follow-up, Peter. I'm pretty sure both times it was me who
suggested that pattern. :-)

And to be honest I just used what was offered in the FR help (for regex's).
So yes, in my case it would be that I offered the parens "because I didn't
know I didn't need to". :-) When it comes to regular expressions, I won't
deny that I tend to just use what I'm shown. If it works, that's generally
good enough for me. :-) It's rare that there are performance implications of
a wrong decision (how I use them, at least), so I've just chosen not to
study RegEx's.

Great to have someone here with more experience to offer more wisdom. That
said, perhaps the FR guys will chime in with thoughts on why they
recommended the (.*) approach in the docs.

So yes, for instance, if one wanted to add an FR restriction for the CF
Admin requests (as Russ did), I can confirm that they could offer this as
the pattern:

/CFIDE/administrator/.*

But we should be clear for others: we ARE still talking about using the
"regular expression" option when adding that restriction. People should not
see this as looking like a typical wildcard (note the needed . before the
*), and again it does NOT work for an "exact match" option when adding a
restriction.

Thanks again. Love to see others chime in so we can all learn from each
other.

/charlie

DeMarco, Alex

unread,
Jan 25, 2010, 10:00:27 AM1/25/10
to fusion...@googlegroups.com
Charlie, this is good stuff.

While we are pretty certain the db issue is causing all the requests to
hang. We also have suspicions that the real cause is from a .cfm page.
One of the many pages in the app gets into a state where it misbehaves.
However, we have never been able to pin it down. Do you think Fusion
Analytics would help here? (I think so, from what I have seen of it).

Thanks again!
- Alex

/charlie

John Hawksley (Fusion Team)

unread,
Jan 25, 2010, 10:50:51 AM1/25/10
to FusionReactor
Hi guys :-)

To address two points:

1) Regex Backrefs

The parentheses in (.*) are used _only_ for clarity. I always used
them to visually separate out the regex from the fixed string and make
it clearer to myself. The string ".*" is equivalent. The regex
filter in FR operates on the stream as a forward-only cursor (since it
can't store state or maintain much of a buffer), so capturing groups
using parens are indeed ignored. You cannot refer back to them with
backrefs \1 \2 etc.

In the case of FR, the capturing group is ignored so there is no
penalty. Peter's right though; if you were using a regex engine which
did support groups, you would (might/could) incur a memory penalty
using groups where you didn't need them. Having said that I would
guess that modern regex engines would optimize away the capturing
group where there was no corresponding backref.

2) DB blocks

Charlie's superb answer notwithstanding, we would also recommend (with
the usual 'try it out in the test environment first' caveats) trying
your DB vendor's JDBC drivers instead of the Macromedia ones, and
seeing if the problem goes away. If it doesn't, you can use Charlie's
advice above to obtain a stack trace and see which line of CF code is
bogging down.

Another option might be to use your database's management tools to
inspect what SQL statement is being run by each connection originating
from your CF servers. This might give you some pointers to where
table/row/page locks are being obtained and causing bogging.

I do agree with Charlie though, Crash Prot in this case really is
curing the symptoms, not the problem.

Good luck!
-John

charlie arehart

unread,
Jan 25, 2010, 4:20:55 PM1/25/10
to fusion...@googlegroups.com
Alex, to be clear: the appearance of CF hanging is indeed (from what you've
said) an issue that is a "real cause from a .cfm page". Whatever that page
is doing that gets hung up is what makes CF then hold that thread and not
allow any new requests to use it, and once all the (max) simultaneous
threads are locked up, CF will seem to be hung. It's not: if/when those
requests do eventually stop hanging, then new requests can get in. But often
people don't wait to get to the root cause. They just restart CF because
someone's breathing down their neck to "get CF back up". The tragedy is that
often, CF's the victim, not the cause.

So that's why I said you need to get to the root cause, and I recommended
using the stack trace feature in FR to see just what those CF pages are
waiting on, literally what line of code.

For that, no, FusionAnalytics won't help. It's job (and it's not yet a
released product) is to analyze logs and help pinpoint problems. It might
possibly help identify a situation like this, but it would be looking at
things after the fact. But even then I would recommend that the Crash
Protection notification emails (with their included thread dump, a
collection of stack traces for all threads) would be more valuable.

But I'm not knocking FA at all. I can't wait to see it released. It will
help with many other problems, for sure. In your case, though, you can and
should use the tool you now have. :-)

/charlie


> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of DeMarco, Alex
> Sent: Monday, January 25, 2010 10:00 AM
> To: fusion...@googlegroups.com
> Subject: RE: FusionReactor Group: Crash Protection question
>
> Charlie, this is good stuff.
>
> While we are pretty certain the db issue is causing all the requests to
> hang. We also have suspicions that the real cause is from a .cfm page.
> One of the many pages in the app gets into a state where it misbehaves.
> However, we have never been able to pin it down. Do you think Fusion
> Analytics would help here? (I think so, from what I have seen of it).
>
> Thanks again!
> - Alex
>
> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of charlie arehart
> Sent: Saturday, January 23, 2010 8:27 PM
> To: fusion...@googlegroups.com
> Subject: RE: FusionReactor Group: Crash Protection question
>
> Alex, this is a classic problem. Yes, once the max simultaneous
> requests
> in

<snip>

Reply all
Reply to author
Forward
0 new messages