We ultimately determined that it was fixed by an update of his JVM (he'd
been running the default Java 1.6.0_04 that came with CF 8) to the newer
1.6.0_12. Some may know that earlier releases of 1.6 had class loading
issues, fixed as of the 0_10 release.
Anyway, he outlines his observations and the steps to solve the problem in
his blog entry:
http://russ.michaels.me.uk/index.cfm/2009/3/19/ColdFusion-8-performance-Issu
es-when-using-Java-6
For those who don't know, Intergral (makers of FusionReactor) offer such
troubleshooting consulting services. More at
http://www.fusion-reactor.com/support/services/ColdFusionConsulting.cfm.
Though I'm independent (not an Intergral employee), you'll see there that
I'm one of the folks who does help provide those consulting services.
/charlie
It's rather detailed, but perhaps some of the info will help others who find
this in the future, perhaps even while researching other problems.
(I also notice that while your very first message in the thread back in
March had to do with memory, this one instead has to do with the blocking on
edu.emory.mathcs.backport.java.util.concurrent.Semaphore, so I have changed
the subject accordingly.)
In a previous message of the thread Darren suggested it could be related to
CF's thread/request management. As further evidence, I see that the Java
docs for that class do indicate that it's indeed often used for thread
management:
http://backport-jsr166.sourceforge.net/doc/api/edu/emory/mathcs/backport/jav
a/util/concurrent/Semaphore.html
And further searching shows that others (on other Java EE servers) have had
similar problems, and they have also been seemingly related to thread pool
management (see http://www.icefaces.org/JForum/posts/list/6554.page). Others
have even reported the problems with CF
(http://www.jonhartmann.com/index.cfm/2009/2/5/ColdFusion-Instance-Issue and
http://forums.adobe.com/message/43707?tstart=0), with no resolution--other
than making sure that the three Start buttons in the CF Server Monitoring
tool are disabled. If you are on CF8/9 Enterprise, that's worth checking. To
be clear (for all readers), it does NOT matter whether you have the CF8/9
Server Monitor interface "open". If you hit those "start" buttons, CF then
starts collecting its info whether you have the interface open or not.
I'll assume, Russ, that's not your problem. As far as the thread pool issues
are concerned, though, I think it would help to explore the thread pool
management. Since you deal with lots of servers (if I recall right), can you
clarify/repeat the details of this server in question: what version of CF
you're on (6, 7, 8, 9), what edition (Standard or Enterprise), what
deployment (if Enterprise: Server or Multiserver), and what J2EE server (if
not JRun). The reason I ask all these things is that there can be issues
related to the setting of threads in the CF Admin relative to what they end
up being set under the covers in the J2EE server.
In that regard, can you report as well what your settings are for the CF
Admin Request Tuning page (or Settings, on CF7/6) for simultaneous (request)
threads, max jrun running/queued threads, and if on Enterprise, can you
confirm that the max jrun running thread count is indeed greater than the
sum of the max number thread settings above it?
Then can you also report your settings for the thread-related XML entries
for the JRunProxyService, as found in the jrun.xml file (either in
\[coldfusion]\runtime\servers\coldfusion\SERVER-INF, or in
\JRun4\servers\[instance]\SERVER-INF in a Multiserver deployment). The
parent XML entry is <service class="jrun.servlet.jrpp.JRunProxyService"
name="ProxyService">, and you'll see the various sub-entries related to
threads, like activeHandlerThreads. It just may help to eyeball those to see
if anything is unusual. I realize you may have already done it on your own.
Just offering more eyes to look it over here.
Of course, this is going beyond what FR is about, so again it's just to try
to help. But here's a way that you can actually use FR to help with this
issue of yours.
I'd propose that when your stack traces (and those of others reporting this)
identify threads that are "waiting" on a semaphore lock, it would be helpful
to find which other requests are "holding" that lock. The requests you're
seeing may be merely "victims". As is often the case, you want to find out
who the perpetrator is. You said, "all stack traces are the same", but it's
entirely possible that nearly all were the same, but some one was different,
and it may not have stood out.
Indeed, you may have "walked right past" the perpetrator because of what you
were looking for (again, no offense intended. Just offering this as much for
others as for you.) It would be easy to look only at threads that showed
that semaphore class as the first in the stack trace, viewing them as "the
problem". If my guess is correct, the perpetrator request (or requests)
would not likely show this semaphore class as the first in its stack trace,
but rather it would be further down the trace, where it would have obtained
the lock and then proceeded to do some other operations (which would be
reflected as running at the top of the stack trace).
And I'll note as well that the thread "holding" the lock may also not be a
CF "request" thread but could be some other thread, so you'll want to do a
thread dump (Resources>List All Threads>Stack Trace All).
Sadly, it's also possible that the stack traces may NOT show the "locking"
request having any reference to the semaphore at all. The stack trace shows
only what classes were called as a result of the current operation (whether
that's a CF tag/function in a CFML template, or some other operation under
the cover). I'm just theorizing here, but it could be that a given thread
had obtained the semaphore lock in a previous operation of the request, and
is still holding it across the request, so that at the time you did the
stack trace it shows the request doing other operations but has no reference
to the semaphore in its list of current methods.
If that's the case, then I would argue it's probably very difficult to spot
who's "holding" the lock that the rest are waiting on. I've long wished that
CF had some lock contention reporting, whether for CFLOCK, or for
transactions, or for things like this, and so on. Since such lock-oriented
operations obtain and release locks at their core, they could report on
their doing this, to help diagnose such issues. Even if there were issues
that made it only something that should be turned on for rare occasions, I
could see it helping solve lots of knotty problems.
This may be one of those, but it's possible that some of the simply things
proposed above will help. Let us know when you've digested and can respond
to this, Russ. And kudos to all who had the patience to get to the bottom of
this long message! :-)
/charlie
> -----Original Message-----
> From: fusion...@googlegroups.com
> [mailto:fusion...@googlegroups.com] On Behalf Of Snake
> Sent: Thursday, October 15, 2009 9:42 AM
> To: fusion...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "FusionReactor" group.
To post to this group, send email to fusion...@googlegroups.com.
To unsubscribe from this group, send email to fusionreacto...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/fusionreactor?hl=.