I saw a slow thread on one of my servers this evening and when I viewed the stack trace it showed it was blocked and waiting for FusionReactor to make a monitor entry. I have included the stack below and was curious what the groups thoughts are on this issue.
com.intergral.fusionreactor.core.u.m(FusionReactor.java:399) - waiting on <0x2e126d8b> (a java.util.LinkedHashMap held by thread 80, FusionReactor Web Server (Server Thread Pool Member Thread-24)) com.intergral.fusionreactor.filter.FusionReactorFilter.b(FusionReactorFilte r.java:448) com.intergral.fusionreactor.filter.FusionReactorFilter.c(FusionReactorFilte r.java:254) com.intergral.fusionreactor.filter.FusionReactorFilter.doFilter(FusionReact orFilter.java:164) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServlet Filter.java:42) coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) jrun.servlet.FilterChain.service(FilterChain.java:101) jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106) jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42) jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:286) jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543) jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203 ) jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:42 8) jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
At first glance, this is FR waiting to 'finish' a request - i.e. the
request has finished and has actually been flushed to the client, and
FR is waiting to do its housekeeping. The other named thread, thread
80, holds the monitor for an internal tracking structure. Without
getting a trace for this other thread, it's difficult to say what's
going on. It could be anything that's operating on the list of
requests: Request Details (briefly), Running Requests, Request History
etc.
Waiting for monitors is exceedingly common in Java as they are used
extensively - not only by FusionReactor - to regulate entry into
critical sections of code. We take great pains to ensure the time
spent inside these sections is a short as possible, but contention
does occur occasionally.
Do you have a complete trace for this point in time? If so, whatever
thread 80's doing ("Server Thread Pool Member Thread-24", part of the
FR internal web server) is holding the lock. Like I said, this might
be updating display pages. Was the server very loaded? Did many
people have Request History or Running Requests open?
Best regards,
-John
On Feb 27, 2:17 am, Aaron Greenlee <aarongreen...@gmail.com> wrote:
> I saw a slow thread on one of my servers this evening and when I viewed the
> stack trace it showed it was blocked and waiting for FusionReactor to make a
> monitor entry. I have included the stack below and was curious what the
> groups thoughts are on this issue.
I was the only person monitoring Fusion Reactor and this server was only serving 3-4 requests a second when I caught this long running request.
I've been using Fusion Reactor for about a year in my development environment but we only recently purchased it for our production servers. So, I tend to spend a lot of time just poking around trying to learn more about FR and the way our site is actually behaving. I have not observed a repeat of this issue and will assume it was isolated. 99.9% of our requests are under 300ms so it should be easy to see if this happens again.
Do you have any tips on capturing the stack trace for other threads when an issue like this occurs? I have my CP warnings set to save the stack trace once it sees a request hit 10 seconds but I'm not sure how I would provide you with the trace for thread 80.
Aaron, assuming you don't catch it in one of your CP alerts (which would include the thread dump in the body), you can also obtain one by going to Resources>List All Threads, then click the button "stack trace all".
/charlie
From: fusionreactor@googlegroups.com [mailto:fusionreactor@googlegroups.com] On Behalf Of Aaron Greenlee Sent: Monday, February 28, 2011 3:43 PM To: fusionreactor@googlegroups.com Subject: [fusionreactor] Re: FusionReactor - Waiting for Monitor Entry
Thanks for your response, John.
I was the only person monitoring Fusion Reactor and this server was only serving 3-4 requests a second when I caught this long running request.
I've been using Fusion Reactor for about a year in my development environment but we only recently purchased it for our production servers. So, I tend to spend a lot of time just poking around trying to learn more about FR and the way our site is actually behaving. I have not observed a repeat of this issue and will assume it was isolated. 99.9% of our requests are under 300ms so it should be easy to see if this happens again.
Do you have any tips on capturing the stack trace for other threads when an issue like this occurs? I have my CP warnings set to save the stack trace once it sees a request hit 10 seconds but I'm not sure how I would provide you with the trace for thread 80.
Respectfully,
Aaron Greenlee
-- You received this message because you are subscribed to the Google Groups "FusionReactor" group. To post to this group, send email to fusionreactor@googlegroups.com. To unsubscribe from this group, send email to fusionreactor+unsubscribe@googlegroups.com. For more options, visit this group at http://groups.google.com/group/fusionreactor?hl=en.
Yes that's what I'd suggest, thanks Charlie. It's not impossible to
see contention like this; after all this is what the monitor/lock
functionality is for. We would only really start to worry when a
2-party (or worse) deadlock occurs (where one thread holds lock A,
and another holds lock B, and both are waiting for the other lock).
If you can get us a trace it would be useful, but I would say that
(fingers crossed) this is just an isolated incident :-)
Best regards,
-John
On 28/02/2011 23:16, charlie arehart wrote:
Aaron, assuming you don’t catch it in one of your
CP alerts (which would include the thread dump in the body),
you can also obtain one by going to Resources>List All
Threads, then click the button “stack trace all”.
I was the only person monitoring Fusion
Reactor and this server was only serving 3-4 requests a
second when I caught this long running request.
I've been using Fusion Reactor for
about a year in my development environment but we only
recently purchased it for our production servers. So, I
tend to spend a lot of time just poking around trying to
learn more about FR and the way our site is actually
behaving. I have not observed a repeat of this issue and
will assume it was isolated. 99.9% of our requests are
under 300ms so it should be easy to see if this happens
again.
Do you have any tips on capturing the
stack trace for other threads when an issue like this
occurs? I have my CP warnings set to save the stack trace
once it sees a request hit 10 seconds but I'm not sure how
I would provide you with the trace for thread 80.
Here is a recent thread dump as a follow-up to this issue. This server was just updated to FR 4.0.5. Is there anything I can adjust in Fusion Reactor to help avoid this blocking? Thanks!
com.intergral.fusionreactor.core.v.m(FusionReactor.java:486) - waiting on <0x3fa91c56> (a java.util.LinkedHashMap held by thread 92, FusionReactor Web Server (Server Thread Pool Member Thread-31)) com.intergral.fusionreactor.filter.FusionReactorFilter.c(FusionReactorFilte r.java:500) com.intergral.fusionreactor.filter.FusionReactorFilter.d(FusionReactorFilte r.java:262) com.intergral.fusionreactor.filter.FusionReactorFilter.doFilter(FusionReact orFilter.java:171) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServlet Filter.java:42) coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) jrun.servlet.FilterChain.service(FilterChain.java:101) jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106) jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42) jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:286) jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543) jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203 ) jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:42 8) jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
com.intergral.fusionreactor.core.v.m(FusionReactor.java:486) - waiting on <0x3fa91c56> (a java.util.LinkedHashMap held by thread 92, FusionReactor Web Server (Server Thread Pool Member Thread-31)) com.intergral.fusionreactor.filter.FusionReactorFilter.c(FusionReactorFilte r.java:500) com.intergral.fusionreactor.filter.FusionReactorFilter.d(FusionReactorFilte r.java:262) com.intergral.fusionreactor.filter.FusionReactorFilter.doFilter(FusionReact orFilter.java:171) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) coldfusion.monitor.event.MonitoringServletFilter.doFilter(MonitoringServlet Filter.java:42) coldfusion.bootstrap.BootstrapFilter.doFilter(BootstrapFilter.java:46) jrun.servlet.FilterChain.doFilter(FilterChain.java:94) jrun.servlet.FilterChain.service(FilterChain.java:101) jrun.servlet.ServletInvoker.invoke(ServletInvoker.java:106) jrun.servlet.JRunInvokerChain.invokeNext(JRunInvokerChain.java:42) jrun.servlet.JRunRequestDispatcher.invoke(JRunRequestDispatcher.java:286) jrun.servlet.ServletEngineService.dispatch(ServletEngineService.java:543) jrun.servlet.jrpp.JRunProxyService.invokeRunnable(JRunProxyService.java:203 ) jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:42 8) jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)
I've just had a look through the trace and the key to this is figuring out what the FR internal web server is doing with 0x3fa91c56, which is the map of currently-running requests. We synchronize this map because it's used a lot concurrently, and naturally we try to make the synchronized blocks as small as possible. Unfortunately the trace you posted is for running requests, rather than the complete system, so the thread which locked 0x3fa91c56 isn't shown.
If/when this recurs, could you do a full trace by doing FusionReactor -> Resources -> List All Threads -> Stack Trace All, then post the result or send us a mail at supp...@fusion-reactor.com, and we'll get right on it!
> Here is a recent thread dump as a follow-up to this issue. This server > was just updated to FR 4.0.5. Is there anything I can adjust in Fusion > Reactor to help avoid this blocking? Thanks!