Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Hung Thread messages in the log

422 views
Skip to first unread message

velmurugan...@gmail.com

unread,
Feb 28, 2005, 5:43:28 PM2/28/05
to
We're running IBM WebSphere 5.1 on Aix. We started seeing these
messages in the log and the application performance is affected. Any
ideas on why this might be happening?

Thanks,


[2/28/05 17:05:40:385 EST] 6a38e9ba ThreadMonitor W WSVR0606W: Thread
"Servlet.Engine.Transports : 5" (6a38e9ba) was previously reported to
be hung but has completed. It was active for approximately 11,842,015
milliseconds. There are 4 threads in total in the server that still
may be hung.
[2/28/05 17:05:56:688 EST] 4ee969bb ThreadMonitor W WSVR0605W: Thread
"Servlet.Engine.Transports : 9" (2883a9ba) has been active for 712,612
milliseconds and may be hung. There are 5 threads in total in the
server that may be hung.
[2/28/05 17:05:56:702 EST] 4ee969bb ThreadMonitor W WSVR0605W: Thread
"Servlet.Engine.Transports : 7" (353a9ba) has been active for 737,438
milliseconds and may be hung. There are 6 threads in total in the
server that may be hung.
[2/28/05 17:11:56:732 EST] 7e5fe9bb ThreadMonitor W WSVR0605W: Thread
"Servlet.Engine.Transports : 25" (348e69ba) has been active for 626,150
milliseconds and may be hung. There are 7 threads in total in the
server that may be hung.

ggosseyn

unread,
Mar 1, 2005, 4:50:44 AM3/1/05
to
Hello,

You have maybe deadlocks in your application Or a infinite loop. For
example, wait time to update database (with sql request) or bad index in
your database which get a long time to read data.

more info :
http://publib.boulder.ibm.com/infocenter/ws51help/index.jsp?topic=/com.ibm.websphere.nd.doc/info/ae/ae/ctrb_hangdetection.html

Bye.
gg

<velmurugan...@gmail.com> wrote in message
news:1109630608....@g14g2000cwa.googlegroups.com...

busj...@de.ibm.com

unread,
Jun 8, 2005, 3:09:03 AM6/8/05
to
Hi, we are encountering a similar problem in WPS 5.1 on Linux:

Out portal crashed during a reinitialization of a virtual portal instance. After killing and restarting it the anonymous area worked fine, but every login resulted in the portal hanging forever. The Message in the SystemOut.log was:

[6/3/05 14:16:24:214 CEST] 3c9bfb36 WebGroup E SRVE0026E: [Servlet Error]-[]: java.lang.NullPointerException
at com.ibm.wps.engine.Servlet.doGet(Servlet.java:526)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
at com.ibm.ws.webcontainer.servlet.StrictServletInstance.doService(StrictServletInstance.java:110)
at com.ibm.ws.webcontainer.servlet.StrictLifecycleServlet._service(StrictLifecycleServlet.java:174)
at com.ibm.ws.webcontainer.servlet.ServicingServletState.service(StrictLifecycleServlet.java:333)
at com.ibm.ws.webcontainer.servlet.StrictLifecycleServlet.service(StrictLifecycleServlet.java:116)
at com.ibm.ws.webcontainer.servlet.ServletInstance.service(ServletInstance.java:283)
at com.ibm.ws.webcontainer.servlet.ValidServletReferenceState.dispatch(ValidServletReferenceState.java:42)
at com.ibm.ws.webcontainer.servlet.ServletInstanceReference.dispatch(ServletInstanceReference.java:40)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:76)
at com.ibm.wps.state.filter.StateCleanup.doFilter(StateCleanup.java:71)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:132)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:71)
at com.ibm.wps.mappingurl.impl.URLAnalyzer.doFilter(URLAnalyzer.java:174)
at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:132)
at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:71)
at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.handleWebAppDispatch(WebAppRequestDispatcher.java:1086)
at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.dispatch(WebAppRequestDispatcher.java:627)
at com.ibm.ws.webcontainer.webapp.WebAppRequestDispatcher.forward(WebAppRequestDispatcher.java:201)
at com.ibm.ws.webcontainer.srt.WebAppInvoker.doForward(WebAppInvoker.java:125)
at com.ibm.ws.webcontainer.srt.WebAppInvoker.handleInvocationHook(WebAppInvoker.java:286)
at com.ibm.ws.webcontainer.cache.invocation.CachedInvocation.handleInvocation(CachedInvocation.java:71)
at com.ibm.ws.webcontainer.srp.ServletRequestProcessor.dispatchByURI(ServletRequestProcessor.java:182)
at com.ibm.ws.webcontainer.oselistener.OSEListenerDispatcher.service(OSEListener.java:334)
at com.ibm.ws.webcontainer.http.HttpConnection.handleRequest(HttpConnection.java:56)
at com.ibm.ws.http.HttpConnection.readAndHandleRequest(HttpConnection.java:652)
at com.ibm.ws.http.HttpConnection.run(HttpConnection.java:448)
at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:912)

followed by several messages:

[6/3/05 14:16:24:827 CEST] 3d3ffb36 ThreadMonitor W WSVR0606W: Thread "Servlet.Engine.Transports : 14" (3d3ffb36) was previously reported to be hung but has completed. It was active for approximately 734,488 milliseconds. There are 13 threads in total in the server that still may be hung.
[6/3/05 14:16:25:317 CEST] 3d783b36 ThreadMonitor W WSVR0606W: Thread "Servlet.Engine.Transports : 11" (3d783b36) was previously reported to be hung but has completed. It was active for approximately 741,007 milliseconds. There are 12 threads in total in the server that still may be hung.

etc. After having counted down to 0 threads, the portal was available again, but still only as anonymous portal user.

Thanks for help!

rene.z...@capgemini.com

unread,
Jun 9, 2005, 4:18:37 PM6/9/05
to
You have a serious performance problem.
You should use Tivoli Performance Viewer and Adviser to see what causes the problem. Probably too many threads that are allocated for too much time.

abba...@schneider.com

unread,
Jul 20, 2005, 11:58:13 AM7/20/05
to
We are facing similar problems on our AIX environment.
We have about 60 odd apps running. Node synchronization fails on a regular basis and the box runs hot (~100% CPU).
Did you guys find any solution or chnaged any of your settings. Help will be greatly appreciated.

Ken Hygh

unread,
Jul 20, 2005, 12:24:44 PM7/20/05
to
abba...@schneider.com wrote:

>We are facing similar problems on our AIX environment.
>We have about 60 odd apps running. Node synchronization fails on a regular basis
>

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
symptom

>and the box runs hot (~100% CPU).
>

^^^^^^^^^^
cause

>Did you guys find any solution or chnaged any of your settings. Help will be greatly appreciated.
>

Sounds like you need a bigger box

Ken

Ben_

unread,
Jul 20, 2005, 5:00:57 PM7/20/05
to
For hung threads:
Collect some javacores (kill -3 <pid>) and review (and compare) them with
the ThreadAnalyzer from alphaWorks to see what threads are blocked and what
they are doing. You need to collect the thread dump at the time when threads
are locked obviously, that is when the message "There are x threads in total
in the server that still may be hung" has x that is not zero... (If it
happens only from times to times, you can automate the collection by using
the sample from the article "Detecting hung threads in J2EE applications"
mentionned earlier).

For CPU usage:
You didn't tell much about the context.
As Ken stated, you might simply have exceeded the capacity of the machine by
running those 60 apps.

If not, you need to identify if one or several process are taking the CPU
and then the threads in there consuming the CPU. Refer to
http://www-1.ibm.com/support/docview.wss?rs=180&context=SSCMP9J&uid=swg21116458&loc=en_US&cs=utf-8&lang=en
for details.
Also this BEA article is very helpfull to correlate the OS thread with the
Java thread:
http://support.bea.com/application_content/product_portlets/support_patterns/wls/High_CPU_Usage_Pattern.html#AIX.

It might also be that Garbage Collection is taking all CPU (typically,
because the application produces much garbage in a short period of time).
Here, you can activate Verbose Garbage Collection logs from the Admin
Console and review native_stderr log to see how much time is spent at GC.

HTH.


0 new messages