"http-8081-2" daemon prio=10 tid=0x0000000057fa4400 nid=0x6150 in
Object.wait() [0x0000000043fba000..0x0000000043fbbd90]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab37e2960> (a java.lang.Object)
at java.lang.Object.wait(Object.java:485)
at com.novell.ldap.Connection.acquireWriteSemaphore(Unknown Source)
- locked <0x00002aaab37e2960> (a java.lang.Object)
at com.novell.ldap.Connection.connect(Unknown Source)
at com.novell.ldap.Connection.connect(Unknown Source)
at com.novell.ldap.LDAPConnection.bind(Unknown Source)
at com.novell.ldap.connectionpool.Connection.bind(Unknown Source)
at com.novell.ldap.LDAPConnection.bind(Unknown Source)
at com.novell.ldap.LDAPConnection.bind(Unknown Source)
at com.novell.ldap.connectionpool.Connection.poolBind(Unknown Source)
at com.novell.ldap.connectionpool.PoolManager.getBoundConnection
(Unknown Source)
...
All of the 199 out of 200 thread all are waiting on object
0x00002aaab37e2960. The servers become locked and tomcat must be
restarted in order to alleviate the condition. There was a similar
discussion that happened way back in Feb 2006 but the thread dies off
without resolution.
Is this still a known issue? It appears to be very similar (if not
exactly) the same issue as before.
Do you see other threads waiting to acquireWriteSemaphore? or any
other threads that are running Connection$ReaderThread.run?
Cheers,
Krishna
"ExecuteThread: '12' for queue: 'weblogic.kernel.Default'" daemon
prio=10 tid=0008b460 nid=25 lwp_id=7651810 in Object.wait()
[0x11180000..0x11180dc0]
at java.lang.Object.wait(Native Method)
- waiting on <2626f340> (a java.lang.Object)
at java.lang.Object.wait(Object.java:429)
at com.novell.ldap.Connection.acquireWriteSemaphore(Unknown Source)
- locked <2626f340> (a java.lang.Object)
at com.novell.ldap.Connection.isConnectionAlive(Unknown Source)
at com.novell.ldap.LDAPConnection.isConnectionAlive(Unknown Source)
### Customer code trace sniped ####
at weblogic.rmi.internal.BasicServerRef.invoke(BasicServerRef.java:
492)
at weblogic.rmi.cluster.ReplicaAwareServerRef.invoke
(ReplicaAwareServerRef.java:108)
at weblogic.rmi.internal.BasicServerRef$1.run(BasicServerRef.java:
435)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs
(AuthenticatedSubject.java:363)
at weblogic.security.service.SecurityManager.runAs
(SecurityManager.java:147)
at weblogic.rmi.internal.BasicServerRef.handleRequest
(BasicServerRef.java:430)
at weblogic.rmi.internal.BasicExecuteRequest.execute
(BasicExecuteRequest.java:35)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:224)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:183)
"Thread-509" daemon prio=10 tid=00091810 nid=566 lwp_id=7655722
runnable [0x0e9c0000..0x0e9c0dc0]
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:134)
at java.net.SocketInputStream.read(SocketInputStream.java:187)
at com.novell.ldap.asn1.ASN1Identifier.<init>(Unknown Source)
at com.novell.ldap.Connection$ReaderThread.run(Unknown Source)
at java.lang.Thread.run(Thread.java:534)
We are using March 2006 NDK version on JRE 1.4.2.
For those reading this thread read also Krishna's interesting analysis
at http://groups.google.com/group/novell.devsup.ldap_j/browse_thread/thread/d508354627a139d9
(
Hang/Deadlock issue with ReaderThreads in jldap)
Regards,
Xabi.
All of the tomcat http-*-* threads exhibit this which is why the
servers run out of threads.
Are all the threads waiting on the same object?
> - waiting on <2626f340> (a java.lang.Object)
That would mean all HTTP requests are using the same LDAPConnection
(or rather clones that use a single Connection) From what I understand
from design and code of OpenLDAP, the same Connection object (real
connection) can shared by many LDAPConnection objects. There's a
definitely a bug in this area - I was able to reproduce a deadlock
with breakpoints, I guess there are many manifestations to the outcome
of this bug.
> Are all the threads waiting on the same object?
>
> > - waiting on <2626f340> (a java.lang.Object)
>
> That would mean all HTTP requests are using the same LDAPConnection
> (or rather clones that use a single Connection) From what I understand
> from design and code of OpenLDAP, the same Connection object (real
> connection) can shared by many LDAPConnection objects. There's a
> definitely a bug in this area - I was able to reproduce a deadlock
> with breakpoints, I guess there are many manifestations to the outcome
> of this bug.
You got it. They all get stuck on the same Object. I had just
started to look at the code when I found your analysis already.
What's interesting is that we're using the same code base in two
different areas; one exhibits the behavior and one does not. From
what I can deduce so far, the catalyst isn't code related but
environmental. I haven't been able to reproduce the issues we're
encountering in any other environment.
I'm going to try your fix as well and see if that rectifies the issue
as well.