Distributed Destination Failover not working

Bob S

unread,

Nov 11, 2002, 11:16:37 AM11/11/02

to

I'm using WebLogic 7 SP1 on Windows 2000. I've configured a
distributed queue that has two member. The two members are
running in two WebLogic instances in a Cluster configuration
(call them Server1 and Server2). My client posts messages to the
distributed queue and it seems the messages seem to be
distributed between Server1 and Server2 (as expected). Although,
when I kill Server1, the client complains that it can't connect
to the queue on Server1 and never recovers. I would have
expected to see (at most) one exception and then the next
request to use Server2's queue. The client gets the following
exception:

weblogic.jms.dispatcher.DispatcherException: Dispatcher not found in jndi: Server1,
javax.naming.NameNotFoundException: Unable to resolve 'weblogic.jms.S:Server1'
Resolved: 'weblogic.jms' Unresolved:'S:Server1' ; remaining name 'S:Server1'
at weblogic.jms.dispatcher.DispatcherManager.dispatcherCreate(DispatcherManager.java:323)
at weblogic.jms.dispatcher.DispatcherManager.findOrCreate(DispatcherManager.java:413)
at weblogic.jms.frontend.FEProducer.<init>(FEProducer.java:87)
at weblogic.jms.frontend.FESession$2.run(FESession.java:607)
at weblogic.security.service.SecurityServiceManager.runAs(SecurityServiceManager.java:785)
at weblogic.jms.frontend.FESession.producerCreate(FESession.java:604)
at weblogic.jms.frontend.FESession.invoke(FESession.java:2246)
at weblogic.jms.dispatcher.Request.wrappedFiniteStateMachine(Request.java:552)
at weblogic.jms.dispatcher.DispatcherImpl.dispatchSync(DispatcherImpl.java:275)
at weblogic.jms.client.JMSSession.createProducer(JMSSession.java:1461)
at weblogic.jms.client.JMSSession.createSender(JMSSession.java:1312)

Should I have some kind of recovery logic on my client to make
this stuff work?

Bob.

Anamitra

unread,

Nov 11, 2002, 1:36:48 PM11/11/02

to

Hi Bob
I saw the exact same behaviour. Over that I also saw that if I shutdown server2
then this feature worked pretty well. I could never find out what was so special
with server1 that the client wont recover.

Anamitra

Quantos Quattro

unread,

Nov 11, 2002, 4:25:41 PM11/11/02

to

Same behaviour here too. The documentation says that a distributed
destination is supposed to help in cases like this to protect against
failover (ie. still distribute to remainnig destinations). I haven't read
all docs yet -busy now - but I still haven't found the answer.

Perhaps its how we are referring to the cluster? How do you refer to the
cluster. Currently I am doing something like
InitialContext ic1 = getInitialContext("t3://x1:7001,x2:7002");

While in a loop writing to a producer (a distributed queue), I take one of
the servers down, then I get
<10-Nov-2002 21:21:02 GMT> <Error> <socket> <000403> <IOException on socket:
Socket[addr=x2.x/10.0.10.10,port=7005,localport=4828]

java.net.SocketException: Connection reset

blah blah

weblogic.jms.common.IllegalStateException: Producer is closed

I was hoping for some kind of invisible failure - ie. producer would
continue as usual, just producing to x1. cos each time you produce a
decision is made as to which physical q to produce on, and as one is down
you'd hope the other would be only optoin.

What should we do?

Regards,

Q

"Anamitra" <ana_...@yahoo.com> wrote in message
news:3dcf...@newsgroups.bea.com...

Tom Barnes

unread,

Nov 11, 2002, 6:17:24 PM11/11/02

to Bob S

Hi Bob,

The particular exception you are seeing seems like it could use
some enhancement - it should be wrapped in a "friendlier" exception
such as "remote server XXX unavailable". I
recommend filing a case with customer support.

That said, a producer sending to a distributed destination needs
to be able to handle send failures. WebLogic
will automatically retry sends in cases where there is no ambiguity, but
when it can't determine the nature of the failure (eg it can't
determine whether or not the message made it to a JMS server) it
throws the exception back to the client to let the client
decide what it wants to do - eg commit/don't commit, reconnect
and resend, reconnect and don't resend.

Tom

Bob S

unread,

Nov 13, 2002, 8:43:18 AM11/13/02

to

Tom,

I don't really have a problem with getting an exception for the
request that was in progress when the server failed. I would
expect, though, the next request to succeed.
The problem is that even when I restart my client process it
still tries to go to the same destination (weird). It seems
that the Distributed Destination exception handling logic only
removes the failed entry when it receives a certain type of
exception. I'm suspecting this because (just 5 minutes ago)
I got the distributed destination to recover from the failure.
The exception that I got this time was the following:

weblogic.jms.common.JMSException: Failed to send message because destination MyQueue_JMSServer1
is not avaiable (shutdown, suspended or deleted).

Start server side stack trace:
weblogic.jms.common.JMSException: Failed to send message because destination MyQueue_JMSServer1
is not avaiable (shutdown, suspended or deleted).
at weblogic.jms.backend.BEDestination.checkShutdownOrSuspendedNeedLock(BEDestination.java:1102)
at weblogic.jms.backend.BEDestination.send(BEDestination.java:2782)
at weblogic.jms.backend.BEDestination.invoke(BEDestination.java:3810)
at weblogic.jms.dispatcher.Request.wrappedFiniteStateMachine(Request.java:552)
at weblogic.jms.dispatcher.DispatcherImpl.dispatchAsync(DispatcherImpl.java:152)
at weblogic.jms.dispatcher.DispatcherImpl.dispatchAsyncTranFuture(DispatcherImpl.java:425)
at weblogic.jms.dispatcher.DispatcherImpl_WLSkel.invoke(Unknown Source)
at weblogic.rmi.internal.BasicServerRef.invoke(BasicServerRef.java:362)
at weblogic.rmi.internal.BasicServerRef$1.run(BasicServerRef.java:313)
at weblogic.security.service.SecurityServiceManager.runAs(SecurityServiceManager.java:785)
at weblogic.rmi.internal.BasicServerRef.handleRequest(BasicServerRef.java:308)
at weblogic.rmi.internal.BasicExecuteRequest.execute(BasicExecuteRequest.java:30)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:153)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:134)
End server side stack trace

In this (rare) case, my system recovered beautifully!

Bob.

Tom Barnes

unread,

Nov 13, 2002, 3:27:20 PM11/13/02

to Bob S

Hi Bob,

If you haven't already, see if the connection factory you use has
"ServerAffinityEnabled" set to false (the default is true) and
"LoadBalancingEnabled" set to true. That said, I think you may be
seeing a known bug - so I suggesting contacting customer support.

Tom

Flyin...@hotmail.com

unread,

Feb 12, 2003, 2:53:46 PM2/12/03

to

Hi,

I hit the same problem, were you able to fix it ? if so how ?