Help with clustering (2.1.0.GA+SipRA)

359 views
Skip to first unread message

Randy Thomas

unread,
Jul 9, 2010, 11:34:55 AM7/9/10
to mobicents-public
I have been trying to use clustering, to keep it simple, I started
with 2 nodes but running into some problems. I've been working with
2.1.0.GA mobicents version with SipRA installed.

Here are the steps I've been following.
1) Modified SipRA configuration to include load balancer
(192.168.230.206)
2) Start node1, run.bat -c all -b 192.168.231.105 -
Djboss.messaging.ServerPeedID=1
3) Start node2, run.bat -c all -b 192.168.231.106 -
Djboss.messaging.ServerPeedID=2

Seem to start fine except some WARN like below in both the logs

2010-07-09 10:26:01,521 WARN [org.jgroups.protocols.pbcast.NAKACK]
(OOB-1,192.168.231.105:61643) 192.168.231.105:61643] discarded message
from non-member 192.168.231.106:50118, my view is
[192.168.231.105:61643|0] [192.168.231.105:61643]
2010-07-09 10:26:01,568 WARN [org.jgroups.protocols.pbcast.NAKACK]
(OOB-1,192.168.231.105:61643) 192.168.231.105:61643] discarded message
from non-member 192.168.231.106:50118, my view is
[192.168.231.105:61643|0] [192.168.231.105:61643]
.....
......
2010-07-09 10:23:33,036 WARN [org.jgroups.protocols.TCP] (main)
failed to join /224.0.0.75:7500 on net3: java.net.SocketException:
Unrecognized Windows Sockets error: 0: no Inet4Address associated with
interface
2010-07-09 10:23:33,037 WARN [org.jgroups.protocols.TCP] (main)
failed to join /224.0.0.75:7500 on net4: java.net.SocketException:
Unrecognized Windows Sockets error: 0: no Inet4Address associated with
interface
......................
..........
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(OOB-36,192.168.231.105:7900) Got merge response at
192.168.231.105:7900 from 192.168.231.106:7900,
merge_id=[192.168.231.106:7900|0] [192.168.231.106:7900], merge data
is sender=192.168.231.106:7900, view=[192.168.231.106:7900|0]
[192.168.231.106:7900], digest=192.168.231.106:7900: [51 : 54 (54)]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 expects 2 responses, so far got 2 responses
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 collected 2 merge response(s) in 16 ms
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 is consolidating merge data
sender=192.168.231.105:7900, view=[192.168.231.105:7900|0]
[192.168.231.105:7900], digest=192.168.231.105:7900: [35 : 38 (38)]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 is consolidating merge data
sender=192.168.231.106:7900, view=[192.168.231.106:7900|0]
[192.168.231.106:7900], digest=192.168.231.106:7900: [51 : 54 (54)]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 computed new merged view that will be MergeView::
[192.168.231.105:7900|1] [192.168.231.105:7900, 192.168.231.106:7900],
subgroups=[[192.168.231.105:7900|0] [192.168.231.105:7900],
[192.168.231.106:7900|0] [192.168.231.106:7900]]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900consolidated digest=192.168.231.106:7900: [51 : 54
(54)], 192.168.231.105:7900: [35 : 38 (38)]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900)
192.168.231.105:7900 is sending merge view [192.168.231.105:7900|1] to
coordinators [192.168.231.106:7900, 192.168.231.105:7900]
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(MergeTask,MessagingPostOffice-DATA,192.168.231.105:7900) Merge leader
192.168.231.105:7900 completed merge task
2010-07-09 10:41:02,327 DEBUG [org.jgroups.protocols.pbcast.GMS]
(Incoming-2,192.168.231.105:7900) view=MergeView::
[192.168.231.105:7900|1] [192.168.231.105:7900, 192.168.231.106:7900],
subgroups=[[192.168.231.105:7900|0] [192.168.231.105:7900],
[192.168.231.106:7900|0] [192.168.231.106:7900]],
digest=192.168.231.106:7900: [51 : 54 (54)], 192.168.231.105:7900:
[35 : 38 (38)]

And some errors like below

2010-07-09 10:41:02,327 ERROR [org.jgroups.protocols.pbcast.NAKACK]
(Incoming-2,192.168.231.105:7900) sender 192.168.231.106:7900 not
found in xmit_table
2010-07-09 10:41:02,327 ERROR [org.jgroups.protocols.pbcast.NAKACK]
(Incoming-2,192.168.231.105:7900) range is null


I wanted to make sure that the SipRA is getting replicated, so, I
installed my registrar application and registered one phone (which
points to LB). Phone gets registered on Node1 but when I go and check
to see Cache contents through JMX console on Node1 and Node2 they
don't seem to match (SipRA AC etc) . On Node1, MobicentsSleeCache has
the activity context details but not on Node2. After a little while I
see the following exceptions on Node1 for the REGISTERs coming in and
REGISTER getting dropped (don't make it to the application breaking
the whole functionality).



REGISTER sip:sip.test.com SIP/2.0
Via: SIP/2.0/UDP 192.168.230.206:5060;branch=z9hG4bK-
d8754z-9611fb0e044ee519-1---d8754z-OGE5O,SIP/2.0/UDP
192.168.231.105:28796;branch=z9hG4bK-d8754z-9611fb0e044ee519-1---
d8754z-;rport=28796;received=192.168.231.105
Max-Forwards: 69
Contact: <sip:
9789912009@192.168.231.105:28796;rinstance=582bcd17359e9023>
To: "Bill C" <sip:97899...@sip.test.com>
From: "Bill C" <sip:97899...@sip.test.com>;tag=14436d12
Call-ID: OGE5OTE4YTk0M2Y4ZmQ1OWY3ODYyYjQxZTQwYWQ5ODI.
CSeq: 4 REGISTER
Expires: 3600
Allow:
INVITE,ACK,CANCEL,OPTIONS,BYE,REFER,NOTIFY,MESSAGE,SUBSCRIBE,INFO
User-Agent: X-Lite release 1104o stamp 56125
Authorization: Digest
username="9789912009",realm="DeargeertCbr",nonce="cbr36d3a1d2b64f8917875887e48c1f",uri="sip:sip.broadvoice.com",response="e04cc0065f9d4122442f7a24e9b8be2a",cnonce="85b253f95577778c0a156ed4f2efba31",nc=00000003,qop=auth,algorithm=MD5
Route: <sip:192.168.231.105:5060;transport=udp;lr>
Content-Length: 0


2010-07-09 10:56:19,990 DEBUG [org.jgroups.protocols.FD]
(Timer-5,192.168.231.105:52358) sending are-you-alive msg to
192.168.231.106:60791 (own address=192.168.231.105:52358)
2010-07-09 10:56:22,086 DEBUG [org.jgroups.protocols.FD]
(Timer-4,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)
2010-07-09 10:56:25,987 DEBUG [org.jgroups.protocols.FD]
(Timer-12,192.168.231.105:52358) sending are-you-alive msg to
192.168.231.106:60791 (own address=192.168.231.105:52358)
2010-07-09 10:56:28,083 DEBUG [org.jgroups.protocols.FD]
(Timer-1,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)
2010-07-09 10:56:28,404 DEBUG [org.jgroups.protocols.FD]
(Timer-4,192.168.231.105:61643) sending are-you-alive msg to
192.168.231.106:50118 (own address=192.168.231.105:61643)
2010-07-09 10:56:31,984 DEBUG [org.jgroups.protocols.FD]
(Timer-10,192.168.231.105:52358) sending are-you-alive msg to
192.168.231.106:60791 (own address=192.168.231.105:52358)
2010-07-09 10:56:34,079 DEBUG [org.jgroups.protocols.FD]
(Timer-1,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)
2010-07-09 10:56:37,981 DEBUG [org.jgroups.protocols.FD]
(Timer-5,192.168.231.105:52358) sending are-you-alive msg to
192.168.231.106:60791 (own address=192.168.231.105:52358)
2010-07-09 10:56:38,398 DEBUG [org.jgroups.protocols.FD]
(Timer-4,192.168.231.105:61643) sending are-you-alive msg to
192.168.231.106:50118 (own address=192.168.231.105:61643)
2010-07-09 10:56:40,075 DEBUG [org.jgroups.protocols.FD]
(Timer-1,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)
2010-07-09 10:56:41,689 ERROR
[javax.slee.RAEntityNotification[entity=SipRA].SipResourceAdaptor]
(SipRA-UDPMessageChannelThread-3) Replication timeout for
192.168.231.106:50118
2010-07-09 10:56:41,689 ERROR
[javax.slee.RAEntityNotification[entity=SipRA].SipResourceAdaptor]
(SipRA-UDPMessageChannelThread-3) Replication timeout for
192.168.231.106:50118
org.jboss.cache.lock.TimeoutException: Replication timeout for
192.168.231.106:50118
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
768)
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
716)
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
721)
at
org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:
161)
at
org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:
135)
at
org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterceptor.java:
107)
at
org.jboss.cache.interceptors.ReplicationInterceptor.handleCrudMethod(ReplicationInterceptor.java:
160)
at
org.jboss.cache.interceptors.ReplicationInterceptor.visitPutDataMapCommand(ReplicationInterceptor.java:
113)
at
org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:
104)
at
org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:
116)
at
org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandInterceptor.java:
131)
at
org.jboss.cache.commands.AbstractVisitor.visitPutDataMapCommand(AbstractVisitor.java:
60)
at
org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:
104)
at
org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:
116)
at
org.jboss.cache.interceptors.TxInterceptor.attachGtxAndPassUpChain(TxInterceptor.java:
301)
at
org.jboss.cache.interceptors.TxInterceptor.handleDefault(TxInterceptor.java:
283)
at
org.jboss.cache.commands.AbstractVisitor.visitPutDataMapCommand(AbstractVisitor.java:
60)
at
org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:
104)
at
org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:
116)
at
org.jboss.cache.interceptors.CacheMgmtInterceptor.visitPutDataMapCommand(CacheMgmtInterceptor.java:
97)
at
org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:
104)
at
org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(CommandInterceptor.java:
116)
at
org.jboss.cache.interceptors.InvocationContextInterceptor.handleAll(InvocationContextInterceptor.java:
178)
at
org.jboss.cache.interceptors.InvocationContextInterceptor.visitPutDataMapCommand(InvocationContextInterceptor.java:
64)
at
org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCommand.java:
104)
at
org.jboss.cache.interceptors.InterceptorChain.invoke(InterceptorChain.java:
287)
at
org.jboss.cache.invocation.CacheInvocationDelegate.invokePut(CacheInvocationDelegate.java:
705)
at
org.jboss.cache.invocation.CacheInvocationDelegate.put(CacheInvocationDelegate.java:
519)
at
org.jboss.cache.invocation.NodeInvocationDelegate.addChild(NodeInvocationDelegate.java:
337)
at org.mobicents.cache.CacheData.create(CacheData.java:70)
at
org.mobicents.slee.runtime.activity.ActivityContextImpl.<init>(ActivityContextImpl.java:
78)
at
org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.createActivityContext(ActivityContextFactoryImpl.java:
110)
at
org.mobicents.slee.runtime.activity.ActivityContextFactoryImpl.createActivityContext(ActivityContextFactoryImpl.java:
32)
at
org.mobicents.slee.resource.SleeEndpointImpl._startActivity(SleeEndpointImpl.java:
206)
at
org.mobicents.slee.resource.SleeEndpointStartActivityNotTransactedExecutor.execute(SleeEndpointStartActivityNotTransactedExecutor.java:
38)
at
org.mobicents.slee.resource.SleeEndpointImpl.startActivity(SleeEndpointImpl.java:
105)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.addActivity(SipResourceAdaptor.java:
834)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.processNotCancelRequest(SipResourceAdaptor.java:
378)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.processRequest(SipResourceAdaptor.java:
215)
at gov.nist.javax.sip.EventScanner.deliverEvent(EventScanner.java:
226)
at
gov.nist.javax.sip.SipProviderImpl.handleEvent(SipProviderImpl.java:
193)
at gov.nist.javax.sip.DialogFilter.processRequest(DialogFilter.java:
1274)
at
gov.nist.javax.sip.stack.SIPServerTransaction.processRequest(SIPServerTransaction.java:
848)
at
gov.nist.javax.sip.stack.UDPMessageChannel.processMessage(UDPMessageChannel.java:
508)
at
gov.nist.javax.sip.stack.UDPMessageChannel.processIncomingDataPacket(UDPMessageChannel.java:
468)
at
gov.nist.javax.sip.stack.UDPMessageChannel.run(UDPMessageChannel.java:
304)
at java.lang.Thread.run(Thread.java:619)
2010-07-09 10:56:43,978 DEBUG [org.jgroups.protocols.FD]
(Timer-5,192.168.231.105:52358) sending are-you-alive msg to
192.168.231.106:60791 (own address=192.168.231.105:52358)

Thanks for looking and any help is appreciated.

Eduardo Martins

unread,
Jul 9, 2010, 12:03:58 PM7/9/10
to mobicent...@googlegroups.com
Start one node after the other one complets startup, see if it is same problem.

-- Eduardo

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
http://emmartins.blogspot.com
http://www.redhat.com/solutions/telco

Randy Thomas

unread,
Jul 9, 2010, 4:45:22 PM7/9/10
to mobicents-public
Eduardo,
Thanks, Starting one node after the other as you said, minimized some
of the errors, however, I do not see any replication of the dialog
data by SipRA.

My call applicaiton uses cache but not configured to handle clustering
yet. When I make a call (established), I see the dialog data (JMX)
under node
/jain-sip-ha
/Dialogs
/
zjizymm4ztvmmji2otbkytjimmm2mjnlzjiwodhmnja.:cda56054:d6484425
{ltag=cda56054, is=true, lr=SIP/2.0 200 OK
..........

on Node1 but I donot see such information on Node2. I have clustering
logs enabled and I didn't see any errors or log messages trying to
replicate any data.




On Jul 9, 12:03 pm, Eduardo Martins <emmart...@gmail.com> wrote:
> Start one node after the other one complets startup, see if it is same problem.
>
> -- Eduardo
>
> -:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-http://emmartins.blogspot.comhttp://www.redhat.com/solutions/telco
> > 9789912...@192.168.231.105:28796;rinstance=582bcd17359e9023>
> > To: "Bill C" <sip:9789912...@sip.test.com>
> > From: "Bill C" <sip:9789912...@sip.test.com>;tag=14436d12
> > Call-ID: OGE5OTE4YTk0M2Y4ZmQ1OWY3ODYyYjQxZTQwYWQ5ODI.
> > CSeq: 4 REGISTER
> > Expires: 3600
> > Allow:
> > INVITE,ACK,CANCEL,OPTIONS,BYE,REFER,NOTIFY,MESSAGE,SUBSCRIBE,INFO
> > User-Agent: X-Lite release 1104o stamp 56125
> > Authorization: Digest
> > username="9789912009",realm="DeargeertCbr",nonce="cbr36d3a1d2b64f8917875887­e48c1f",uri="sip:sip.broadvoice.com",response="e04cc0065f9d4122442f7a24e9b8­be2a",cnonce="85b253f95577778c0a156ed4f2efba31",nc=00000003,qop=auth,algori­thm=MD5
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 161)
> >        at
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 135)
> >        at
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 107)
> >        at
> > org.jboss.cache.interceptors.ReplicationInterceptor.handleCrudMethod(Replic­ationInterceptor.java:
> > 160)
> >        at
> > org.jboss.cache.interceptors.ReplicationInterceptor.visitPutDataMapCommand(­ReplicationInterceptor.java:
> > 113)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at- Hide quoted text -
>
> - Show quoted text -...
>
> read more »

Randy Thomas

unread,
Jul 9, 2010, 5:08:28 PM7/9/10
to mobicents-public
My bad, the following errors got logged in server.log once the call
was established


2010-07-09 16:56:45,112 DEBUG [org.jgroups.protocols.FD]
(Timer-4,192.168.231.105:62336) sending are-you-alive msg to
192.168.231.106:61468 (own address=192.168.231.105:62336)
2010-07-09 16:56:48,054 DEBUG [org.jgroups.protocols.FD]
(Timer-1,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)
2010-07-09 16:56:48,821 ERROR
[javax.slee.RAEntityNotification[entity=SipRA].SipResourceAdaptor]
(Timer-6) Replication timeout for 192.168.231.106:61468
org.jboss.cache.lock.TimeoutException: Replication timeout for
192.168.231.106:61468
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
768)
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
716)
at
org.jboss.cache.RPCManagerImpl.callRemoteMethods(RPCManagerImpl.java:
721)
at
org.mobicents.slee.runtime.activity.ActivityContextCacheData.setEnding(ActivityContextCacheData.java:
166)
at
org.mobicents.slee.runtime.activity.ActivityContextImpl.endActivity(ActivityContextImpl.java:
494)
at
org.mobicents.slee.resource.SleeEndpointImpl._endActivity(SleeEndpointImpl.java:
278)
at
org.mobicents.slee.resource.SleeEndpointEndActivityNotTransactedExecutor.execute(SleeEndpointEndActivityNotTransactedExecutor.java:
41)
at
org.mobicents.slee.resource.SleeEndpointImpl.endActivity(SleeEndpointImpl.java:
240)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.sendActivityEndEvent(SipResourceAdaptor.java:
807)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.processTransactionTerminated(SipResourceAdaptor.java:
750)
at
org.mobicents.slee.resource.sip11.SipResourceAdaptor.processTransactionTerminated(SipResourceAdaptor.java:
737)
at gov.nist.javax.sip.EventScanner.deliverEvent(EventScanner.java:
419)
at
gov.nist.javax.sip.SipProviderImpl.handleEvent(SipProviderImpl.java:
193)
at
gov.nist.javax.sip.stack.SIPTransactionStack.removeTransaction(SIPTransactionStack.java:
1655)
at
gov.nist.javax.sip.stack.SIPServerTransaction.cleanUp(SIPServerTransaction.java:
1873)
at gov.nist.javax.sip.stack.SIPServerTransaction
$1.runTask(SIPServerTransaction.java:1586)
at gov.nist.javax.sip.stack.timers.DefaultSipTimer
$DefaultTimerTask.run(DefaultSipTimer.java:61)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
> > > username="9789912009",realm="DeargeertCbr",nonce="cbr36d3a1d2b64f8917875887­­e48c1f",uri="sip:sip.broadvoice.com",response="e04cc0065f9d4122442f7a24e9b­8­be2a",cnonce="85b253f95577778c0a156ed4f2efba31",nc=00000003,qop=auth,algo­ri­thm=MD5
> > > [javax.slee.RAEntityNotification[entity=SipRA].SipResourceAdaptor]- Hide quoted text -

Eduardo Martins

unread,
Jul 9, 2010, 7:32:10 PM7/9/10
to mobicent...@googlegroups.com
Dunno if this can be the issue, start first node just with "-c all" (
same as "-c all -Djboss.messaging.ServerPeerID=1
-Djboss.service.binding.set=ports-default") and second with "-c all
-Djboss.messaging.ServerPeerID=2 -Djboss.service.binding.set=ports-01"

By the way, do you see the RA deployed and activated in both nodes?

-- Eduardo

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
http://emmartins.blogspot.com
http://www.redhat.com/solutions/telco

Randy Thomas

unread,
Jul 12, 2010, 10:35:06 AM7/12/10
to mobicents-public
Eduardo,
I'm not running these two nodes on same machine.

On Jul 9, 7:32 pm, Eduardo Martins <emmart...@gmail.com> wrote:
> Dunno if this can be the issue, start first node just with "-c all" (
> same as "-c all -Djboss.messaging.ServerPeerID=1
> -Djboss.service.binding.set=ports-default") and second with "-c all
> -Djboss.messaging.ServerPeerID=2 -Djboss.service.binding.set=ports-01"
>
> By the way, do you see the RA deployed and activated in both nodes?
>
> -- Eduardo
>
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 161)
> >        at
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 135)
> >        at
> > org.jboss.cache.interceptors.BaseRpcInterceptor.replicateCall(BaseRpcInterc­eptor.java:
> > 107)
> >        at
> > org.jboss.cache.interceptors.ReplicationInterceptor.handleCrudMethod(Replic­ationInterceptor.java:
> > 160)
> >        at
> > org.jboss.cache.interceptors.ReplicationInterceptor.visitPutDataMapCommand(­ReplicationInterceptor.java:
> > 113)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at
> > org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(­CommandInterceptor.java:
> > 116)
> >        at
> > org.jboss.cache.interceptors.base.CommandInterceptor.handleDefault(CommandI­nterceptor.java:
> > 131)
> >        at
> > org.jboss.cache.commands.AbstractVisitor.visitPutDataMapCommand(AbstractVis­itor.java:
> > 60)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at
> > org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(­CommandInterceptor.java:
> > 116)
> >        at
> > org.jboss.cache.interceptors.TxInterceptor.attachGtxAndPassUpChain(TxInterc­eptor.java:
> > 301)
> >        at
> > org.jboss.cache.interceptors.TxInterceptor.handleDefault(TxInterceptor.java­:
> > 283)
> >        at
> > org.jboss.cache.commands.AbstractVisitor.visitPutDataMapCommand(AbstractVis­itor.java:
> > 60)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at
> > org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(­CommandInterceptor.java:
> > 116)
> >        at
> > org.jboss.cache.interceptors.CacheMgmtInterceptor.visitPutDataMapCommand(Ca­cheMgmtInterceptor.java:
> > 97)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at
> > org.jboss.cache.interceptors.base.CommandInterceptor.invokeNextInterceptor(­CommandInterceptor.java:
> > 116)
> >        at
> > org.jboss.cache.interceptors.InvocationContextInterceptor.handleAll(Invocat­ionContextInterceptor.java:
> > 178)
> >        at
> > org.jboss.cache.interceptors.InvocationContextInterceptor.visitPutDataMapCo­mmand(InvocationContextInterceptor.java:
> > 64)
> >        at
> > org.jboss.cache.commands.write.PutDataMapCommand.acceptVisitor(PutDataMapCo­mmand.java:
> > 104)
> >        at
> > org.jboss.cache.interceptors.InterceptorChain.invoke(InterceptorChain.java:
> > 287)
> >        at
> > org.jboss.cache.invocation.CacheInvocationDelegate.invokePut(CacheInvocatio­nDelegate.java:
> > 705)
> >        at
> > org.jboss.cache.invocation.CacheInvocationDelegate.put(CacheInvocationDeleg­ate.java:
> > 519)
> >        at
> > org.jboss.cache.invocation.NodeInvocationDelegate.addChild(NodeInvocationDe­legate.java:
> > 337)
> >        at
> > org.mobicents.slee.runtime.activity.ActivityContextCacheData.setEnding(Acti­vityContextCacheData.java:
> > 166)
> >        at
> > org.mobicents.slee.runtime.activity.ActivityContextImpl.endActivity(Activit­yContextImpl.java:
> > 494)
> >        at
> > org.mobicents.slee.resource.SleeEndpointImpl._endActivity(SleeEndpointImpl.­java:
> > 278)
> >        at
> > org.mobicents.slee.resource.SleeEndpointEndActivityNotTransactedExecutor.ex­ecute(SleeEndpointEndActivityNotTransactedExecutor.java:
> > 41)
> >        at
> > org.mobicents.slee.resource.SleeEndpointImpl.endActivity(SleeEndpointImpl.j­ava:
> > 240)
> >        at
> > org.mobicents.slee.resource.sip11.SipResourceAdaptor.sendActivityEndEvent(S­ipResourceAdaptor.java:
> > 807)
> >        at
> > org.mobicents.slee.resource.sip11.SipResourceAdaptor.processTransactionTerm­inated(SipResourceAdaptor.java:
> > 750)
> >        at
> > org.mobicents.slee.resource.sip11.SipResourceAdaptor.processTransactionTerm­inated(SipResourceAdaptor.java:
> > 737)
> >        at gov.nist.javax.sip.EventScanner.deliverEvent(EventScanner.java:
> > 419)
> >        at
> > gov.nist.javax.sip.SipProviderImpl.handleEvent(SipProviderImpl.java:
> > 193)
> >        at
> > gov.nist.javax.sip.stack.SIPTransactionStack.removeTransaction(SIPTransacti­onStack.java:
> > 1655)
> >        at
> > gov.nist.javax.sip.stack.SIPServerTransaction.cleanUp(SIPServerTransaction.­java:
> >> > > 192.168.231.105:7900- Hide quoted text -

Eduardo Martins

unread,
Jul 12, 2010, 10:39:45 AM7/12/10
to mobicent...@googlegroups.com
In that case start both with -c all -b ip only.

-- Eduardo

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
http://emmartins.blogspot.com
http://www.redhat.com/solutions/telco

Randy Thomas

unread,
Jul 12, 2010, 12:20:28 PM7/12/10
to mobicents-public
Eduardo,
When I started with those options, all nodes seem to discover each
other (took a while though).
I'm not understranding what's dictating to which node becomes merge
leader. For ex, I started 105 (192.168.231.105) and started 106 and 91
later but 91 became merge leader.
Following is the cut&paste from cluster.log (105).

mcast_sock: bound to 192.168.231.105:45588, send buffer size=640000,
receive buffer size=25000000
2010-07-12 11:32:18,802 WARN [org.jgroups.protocols.UDP] failed to
join /224.0.75.75:7500 on net3: java.net.SocketException: Unrecognized
Windows Sockets error: 0: no Inet4Address associated with interface
2010-07-12 11:32:18,803 WARN [org.jgroups.protocols.UDP] failed to
join /224.0.75.75:7500 on net4: java.net.SocketException: Unrecognized
Windows Sockets error: 0: no Inet4Address associated with interface
2010-07-12 11:32:18,808 DEBUG [org.jgroups.protocols.UDP] created
unicast receiver thread Thread[UDP ucast,
192.168.231.105:61031,5,JGroups]
2010-07-12 11:32:18,808 DEBUG [org.jgroups.protocols.UDP] created
multicast receiver thread Thread[UDP mcast,
192.168.231.105:61031,10,JGroups]
2010-07-12 11:32:20,816 DEBUG [org.jgroups.protocols.pbcast.GMS]
initial_mbrs are []
2010-07-12 11:32:20,816 DEBUG [org.jgroups.protocols.pbcast.GMS] no
initial members discovered: creating group as first member
2010-07-12 11:32:20,816 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=192.168.231.105:61031] view is [192.168.231.105:61031|0]
[192.168.231.105:61031]
2010-07-12 11:32:20,816 DEBUG [org.jgroups.protocols.FD_SOCK]
VIEW_CHANGE received: [192.168.231.105:61031]
2010-07-12 11:32:20,822 DEBUG [org.jgroups.protocols.pbcast.FLUSH]
Installing view at 192.168.231.105:61031 view is
[192.168.231.105:61031|0] [192.168.231.105:61031]
2010-07-12 11:32:20,822 DEBUG [org.jgroups.protocols.pbcast.FLUSH] At
192.168.231.105:61031 received STOP_FLUSH, unblocking FLUSH.down() and
sending UNBLOCK up
2010-07-12 11:32:20,822 DEBUG [org.jgroups.protocols.pbcast.GMS]
192.168.231.105:61031 changed role to
org.jgroups.protocols.pbcast.CoordGmsImpl
2010-07-12 11:32:20,822 DEBUG [org.jgroups.protocols.pbcast.GMS]
created group (first member). My view is [192.168.231.105:61031|0],
impl is org.jgroups.protocols.pbcast.CoordGmsImpl
2010-07-12 11:42:54,817 DEBUG [org.jgroups.protocols.pbcast.STABLE]
suspending message garbage collection
2010-07-12 11:42:54,818 DEBUG [org.jgroups.protocols.pbcast.STABLE]
resume task started, max_suspend_time=200000
2010-07-12 11:42:54,818 DEBUG [org.jgroups.protocols.pbcast.GMS] Merge
participant 192.168.231.105:7900 got merge request from
192.168.231.91:7900
2010-07-12 11:42:54,818 DEBUG [org.jgroups.protocols.pbcast.GMS]
Suspending view handler at 192.168.231.105:7900
2010-07-12 11:42:54,819 DEBUG [org.jgroups.protocols.pbcast.GMS]
suspended ViewHandler at 192.168.231.105:7900
2010-07-12 11:42:54,820 DEBUG [org.jgroups.protocols.pbcast.GMS]
192.168.231.105:7900 got merge request from 192.168.231.91:7900,
merge_id=[192.168.231.91:7900|1278949374756]
2010-07-12 11:42:54,821 DEBUG [org.jgroups.protocols.pbcast.GMS]
response=GmsHeader[MERGE_RSP]: view=[192.168.231.105:7900|0]
[192.168.231.105:7900], digest=192.168.231.105:7900: [27 : 30 (30)],
merge_rejected=false, merge_id=[192.168.231.91:7900|1278949374756]
2010-07-12 11:42:54,824 DEBUG [org.jgroups.protocols.pbcast.GMS]
192.168.231.105:7900 responded to 192.168.231.91:7900,
merge_id=[192.168.231.91:7900|1278949374756]
2010-07-12 11:42:55,058 DEBUG [org.jgroups.protocols.pbcast.GMS]
view=MergeView::[192.168.231.91:7900|1] [192.168.231.91:7900,
subgroups=[[192.168.231.106:7900|0] [192.168.231.106:7900],
[192.168.231.91:7900|0] [192.168.231.91:7900], [192.168.231.105:7900|
0] [192.168.231.105:7900]], digest=192.168.231.106:7900: [15 : 18
(18)], 192.168.231.91:7900: [15 : 18 (18)], 192.168.231.105:7900:
[27 : 30 (30)]
2010-07-12 11:42:55,058 ERROR [org.jgroups.protocols.pbcast.NAKACK]
sender 192.168.231.91:7900 not found in xmit_table
2010-07-12 11:42:55,058 ERROR [org.jgroups.protocols.pbcast.NAKACK]
range is null
2010-07-12 11:42:55,058 ERROR [org.jgroups.protocols.pbcast.NAKACK]
sender 192.168.231.106:7900 not found in xmit_table
2010-07-12 11:42:55,058 ERROR [org.jgroups.protocols.pbcast.NAKACK]
range is null
2010-07-12 11:42:55,058 DEBUG [org.jgroups.protocols.pbcast.NAKACK]
existing digest: 192.168.231.105:7900: [27 : 31 (31)]
new digest: 192.168.231.106:7900: [15 : 18 (18)],
192.168.231.91:7900: [15 : 18 (18)], 192.168.231.105:7900: [27 : 30
(30)]
resulting digest: 192.168.231.106:7900: [15 : 18 (18)],
192.168.231.91:7900: [15 : 18 (18)], 192.168.231.105:7900: [27 : 31
(31)]
2010-07-12 11:42:55,058 DEBUG [org.jgroups.protocols.pbcast.GMS]
[local_addr=192.168.231.105:7900] view is MergeView::
[192.168.231.91:7900|1] [192.168.231.91:7900, 192.168.231.105:7900,
192.168.231.106:7900], subgroups=[[192.168.231.106:7900|0]
[192.168.231.106:7900], [192.168.231.91:7900|0] [192.168.231.91:7900],
[192.168.231.105:7900|0] [192.168.231.105:7900]]
2010-07-12 11:42:55,062 DEBUG [org.jgroups.protocols.FD_SOCK]
VIEW_CHANGE received: [192.168.231.91:7900, 192.168.231.105:7900,
192.168.231.106:7900]
2010-07-12 11:42:55,063 DEBUG [org.jgroups.protocols.pbcast.GMS]
192.168.231.105:7900 changed role to
org.jgroups.protocols.pbcast.ParticipantGmsImpl
2010-07-12 11:42:55,063 DEBUG [org.jgroups.protocols.pbcast.STABLE]
resuming message garbage collection
2010-07-12 11:42:55,097 DEBUG [org.jgroups.protocols.FD_SOCK]
determinePingDest()=192.168.231.106:7900
2010-07-12 11:42:55,099 DEBUG [org.jgroups.protocols.FD_SOCK]
ping_dest=192.168.231.106:7900, ping_sock=Socket[addr=/
192.168.231.106,port=56365,localport=54561],
cache={192.168.231.106:7900=192.168.231.106:56365,
192.168.231.91:7900=192.168.231.91:54161,
192.168.231.105:7900=192.168.231.105:54465}
2010-07-12 11:42:55,746 DEBUG [org.jgroups.protocols.MERGE2]
192.168.231.105:64384 found multiple coordinators:
[192.168.231.91:41882, 192.168.231.105:64384]; sending up MERGE event
2010-07-12 11:42:55,805 DEBUG [org.jgroups.protocols.pbcast.GMS]
Determining merge leader from coordinators: [192.168.231.91:41882,
192.168.231.105:64384]
2010-07-12 11:42:55,805 DEBUG [org.jgroups.protocols.pbcast.GMS] I
(192.168.231.105:64384) am not the merge leader, waiting for merge
leader (192.168.231.91:41882) to initiate merge
2010-07-12 11:42:59,820 DEBUG [org.jgroups.protocols.pbcast.GMS] At
192.168.231.105:7900 cancelling merge due to timer timeout (5000 ms)
2010-07-12 11:42:59,821 DEBUG [org.jgroups.protocols.pbcast.GMS]
cancelling merge (merge_id=[192.168.231.91:7900|1278949374756])

As you can see in the above logs , merge seems to fail because of
timeout, but I don't see re-tries of the same.


I went onto make a call to test replication, I see dialog on 105 (JMX)
but not on 106 or 91. I don't even see any log messages trying to
replicate anything.
In server.log I see "keep alive" messages periodically though.

2010-07-12 12:01:37,319 DEBUG [org.jgroups.protocols.FD]
(Timer-1,192.168.231.105:7900) sending are-you-alive msg to
192.168.231.106:7900 (own address=192.168.231.105:7900)


On Jul 12, 10:39 am, Eduardo Martins <emmart...@gmail.com> wrote:
> In that case start both with -c all -b ip only.
>
> -- Eduardo
>
> -:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-http://emmartins.blogspot.comhttp://www.redhat.com/solutions/telco
> >> >> > > 2010-07-09 10:23:33,037 WARN- Hide quoted text -

Eduardo Martins

unread,
Jul 13, 2010, 5:39:50 AM7/13/10
to mobicent...@googlegroups.com
What happens with 2 nodes in same host? I don't really know how to
help you with this, the timeout defined is much more than ever needed,
it can be some network issue or configuration.

-- Eduardo

-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-
http://emmartins.blogspot.com
http://www.redhat.com/solutions/telco

Reply all
Reply to author
Forward
0 new messages