MSS SIP over TCP problem

574 views
Skip to first unread message

Krzysztof Mioduszewski

unread,
Feb 8, 2013, 7:30:30 AM2/8/13
to mobicent...@googlegroups.com
Hi, 

I have encountered a really strange problem with the latest mss (mss-2.0.0.FINAL-jboss-as-7.1.2.Final-1349104459). I implemented a regular B2BUA and everything works fine in case of SIP over UDP communication for both legs. 
I've switched A leg transport to TCP. After server restart for about 4-5 minutes everything works fine - calls are established and disconnected properly. After these 4 minutes if B side is terminating an ongoing call (sends BYE over UDP). There's no BYE sent to A from mss. There's no error in mss logs. I can only see the following warnings:

13:46:08,783 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) [2] Old socket different than new socket on channel tcp:10.203.9.66:5060java.nio.channels.SocketChannel[closed] java.nio.channels.SocketChannel[connected local=/<mss_ip>:47172 remote=<scscf_url>/<scscf_ip>:5060]
13:46:08,784 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) Old socket local ip address null
13:46:08,785 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) Old socket remote ip address null
13:46:08,785 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) New socket local ip address /<mss_ip>:47172
13:46:08,785 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) New socket remote ip address <scscf_url>/<scscf_ip>:506
13:46:08,785 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-10) There was no exception for the retry mechanism so we keep going <scscf_ip>:5060

In the trace I can see that one TCP sessions is started before the initial INVITE is sent. This session is used for establishing the call (src ip: scscf_ip, src port: 58342, dest ip: mss_ip, dest_port: 5060). When the BYE over UDP is received another TCP session is established (src ip: mss_ip, src port: 47172, dest ip: scscf_ip, dest port: 5060). I can see this socket using netstat. But nothing is sent over it. (In the first 4 minutes after restart BYE is sent over this second socket).

I've also changed B leg transport to TCP and the situation is similar. First 4 minutes after restart everythings fine. After that while establishing a call the second INVITE (to B) is also not sent over open socket with similar log output.

Any advice how to solve this?

BR,
Krzysiek

Jose

unread,
Feb 8, 2013, 4:59:48 PM2/8/13
to mobicent...@googlegroups.com
Hello, Krzysztof,

The first TCP connection is between S-CSCF:58342 - MSS:5060 (this means INVITE is sent by the S-CSCF to MSS). The new TCP connection is between MSS:47172 - S-CSCF:5060. So, for some reason mss thinks that the outgoing BYE for the B leg must be forcefully sent to the port 5060 in the S-CSCF, instead of using the existing TCP connection. Perhaps the initial incoming INVITE misses some info.

Any change if the leg sending the BYE is in the TCP connection side?

If you switch to TCP in both legs, not even call establishment works after 4 minutes, no error message, timeout, etc?

I would try to set logging to DEBUG and see what's going on. A tcpdump would also help.

JM

Krzysztof Mioduszewski

unread,
Feb 12, 2013, 6:48:29 AM2/12/13
to mobicent...@googlegroups.com
Hi,

I'm sorry but I cannot post any tcpdumps (secutity issues). 
Ok, just to clarify, now I'm using only TCP on both legs, so the scenario is just a basic B2BUA over TCP. I've done some more testing, and it looks like there is some kind of timeout after a period of no calls which breakes the functionality of sending requests over new tcp connections.
After restarting the server I was setting up test calls, one after another with several seconds of pause between them. Everything was working fine. After about 8 minutes I stopped making calls. After next 2 minutes I tried to setup another call and it failed. MSS didn't send out the INVITE to B even though the new tcp session was up. Log output looked exactly the same (except for the ports of course). 

I've tried changing logging to DEBUG in logging.properties, but i don't see anything new. Adding BasicConfiguration.configure() at servlet init resulted in printing stdouts(and doubling some logs). You can find the current log output below. It is interesting that the INVITE which isn't send out is printed as a regular message. Some more ideas?

11:46:22,802 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326611 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - [2] Old socket different than new socket on channel tcp:<scscf_ip>:5060java.nio.channels.SocketChannel[closed] java.nio.channels.SocketChannel[connected local=/<mss_ip>:43950 remote=<scscf_url>/<scscf_ip>:5060]
11:46:22,802 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) [2] Old socket different than new socket on channel tcp:<scscf_ip>:5060java.nio.channels.SocketChannel[closed] java.nio.channels.SocketChannel[connected local=/<mss_ip>:43950 remote=<scscf_url>/<scscf_ip>:5060]
11:46:22,805 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326614 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - Old socket local ip address null
11:46:22,805 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) Old socket local ip address null
11:46:22,805 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326614 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - Old socket remote ip address null
11:46:22,805 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) Old socket remote ip address null
11:46:22,806 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326615 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - New socket local ip address /<mss_ip>:43950
11:46:22,806 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) New socket local ip address /<mss_ip>:43950
11:46:22,806 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326615 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - New socket remote ip address <scscf_url>/<scscf_ip>:5060
11:46:22,806 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) New socket remote ip address <scscf_url>/<scscf_ip>:5060
11:46:22,807 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326616 [SIP-TCP-Core-PipelineThreadpool-12] WARN gov.nist.javax.sip.stack.NioTcpMessageChannel  - There was no exception for the retry mechanism so we keep going tcp:<scscf_ip>:5060
11:46:22,807 WARN  [gov.nist.javax.sip.stack.NioTcpMessageChannel] (SIP-TCP-Core-PipelineThreadpool-12) There was no exception for the retry mechanism so we keep going tcp:<scscf_ip>:5060
11:46:22,811 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) 326620 [SIP-TCP-Core-PipelineThreadpool-12] INFO gov.nist.javax.sip.stack.SIPTransactionStack  - <message
11:46:22,811 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) from="<mss_ip>:5060"
11:46:22,811 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) to="<scscf_ip>:5060"
11:46:22,811 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) time="1360665982799"
11:46:22,811 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) isSender="true"
11:46:22,812 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) transactionId="z9hg4bk1d40cd22-4712-43e3-b9e4-9344389e1859_56e6eb3b_2423630489896848"
11:46:22,812 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) callId="5af01d475e894614ca5f1a49eed517dd@<mss_ip>"
11:46:22,812 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) firstLine="INVITE sip:<user>@<scscf_url>;user=phone;transport=tcp SIP/2.0"
11:46:22,812 INFO  [stdout] (SIP-TCP-Core-PipelineThreadpool-12) >
      
BR,
Krzysiek

Jose

unread,
Feb 12, 2013, 10:29:24 AM2/12/13
to mobicent...@googlegroups.com
Hello, Krzysztof,

The logs indicate that, from the point of view of the stack, the message was sent.
Have you checked with tcpdump that the message is not actually sent? Perhaps you need to keep open a pinhole in a firewall or router and that pinhole is closed after 8 minutes of inactivity.

Hope it helps,


JM

On Friday, 8 February 2013 13:30:30 UTC+1, Krzysztof Mioduszewski wrote:

Krzysztof Mioduszewski

unread,
Feb 13, 2013, 10:53:34 AM2/13/13
to mobicent...@googlegroups.com
Hi,

I was looking into tcpdump. As I wrote before the tcp session is established but nothing is sent over it. 
I can see that the session is established properly (handshake with syn, syn ack, ack). It looks exactly the same in case of the calls just after the restart which are ok.
I'll try to doublecheck all network elements on the path. But still it doesn't make much sense that after server restart everything works fine and breaks down after the short pause. If there would be some firewall involved I would assume that the tcp session wouldn't be established at all.

BR,
Krzysiek

Jose

unread,
Feb 15, 2013, 9:45:14 AM2/15/13
to mobicent...@googlegroups.com

Does the message appear in the tcpdump?

Please check the network path. Do you have NAT between the MSS server and the CSCF?. Every time the server is restarted, it would select a different source port so a new connection would be used and a new pinhole is created. As long as there is traffic, the pinhole is kept open. If the pinhole times out the NAT can simply drop the packets while keeping the connection open (that's the weird part). So yo would see messages in the MSS capture but you would not see the messages in the CSCF capture.

Without full logging or tcpdump captures is difficult to go beyond this.

Regards,

JM

Krzysztof Mioduszewski

unread,
Feb 19, 2013, 8:41:05 AM2/19/13
to mobicent...@googlegroups.com
No, the message does not appear in the tcpdump on the MSS side (CSCF also not). I've verified the network path. There isn't any NAT or FW between MSS and CSCF. Running out of options here, maybe I'll try to look into the MSS code which prints the above warnings. Do you have any other suggestions?

BR,
Krzysiek 

Jose

unread,
Feb 20, 2013, 10:44:41 AM2/20/13
to mobicent...@googlegroups.com

Running out of options indeed.

Try to setup gov.nist logging level to DEBUG and look for anything happening after the 8 minutes of activity ...

JM

Krzysztof Mioduszewski

unread,
Feb 25, 2013, 4:51:51 AM2/25/13
to mobicent...@googlegroups.com
Hi,

I've managed to find a workaround for this problem. The quick fix is to change the sip stack implementation to the version which isn't using NIO (jain-sip-ri-1.2.170.jar instead of jain-sip-ri-1.2.170-NIO.jar). After loading the module without NIO tcp works fine. So it looks like a problem in the NIO sip stack implementation. I've decided to do this after seeing DEBUG logs from the sip stack and browsing quickly through the sourcecode. 
Are there any performance issues which I should consider when switching to non-NIO version? Are there any downsides of using it with the current MSS?

Below you can find part of the log which motivated me to switch to non-NIO version. It looks like there's a problem with nioTcpMessageChannel.
16:56:07,020 DEBUG [gov.nist] (NioSelector-TCP-<mss_ip>/5060) gov.nist.javax.sip.stack.NioTcpMessageProcessor$ProcessorTask.write(NioTcpMessageProcessor.java:135) [Need to write something on nioTcpMessageChannel null socket java.nio.channels.SocketChannel[connected local=/<mss_ip>:44216 remote=<scscf_url>/<scscf_ip>:5060]]
16:56:07,020 DEBUG [gov.nist] (NioSelector-TCP-<mss_ip>/5060) gov.nist.javax.sip.stack.NioTcpMessageProcessor$ProcessorTask.write(NioTcpMessageProcessor.java:138) [Dead socketChanneljava.nio.channels.SocketChannel[connected local=/<mss_ip>:44216 remote=<scscf_url>/<scscf_ip>:5060] socket <scscf_url>/<scscf_ip>:5060]

BR,
Krzysiek

Jose

unread,
Mar 5, 2013, 5:18:26 AM3/5/13
to mobicent...@googlegroups.com
Hi, Krzysztof,

NIO is more performant in general.

Did something suspicious appear in the logs after 8 minutes of inactivity?

If you can provide more complete logs, consider opening an issue. If not, perhaps you can go for commercial TeleStax support (they can sign NDAs, etc.).

JM

Krzysztof Mioduszewski

unread,
Mar 11, 2013, 6:54:26 AM3/11/13
to mobicent...@googlegroups.com
Hi Jose,


BR,
Krzysztof
Reply all
Reply to author
Forward
0 new messages