[mobicents-public] MSS: ReentrantLock$NonfairSync memory leak

477 views
Skip to first unread message

Miha Polak

unread,
May 13, 2010, 2:54:04 AM5/13/10
to mobicents-public
Hello!
Recently we are dealing with some load testing of our application. It wasnt a big load but eventualy java runs out of memory. We did some profiling and found that some objects are not collected by GC and are constantly increasing (please see attached picture).
Since our application does not use reentrantlocks I would like to ask if anyone experienced such problem.
We use MSS stable release 1.0.

br, Miha
memory leak.JPG

Vladimir Ralev

unread,
May 13, 2010, 2:59:25 AM5/13/10
to mobicent...@googlegroups.com
Tell us generally what SS features your app uses. Also try to show the GC root and tree for those. And I think you should try this version, it may fix the problemhttp://hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807/

Miha

unread,
May 13, 2010, 4:08:40 AM5/13/10
to mobicents-public
Hello!
Features we use are nothing special. Making originating sip calls,
storing some data to sipsessions, handling doXXX methods. It is a B2UA
application, that uses B2UAHelper. Besides that we also have
sipsession listener and timer listener/timer service implemented.

I think that's mostly it.

I will soon post also some pictures of gc as required.

On 13 maj, 08:59, Vladimir Ralev <vladimir.ra...@gmail.com> wrote:
> Tell us generally what SS features your app uses. Also try to show the GC
> root and tree for those. And I think you should try this version, it may fix
> the problemhttp://hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807/
>

Miha Polak

unread,
May 13, 2010, 4:16:33 AM5/13/10
to mobicents-public
Here are also pics, hope they help.
GCrootReentrantLockNonfairSync.JPG
GCrootConcurrentHashMapHashEntry.JPG
GCrootConcurrentHashMapSegment.JPG
GCrootReentrantLockNonfairSync2.JPG

Vladimir Ralev

unread,
May 13, 2010, 4:18:41 AM5/13/10
to mobicent...@googlegroups.com
OK thanks. It is still highly advised to try the newer version. There have been A LOT of changes related to that.

Miha Polak

unread,
May 13, 2010, 4:22:55 AM5/13/10
to mobicent...@googlegroups.com
We will try to :)

2010/5/13 Vladimir Ralev <vladimi...@gmail.com>

Jernej Krmelj

unread,
May 13, 2010, 10:23:39 AM5/13/10
to mobicents-public
We tried to deploy newest version on Tomcat 6.0.20, but we had some
problems with this version of Tomcat. Is it possible to use higher
version of Tomcat with the newest version of MSS, for example 6.0.26?

On May 13, 10:18 am, Vladimir Ralev <vladimir.ra...@gmail.com> wrote:
> OK thanks. It is still highly advised to try the newer version. There have
> been A LOT of changes related to that.
>

Jean Deruelle

unread,
May 13, 2010, 10:26:04 AM5/13/10
to mobicent...@googlegroups.com
Too late we are in the process of releasing MSS 1.3.
We will upgrade to latest tomcat version for MSS 1.4, a snapshot with this version should be avaiable next week. we will ping you back then

Jernej Krmelj

unread,
May 14, 2010, 5:11:32 AM5/14/10
to mobicents-public
Hi,

anyway we run the same load test on a MSS snapshot you advised (http://
hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807) deployed
on Tomcat 6.0.26. The results are the same, the same memory leak is
encountered.

It seems that there are known issues when using
java.util.concurrent.locks.ReentrantLock (for example
http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=3fe16e2735cd530986278fe92ea5?bug_id=6539568),
so maybe this could be the cause of our problems.

We are using java 1.6.0_16.

I will try to post some additonal GC roots of the objects, if they
would help analyze the problem.

On May 13, 4:26 pm, Jean Deruelle <jean.derue...@gmail.com> wrote:
> Too late we are in the process of releasing MSS 1.3.
> We will upgrade to latest tomcat version for MSS 1.4, a snapshot with this
> version should be avaiable next week. we will ping you back then
>

Jernej Krmelj

unread,
May 14, 2010, 5:29:28 AM5/14/10
to mobicents-public
Hi,

anyway we run the same load test on a MSS snapshot you advised (http://
hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807) deployed
on Tomcat 6.0.26. The results are the same, the same memory leak is
encountered.

It seems that there are known issues when using
java.util.concurrent.locks.ReentrantLock (for example
http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=3fe16e2735cd530986278fe92ea5?bug_id=6539568),
so maybe this could be the cause of our problems.

We are using java 1.6.0_16.

I will try to post some additonal GC roots of the objects, if they
would help analyze the problem.

On May 13, 4:26 pm, Jean Deruelle <jean.derue...@gmail.com> wrote:
> Too late we are in the process of releasing MSS 1.3.
> We will upgrade to latest tomcat version for MSS 1.4, a snapshot with this
> version should be avaiable next week. we will ping you back then
>

Jernej Krmelj

unread,
May 14, 2010, 5:32:20 AM5/14/10
to mobicents-public
additional pics....

2010/5/14 Jernej Krmelj <krmel...@gmail.com>



--
lp, .korny
GCrootReentrantLockNonfairSync3.JPG
GCrootReentrantLockNonfairSync4.JPG

Vladimir Ralev

unread,
May 14, 2010, 5:39:18 AM5/14/10
to mobicent...@googlegroups.com
OK, we expect certain handles of this type to stay in memory for 30-40 seconds and move into the older generation. Can you stop the load and wait for the calls to finish, then wait another 40 seconds then perform full GC from the profiler, and see if anything is collected. It may take a couple of full GCs to collect completely.

Jernej Krmelj

unread,
May 14, 2010, 6:33:13 AM5/14/10
to mobicent...@googlegroups.com
After stop of traffic and wait for about 30 min, there is no change on number of those object. Even if I invoke full GC, the number of objects stays the same afterwards.

2010/5/14 Vladimir Ralev <vladimi...@gmail.com>



--
lp, .korny

Vladimir Ralev

unread,
May 14, 2010, 12:32:01 PM5/14/10
to mobicent...@googlegroups.com
OK I see. If you make a single call does it leak, or it happens only under load?

Jernej Krmelj

unread,
May 17, 2010, 3:14:58 AM5/17/10
to mobicent...@googlegroups.com
Yes, even if I make a single call (invite, 180, 183, 200 OK, BYE, 200 OK), objects retain in memory. With every call the same number of objects is added to the number of retained objects.

2010/5/14 Vladimir Ralev <vladimi...@gmail.com>



--
lp, .korny

Vladimir Ralev

unread,
May 17, 2010, 9:49:44 AM5/17/10
to mobicent...@googlegroups.com
Do you ACK the INVITE OK? That might account for a leaking transaction. We make load tests for all types of apps and don't see such problems.

Vladimir Ralev

unread,
May 17, 2010, 9:50:32 AM5/17/10
to mobicent...@googlegroups.com
If you have a simple app that reproduces the problem we'd be glad to look at it.

Jernej Krmelj

unread,
May 17, 2010, 10:06:56 AM5/17/10
to mobicent...@googlegroups.com
In the scenario we test, MSS is the initiator of SIP session. MSS sends invite and receives responses. The same happens with BYE. Test scenario is part of a complex application, so we will try to prepare a simple application for analysis.

Meanwhile: I found similar problem on http://forums.java.net/jive/message.jspa?messageID=280897 , but no solution is described.

2010/5/17 Vladimir Ralev <vladimi...@gmail.com>

Jean Deruelle

unread,
May 17, 2010, 2:30:37 PM5/17/10
to mobicent...@googlegroups.com
Can you check if after 3-4 min (3 minutes is the expiration time of the sip application session) the objects used during the call are Garbage collected ?

Miha Polak

unread,
May 18, 2010, 3:13:41 AM5/18/10
to mobicent...@googlegroups.com
We do ack invite. In message before Jernej forgot to write down the message.

2010/5/17 Vladimir Ralev <vladimi...@gmail.com>

Jernej Krmelj

unread,
May 18, 2010, 7:36:52 AM5/18/10
to mobicent...@googlegroups.com
Now we test a simple one call scenario: MSS sends invite, but there is no response. A few invite send are repeated before "408 Request timeout" occurs after 30s. SIP session is not destroyed at that moment. After for about 3 minutes an exceptions happens on MSS:
Exception in thread "Timer-0" java.lang.NullPointerException
        at org.mobicents.servlet.sip.startup.SipStandardContext.enterSipApp(SipStandardContext.java:1133)
        at org.mobicents.servlet.sip.core.timers.StandardSasTimerTask.tryToExpire(StandardSasTimerTask.java:75)
        at org.mobicents.servlet.sip.core.timers.StandardSasTimerTask.run(StandardSasTimerTask.java:69)
        at java.util.TimerThread.mainLoop(Timer.java:512)
        at java.util.TimerThread.run(Timer.java:462)

No SIP application session expiration callback is invoked.

I guess this exception means that something went wrong when SIP Application session expiration is invoked. Of course objects still retain in memory.

Be aware that we still use your MSS snapshot you advised (http://hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807) on top of Tomcat 6.0.26, so this maybe is the cause of the problem.

We test the same scenario on MSS stable release 1.0 and there was no problem with SIP application session expiration and destruction. But still some objects retain in memory although their number is much more lower.

2010/5/17 Jean Deruelle <jean.d...@gmail.com>

Jernej Krmelj

unread,
May 18, 2010, 8:46:20 AM5/18/10
to mobicents-public
After moving from SipSession to SipApplicationSession concurrency
mode, after 3 minutes there is no exception anymore and
SipApplicationSession expires and destroys normally. But with
repeating earlier mentioned simple one call scenario, the number of in
memory retained object is slowly increasing although full GC is
invoked repeatedly.

On May 18, 1:36 pm, Jernej Krmelj <krmeljer...@gmail.com> wrote:
> Now we test a simple one call scenario: MSS sends invite, but there is no
> response. A few invite send are repeated before "408 Request timeout" occurs
> after 30s. SIP session is not destroyed at that moment. After for about 3
> minutes an exceptions happens on MSS:
> Exception in thread "Timer-0" java.lang.NullPointerException
>         at
> org.mobicents.servlet.sip.startup.SipStandardContext.enterSipApp(SipStandardContext.java:1133)
>         at
> org.mobicents.servlet.sip.core.timers.StandardSasTimerTask.tryToExpire(StandardSasTimerTask.java:75)
>         at
> org.mobicents.servlet.sip.core.timers.StandardSasTimerTask.run(StandardSasTimerTask.java:69)
>         at java.util.TimerThread.mainLoop(Timer.java:512)
>         at java.util.TimerThread.run(Timer.java:462)
>
> No SIP application session expiration callback is invoked.
>
> I guess this exception means that something went wrong when SIP Application
> session expiration is invoked. Of course objects still retain in memory.
>
> Be aware that we still use your MSS snapshot you advised (http://
> hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807) on top of
> Tomcat 6.0.26, so this maybe is the cause of the problem.
>
> We test the same scenario on MSS stable release 1.0 and there was no problem
> with SIP application session expiration and destruction. But still some
> objects retain in memory although their number is much more lower.
>
> 2010/5/17 Jean Deruelle <jean.derue...@gmail.com>
>
> > Can you check if after 3-4 min (3 minutes is the expiration time of the sip
> > application session) the objects used during the call are Garbage collected
> > ?
>
> > On Mon, May 17, 2010 at 9:14 AM, Jernej Krmelj <krmeljer...@gmail.com>wrote:
>
> >> Yes, even if I make a single call (invite, 180, 183, 200 OK, BYE, 200 OK),
> >> objects retain in memory. With every call the same number of objects is
> >> added to the number of retained objects.
>
> >> 2010/5/14 Vladimir Ralev <vladimir.ra...@gmail.com>
>
> >>> OK I see. If you make a single call does it leak, or it happens only
> >>> under load?
>
> >>> On Fri, May 14, 2010 at 1:33 PM, Jernej Krmelj <krmeljer...@gmail.com>wrote:
>
> >>>> After stop of traffic and wait for about 30 min, there is no change on
> >>>> number of those object. Even if I invoke full GC, the number of objects
> >>>> stays the same afterwards.
>
> >>>> 2010/5/14 Vladimir Ralev <vladimir.ra...@gmail.com>
>
> >>>> OK, we expect certain handles of this type to stay in memory for 30-40
> >>>>> seconds and move into the older generation. Can you stop the load and wait
> >>>>> for the calls to finish, then wait another 40 seconds then perform full GC
> >>>>> from the profiler, and see if anything is collected. It may take a couple of
> >>>>> full GCs to collect completely.
>
> >>>>> On Fri, May 14, 2010 at 12:32 PM, Jernej Krmelj <krmeljer...@gmail.com
> >>>>> > wrote:
>
> >>>>>> additional pics....
>
> >>>>>> 2010/5/14 Jernej Krmelj <krmeljer...@gmail.com>
>
> >>>>>>  Hi,
>
> >>>>>>> anyway we run the same load test on a MSS snapshot you advised
> >>>>>>> (http://
> >>>>>>> hudson.jboss.org/hudson/job/MobicentsSipServletsRelease/807)
> >>>>>>> deployed
> >>>>>>> on Tomcat 6.0.26. The results are the same, the same memory leak is
> >>>>>>> encountered.
>
> >>>>>>> It seems that there are known issues when using
> >>>>>>> java.util.concurrent.locks.ReentrantLock (for example
>
> >>>>>>>http://bugs.sun.com/bugdatabase/view_bug.do;jsessionid=3fe16e2735cd53...
> >>>>>>> ),

Jean Deruelle

unread,
May 18, 2010, 8:59:56 AM5/18/10
to mobicent...@googlegroups.com

Jean Deruelle

unread,
May 18, 2010, 9:03:51 AM5/18/10
to mobicent...@googlegroups.com
Can you check through a profiler which objects remains in memory after the sip app session expires and GC is forced ?

Jernej Krmelj

unread,
May 18, 2010, 9:40:33 AM5/18/10
to mobicent...@googlegroups.com
Attached picture shows the number of retained objects after 17th repetition of a simple one call scenario and forced full GC invocation. Pay attention to the number of retained objects as they are a multiple of 17.

2010/5/18 Jean Deruelle <jean.d...@gmail.com>
RetainedObjects.JPG

Jean Deruelle

unread,
May 18, 2010, 9:47:23 AM5/18/10
to mobicent...@googlegroups.com
do you see any gov.nist or org.mobicents related objects being retained ?

Jean Deruelle

unread,
May 18, 2010, 9:49:44 AM5/18/10
to mobicent...@googlegroups.com
Sorry it went out too fast


do you see any gov.nist or org.mobicents related objects being retained ?
Can you check the path to Root GC for the Semaphore, ConcurrentLinkedQueue and CopyOnWrite... objects ?

Thanks 
jean

Jernej Krmelj

unread,
May 18, 2010, 10:23:32 AM5/18/10
to mobicent...@googlegroups.com
Oh, sorry, previous list of retained objects was filtered. Now I attach more detailed list and GC roots of objects you wanted.

2010/5/18 Jean Deruelle <jean.d...@gmail.com>



--
lp, .korny
RetainedObjects1.JPG
RetainedObjects2.JPG
RetainedObjects3.JPG
GCrootSemaphore1.JPG
GCrootSemaphore2.JPG
GCrootSemaphore3.JPG
GCrootConcLinkedQueue1.JPG
GCrootConcLinkedQueue2.JPG
GCrootCopyOnWriteArrayList1.JPG

Jean Deruelle

unread,
May 18, 2010, 10:43:09 AM5/18/10
to mobicent...@googlegroups.com
RetainedObjects3.JPG clearly shows that the SipApplicationSessionImpl and SipSessionImpl are retained.
I also see that there is still 34 SIP Client transactions alive which is weird the SIP Client transaction should either go to terminated state or timeout and thus be cleaned.

You seem to be able to reproduce it easily so could you try this ?

Run the same test for one call only with logs in DEBUG mode (both for jain sip and MSS) and wait for 4 minutes *AFTER THE END* of the call, then force GC in the profiler and see if objects stay in memory in the profiler.
Please collect the logs and send them here.

PS : If you have an application reproducing the issue that we could try that would be great

Thanks
Jean

Jernej Krmelj

unread,
May 19, 2010, 5:02:59 AM5/19/10
to mobicent...@googlegroups.com
Logs are attached.

For navigation:
1. 19-05-2010 10:34:23: Sip application session and sip session are created, send of first INVITE happens
2. 19-05-2010 10:34:55: 408 Request timeout is received from mobicents
3. 19-05-2010 10:37:23: Sip application session expires and destroys, sip session is destroyed
4. full GC is invoked: still one SipApplicationSessionImpl and SipSessionImpl objects retain in memory, two gov.nist.javax.sip.stack.SIPClientTransaction objects also


2010/5/18 Jean Deruelle <jean.d...@gmail.com>
logs.tgz

Jean Deruelle

unread,
May 19, 2010, 9:41:52 AM5/19/10
to mobicent...@googlegroups.com
Can you check if you don't hold a ref to a request or response, sip session or sip app session in your application because I just tried with the attached application against the current trunk and I don't see any leak with jprofiler 6.

Please check the code of the attached application to see if it does the same stuff as your leaking application

Thanks
Jean
timeout-leak-servlet.zip

Miha Polak

unread,
May 19, 2010, 2:47:56 PM5/19/10
to mobicent...@googlegroups.com

There is a bit difference in our app and your example.
We also call req.pushRoute() with some address and we attach to the created SipSession  an attribute:
req.getSession().setAttribute("Cancel", req.createCancel());
Tomorrow we plan to investigate the differences. Especially the req.createCancel() could mean for MSS a new transaction that does not end since sometimes on cancel we sometimes never call cancel.send() and of course therefore we receive no response for it. Would you consider the idea of creating a cancel and not sending it the best position to start the investigation or do you recommend to try with setAttribute() or pushRoute?

Br, Miha



2010/5/19 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
May 19, 2010, 3:00:52 PM5/19/10
to mobicent...@googlegroups.com
Right this is the issue, the createCancel() does create a new Client Transaction indeed.
I will check if we can delay the client transaction creation until the send method.

In the meantime can you comment out this part and retry to profile to check if everything is reclaimed correctly ?
req.getSession().setAttribute("Cancel", req.createCancel());

Thanks
Jean

Jean Deruelle

unread,
May 19, 2010, 3:24:50 PM5/19/10
to mobicent...@googlegroups.com
I created an issue to track it down

Jernej Krmelj

unread,
May 19, 2010, 3:58:13 PM5/19/10
to mobicent...@googlegroups.com
Is it possible to prepare a MSS snapshot release for Tomcat with this issue fix included?

And of course, thanks for your effort.

2010/5/19 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
May 19, 2010, 4:16:56 PM5/19/10
to mobicent...@googlegroups.com
Our hudson is currently down but one should be available before the end of week if there is no regression to the MSS testsuite and JSR 289 TCK passes with this fix.

Miha Polak

unread,
May 20, 2010, 6:26:26 AM5/20/10
to mobicent...@googlegroups.com
Is it possible to predict similar problems in case any other requests created and not sent?
Could  eg. SipServletResponse.createAck() cause similar problems?

And thanks for the effort also from my side.
Please let us know when snapshot for tomcat will be prepared.

Miha Polak

unread,
May 20, 2010, 9:24:40 AM5/20/10
to mobicent...@googlegroups.com
Update:
We were able to compile MSS ourselves and run it on tomcat.
But we experience problem, when we receive error terminal response in invite transaction.
It goes like that:

20-05-2010 14:04:27.429 [UDPMessageChannelThread] INFO - [SipConferenceServlet:sendInvite] - Sending invite for call: 5b9516fb9e4b3379...@192.168.77.77

20-05-2010 14:04:28.184 [UDPMessageChannelThread] INFO - [SipConferenceServlet:doProvisionalResponse] - Received provisional response for call:
5b9516fb9e4b3379...@192.168.77.77

20-05-2010 14:04:28.255 [UDPMessageChannelThread] INFO - [SipConferenceServlet:doErrorResponse] - Received error response for call:
5b9516fb9e4b3379...@192.168.77.77


In doErrorResponse we use super.doErrorResponse() to ACK the original invite. But until application session does not expire SipSession is not destroyed.

20-05-2010 14:22:12.875 [Timer-0] TRACE - [SipConferenceServlet:sessionExpired] - Application Session expired.

20-05-2010 14:22:12.935 [Timer-0] DEBUG - [SessionListener:sessionDestroyed] - Session is destroyed. 5b9516fb9e4b3379...@192.168.77.77


If sip call is terminated by sending BYE session is destroyed in approximately 5 seconds..
Any idea?

2010/5/20 Miha Polak <miha....@gmail.com>

Vladimir Ralev

unread,
May 20, 2010, 9:42:14 AM5/20/10
to mobicent...@googlegroups.com
Error responses should be ACK-ed automatically by MSS. There is no need to ACK them from your app. Can you check if the ACK is already generated?

Miha Polak

unread,
May 20, 2010, 9:57:58 AM5/20/10
to mobicent...@googlegroups.com
No, ack is not generated at that moment. Ack in our app generates only if doSuccessResponse method is called.

Should we remove super.doErrorResponse() from doErrorResponse() method and MSS will still send ack?

2010/5/20 Vladimir Ralev <vladimi...@gmail.com>

Vladimir Ralev

unread,
May 20, 2010, 10:15:47 AM5/20/10
to mobicent...@googlegroups.com
Strange. It doesn't matter if you have super.doErrorResponse(). This would be happening way before that.

What kind of error response is it? A full log would help better.

I suspect may be the condidtion with error response after provisional response is not handled well, but not sure yet.

Vladimir Ralev

unread,
May 20, 2010, 10:26:12 AM5/20/10
to mobicent...@googlegroups.com
Just tested a couple of cases here and ACK was sent automatically with or with provisional responses for any error response. So we would need a full debug log, scenario or sample app.

Miha Polak

unread,
May 20, 2010, 10:52:20 AM5/20/10
to mobicent...@googlegroups.com
I attached tomcat's log. Hopefully it will help.
Scenario is following:
We do kind of click2dial. As soon as first sip session receives SDP (in 200 OK) we create a new invite trough invocation of
b2buaHelper.createRequest(origReq, true, headerMap); In that case we have second sip session.

Than 183 is returned for second sip session (with SDP). At that moment we send ACK for the first sip session.
And finally error is returned for second sip session - we tested either 503 or 404. When it happens we send bye for the first session and it is destroyed after a few seconds. For second session we only call super.doErrorResponse and some other non MSS related processing. However method doErrorResponse quickly returns.

We observe creation and termination of SipSession and SipApplicationSession with listeners.
Hope this helps.

We temporally (until we together find out better solution) added manual invalidation of SipSession in doErrorResponse. Hoping that MSS doesn't need it for anything.


2010/5/20 Vladimir Ralev <vladimi...@gmail.com>
Server.log

Vladimir Ralev

unread,
May 20, 2010, 11:15:17 AM5/20/10
to mobicent...@googlegroups.com
Everything looks fine in the log. The ACK is created and sent as expected. Can you verify that it is sent physically to sip:192.168.77.20:5060 using wireshark?

Also it is a bit weird that it has those two Route headers? Are you using a load balancer of some other proxy?

This is the response:

SIP/2.0 404 Not Found
CSeq: 1 INVITE
From: <sip:38...@192.168.77.20>;tag=35070940_8ade2a41_3413101898917
To: <sip:3000...@192.168.77.20:5060>;tag=0d0ywlwpun
Via: SIP/2.0/UDP 192.168.77.78:5080;branch=z9hG4bK494cc4dcbaaf20f5ce326f7dd1f991a3;appname=8ade2a41;app_id=3413101898917
Reason: Q.850;cause=1;text="Unallocated (unassigned) number"
Content-Length: 0

This is the ACK:
Max-Forwards: 70
From: <sip:38...@192.168.77.20>;tag=35070940_8ade2a41_3413101898917
To: <sip:3000...@192.168.77.20:5060>;tag=0d0ywlwpun
Via: SIP/2.0/UDP 192.168.77.78:5080;branch=z9hG4bK494cc4dcbaaf20f5ce326f7dd1f991a3;appname=8ade2a41;app_id=3413101898917
CSeq: 1 ACK
Route: <sip:192.168.77.20:5060;lr>
Route: <sip:192.168.77.20:5060;lr>
Content-Length: 0

Jean Deruelle

unread,
May 20, 2010, 2:06:55 PM5/20/10
to mobicent...@googlegroups.com
Sorry I'm on business trip for the day
Yes with session.createRequest for non dialog creating requests. I'm checking a fix right now.

Jean

Jean Deruelle

unread,
May 20, 2010, 2:08:40 PM5/20/10
to mobicent...@googlegroups.com
This is mandated by the SIP Servlets 1.1 spec, Section 6.2.1 Relationship to SIP Dialogs :
"3.If the servlet acts as a UAC and sends a dialog creating request, then the SipSession state tracks directly the SIP dialog state except that non-2XX final responses received in the EARLY or INITIAL states cause the SipSession state to return to the INITIAL state rather than going to TERMINATED."

so you may want to invalidate it yourself directly

Jean

Miha Polak

unread,
May 20, 2010, 3:01:42 PM5/20/10
to mobicent...@googlegroups.com
Thanks. I think that explains it all.
So just to wrap things also with Vladimir:

With Jean's explanation, I understand everything works as it should.
ACK is physically sent over network, but SipSession should not be terminated....

So we are ok.

And two route headers it is our problem we are currently aware of. We create a new request trough newRequest = B2UAHelper.createRequest(...), that copies route header from origRequest. And after that we pushRoute(address) (to be able to route it to different address as origRequest) on the new request. We will change the newRequest creation by using SipFactory.createRequest and explicitly link both SipSessions.

Br, Miha

2010/5/20 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
May 20, 2010, 4:41:10 PM5/20/10
to mobicent...@googlegroups.com
Fix Committed in r11898 http://code.google.com/p/mobicents/source/detail?r=11898
If you can rebuild from trunk, you should not have memory leak any longer, can you confirm ?

Thanks
Jean

Miha Polak

unread,
May 21, 2010, 7:01:19 AM5/21/10
to mobicent...@googlegroups.com
No leak found. We ran load test for almost 4 hours, no objects were left behind...

I have a question though:

Why is SipSession invalidated approximately 5 seconds after 200OK on sent BYE is received? I don't have problem with that it is just curiosity.

Br, Miha

2010/5/20 Jean Deruelle <jean.d...@gmail.com>
Fix Committed in r11898 http://code.google.com/p/mobicents/source/detail?r=11898

Vladimir Ralev

unread,
May 21, 2010, 7:13:14 AM5/21/10
to mobicent...@googlegroups.com
The session is invalidated when the transaction is terminated and the transaction is terminated a bit later to make sure retransmissions are not occuring. This is to handle cases such as proxy where after you receive the OK from phone A, it might get lost while delivering to the other end (phone B) and phone B will send a retransmission for the BYE, which will cause another retransmission of the OK from phone A again, and we must deliver this retransmission to the app. We could optimize very slighty when we know for sure the application is not a proxy, but it is not a significant gain to bother.

Miha Polak

unread,
May 24, 2010, 4:34:06 AM5/24/10
to mobicent...@googlegroups.com
Hello!
On Friday afternoon we started a load test over weekend. Today we stopped the load and waited for more than 5 minutes to be sure sip application session, sip sessions and transactions would time-out.


In the profiler we noticed approximately 500 transaction objects weren't cleared from memory even if we manually invoke gc.
But however, memory didn't ran out.

On the other hand some other case occurred. Before we stopped the load no messages from the network weren't passed to the servlet. Outgoing INVITE was sent to the network successfully, but returning 200OK (seen on Wireshark) was not passed to the servlet.

In sip-server.log there were lots of messages:

DEBUG UDPMessageProcessorThread Mobicents-SIP-Servlets - gov.nist.javax.sip.stack.UDPMessageProcessor.run(UDPMessageProcessor.java:215) [Dropping message -- queue length exceeded]

In the attachment there is a log of a single invite sent on network and no returning responses.

Since the tested application is a complex one with many possible scenarios and looking like two problems, the next step would be to find out what still causes transactions not to terminate and what causes responses and requests not to deliver to the servlet.

Any advice or comment would be appreciated.

Br, Miha



2010/5/21 Vladimir Ralev <vladimi...@gmail.com>
singlecall.txt

Miha Polak

unread,
May 24, 2010, 10:58:49 AM5/24/10
to mobicent...@googlegroups.com
While dealing with previously described problems we noticed a difference between MSS stable release 1.0 and current MSS we compiled from trunk. Previously doXXX() methods were invoked in the context of some thread pool thread. Now with current trunk we have all doXXX methods in servlet invoked in UDPMessageChannelThread. Is that normal or did we miss something when upgrading MSS?


2010/5/24 Miha Polak <miha....@gmail.com>

Vladimir Ralev

unread,
May 24, 2010, 11:37:27 AM5/24/10
to mobicent...@googlegroups.com
The message you are seeing implies that you have reached the JAIN SIP congestion control threshold (5000 transactions at the same time). To remove this you should edit jain sip properties and include this:

gov.nist.javax.sip.CONGESTION_CONTROL_ENABLED=false

This condition is very likely to occur under load, especially with longer call length, because for each call we must keep at least one transaction in the memory. Note that messages are dropped and thus some cleanup logic is not invoked which leads to further delays in releasing objects.

500 hanging transactions do not seem normal. The timers must kill them within a few minutes, but depending on your configuration this might take more than 5 minutes. Have you changes anything. Also the GC roots would help. Were there application sessions alive after the full GC?

Vladimir Ralev

unread,
May 24, 2010, 11:38:03 AM5/24/10
to mobicent...@googlegroups.com
It is normal. We changed the threading model a bit.

Jernej Krmelj

unread,
May 25, 2010, 6:12:34 AM5/25/10
to mobicent...@googlegroups.com
Further analysis showed today that some session, transactions and request are never collected from memory. We found a few scenarios when this happens:

1. callId: a461d57243e781f9d634ba85b09406a8
   sipSessionId: 27860644_8ade2a41_340019083127318:a461d57243e781f9...@192.168.77.31:340019083127318:si.iskratel.as.conference.SipConferenceService
  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) it seems 200 OK is also received (seen in attached log), but is never dispatched to servlet. Because no ACK is sent, retransmition of 200 OK is done several times
   d) servlet wants to terminate the call by creating CANCEL request. But when we try to invoke getB2buaHelper() on newly created request, we get following exception:
      java.lang.NullPointerException
        at org.mobicents.servlet.sip.message.SipFactoryImpl$NamesComparator.compare(SipFactoryImpl.java:97)
        at org.mobicents.servlet.sip.message.SipFactoryImpl$NamesComparator.compare(SipFactoryImpl.java:93)
        at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
        at java.util.TreeMap.getEntry(TreeMap.java:322)
        at java.util.TreeMap.containsKey(TreeMap.java:209)
        at java.util.TreeSet.contains(TreeSet.java:217)
        at org.mobicents.servlet.sip.message.SipServletRequestImpl.getB2buaHelper(SipServletRequestImpl.java:372)
                ...
        e) After 5 minutes sipSession destroyed callback is invoked, but session is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.

2. callId: bab438d148142144a00aca42f1547756
   sipSessionId: 63926527_8ade2a41_340019083127318:bab438d148142144...@192.168.77.31:340019083127318:si.iskratel.as.conference.SipConferenceService
  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 183 Session Progress is received and dispatched to servlet
   c) Another 183 Session Progress is received and dispatched to servlet
   d) Third 183 Session Progress is received and dispatched to servlet
   e) it seems 200 OK is also received (seen in attached log), but is never dispatched to servlet.
   f) Servlet tryes to send CANCEL request, but IllegalStateException happens, so BYE is sent instead.
   g) 200 OK for BYE is received and dispatched to servlet.
   h) After 5 minutes sipSession destroyed callback is invoked, but session is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
3. callId: 91d3df466ce1239f14bf008437a13c0f
     sipSessionId: 70446410_8ade2a41_340019083127318:91d3df466ce1239f...@192.168.77.31:340019083127318:si.iskratel.as.conference.SipConferenceService
     
     This one very strange as everything seems to work as expected, but after all, session is not cleaned from memory.
     
     Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
   d) ACK is sent after 2 seconds, meanwhile 200 OK retransmittions (seen in attached log) happen.
   e) servlet sends BYE
   f) 200 OK for BYE is received and dispatched to servlet.
   g) After 5 seconds session is invalidated and destroyed, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
4. callId: 5422c1f82797ce649c867d7a1225729c
   sipSessionID: 51692209_8ade2a41_340019083127318:5422c1f82797ce64...@192.168.77.31:340019083127318:si.iskratel.as.conference.SipConferenceService
  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) CANCEL request is sent from sip servlet (UAC) to sip server
   c) 408 Request timeout is dispatched to servlet after 32 seconds after CANCEL request
   d) Servlet invalidates sipSession, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
5. callId: dd63deb4f86b166ba724920088e6798f
   sipSessionId: 34992473_8ade2a41_340019083127318:dd63deb4f86b166b...@192.168.77.31:340019083127318:si.iskratel.as.conference.SipConferenceService
  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
   d) ACK is sent
   e) servlet sends BYE
   f) 408 Request timeout is dispatched to servlet after 32 seconds after BYE request
   g) Servlet invalidates sipSession, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  

If you need any additional info, profiler snapshot is saved, so information is available.

2010/5/24 Vladimir Ralev <vladimi...@gmail.com>
tomcat.tgz

Jernej Krmelj

unread,
May 25, 2010, 9:56:39 AM5/25/10
to mobicent...@googlegroups.com
We noticed there are a few threads running within MSS, all with "UDPMessageChannelThread" name. Wouldn't it be better if they are named distinctly, for example "UDPMessageChannelThread-1", "UDPMessageChannelThread-2"...? This would ease the debugging process very much.

2010/5/24 Vladimir Ralev <vladimi...@gmail.com>

Jean Deruelle

unread,
May 25, 2010, 11:42:02 AM5/25/10
to mobicent...@googlegroups.com
This comes from the underlying stack, I will consult with the nist sip community if they agree on that.

Jean Deruelle

unread,
May 25, 2010, 11:42:24 AM5/25/10
to mobicent...@googlegroups.com
Thanks for the detailed report, I will look into it

Jean Deruelle

unread,
May 25, 2010, 12:01:57 PM5/25/10
to mobicent...@googlegroups.com
I fixed the issue you see in 1.d) in the trunk (See http://code.google.com/p/mobicents/issues/detail?id=1480)

Jean Deruelle

unread,
May 25, 2010, 12:30:55 PM5/25/10
to mobicent...@googlegroups.com
Unfortunately the logs you sent have INFO Level for Mobicents SIP Servlets so I cannot see what happens in the container only what happens in the NIST SIP Stack.
Could change that to DEBUG for org.mobicents package and resend the logs corresponding to what you described below with updated callid and sipsessionId ?

Thanks in advance
Best regards
Jean 

Jernej Krmelj

unread,
May 26, 2010, 8:21:31 AM5/26/10
to mobicent...@googlegroups.com
It seems somehow that the cause of problems are transactions that are not ACKed regurarly, for eg. they become canceled or timeout occure. This is just our assumption.

We repeated traffic test and found some scenarios where session, transactions and request are never collected from memory. DEBUG level is set now for org.mobicents package in attached logs.

1. callId: cb65320a8fcba9745a2b24bd9183d582
   sessionID: 18420458_8ade2a41_439343716939004:cb65320a8fcba974...@192.168.77.31:439343716939004:si.iskratel.as.conference.SipConferenceService

  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) it seems 200 OK is also received (seen in attached log), but is never dispatched to servlet. Because no ACK is sent, retransmition of 200 OK is done several times.
   d) Servlet tryes to send CANCEL request, but IllegalStateException happens, so BYE is sent instead.
   e) 200 OK for BYE is received and dispatched to servlet.
   f) After 5 minutes sipSession destroyed callback is invoked, but session is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
   A question: should sessionReadyToInvalidate() callback also be invoked? Can this be the cause of the problem?
  
2. callId: 0f07a0b1d7a2453cc3f91dc010d44d87
   sessionID: 02217793_8ade2a41_439343716939004:0f07a0b1d7a2453c...@192.168.77.31:439343716939004:si.iskratel.as.conference.SipConferenceService  
  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) it seems 200 OK is also received (seen in attached log), but is never dispatched to servlet. Because no ACK is sent, retransmition of 200 OK is done several times.
   d) Servlet tryes to send CANCEL request, but IllegalStateException happens, so BYE is sent instead.
   e) 408 Request timeout is dispatched to servlet after 32 seconds after BYE request
   f) Servlet invalidates sipSession which is destroyed, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
   A question: should sessionReadyToInvalidate() callback also be invoked? Can this be the cause of the problem?
  
3. callId: 27f69fb712cbbb5f6b7405086b4f393a
   sessionID: 33471403_8ade2a41_439343716939004:27f69fb712cbbb5f...@192.168.77.31:439343716939004:si.iskratel.as.conference.SipConferenceService

  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
   d) ACK is NOT sent

   e) servlet sends BYE
   f) 408 Request timeout is dispatched to servlet after 32 seconds after BYE request
   g) Servlet invalidates sipSession which is destroyed, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
4. callId: 93517ad033533d79fa84f3e98c9ea18f
   sessionID: 75225833_8ade2a41_439343716939004:93517ad033533d79...@192.168.77.31:439343716939004:si.iskratel.as.conference.SipConferenceService


   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) CANCEL request is sent from sip servlet (UAC) to sip server
   c) 408 Request timeout is dispatched to servlet after 32 seconds after CANCEL request
   d) Servlet invalidates sipSession which is destroyed, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  

5. callId: d275194e5ed770a78bc03fde0a36c670
   sessionId: 68270731_8ade2a41_439343716939004:d275194e5ed770a7...@192.168.77.31:439343716939004:si.iskratel.as.conference.SipConferenceService

  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
   d) servlet sends ACK
   e) servlet sends REINVITE
   f) 503 Service Unavailable is received and dispatched to serlvet
   g) Servlet invalidates sipSession which is destroyed, but it is never cleaned from memory. Profiler shows that session's isValid atribute is set to false.
  
-----------------------------------
- We also noticed the retention of sipsession object of session which were ACKed regurarly and destroyed, but are linked to another sipsession which was not ACKed regurarly.
GC root of this kind of session is attached. A question here: Why a linked session does not become unlinked when it is destroyed?

- We also noticed that sometimes we catch IllegalStateException when we try to send ACK on reinvite. Mostly sending ACK on reinvite is successful but in this test we received the mentioned exception for example for call with callID af74574ac3105103d2e2df404db3f5f6.
What is the difference from the call with id dd84d1c1802c84b2a1b433dd39b52142 which is ACKed properly?
tomcat.rar
GCrootLinkedSession.JPG

Jean Deruelle

unread,
May 26, 2010, 12:56:59 PM5/26/10
to mobicent...@googlegroups.com
I started analyzing the logs for 1. and it seems there is something really weird happening 
180 and 200 are received in the correct order but are handled by 2 different thread
200 starts to be processed and acquire the semaphore on the tx then the 180 thread is scheduled and start processing and acquire the semaphore on the tx as well, 180 is delivered to MSS then to the app, then 200 OK processing resume and it seems it is a bit messed up and is dropped by the stack. 
Later on 200 OK retrans come into the stack and eventually one gets delivered to MSS but the stack is marked as retrans (tx is null) so MSS doesn't deliver it to the app
I will try to investigate more on this and talk with the nist sip stack guys to get more input.

In the meanwhile could you rerun the same test w/o the 180 to see if that helps fix the mem leak for 1. (and potentially 2.)
I will try to reproduce each case there and see why things are retained.

Jean

Jean Deruelle

unread,
May 26, 2010, 3:14:33 PM5/26/10
to mobicent...@googlegroups.com
Are you able to reproduce easily case number 1 ? Would it be possible to make a standalone app and provide a mean to reproduce ?
I added the modification to the NIST SIP Stack that name UDPMessageChannelThread distinctly by number.
You should modify your property gov.nist.javax.sip.TRACE_LEVEL in your mss-sip-stack.properties for this:  gov.nist.javax.sip.TRACE_LEVEL=LOG4J
and in your log4j.xml use the following conversion pattern <param name="ConversionPattern" value="%d %-5r %-5p [%c] (%t:%x) %m%n"/>

With this new configuration and JAIN SIP Stack, can you rerun 1. so that I can try to get more clues on what's going on because it seems the 180 Response was able to acquire the semaphore on the Client transaction even though the 200 OK acquired it and the fact that the 200 OK is dropped by the stack makes no sense, normally the listening point can never be null on dialogfilter.
Can you provide the JVM and OS used for the test as well + JVM options + Sip Stack props ?

Jean Deruelle

unread,
May 26, 2010, 4:10:43 PM5/26/10
to mobicent...@googlegroups.com
- We also noticed the retention of sipsession object of session which were ACKed regurarly and destroyed, but are linked to another sipsession which was not ACKed regurarly.
GC root of this kind of session is attached. A question here: Why a linked session does not become unlinked when it is destroyed?

The retention is due to linkedRequest from the snapshot you sent, but it seems the problem is that the client transaction was not removed from the stack for some reason which I will investigate
 
 
- We also noticed that sometimes we catch IllegalStateException when we try to send ACK on reinvite. Mostly sending ACK on reinvite is successful but in this test we received the mentioned exception for example for call with callID af74574ac3105103d2e2df404db3f5f6.
What is the difference from the call with id dd84d1c1802c84b2a1b433dd39b52142 which is ACKed properly?

I don't see any  IllegalStateException being thrown in the log

Jernej Krmelj

unread,
May 27, 2010, 6:50:53 AM5/27/10
to mobicent...@googlegroups.com
We run traffic for about a minute. Meanwhile 146 sessions are created and destroyed finally. 8 of them have the scenario from 1. and are never GCed. These sessions and linked callIDs are:

a77767a7fe0dbe83be96c26c027a02e6
53372654_8ade2a41_2937701788831:a77767a7fe0dbe83...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

0d348ee175884fb4b1aa966a7b078efc
90651008_8ade2a41_2937701788831:0d348ee175884fb4...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

4105d97fe061f3a74371975bbd100851
93648305_8ade2a41_2937701788831:4105d97fe061f3a7...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

45c0015c8d182af26242c10b619fbb0e
03963283_8ade2a41_2937701788831:45c0015c8d182af2...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

e284a815763bca0afae4eddbdb6de983
02175931_8ade2a41_2937701788831:e284a815763bca0a...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

574cec3e85c521d97f85d8490c2ad676
63772045_8ade2a41_2937701788831:574cec3e85c521d9...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

909e9118e4bf513d8b69234580346164
36387989_8ade2a41_2937701788831:909e9118e4bf513d...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService

5a21a8291838af40667a45f27782a5b8
58098256_8ade2a41_2937701788831:5a21a8291838af40...@192.168.77.31:2937701788831:si.iskratel.as.conference.SipConferenceService


Logs are attached, also JVM options, sip stack options and OS release.
tomcat.rar

Jean Deruelle

unread,
May 27, 2010, 10:14:02 AM5/27/10
to mobicent...@googlegroups.com
Just a note to myself, I found what lead to this problem

As said before there is a race condition between 180 and 200 responses handled by different threads.
They both call currentTransaction.setResponseInterface(sri); in SipTransactionStack.newSIPServerResponse().
But the 200 OK call it first then 180. when 180 has been processed the DialogFilter.listeningPoint is set to null in its finally clause in DialogFilter.processResponse
so when the 200 OK processing resume the DialogFilter of the Clienttransaction is the one of the 180 and its listeningPoint is null hence the dropping of the 200 OK.
The only problem left is :
The semaphore on the clienttransaction should guard this. I still can't see any reason why both responses can acquire it (number of permit is 1 and fairness is true on it)... I wonder if this could be a JVM bug
Could you try on the latest JDK, I believe this is JDK 1.6.20 ? In the meanwhile I'll keep digging

Thanks

Jean Deruelle

unread,
May 27, 2010, 10:54:29 AM5/27/10
to mobicent...@googlegroups.com
Attached to this mail is a new nist sip stack jar which has the semaphore (which is really a mutex) replaced by a reentrant lock.
Can you try with that and see if you observe the same behavior ?

Jean
jain-sip-ri-1.2.X-SNAPSHOT.jar

Jernej Krmelj

unread,
May 27, 2010, 11:43:26 AM5/27/10
to mobicent...@googlegroups.com
Good news: with newest nist sip stack the situation is much better. After 20 minutes of traffic, there is no session left belonging to the scenario we were investigating.

But still there are two sessions left belonging to the following scenario: INVITE->, 180<-, 200<-, ACK->, RE-INVITE->, 503 Service Unavailable<-, session destroy. We will prepare logs and send them.

I guess we are going the right way...

Jernej Krmelj

unread,
May 27, 2010, 11:47:40 AM5/27/10
to mobicent...@googlegroups.com
Oh, and previously java 1.6.20 did not do anything better.

Miha Polak

unread,
May 27, 2010, 1:14:47 PM5/27/10
to mobicent...@googlegroups.com
Good news spread fast.

Jernej, can you confirm, that you haven't tested the reboot of remote party. That produced scenarios 2,3 and 4.


2010/5/27 Jernej Krmelj <krmel...@gmail.com>

Miha Polak

unread,
May 27, 2010, 1:20:20 PM5/27/10
to mobicent...@googlegroups.com
 
- We also noticed that sometimes we catch IllegalStateException when we try to send ACK on reinvite. Mostly sending ACK on reinvite is successful but in this test we received the mentioned exception for example for call with callID af74574ac3105103d2e2df404db3f5f6.
What is the difference from the call with id dd84d1c1802c84b2a1b433dd39b52142 which is ACKed properly?

I don't see any  IllegalStateException being thrown in the log


Today we couldn't reproduce the exception.
We are not 100% sure we provided the right call id since all threads had at the time same name. We welcome the distinctive naming of different threads cause debugging is much easier and of course accurate. In case we manage reproduce the exception we will provide the logs.

Jean Deruelle

unread,
May 27, 2010, 4:02:21 PM5/27/10
to mobicent...@googlegroups.com
Can you retest with this jar https://repository.jboss.org/nexus/content/repositories/snapshots//javax/sip/jain-sip-ri/1.2.X-SNAPSHOT/jain-sip-ri-1.2.X-20100527.195640-188.jar ?
I had to make some modifications (lock wasn't working in non reentrant listener mode so I had to create a semaphore wrapper depending of the stackmode) to pass jain sip TCK + testsuite.
this jar passes the jain sip TCK + testsuite

Sorry about that and thanks for your time and detailed reports. This is highly appreciated.

Jean

Jernej Krmelj

unread,
May 27, 2010, 4:47:44 PM5/27/10
to mobicent...@googlegroups.com
The remote party was up and alive here all the time. No 408 Request timeout scenarios were involved.

2010/5/27 Miha Polak <miha....@gmail.com>



--
lp, .korny

Jernej Krmelj

unread,
May 28, 2010, 5:32:10 AM5/28/10
to mobicent...@googlegroups.com
As already mentioned before situation after traffic test is now much better, but after 1 hour of traffic test we still encountered two retained sessions related to 408 Request timeout scenarios. Test were made with latest NIST SIP stack (jain-sip-ri-1.2.X-20100527.195640-188.jar), logs are attached. Retained sessions are:

1. callID: 6000b3dd81b915160d5b60230c001ea7
   sessionID: 97638055_8ade2a41_84310511982932:6000b3dd81b91516...@192.168.77.31:84310511982932:si.iskratel.as.conference.SipConferenceService

   
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
     d) servlet sends ACK

   e) servlet sends BYE
   f) 408 Request timeout is dispatched to servlet after 32 seconds after BYE request
   g) Servlet invalidates sipSession which is destroyed, but is never cleaned from memory.
  
  
2. callID: a2141377607be81e847b63cfc429b46f
   sessionID: 30498453_8ade2a41_84310511982932:a2141377607be81e...@192.168.77.31:84310511982932:si.iskratel.as.conference.SipConferenceService

  
   Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 183 Session Progress is received and dispatched to servlet
   c) CANCEL request is sent from sip servlet (UAC) to sip server
   d) 408 Request timeout is dispatched to servlet after 32 seconds after CANCEL request
   e) Servlet invalidates sipSession, but is never cleaned from memory.
  
------------------------------------------------
Pay attention to many of the following exceptions that can found in attached log. Maybe they are the cause of session retention:
2010-05-28 10:08:59,103 2967133 ERROR [Mobicents-SIP-Servlets] (UDPMessageChannelThread-3:) Unexpected exception releasing sem
java.lang.IllegalMonitorStateException
        at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:127)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1175)
        at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:431)
        at gov.nist.javax.sip.stack.SIPTransaction$TransactionSemaphore.release(SIPTransaction.java:329)
        at gov.nist.javax.sip.stack.SIPTransaction.semRelease(SIPTransaction.java:1233)
        at gov.nist.javax.sip.stack.SIPTransaction.releaseSem(SIPTransaction.java:1218)
        at gov.nist.javax.sip.stack.UDPMessageChannel.processMessage(UDPMessageChannel.java:554)
        at gov.nist.javax.sip.stack.UDPMessageChannel.processIncomingDataPacket(UDPMessageChannel.java:467)
        at gov.nist.javax.sip.stack.UDPMessageChannel.run(UDPMessageChannel.java:303)
        at java.lang.Thread.run(Thread.java:619)
tomcat.rar

Jean Deruelle

unread,
May 28, 2010, 11:06:40 AM5/28/10
to mobicent...@googlegroups.com
Can you SVN update. I committed a fix so that when the session is forcefully invalidated by the application the underlying dialog and session transaction are terminated thus it should not retain any memory.

Can you retry with that and report back ?

Jernej Krmelj

unread,
May 31, 2010, 7:58:35 AM5/31/10
to mobicent...@googlegroups.com
With newest sip stack there was no IllegalMonitorStateException anymore, but with latest 12022 SVN revision we still got a few sip sessions retained in memory.

The scenario with all retained sessions is the same. One of the sessions is the following:
    callid: 9b7199fcf02449ca79f3a70dfbd8918e
    sessionid: 02907584_8ade2a41_348209889658436:9b7199fcf02449ca...@192.168.77.31:348209889658436:si.iskratel.as.conference.SipConferenceService


     Scenario:
   a) invite is sent from sip servlet (UAC) to sip server
   b) 180 ringing is received and dispatched to servlet
   c) 200 OK is received and dispatched to servlet
   d) servlet sends ACK
   e) servlet sends REINVITE
   f) 503 Service Unavailable is received and dispatched to servlet
   g) Servlet invalidates sipSession which is destroyed.
   h) BYE is also received (seen in attached log), but it is not dispatched to servlet. Probably because session is already invalidated.
  
GC root shows that session is referenced by latest BYE request that is never released from stack and GCed. From attached log it seems somehow that this could be a synchronization issue again. It seems that following is happening with BYE request:
   1) 503 Service Unavailable is received and dispatched to servlet in the context of Mobicents-SIP-Servlets-UDPMessageChannelThread-1.
   2) Context switch: BYE is received in the context of Mobicents-SIP-Servlets-UDPMessageChannelThread-2. The sip session, which BYE belongs to, is found.
     
      2010-05-31 11:28:07,264 2830345 DEBUG [org.mobicents.servlet.sip.core.dispatchers.SubsequentRequestDispatcher] (Mobicents-SIP-Servlets-UDPMessageChannelThread-2:) Inverted try worked. sip session found : (02907584_8ade2a41_348209889658436:9b7199fcf02449ca...@192.168.77.31:348209889658436:si.iskratel.as.conference.SipConferenceService)
  
   3) Context switch: Sip session is invalidated in the context of Mobicents-SIP-Servlets-UDPMessageChannelThread-1.
  
      2010-05-31 11:28:07,279 2830360 INFO  [org.mobicents.servlet.sip.core.session.SipSessionImpl] (Mobicents-SIP-Servlets-UDPMessageChannelThread-1:) Invalidating the sip session (02907584_8ade2a41_348209889658436:9b7199fcf02449ca...@192.168.77.31:348209889658436:si.iskratel.as.conference.SipConferenceService)
     
   4) Context switch: Error with procesing BYE request happens in the context of Mobicents-SIP-Servlets-UDPMessageChannelThread-2 happens.
  
      2010-05-31 11:28:07,287 2830368 ERROR [org.mobicents.servlet.sip.core.dispatchers.MessageDispatcher] (Mobicents-SIP-Servlets-UDPMessageChannelThread-2:) no handler found for sip session (02907584_8ade2a41_348209889658436:9b7199fcf02449ca...@192.168.77.31:348209889658436:si.iskratel.as.conference.SipConferenceService) and request BYE sip:192.168.77.31:5080 SIP/2.0
      Call-ID: 9b7199fcf02449ca...@192.168.77.31
      CSeq: 948 BYE
      From: <sip:30...@192.168.77.20>;tag=a6ds7sirw6
      To: <sip:0820233@lnxmakAS>;tag=02907584_8ade2a41_348209889658436
      Via: SIP/2.0/UDP 192.168.77.20:5060;branch=z9hG4bK-qq8ga-skvu2
      Reason: Q.850;cause=34;text="No circuit/channel available"
      Max-Forwards: 70
      Content-Length: 0




PS.: No 408 Request timeout scenarios were involved here, so I cannot say if problem is still persistent with latest SVN revision.
tomcat.rar

Jean Deruelle

unread,
Jun 2, 2010, 6:03:33 AM6/2/10
to mobicent...@googlegroups.com
I'm currently at Jazoon but will check it out next week upon my return

Jean

Jernej Krmelj

unread,
Jun 9, 2010, 2:57:55 PM6/9/10
to mobicent...@googlegroups.com
hi,

any clue yet on what is going on?

2010/6/2 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
Jun 9, 2010, 3:16:10 PM6/9/10
to mobicent...@googlegroups.com
Didn't had the time to look at it, we are busy with customer issues and on and off on business trips. Should be able to work on it end of week or beginning of next week.

Sorry about that
Jean

Jernej Krmelj

unread,
Jun 9, 2010, 3:26:35 PM6/9/10
to mobicent...@googlegroups.com
sure, no problem,

just curious if the solution already came by

Jernej

2010/6/9 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
Jun 9, 2010, 3:37:39 PM6/9/10
to mobicent...@googlegroups.com
Also if you have a screenshot of the GC root in this case that would help but I have an idea about it but didn't have time to check it out. 
Basically either don't nullify the handlerServlet of SipSessionImpl on invalidate and/or in MessageDispatcher after 
logger.error("no handler found for sip session " + session.getKey() + " and request " + request);
do something like 

Transaction transaction = request.getTransaction();
if(transaction != null) {
TransactionApplicationData tad = (TransactionApplicationData) transaction.getApplicationData();
if(tad != null) {
tad.cleanUp();
}
transaction.setApplicationData(null);
}
request.setSipSession(null);

and make sure the setSipSession in SipServletRequestImpl does not throw NPE if the param is null.

Jernej Krmelj

unread,
Jun 10, 2010, 8:53:52 AM6/10/10
to mobicent...@googlegroups.com
We run new traffic test according to your source suggestions change and here are the results and scenarios that happened.

1. I commented "handlerServlet = null;" (//handlerServlet = null;) statement in SipSessionImpl.invalidate() method and run the test. There was no "no handler found for sip session" output in log file found, but after stop of traffic test, three session retained in memory, all belonging to the scenario we are investigating (reinvite with 503 and bye response). GC root of retained session is attached (GCRootSipSessionNoHandlerServlet.JPG).

2. I rollbacked previous change in SipSessionImpl.invalidate() method and added the following changes into MessageDispatcher.callServlet() method:
} else {

  logger.error("no handler found for sip session " + session.getKey() + " and request " + request);
  Transaction transaction = request.getTransaction();
  if(transaction != null) {
    TransactionApplicationData tad = (TransactionApplicationData) transaction.getApplicationData();
    if(tad != null) {
      tad.cleanUp();
    }
    transaction.setApplicationData(null);
  }

  request.setSipSession(null);
}

I also changed SipServletMessageImpl.setSipSession() to the following:
  this.sipSession = session;
  if (session != null)
    this.sessionKey = session.getKey();
  else
    this.sessionKey = null;

We run traffic test again and now there were two "no handler found for sip session" outputs written into log file. After traffic stop also two sessions retained in the memory again. GC root of a session is attached (GCRootSipSessionMessageDispatcher.JPG).
Pay attention to this GC root as session is retained in the context of a EventScannerThread. I also analysed SipServletRequestImpl object which references the session retained and profiler shows that its' request method is an INVITE (not BYE) and its'
CSeq value is 2 as this means that this is a second INVITE or RE-INVITE of a sip session.

I hope this helps a little bit resolving this problem.
GCRootSipSessionNoHandlerServlet.JPG
GCRootSipSessionMessageDispatcher.JPG

Jernej Krmelj

unread,
Jun 10, 2010, 9:28:51 AM6/10/10
to mobicent...@googlegroups.com
In attachement there is a GC root (GCRootSipSessionNoChange.JPG) of a retained session (02907584_8ade2a41_348209889658436:9b7199fcf02449ca...@192.168.77.31) from last week. Of course, MSS sources last week did not contain the changes you suggested yesterday. I analyzed SipServletRequestImpl object which references the session retained and profiler shows that its' request method is BYE. This means that today's MessageDispatcher.callServlet() change did its' job, but it seems some additional cleanup has to be done. Maybe after receiving 503 response?


2010/5/31 Jernej Krmelj <krmel...@gmail.com>
GCRootSipSessionNoChange.JPG

Jean Deruelle

unread,
Jun 10, 2010, 9:31:49 AM6/10/10
to mobicent...@googlegroups.com
Interesting this should be done automatically by jain sip in this case. The transaction should go to TERMINATED after receiving a 503 and linger for a while then stack calls listener ie SipApplicationDispatcherImpl.processTransactionTerminated and it should do the cleanup...
Something weird here

Miha Polak

unread,
Jun 14, 2010, 4:54:02 AM6/14/10
to mobicent...@googlegroups.com
Maybe you missed the fact that the problem Jernej is describing is about re-invite. Negative response on reinvite should not terminate sip session. RFC 3261 says in section 14.1
"If a UA receives a non-2xx final response to a re-INVITE, the session parameters MUST remain unchanged, as if no re-INVITE had been issued."
Maybe after that there is nothing weird after all?

Br

2010/6/10 Jean Deruelle <jean.d...@gmail.com>

Jernej Krmelj

unread,
Jun 14, 2010, 6:52:45 AM6/14/10
to mobicent...@googlegroups.com
Nevertheless I tried to find a session with the same scenario (INVITE,180,200,ACK,RE-INVITE,503,BYE) that is cleaned from the memory after invalidation.

With this session there is a difference after sending RE-INVITE:
1. UDPMessageChannelThread-0: 503 Service Unavailable is received, ACK is sent, processing of response is started (PROCESSING INCOMING RESPONSESIP/2.0 503 Service Unavailable)
2. Context switch: UDPMessageChannelThread-3: BYE is received, it's session is found, response is dispatched to the servlet, servlet generates 200 OK and sends it
3. Context switch: UDPMessageChannelThread-0: 503's session is found, response is dispatched to the servlet, session is invalidated by MSS.

After this everything is cleaned from memory. Would this mean that improper thread synchonization is the problem sometimes with 503 scenario?

2010/6/14 Miha Polak <miha....@gmail.com>

Jean Deruelle

unread,
Jun 14, 2010, 7:49:03 AM6/14/10
to mobicent...@googlegroups.com
It seems the GCRootSipSessionMessageDispatcher problem is that the dialog attached to the tx is still keeping hold of the sipservletrequest which has a ref to the sip session.
Can you retry and keep the earlier changes + uncommenting line 788 in SipApplicationDispatcher or/and line 835 in SipSessionImpl to see if that helps to remove that ref ?

Thanks 
Jean

Jernej Krmelj

unread,
Jun 15, 2010, 6:58:04 AM6/15/10
to mobicent...@googlegroups.com
I made suggested changes and made tests, but reference to the session is not removed. Still, there is a slight change. With first change only (uncommenting line 788 in SipApplicationDispatcher), GC root differs a little bit as "lastTransaction" object is involved (see attached file).

The second change (uncommenting line 835 in SipSessionImpl) didn't have any effect, with or without first change involved at the same time.

2010/6/14 Jean Deruelle <jean.d...@gmail.com>
GCRootSipSessionSipApplicationDispatcherImpl.JPG

Jean Deruelle

unread,
Jun 15, 2010, 7:15:42 AM6/15/10
to mobicent...@googlegroups.com
The root problem now might be that this server transaction is not removed from the stack at all or not cleaned up if the non invite transaction is not answered with a final response

Can you try to add that code 
                        try {
request.getTransaction().terminate();
} catch (ObjectInUseException e) {
logger.error("transaction " + request.getTransaction().getBranchId() + " for request " + request + " couldn't be terminated");
}

after line 259 of MessageDispatcher (and after the source code you already added there previously that cleans up the application data) and retry ?
logger.error("no handler found for sip session " + session.getKey() + " and request " + request);

Thanks 
Jean

Jernej Krmelj

unread,
Jun 15, 2010, 9:35:53 AM6/15/10
to mobicent...@googlegroups.com
With latest correction, the problem seems to be solved. No sessions retained in the memory anymore.

Great!

I attach the files that were modified...

Thanks for your cooperation.

2010/6/15 Jean Deruelle <jean.d...@gmail.com>
MessageDispatcher.java
SipServletMessageImpl.java

Jean Deruelle

unread,
Jun 15, 2010, 9:39:17 AM6/15/10
to mobicent...@googlegroups.com
Excellent !

Could you do me a favor and send me a SVN (or Eclipse) patch or diff file if you don't mind ?

Thanks in advance
BR
Jean

Jernej Krmelj

unread,
Jun 15, 2010, 9:47:04 AM6/15/10
to mobicent...@googlegroups.com
I hope this is what you want...

2010/6/15 Jean Deruelle <jean.d...@gmail.com>
sip_servlets_no_handler_found_for_sip_session.patch

Jean Deruelle

unread,
Jun 15, 2010, 9:49:26 AM6/15/10
to mobicent...@googlegroups.com
Perfect. Thanks

Jean Deruelle

unread,
Jun 15, 2010, 9:58:06 AM6/15/10
to mobicent...@googlegroups.com
I also wanted to thank you for the heavy profiling, testing, patience, insights and applying changes back and forth, this is highly appreciated

And If you want to be acknowledged here for your help

Please let me know your names and company names

Jean

Jernej Krmelj

unread,
Jun 15, 2010, 10:12:41 AM6/15/10
to mobicent...@googlegroups.com
Hehe, sure...

Miha Polak and Jernej Krmelj both from Iskratel Slovenia (www.iskratel.com)


Well, now we resolved all problems with current testing scenarios. Other test are pending. Maybe we still write each other :)

Jernej

2010/6/15 Jean Deruelle <jean.d...@gmail.com>

Jean Deruelle

unread,
Jun 15, 2010, 10:32:13 AM6/15/10
to mobicent...@googlegroups.com
As we say : the more tests, the better.

Updated with your contributions http://www.mobicents.org/acknowledgements.html
Reply all
Reply to author
Forward
0 new messages