Hi mobicents community,
I’m new to Mobicents Jain SLEE and currently evaluating how it performs as a HA AS for SIP based Applications.
Therefore we are executing High Load tests to find the maximum performance limits of the platform on a carrier grade hardware:
HP DL380 G7
24 Intel Xeon CPU @ 3.33 GHz
48 GB RAM
I’m using Mobicents Jain Slee 2.6.0 Final (no cluster – default mode) and sip-11-ra provided with this package and the sip-b2bua example from mobicents as application.
JVM is tuned according to our experience with other Jain SLEE platforms.
Now here comes the issue:
Until 600 CPS the SIPP-Traffic is stable.
At 600 CPS the INVITE retransmission counter starts to increase steadily. At a rather constant rate INVITEs are simply not answered (with trying) and SIPP has to retransmit them – this happens steadily (not only during full GC).
Overall at 600 CPS this happens for ~1 % of calls whereas 99% of calls are answered within 100ms – so there are still enough resources, also CPU is rather low at this rate.
A tshark interface trace shows that those INVITEs are received the Jain SLEE Server’s IP Interface (so its not a connection or sipp issue)
When enabling the Jain SIP Message Logging (gov.nist.javax.sip.LOG_MESSAGE_CONTENT) it can be seen that those INVITEs are NOT shown in the log – only the retransmissions after are shown and processed fast.
So I assume that the SIP stack somehow swallows or ignores those INVITEs (even before logging them)
Also interesting is that when starting the traffic already with 600 CPS it takes some time (~20000 messages) until the retransmissions start.
This may indicate that some queue/buffer exceeds its size.
I already tried to tune some of the following preconfigured sipra.properties:
THREAD_POOL_SIZE from 8 to 64
RECEIVE_UDP_BUFFER_SIZE and SEND_UDP_BUFFER_SIZE to the double and triple value.
MAX_SERVER_TRANSACTIONS and MAX_CLIENT_TRANSACTIONS to the double value
None of the changes prevented or delayed the retransmissions.
So I’m asking if someone experienced the same or similar issue and knows how to deal with it ? Are there further sip-stack properties which are worth tuning ? Maybe someone knows how and what areas are worth investigating ? (maybe sip stack provides some possibility to detect buffer overflows ?) Is it possible that some mobicents extension of the sip-stack interferes here ?
Thanks in advance for any feedback.
Best regards,
Stephan Klein – Kapsch CarrierCom
---
P.S: A special hello to our colleagues from Lithuania ;)
The information contained in this e-mail message is privileged and
confidential and is for the exclusive use of the addressee. The person
who receives this message and who is not the addressee, one of his
employees or an agent entitled to hand it over to the addressee, is
informed that he may not use, disclose or reproduce the contents
thereof, and is kindly asked to notify the sender and delete the e-mail
immediately.
I’m new to Mobicents Jain SLEE and currently evaluating how it performs
as a HA AS for SIP based Applications.****
Therefore we are executing High Load tests to find the maximum
performance limits of the platform on a carrier grade hardware:****
** **
HP DL380 G7****
24 Intel Xeon CPU @ 3.33 GHz****
48 GB RAM****
** **
I’m using Mobicents Jain Slee 2.6.0 Final (no cluster – default mode) and
sip-11-ra provided with this package and the sip-b2bua example from
mobicents as application.****
JVM is tuned according to our experience with other Jain SLEE platforms.*
***
** **
Now here comes the issue:****
Until 600 CPS the SIPP-Traffic is stable.****
At 600 CPS the INVITE retransmission counter starts to increase steadily.
At a rather constant rate INVITEs are simply not answered (with trying) and
SIPP has to retransmit them – this happens steadily (not only during full
GC).****
Overall at 600 CPS this happens for ~1 % of calls whereas 99% of calls
are answered within 100ms – so there are still enough resources, also CPU
is rather low at this rate.****
** **
A tshark interface trace shows that those INVITEs are received the Jain
SLEE Server’s IP Interface (so its not a connection or sipp issue)****
When enabling the Jain SIP Message Logging
(gov.nist.javax.sip.LOG_MESSAGE_CONTENT) it can be seen that those INVITEs
are NOT shown in the log – only the retransmissions after are shown and
processed fast.****
** **
So I assume that the SIP stack somehow swallows or ignores those INVITEs
(even before logging them)****
** **
Also interesting is that when starting the traffic already with 600 CPS
it takes some time (~20000 messages) until the retransmissions start. ***
*
This may indicate that some queue/buffer exceeds its size.****
** **
I already tried to tune some of the following preconfigured
sipra.properties:****
THREAD_POOL_SIZE from 8 to 64****
RECEIVE_UDP_BUFFER_SIZE and SEND_UDP_BUFFER_SIZE to the double and triple
value.****
MAX_SERVER_TRANSACTIONS and MAX_CLIENT_TRANSACTIONS to the double value**
**
** **
None of the changes prevented or delayed the retransmissions.****
** **
So I’m asking if someone experienced the same or similar issue and knows
how to deal with it ? Are there further sip-stack properties which are
worth tuning ? Maybe someone knows how and what areas are worth
investigating ? (maybe sip stack provides some possibility to detect buffer
overflows ?) Is it possible that some mobicents extension of the sip-stack
interferes here ?****
** **
Thanks in advance for any feedback.****
** **
Best regards,****
Stephan Klein – Kapsch CarrierCom****
Once you get a stable limit do a test longer than 1h to ensure it handles such load GC fine, and let us know what's the heap memory usage in the end, it should give us an idea of how hungry the sip stack really is, and if that is truly a prob.
Then, like I said before, try to increase the buffers, this will waste more mem, but may payoff.
If CPU load is low, messing with threads probably won't help.
Finally, that doubt about the 100 trying, if possible can you check it?
Btw, b2bua will give you a more complex app picture, it uses some high level code that uses more app state then needed, but mixing the sip uas with jdbc ra and/or diameter should be more interesting :-)
Hi,
after further tests with sip-uas here are new findings:
- Increased SIP-RA Buffer sizes to double, fourfold, tenfold size
-> No Improvements
- Upgraded to Java 7 U4 (was java 6 before)
-> No Improvements
- Tuned Eduardos JVM Settings: Changed -Xmn256m to -Xmn128m:
-> Improvement from stable 1600 to 1700 CPS
-> Decreasing the Max New Limit of the JVM results in shorter GC Pauses (But more often), which has a positive effect on retransmissions
- Tried New Java 7 U4 Garbage Collector One (GC1) with various options - see http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html
-> Much worse (limit was 1000 CPS)
-> G1 seems not to achieve very low GC Pauses at high load (target would be <5ms as we have with CMS to avoid retransmissions)
- Updated SIP-UAS SBB with logic for sending 100 TRYING always:
-> Improvement from stable 1700 CPS to 1900 CPS
-> Ran a 1h test with 1900 CPS and it was stable (CPU Load ~30 %)
-> It seems that 100 TRYING can be generated really fast by the stack, and this also has a positive effect on retransmissions
Regarding the memory-greedyness of the sip-stack, find screenshot of the the memory profile for the 1h test attached.
I also calculated from GC-Logging which i enabled (this had no noticeable perf. impact) that the GC collected a total of 3145738,7 MB equal to 873MB per second, or 460K per Call
-> not sure if this can be interpreted as 'greedy' or not ;)
We a rather satisfied with this performance so I will not continue tuning for now, this is a process where you can spend weeks, as JVM tuning options are countless.
However I will let you know if we achieve any further improvements in the future.
Attachments:
- 1900cps_1h_sipp_result.txt: Details about Latency, Nr of Calls etc for the 1h 1900 CPS test
- 1900cps_1h_memory&cpu.png: Screenshot of jvisualvm, shows CPU and Memory Profile
-- Stephan
The information contained in this e-mail message is privileged and
Thank you for all your work, I will incorporate the sending of the 100s, and will see the best option to share your findings wrt JVM tunning.
-- Eduardo