Error while registering - Service Unavailable

665 views
Skip to first unread message

bbszabi

unread,
May 5, 2013, 8:00:14 AM5/5/13
to csipsim...@googlegroups.com
Hi Regis,

I'm having a great deal of problem with the app going "Error while registering - Service Unavailable" (and Request Timeout) and staying there indefinitely when my phone loses signal or some other connection problem occurs. I always have to manually deactivacte/reactivate the account, then it registers without any problem. I saw your answer in a post to this problem, but waiting for the SIP stack to reconnect (or not at all) is just not an option for me. My mobile Skype has no such problems, and CSipSimple should work the same way :)

So I thought, that the manual deactivaction/reactivation of the active accounts that are in error state could be automated by a thread, that checks every 30-60 seconds the state of all active accounts: if it finds any of them in a registration error state, the account should be reset, just like when you tap on the account widget twice to deactivate and reactivate.

I would like to implement this feature, because this behavior is a big no-no in production. You know this app inside-out so, to speed up the implementation please point me in the right direction by giving me your input:
1. Which class the checking thread should be defined in and run from (SipService or some other class)?
2. How can I get a list of all the active accounts, that should be connected to a SIP server?
3. While loopting through the list, how do I find out if an account is in an error state or is registered correctly with the server?

Of course, I will contribute the code back to CSipSimple. This great app deserves any effort put in it!
Regards,
Szabi

Régis Montoya

unread,
May 5, 2013, 9:17:49 AM5/5/13
to csipsim...@googlegroups.com
Hi,


Normally such a re-registration should be done by pjsip.
At least it's expected it's pjsip that will retry at the interval
specified by :
http://www.pjsip.org/pjsip/docs/html/group__PJSUA__LIB__ACC.htm#gabecb3fffd4a8bea84a9bcb5394dcde61
which should do : first re-try after 0 sec, and for later retries, wait
for the time specified by the setting above randomized with +/- 10 seconds.

Is it what you observe?
If not, there is maybe a bug here. But the way it's expected to work
doesn't requires some change in csipsimple.
Or maybe we can reduce this value to something lower than 5 minutes (but
we have to take *BIG* care that making this too frequent will kill the
battery if the network is *ACTUALLY* not one that can be used for SIP --
for example a network with a firewall or a network with just an http proxy).
> --
> You received this message because you are subscribed to the Google
> Groups "CSipSimple Development" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to csipsimple-de...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Szabolcs Borbély

unread,
May 5, 2013, 11:10:32 AM5/5/13
to csipsim...@googlegroups.com
Thanks for the reply, that was quick for a sunday afternoon! :)

For me the easiest way to reproduce this is by shutting down Freeswitch or cutting the LAN from the server, waiting for the Service Unavailable error to occur and then starting up Freeswitch/establish LAN again. Sometimes it does reconnect, but most of the times it just stays with the error message on, regardless of the KA and registration expiration settings (KA = 100, registration expiration = 180). This happens when the 3G signal goes away too, eg. last night, when I went to a club, where the signal is very poor. After the signal got back, the Service Unavailable would not go away, so I reconnected manually and it worked. As you mentioned, this is not normal behavior.

I have two phones I use for testing: a Motorola Defy+ with CM9 and an Allview P5 Mini (Android 4). They both show the same random behavior: sometimes reconnect but mostly don't.

Anyway, those 5 minutes seem huge amount of time for me. I wonder, how Skype deals with this so quickly, without draining the battery. I just tested Skype: it takes between 60-100 seconds to sense that connection is lost and under 10 seconds to reconnect after the networks is re-established. To rule out the phone reacting to networks changes, I connected through WiFi and do the disconnect by powering down the switch where the WiFi device is plugged in (so the WiFi has signal all the time).

Another solution would be to probe the server's SIP socket (host:port) first (or send some sort of ping), before retrying the connection, even with a SYN packet only, if that would save battery. If the connection (ping) does not time out in 10-20 seconds, the registration should be attempted, otherwise not.

PS: THIS MIGHT BE IMPORTANT! While I was composing this message, I re-tested the connection off-on both on WiFi and 3G and I noticed that if the error is "Request Timeout", it reconnects eventually. But if the "Service Unavailable" shows up, it won't reconnect at all. I just noticed this tried it out now only a few times, but it seems to be the behavior pattern.


To unsubscribe from this group and stop receiving emails from it, send an email to csipsimple-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to a topic in the Google Groups "CSipSimple Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/csipsimple-dev/KCT2mYKzuKQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to csipsimple-dev+unsubscribe@googlegroups.com.

Régis Montoya

unread,
May 5, 2013, 11:41:57 AM5/5/13
to csipsim...@googlegroups.com
During week-ends I've more time for csipsimple than during week when I have my full time job that takes all my time ;)

About the problem : I've just changed something in the android_timer implementation that might solve some problems in keep alive/reconnections and anything related to timers. So maybe there were a bug.

Second, point about how skype can reconnect rapidely, there is some potential explaination :
TCP is more reliable and it's easier for the app to detect it was stopped. (most of the time sip is over UDP which is less reliable). Also, maybe skype drain battery when the network connection is poor or block them... but they are lucky... they are less blocked than SIP by mobile carrier equipements that are not able to detect their protocol (while sip is easy to detect and be blocked). So maybe they have more aggressive approach.
This said, indeed, 5 minutes can appear too long from user experience point of view. Maybe we could reduce it to 120 seconds by default (and add an option in expert mode). What do you think?

About the P.S. : it's very interesting point. Maybe something related to the bug I mentionned previously or could also be something in pjsip to investigate if there is cases where they don't retry. From my understand of the pjsip code (here : http://trac.pjsip.org/repos/browser/pjproject/trunk/pjsip/src/pjsua-lib/pjsua_acc.c#L3139 ) it's not the case, but maybe I miss something?
To unsubscribe from this group and stop receiving emails from it, send an email to csipsimple-de...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to a topic in the Google Groups "CSipSimple Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/csipsimple-dev/KCT2mYKzuKQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to csipsimple-de...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
You received this message because you are subscribed to the Google Groups "CSipSimple Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email to csipsimple-de...@googlegroups.com.

Szabolcs Borbély

unread,
May 5, 2013, 12:44:27 PM5/5/13
to csipsim...@googlegroups.com
Hi,
Can you show me what you just changed and where, so I can test it? If it's C code, please tell me how to recompile or where to get the recompiled libs, I'm terrible at that :)

I'm not a C programmer, but as far as I can understand the 'schedule_reregistration()' method, there is no distinction between nature of the errors, it just re-schedules for registration for the given (slightly randomized) 'delay' interval (I guess this is the 300 seconds thing). Given this, it's possible that this 'schedule_reregistration()' method is not being called at all when Service Unavailable error occurs? I'm trying to track things down in the PJ logs, but I can't see any of the log messages from 'schedule_reregistration()' in the log file (logging to file, level 5)

By the way, I'm using TLS/TCP connection on a non-standard port (like 4321), so there is no problem with the mobile providers, they can't figure out the protocol. Maybe they can guess that it's SIP service by the server's IP address, but it's highly unlikely.

Even if you don't cut the default 300 seconds down to 120, it would be nice to have a expert option to modify it and to be able to set that value in the account wizard (globally or just for the account)

Régis Montoya

unread,
May 5, 2013, 5:38:42 PM5/5/13
to csipsim...@googlegroups.com
On 05/05/2013 18:44, Szabolcs Borb�ly wrote:
> Hi,
> Can you show me what you just changed and where, so I can test it? If
> it's C code, please tell me how to recompile or where to get the
> recompiled libs, I'm terrible at that :)

You can follow and see each diff of each fix by browsing the page here :
http://code.google.com/p/csipsimple/source/list
(the changes I mentioned are in r2210)
However, as for all version, if you don't want to recompile the native
library part, you can just download it from a nightly build
(http://nightlies.csipsimple.com/trunk/), extract the apk (apk are just
zip files renamed apk), and get the ".so" files in the libs folders.
Just take care you are up to date in the java part of the code.

>
> I'm not a C programmer, but as far as I can understand the
> 'schedule_reregistration()' method, there is no distinction between
> nature of the errors, it just re-schedules for registration for the
> given (slightly randomized) 'delay' interval (I guess this is the 300
> seconds thing). Given this, it's possible that this
> 'schedule_reregistration()' method is not being called at all when
> Service Unavailable error occurs? I'm trying to track things down in
> the PJ logs, but I can't see any of the log messages from
> 'schedule_reregistration()' in the log file (logging to file, level 5)
>
Ok, so will be interesting to try with the fix of r2210.
Also, just to be sure : on the java part did you modified somethings in
the UAStateReceiver? If you are not fully aware on how things are done
in the pjsip part, some modifications here are very risky and could lead
to break unexpected things.


Szabolcs Borbély

unread,
May 5, 2013, 6:00:39 PM5/5/13
to csipsim...@googlegroups.com
Ok, I'll try to test the latest '.so' files tomorrow and I'll let you know how it behaved. At the moment I'm using build 2194, so I guess it should work.

I did not modify anything that has to do with the 'internals' of the app (for the reasons you just mentioned :), only some interface stuff. The most 'invasive' thing I did was to disable bluetooth and recording. The rest of the modifications are '.png' image files, colors, CustomDistribution class and I added my own account wizard and removed all other wizards.

Thanks and I'll get back tomorrow. Have a nice week!


On Mon, May 6, 2013 at 12:38 AM, Régis Montoya <r3gi...@gmail.com> wrote:
--
You received this message because you are subscribed to a topic in the Google Groups "CSipSimple Development" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/csipsimple-dev/KCT2mYKzuKQ/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to csipsimple-dev+unsubscribe@googlegroups.com.

Khoa Pham

unread,
May 6, 2013, 8:10:56 AM5/6/13
to csipsim...@googlegroups.com
@Szabi

1. Deactivate/activate is just a workaround. You have to find out the root of the problem
2. Do you use TCP? What is your proxy server? Does the problem 99% happen? 
Try switching to another server to see if the problem still exists? Have you investigate the IP packet to see what really matters ? Are the server busy (reach max connections allowed) ?

Khoa Pham

unread,
May 6, 2013, 8:18:00 AM5/6/13
to csipsim...@googlegroups.com
@Szabi, to add more info

1. About the 300s, it is the default registration timeout
2. What you want is reg_retry_interval, which will decide the next time rereg

bbszabi

unread,
May 6, 2013, 10:13:26 AM5/6/13
to csipsim...@googlegroups.com


On Monday, May 6, 2013 3:10:56 PM UTC+3, Khoa Pham wrote:
@Szabi

1. Deactivate/activate is just a workaround. You have to find out the root of the problem
 
I'm aware of that, but since I don't have the skills to mangle with the pjsip & other C code, I do what I can: either a workaround or ask for assistance.

2. Do you use TCP? What is your proxy server? Does the problem 99% happen? 

The server is an OSTN compliant FreeSWITCH and I use TLS (TCP), with certificate validation on all ends. So the registration and proxy server are the same. The server sits right here in the next room and has only two clients online. And yes, it happens every time the Service Unavailable error occurs.
 
Try switching to another server to see if the problem still exists? Have you investigate the IP packet to see what really matters ? Are the server busy (reach max connections allowed) ?
 
 Since deactivate/reactivate works every time, there are no issues whatsoever with the server or the connection. It must be the app!

bbszabi

unread,
May 6, 2013, 10:15:12 AM5/6/13
to csipsim...@googlegroups.com


On Monday, May 6, 2013 3:18:00 PM UTC+3, Khoa Pham wrote:
@Szabi, to add more info

1. About the 300s, it is the default registration timeout

Yes, I know that.
 
2. What you want is reg_retry_interval, which will decide the next time rereg

The big problem is not the interval, but the fact that it won't reconnect ever!

bbszabi

unread,
May 6, 2013, 10:47:00 AM5/6/13
to csipsim...@googlegroups.com
GREAT news Regis: it works!

Probably the timer_android was the cuplrit and you nailed it! Big thanks, man!
I tested it with r2215 and right now I'm compiling the new pjsip to get it all up to r2218. I'll let you know if that works well, too.

The only thing that remains to do is to implement the possibility of modifying/customizing that 300 sec timeout. Or I wonder if better aproach would be increasing retry intervals of some sort, eg. try it 10x60s, then 10x120s... up to the 300s. Does this sound feasible to you?

bbszabi

unread,
May 6, 2013, 11:49:03 AM5/6/13
to csipsim...@googlegroups.com
Hi Regis,
I just updated everything to r2218 and I can confirm that it works like charm! If you could figure out how to customize or improve the reconnect delay (progressively increasing delays from 60s to 300s, by 60s steps at every 10th retry, lets say), that would be awsome.
Have a great week!
Szabi
Reply all
Reply to author
Forward
0 new messages