It's seem that there is a bug in some OS

101 views
Skip to first unread message

施威

unread,
Feb 15, 2016, 6:31:16 AM2/15/16
to TURN Server (Open-Source project)

Hello,everyone ,

    It seems that the turnserver can not  works well in some OS which put the network packet into the newest fd's buffer rather than the oldest one's when there is more than one fds bind in the same ip:port .

    the reason I said that dues to the following code :

if (addr_bind(udp_fd,&(s->local_addr),1,1,UDP_SOCKET) < 0) {
TURN_LOG_FUNC(TURN_LOG_LEVEL_ERROR,
"Cannot bind new detached udp server socket to local addr\n");
IOA_CLOSE_SOCKET(ret);
return -1;
}

ret->bound = 1;

{
int connect_err = 0;
if (addr_connect(udp_fd, &(server->sm.m.sm.nd.src_addr), &connect_err) < 0) {
char sl[129];
char sr[129];
addr_to_string(&(ret->local_addr),(u08bits*)sl);
addr_to_string(&(server->sm.m.sm.nd.src_addr),(u08bits*)sr);
TURN_LOG_FUNC(TURN_LOG_LEVEL_ERROR,
"Cannot connect new detached udp client socket from local addr %s to remote addr %s\n",sl,sr);
IOA_CLOSE_SOCKET(ret);
return -1;
}
}

           Consider this situation:client A(1.2.3.4:10000) send a allocation to Server(4.3.2.1:3478), the fdo (created when server init ) recv the packet and  do { fdn = socket()  ,  bind(fdn,4.3.2.1,3478)  ,connect(fdn,1.2.3.4:10000) },  then the connection A -> Server will handle by fdn ,  the server works well .But when the bind function has just completed and the connect function has not started ,  in the moment there is 2 fds (fdn, fdo)binding in the same ip:port,so if in the very time , A allocation from client B(1.2.3.5:10000) has arrived in Server ,So the OS should to choose which fd putting the udp data into .  Unfortunately,MACOS (and some other OS like Fedora22 .etc) choose the newest one while it should be handle by fdo (then socket(),bind(),connet()) , so the BUG comes . the client B's allocation has delivered to fdn rather than fdo ,so the handler of fdn think he got a dup allocation,then Client A got a 437 error , and the Client B got Nothing ;
            Flowing wireshark pic shows this situation :

  Client A :


 
Client B :


tid of the allocation from B proves my guess .

thanks .


Oleg Moskalenko

unread,
Feb 15, 2016, 12:54:37 PM2/15/16
to 施威, TURN Server (Open-Source project)
The problem that you described is due to the naturally occurring race condition that happens because we cannot make bind() and connect() as single transaction.

But that scenario happens only with the 'network engine 1' in the TURN server, and that is only BSD-type of OSes, currently. Only those OSes use 'network engine 1' by default (that creates a separate client-handling UDP socket for each session). The network engines 2 & 3 use connection multiplexing. Fedora22 uses network engine 3 that creates a single client-facing UDP socket per CPU - unless you are forcing it to use the network engine 1 with --ne option.

Please file an Issue in the git project page. Meanwhile, you can use a workaround - force your system to use 'network engine 1' with --ne=1 option.

Oleg


--
You received this message because you are subscribed to the Google Groups "TURN Server (Open-Source project)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to turn-server-project-rfc57...@googlegroups.com.
To post to this group, send email to turn-server-project...@googlegroups.com.
Visit this group at https://groups.google.com/group/turn-server-project-rfc5766-turn-server.
For more options, visit https://groups.google.com/d/optout.

施威

unread,
Feb 16, 2016, 6:24:28 AM2/16/16
to TURN Server (Open-Source project), sw....@gmail.com
thanks for sharing that ,it's very helpful. .

在 2016年2月16日星期二 UTC+8上午1:54:37,Oleg Moskalenko写道:
To unsubscribe from this group and stop receiving emails from it, send an email to turn-server-project-rfc5766-turn-server+unsubscribe@googlegroups.com.
To post to this group, send email to turn-server-project-rfc5766-turn-...@googlegroups.com.

Oleg Moskalenko

unread,
Feb 16, 2016, 11:34:22 AM2/16/16
to 施威, TURN Server (Open-Source project)


Sent from my iPhone

On Feb 16, 2016, at 3:24 AM, 施威 <sw....@gmail.com> wrote:

thanks for sharing that ,it's very helpful. .

在 2016年2月16日星期二 UTC+8上午1:54:37,Oleg Moskalenko写道:
The problem that you described is due to the naturally occurring race condition that happens because we cannot make bind() and connect() as single transaction.

But that scenario happens only with the 'network engine 1' in the TURN server, and that is only BSD-type of OSes, currently. Only those OSes use 'network engine 1' by default (that creates a separate client-handling UDP socket for each session). The network engines 2 & 3 use connection multiplexing. Fedora22 uses network engine 3 that creates a single client-facing UDP socket per CPU - unless you are forcing it to use the network engine 1 with --ne option.

Please file an Issue in the git project page. Meanwhile, you can use a workaround - force your system to use 'network engine 1' with --ne=1 option.

Sorry, this is a typo - i meant 'force your turn serve to use the network engine 2 with --ne=2 option'.


To unsubscribe from this group and stop receiving emails from it, send an email to turn-server-project-rfc57...@googlegroups.com.
To post to this group, send email to turn-server-project...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages