stun reflexive testing / confusion

296 views
Skip to first unread message

dmcle...@gmail.com

unread,
May 3, 2017, 3:39:34 PM5/3/17
to discuss-webrtc
Hi,

I'm having a problem in our production system, and I'm trying to isolate & debug some things in a lab setup.  It seems like many of our calls are using TURN to relay.  Sometimes, when there are two calls in a row between two peers, one call uses relay & the next will not, even though the network environment is the same.  I'm digging into why we're using relay more often than we used to.

I'm making calls between an android application, and a browser.  Both devices are on the same network.

In the android app, I modified my code to drop all local ice candidates, so I won't just go peer-to-peer on the LAN.  Only reflexive (stun) and relay (turn) are used.  The app won't send local candidates to the browser, and it will drop local candidates from the browser without adding them to the peer connection.  (the android app won't allow anything other than relay or reflexive.)

When I do this, I can make a call.  One side of the call is always using relay/turn, and one side is always using reflexive/stun.  So, if my network can work with reflexive candidates, why can't both legs of the call be reflexive?

I added some more debugging, and caused the app to drop all relay candidates as well.  It only passes reflexive candidates.  When I do this, I can't make a call.  At the top of chrome://webrtc-internals/, I can see:
..
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (srflx)
icecandidate (srflx)
icecandidate (srflx)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (host)
icecandidate (relay)
icecandidate (relay)
icecandidate (relay)
...

addIceCandidate (srflx)
...

but down in the "Stats Tables" section of the webrtc internals,  I can see 3 candidate sections like this:
Cand-qNNkk9NR (localcandidate)
Cand-wT61B++e (localcandidate)
Cand-2vKQb3Ms (remotecandidate)
the first is "candidateType host", and the second is "candidateType relayed" and the third is the app & it is "candidateType serverreflexive"/

So, I think that the browser has 3 stun candidates, and it got 1 stun candidate from the android app.  Shouldn't I be able to make a call like this?  It'll just have to use Stun?  in the call before I made these changes (to drop all but reflexive), one leg used stun, so my router/nat must be "stun capable", if that makes sense.

Should there be a "candidateType serverreflexive" for the browser in the stats tables?  I have got the candidates above.  Are they not in use?

I expected to see that, since that's the one that I send from the android app.  I just don't know why I don't have a serverreflexive candidate for the browser.  I grabbed a chrome debug log ... I don't see anything that I understand in there.  Can anyone tell me where to look next?  Or am I off the mark here, and there's something that I don't understand about the stun & turn that's causing my issues?

Thanks,

Doug

dmcle...@gmail.com

unread,
May 5, 2017, 1:59:14 PM5/5/17
to discuss-webrtc
Hi,

The debug set up in the last post was kind of messed up - mostly my own confusion.

If both peers are on the same network, it looks like I cannot force communication over STUN.  It's probably not something that's likely to pop up in the real world, so that was a bad test setup.

I still have a problem though.
If I have 2 peers on 2 networks, and both networks are capable of using stun, sometimes they still use turn.
I dug into the logs.  When I’m using turn, if I look at the statistics, there is no googCandidatePair that uses stun for both local & remote.  So, maybe something in the candidate pruning algorithm is pruning the "best" path (where best is the shortest & best priority available path).  Is there any way for me to dig into this more?

I made 2 calls from A to B.  they are on different networks.

call 1 from A to B has these candidate pairs (local/remote):
stun/stun
local/local
stun/relay
relay/relay
relay/stun * 
relay/stun *

in the case, the two with the asterisk have the same local candidate & same remote candidate - so these two candidates are not unique.  Is that OK?  that sounds like a bug.
The first one is the active connection, stun/stun.

call 2 from A to B has these candidate pairs (local/remote):
stun/relay
local/local
local/stun
relay/local
relay/stun
relay/relay

So, since there's no pair that uses stun/stun, then I cannot connect with both sides over stun.
The path that is used is the stun/relay path.  that path isn't as good, so my call isn't as good.

Is there a way to change that?  Is there a way to configure which pairs are used or capable of being used?  In the second call, there is a relay/local pair.  I'm not a routing expert, but it seems that if I were limiting options for candidate pairs, that's the one that should get cut. (I can't imagine that there are many times when A can get to B on a local network, but B needs to use TURN.)

Is WebRTC limiting me to 6 candidate pairs?  Is the right thing to do?  Is there any way that I can control the candidate selection or pruning process?

Any help would be greatly appreciated.

Thanks,

Doug

Taylor Brandstetter

unread,
May 5, 2017, 2:41:10 PM5/5/17
to discuss-webrtc
If both peers are on the same network, it looks like I cannot force communication over STUN.

The reason is likely that your NAT doesn't support hairpinning.

For the rest of your question, can you clarify something: were call 1 and call 2 made under identical circumstances? If so, I don't know why it would settle on "stun/stun" one time and "stun/relay" another time. The fact that there is a "local/stun" pair instead of a "stun/stun" pair means that no binding responses were received for that pair. If a binding response was received, it would turn into a "stun/stun" pair as a result of the server reflexive address in the binding response. I'd suggest collecting a packet capture on both sides to see where the binding request or response gets dropped.

Responding to some other miscellaneous points:

in the case, the two with the asterisk have the same local candidate & same remote candidate - so these two candidates are not unique.  Is that OK?  that sounds like a bug.

It does sound like a bug. I'd expect to see a "relay/local" pair instead of a second "relay/stun". Can you file a bug, with a native log

Is WebRTC limiting me to 6 candidate pairs?

There's no such limit. My guess is that you just have 3 candidates on each side (after BUNDLE/rtcp-mux is negotiated), and the local "stun/host" candidates have the same base, so you end up with (3-1)*3=6 pairs. 

--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrtc+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/e22f894b-39a6-4135-abe0-2fd4b783f3c1%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

dmcle...@gmail.com

unread,
May 8, 2017, 8:20:36 AM5/8/17
to discuss-webrtc
I was able to reproduce the problems with I'm having with a laptop connected to my iphone, calling my desktop on our work network.  I just used apprtc.  I made 2 calls in a row.  The candidate pairs don't look the same, and I think that one had the duplicate candidate pair.  I had chrome debug log on in both of them, and I saved the "peer connection updated & stats" in both of them.
I think that it's a bug, so I'll create a bug for this.  

Thanks for your help.

Doug
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages