DTLS handshake questions

1,946 views
Skip to first unread message

Jeremy Noring

unread,
Feb 12, 2015, 2:27:55 PM2/12/15
to discuss...@googlegroups.com
I'm working with licode, which has its own code for handling a DTLS handshake built on top of libsrtp, libnice and openssl.  From what I can see by reading code, logs and wireshark, here's what the process looks like:
  1. Licode receives an event that it has a valid pair from libnice (i.e. ICE negotiation was successful).  It boils this up and initiates the DTLS handshake.
  2. Licode starts by sending "Client Hello" to the webrtc client
  3. Webrtc client responds with "Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done"
  4. Licode responds with "Certificate, Client Key Exchange, Certificate Verify, Change Cipher Spec, Encrypted Handshake Message"
  5. Webrtc client responds with "New Session Ticket, Change Cipher Spec, Encrypted Handshake Message"
  6. At this point, they freely send/receive media over the line.
Does this approach generally seem correct?

Questions:
  1. Licode will resend the current frame, but it currently only attempts to resend the current frame a single time, one second later.  It seems like the spec calls for an exponential backoff here?
  2. I've noticed that if I drop packets during this process, sometimes the webrtc client will re-send stuff to licode....and sometimes it won't.  Does it generally follow the state machine in 4.2.4 of rfc 4347?  It seems terribly easy to break the DTLS handshake when randomly dropping sent/received data, and I'm not sure if it's something simply wrong in licode.
  3. Why does licode initiate with a client hello?  When two WebRTC clients connect, is there a convention for who sends "client hello?"
Most of the relevant code in licode can be found in https://github.com/ging/licode/blob/master/erizo/src/erizo/DtlsTransport.cpp and https://github.com/ging/licode/tree/master/erizo/src/erizo/dtls - any advice is much appreciated as always.

-Jeremy

Jeremy Noring

unread,
Feb 12, 2015, 3:01:18 PM2/12/15
to discuss...@googlegroups.com
One other thing I'm noticing--I've seen webrtc send a single packet containing "Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done", and also send each of those in its own packet (this only seems to happen in the event of send/receive failure.  Any idea what's going on here?

Justin Uberti

unread,
Feb 18, 2015, 7:57:59 PM2/18/15
to discuss-webrtc
On Thu, Feb 12, 2015 at 12:01 PM, Jeremy Noring <jeremy...@gmail.com> wrote:
One other thing I'm noticing--I've seen webrtc send a single packet containing "Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done", and also send each of those in its own packet (this only seems to happen in the event of send/receive failure.  Any idea what's going on here?


On Thursday, February 12, 2015 at 12:27:55 PM UTC-7, Jeremy Noring wrote:
I'm working with licode, which has its own code for handling a DTLS handshake built on top of libsrtp, libnice and openssl.  From what I can see by reading code, logs and wireshark, here's what the process looks like:
  1. Licode receives an event that it has a valid pair from libnice (i.e. ICE negotiation was successful).  It boils this up and initiates the DTLS handshake.
  2. Licode starts by sending "Client Hello" to the webrtc client
  3. Webrtc client responds with "Server Hello, Certificate, Server Key Exchange, Certificate Request, Server Hello Done"
  4. Licode responds with "Certificate, Client Key Exchange, Certificate Verify, Change Cipher Spec, Encrypted Handshake Message"
  5. Webrtc client responds with "New Session Ticket, Change Cipher Spec, Encrypted Handshake Message"
  6. At this point, they freely send/receive media over the line.
Does this approach generally seem correct?

Questions:
  1. Licode will resend the current frame, but it currently only attempts to resend the current frame a single time, one second later.  It seems like the spec calls for an exponential backoff here?
  2. I've noticed that if I drop packets during this process, sometimes the webrtc client will re-send stuff to licode....and sometimes it won't.  Does it generally follow the state machine in 4.2.4 of rfc 4347?  It seems terribly easy to break the DTLS handshake when randomly dropping sent/received data, and I'm not sure if it's something simply wrong in licode.
  3. Why does licode initiate with a client hello?  When two WebRTC clients connect, is there a convention for who sends "client hello?"

The a=setup attribute in SDP governs this.
 
Most of the relevant code in licode can be found in https://github.com/ging/licode/blob/master/erizo/src/erizo/DtlsTransport.cpp and https://github.com/ging/licode/tree/master/erizo/src/erizo/dtls - any advice is much appreciated as always.

-Jeremy

--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jeremy Noring

unread,
Feb 19, 2015, 10:41:51 AM2/19/15
to discuss...@googlegroups.com
On Wednesday, February 18, 2015 at 5:57:59 PM UTC-7, Justin Uberti wrote:
  1. Why does licode initiate with a client hello?  When two WebRTC clients connect, is there a convention for who sends "client hello?"

The a=setup attribute in SDP governs this.

Thanks for the clarification, Justin.  I'll read up.

Jeremy Noring

unread,
Mar 5, 2015, 6:34:42 PM3/5/15
to discuss...@googlegroups.com
So after doing a bunch of debugging and messing around, I think there may be a bug in WebRTC's DTLS implementation that causes a loss of the last flight to break the DTLS handshake.

In licode, I added code that selectively drops in-bound packets, except for a packet of length 746, which on my system always corresponds with WebRTC's final flight sent to licode:

int recvDataCount = 0;

void DtlsTransport::onNiceData(unsigned int component_id, char* data, int len, NiceConnection* nice) {
  int length = len;
  SrtpChannel *srtp = srtp_.get();
  if (DtlsTransport::isDtlsPacket(data, len)) {
    ELOG_DEBUG("%s - Received DTLS message from %u, len %d", transport_name.c_str(), component_id, len);

    recvDataCount++;
    if (recvDataCount % 2 == 0 && len != 746) {
        return;
    }

...this is 100% reliable for me with licode despite ~50% packet loss during DTLS transmission.   WebRTC retransmits flights that are lost (sometimes breaking them apart per 4.1.1.1 of RFC 6347, "...if repeated retransmissions do not result in a response, and the PMTU is unknown, subsequent retransmissions SHOULD back off to a smaller record size, fragmenting the handshake message as appropriate.").  

However, if I drop that final flight (746 bytes flight) which contains "New Session Ticket, Change Cipher Spec, Encrypted Handshake Message", things never recover:

2015-03-05 15:53:04,090  - INFO: DtlsTransport - video - DTLSRTP Start

2015-03-05 15:53:04,090  - ERROR: dtls.DtlsSocketContext - start

2015-03-05 15:53:04,090  - INFO: dtls.DtlsSocket - mOutBio Pending: 214 and read: 214

2015-03-05 15:53:04,090  - DEBUG: DtlsTransport - video - Sending DTLS message to 1, len: 214

2015-03-05 15:53:04,091  - DEBUG: NiceConnection - video - NICE Component State Changed 1 - 3

2015-03-05 15:53:04,092  - DEBUG: DtlsTransport - video - Received DTLS message from 1, len 812

2015-03-05 15:53:04,092  - INFO: dtls.DtlsSocket - mInBio bytes to write: 812 written: 812

2015-03-05 15:53:04,092  - INFO: dtls.DtlsSocket - doing handshake!

2015-03-05 15:53:04,095  - INFO: dtls.DtlsSocket - mOutBio Pending: 831 and read: 831

2015-03-05 15:53:04,095  - INFO: Resender - Resender destructor

2015-03-05 15:53:04,095  - DEBUG: DtlsTransport - video - Sending DTLS message to 1, len: 831

2015-03-05 15:53:04,096  - DEBUG: DtlsTransport - video - Received DTLS message from 1, len 746 // THIS GETS DROPPED ON THE FLOOR

2015-03-05 15:53:05,095  - WARN: Resender - video - Resending DTLS message to 1, len: 831

2015-03-05 15:53:07,096  - WARN: Resender - video - Resending DTLS message to 1, len: 831

2015-03-05 15:53:11,096  - WARN: Resender - video - Resending DTLS message to 1, len: 831

2015-03-05 15:53:15,097  - WARN: Resender - video - Resending DTLS message to 1, len: 831

2015-03-05 15:53:19,097  - WARN: Resender - video - Resending DTLS message to 1, len: 831

...licode is attempting to resend the final flight ( "Certificate, Client Key Exchange, Certificate Verify, Change Cipher Spec, Encrypted Handshake Message"), and client-side I see no retransmissions.  Per section 4.2.4 of RFC6347, shouldn't WebRTC be retransmitting that final flight in the event of the other side doing a retransmit?  And it's entirely possible licode is borked, so if I'm in crazy land and you guys have clear tests that show this works, I'd accept that as a suitable response (it seems unlikely to me that WebRTC is broken in this way).

I checked, and Chrome 41 and Canary both have the same behavior in this regard.

Any advice is very appreciated.

pablo platt

unread,
Mar 5, 2015, 6:51:17 PM3/5/15
to discuss...@googlegroups.com

--

Jeremy Noring

unread,
Mar 6, 2015, 10:24:47 AM3/6/15
to discuss...@googlegroups.com
No, I don't think that's the same issue.  None of the packets being exchanged in my DTLS handshake are close to the MTU size (largest is about ~850 bytes).  My issue crops up with the final flight is dropped, licode resends, and webrtc is unresponsive to the resend.

Jeremy Noring

unread,
Mar 9, 2015, 10:56:50 AM3/9/15
to discuss...@googlegroups.com
Filed here: https://code.google.com/p/webrtc/issues/detail?id=4403 - thanks for everyone's help.

Alvaro Gil

unread,
Oct 27, 2015, 1:16:20 AM10/27/15
to discuss-webrtc
Hi Jeremy,

I am experiencing an issue with Licode aswell that sounds familiar.
I am getting DTLS timeout expired. Handshake failed.

Did you solve your issue?

Jeremy Noring

unread,
Oct 27, 2015, 11:30:03 AM10/27/15
to discuss-webrtc
I did, however it took pretty significant re-architecting of licode's DTLS handling.  I didn't resubmit those back to the project because our repo of licode has deviated far enough from ging/licode that it'd be hard to resubmit.

You're a maintainer on licode, aren't you? 

Alvaro Gil

unread,
Oct 27, 2015, 11:38:05 AM10/27/15
to discuss-webrtc
Jeremy,

No, I am not maintainer of Licode, I am just a developer using it. I've written an IOS client[0] for it, and now I am having hard time to find a bug that just appear when the build includes arm64 and is built against webrtc/release and happen in some devices, emulator for example works fine.

It trow a DTLS timeout expired, the handshake fails, no connection. I thought it could be related.


Thanks!

Jeremy Noring

unread,
Oct 27, 2015, 1:27:14 PM10/27/15
to discuss-webrtc
It's your version of WebRTC:

target "ECIExampleLicode" do
platform :ios, :deployment_target => "8.4"
xcodeproj “ECIExampleLicode/ECIExampleLicode
pod "libjingle_peerconnection", "9814.2.0"
end

...there's something wrong with 9814 where it consistently fails a DTLS handshake.  I don't know why this is, or why the pristine io project chose this version as a "good" version to supply for a cocoapod.  I'd swap it over to use WebRTC 45 and see if that works any better.

Alvaro Gil

unread,
Oct 27, 2015, 2:42:39 PM10/27/15
to discuss-webrtc
I wish it were 9814, I've tested compiling webrtc-build-scripts revision 10410 and happen the same. It might be a bug in my code, but what gets my attention is that it only happen in Release build. In the past I've fixed Release issues, with threading and stuff just by running the app in Debug and reading the logs but not this time.

Jeremy, are you building using official build or webrtc-build-scripts?


For more options, visit https://groups.google.com/d/optout.



--
Alvaro

Alvaro Gil

unread,
Oct 30, 2015, 10:16:38 AM10/30/15
to discuss-webrtc
So I was able to make it work with latest revision from Webrtc. BTW. How I can know which revision is it?
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrtc+unsubscribe@googlegroups.com.



--
Alvaro
Reply all
Reply to author
Forward
0 new messages