Debugging Audio / Video Sync

2,497 views
Skip to first unread message

qu...@mixer.com

unread,
Jan 24, 2018, 3:37:34 AM1/24/18
to discuss-webrtc

Hello All!


Background:


I work on the mixer.com video team. Mixer is a website that provides low latency game streaming simular to YouTube and Twitch. We use WebRTC on all our clients for video playback to keep the latency as low as possible. Recently, we have been having trouble debugging audio / video synchronization on our channels and have some questions.


Context:

We have a considerably basic stream - we have one RTP video stream and one RTP audio stream. From our current understanding of WebRTC, we assume that the RTP timestamps can be seeded with a random value (which we don’t do today) and thus they can and shouldn’t be used to sync the streams. Instead, RTPC messages must be sent for each audio and video stream to give an absolute time for the streams to be synced to. Thus, we send along with the RTP video and audio streams a RTPC message for each stream every 5 seconds.


Problem:

We have been seeing a problem with a/v sync on our live transcodes. Our transcodes produce two RTP streams and two RTPC streams almost identical to the streams in source. The only differences are that we change the SSRC id on the transcode streams as well as scale the video. In fact, the audio and RTPC streams are copies of the source streams with the SSRC updated. But the timestamps in both the RTP and RTPC packet are the exact same as source. Our video transcode keeps the same input fps, so one frame in results in one frame out.


Test Channel: https://mixer.com/quinnbit

(This test channel is running on new test ingest bits that produce the streams as described above. Other streams on the website won’t have the same behavior)


On mixer.com, when if you open the test channel above you will see the source stream by default, it should be in sync. If you then select the gear icon it the bottom right of the video, you will see quality options for transcodes. If you switch to a transcode on the test channel you will see that the audio and video fall out of sync. One interesting observation I had was most of the time when I reproduce the bad a/v sync, the a/v sync seems to always have the same incorrect offset. Meaning that most of the time audo and video fall out of sync the same amount, having the audio ahead by about 150ms. My guess from that observation is that it’s possible that webrtc is trying to sync the streams, but is trying to sync them to an incorrect offset.


Questions:


The first question I have is about our assumption that synchronization in webrtc is based off RTPC packet timestamps. Is that correct? From looking at the source code that’s what I concluded, but I can’t find any documentation discussing a/v sync.


My second question is about general debugging the RTP incoming streams in WebRTC. Since WebRTC is encrypted, it’s hard to verify that the incoming RTP streams are formatted correctly and timestamped correctly. Are there any tricks to make it easier to monitor the incoming RTP streams? An events or logs that fire when packets are received or decoded?


Last, if you can identify any issues with the transcode streams that would cause them to be out of sync please let me know. But beyond that, I would really like to know how we can better debug these a/v sync issues ourselves. I have looked at webrtc internals, but I can’t find anything that indicates what the webrtc engine is doing to each stream to try to sync them. I have also looked through the source code, but beyond running a local debug build of chrome with a debugger attached I can’t find many good ways to debug what’s going on. I’m wondering if there is any logging I can watch or data points that are emitted that indicate how webrtc is interpreting the incoming packet timestamps, and how it’s skewing each stream to try to synchronize them.


Sorry for the long post - I appreciate any thoughts or help anyone has. 😊

 

Ashik Salim

unread,
Jan 24, 2018, 3:58:21 AM1/24/18
to discuss-webrtc
I didn't exactly understand whether your scenario involves more than one audio/video track, but in case it does webrtc does not currently supports only one audio and video. This is marked as TODO in the native code the last time I checked (which admittedly was a a couple months ago).

Philipp Hancke

unread,
Jan 24, 2018, 4:03:49 AM1/24/18
to WebRTC-discuss
you might try running with encryption disabled. https://webrtchacks.com/video_replay/ has quite a number of helpful hints regarding that and shows how to get a perfect reproduction dump.

--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrtc+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/76ef7826-4739-4b8d-a70f-3dc1ab4d743e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lorenzo Miniero

unread,
Jan 24, 2018, 6:48:37 AM1/24/18
to discuss-webrtc
Assuming mixer.com still uses Janus (or a forked version of it), you can get an unencrypted dump of the traffic on the server side using a recently merged text2pcap integration:

 
Lorenzo
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.

Philipp Hancke

unread,
Jan 24, 2018, 9:31:55 AM1/24/18
to WebRTC-discuss
https://tokbox.com/blog/lip-sync-issues-when-a-chrome-update-fixes-your-application/ also shows some stats related to this.
The attached graph (internals.png) from webrtc-internals looks to the 150ms delay you mention.

I've also seen qpSum dropping to 0 and framesDecoded being reset at very short intervals (see qpsum.png) which suggests the decoder *really* doesn't like what you are doing.


internals.png
qpsum.png

qu...@mixer.com

unread,
Jan 31, 2018, 7:31:28 PM1/31/18
to discuss-webrtc

Thanks for all of the replies. I have gone over all of the info from above, but still can’t figure out what my issue is.


I have a more fundamental question about WebRTC, how does it handle stream synchronization? Assuming one RTP video stream and one RTP audio stream and RTPC SR packets, what does WebRTC use to synchronize the playback of the streams?


I assume it uses the RTPC packets send alongside the streams to do the sync, but is that true? What would happen if I take two RTP streams with correct RTP timestamps to sync them, and delay video by 200ms. Will WebRTC delay playback of the audio stream to stay in sync with the video stream?

To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.

Philipp Hancke

unread,
Feb 1, 2018, 2:56:16 AM2/1/18
to WebRTC-discuss
2018-02-01 1:31 GMT+01:00 <qu...@mixer.com>:

Thanks for all of the replies. I have gone over all of the info from above, but still can’t figure out what my issue is.


I have a more fundamental question about WebRTC, how does it handle stream synchronization? Assuming one RTP video stream and one RTP audio stream and RTPC SR packets, what does WebRTC use to synchronize the playback of the streams?


There is a book for that and it has aged very well in the last 15 years! Colin Perkins "RTP: Audio and Video for the Internet". Page 155, "Timestamps and the RTP timing model"

There is a simple experiment you can try: currently your sdp contains something like this:
a=ssrc:207675272 cname:janusaudio
a=ssrc:207675272 msid:janus janusa0
a=ssrc:207675272 mslabel:janus
a=ssrc:207675272 label:janusa0
and
a=ssrc:327120901 cname:janusvideo
a=ssrc:327120901 msid:janus janusv0
a=ssrc:327120901 mslabel:janus
a=ssrc:327120901 label:janusv0
This tells Chrome to put the audio from ssrc 20765272 and video from ssrc 327120901 into the media stream with id janusvideo and trigger onaddstream with that (see chrome://webrtc-internals)
If you change the "msid janus janusa0" (and mslabel:janus in the audio-section; sadly) to janusaudio you will get two onaddstream calls, one with an audio-only stream and a second video-only one.
Attach them to two audio/video elements, check if the problem disappears.
This is not a good thing in production to do but allows you to figure out if it the internal attempt to synchronize.
 

I assume it uses the RTPC packets send alongside the streams to do the sync, but is that true? What would happen if I take two RTP streams with correct RTP timestamps to sync them, and delay video by 200ms. Will WebRTC delay playback of the audio stream to stay in sync with the video stream?


I think so. This kind of bug is nasty.

hope that helps
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrtc+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/4203c414-44cb-468c-ad1f-f7560504b0a1%40googlegroups.com.

Boris Grozev

unread,
Feb 1, 2018, 10:52:46 AM2/1/18
to discuss...@googlegroups.com, qu...@mixer.com
Hi,

On 31/01/2018 18:31, qu...@mixer.com wrote:
> Thanks for all of the replies. I have gone over all of the info from
> above, but still can’t figure out what my issue is.
>
>
> I have a more fundamental question about WebRTC, how does it handle
> stream synchronization? Assuming one RTP video stream and one RTP audio
> stream and RTPC SR packets, what does WebRTC use to synchronize the
> playback of the streams?
>
>
> I assume it uses the RTPC packets send alongside the streams to do the
> sync, but is that true?

I think the short answer is "yes". RTCP Sender Reports contain two
timestamp fields[0], which correlate the RTP clock with the sender's
wallclock:

RTP timestamp: 32 bits
Corresponds to the same time as the NTP timestamp (above), but in
the same units and with the same random offset as the RTP
timestamps in data packets. This correspondence may be used for
intra- and inter-media synchronization for sources whose NTP
timestamps are synchronized


Regards,
Boris

[0] https://tools.ietf.org/html/rfc3550#section-6.4.1
Reply all
Reply to author
Forward
0 new messages