PSA: VP9/AV1 simulcast support in M113

Henrik Boström

unread,

Mar 30, 2023, 5:31:26 AM3/30/23

to discuss-webrtc

TL;DR

You can opt-in to VP9/AV1 simulcast by specifying scalabilityMode and scaleResolutionDownBy.
In upcoming milestones we'll experiment with new bitrate allocation strategies for VP9/AV1 simulcast.

PSA

The spec and implementation supports configuring multiple encodings with addTransceiver() or multiple ssrcs in SDP and after negotiation, RTCRtpSender.getParameters() + setParameters() can be used to set which encodings are active or inactive without the need to re-negotiate.

Multiple encodings has allowed VP8 and H264 to support simulcast for years.

According to the standard, this should work for any codec, including VP9 and AV1 as well. But for historical reasons and the previous lack of ability to control the number of spatial layers via the API, VP9 and AV1 has been treated differently. Instead of interpreting multiple encodings as multiple encodings, VP9/AV1 has interpreted this to mean "multiple spatial layers". For this reason, VP9/AV1 simulcast was not possible.

This is changing in M113, where VP9/AV1 simulcast is now possible: see this demo working on latest Canary.

For backwards-compat reasons, you only get the standard behavior if you specify both a scalabilityMode and a scaleResolutionDownBy value. This is considered the way to "opt-in" to the new standard API path.

If you don't specify these parameters, you still get "legacy SVC mode" when you have multiple encodings - e.g. see this example, the legacy mode can be confirmed by going to chrome://webrtc-internals/ where you see a single "outbound-rtp" with scalabilityMode: L3T3_KEY.

The standard way to achieve L3T3_KEY is to specify it with scalabilityMode and to make any other encoding inactive (3 encoding example, 1 encoding example).
In legacy mode, inactivating an encoding will inactivate a spatial layer. In standard mode, the active parameter refers to the entire encoding. If you want to disable a spatial layer in standard mode, you need to change to a different scalabilityMode/scaleResolutionDownBy for your active encoding.

Future milestones

Be aware that in upcoming milestones we'll continue to tweak the new VP9/AV1 simulcast paths. Most notably, there is a large difference between the bitrate allocation of simulcast and SVC. We will experiment with making VP9 simulcast bitrates more closely aligned with VP9 SVC bitrates.

Today's implementation supports many different scalabilityMode values when the first encoding is the only active encoding. For multiple active encodings (simulcast), only L1T1, L1T2 and L1T3 are supported and all encodings have to use the same value. Unsupported configurations will trigger fallback to a supported configuration such as L1T2. See getParameters() and getStats() for status.

- Stay tuned for future PSAs about supporting more combinations.

Lorenzo Miniero

unread,

May 10, 2023, 12:38:49 PM5/10/23

to discuss-webrtc

Hi Henrik,

apologies for the late feedback, I only recently became aware of this effort. I just published a simple patch that makes VP9/AV1 simulcast work with Janus as well, which was quite simple to come up with, so that's good news! Linking the PR here for those who want to give it a try or follow the development:

https://github.com/meetecho/janus-gateway/pull/3218

Your description doesn't mention how SFUs can intercept temporal layers, though. For substreams, rids are what you use to demultiplex traffic, but for temporal layers we're currently manually inspecting the VP8 payload header to figure out which one a packet refers to. Would the same codec-aware manual inspection be needed for VP9 and AV1 too? Or is Chrome currently encoding this information in some RTP extension?

Thanks!
Lorenzo

Philipp Hancke

unread,

May 10, 2023, 12:40:14 PM5/10/23

to discuss...@googlegroups.com

you can use either the AV1 DD or the generic-frame-descriptor, no?

--

---
You received this message because you are subscribed to the Google Groups "discuss-webrtc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to discuss-webrt...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/2bcf8ea1-f0d4-41c6-ad88-e89cd14c32edn%40googlegroups.com.

Lorenzo Miniero

unread,

May 10, 2023, 12:44:49 PM5/10/23

to discuss-webrtc

My question was mainly what is being used by Chrome to set that info, at the moment, if anything. I've messed a bit with AV1 DD in the past and pretty much hated it, so I only have partial support :-) I don't have any support for the generic-frame-descriptor at all at the moment instead. Knowing what's being set in Chrome, I'll know what to focus on next.

L.

Philipp Hancke

unread,

May 10, 2023, 12:47:02 PM5/10/23

to discuss...@googlegroups.com

Chrome doesn't offer either of the two by default but servers can still negotiate it (which works at least in Chrome).

GFD is simpler (and if you know DD you will appreciate that) but nonstandard.

To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/f2ec8060-66ef-437f-a7eb-c1ca2a75a6dcn%40googlegroups.com.

Lorenzo Miniero

unread,

May 10, 2023, 12:51:08 PM5/10/23

to discuss-webrtc

But is info on temporal layers indeed encoded in those extensions, then, when negotiated? That's what isn't clear for me from the text above.

L.

Philipp Hancke

unread,

May 10, 2023, 1:31:11 PM5/10/23

to discuss...@googlegroups.com

the extensions are based on the internal data structures (which are extracted from the codec) so should contain the right data.

Only looking at VP9 it might be more pragmatic to just extract the bits you need from the payload descriptor and make the payload descriptor parsing codec-specific.

To view this discussion on the web visit https://groups.google.com/d/msgid/discuss-webrtc/9ee7fea9-4132-43f1-98db-060ecd43bef0n%40googlegroups.com.

Lorenzo Miniero

unread,

May 10, 2023, 1:38:47 PM5/10/23

to discuss-webrtc

Thanks, I'll tinker with that tomorrow!

L.

Lorenzo Miniero

unread,

May 11, 2023, 2:02:28 PM5/11/23

to discuss-webrtc

Just as an update, forcing a negotiation of the DD extension did do the trick for me: to keep things simple I did the payload parsing for VP9 (since I already had code for that for VP9-SVC), and used the DD just for AV1 simulcast, and that seemed to work nicely. Thanks for the tip!

L.

Reply all

Reply to author

Forward