Intent to Prototype: MediaStreamTrack Insertable Streams (a.k.a. Breakout Box)

Guido Urdaneta

unread,

Nov 8, 2020, 3:03:59 AM11/8/20

to blink-dev

API spec

Yes

Summary

This feature defines an API surface for manipulating raw media carried by MediaStreamTracks such as the output of a camera, microphone, screen capture, or the decoder part of a codec and the input to the decoder part of a codec. It uses WebCodecs interfaces to represent raw media frames and exposes them using streams, similarly to the way the WebRTC Insertable Streams spec exposes encoded data from RTCPeerConnections.

Blink component

Blink>MediaStream

Motivation

The motivation for this feature is to support some of the use cases described in WebRTC Next Version Use Cases in a more ergonomic or efficient way than existing approaches. More specifically: * Funny Hats: Refers to manipulation of media to provide effects such as background removal, funny hats, echo detection, voice effects. * Machine Learning: Refers to applications such as real-time object identification/annotation.

Initial public proposal

https://youtu.be/gZsZIwfvw28?t=9

TAG review

None

TAG review status

Pending

Risks

Interoperability and Compatibility

As with all new features, the main interoperability risk is that other browsers do not implement it. There is no compatibility risk since the existing behavior is unaffected if the feature is not used.

Gecko: No signal

WebKit: No signal

Web developers: Positive

Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes

Is this feature fully tested by web-platform-tests?

Not yet, but WPT tests will be added during development.

Tracking bug

https://crbug.com/1142955

Launch bug

https://crbug.com/1146805

Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5499415634640896

This intent message was generated by Chrome Platform Status.

Arthur Sonzogni

unread,

Nov 10, 2020, 11:33:51 AM11/10/20

to blink-dev, Guido Urdaneta, Martin Šrámek, Sean Harrison, Aaron Tagliaboschi

Hello Guido,

We've discussed your intent within security + privacy teams.

We would be happy to see the explainer and the specification to be completed.

Moreover, adding a security and privacy considerations would be helpful. For instance:

How does this deal with same-origin policy exactly? Can this be used to transform a cross-origin video stream?
Is there a chance for data exfiltrations, GPU profiling, knowing what are the supported codecs, etc...?
Can this be used to potentially exploit existing encoder/decoder vulnerabilities more easily? Is it a problem?
Is there use case for supporting this with screen capture?

Thanks,
Arthur

Harald Alvestrand

unread,

Nov 25, 2020, 5:44:27 AM11/25/20

to Arthur Sonzogni, blink-dev, Guido Urdaneta, Martin Šrámek, Sean Harrison, Aaron Tagliaboschi

Thanks for the questions, and apologies for late response!

On Tue, Nov 10, 2020 at 5:34 PM Arthur Sonzogni <arthurs...@chromium.org> wrote:

Hello Guido,

We've discussed your intent within security + privacy teams.
We would be happy to see the explainer and the specification to be completed.
Moreover, adding a security and privacy considerations would be helpful. For instance:
How does this deal with same-origin policy exactly? Can this be used to transform a cross-origin video stream?

At the moment, MediaStreamTracks aren't transferable between origins, so cross-origin MediaStreamTracks don't exist.

I believe (but need to check) that VideoFrames are origin-tagged, just like Canvas, so that if one transfers them to a different origin, they become displayable but not accessible.

Is there a chance for data exfiltrations, GPU profiling, knowing what are the supported codecs, etc...?

Yes, GPU profiling is possible, since some of the functions on VideoFrame (especially format conversion) will probably be implemented on the GPU. But it should be a strict subset of what's available under WebGL/WebGPU, so shouldn't give new information.
The VideoFrame issues are not new with MST insertable streams, but are also present in the WebCodec (currently on origin trial).

Can this be used to potentially exploit existing encoder/decoder vulnerabilities more easily? Is it a problem?

No, this interface doesn't touch codecs directly. The possible vulnerabilities (encoding "peculiar" pictures), if they exist, are already exploitable through capturing from a Canvas.

Is there use case for supporting this with screen capture?

A MediaStreamTrack produced from screen capture will be able to be processed using this API, but I don't believe there will be any new vulnerabilities compared to painting the same track on a canvas through a video element.

Hope this helps! I'll try to get the same answers into a security and privacy considerations section Real Soon Now.

--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/6bcadc88-1383-4644-b7a3-e3ad93877955n%40chromium.org.

Arthur Sonzogni

unread,

Nov 25, 2020, 10:44:11 AM11/25/20

to Harald Alvestrand, Arthur Sonzogni, blink-dev, Guido Urdaneta, Martin Šrámek, Sean Harrison, Aaron Tagliaboschi

Hope this helps! I'll try to get the same answers into a security and privacy considerations section Real Soon Now. ~[Harald Alvestrand]

It helps, thank you!

It is very useful to explicitly write down all the security and privacy considerations that have been taken into account.

How does this deal with same-origin policy exactly? Can this be used to transform a cross-origin video stream?
At the moment, MediaStreamTracks aren't transferable between origins, so cross-origin MediaStreamTracks don't exist.

I believe (but need to check) that VideoFrames are origin-tagged, just like Canvas, so that if one transfers them to a different origin, they become displayable but not accessible. ~[Harald Alvestrand]

In a non COEP context, you can have cross-origin <img> or <video> that haven't explicitly opted into being embedded (via CORS/CORP). They are displayable, but should remain inaccessible from the document. You should just make sure this feature won't open a new security hole. If video.captureStream() refuses to return the MediaStream, then BreakoutBox can't be used. So this sounds good a priori.

Is there use case for supporting this with screen capture?

A MediaStreamTrack produced from screen capture will be able to be processed using this API, but I don't believe there will be any new vulnerabilities compared to painting the same track on a canvas through a video element. ~[Harald Alvestrand]

The first section in the document: How To Do Chrome Security Reviews is "Does anybody need this feature?". So I had to ask the question.

The interaction in between ScreenCapture and BreakoutBox looks like just a side effect, I don't think there is any use case. Since you can do (ScreenCapture -> Canvas and then Canvas -> XXX), it wouldn't bring much for me to request not supporting BreakoutBox for (ScreenCapture -> Canvas), while we are going to support (Canvas -> XXX). Thanks!

----

About the privacy related response, thanks! This will be useful for the privacy reviewers.

Reply all

Reply to author

Forward

Intent to Prototype: MediaStreamTrack Insertable Streams (a.k.a. Breakout Box)

Guido Urdaneta

Contact emails

Explainer

Specification

API spec

Summary

Blink component

Motivation

Initial public proposal

TAG review

TAG review status

Risks

Interoperability and Compatibility

Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Is this feature fully tested by web-platform-tests?

Tracking bug

Launch bug

Link to entry on the Chrome Platform Status

Arthur Sonzogni

Harald Alvestrand

Arthur Sonzogni