Intent to Extend Experiment: MediaStreamTrack Insertable Streams (a.k.a. Breakout Box)

134 views
Skip to first unread message

Tony Price

unread,
Jun 16, 2021, 10:29:47 AM6/16/21
to blin...@chromium.org

Contact emails

gui...@chromium.org, h...@chromium.org, top...@chromium.org


Explainer

https://github.com/w3c/mediacapture-transform/blob/main/explainer.md


Specification

https://w3c.github.io/mediacapture-transform/


API spec

Yes


Design docs


https://w3c.github.io/mediacapture-insertable-streams/

https://github.com/w3c/mediacapture-insertable-streams/blob/main/explainer.md


Summary

This feature defines an API surface for manipulating raw media carried by MediaStreamTracks such as the output of a camera, microphone, screen capture, or the decoder part of a codec and the input to the decoder part of a codec. It uses WebCodecs interfaces to represent raw media frames and exposes them using streams, similarly to the way the WebRTC Insertable Streams spec exposes encoded data from RTCPeerConnections.



Blink component

Blink>MediaStream


TAG review

https://github.com/w3ctag/design-reviews/issues/603


TAG review status

Positive initial response. Pending approval.


Risks



Interoperability and Compatibility

As with all new features, the main interoperability risk is that other browsers do not implement it. There is no compatibility risk since the existing behavior is unaffected if the feature is not used.



Gecko: Negative. Their main objections are audio support and allowing processing on the Window scope. See:

https://github.com/w3c/mediacapture-transform/issues/38


WebKit: Negative. They object to using the streams API, supporting audio and allowing processing on the Window scope. See:

https://github.com/w3c/mediacapture-transform/issues/31

https://github.com/w3c/mediacapture-transform/issues/29

https://github.com/w3c/mediacapture-transform/issues/23

https://github.com/w3c/mediacapture-transform/issues/4


Web developers: Positive so far, but more feedback being collected.


Ergonomics

* Are there any other platform APIs this feature will frequently be used in tandem with?

This API will be used together with other MediaStream and WebRTC related APIs, such as getUserMedia, getDisplayMedia, and RTCPeerConnection.



* Could the default usage of this API make it hard for Chrome to maintain good performance (i.e. synchronous return, must run on a certain thread, guaranteed return timing)? 

No.



Activation

* Will it be challenging for developers to take advantage of this feature immediately, as-is? 

No.


* Would this feature benefit from having polyfills, significant documentation and outreach, and/or libraries built on top of it to make it easier to use?

No.



Security

This feature introduces a new MediaStreamTrack sink and source.

The sink (MediaStreamTrackProcessor) exposes data that is already exposed in Chrome by other sinks (e.g., media elements, Web Audio, peer connection. In this sense, it does not introduce new attack surfaces other than the API implementation itself. Security relies on existing security mechanisms for MediaStreamTrack sources. For example, all camera or screen-capture tracks require user authorization and element capture does not allow access to cross-origin data.

The source (MediaStreamTrackGenerator) allows users to create new tracks using data already available to the Web page. In this case, security relies on the same-origin policy, as the document can only create media frames using data that is already visible.

All the code for this feature runs in the sandboxed renderer process and therefore respects the Rule of Two.



Goals for experimentation

We are looking for feedback with regards to the API shape and its performance, especially for the video processing use case. We also want to learn if the WebCodecs interfaces for raw data need adjustment to better support the use cases enabled by this API.


Since this API can be seen as an extension to WebCodecs, we are reusing the WebCodecs Origin Trial token in order to make it easier for existing participants in the WebCodecs trial to use this API.



Reason this experiment is being extended

This feature depends on WebCodecs,, which itself has had experimentation extended (see the WebCodecs Intent To Extend Origin Trial). The trial extension will also provide more time to improve our implementation and seek cross UA consensus. This feature is already controlled by the WebCodecs origin trial. 


Experimental timeline

This experiment is controlled by the WebCodecs origin trial. It started on M90, and was initially expected to conclude with the release of M92 stable. We are now requesting to continue to use the WebCodecs trial for the additional 2 milestones (M93 and M94) for which that trial will run, concluding with the release of M94 stable. 


Ongoing technical constraints

None.



Will this feature be supported on all six Blink platforms (Windows, Mac, Linux, Chrome OS, Android, and Android WebView)?

Yes


Is this feature fully tested by web-platform-tests?

There are WPT tests covering the core of this feature and additional tests are planned. Existing tests can be found at:

https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/external/wpt/mediacapture-insertable-streams/

https://source.chromium.org/chromium/chromium/src/+/main:third_party/blink/web_tests/wpt_internal/mediastream/


Flag name

MediaStreamInsertableStreams


Tracking bug

https://crbug.com/1142955


Launch bug

https://crbug.com/1146805


Link to entry on the Chrome Platform Status

https://chromestatus.com/feature/5499415634640896


Links to previous Intent discussions

Intent to prototype: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/gkBkLvGKuzc

Intent to Experiment: https://groups.google.com/a/chromium.org/g/blink-dev/c/fyJfqEwP1FY



This intent message was generated by Chrome Platform Status.

Thomas Steiner

unread,
Jun 16, 2021, 11:35:58 AM6/16/21
to Tony Price, blink-dev
--
You received this message because you are subscribed to the Google Groups "blink-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to blink-dev+...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/blink-dev/CAP4M0WbPpy6wrrPWvkxtKABZB_8FjondUZj8GaMz9oVouXoNNw%40mail.gmail.com.


--
Thomas Steiner, PhD—Developer Advocate (https://blog.tomayac.com, https://twitter.com/tomayac)

Google Germany GmbH, ABC-Str. 19, 20354 Hamburg, Germany
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.1.23 (GNU/Linux)

iFy0uwAntT0bE3xtRa5AfeCheCkthAtTh3reSabiGbl0ck0fjumBl3DCharaCTersAttH3b0ttom.hTtPs://xKcd.cOm/1181/
-----END PGP SIGNATURE-----

Yoav Weiss

unread,
Jun 17, 2021, 12:53:20 AM6/17/21
to Thomas Steiner, Tony Price, blink-dev
This is concerning. It seems like both Mozilla and Apple are engaged yet object to some aspects of the current design. How likely are we to reach consensus here?
Do we have concrete users and concrete examples for the contention points? (Window scope, audio processing, etc)
 

Web developers: Positive so far, but more feedback being collected.



Any signals from current OT participants?
 

Guido Urdaneta

unread,
Jun 17, 2021, 7:39:00 AM6/17/21
to Yoav Weiss, Thomas Steiner, Tony Price, blink-dev
I believe consensus is unlikely.

 
Do we have concrete users and concrete examples for the contention points? (Window scope, audio processing, etc)
 
Yes, we do.
At least one OT participant has expressed that supporting the Window scope is important for them. This use case is not a large-scale VC service, but a production application for evaluating video effects together with WebCodecs. In this use case, simplicity is very important and Workers bring no benefit whatsoever, but instead add complexity, according to the participant. They note that there is some friction to using workers together with TypeScript and Webpack.  
Another important use case is migration of existing large codebases using video element+canvas capture. In this case, introducing a worker into the architecture can be quite disruptive, so a migration in multiple stages where processing is migrated first on the Window scope and then moved to a worker is a sensible approach.
For audio, we are aware of a participant who benefits from using the same API for audio and video to do similar processing to both in the context of a gaming application. Also, any processing that involves video together with audio (e.g., audio-visual speech separation, see https://arxiv.org/abs/1804.03619) benefits from audio support in this API.
We're willing to make Window-scope usage opt-in and disabled by default (e.g., require an explicit parameter in the constructor to allow it), but we are not willing to forbid it entirely or to produce a more complex API just to make it exceedingly difficult.

 

Web developers: Positive so far, but more feedback being collected.



Any signals from current OT participants?
I mentioned some preliminary feedback we have received. We are in the process of requesting more specific feedback from OT participants with regards to the issues Apple and Mozilla oppose (use of streams, audio support and window scope).

 

Guido Urdaneta

unread,
Jul 22, 2021, 1:00:52 PM7/22/21
to Guido Urdaneta, Yoav Weiss, Thomas Steiner, Tony Price, blink-dev
Updating this thread to request extending the experiment.
There is an Intent to Ship this feature that addressed the questions originally raised here.
Final approval to ship awaits a W3C Media WG decision about Window exposure of WebCodecs interfaces. This decision will not be made before M94, so requesting to extend the experiment for two more releases (M93 and M94). 

Chris Harrelson

unread,
Jul 22, 2021, 3:05:27 PM7/22/21
to Guido Urdaneta, Yoav Weiss, Thomas Steiner, Tony Price, blink-dev
Reply all
Reply to author
Forward
0 new messages