Intent to Prototype: Media Session API: Video conferencing actions

205 views
Skip to first unread message

Tommy Steimel

unread,
Mar 16, 2021, 5:57:04 PM3/16/21
to blin...@chromium.org

Contact emails

ste...@chromium.org


Explainer


https://github.com/w3c/mediasession/issues/264


Specification

Core specification: https://w3c.github.io/mediasession/


The changes did not get merged yet.


Design docs


https://docs.google.com/document/d/1KDpWqg9LcnuQ5TQDBK45BrbkyfuziolZ0m7Wmt6xPnU/edit?usp=sharing&resourcekey=0-I0o3GayIn9Q-2LN7EB1l8w


Summary

Adds "togglemicrophone", "togglecamera", and "hangup" actions to the existing Media Session API.


This will enable developers of video conferencing websites to handle these actions from browser UI. For example, if the user puts their video call into a picture-in-picture window, the browser could display buttons for mute/unmute, turnon/turnoff camera, and hang up. When the user clicks these, the website handles them through the Media Session API.



Blink component

Internals>Media>Session


Motivation

This will enable developers of video conferencing websites to handle these actions from browser UI. For example, if the user puts their video call into a picture-in-picture window, the browser could display buttons for mute/unmute, turnon/turnoff camera, and hang up. When the user clicks these, the website handles them through the Media Session API.



Initial public proposal

None


TAG review

https://github.com/w3ctag/design-reviews/issues/608


TAG review status

Issues addressed


Risks



Interoperability and Compatibility

Interop risk is low because it's a small addition to an existing API



Gecko: No signal


WebKit: Positive (https://lists.w3.org/Archives/Public/public-media-wg/2021Jan/0001.html) Interested in WebRTC and hangup use cases


Web developers: Same as Apple (see above), we received feedback from web developers that they would be interested to handle WebRTC sessions via Media Session.



Is this feature fully tested by web-platform-tests?

No


Tracking bug

https://crbug.com/1178939


Launch bug

https://crbug.com/1174645


Link to entry on the Chrome Platform Status

https://www.chromestatus.com/feature/5744304695803904


This intent message was generated by Chrome Platform Status.

Tommy Steimel

unread,
Mar 17, 2021, 11:33:18 AM3/17/21
to Sergio Garcia Murillo, blink-dev
The current plan is to add these to the PiP window created by video.requestPictureInPicture() and improve the UX for that window. This also leaves open the possibility of integrating "togglemicrophone", "togglecamera", and "hangup" with other things such as usb headset buttons

On Wed, Mar 17, 2021 at 4:30 AM Sergio Garcia Murillo <sergio.gar...@gmail.com> wrote:
"For example, if the user puts their video call into a picture-in-picture window"

Are you planning to add more PiP functionalities for conference mode usage? 

Currently (afaik) it is only viable ways are getCurrentBrowsingContextMedia()  (which now requires a user confirmation) or by doing a video.requestPictureInPicture() (which has a poor UX). You could also do video mixing in a canvas and do the requestPiP on the capture stream, but I would really prefer not to do a rendering/mixing engine when I could use the browser native capabilities instead.

Any chance to have a window.requestPictureInPicture() that don't require user confirmation? 

Also, it would be great to be able to integrate those "togglemicrophone" and "hangup" actions with the usb headset buttons.

Best regards
Sergio

Sergio Garcia Murillo

unread,
Mar 17, 2021, 11:59:50 AM3/17/21
to blink-dev, Tommy Steimel
"For example, if the user puts their video call into a picture-in-picture window"

Are you planning to add more PiP functionalities for conference mode usage? 

Currently (afaik) it is only viable ways are getCurrentBrowsingContextMedia()  (which now requires a user confirmation) or by doing a video.requestPictureInPicture() (which has a poor UX). You could also do video mixing in a canvas and do the requestPiP on the capture stream, but I would really prefer not to do a rendering/mixing engine when I could use the browser native capabilities instead.

Any chance to have a window.requestPictureInPicture() that don't require user confirmation? 

Also, it would be great to be able to integrate those "togglemicrophone" and "hangup" actions with the usb headset buttons.

Best regards
Sergio

On Tuesday, March 16, 2021 at 10:57:04 PM UTC+1 Tommy Steimel wrote:

Sergio Garcia Murillo

unread,
Mar 17, 2021, 1:35:44 PM3/17/21
to Tommy Steimel, blink-dev
How would you differentiate a PiP for a conference  from a normal video? Will there be an api for showing the UI buttons?

 I can't find that info on the public links.

Best regards
Sergio 

Tommy Steimel

unread,
Mar 17, 2021, 1:39:39 PM3/17/21
to Sergio Garcia Murillo, blink-dev
We will show the UI buttons if the website has registered actions for them using setActionHandler.
See here for more info on how the Media Session API works: https://developer.mozilla.org/en-US/docs/Web/API/MediaSession

"togglemicrophone", "togglecamera", and "hangup" will just bt new actions alongside "play". "pause", "nexttrack", etc

Sergio Garcia Murillo

unread,
Mar 17, 2021, 1:42:07 PM3/17/21
to Tommy Steimel, blink-dev
That makes sense,  thank you! 
Sergio
Reply all
Reply to author
Forward
0 new messages