Future of media::VideoDecodeAccelerator

Sunny Sachanandani

unread,

Apr 15, 2022, 6:13:34 PM4/15/22

to Graphics-dev, Dale Curtis, libe...@chromium.org, tmath...@chromium.org, vas...@chromium.org

Hi,

First, just to summarize my understanding of the current state of media::VDA:

1) media::VideoDecodeAccelerator is the legacy accelerated video decode API that uses an antiquated resource model - clients generate GL texture ids, associate them with picture buffers, etc. It uses mojo now but used legacy IPC until last year when rockot@ migrated GPU legacy IPC to mojo.

2) media::VideoDecoder is the new API which is based on mailboxes and was exposed via mojo from the beginning. There are some new implementations like D3D11VideoDecoder, but there's also an implementation that runs on top of legacy VDA (VdaVideoDecoder) for wider compatibility.

3) PepperVideoDecoderHost uses GpuVideoDecodeAcceleratorHost when accelerated, but it has an adapter VideoDecoderShim that runs on top of VideoDecoder that seems to be used only for software fallback? PPB_VideoDecoder_Impl also uses GpuVideoDecodeAcceleratorHost, but it's unclear if this is still used (docs suggest this is old-style renderer-process implementation).

We would like to deprecate GLImages and completely migrate to shared image and for that we would like to get rid of the antiquated GL texture code in VDA implementations.

I'd like to explore the possibility of switching pepper to VideoDecoderShim for all video decode and completely get rid of the legacy VDA mojo interface (only the mojo interface - the implementations will still remain for use with VdaVideoDecoder in GPU process). Once the legacy VDA mojo interface is gone, we can refactor the VDA implementations in the GPU process to migrate fully to mailboxes as needed e.g. add optional mailbox params in AssignPictureBuffers which are used in lieu of texture ids if present, migrate one VDA at a time, etc.

Does this sound feasible? Any other thoughts?

Thanks,

Sunny

Dan Sanders

unread,

Apr 15, 2022, 6:31:50 PM4/15/22

to Sunny Sachanandani, Graphics-dev, Dale Curtis, libe...@chromium.org, tmath...@chromium.org, vas...@chromium.org

3) PepperVideoDecoderHost uses GpuVideoDecodeAcceleratorHost when accelerated, but it has an adapter VideoDecoderShim that runs on top of VideoDecoder that seems to be used only for software fallback? PPB_VideoDecoder_Impl also uses GpuVideoDecodeAcceleratorHost, but it's unclear if this is still used (docs suggest this is old-style renderer-process implementation).

My understanding is that VideoDecoderDev PPAPI was used only for Flash and therefore is no longer necessary. The VideoDecoder PPAPI is used by NaCl and will be maintained for some time to come.

This is convenient because the VideoDecoderDev protocol was not compatible with mailboxes but the VideoDecoder protocol can be more straightforwardly adapted.

I'd like to explore the possibility of switching pepper to VideoDecoderShim for all video decode and completely get rid of the legacy VDA mojo interface (only the mojo interface - the implementations will still remain for use with VdaVideoDecoder in GPU process).

This sounds the same as https://docs.google.com/document/d/1LslBpnWQrVsIRT0AxFfytsxVDoKmY87QrfDcYOjlxdc/edit?

Does this sound feasible? Any other thoughts?

I'm losing track of the various people that have asked about a migration like this, and I don't have a strong understanding of the Pepper API guarantees that are being maintained, but in general there is no technical blocker I am aware of to migrating usage of the NaCL-accessible VideoDecoder API to use MojoVideoDecoder, and it likely makes sense to do so via VideoDecoderShim.

- Dan

Frank Liberato

unread,

Apr 15, 2022, 6:42:44 PM4/15/22

to Sunny Sachanandani, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org

+Dan Sanders

--

I sometimes work nonstandard hours. Please don't feel any urgency to respond outside of your working hours.

Dan Sanders

unread,

Apr 15, 2022, 6:45:12 PM4/15/22

to Sunny Sachanandani, Graphics-dev, Dale Curtis, libe...@chromium.org, tmath...@chromium.org, vas...@chromium.org

I should add that GpuArcVideoDecodeAccelerator is also using the old interface, but it is using buffer import and doesn't really interact with the rest of the GPU stack.

Sushanth Rajasankar

unread,

Apr 15, 2022, 7:50:54 PM4/15/22

to Frank Liberato, Sunny Sachanandani, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org

Edge relies completely on DXVAVideoDecodeAccelerator, for media playback. There are couple of reasons for this, one concrete reason IIRC is that due to codec licensing concerns we are required to use the OS MFT and its inbuilt software fallback rather than bundling the software codec and talking to d3d11 directly like chrome does. I think we are open to doing to work to bringing DXVAVideoDecodeAccelerator upto speed and use SharedImages, if that results in nett cleaner code.

Thanks,

Sushanth

--
To unsubscribe from this group and stop receiving emails from it, send an email to graphics-dev...@chromium.org.

Sunny Sachanandani

unread,

Apr 17, 2022, 12:14:13 AM4/17/22

to Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org

Sushanth - just to clarify, we're not proposing getting rid of any of the VDA implementations - we're discussing just removing the legacy VDA IPC/mojo interface which isn't really used anymore except by ppapi and GpuArcVideoDecodeAccelerator as sandersd@ pointed out. The way DXVA VDA and other VDAs are mostly used is via VdaVideoDecoder which uses the newer VideoDecoder mojo APIs to communicate with the renderer process.

sandersd@ - thanks for the pointers - I'll reach out to mcasas@ and andrescj@ to ask about current status of their plan

Andres Calderon Jaramillo

unread,

Apr 18, 2022, 10:15:55 AM4/18/22

to Sunny Sachanandani, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org, Jeff Chen

> I should add that GpuArcVideoDecodeAccelerator is also using the old interface, but it is using buffer import and doesn't really interact with the rest of the GPU stack.

This is correct. GpuArcVideoDecodeAccelerator (Chrome OS-specific) implements an interface similar to the VDA. It implements it either using the platform-specific VDA implementations or the VideoDecoderPipeline which is a media::VideoDecoder. It doesn't really matter though because as sandersd@ says, we don't use GL or any graphics API in this path regardless of whether we're using VDA or media::VideoDecoder.

> sandersd@ - thanks for the pointers - I'll reach out to mcasas@ and andrescj@ to ask about current status of their plan

It's actively being worked on by jeffcchen@. I believe he's currently working his way through unexpected surprises.

> Sushanth - just to clarify, we're not proposing getting rid of any of the VDA implementations - we're discussing just removing the legacy VDA IPC/mojo interface which isn't really used anymore except by ppapi and GpuArcVideoDecodeAccelerator as sandersd@ pointed out.

This sounds good. At least on Chrome OS, we're working towards getting rid of all the VDA implementations too. We also have plans to get rid of GpuArcVideoDecodeAccelerator in favor of GpuArcVideoDecoder which is more similar to the VideoDecoder. This is lower priority work (more info at go/state-of-arc-video-decoding). However, this shouldn't interfere with the broader goal of removing GLImage for the reasons explained by sandersd@.

Sunny Sachanandani

unread,

Jul 12, 2022, 5:53:46 PM7/12/22

to Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org, Jeff Chen

Hi Andres,

Do you know what's the state of the refactoring to stop using VDA IPC interfaces from the renderer process for pepper? After renderer stops using VDA IPC interfaces, we can refactor VDA internals in the GPU service which is needed for us to complete our goal of removing GLImage usage from a bunch of places. Anything we can do to help this effort if it's stalled?

Also +Colin Blundell FYI

Thanks,

Sunny

Colin Blundell

unread,

Jul 13, 2022, 9:37:00 AM7/13/22

to Sunny Sachanandani, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org, Jeff Chen

Sunny, thanks for looping me in!

+1 to Sunny's questions - we're actively looking to eliminate the GLImage public interface, and this is one of the last remaining usages. I recently filed a bug to track this - please feel free to add info there as well and/or point me to any pre-existing bugs.

Thanks,

Colin

Andres Calderon Jaramillo

unread,

Jul 13, 2022, 2:14:08 PM7/13/22

to Colin Blundell, Sunny Sachanandani, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org, Pilar Molina Lopez

Hi all!

Pilar has taken over this work since Jeff Chen left. She's actively working on this. CCing her to comment more.

Andres

Pilar Molina Lopez

unread,

Jul 15, 2022, 7:53:56 PM7/15/22

to Andres Calderon Jaramillo, Colin Blundell, Sunny Sachanandani, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org

Hi all,

As Andres mentioned, I’m currently working on making Pepper use the VideoDecoder (VD) interface over Mojo instead of the legacy VDA interface.

We need some feedback, but first let me give you some background about what happens today and what we're planning to do.

PepperVideoDecoderHost seems to be an intermediary between the Pepper plugin and the video decoder in the gpu process. Today, for hardware accelerated video decoding, it uses media::GpuVideoDecodeAcceleratorHost as the media::VideoDecodeAccelerator implementation [1]. In order to make decoding requests, the PepperVideoDecoderHost calls media::GpuVideoDecodeAcceleratorHost::Decode() [2]. Those requests are received in the gpu process by media::GpuVideoDecodeAccelerator that owns a media::VideoDecodeAccelerator. This media::VideoDecodeAccelerator is implemented by a platform-specific class. For instance, for VA-API on ChromeOS, we have media::VaapiVideoDecodeAccelerator. When media::VaapiVideoDecodeAccelerator receives a decoding request, it asks the client for buffers to decode onto by calling the client function ProvidePictureBuffers() [3]. The client (the Pepper code in this case) creates texture/mailbox pairs and sends the texture IDs to the video decoder. At least on ChromeOS, the ideal path in the GPU process is to then bind those textures to the native buffers used by the hardware decoder. That way, the hardware decoder decodes onto buffers used by Pepper without copies.

PepperVideoDecoderHost uses media::VideoDecoderShim [4] for fallback to software video decoding. media::VideoDecoderShim serves as an adapter between the VDA interface (old) and the VD interface (new) so our plan is to use it for hardware video decoding too.

In the new interface, the video decoder does not ask the client for buffers using ProvidePictureBuffers() as I mentioned above. Instead, the video decoder in the gpu process manages its own buffers. The VideoDecoderShim asks the Pepper code to create textures so that when it gets a decoded YUV frame, it converts from YUV to RGB onto those textures [5]. This doesn't seem too bad for software decoding.

However, for hardware video decoding, this buffer management framework complicates things a bit. For context (at least on ChromeOS), the hardware decoder sends us Mailbox-backed NV12 frames. These NV12 frames are actually backed by a single Mailbox because on ChromeOS, we can sample from NV12 buffers as if they were RGBA textures. We have a couple of alternatives:

1. We can continue asking the Pepper code to create RGBA textures. Then, we do a copy from the NV12 frame mailboxes to the Pepper RGBA textures using RasterInterface::CopySubTexture()[6].

2. Don’t request textures from the Pepper code. Instead, when we receive an NV12 Mailbox frame from the decoder, we can get a texture from it and give it to the Pepper code. For that we need to use RasterInterface::CreateAndConsumeForGpuRaster() [7] which is only implemented by RasterImplementationGLES [8].

Option #1 is the easiest (in theory) but it represents a regression from what we do today: instead of sampling from the NV12 buffers directly (and doing the RGB conversion implicitly at that point), we are doing a copy-conversion from NV12 to RGBA which may be a problem in terms of bandwidth and memory usage.

We believe option #2 is the best option, but it’s a bit more complicated to implement. Before proceeding, we would like to make sure you don’t see any fundamental issues with this option. For instance, we would need to make sure that RasterImplementationGLES::CreateAndConsumeForGpuRaster [8] is not part of the GL code you are trying to remove.

In principle, since Pepper is on its way out, we could first try option #1, and if we don't see a significant regression with respect to tip-of-tree, we could just stick with it. We don't know if that would work for all platforms though.

Please, let us know what you think. Thanks!

[1] https://source.chromium.org/chromium/chromium/src/+/main:content/renderer/pepper/pepper_video_decoder_host.cc;l=164;drc=0c0b242b9475066635cf3e5e8a9ea074199d261d

[2] https://source.chromium.org/chromium/chromium/src/+/main:content/renderer/pepper/pepper_video_decoder_host.cc;l=262;drc=0c0b242b9475066635cf3e5e8a9ea074199d261d

[3] https://source.chromium.org/chromium/chromium/src/+/main:media/gpu/vaapi/vaapi_video_decode_accelerator.cc;l=650;drc=0b1b357a2fe097c9d23e453a0a5c5680c712b287

[4] https://source.chromium.org/chromium/chromium/src/+/main:content/renderer/pepper/video_decoder_shim.h;l=34-38;drc=3bfd3ba3af617152dfee61d03e93a773ddd3443b

[5] https://source.chromium.org/chromium/chromium/src/+/main:content/renderer/pepper/video_decoder_shim.cc;l=519-520;drc=a8afec221e6eac25a83c8a0db5909e88402b3d1c

[6] https://source.chromium.org/chromium/chromium/src/+/main:gpu/command_buffer/client/raster_interface.h;l=51;drc=75425ebb5303c7bb416b2261e9b27114ec903d3f

[7] https://source.chromium.org/chromium/chromium/src/+/main:gpu/command_buffer/client/raster_interface.h;l=171;drc=75425ebb5303c7bb416b2261e9b27114ec903d3f

[8] https://source.chromium.org/chromium/chromium/src/+/main:gpu/command_buffer/client/raster_implementation_gles.cc;l=396;drc=75425ebb5303c7bb416b2261e9b27114ec903d3f

Pilar

Sunny Sachanandani

unread,

Jul 18, 2022, 7:39:23 PM7/18/22

to Pilar Molina Lopez, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org, vas...@chromium.org

Hi Pilar,

Thanks for the detailed reply.

> Option #1 is the easiest (in theory) but it represents a regression from what we do today: instead of sampling from the NV12 buffers directly (and doing the RGB conversion implicitly at that point), we are doing a copy-conversion from NV12 to RGBA which may be a problem in terms of bandwidth and memory usage.

That seems fine, you can probably use one of the many methods on PaintCanvasVideoRenderer to accomplish this YUV to RGB copy. See also WebGLRenderingContextBase::TexImageHelperMediaVideoFrame or CreateImageFromVideoFrame for some examples where we copy from YUV to RGB.

> We believe option #2 is the best option, but it’s a bit more complicated to implement. Before proceeding, we would like to make sure you don’t see any fundamental issues with this option. For instance, we would need to make sure that RasterImplementationGLES::CreateAndConsumeForGpuRaster [8] is not part of the GL code you are trying to remove.

vasilyt@ can probably say for sure, but once out of process raster for canvas 2d (aka OOP-C or Canvas OOP-R) launches everywhere, removing RasterImplementationGLES would be a natural next step. That being said, there's also GLES2Interface::CreateAndConsumeTextureCHROMIUM which will likely always be around since it's needed for WebGL to be able to consume arbitrary shared images, and that's something you could use for pepper too.

- Sunny

Vasiliy Telezhnikov

unread,

Jul 20, 2022, 12:37:19 PM7/20/22

to Sunny Sachanandani, Pilar Molina Lopez, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Yes, we indeed want to remove RasterImplementationGLES after OOP-C launch. Using GLES2Interface seems the right thing to do to get texture ids (assuming they will be used by the same GLES2Interface), but we might need new context provider for that, because shared main thread context provider won't have gles2 interface on it after oop-c launch.

- Vasiliy

Sunny Sachanandani

unread,

Jul 25, 2022, 8:30:45 PM7/25/22

to Pilar Molina Lopez, Vasiliy Telezhnikov, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Generally the best way would be to copy one of the existing context providers e.g. RenderThreadImpl::GetVideoFrameCompositorContextProvider. One important question is if the context provider will be used on multiple threads - that requires specifying "supports_locking" on creation and using a ScopedContextLock to access the GL context - probably not necessary for your use case - it's ok if it's only used on a single thread even if it's not the renderer main thread.

On Mon, Jul 25, 2022 at 5:08 PM Pilar Molina Lopez <pmolin...@chromium.org> wrote:

Thanks for the info, Sunny and Vasily. That makes a lot of sense.

If the shared main thread context provider won't have gles2 interface on it after oop-c launch, we will definitely need a new context provider anyway. What would be the best way of adding a new context provider in RenderThreadImpl? Any suggestion would be appreciated :)

Pilar

Pilar Molina Lopez

unread,

Jul 25, 2022, 11:40:22 PM7/25/22

to Vasiliy Telezhnikov, Sunny Sachanandani, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Thanks for the info, Sunny and Vasily. That makes a lot of sense.

If the shared main thread context provider won't have gles2 interface on it after oop-c launch, we will definitely need a new context provider anyway. What would be the best way of adding a new context provider in RenderThreadImpl? Any suggestion would be appreciated :)

Pilar

On Wed, Jul 20, 2022 at 12:37 PM Vasiliy Telezhnikov <vas...@chromium.org> wrote:

Pilar Molina Lopez

unread,

Jul 26, 2022, 2:08:54 PM7/26/22

to Colin Blundell, Sunny Sachanandani, Vasiliy Telezhnikov, Andres Calderon Jaramillo, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Hi Colin,

Thanks for the question. I actually didn't know if we had one. I just found this one b/230007619 (we didn't re-assign it after Jeff left).

Pilar

On Tue, Jul 26, 2022 at 8:39 AM Colin Blundell <blun...@chromium.org> wrote:

Hi Pilar,

Is there a bug tracking this work? Thanks!

Best,

Colin

Colin Blundell

unread,

Jul 26, 2022, 2:45:08 PM7/26/22

to Sunny Sachanandani, Pilar Molina Lopez, Vasiliy Telezhnikov, Andres Calderon Jaramillo, Colin Blundell, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Hi Pilar,

Is there a bug tracking this work? Thanks!

Best,

Colin

Colin Blundell

unread,

Jul 27, 2022, 9:45:34 AM7/27/22

to Pilar Molina Lopez, Colin Blundell, Sunny Sachanandani, Vasiliy Telezhnikov, Andres Calderon Jaramillo, Sushanth Rajasankar, Frank Liberato, Dan Sanders, Graphics-dev, Dale Curtis, tmath...@chromium.org

Thanks!

Reply all

Reply to author

Forward