Trying to make V4L2 work on Rockchip rk3399 on mainline linux 6.2.

1,426 views
Skip to first unread message

Alexey Guskov

unread,
Apr 11, 2023, 6:43:25 PM4/11/23
to Chromium-dev
Hey guys.

I got a rk3399 device with rk-vdec and hantro hardware decoders and trying to make hardware acceleration work on those.
- I figured out a few missing pieces needed to make chromium talk to v4l2 driver (VaapiVideoDecodeLinuxGL feature, does not seem to be documented anywhere)
- I was surprised to find that chromium expects v4l2 devices to be named `/dev/video-decX` and `/dev/media-decX`, not `/dev/videoX` and `/dev/mediaX` devices created by 6.2 linux kernel.

So, in the end, i have added a few logs here and there, and now i can see the decoder is initialised. 

> "info":"Selected V4L2VideoDecoder for video decoding, config: codec: h264, profile: h264 baseline, level: not available, alpha_mode: is_opaque, coded size: [1920,1080], visible rect: [0,0,1920,1080], natural size: [1920,1080], has extra data: false, encryption scheme: Unencrypted, rotation: 0°, flipped: 0, color space: {primaries:BT709, transfer:BT709, matrix:BT709, range:LIMITED}"}

However, after initial setup the whole pipeline gets reset when VideoDecoderPipeline tries to allocate ImageProcessor (/dev/image-procX) which does not seem to exist on mainline linux. ImageProcessor seems to be used to resize output frame an convert it from NV12 to AR24 format. 

It's unclear why an ImageProcessor is required, because VideoPipeline also initialises MailboxVideoFrameConverter which is supposed to be able to do the same. Target video format of AR24 is hardcoded in GetPreferredRenderableFourccs in mojo client. Tried adding NV12, but got ` MediaEvent: {"error":"VideoDecoderPipeline Frame converter returns null frame."}`.

Now, questions:
- is v4l2 decoding supposed to be working with mainline linux? 
- why v4l2 device names are using /dev/video-decX /dev/media-decX notation, not /dev/videoX and /dev/mediaX?
- Is ImageProcessor v4l2 device required for video decoding to work? 

Alexey Guskov

unread,
Apr 12, 2023, 1:38:57 PM4/12/23
to Chromium-dev, Alexey Guskov
Update:

V4L2 ImageProcessor is required because other ImageProcessor implementations do not support the input and output format combination (NV12 -> AR24). After modifying libYUVImageProcessor i was able to make V4L2VideoDecoder work, CPU load went down to 25%, but there seems to be a bottleneck somewhere as the video is buffering every few seconds. Now I'm trying to make it work with GLImageProcessor.

среда, 12 апреля 2023 г. в 00:43:25 UTC+2, Alexey Guskov:

Alexey Guskov

unread,
Apr 12, 2023, 10:04:12 PM4/12/23
to Chromium-dev, Alexey Guskov
Another update:
In previous episodes:
- V4L2VideoDecoder does work, if proper path to lib4l2.so is supplied and /dev/videoX and /dev/mediaX devices paths are correct
- But Linux version of Mojo client uses hardcoded AR24 as a renderable format: https://source.chromium.org/chromium/chromium/src/+/main:media/mojo/services/gpu_mojo_media_client_linux.cc;drc=9329f990b89052f1e7f82e0a1a4b298b72359263;l=50
- V2L produces NV12 frames, so mojo client does not accept those without conversion.
- For that, a special class, called ImageProcessor is used, with 3 available implementations.
- V4L2ImageProcessor uses a kernel device that is not present on my platform.
- libYUVImageProcessor uses a software conversion and did not support NV12 -> AR24 conversion, but after modifications i managed to make it work. Still have issues with video buffering.
- GLImageProcessor uses a gpu shader and does not support NV12 -> AR24, so i rewrote it, but buffering still remains. Also i got a lot of errors about ARGB being an invalid gpu buffer format (yet, the video is rendered)

In fact it looks like ImageProcessors are usually used to convert from other formats _to_ NV12, not from it, meaning that format should be acceptable further down the pipeline. But with NV12 added to a target format list, I receive error from Mailbox (video frame being in DMABUF instead of GPU memory), and (if i comment out the check) it crashes in media::PlatformVideoFramePool::UnwrapFrame()

среда, 12 апреля 2023 г. в 19:38:57 UTC+2, Alexey Guskov:

Alexey Guskov

unread,
Apr 17, 2023, 1:13:47 PM4/17/23
to Chromium-dev, Alexey Guskov
Another short update:

Looks like gl ImageProcessor works fine, here's my patched version: https://gist.github.com/kvasdopil/2d9942bcf168d75c76bfa29ca26c72aa
- i'm quite certain it is not actually needed, Mailbox converter should be able to perform the conversion. Besides a GL renderer should be capable of rendering YUV frames without any conversion. 
- If you have a look at the shader, it does not perform the actual color space conversion, just unwraps from NV12 to BGRA Yet the video colors are fine, no idea why.

Anyways, the pipeline seems to work fine. Video frames are getting decoded and CPU load is ~10%, less that 100% for a single core, which is exactly what i'm looking for.

Now the last problem: the video output is not smooth, looks like a lot of frames are getting dropped, perceived fps is ~15.
I added a bunch of logs here and there, for example in mojo_video_decoder_service.cc a `Decode pts=XXXXX` is called 30 times per second with correct timestamps, and `VideoDecoderPipeline::OnFrameConverted` is called 30 times per second, also with correct timestamps. Considering low CPU load I assume there's no performance bottleneck in decoder. 

So either frames aren't rendered in time, or IPC isn't delivering them in time (not even sure IPC is used for hw video frames), or decoder is rewriting older frames with a new ones before they are rendered. All options sound unlikely to me, but I got no better ideas. Not quite sure how I debug that, could not find the client end of mojo_video_decoder_service.cc, media/mojo/clients/mojo_video_decoder.cc does not seem to be used (perhaps because i'm using --in-process-gpu, without that i got software composing). 

Any advices would be greatly appreciated.

четверг, 13 апреля 2023 г. в 04:04:12 UTC+2, Alexey Guskov:

Alexey Guskov

unread,
May 2, 2023, 8:05:57 PM5/2/23
to Chromium-dev, Alexey Guskov
For those who are interested: 
- V4L2VideoDecoder seems to be unstable, at least on rockchip. It works with a few minor fixes and customised GLImageProcessor, but framerate is somewhat unreliable, despite correct timestamps in render function perceived fps is around 15. Increasing number of buffers in render queue helps, but not in 100% cases. CPU load is ~15-20% on rk3399. Also this decoder is having issues with out-of-orders frame in h264, looks like the frames aren't getting reordered properly, so the image kinda jumps back and forward in time for those.
- VDAVideoDecoder produces better results, fps is good, but sometimes decoding just stops in a middle of a video, CPU load is ~20-25%. Out of order frames produce visual artifacts, but there's no time jumping. 
 
Also tested on rk3568, hantro decoding works, but has same issues as on rk3399. Vertical 1080x1920 videos are decoded in software (decoder profile check is done against 1920x1080 size, not taking video orientation into account, not sure if that is expected or a bug).

For reference I have an rk3399 device with kernel 4.x.x and chromium v.98 (probably custom version). Videos are rendered with MojoVideoDecoder, fps is good, no visual artifacts, no freezes, CPU load is ~10-12%. No idea if that is achievable on latest chromium and latest mainline linux.
четверг, 13 апреля 2023 г. в 04:04:12 UTC+2, Alexey Guskov:
Another update:

Tommy Xe200

unread,
May 8, 2023, 1:27:18 PM5/8/23
to Chromium-dev, Alexey Guskov
Hi Alexey,

It looks like you have been doing the same job I am doing right now with imx8mm, it also uses Hantro codecs, and I am working with mainline 6.2 kernel. I reached the moment where I have decode frames in NV12 format which are rejected, then found your post.

"VDAVideoDecoder produces better results, fps is good" -> Did you swap V4L2 processor (non existing /dev/image-proc) to GL image processor (you mentioned in the link) in v4l2_slice_video_decode_accelerator?

PS. I am using chromium version 111, so there are differences in the code looking in all the commits done meanwhile to /media/gpu subdirectories, howevet I think it should not be a blocker...

Thanks!
Tommy

Alexey Guskov

unread,
Jun 1, 2023, 10:56:13 AM6/1/23
to Chromium-dev, Tommy Xe200, Alexey Guskov
Hi Tommy,

Sorry for delayed answer, i completely missed your message.

So, if i get this right, there are 2 implementations for video decoding pipeline that use V4L2: V4L2VideoDecoder and VDAVideoDecoder. I don't have the sources nearby, but IIRC there was a video decoder selection routine somewhere, that returns 'V4L2VideoDecoder'. If you make it return VDAVideoDecoder then it works. That also trigger an error for invalid GPU buffer format (smth like RGBA buffer cannot be created), but works fine if you just comment out the check. 

I'll try to publish my sources on monday so you can recreate my results.

Also i have a build electron with my patches, you can check this out if you're interested: https://lexa.blob.core.windows.net/electron/110-release.tar.zst

Lmk if you're still interested in this, the lack of hw decoding on rockchip has been bothering me for a while, i really want to finish this project.
понедельник, 8 мая 2023 г. в 19:27:18 UTC+2, Tommy Xe200:

Jianfeng Liu

unread,
Oct 30, 2023, 6:25:33 PM10/30/23
to Chromium-dev, Alexey Guskov, Tommy Xe200
Hi Alexey,
  It's very lucky for me to find your work! I've been investigating v4l2 decoding on chromium these days. Here is what I find these days:
  1, VDAVideoDecoder can use libv4l2 to make v4l2m2m decoder work if you build chromium with flag `use_v4lplugin=true`. I've confirmed that it works on qcom snapdragon 865. Here is the patches I use for chromium v114: https://github.com/amazingfate/chromium-libv4l2-patches/tree/main/v114.0.5735.35
  2, libv4l2 is removed since chromium v117: https://github.com/chromium/chromium/commit/72d438bcd230395b4a8e1d3f93f740c58c9d4bde, so the method above won't work for chromium version after v117.
  3, I also tried V4L2VideoDecoder on chromium v118, and also met the same issue as you: "video_decoder_pipeline.cc(1146)] PickDecoderOutputFormat(): Unable to find ImageProcessor to convert format". But I have vulkan support on snapdragon 865 platform, so I tried again after enabling vulkan. It seems that vulkan can deal with NV12 directly so it won't try to find an imageprocessor, but I get a new error from this line: https://github.com/chromium/chromium/blob/main/media/gpu/chromeos/mailbox_video_frame_converter.cc#L325. `frame->storage_type()` is `STORAGE_DMABUFS` instead of `STORAGE_GPU_MEMORY_BUFFER`.

Justin Green

unread,
Nov 1, 2023, 5:02:12 PM11/1/23
to Chromium-dev, Jianfeng Liu, Alexey Guskov, Tommy Xe200, Justin Green
Hi!
I'm a ChromeOS video dev that does a lot of work on the ImageProcessor.

For context, ChromeOS generally uses GL drivers that can directly import NV12 textures. We don't have code for converting NV12 to ARGB on RK3399 because we simply never experienced this issue. We do experience similar problems on other platforms though, which is why those ImageProcessor classes exist. You folks are on the right track with trying to recycle them :-)

Some thoughts on each approach:
1. V4L2ImageProcessor: this is a wrapper for a piece of hardware you don't have, so we safely ignore it.

2. LibYUVImageProcessor: this is an easy solution since most of the code is already in place, but it will have performance issues as you have already noticed. A big reason for the performance problems is actually because the Hantro driver allocates DMA buffers in a way that disables caching on ARM devices. If you're interested, I have a 1 line patch for the kernel attached to this email that may fix the issue.

3. GLImageProcessor: this might be the most performant approach in the long term, but we currently don't have infrastructure in place for properly synchronizing everything. So instead, we just have a heavy-handed `glFinish()` call at the end of frame processing, which is likely to result in poor performance. I'm not sure if it will be better or worse than LibYUV in its current state.

Cheers,
Justin
0001-Enable-non-coherent-dst-bufs-for-Hantro-V4L2-driver.patch

Jianfeng Liu

unread,
Nov 2, 2023, 11:14:56 AM11/2/23
to Chromium-dev, Justin Green, Jianfeng Liu, Alexey Guskov, Tommy Xe200
Hello Justin,
  Many thanks to your reply. And I finally get V4L2VideoDecoder work with snapdragon venus v4l2 statefull api. Here are the patches I use: https://github.com/amazingfate/chromium-libv4l2-patches/tree/main/v118.0.5993.70.
  Your kernel patch can also get applied to qcom venus driver and it improves the performance a lot when playing a 2160p@30 h264 video.

Jianfeng Liu

unread,
Nov 2, 2023, 12:37:41 PM11/2/23
to Justin Green, Chromium-dev, Alexey Guskov, Tommy Xe200
The first two patches are just my attempts to enable the legacy VDA decoder. Since I'm using V4L2 decoder now I think they are not necessary now. But I still need to modify `media/mojo/services/gpu_mojo_media_client.cc` to enable V4L2 decoder for linux.
The last two patches are necessary on linux because of these two reason:
1, linux mesa gbm need write permission to do `gbm_create_device`.
2, webgpu is not supported on linux so `SHARED_IMAGE_USAGE_SCANOUT` should not get passed to `shared_image_usage`.

Justin Green <green...@chromium.org> 于2023年11月2日周四 23:21写道:
Great to hear that you got something working!

Out of curiosity, have you tried just using that NV12->ARGB patch?
https://github.com/amazingfate/chromium-libv4l2-patches/blob/main/v118.0.5993.70/0003-media-image-processor-libyuv-add-NV12-to-ARGB-conver.patch
I suspect that this patch combined with the kernel patch will be
sufficient for both qcom and rk3399.

Justin Green

unread,
Nov 2, 2023, 12:37:42 PM11/2/23
to Jianfeng Liu, Chromium-dev, Alexey Guskov, Tommy Xe200
Great to hear that you got something working!

Out of curiosity, have you tried just using that NV12->ARGB patch?
https://github.com/amazingfate/chromium-libv4l2-patches/blob/main/v118.0.5993.70/0003-media-image-processor-libyuv-add-NV12-to-ARGB-conver.patch
I suspect that this patch combined with the kernel patch will be
sufficient for both qcom and rk3399.

On Thu, Nov 2, 2023 at 5:23 AM Jianfeng Liu <liujian...@gmail.com> wrote:
>

Jianfeng Liu

unread,
Nov 2, 2023, 12:38:01 PM11/2/23
to Justin Green, Chromium-dev, Alexey Guskov, Tommy Xe200
typo: `SHARED_IMAGE_USAGE_SCANOUT` should be `SHARED_IMAGE_USAGE_WEBGPU`

Jianfeng Liu <liujian...@gmail.com> 于2023年11月3日周五 00:20写道:

Justin Green

unread,
Nov 10, 2023, 8:04:51 AM11/10/23
to Jianfeng Liu, Chromium-dev, Alexey Guskov, Tommy Xe200
I see, that makes sense. With the VDA decoder part eliminated, can
this cleanly rebase onto the tip of the tree? If so, I can help you
shepherd these changes upstream.

Jianfeng Liu

unread,
Nov 10, 2023, 8:04:53 AM11/10/23
to Justin Green, Chromium-dev, Alexey Guskov, Tommy Xe200
Hi Justin, I just sent my first commit to code review: https://chromium-review.googlesource.com/c/chromium/src/+/5014723. I'm going to send other patches after this one is merged.

Justin Green <green...@chromium.org> 于 2023年11月9日周四 02:55写道:

Jianfeng Liu

unread,
Mar 26, 2024, 11:21:59 AMMar 26
to Chromium-dev, Jianfeng Liu, Chromium-dev, Alexey Guskov, Tommy Xe200, Justin Green
V4L2ImageProcessor: this is a wrapper for a piece of hardware you don't have, so we safely ignore it.
Hi Justin, I'm interested in the  V4L2ImageProcessor. On rockchip socs there is a rga image processor which provides v4l2 m2m api in mainline kernel to do color space conversion. For  example we can use gstreamer command to use it:
DISPLAY=:0 gst-launch-1.0 videotestsrc ! video/x-raw,format=BGRx ! v4l2convert ! xvimagesink
Is this rga node compitable with  V4L2ImageProcessor?

Justin Green

unread,
Mar 27, 2024, 7:37:16 AMMar 27
to Jianfeng Liu, Chromium-dev, Alexey Guskov, Tommy Xe200
The V4L2ImageProcessor is currently only designed to work with one
specific Mediatek SoC, and would require some modifications to work
with other image processing IPs.

That being said, perhaps it's worth taking a step back and asking what
your intentions are with this image processing IP? We have
historically found this type of hardware has poor performance
characteristics, so we generally use alternatives whenever possible.
Color conversions, for example, are usually handled by shaders.

Jianfeng Liu

unread,
Mar 27, 2024, 7:37:25 AMMar 27
to Justin Green, Chromium-dev, Alexey Guskov, Tommy Xe200
Thanks for explanation. I intended to try rockchip's rga in chromium if chromium is using standard v4l2 api. Now I will keep using libyuv to do that.
By the way, rockchip has a patched version of chromium which will use egl to directly render NV12 so there will be no color conversion. They use a libv4l2 plugin: https://github.com/JeffyCN/libv4l-rkmpp, and here is the egl related patch: https://github.com/JeffyCN/meta-rockchip/blob/master/dynamic-layers/recipes-browser/chromium/chromium_119.0.6045/0001-HACK-media-Support-V4L2-video-decoder.patch#L661-L664.
I guess libv4l2 method which has been dropped by chromium is different from direct kernel uAPI method chromium is using now. Maybe color conversion is inevitable with mainline chromium.

Justin Green <green...@chromium.org> 于2024年3月27日周三 01:34写道:

Justin Green

unread,
Mar 28, 2024, 5:47:47 PMMar 28
to Jianfeng Liu, Chromium-dev, Alexey Guskov, Tommy Xe200
I'm not sure I see the EGL patch in that tree, but if such a thing
exists, then image processing isn't needed at all, and your GL drivers
should be able to just directly consume NV12 frames. That's generally
how it works on most platforms.

As an aside, in the few months since we first talked about this, I've
actually been working on integrating a Vulkan image processor directly
into the rendering stack, complete with synchronization and hardware
overlay support. I did not yet write a NV12->ARGB shader, and I do not
know if your graphics drivers on Rockchip support Vulkan, but this
would likely be a performant path forward if the GL drivers don't
support NV12 natively:
https://source.chromium.org/chromium/chromium/src/+/main:media/gpu/chromeos/vulkan_image_processor.h

Jianfeng Liu

unread,
Apr 17, 2024, 7:01:23 PM (9 days ago) Apr 17
to Chromium-dev, Justin Green, Chromium-dev, Alexey Guskov, Tommy Xe200, Jianfeng Liu
Here is a patch to make chromium directly render NV12 with gbm: https://gitlab.freedesktop.org/mesa/mesa/-/issues/7745. I've tested it on rk3588 panthor gpu driver with v4l2 stateless decoder. No image processor is necessary.
We need a patch to chromium to allow NV12: https://paste.armbian.com/raw/yekopumixi
Reply all
Reply to author
Forward
0 new messages