Hello fellow WebRTC enjoyers,
I have been struggling to integrate AV1 hardware vendor encoders that don't support tight cropping into my WebRTC clients for a while. It had seemed like AV1 lacks header-based cropping for edges in favor of complex transform block elision logic on the borders.
In particular, the AV1 spec doesn't seem to mention "crop" at all. Likewise the only mention of padding seems to be OBU byte-stream padding (and not picture size padding).
Instead, the AV1 spec has the following to say about the render dimensions:
BEGIN QUOTATION
6.8.5. Render size semantics
The render size is provided as a hint to the application about the desired display size. It has no effect on the decoding process.
render_and_frame_size_different equal to 0 means that the render width and height are inferred from the frame width and height. render_and_frame_size_different equal to 1 means that the render width and height are explicitly coded.
Note: It is allowed for the bitstream to explicitly code the render dimensions in the bitstream even if they are an exact match for the frame dimensions.
render_width_minus_1 plus one is the render width of the frame in luma samples.
render_height_minus_1 plus one is the render height of the frame in luma samples.
END QUOTATION
Likewise, the AV1 RTP spec talks a bit about how this size trickles through the RTP headers but doesn't say anything meaningful about the interpretation of this value.
This is all pretty nebulous regarding how to interpret this render size.
So this morning I was a bit surprised to see this AV1 change land in WebRTC treating render size as a normative top-left anchored crop: https://webrtc-review.googlesource.com/c/src/+/381340 ( landed as commit 723e219e80b67c30c2828728cdbc2d372a8f4897 )
> Deliver render resolution
>
> Encoder may pad frames on right and/or bottom side and indicate true render resolution in frame header. Decoder must
> remove padding pixels.
The commit then links to a private bug.
The interpretation of the borders here doesn't seem to align with my reading of the spec. The commit seems to interpret this render size as an unambiguous mandatory crop.
Am I missing some subtlety here? Can any additional context on this change be provided?
Best Regards,
Alex Converse