Issue 611 in webp: webp spec feedback

132 views

Skip to first unread message

tha… via monorail

unread,

May 17, 2023, 10:10:49 AM5/17/23

to webp-d...@webmproject.org

Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 611 by tha...@chromium.org: webp spec feedback
https://bugs.chromium.org/p/webp/issues/detail?id=611

Turning an email into a bug as requested :)

Here's some feedback on the webp lossless spec.

High-level comments first.

a) The good news is that I was able to write a decoder. The bad news is that I had to cheat and look at libwebp for a few things that I couldn't figure out just from reading the spec (not many, but some).

The main points are:
1. it wasn't clear in which order bits appear in the file
2. information for one thing is often spread over several sections
3. the sections aren't granular enough and not structured super well

For the first point, things are clear enough up to section 4. I then found a file that doesn't have any transforms and tried to figure out what bits to read after the `0` transforms bit. (It's the color cache size.) Section 5 "image data" talks about encoding of data a lot. In section 5.2.3 it mentions `int color_cache_code_bits = ReadBits(4);` but it's not at all clear that this is the first part of the image data.

Section 7 helped a lot with this problem, but it contained mistakes (see below) and it also had confusing parts (see also below).

For the second point, consider for example trying to apply the inverse transforms. None of the transform descriptions in section 4 mention how RGBA is mapped to the transform data – this is instead in section 5.1. Why isn't it with the transform description?

I found the "color transform" and "subtract green transform" descriptions easy to understand. (Except that I at first didn't find how to map ARBG to ColorTransformElement.)

For "predictor transform", I wish the section would mention how to combine the stored pixel with the predicted pixel to inverse the transform. (It's just per-channel addition, but that's easy to describe, might as well.)

For "color indexing transform", it wasn't clear to me at all that the pixel bundling changes the size of _all images_ (transform images, main image, meta prefix image) after the color indexing transform. This could imho be made much more explicit.

For the third point, there are many sections in the spec I'd like to link to in comments but can't, since the sections aren't granular enough. (examples: "Decoding the Code Lengths", "Simple Code Length Code", "Normal Code Length Code", "Entropy image", "Interpretation of Meta Prefix Codes")

b)

prefix-codes          =  prefix-code-group *prefix-codes
prefix-code-group     =
    5prefix-code ; See "Interpretation of Meta Prefix Codes" to
                 ; understand what each of these five prefix
                 ; codes are for.

"Interpretation of Meta Prefix Code" is in "6.2.2 Decoding of Meta Prefix Codes" which starts with "Meta prefix codes are indexes identifying which prefix codes to use in different parts of the image.". But even entropy-coded-images have prefix-codes, even though they can't have a meta-prefix. I found this pretty confusing. It'd be good if there was a separate section describing prefix code groups before 6.2.2. since that's something needed for all images – and then have 6.2.2 only talk about what meta prefix codes are.

I also would've found it helpful if section 7 somehow mentioned that only spatially-coded-image with a meta-prefix can have more than one prefix-code-group.

(This is in part because the order of things in the file wasn't clear from the main text, so I relied on section 7 heavily.)


On to the low-level comments.

c)
More serious mistakes:

7.1 says:
format       = RIFF-header image-size image-stream
RIFF-header  = "RIFF" 4OCTET "WEBP" "VP8L" 4OCTET %x2F
image-size   = 14BIT 14BIT ; width - 1, height - 1
image-stream = optional-transform spatially-coded-image

This is missing the 1 bit for alpha and 3 bits for version number after image-size (this is correctly described in section 3).

=> fixed in https://chromium-review.googlesource.com/c/webm/libwebp/+/4416725

d)
This one is more very misleading than a mistake: Towards the end of 6.2.1 Decoding and Building the Prefix Codes, there's:

int num_code_lengths = 4 + ReadBits(4);

and then after:

"Next, if ReadBits(1) == 0, the maximum number of different read symbols is num_code_lengths."

But num_code_lengths doesn't mean the num_code_lengths mentioned a few lines further up in the spec (and in scope here), but alphabet_size!
(See libwebp vp8l_dec.c, ReadHuffmanCode(), which passes alphabet_size to ReadHuffmanCodeLengths() as num_symbols, and ReadHuffmanCodeLengths() then sets max_symbol to that.)


e)
On that note, I found this pretty confusing:

"A prefix table is then built from code_length_code_lengths and used to read up to max_symbol code lengths."

What it's supposed to mean is that we read input symbols until we have produced alphabet_size many output codes, but we read at most max_symbol input symbols. (One input symbol can produce several output codes.) I had to look at the libwebp code to figure this out, and I still got it wrong the first time round.

libwebp also checks that max_symbols is <= alphabet_size, but the spec doesn't mention this. (It's obvious once you understand what's going on, but it'll help understanding what's supposed to go on if it was mentioned in the spec.)


d)
As far as I can tell, nothing mentions that the transforms should be applied in reverse order they're stored in the file.


Benign mistakes:

e)
#define DIV_ROUND_UP(num, den) ((num) + (den) - 1) / (den))
^ This is missing a `(`, the parens in the expansion don't match up

and:
int position =
    (y >> prefix_bits) * prefix_xsize + (x >> prefix_bits);
int meta_prefix_code = (entropy_image[pos] >> 8) & 0xffff;
^ declares a variable `position` and then uses a variable `pos`

=> Both fixed in https://chromium-review.googlesource.com/c/webm/libwebp/+/4412922

g)
7.3 says:
prefix-code           =  simple-prefix-code / normal-prefix-code
simple-prefix-code    =  ; see "Simple Code Length Code" for details
normal-prefix-code    =  code-length-code encoded-code-lengths
code-length-code      =  ; see section "Normal Code Length Code"

But "encoded-code-lengths" isn't defined anywhere. Just
prefix-code           =  simple-prefix-code / normal-prefix-code
simple-prefix-code    =  ; see "Simple Code Length Code" for details
normal-prefix-code    =  ; see section "Normal Code Length Code"

is probably enough and what is meant?

=> fixed in https://chromium-review.googlesource.com/c/webm/libwebp/+/4412923


Nits:

h)
The spec uses the word "dynamics" in three places to mean "bit width". I've never heard that word used like that before and found it a bit weird. s/dynamics/bit width/g would imho be clearer.

=> fixed in https://chromium-review.googlesource.com/c/webm/libwebp/+/4528644

i)
Section 7 links to ABNF. But I think it'd be helpful if it very quickly summarized the parts used. ("`*a` means a can be repeated 0 or more times. `5b` means b is repeated exactly 5 times)

-- 
You received this message because:
  1. The project was configured to send all issue notifications to this address

You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings

tha… via monorail

unread,

May 17, 2023, 10:16:31 AM5/17/23

to webp-d...@webmproject.org

Comment #1 on issue 611 by tha...@chromium.org: webp spec feedback
https://bugs.chromium.org/p/webp/issues/detail?id=611#c1

For point a2, I think replacing https://developers.google.com/speed/webp/docs/webp_lossless_bitstream_specification#51_roles_of_image_data with "A webp image always encodes ARGB data. For webp images used to store data for Transformations or Entropy Images, see those sections on how to interpret the ARGB data. Only the main image stores a meta-prefix image." and then in the transformation sections and the entropy image sections saying "This reads an ARGB image as described in section 5. The green component of that image stores which of the 14 predictors is used for each tile. Other channels should be ignored" in the predictor transform description (and likewise for the other transforms, and for the meta prefix image) would be a huge help.

In other words, only describe ARGB in the "how to decode an image" section, and describe how to interpret the ARGB data for each transform in the text for each transform.

jz… via monorail

unread,

May 17, 2023, 5:02:49 PM5/17/23

to webp-d...@webmproject.org

Updates:
	Cc: jy...@google.com vra...@google.com

Comment #2 on issue 611 by jz...@google.com: webp spec feedback
https://bugs.chromium.org/p/webp/issues/detail?id=611#c2

(No comment was entered for this change.)

Reply all

Reply to author

Forward

0 new messages