Issue 551 in webp: VP8L specification is incredible wrong

17 views
Skip to first unread message

j… via monorail

unread,
Jan 17, 2022, 2:15:04 AMJan 17
to webp-d...@webmproject.org
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 551 by j...@minteronline.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551

I am working on my own WebP decoder and currently focusing on lossless (VP8L) decoding and have found myself looking at the reference source code (libwebp) more and more frequently due to how incorrect the specification (https://developers.google.com/speed/webp/docs/webp_lossless_bitstream_specification) is.

So far I have found the following issues, although there may be more. Most of these issues have to do with missing details for the entropy codes:
- the simple entropy codes is just straight up wrong. It suggests you read the lengths for values 0 and 1 when instead you read the two values and set their lengths to 1.
- the normal entropy codes is missing half its details. It only explains how to read the codes lengths used in decoding the code lengths, however only briefly mentions how actually decode the real code lengths. Notably, doesn't mention the optional max symbols that can be prefixed at the start.
- the section on entropy codes suggest and the structure of the data stream ACTUALLY STATES that the meta Huffman image comes before the normal image data (notably, before the color cache) when turns out this is very much incorrect. The meta Huffman image seems to come after the color cache and before the Huffman groups.
- nowhere does it state that if a Huffman table only has one value it should not read any bits. This makes sense to improve compressibility but still should be explicitly stated somewhere.

Kind regards, a very annoyed and confused developer

--
You received this message because:
1. The project was configured to send all issue notifications to this address

You may adjust your notification preferences at:
https://bugs.chromium.org/hosting/settings

pasca… via monorail

unread,
Jan 17, 2022, 7:36:40 AMJan 17
to webp-d...@webmproject.org

Comment #1 on issue 551 by pasca...@gmail.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551#c1

Hi,

could you post the exact paragraph (or lines) where you found the issue(s)? Or at least quote the exact sentence with the problem?
This would help making sure we're on the same page...

thanks!

j… via monorail

unread,
Jan 17, 2022, 7:40:57 PMJan 17
to webp-d...@webmproject.org

Comment #2 on issue 551 by j...@minteronline.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551#c2

re: simple entropy codes, under 5.2.2(i) it has the pseudo code:
[code]
int is_first_8bits = ReadBits(1);
code_lengths[0] = ReadBits(1 + 7 * is_first_8bits);
if (num_code_lengths == 2) {
code_lengths[1] = ReadBits(8);
}
[/code]
while in the libwebp reference decoder inside vp8l_dec.c in the function ReadHuffmanCode:
[code]
code_lengths[symbol] = 1;
// The second code (if present), is always 8 bit long.
if (num_symbols == 2) {
symbol = VP8LReadBits(br, 8);
code_lengths[symbol] = 1;
}
[/code]
This makes more sense since if there are only two values in the huffman table they should both have a length of 1, but is different to the spec

re: normal entropy codes, under 5.2.2(ii) it briefly states how to decode it but is missing a lot of details. These extra details can be found in the later half of ReadHuffmanCode and ReadHuffmanCodeLengths (in vp8l_dec.c). Notably the spec never fully explains how to use the code lengths as described in 5.2.2(ii) to decode the real code lengths, and does not mention reading the max_symbol parameter as shown in the reference decoder:
[code]
if (VP8LReadBits(br, 1)) { // use length
const int length_nbits = 2 + 2 * VP8LReadBits(br, 3);
max_symbol = 2 + VP8LReadBits(br, length_nbits);
if (max_symbol > num_symbols) {
goto End;
}
} else {
max_symbol = num_symbols;
}
[/code]

re: order of huffman code reading, in section 4.1.1 it suggests that the ARGB image data is encoded the exact same way as other image data (except for the extra huffman groups). Also in section 5.2 of the spec the order suggests that the meta huffman codes is before the main entropy-coded image.
Along with this, in section 6, it actually states that:
<image stream> ::= <optional-transform><spatially-coded image>
[...]
<spatially-coded image> ::= <meta huffman><entropy-coded image>
<entropy-coded image> ::= <color cache info><huffman codes><lz77-coded image>
<meta huffman> ::= 1-bit value 0 |
(1-bit value 1; <entropy image>)
Again, showing that the meta huffman is before the rest of the entropy coded image. However in DecodeImage in the reference decoder this seems to be missing, and instead can be found in ReadHuffmanCodes right before reading the huffman groups. Thus the order should be:
<spatially-coded image> ::= <color cache info><meta huffman><huffman codes><lz77-coded image>
<entropy-coded image> ::= <color cache info><huffman codes><lz77-coded image>
And sections 4.1.1 and 5.2 should better explain how the main ARGB image is different to other entropy-coded images in that it embeds the meta huffman in the middle of it, after the color cache.

re: one value huffmans, in the reference decoder in the source file huffman_utils.c in the function BuildHuffmanTable there is this code snippet:
[code]
// Special case code with only one value.
if (offset[MAX_ALLOWED_CODE_LENGTH] == 1) {
if (sorted != NULL) {
HuffmanCode code;
code.bits = 0;
code.value = (uint16_t)sorted[0];
ReplicateValue(table, 1, total_size, code);
}
return total_size;
}
[/code]
Yet again, this is something that logically makes sense, but is completely absent from the specification and would be very helpful to mention somewhere. Perhaps under 5.2.2(i) as a note?

pasca… via monorail

unread,
Jan 18, 2022, 8:47:24 AMJan 18
to webp-d...@webmproject.org
Updates:
Status: Accepted

Comment #3 on issue 551 by pasca...@gmail.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551#c3

Thanks for the notes! I'll have a look at each individually. Please tell me if the rewording seems fine to you:

Issue 5.2.2(i)

Indeed, the spec is wrong. It should rather be:

=============

This variant is used in the special case when only 1 or 2 Huffman symbols in range [0,255] with code length '1', associated with code '0' and '1' respectively.

The first bit indicates the number of symbols:

int num_symbols= ReadBits(1) + 1;

Following are the symbol values. The first symbol can be in range [0,1] or [0,255] according to the first bit value 'is_first_8bits'.
This first symbol is coded using 1 or 8 bits depending on the value of 'is_first_8bits'.
The second symbol, if present, is always assumed in range [0,255] and coded using 8 bits.

int is_first_8bits = ReadBits(1);
symbol0 = ReadBits(1 + 7 * is_first_8bits);
code_lengths[symbol0] = 1;
if (num_symbols == 2) {
symbol1 = ReadBits(8);
code_lengths[symbol1] = 1;
}

=============

Is this wording clearer to you?

j… via monorail

unread,
Jan 18, 2022, 6:34:43 PMJan 18
to webp-d...@webmproject.org

Comment #4 on issue 551 by j...@minteronline.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551#c4

re: "associated with code '0' and '1' respectively" I would maybe change it to "huffman codes" to be a bit more clear and probably remove the word "respectively" since that indicates that the first symbol gets code 0 and the second symbol gets code 1 when that's not necessarily the case (from my understanding the lower value gets value 0, although that is more likely to be code 0). It is probably better to clarify how code lengths relate to actual huffman cods although that is already done via linking to the wikipedia article that does a great job at explaining it.

Other than that the rest of it sounds great :)

j… via monorail

unread,
Jan 18, 2022, 6:35:32 PMJan 18
to webp-d...@webmproject.org

Comment #5 on issue 551 by j...@minteronline.com: VP8L specification is incredible wrong
https://bugs.chromium.org/p/webp/issues/detail?id=551#c5

correction: the lower values gets code 0, although that is more likely to be the first value
Reply all
Reply to author
Forward
0 new messages