Looking for the best way to debug a corrupted AV1 bitstream

43 views
Skip to first unread message

Eyal Frishman IL

unread,
May 19, 2022, 2:24:54 AM5/19/22
to AV1 Discussion
Hi,

I have generated a bitstream which is clearly corrupted but so far I was not able to locate the first location where an error happens. The error I get is "Warning: Failed to decode frame 2: Corrupt frame detected".

I was wondering whether AOM has decoder flags/options which could help pinpoint as to the corruption (was not sure of these are decoder definitions, but I tried "CONFIG_BITSTREAM_DEBUG" and "CONFIG_MISMATCH_DEBUG" which caused assertions even on "good" bitstreams -- looks like they are not for decoding).

Anyway, anything someone can share to help pinpoint the problem (debugs/logs/other) will be most appreciated.

Thanks in advance,
Eyal


PS: Attached my bad bitstream in case it helps somehow.
bad.av1

James Zern

unread,
May 19, 2022, 5:29:53 PM5/19/22
to AV1 Discussion
On Wed, May 18, 2022 at 11:24 PM 'Eyal Frishman IL' via AV1 Discussion <av1-d...@aomedia.org> wrote:
Hi,

I have generated a bitstream which is clearly corrupted but so far I was not able to locate the first location where an error happens. The error I get is "Warning: Failed to decode frame 2: Corrupt frame detected".

ViCueSoft's analyzer will give a little more detail: trailing_one_bit shall be equal to 1. libgav1 will report something maybe a little earlier:
ERROR 7f6063601740 obu_parser.cc:2785] Byte alignment has non zero bits.
ERROR 7f6063601740 decoder_impl.cc:1045] Failed to parse OBU.
Unable to dequeue frame: The bitstream is not encoded correctly or violates a bitstream conformance requirement.
 

I was wondering whether AOM has decoder flags/options which could help pinpoint as to the corruption (was not sure of these are decoder definitions, but I tried "CONFIG_BITSTREAM_DEBUG" and "CONFIG_MISMATCH_DEBUG" which caused assertions even on "good" bitstreams -- looks like they are not for decoding).

That sounds like a bug, would you mind filing an issue?
 

Anyway, anything someone can share to help pinpoint the problem (debugs/logs/other) will be most appreciated.

Thanks in advance,
Eyal


PS: Attached my bad bitstream in case it helps somehow.

--
You received this message because you are subscribed to the Google Groups "AV1 Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to av1-discuss...@aomedia.org.
To view this discussion on the web visit https://groups.google.com/a/aomedia.org/d/msgid/av1-discuss/f97ed273-f066-46f6-bc8a-0ca30eaf952en%40aomedia.org.

Wan-Teh Chang

unread,
May 19, 2022, 7:01:51 PM5/19/22
to av1-d...@aomedia.org
Hi Eyal,

If you are comfortable with reading libaom source code and using a
debugger, you can create a debug build of libaom (pass
-DCMAKE_BUILD_TYPE=Debug to the cmake command) and run aomdec in gdb.

Set a breakpoint at aom_internal_error(). Many (but not all) decoding
errors are reported by calling aom_internal_error().

In addition, set a breakpoint at the following line inside the
aom_decode_frame_from_obus() function in av1/decoder/obu.c:

switch (obu_header.type) {

That switch statement is responsible for decoding the AV1 OBUs. You
will see temporal delimiter, sequence header, frame, temporal
delimiter, frame, frame, and then the decoding error.

Then run the command "run bad.av1 -o /dev/null" in gdb.

By doing this, I found that the first coding error comes from the
byte_alignment() call. I attach a patch (patch.txt) that will print a
more informative error message when that byte_alignment() call fails.
The patch also suppresses the byte_alignment() check (which you can
undo by changing #if 0 to #if 1).

After I suppressed the byte_alignment() failure, the decoding still
failed, this time in the decode_tiles() function in
av1/decoder/decodeframe.c.

// decode tile
decode_tile(pbi, td, row, col);
aom_merge_corrupted_flag(&pbi->dcb.corrupted, td->dcb.corrupted);
if (pbi->dcb.corrupted)
aom_internal_error(&pbi->error, AOM_CODEC_CORRUPT_FRAME,
"Failed to decode tile data");

This decoding error is reported by aom_internal_error(), so I caught
it with the first breakpoint.

The next thing to try is to track down where libaom sets the
td->dcb.corrupted flag to 1. I searched for "corrupted" in
av1/decoder/decodeframe.c and set a breakpoint at every place where we
set the 'corrupted' struct member to 1. I hit a breakpoint at the end
of decode_tile():

int corrupted =
(check_trailing_bits_after_symbol_coder(td->bit_reader)) ? 1 : 0;
aom_merge_corrupted_flag(&dcb->corrupted, corrupted);

Now I applied the second attached patch (patch2.txt) to suppress the
errors in check_trailing_bits_after_symbol_coder(), and that finally
allowed me to decode bad.av1 successfully. So both decoding errors
have to do with trailing bits. Please consult the AV1 specification
for the requirements on the byte_alignment() and trailing_bits()
functions.

Note: You can also try libgav1 as James Zern suggested. libgav1
generally has better error reporting.

Wan-Teh
patch.txt
patch2.txt

James Zern

unread,
May 31, 2022, 12:58:35 PM5/31/22
to Eyal Frishman, AV1 Discussion
Hi Eyal,

On Sun, May 29, 2022 at 7:06 AM Eyal Frishman <eyal...@gmail.com> wrote:
Hi James,

Thanks for the inputs. I actually use VQ-Analyzer and was not able to see this level of error. Will try to see what i'm doing wrong.

Make sure you have 'View->Status' enabled and then advance to the third frame. In the 'Errors' tab you should see the message I listed below. I was viewing the file with version 6.5.0.
 
Also, i am not able to access the link you shared for reporting a bug.

That's strange. Do you see a 404 or another error? You can try the base url https://bugs.chromium.org/p/aomedia/issues/list and use the new issue button.
 

Thanks again,
Eyal


PS: Took me a while to reply as i just found these emails in my Junk-Mail (hopefully it wont happen again!).
Reply all
Reply to author
Forward
0 new messages