[C++] Unbounded expansion factor in verifier?

32 views
Skip to first unread message

Antoine Pitrou

unread,
Feb 8, 2021, 1:18:52 PM2/8/21
to flatb...@googlegroups.com

Hello,

The Apache Arrow IPC format uses Flatbuffers for metadata serialization.

We're currently covering our IPC read routines with OSS-Fuzz to hunt for
potential security risks when ingesting untrusted data. We rely on
flatbuffers::Verifier to ensure that incoming messages are valid
Flatbuffers data, before doing our own higher-level validation.

However, it seems that the runtime of flatbuffers verification may grow
unbounded. OSS-Fuzz recently managed to exhibit a 1.5 Kb buffer that
expands to 268 million Flatbuffers tables, causing a timeout. It seems
a bit surprising to me, since it means that a table can basically occupy
exactly 0 bytes in a buffer (not even its size seems to be encoded). Is
it expected?

I know we can limit the maximum number of tables when verifying. Some of
our users have use cases for a very large number of tables, which is why
we relaxed that maximum number. On the other hand, the potential
expansion factor seems to great that it may still make a denial of
service feasible using tons of tiny files, wouldn't it?

Thanks in advance for your insight.

Regards

Antoine.

Владимир Г.

unread,
Feb 9, 2021, 9:25:52 AM2/9/21
to FlatBuffers
Hello, Antoine

You are right, this is unexpected behavior.
The number of tables can't exceed the size of the buffer in bytes.
Probably this buffer has a cycle inside (table offset might be signed).

Can you share this OSS-Fuzz input and fbs-schema under test?

Regards
Vladimir.

вторник, 9 февраля 2021 г. в 01:18:52 UTC+7, Antoine Pitrou:

Antoine Pitrou

unread,
Feb 9, 2021, 10:06:20 AM2/9/21
to FlatBuffers

Hello Vladimir,

Thanks for your answer. I've distilled the issue as a standalone
reproducer (including binary file) here:
https://github.com/pitrou/flatbuf_issue

Regards

Antoine.
> --
> You received this message because you are subscribed to the Google
> Groups "FlatBuffers" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to flatbuffers...@googlegroups.com
> <mailto:flatbuffers...@googlegroups.com>.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/flatbuffers/2367f39d-d304-4377-bf82-b9861d1335d5n%40googlegroups.com
> <https://groups.google.com/d/msgid/flatbuffers/2367f39d-d304-4377-bf82-b9861d1335d5n%40googlegroups.com?utm_medium=email&utm_source=footer>.

Владимир Г.

unread,
Feb 9, 2021, 2:30:23 PM2/9/21
to FlatBuffers
This is not a bug, this is table deduplication plus OSS-Fuzz magic.

There are only 21 unique Field tables in that OSS-Fuzz buffer.
Every Field[k] has N[k] copies of Fields[k+1] as children.
Field[0]  has 4 children point to  Fields[1],
Filed[1] has 2  children point to  Field[2], and so on.
In total: 4*2*2*4*2*2*4*3*2*2*1*4*2*2*4*3*2*2*1*1; = 9"437"184
The `Schema` has 4 fields, so we have 37"748"736 of Fields tables.
Verifier checked 268"285"953 tables in total. This is comparable with 37 million of the Field tables (every Field table has `metadata` vector).
This issue can be easily solved with an additional traversal tree in the verifier.
But in the worst case, this tree requires the same amount of memory as a buffer under test.


вторник, 9 февраля 2021 г. в 22:06:20 UTC+7, Antoine Pitrou:

Antoine Pitrou

unread,
Feb 9, 2021, 2:49:38 PM2/9/21
to FlatBuffers

Thanks a lot for the diagnosis. I was unaware that Flatbuffers could
deduplicate tables. It would be worth discussing this in the verifier
docstrings, IMHO. Otherwise it is difficult to understand why the
max_tables parameter may be important.

Best regards

Antoine.
> https://groups.google.com/d/msgid/flatbuffers/80e04979-e210-4577-84e1-6100ebee0bcen%40googlegroups.com
> <https://groups.google.com/d/msgid/flatbuffers/80e04979-e210-4577-84e1-6100ebee0bcen%40googlegroups.com?utm_medium=email&utm_source=footer>.

Владимир Г.

unread,
Feb 10, 2021, 6:17:20 AM2/10/21
to FlatBuffers
I think it is possible to extend flatbuffers::Verifier with std::set<voffset_t> that store already verified vtables.
This lookup table can be activated by a macro like FLATBUFFERS_TRACK_VERIFIER_BUFFER_SIZE.
Or this lookup table can be bounded by size in run-time.

среда, 10 февраля 2021 г. в 02:49:38 UTC+7, Antoine Pitrou:

Antoine Pitrou

unread,
Feb 10, 2021, 6:19:30 AM2/10/21
to FlatBuffers

For now, the strategy we adopted is to make the `max_tables` a function
of the buffer size to verify. On real-world data, there shouldn't be
many identical tables (but I suppose that depends on the use case).
https://github.com/apache/arrow/pull/9447/files

Regards

Antoine.
> https://groups.google.com/d/msgid/flatbuffers/8f7573b6-c64d-4f74-a2e6-2b530f45ef12n%40googlegroups.com
> <https://groups.google.com/d/msgid/flatbuffers/8f7573b6-c64d-4f74-a2e6-2b530f45ef12n%40googlegroups.com?utm_medium=email&utm_source=footer>.
Reply all
Reply to author
Forward
0 new messages