Flatbuffers and message packing

599 views
Skip to first unread message

John Zorko

unread,
Apr 21, 2022, 1:58:08 PM4/21/22
to FlatBuffers
Hello, all ...

Does FlatBuffers support message packing like capnproto does? If so, does it rely on unaligned memory accesses like capn does? (this is running on a Cypress PSoC6, whose M0 core does not allow unaligned memory access) I ask bc flatbuffers seems to introduce a significant (24-byte) overhead with a schema like this:

namespace MyNamespace;

struct S1 {
  x:short;
  y:short;
  z:short;
}

struct S2 {
  s2:[ubyte: 16];
}

struct S3 {
  s3:[ubyte: 6];
}

table MyTable {
  index:ubyte;
  s2:S2;
  value1:short;
  value2:ubyte;
  s1:S1;
  s3:S3;
}

... in that the raw data is only 32 bytes, but the flatbuffer version (using flatcc) is 56 bytes, which in our case is significant. Is there a way to cut that down somewhat?

Regards,

John

Derek Bailey

unread,
Apr 21, 2022, 2:42:11 PM4/21/22
to John Zorko, FlatBuffers
Hi John,

I think this is an excellent case to use the new `--annotate` feature I recently added to flatc. It will annotate the buffer and help show you why the size is 56 bytes, and possibly can help you optimize the schema a bit to avoid padding and other overheads. 

Now you mentioned you used flatcc, so this might not work one-for-one with flatc, but I copied your schema and made a simple test binary and this is what I get for the annotated:

// Annotated Flatbuffer Binary
//
// Schema file: test.fbs
// Binary file: test.bin

header:
  +0x00 | 14 00 00 00             | UOffset32 | 0x00000014 (20) Loc: +0x14 | offset to root table `MyTable`

vtable (MyTable):
  +0x04 | 10 00                   | uint16_t  | 0x0010 (16)                | size of this vtable
  +0x06 | 24 00                   | uint16_t  | 0x0024 (36)                | size of referring table
  +0x08 | 04 00                   | VOffset16 | 0x0004 (4)                 | offset to field `index` (id: 0)
  +0x0A | 08 00                   | VOffset16 | 0x0008 (8)                 | offset to field `s2` (id: 1)
  +0x0C | 06 00                   | VOffset16 | 0x0006 (6)                 | offset to field `value1` (id: 2)
  +0x0E | 05 00                   | VOffset16 | 0x0005 (5)                 | offset to field `value2` (id: 3)
  +0x10 | 18 00                   | VOffset16 | 0x0018 (24)                | offset to field `s1` (id: 4)
  +0x12 | 1E 00                   | VOffset16 | 0x001E (30)                | offset to field `s3` (id: 5)

root_table (MyTable):
  +0x14 | 10 00 00 00             | SOffset32 | 0x00000010 (16) Loc: +0x04 | offset to vtable
  +0x18 | 01                      | uint8_t   | 0x01 (1)                   | table field `index` (UByte)
  +0x19 | 03                      | uint8_t   | 0x03 (3)                   | table field `value2` (UByte)
  +0x1A | 02 00                   | int16_t   | 0x0002 (2)                 | table field `value1` (Short)
  +0x1C | 00                      | uint8_t   | 0x00 (0)                   | array field `S2.s2`[0] (UByte)
  +0x1D | 01                      | uint8_t   | 0x01 (1)                   | array field `S2.s2`[1] (UByte)
  +0x1E | 02                      | uint8_t   | 0x02 (2)                   | array field `S2.s2`[2] (UByte)
  +0x1F | 03                      | uint8_t   | 0x03 (3)                   | array field `S2.s2`[3] (UByte)
  +0x20 | 04                      | uint8_t   | 0x04 (4)                   | array field `S2.s2`[4] (UByte)
  +0x21 | 05                      | uint8_t   | 0x05 (5)                   | array field `S2.s2`[5] (UByte)
  +0x22 | 06                      | uint8_t   | 0x06 (6)                   | array field `S2.s2`[6] (UByte)
  +0x23 | 07                      | uint8_t   | 0x07 (7)                   | array field `S2.s2`[7] (UByte)
  +0x24 | 08                      | uint8_t   | 0x08 (8)                   | array field `S2.s2`[8] (UByte)
  +0x25 | 09                      | uint8_t   | 0x09 (9)                   | array field `S2.s2`[9] (UByte)
  +0x26 | 0A                      | uint8_t   | 0x0A (10)                  | array field `S2.s2`[10] (UByte)
  +0x27 | 0B                      | uint8_t   | 0x0B (11)                  | array field `S2.s2`[11] (UByte)
  +0x28 | 0C                      | uint8_t   | 0x0C (12)                  | array field `S2.s2`[12] (UByte)
  +0x29 | 0D                      | uint8_t   | 0x0D (13)                  | array field `S2.s2`[13] (UByte)
  +0x2A | 0E                      | uint8_t   | 0x0E (14)                  | array field `S2.s2`[14] (UByte)
  +0x2B | 0F                      | uint8_t   | 0x0F (15)                  | array field `S2.s2`[15] (UByte)
  +0x2C | 04 00                   | int16_t   | 0x0004 (4)                 | struct field `S1.x` (Short)
  +0x2E | 05 00                   | int16_t   | 0x0005 (5)                 | struct field `S1.y` (Short)
  +0x30 | 06 00                   | int16_t   | 0x0006 (6)                 | struct field `S1.z` (Short)
  +0x32 | 00                      | uint8_t   | 0x00 (0)                   | array field `S3.s3`[0] (UByte)
  +0x33 | 01                      | uint8_t   | 0x01 (1)                   | array field `S3.s3`[1] (UByte)
  +0x34 | 02                      | uint8_t   | 0x02 (2)                   | array field `S3.s3`[2] (UByte)
  +0x35 | 03                      | uint8_t   | 0x03 (3)                   | array field `S3.s3`[3] (UByte)
  +0x36 | 04                      | uint8_t   | 0x04 (4)                   | array field `S3.s3`[4] (UByte)
  +0x37 | 05                      | uint8_t   | 0x05 (5)                   | array field `S3.s3`[5] (UByte)



You can first see in the bottom portion of that file, that everything is packed with no padding. Part of that is because most of your types are 1 byte wide, so there wouldn't need to be a packing step. And your usage of the 2-byte types align nicely.

Secondly, the overhead is coming from the vtable addition. That alone adds 16 bytes, with additional 4 bytes for the header and 4 bytes in the table to point to the vtable used. So that is the 24 byte "overhead" you will get using flatbuffers.

There are some things you can do to minimize it (reduce the number of fields, since each field adds 2 bytes of overhead), but much of the overhead is 'fixed' and cannot be changed. 

Derek


--
You received this message because you are subscribed to the Google Groups "FlatBuffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flatbuffers...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/flatbuffers/07fdad12-b6d9-49c9-9d9b-d24bc2caab19n%40googlegroups.com.

Derek Bailey

unread,
Apr 21, 2022, 2:45:02 PM4/21/22
to John Zorko, FlatBuffers
Here is the commands and json file I used:

flatc -b test.fbs test.json
flatc --annotate test.fbs -- test.bin

This is the test.json:

{
  "index": 1,
  "s2": {
    "s2": [
      0,
      1,
      2,
      3,
      4,
      5,
      6,
      7,
      8,
      9,
      10,
      11,
      12,
      13,
      14,
      15
    ]
  },
  "value1": 2,
  "value2": 3,
  "s1": {
    "x": 4,
    "y": 5,
    "z": 6
  },
  "s3": {
    "s3": [
      0,
      1,
      2,
      3,
      4,
      5
    ]
  }
}

mikkelfj

unread,
Apr 21, 2022, 3:46:00 PM4/21/22
to FlatBuffers
There is an overhead with each object in FlatBuffers. There you put into the same table or vector (aka array), and the more of the same type to avoid padding overhead, the more more efficient the buffer will be. This is sort of the same as overhead in C memory structures although FlatBuffers has a bit more overhead to deal with versioning etc.

It has been a long time since I looked at Cap'n Proto, but as I recall, it works largely the same as FlatBuffers in memory layout, except it doesn't use vtables to decide which members of a table is present, so it stores all members, even if they are zero / null values. Instead, Cap'n Proto offers to compress runs of zero values such that the end result takes up less space, but I think it needs to do a fast decompression step before reading, and I think this compression is optional.

FlatBuffers sort of has compression via vtables, but it is mostly visible when you only use a few among many fields present.
FlatBuffers does require the buffer to be aligned, so if you read it from a sufficiently aligned memory position, you should be safe. If you do not care (as on Intel platforms), you can get away with unaligned memory, but if you choose to verify the buffer, you probably will get an alignment error.

Now, you can also just compress the buffer with gzip, zstd, og lz4 compression, as examples, but that is outside of FlatBuffers per se.
It has earlier been shown, that JSON compresses better than FlatBuffers for large datasets, and FlatBuffers better than JSON for small data sets. This has to do with the use of offsets (relative pointers), and reuse of names in JSON.

Finally, if you data needs to be small, simple, and fast, and you only use C, you can use flatcc's struct buffer feature:
In this case, the root of the buffer is a struct, very similar to a C struct, and you cannot have strings, tables, unions, or anything but inline fixed size arrays (including char arrays in C).
The benefit of struct buffers over native C structs is mainly that you get to have a schema, and you are guaranteed portability across platforms with different C compilers and endianness. The benefit over FlatBuffers with tables as root is speed and usually size for simple messages at the cost of portability since other implementations do not support struct buffers.

BTW: to better understand Derek's dump (excellent feature) you may want to have a look at:


There is an example that explains an example dump (that inspired the current dump format, but isn't exactly the same).

Mikkel
Reply all
Reply to author
Forward
0 new messages