How to safely use a nested flatbuffers?

2,734 views
Skip to first unread message

John Wiseman

unread,
Nov 24, 2016, 12:10:03 AM11/24/16
to FlatBuffers

hi,
I saw previous discussion about nested flatbuffers, and Wouter mentioned ForceVectorAlignment. So, what is the portable solution to embed a flatbuffers in a vector?

mikkelfj

unread,
Nov 27, 2016, 12:33:58 PM11/27/16
to FlatBuffers


On Thursday, November 24, 2016 at 6:10:03 AM UTC+1, John Wiseman wrote:

hi,
I saw previous discussion about nested flatbuffers, and Wouter mentioned ForceVectorAlignment. So, what is the portable solution to embed a flatbuffers in a vector?

I just want to point that flatcc (C interface), does not support forced vector alignment, but it is good to see that it has been added, so flatcc will also support this at some point.

More generally, you can either add a nested flatbuffer using a dedicated API method for this, if the language interface supports it. If so, the nested schema must be visible to the containing schema and the nested buffer will be aligned correctly.

In praxis, many just store a nested buffer as a ubyte vector in which they store and independendently generated buffer. This avoids issues with lack of language support and also avoids requiring one schema to know about the other. Unfortunately this means the any alignment above 4 bytes is not guaranteed. With the forced vector alignment you can (I presume) enforce that the ubyte vector has an alignment stricter than 4 bytes which solves the alignment problem but puts the burden of choosing the proper alignment on the user.

Handling nested buffers was by far the most difficult part of the flatcc builder interface and runtime engine and I'd really like to strip it out and forget about it. Instead the ubyte vector ought to support the align attribute which is otherwise only supported by structs.

So what is the best solition? Conceptually the cleanest solution is the nested attribute in the schema, but in praxis it will be a raw ubyte that users find simpler and more practical. My guess is that neither solution will guarantee wide support on all languages - this can be handled be either ignoring false alignment (which works on amd64, but not emscritpem, or some ARM systems), or copy the nested bufer out to an aligned location before reading it.

mikkelfj

unread,
Nov 27, 2016, 12:40:46 PM11/27/16
to FlatBuffers


On Sunday, November 27, 2016 at 6:33:58 PM UTC+1, mikkelfj wrote:


On Thursday, November 24, 2016 at 6:10:03 AM UTC+1, John Wiseman wrote:

hi,
I saw previous discussion about nested flatbuffers, and Wouter mentioned ForceVectorAlignment. So, what is the portable solution to embed a flatbuffers in a vector?

I just want to point that flatcc (C interface), does not support forced vector alignment, but it is good to see that it has been added, so flatcc will also support this at some point.

That is not correct, actually, the flatcc builder has an advanced option for manually controlling alignment, but you need to call the raw builder interface instead of the higher level generated calls when adding the nested buffer.

Wouter van Oortmerssen

unread,
Nov 28, 2016, 3:33:22 PM11/28/16
to mikkelfj, FlatBuffers
Like Mikkel said, just store the nested FlatBuffer as a ubyte vector in its parent. This will get you 4-byte alignment. If the nested buffer may contain elements that need higher alignment (such as double / int64_t) you should call ForceVectorAlignment as indicated here https://github.com/google/flatbuffers/commit/07da3fc216c62b18eb13a8bcb9afa95d7c325418, though it should also work without if you don't run on older arm chips :)

--
You received this message because you are subscribed to the Google Groups "FlatBuffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flatbuffers+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

karan bhatia

unread,
May 24, 2017, 9:30:06 PM5/24/17
to FlatBuffers, mik...@dvide.com
Bringing this up once again. Can someone explain in some detail the alignment issues with respect to using nested flatbuffers?

Also, mikkelfj@ mentioned 
"More generally, you can either add a nested flatbuffer using a dedicated API method for this, if the language interface supports it. If so, the nested schema must be visible to the containing schema and the nested buffer will be aligned correctly."
Does the C++ api have any special support for nested flatbuffers so that clients don't need to worry about things like alignment.

To unsubscribe from this group and stop receiving emails from it, send an email to flatbuffers...@googlegroups.com.

mikkelfj

unread,
May 27, 2017, 11:45:46 AM5/27/17
to FlatBuffers, mik...@dvide.com


On Thursday, May 25, 2017 at 3:30:06 AM UTC+2, karan bhatia wrote:
Bringing this up once again. Can someone explain in some detail the alignment issues with respect to using nested flatbuffers?

If you a 8-byte aligned vector, should as [ulong] and store it in a ubyte vector inteded for hosting the nested flatbuffer, the the start of the ubyte vector might be 4-byte aligned. It will not be 1-byte aligned because all vectors have a 4-byte length field that ensure at least this much alignment, but no more. The fact that the ubyte vector holds data that needs 8-byte alignment will not easily be available. Thus you risk accessing a vector at the wrong alignment, e.g. at addres 0x80004 instead of 0x80008 or 0x80000. On the x86 and amd64 platforms this is not an issue, but it can be on other platforms such as som ARM CPU's, and in other case it might work, but slower than necesary.


Also, mikkelfj@ mentioned 
"More generally, you can either add a nested flatbuffer using a dedicated API method for this, if the language interface supports it. If so, the nested schema must be visible to the containing schema and the nested buffer will be aligned correctly."
Does the C++ api have any special support for nested flatbuffers so that clients don't need to worry about things like alignment.

I don't know it this has been added or not. But as Wouter mentioned, the force vector call can be used to store a nested flatbuffer with the appropriate alignment. You just need to know what required alignment is. I think C++ at some point made a change to ensure nested flatbuffers are at least aligned to 8 bytes, but it would be a conservative guess. It would fail if you have a struct with an alignment larger than 8, e.g. for use with some DMA processors or graphics operations. Here the force vector call could be used.

Incidentally, the C api does not have a force vector operation, but it does support building nested buffers and ensure the alignment is correct for that buffer.

karan bhatia

unread,
May 30, 2017, 6:17:02 PM5/30/17
to FlatBuffers, mik...@dvide.com
Thanks mikkelfj. I have some more questions.

The fact that the ubyte vector holds data that needs 8-byte alignment will not easily be available.
When we write the schema for a struct/table, I am assuming the flatbuffers code is able to work out the required alignment for it. Shouldn't it be possible for it to do the same for nested flatbuffers? If not would it be possible to expose the min required alignment for a struct/table?

 I think C++ at some point made a change to ensure nested flatbuffers are at least aligned to 8 bytes
 Can someone confirm/deny this?

 It would fail if you have a struct with an alignment larger than 8
 Under what cases can the required alignment be larger than 8? If I am using the basic types provided by flatbuffers for my struct/table, and force the vector alignment of the nested flatbuffer to be 8, can I be assured it is safe in a cross-platform manner?

mikkelfj

unread,
May 30, 2017, 6:35:23 PM5/30/17
to FlatBuffers, mik...@dvide.com


On Wednesday, May 31, 2017 at 12:17:02 AM UTC+2, karan bhatia wrote:
Thanks mikkelfj. I have some more questions.

See below ...
 
The fact that the ubyte vector holds data that needs 8-byte alignment will not easily be available.
When we write the schema for a struct/table, I am assuming the flatbuffers code is able to work out the required alignment for it. Shouldn't it be possible for it to do the same for nested flatbuffers? If not would it be possible to expose the min required alignment for a struct/table?

Well sort of - but I guess most implementations have not priotized making this information externally available - but you usually know the maximum alignment from the schema anyway. I am only very familiar with the C implementation though. The following function returns the buffer alignment requirements, and you could imagine other languages providing a similar function, but that is not a given. Internally all compliant implementations certainly will be aware of the alignment requirements.

uint16_t flatcc_builder_get_buffer_alignment(flatcc_builder_t *B);

Note also that when using size prefixed buffers alignment need to be aware of this, and both C and C++ handles this, but this is not relevant to nested buffers since they are always size prefixed unlike root buffers which only has this as an option.

Note C does NOT currently tail pad a buffer up to alignment, in case you want to stack multiple buffers with a size prefix, but you can easily do it with the information alignment data given by the get_buffer_alignment call.

 It would fail if you have a struct with an alignment larger than 8
 Under what cases can the required alignment be larger than 8? If I am using the basic types provided by flatbuffers for my struct/table, and force the vector alignment of the nested flatbuffer to be 8, can I be assured it is safe in a cross-platform manner?

For nealy all use cases you should be safe with an 8 byte alignment, but you can use attributes to obtain higher alignments on structs and vectors which may be needed or at least preferred for GPU operations, or 64 byte cache line isolation for improved multi-core performance.

You should also be aware that typical malloc implementations might not align to more than 8 bytes. aligned_alloc can be used to get more control over alignmenment. The C api provides a call to allocate a buffer with full alignment which requires a call to aligned_free subsequently. It is a non-trivial to support portably aligned allocation, but flatcc does this in the portable library - though it had some bug fixes in this area in the latest releases.

mikkelfj

unread,
May 30, 2017, 6:47:47 PM5/30/17
to FlatBuffers, mik...@dvide.com

Replaying again with more details

When we write the schema for a struct/table, I am assuming the flatbuffers code is able to work out the required alignment for it. Shouldn't it be possible for it to do the same for nested flatbuffers? If not would it be possible to expose the min required alignment for a struct/table?

For nested buffers these have historically been implemented as an add-on in C++ where the storage was just a ubyte vector and an attribute listing the type such that code could be generated to automatically cast the ubyte to the given nested root table type. Unfortunately this failed to handle proper alignment unless you copied the buffer to aligned memory. Later (I think) C++ added a forced 8-byte alignment to be safe in most cases, but it couldn't trivially know the nested buffers allocation because it was just a binary blob being added. Later (I think) the Force vector alignment call was added such that the user could provide the missing knowledge of the nested buffers alignment (and for other use cases).

The C apis flatcc compiler and code generator supports building nested buffers in-place such that it does in fact know the inner buffers alignment and thus ensures the proper alignment without user intervention. However, this is extremely complicated and I sort of regret adding support for it because it significantly complicates the builder logic and code generation for limited benefits. Also, it is easy to create a some object within the wrong buffer by mistake when you can have multiple buffers open at the same time.
 

Wouter van Oortmerssen

unread,
May 31, 2017, 12:04:49 PM5/31/17
to FlatBuffers
karan batia:

To be clear, this is how you would store a nested FlatBuffer in C++ with correct alignment:

FlatBufferBuilder child;
// (Code to construct the nested FlatBuffer in child goes here).
auto minalign = child.GetBufferMinAlignment();
// if for whatever reason you have no access to the child FlatBufferBuilder, this also works if there is no use of force_align in the schema:
auto minalign = flatbuffers::largest_scalar_t
// Now construct the parent.
FlatBufferBuilder parent;
parent.ForceVectorAlignment(child.GetSize(), sizeof(int8_t), minalign);
auto vec = parent.CreateVector(child.GetBufferPointer(), child.GetSize());
// (Code to put vec in a table, and finish the buffer goes here).


agallego

unread,
Sep 26, 2017, 2:33:14 PM9/26/17
to FlatBuffers
I'm using the object API in C++ and I'm running into a peculiar issue about this. 

I looked inside the base.h to ensure I was effectively copying the PreAlign() func call which is what the .ForceVectorAlignment uses. 

Effectively - please correct me if I'm wrong - we just pad w/ 0's until the 

sizeof typedef uintmax_t largest_scalar_t;

I'm using 2 object API from 2 different flatbuffers (.fbs) files. 

One file uses std::vector<uint8_t> called payload.body in the following snippet. 

So the flow is this

1) Pass in some RootType::NativeTableType  (object api) and pack it
2) put it in the payload object api - with padding
3) at some point in the future, i effectively do the same on the parent object api. 

In flatbuffers terms you have

table foo {
   x
: ulong;
}


table payload
{
  body
: [ ubyte ];
}


    builder.Finish(RootType::Pack(builder, &t, nullptr));
   
// auto padding = flatbuffers::PaddingBytes(
   
//   builder.GetSize(),
   
//   flatbuffers::AlignOf<flatbuffers::largest_scalar_t>());
   
auto padding = flatbuffers::PaddingBytes(builder.GetSize(),
                                             builder
.GetBufferMinAlignment());
   
// memset
    let
->payload->body.resize(builder.GetSize() + padding, 0);
   
const char *p = reinterpret_cast<const char *>(builder.GetBufferPointer());
    std
::memcpy(&let->payload->body[0], p, builder.GetSize());


the idea is that I want to use the object API to lazily build the payload ... i.e.: use a chain of headers, etc. 

I'm getting this: 


../src/third_party/include/flatbuffers/base.h:162:22: runtime error: load of misaligned address 0x62500000798c for type 'const long unsigned int', which requires 8 byte alignment
0x62500000798c: note: pointer points here
 
14 00 00 00 a3 f0 78 0b  00 00 00 00 08 00 0c 00  04 00 08 00 08 00 00 00  44 00 00 00 04 00 00 00
             
^






Thank you for any tips !!! 

- Alex

Wouter van Oortmerssen

unread,
Sep 28, 2017, 11:55:31 AM9/28/17
to agallego, FlatBuffers
ForceVectorAlignment (or PreAlign) only work if they're called right before CreateVector, and since the Object API calls CreateVector on your behalf as part of a larger set of serialization commands, you have no way to align vectors in the Object API unless you modified the generated code and added a ForceVectorAlignment manually. This is something you could try, just to see if it works.

The better solution would be to add a force_align attribute for fields, which would cause a ForceVectorAlignment to be emitted automatically in the Object API. PRs welcome.

--
You received this message because you are subscribed to the Google Groups "FlatBuffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flatbuffers+unsubscribe@googlegroups.com.

Alexander Gallego

unread,
Sep 28, 2017, 1:47:59 PM9/28/17
to Wouter van Oortmerssen, FlatBuffers
AH!! that's a good idea, not sure why i didn't modify the compiler. 

Yeah, i'll submit a patch. This is especially useful for nested types - effectively nested header + bytearrays which are likely flatbuffers on the objectapi - ie.: 

table KV {
  hdr:  some_header_struct;
  payload: [ubyte]
}

good thinking. 

I'll hack a patch 



--
You received this message because you are subscribed to a topic in the Google Groups "FlatBuffers" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/flatbuffers/GXrn90Ey3ZM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to flatbuffers+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages