attribute string_constant ;).
- JSON support. flatc enables us to translate Binary into JSON and JSON in to Binary. However even if binaries are backwards and forward compatible, JSONs are not. As far as I can see, it is only due to the fact that flatc parses JSON and schemas through the same code. This is an implementation detail which cripples JSON support.
- Memory footprint vs. effective access. The beauty of FlatBuffers is that it can support both. We can layout data in the most thrifty way, by reusing definitions. This is already done with reuse of vTables. I was however a bit puzzled why it is not done for strings. However, this implementation detail also implies that you can't have a really lazy data access where you can stream the data without the need of feeling up the whole buffer first. It could happen that I reuse vTable which is somewhere at the end of the buffer.
- The base classes which are used by the generated code are very C++ centric. It kind of correlates with the 2. point. They favour effective access over ease of use.
- One questionable feature is the min alignment and padding. Which I guess has much bigger impact in C++ than in other languages. However I have to be honest I am not sure how big of an impact it makes. I would really appreciate if you could elaborate on this implementation detail. I personally picked on it because it makes the process of serialisation in to binary much more complex and also effects the memory foot print of the binary.
I am sorry for being so critical, but I truly believe that FlatBuffers is a great project and that it could be very beneficial to the project if we could do following adjustments to address the problems that I listed before.
- Decouple JSON and Schema parser from each other.
- Provide a possibility to define if it is desirable to go for the smallest memory footprint, or efficient access (maybe even streaming access)
- Make it easy to write own code generators for different languages. I guess it was at some point on the table anyways (looking at attribute_decl =
attributestring_constant;).I started with (https://github.com/mzaks/FlatBuffersSchemaEditor) to address the third point. I will have a bit more time after 20th of December. I hope to make some progress there.
Hi guys,first of all, I want to tell that FlatBuffers is a great concept and I am really thankful you making it happen as open source project.I guess now you are expecting a BUT part and here it comes :)It seems to me that there was some degree of historical evolution which resulted in some strange implementation details.I would like to list them
- JSON support. flatc enables us to translate Binary into JSON and JSON in to Binary. However even if binaries are backwards and forward compatible, JSONs are not. As far as I can see, it is only due to the fact that flatc parses JSON and schemas through the same code. This is an implementation detail which cripples JSON support.
- Memory footprint vs. effective access. The beauty of FlatBuffers is that it can support both. We can layout data in the most thrifty way, by reusing definitions. This is already done with reuse of vTables. I was however a bit puzzled why it is not done for strings. However, this implementation detail also implies that you can't have a really lazy data access where you can stream the data without the need of feeling up the whole buffer first. It could happen that I reuse vTable which is somewhere at the end of the buffer.
- The base classes which are used by the generated code are very C++ centric. It kind of correlates with the 2. point. They favour effective access over ease of use. One questionable feature is the min alignment and padding. Which I guess has much bigger impact in C++ than in other languages. However I have to be honest I am not sure how big of an impact it makes. I would really appreciate if you could elaborate on this implementation detail. I personally picked on it because it makes the process of serialisation in to binary much more complex and also effects the memory foot print of the binary.
I am sorry for being so critical, but I truly believe that FlatBuffers is a great project and that it could be very beneficial to the project if we could do following adjustments to address the problems that I listed before.
- Decouple JSON and Schema parser from each other.
- Provide a possibility to define if it is desirable to go for the smallest memory footprint, or efficient access (maybe even streaming access)
- Make it easy to write own code generators for different languages. I guess it was at some point on the table anyways (looking at attribute_decl =
attributestring_constant;).
I started with (https://github.com/mzaks/FlatBuffersSchemaEditor) to address the third point. I will have a bit more time after 20th of December. I hope to make some progress there.
table Monster {
mana:short = 150;
hp:short = 100;
name:string;
}{
mana : 100,
hp : 50
name : "Max"
}[
100, 50, "max"
]The communication works as following, BackEnd sends client config file and players game state. Client sends Backend Commands which reflects changes in the game state.Now the config is initially done in Google Spreadsheets (preferred tool of game designers) than it is transformed in to JSON. The Backend than translates JSON into Flatbuffer binary and sends it to Client. If handling of JSON would be equal to binary we could have additive changes to config without the need of redeploying the backend with the new schema.
Now here is a tricky part. And this is described best by the game state example. Game States are also saved as JSONs in Postgres, mainly so that we can query them.
Would be represented as:{
mana : 100,
hp : 50
name : "Max"
}But to be resilient agains the property name changes as the binary is, it should be something like:[
100, 50, "max"
]
nestly I am uncertain myself if it is a good idea to have such cryptic JSON, but it would be the most resilient one for migrations.What do you think?
It is better to have an unfolded DAG which will than results in a Tree, where you can read only certain branches and not even put other branches in to memory.I guess mikkelfj implemented it in his C port. (I didn't looked at the implementation, sorry)
In my opinion those are two strategies which are totally valid and it would be nice to be able to select which strategy is important to you.
Let's take again my use case with the city builder strategy game I am working on.
However the config I want to read lazy maybe even stream and not download completely.For this I would need the config to be represented as an unfolded Tree and not as DAG.By this I wanted to showcase that it is possible to have both use cases in a single application.And it would be nice if the FlatBuffer implementation would embrace those in a user friendly manner.
Hi Wouter, thanks for reply and sorry for my slow response :)
Am Montag, 14. Dezember 2015 21:46:14 UTC+1 schrieb Wouter van Oortmerssen:Maxim,On Sun, Dec 13, 2015 at 4:34 AM, Maxim Zaks <maxim...@googlemail.com> wrote:Hi guys,first of all, I want to tell that FlatBuffers is a great concept and I am really thankful you making it happen as open source project.I guess now you are expecting a BUT part and here it comes :)It seems to me that there was some degree of historical evolution which resulted in some strange implementation details.I would like to list them
- JSON support. flatc enables us to translate Binary into JSON and JSON in to Binary. However even if binaries are backwards and forward compatible, JSONs are not. As far as I can see, it is only due to the fact that flatc parses JSON and schemas through the same code. This is an implementation detail which cripples JSON support.
I'm not sure what about JSON is not forwards/backwards compatible. One issue is that by default, The FlatBuffer parser does not accept unknown fields. This is for good reason, the JSON input was originally conceived as a friendly way to data into FlatBuffers, not necessarily to absorb abitrary JSON.
I think here is better to have a use case, how the JSON might be used.
I am currently working on a strategy game where we use Flatbuffers for all Client - Backend communication where client is a game running on mobile device, implemented with Unity3D.
The communication works as following, BackEnd sends client config file and players game state. Client sends Backend Commands which reflects changes in the game state.Now the config is initially done in Google Spreadsheets (preferred tool of game designers) than it is transformed in to JSON. The Backend than translates JSON into Flatbuffer binary and sends it to Client. If handling of JSON would be equal to binary we could have additive changes to config without the need of redeploying the backend with the new schema.
Now here is a tricky part. And this is described best by the game state example. Game States are also saved as JSONs in Postgres, mainly so that we can query them.
Here it is also desirable that a game state of a new client can be read by (old) BackEnd which serves old client. But there is another detail which makes JSON behave different from the binary representation. When you rename a property in a schema this does not need any migration as in binary the property names are not stored. This is however not the case in current JSON representation.
Schema like:Would be represented as:table Monster {
mana:short = 150;
hp:short = 100;
name:string;
}{
}
As such, it is important you get error-feedback on mis-spelled fields, rather than difficult to diagnose errors at run-time due to missing data.That said, people have asked for an option to ignore unknown fields before, and I think it should be added.
- Memory footprint vs. effective access. The beauty of FlatBuffers is that it can support both. We can layout data in the most thrifty way, by reusing definitions. This is already done with reuse of vTables. I was however a bit puzzled why it is not done for strings. However, this implementation detail also implies that you can't have a really lazy data access where you can stream the data without the need of feeling up the whole buffer first. It could happen that I reuse vTable which is somewhere at the end of the buffer.
Reuse of strings (and vectors, and tables) is up to the discretion of the user, and can easily be implemented by the user (FlatBuffers are allowed to be a DAG, so if you want to reuse the result of CreateString twice, no-one is stopping you.There are consequences to reusing objects, related to identity and mutation, so this should always be under user control.That said, specifically for strings which are a common use case, we could easily add a CreateSharedString() function that does the heavy lifting for you automatically.
- The base classes which are used by the generated code are very C++ centric. It kind of correlates with the 2. point. They favour effective access over ease of use. One questionable feature is the min alignment and padding. Which I guess has much bigger impact in C++ than in other languages. However I have to be honest I am not sure how big of an impact it makes. I would really appreciate if you could elaborate on this implementation detail. I personally picked on it because it makes the process of serialisation in to binary much more complex and also effects the memory foot print of the binary.
C++ centric? Which language are you using?FlatBuffers was entirely designed to give maximum performance though in-place memory layouts, something which is indeed very foreign to most languages other than C++. We can't change these languages however, but I believe that optimising how a CPU accesses memory brings benefits to any language, though admittedly some of that is lost when not using C++.Whenever in FlatBuffers there is a mutually exclusive choice between fast and convenient, we go for the former. We just hope that in most cases it is not mutually exclusive.Games, codecs, scientific computing, lots of fields use data that needs specific alignment to be fast (for e.g. SIMD). Having to copy this data out of FlatBuffers to be maximally efficient would defeat the point of FlatBuffers.I am sorry for being so critical, but I truly believe that FlatBuffers is a great project and that it could be very beneficial to the project if we could do following adjustments to address the problems that I listed before.
- Decouple JSON and Schema parser from each other.
What problem does this solve? I believe any compatibility issues can be solved with the existing parser.It is also a very compact and very fast parser that is useful for those that want to read JSON at run-time, but with low cost. It parses straight into a FlatBuffer with very little intermediate memory usage. I've looked around at existing JSON parsers, and they're all orders or magnitude less efficient (particularly in memory usage).If people wish to create a JSON -> FlatBuffers converter using a different parser of their choosing, this would not be hard to do.
- Provide a possibility to define if it is desirable to go for the smallest memory footprint, or efficient access (maybe even streaming access)
Beyond pooling strings, what are you suggesting here?
- Make it easy to write own code generators for different languages. I guess it was at some point on the table anyways (looking at attribute_decl =
attributestring_constant;).Generally a language implementation is complicated thing, because FlatBuffers is such a thin layer of raw memory access, there's relatively a lot to implement, and I am not sure to what extend that can be simplified from the current generators, since most of it is so language specific.I started with (https://github.com/mzaks/FlatBuffersSchemaEditor) to address the third point. I will have a bit more time after 20th of December. I hope to make some progress there.Definitely sounds cool :)Wouter
--You received this message because you are subscribed to the Google Groups "FlatBuffers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to flatbuffers...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Would be represented as:{
mana : 100,
hp : 50
name : "Max"
}But to be resilient agains the property name changes as the binary is, it should be something like:[
100, 50, "max"
]This is a direct equivalent to binary representation.Even though it is less human readable and it makes the JSON also much harder to query.Honestly I am uncertain myself if it is a good idea to have such cryptic JSON, but it would be the most resilient one for migrations.What do you think?