I'd like to propose an optimization for large arrays of JSON objects typically returned from web services...
Because each array element is a well known JSON object/type, I include the type definition first, then only include values in the array:
[
{ id: 1, name: 'Test', age: 23, dateSubscribed: '2014/07/28' },
{ id: 3, name: 'Some', age: 33, dateSubscribed: '2014/07/27'},
... lots more like these
]
As you know, each row includes the same set of string IDs which bloats the data.
Instead of that, I include something like this:
{type:123, fields:{id:'int',name:'string',age:'uint8',dateSubscribed:'date'}}
[$123,#322
1,Test,23,2014/07/28,
3,Some,33,2014/07/27,
...
]
Of course the type definitions use proper BJSON type characters not strings like 'int', 'string', 'uint8', 'date' above ... but you get the point.
I'm only not sure how to create new types (arbitrary number of them) using standard BJSON, and how to refer to them within $ type declaration (would need more than a single character for these)?
I also had to extend this to allow any nested sub-objects when returning an entire object hierarchy as array elements.
For that I've added a similar type declaration mapped to a field name/path such as 'order.details.product' where it would have name, price, ... in regular JSON, but for BJSON this could be done even better:
These nested types could be declared as a separate set of types with their own $ids declared prior to being referenced, something like this:
{type:120, name:'Product', fields:{id:'int',name:'string',price:'double'}},
{type:121, name:'Details', fields:{count:'int',product:'$120'}},
{type:122, name:'Order', fields:{id:'int',from:'string',date:'date',details:'$121'}}
Then the data array can only inlclude values, flat table:
[$122,#3
11,'John','2014/07/28', 1, 73,'Bike',122.11,
12,'Marry','2014/07/28', 3, 82,'Pen',3.21,
...
When using large data sets, this can reduce the data size quite significantly, especially when property names are long.
The same technique could be used for any JSON objects - include the type first, then use {$### followed by data, not even count would be needed in this case.
Serializer could keep track of each JSON type already generated, assign new type $ to new ones and generate them before first usage. Type names would be optional, provided by serializer if known?
So far I've used this technique only with standard JSON but I'd like to start optimizing my data even further with BJSON implementing it or some form of it.
-Piotr