I'm thinking about applications containing numerical data. E.g:
- a list of countries or regions, each including in a property a large
array of coordinates defining its boundaries.
- 3D models stored as documents with structure, which include large
arrays of vertices, normals, face indices, etc. Thousands of elements.
- statistical data of any kind
but which needs to be interpretable by Mongo queries, updates, etc. (so
binary does not help).
For example, these are the first two points in the boundary of Algeria
(the complete contour would be much longer):
[771230.894373, 4422896.962001, 804804.852796, 4451159.130080]
in current BSON this would be stored including an "index value" for each
: so, conceptually as if it was this:
{"0": 771230.894373, "1": 4422896.962001, "2": 804804.852796, "3":
4451159.130080}
(for a complete detailed contour indices would grow much more: ...
"3542": 131633.628071, "3543": 4371332.547494 ...)
That redundancy is what kind of bothers me: Storing indices for
non-sparse arrays. And as text.
The new array type I propose would omit the redundant 'e_name' of each
element (but would keep an int32 size at the beginning).
Following the example, each array element also identifies its type. So
we have something comparable to:
{"0": double 771230.894373, "1": double 4422896.962001, "2": double
804804.852796, "3": double 4451159.130080}
In this case, all those types could be moved to the beginning as they
are all "double". Thus saving significant space. This is the special
"homogenous array" type I further propose.
BTW, true and false both have type ID \x08 (boolean), or am I missing
something?. A hypothetical homogenous "array of booleans" would have
element type \x08 and its content would be a sequence of \x00 and \0x01
(for false or true elements).
About random access through pointer arithmetic: yes, obviously I'm
talking about elements of the same size like those you mention. But this
advantage is small, I know.
And well, I agree, the format specification simplicity would probably
suffer with these changes. Just wanted to discuss these issues.
Best regards
El 18/03/2012 20:14, Bernd Sp�th escribi�: