support for multi-dimensional arrays

112 views
Skip to first unread message

Qianqian Fang

unread,
Aug 21, 2013, 12:00:48 AM8/21/13
to universal-...@googlegroups.com
hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form? 

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector? 

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian 

Riyad Kalla

unread,
Aug 21, 2013, 7:51:20 AM8/21/13
to universal-...@googlegroups.com
Qianqian,
Great question, while there isn't specifically a multi-dim container in UBJSON, as you pointed out an optimized array-of-arrays construct is exactly what I would recommend.

Using the new optimized container format markers you go from:

[[]
  [[][i][1][i][2][i][3][]]
  [[][i][4][i][5][i][6][]]
  [[][i][7][i][8][i][9][]]
[]]

to

[[][[][i][3]
  [[][i][i][3][1][2][3]
  [[][i][i][3][4][5][6]
  [[][i][i][3][7][8][9]

Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.

Hope that helps!

Best wishes,
Riyad


--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary...@googlegroups.com.
To post to this group, send email to universal-...@googlegroups.com.
Visit this group at http://groups.google.com/group/universal-binary-json.
For more options, visit https://groups.google.com/groups/opt_out.

Qianqian Fang

unread,
Aug 22, 2013, 2:29:44 PM8/22/13
to universal-...@googlegroups.com
On Wednesday, August 21, 2013 7:51:20 AM UTC-4, Riyad Kalla wrote:
Qianqian,
Great question, while there isn't specifically a multi-dim container in UBJSON, as you pointed out an optimized array-of-arrays construct is exactly what I would recommend.

Using the new optimized container format markers you go from:

[[]
  [[][i][1][i][2][i][3][]]
  [[][i][4][i][5][i][6][]]
  [[][i][7][i][8][i][9][]]
[]]

to

[[][[][i][3]
  [[][i][i][3][1][2][3]
  [[][i][i][3][4][5][6]
  [[][i][i][3][7][8][9]


hi Riyad

I assumed that for the second [[] you meant to use [#], correct?

I agree, this is the way that I would use based on the Draft 9 of the 
specification. However, as I understand, the spec is still under development.
I am curious if you see a chance to add the new syntax I suggested in the
future revisions? It should be backward compatible as it won't conflict
with any other existing constructs.

If using the syntax I suggested, the above array could be saved more
compactly as

[[][$][i][#][[][$][i][3][3][]]
  [1][2][3][4][5][6][7][8][9]

(here I use the row-major order, but column-major should be equally fine)

with this syntax, the binary data stream stored in the array can be a 
literal copy of the original array in the memory. No additional byte
is needed to be inserted into the steam, and removed when decoding.
This can be a significant simplification for processing data arrays,
especially for the line of work (scientific research) that I am doing.

what do you think?

Qianqian


PS: my matlab ubjson encoder & decoder are nearly complete, will be
announced in a few days (I do have this extended array syntax by
default to ease the mapping from/to matlab data structures)




Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.

Hope that helps!

Best wishes,
Riyad
On Tue, Aug 20, 2013 at 9:00 PM, Qianqian Fang <fan...@gmail.com> wrote:
hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form? 

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector? 

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian 

--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary-json+unsub...@googlegroups.com.

Qianqian Fang

unread,
Aug 22, 2013, 2:46:18 PM8/22/13
to universal-...@googlegroups.com
On Wednesday, August 21, 2013 7:51:20 AM UTC-4, Riyad Kalla wrote:
 
Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.


there is one more point I'd like to make, it is not entirely about
space savings. The straightforward and intuitive memory mapping
and ease of pre-allocation can be significant savings in efforts 
when dealing with data of many regular arrays. 

For example, to process the data array you have suggested, a 
programmer can only allocate 3 pointers when reading the header. 
He/she has no idea that the following sub-arrays all have the same 
length, otherwise, they could have allocated a buffer to hold all 
the following data in a continuous memory, which is more efficient
for speed-sensitive applications.

Qianqian

 

Hope that helps!

Best wishes,
Riyad
On Tue, Aug 20, 2013 at 9:00 PM, Qianqian Fang <fan...@gmail.com> wrote:
hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form? 

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector? 

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian 

--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary-json+unsub...@googlegroups.com.

Riyad Kalla

unread,
Aug 23, 2013, 11:52:41 AM8/23/13
to universal-...@googlegroups.com
Qianqian,
Very sorry for the delay in responding, we had a baby yesterday. I will respond to your posts shortly.

Riyad

Sent from my iPhone
To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary...@googlegroups.com.

Qianqian Fang

unread,
Aug 23, 2013, 12:36:38 PM8/23/13
to universal-...@googlegroups.com
On Friday, August 23, 2013 11:52:41 AM UTC-4, riyad.kalla wrote:
Qianqian,
Very sorry for the delay in responding, we had a baby yesterday. I will respond to your posts shortly.

hi Riyad

Congratulations! 

there is no rush in responding, especially at this moment.

Qianqian

Riyad Kalla

unread,
Sep 16, 2013, 1:19:37 PM9/16/13
to universal-...@googlegroups.com
Qianqian,
So sorry for the delay, finally getting back to this...

The suggestion is very cool and wonderfully optimized for multi-dim/binary structures. That said, it introduces a too-specialized data structure to the spec.

Right now everything in UBJSON has a nice, compatible 1:1 mapping with something in JSON such that you can go UBJSON to JSON and back to UBJSON and get the same constructs. With this structure when you went to JSON you'd end up with array of arrays and back to UBJSON again you'd get a different construct.

Also the complexity of the construct makes my spidy-senses tingle.

That said, if UBJSON was a custom binary format and needed a hyper optimized data struct like this, I'd be all over this change ;)

Riyad

Qianqian Fang

unread,
Sep 19, 2013, 2:08:32 PM9/19/13
to universal-...@googlegroups.com
On Monday, September 16, 2013 1:19:37 PM UTC-4, Riyad Kalla wrote:
Qianqian,
So sorry for the delay, finally getting back to this...


hi Riyad

thanks for getting back to this thread.
 

The suggestion is very cool and wonderfully optimized for multi-dim/binary structures. That said, it introduces a too-specialized data structure to the spec.

Right now everything in UBJSON has a nice, compatible 1:1 mapping with something in JSON such that you can go UBJSON to JSON and back to UBJSON and get the same constructs. With this structure when you went to JSON you'd end up with array of arrays and back to UBJSON again you'd get a different construct.


but isn't it true that 1-to-1 mapping has already gone when converting text to binary? for example, in JSON, a value "1" in JSON can be mapped to [U][1] or [I][1] or [i][1] etc; another example, an array [1,257,1000] can be mapped to [[][i][1][I][257][I][1000][]], or [[][#][I][1][257][1000][]]] or any upper-casted types (even floating points). To me, 1-to-1 mapping is not that important, after all, binary files are typed format, while JSON is untyped. 

in comparison, JSON->UBJSON->JSON roundtrip invariance is more meaningful. In this aspect, I don't see the suggested array syntax cause any trouble -- it is merely a minor extension to the already added array extensions ]#]/[d] for the same purpose (optimizing performance).


Also the complexity of the construct makes my spidy-senses tingle.

That said, if UBJSON was a custom binary format and needed a hyper optimized data struct like this, I'd be all over this change ;)

anyways, I already released my JSONlab with UBJSON support (and with my suggested extended syntax).

feel free to add the link to your library page


Qianqian
Reply all
Reply to author
Forward
0 new messages