support for multi-dimensional arrays

Qianqian Fang

unread,

Aug 21, 2013, 12:00:48 AM8/21/13

to universal-...@googlegroups.com

hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form?

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector?

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian

Riyad Kalla

unread,

Aug 21, 2013, 7:51:20 AM8/21/13

to universal-...@googlegroups.com

Qianqian,

Great question, while there isn't specifically a multi-dim container in UBJSON, as you pointed out an optimized array-of-arrays construct is exactly what I would recommend.

Using the new optimized container format markers you go from:

[[]

[[][i][1][i][2][i][3][]]

[[][i][4][i][5][i][6][]]

[[][i][7][i][8][i][9][]]

[]]

to

[[][[][i][3]

[[][i][i][3][1][2][3]

[[][i][i][3][4][5][6]

[[][i][i][3][7][8][9]

Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.

Hope that helps!

Best wishes,

Riyad

--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary...@googlegroups.com.
To post to this group, send email to universal-...@googlegroups.com.
Visit this group at http://groups.google.com/group/universal-binary-json.
For more options, visit https://groups.google.com/groups/opt_out.

Qianqian Fang

unread,

Aug 22, 2013, 2:29:44 PM8/22/13

to universal-...@googlegroups.com

On Wednesday, August 21, 2013 7:51:20 AM UTC-4, Riyad Kalla wrote:

Qianqian,
Great question, while there isn't specifically a multi-dim container in UBJSON, as you pointed out an optimized array-of-arrays construct is exactly what I would recommend.

Using the new optimized container format markers you go from:

[[]
[[][i][1][i][2][i][3][]]
[[][i][4][i][5][i][6][]]
[[][i][7][i][8][i][9][]]

[]]

to

[[][[][i][3]
[[][i][i][3][1][2][3]
[[][i][i][3][4][5][6]
[[][i][i][3][7][8][9]

hi Riyad

I assumed that for the second [[] you meant to use [#], correct?

I agree, this is the way that I would use based on the Draft 9 of the

specification. However, as I understand, the spec is still under development.

I am curious if you see a chance to add the new syntax I suggested in the

future revisions? It should be backward compatible as it won't conflict

with any other existing constructs.

If using the syntax I suggested, the above array could be saved more

compactly as

[[][$][i][#][[][$][i][3][3][]]

[1][2][3][4][5][6][7][8][9]

(here I use the row-major order, but column-major should be equally fine)

with this syntax, the binary data stream stored in the array can be a

literal copy of the original array in the memory. No additional byte

is needed to be inserted into the steam, and removed when decoding.

This can be a significant simplification for processing data arrays,

especially for the line of work (scientific research) that I am doing.

what do you think?

Qianqian

PS: my matlab ubjson encoder & decoder are nearly complete, will be

announced in a few days (I do have this extended array syntax by

default to ease the mapping from/to matlab data structures)

http://sourceforge.net/p/iso2mesh/code/commit_browser

Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.

Hope that helps!

Best wishes,
Riyad

On Tue, Aug 20, 2013 at 9:00 PM, Qianqian Fang <fan...@gmail.com> wrote:

hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form?

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector?

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian

--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.

To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary-json+unsub...@googlegroups.com.

Qianqian Fang

unread,

Aug 22, 2013, 2:46:18 PM8/22/13

to universal-...@googlegroups.com

On Wednesday, August 21, 2013 7:51:20 AM UTC-4, Riyad Kalla wrote:

Obviously in this contrived example you don't save much in space (but as your payloads in each dimensional array grows, your savings approaches 50%) -- in the outer array you use '[' to indicate an array marker and provide a size to help make parsing more optimal.

there is one more point I'd like to make, it is not entirely about

space savings. The straightforward and intuitive memory mapping

and ease of pre-allocation can be significant savings in efforts

when dealing with data of many regular arrays.

For example, to process the data array you have suggested, a

programmer can only allocate 3 pointers when reading the header.

He/she has no idea that the following sub-arrays all have the same

length, otherwise, they could have allocated a buffer to hold all

the following data in a continuous memory, which is more efficient

for speed-sensitive applications.

Qianqian

Hope that helps!

Best wishes,
Riyad

On Tue, Aug 20, 2013 at 9:00 PM, Qianqian Fang <fan...@gmail.com> wrote:

hi Ryiad

thanks for the effort of developing the UBJSON specifications. I am working on a MATLAB implementation of your specification, as part of my JSONLAB library (http://iso2mesh.sf.net/cgi-bin/index.cgi?jsonlab). I am wondering if you have any plans supporting multi-dimensional arrays in the optimized (compact) form?

I understand that JSON itself already supports multi-dimensional arrays by nesting [] constructs. However, for binary data, this means I have to insert the '[]' notations between binary streams. This takes additional space, and slows down encoding/decoding. I see you have added the "$type" and "#count" syntax for arrays. I am wondering if the count syntax can be extended to accommodate more than a 1D horizontal vector?

If this is on you roadmap, I do have a proposal, how does the following format look to you?

[[] [$] [type] [#] [[] [$] [an integer type] [nx ny nz ...] []] [nx*ny*nz*sizeof(type) ] []]

basically, I'd like to put a 1D dimension array (of integer type [iIlL]) after the # notation. This way, the dimensional array can be used to pre-allocate the buffer. This will help tremendously especially when handling N-D arrays in matlab.

please let me know if you see any problem with this proposal. thanks

Qianqian

--
You received this message because you are subscribed to the Google Groups "Universal Binary JSON (UBJSON)" group.

To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary-json+unsub...@googlegroups.com.

Riyad Kalla

unread,

Aug 23, 2013, 11:52:41 AM8/23/13

to universal-...@googlegroups.com

Qianqian,

Very sorry for the delay in responding, we had a baby yesterday. I will respond to your posts shortly.

Riyad

Sent from my iPhone

To unsubscribe from this group and stop receiving emails from it, send an email to universal-binary...@googlegroups.com.

Qianqian Fang

unread,

Aug 23, 2013, 12:36:38 PM8/23/13

to universal-...@googlegroups.com

On Friday, August 23, 2013 11:52:41 AM UTC-4, riyad.kalla wrote:

Qianqian,
Very sorry for the delay in responding, we had a baby yesterday. I will respond to your posts shortly.

hi Riyad

Congratulations!

there is no rush in responding, especially at this moment.

Qianqian

Riyad Kalla

unread,

Sep 16, 2013, 1:19:37 PM9/16/13

to universal-...@googlegroups.com

Qianqian,

So sorry for the delay, finally getting back to this...

The suggestion is very cool and wonderfully optimized for multi-dim/binary structures. That said, it introduces a too-specialized data structure to the spec.

Right now everything in UBJSON has a nice, compatible 1:1 mapping with something in JSON such that you can go UBJSON to JSON and back to UBJSON and get the same constructs. With this structure when you went to JSON you'd end up with array of arrays and back to UBJSON again you'd get a different construct.

Also the complexity of the construct makes my spidy-senses tingle.

That said, if UBJSON was a custom binary format and needed a hyper optimized data struct like this, I'd be all over this change ;)

Riyad

Qianqian Fang

unread,

Sep 19, 2013, 2:08:32 PM9/19/13

to universal-...@googlegroups.com

On Monday, September 16, 2013 1:19:37 PM UTC-4, Riyad Kalla wrote:

Qianqian,
So sorry for the delay, finally getting back to this...

hi Riyad

thanks for getting back to this thread.

The suggestion is very cool and wonderfully optimized for multi-dim/binary structures. That said, it introduces a too-specialized data structure to the spec.

Right now everything in UBJSON has a nice, compatible 1:1 mapping with something in JSON such that you can go UBJSON to JSON and back to UBJSON and get the same constructs. With this structure when you went to JSON you'd end up with array of arrays and back to UBJSON again you'd get a different construct.

but isn't it true that 1-to-1 mapping has already gone when converting text to binary? for example, in JSON, a value "1" in JSON can be mapped to [U][1] or [I][1] or [i][1] etc; another example, an array [1,257,1000] can be mapped to [[][i][1][I][257][I][1000][]], or [[][#][I][1][257][1000][]]] or any upper-casted types (even floating points). To me, 1-to-1 mapping is not that important, after all, binary files are typed format, while JSON is untyped.

in comparison, JSON->UBJSON->JSON roundtrip invariance is more meaningful. In this aspect, I don't see the suggested array syntax cause any trouble -- it is merely a minor extension to the already added array extensions ]#]/[d] for the same purpose (optimizing performance).

Also the complexity of the construct makes my spidy-senses tingle.

That said, if UBJSON was a custom binary format and needed a hyper optimized data struct like this, I'd be all over this change ;)

anyways, I already released my JSONlab with UBJSON support (and with my suggested extended syntax).

feel free to add the link to your library page

http://iso2mesh.sourceforge.net/cgi-bin/index.cgi?jsonlab

Qianqian

Reply all

Reply to author

Forward