Re: [jackson-user] SMILE and vocabulary

62 views
Skip to first unread message

Tatu Saloranta

unread,
Jan 20, 2012, 5:44:06 PM1/20/12
to us...@jackson.codehaus.org, smile-forma...@googlegroups.com
On Tue, Jan 17, 2012 at 9:04 AM, Benson Margulies <bimar...@gmail.com> wrote:
> I've got some json that has many recurring instances of a relatively
> small number of long strings. Unfortunately, it's in the form of many
> distinct documents.
>
> What I wish is that, along the lines of the XML FIS, there was a way
> to 'preload' Smile with these guys so that it would write them out
> using smaller strings.

This might be best handled at a higher level, using some kind of
replacements -- Smile format is designed not to use dictionaries.
Although it is not out of the question that perhaps a new revision of
format could add support: there are unused bit/byte combinations,
reserved for future use, which could be used.

Btw, I will cc my response to Smile format mailing list, even more
appropriate there. :)

-+ Tatu +-

Tatu Saloranta

unread,
Jan 20, 2012, 8:43:30 PM1/20/12
to us...@jackson.codehaus.org, smile-forma...@googlegroups.com
On Fri, Jan 20, 2012 at 3:04 PM, Gregory Gerard <gge...@gmail.com> wrote:
> Could you build up the dictionary on the fly as you see new keys?

Indeed this is how Smile back references work, both for keys (of any
length) and short String values (64 bytes or less) -- at most 1024
most recently seen entries are in (sort of) sliding window dictionary,
and can be encoded using just a reference id. This is partly the
reason for not supporting explicit/external dictionaries.

Benson's use case does not benefit from this, however, since he
mentioned long Strings.

-+ Tatu +-

Reply all
Reply to author
Forward
0 new messages