Gzip JsonFactory/ObjectMapper?

82 views
Skip to first unread message

oxy...@gmail.com

unread,
Nov 23, 2018, 11:17:03 PM11/23/18
to jackson-user
Hi,

I have been scratching my head for the last hour or two looking for a Gzip JsonFactory or any other type of factory that is able to compress huge Json objects,
I know I'm going to sacrifice some performance but for this particular scenario I need a JsonFactory -or ObjectMapper- that is able to gzip.
Is there any opensource project that has that done already? or what would be the simplest way to write one?
I could gzip the output of the ObjectMapper but this is not an option before I'm using a 3rd party framework that requires me to pass an ObjectMapper

Thanks a lot,
Guido

Tatu Saloranta

unread,
Nov 23, 2018, 11:37:54 PM11/23/18
to jackso...@googlegroups.com
This sort of falls outside realm of ObjectMapper in my opinion (and
why just gzip? Snappy, lz4, bz2, there's plethora of compression
codecs to support).

But there is one extension point in JsonFactory (part of `jackson-core`):

JsonFactory.setOutputDecorator(OutputDecorator decoratorImpl);

which you can use to install a decorator that will add compression
codec into `OutputStream` when `decorate(OutputStream)` method is
called.
So you can construct and configure `JsonFactory`, construct
`ObjectMapper` with it, and that should allow you to add compression.

I hope this helps,

-+ Tatu +-

oxy...@gmail.com

unread,
Nov 24, 2018, 5:56:21 AM11/24/18
to jackson-user
Hi Tatu,

This looks exactly like what I need, I know this is not a responsibility of the ObjectMapper but the 3rd party API isn't giving me much freedom so I cannot compress myself,
I'm wondering if I can decorate the SmileFactory using this approach? As the SmileFactory without the extra string compression is the fastest of the bunch.
Also, are there examples on how to use these Input/Output decorators, thanks a lot for answering as this is a huge step towards the solution.

Guido.
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

oxy...@gmail.com

unread,
Nov 25, 2018, 5:26:03 AM11/25/18
to jackson-user
Now this is even better that what I would expected, using the LZ4 implementation https://github.com/lz4/lz4-java and double pass I get the following result:

Json size: 2727kb
Json time: 26.22ms

Smile size: 1043kb
Smile time: 19.78ms

Compressed Smile size: 914kb
Compressed Smile time: 19.88ms

LZ4 size
: 84kb
LZ4 time
: 29.839ms

Now, I believe it is adding a couple of millis to the standard ObjectMapper, my last goal is now to be able to replace the JsonFactory by SmileFactory which currently throws Exceptions:

  public static final Charset UTF8=Charset.forName("UTF-8");
 
public static final int BLOCK_SIZE=16 * 1024;
 
public static final ObjectMapper LZ4_OBJECT_MAPPER;

 
static{
   
final JsonFactory jsonFactory=new JsonFactory();
    jsonFactory
.setInputDecorator(new InputDecorator()
   
{
     
@Override
     
public InputStream decorate(IOContext context,InputStream inputStream)
     
{
       
return new LZ4BlockInputStream(new LZ4BlockInputStream(inputStream));
     
}

     
@Override
     
public InputStream decorate(IOContext context,byte[] bytes,int offset,int length)
     
{
       
return new LZ4BlockInputStream(new LZ4BlockInputStream(new ByteArrayInputStream(bytes,offset,length)));
     
}

     
@Override
     
public Reader decorate(IOContext context,Reader reader)
     
{
       
return new InputStreamReader(new LZ4BlockInputStream(new LZ4BlockInputStream(new ReaderInputStream(reader))),UTF8);
     
}
   
});

    jsonFactory
.setOutputDecorator(new OutputDecorator()
   
{
     
@Override
     
public OutputStream decorate(IOContext context,OutputStream outputStream)
     
{
       
return new LZ4BlockOutputStream(new LZ4BlockOutputStream(outputStream,
           BLOCK_SIZE
,LZ4Factory.fastestInstance().fastCompressor()),
           BLOCK_SIZE
,LZ4Factory.fastestInstance().fastCompressor());
     
}

     
@Override
     
public Writer decorate(IOContext context,Writer writer)
     
{
       
return new OutputStreamWriter(
           
new LZ4BlockOutputStream(new LZ4BlockOutputStream(new WriterOutputStream(writer,UTF8),
              BLOCK_SIZE
,LZ4Factory.fastestInstance().fastCompressor()),
              BLOCK_SIZE
,LZ4Factory.fastestInstance().fastCompressor())
       
);
     
}
   
});

    LZ4_OBJECT_MAPPER
=new ObjectMapper(jsonFactory).
       disable
(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES).
       disable
(SerializationFeature.FAIL_ON_EMPTY_BEANS).
       setSerializationInclusion
(JsonInclude.Include.NON_NULL).
       disable
(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS).
       registerModule
(INT_RANGE_MODULE).
       registerModule
(JODA_MODULE).
       registerModule
(TAGS_MODULE);
 
}

Tatu Saloranta

unread,
Nov 27, 2018, 4:43:46 PM11/27/18
to jackso...@googlegroups.com
On Sat, Nov 24, 2018 at 2:17 PM <oxy...@gmail.com> wrote:
Now I'm curious of smile combined with this LZ4 implementation https://github.com/lz4/lz4-java
The time is pretty close and the compression is decent, what do you think I need to do to be able to decorate the SmileFactory?

In theory, should work just like JsonFactory. Although since problem reported:


maybe it does not :-(

-+ Tatu +- 
 

Json size: 2727kb
Json time: 27.43ms

Smile size: 1043kb
Smile time: 20.776ms

Compressed Smile size: 914kb
Compressed Smile time: 20.761ms

LZ4 size
: 586kb
LZ4 time
: 30.093ms

--
You received this message because you are subscribed to the Google Groups "jackson-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jackson-user...@googlegroups.com.
To post to this group, send email to jackso...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Tatu Saloranta

unread,
Nov 27, 2018, 4:47:18 PM11/27/18
to jackso...@googlegroups.com
On Sat, Nov 24, 2018 at 12:10 PM <oxy...@gmail.com> wrote:
Quite strangely, the JDK's GZip is winning vs these fancy compression algorithms:

Json size: 2727kb
Json time: 26.426ms

Smile size: 1043kb
Smile time: 20.366ms

Compressed Smile size: 914kb
Compressed Smile time: 20.678ms

GZip size: 216kb
GZip time: 53.393ms

And for a reasonable price to pay, I'm happy with this now ;-)

I think that is to be expected: gzip does have better compression than basic Lempel-Ziv (lz4, lzf, Snappy), because it does LZ and then Huffman encoding on that.

But on the other hand, lz-variants are maybe 3x-6x faster to compress (and 2x-3x decompress). So it's all about trade-off: do you have more network (less need for compression) or cpu (to compress more) to spend?

Similarly there are some compression codecs like bzip2 that have yet higher compression ratio, but are really CPU-intensive (read: slow, esp. compressing). They have their uses too, esp. compress-once-uncompress-million-times (like downloading releases).

Glad you have made good progress here!

-+ Tatu +-


 

On Saturday, November 24, 2018 at 6:46:45 PM UTC, oxy...@gmail.com wrote:
I cannot set input/output decorators to the SmileFactory, I get exceptions when I do that, the following is working but it is a shame I cannot use the SmileFactory + decorators:

  public static final ObjectMapper SNAPPY_OBJECT_MAPPER;

 
public static final Charset UTF8=Charset.forName("UTF-8");



 
static{

   
final JsonFactory jsonFactory=new JsonFactory();
    jsonFactory
.setInputDecorator(new InputDecorator()
   
{
     
@Override

     
public InputStream decorate(IOContext context,InputStream inputStream) throws IOException
     
{
       
return new FramedSnappyCompressorInputStream(inputStream);
     
}


     
@Override
     
public InputStream decorate(IOContext context,byte[] bytes,int offset,int length) throws IOException
     
{
       
return new FramedSnappyCompressorInputStream(new ByteArrayInputStream(bytes,offset,length));
     
}


     
@Override
     
public Reader decorate(IOContext context,Reader reader) throws IOException
     
{
       
return new InputStreamReader(new FramedSnappyCompressorInputStream(new ReaderInputStream(reader)),UTF8);
     
}
   
});


    jsonFactory
.setOutputDecorator(new OutputDecorator()
   
{
     
@Override
     
public OutputStream decorate(IOContext context,OutputStream outputStream) throws IOException
     
{
       
return new FramedSnappyCompressorOutputStream(outputStream);
     
}


     
@Override
     
public Writer decorate(IOContext context,Writer writer) throws IOException
     
{
       
return new OutputStreamWriter(new FramedSnappyCompressorOutputStream(new WriterOutputStream(writer,UTF8)));
     
}
   
});


    SNAPPY_OBJECT_MAPPER
=new ObjectMapper(jsonFactory).

       disable
(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES).
       disable
(SerializationFeature.FAIL_ON_EMPTY_BEANS).
       setSerializationInclusion
(JsonInclude.Include.NON_NULL).
       disable
(SerializationFeature.WRITE_DATES_AS_TIMESTAMPS).
       registerModule
(INT_RANGE_MODULE).
       registerModule
(JODA_MODULE).
       registerModule
(TAGS_MODULE);
 
}


This is not a JMH brenchmark but I made sure I warmed before I took these times:

Json size: 1743kb
Json time: 17.322ms


Smile size: 624kb
Smile time: 12.619ms


Compressed Smile size: 556kb
Compressed Smile time: 12.003ms


Snappy size: 196kb
Snappy time: 89.368ms

Guido.
Reply all
Reply to author
Forward
0 new messages