Comparing Hazelcast 3 Serialization methods with Kryo Serialization

2,185 views
Skip to first unread message

Fuad Malikov

unread,
Jun 13, 2013, 12:08:42 PM6/13/13
to Hazelcast
Hi All,

Hazelcast 3 comes with new Serialization methods and it lets you to implement and plug a custom serialization. Custom serialization was a frequently requested feature. 

Recently I implemented and compared Java Serialization, Hazelcast serialization methods and Kryo Serialization and published the results in the following blog post. 


Looking forward to see your feedback. 

Best,

-fuad

mongonix

unread,
Jun 13, 2013, 1:16:00 PM6/13/13
to haze...@googlegroups.com
Hi Fuad,

Could you try with this Kryo-based serializer? It is based on the latest Kryo trunk. And it is supposed to be much faster than standard Kryo that you used.

Do not forget to add:
kryo.register(YourClass.class);
for all the classes you want to serialize

Cheers,
  Leo

package test.hazelcast.serialization;

import java.io.IOException;

import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.io.Output;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.UnsafeOutput;
import com.esotericsoftware.kryo.io.UnsafeInput;
import com.hazelcast.nio.ObjectDataInput;
import com.hazelcast.nio.ObjectDataOutput;
import com.hazelcast.nio.serialization.TypeSerializer;
import com.hazelcast.util.ExceptionUtil;

public class KryoSerializer implements TypeSerializer<Object> {

private static boolean useUnsafe = true;
private Kryo kryo = new Kryo();
private Output kOut = (useUnsafe) ? new UnsafeOutput(4096 * 4)
       : new Output(4096 * 4);

public KryoSerializer() {
if (!useUnsafe)
kryo.setAsmEnabled(true);
kryo.setReferences(false);
                // TODO: Please register your specific classes here!!!!
// Like this:
                // kryo.register(YourClass.class);
}

public int getTypeId() {
return 2;
}

public void write(ObjectDataOutput out, Object object) throws IOException {
write(object);
out.writeInt(kOut.position());
out.write(kOut.getBuffer(), 0, kOut.position());
}

public byte[] write(Object object) {
kOut.clear();
kryo.writeClassAndObject(kOut, object);
return kOut.toBytes();
}

public Object read(ObjectDataInput in) throws IOException {
int len = in.readInt();
byte[] bytes = new byte[len];
in.readFully(bytes);
return read(bytes);
}

public Object read(byte[] bytes) {
Input kIn = (useUnsafe) ? new UnsafeInput(bytes) : new Input(bytes);
try {
Object obj = kryo.readClassAndObject(kIn);
return obj;
} catch (Throwable e) {
ExceptionUtil.rethrow(e);
} finally {
kIn.close();
}
return null;
}

public void destroy() {
kOut.close();

Fuad Malikov

unread,
Jun 13, 2013, 4:51:52 PM6/13/13
to Hazelcast

Here is what I get with my version:

1000000 PUT took 20082 ms
1000000 GET took 13076 ms

and your version. Just copied and pasted. In both cases I am using the latest code from master branch from github. 
1000000 PUT took 20019 ms
1000000 GET took 15063 ms

Basically it is not better. Then I realized that, I didn't register the Customer.class. After registering it, I got 
1000000 PUT took 19188 ms
1000000 GET took 14638 ms

Then I added the same registration into my code and it did improve slightly. 
1000000 PUT took 19711 ms
1000000 GET took 12758 ms

The original version seems to be better. 

Fuad Malikov
Co-founder & Managing Partner
Hazelcast | Open source in-memory data grid
575 Middlefield Rd, Palo Alto, CA 94301


--
You received this message because you are subscribed to the Google Groups "Hazelcast" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hazelcast+...@googlegroups.com.
To post to this group, send email to haze...@googlegroups.com.
Visit this group at http://groups.google.com/group/hazelcast?hl=en-US.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

mongonix

unread,
Jun 13, 2013, 5:25:53 PM6/13/13
to haze...@googlegroups.com
Thanks for trying it out! 
Could you post your code somewhere, e.g. on github, so that others could play with it?
I have an Avro serializer for HZ 3.0 somewhere around. I could try to dig it out and check how it scores.

Regarding the Kryo serializer:
Your class is too "simple" to make it shine :-) Kryo would most likely win hands down if your class would contain arrays of primitive types or many more fields.

BTW, I seem to remember an earlier discussion about custom serializers for HZ 3.0 (this one: https://groups.google.com/d/topic/hazelcast/IIHZ5fIAIn4/discussion)
IIRC, I mentioned at that time that I observed too much copying during the execution of write method which affects performance (see bullet 3 in my message in that discussion). I'm wondering if anything has been improved in the meantime or if there are any plans to do something to improve the situation?

-Leo

Fuad Malikov

unread,
Jun 13, 2013, 5:58:53 PM6/13/13
to Hazelcast
Fare enough:) Maybe later we can try with more complex objects. By the way, I updated the post with Jackson Smile and the result was very similar to Kryo. If you look at my implementation, you'll see that, you can directly write and read from stream. No need to serialize to binary and then right it. I think it is pretty optimized. 

Fuad Malikov
Co-founder & Managing Partner
Hazelcast | Open source in-memory data grid
575 Middlefield Rd, Palo Alto, CA 94301


Fuad Malikov

unread,
Jun 13, 2013, 5:59:58 PM6/13/13
to Hazelcast
The code is little bit messy at the moment as I tried different Serializations on the same Class. I need to clean it inorder to post to Github.  

-fuad

mongonix

unread,
Jun 14, 2013, 4:35:30 AM6/14/13
to haze...@googlegroups.com
BTW, you may want to check the SerializationBenchmarkTest in Kryo. It is based on a test published on the following GridGain's blog:

The interesting thing is that Kryo from trunk is much, much faster than old Kryo on this test.

It would be interesting to see how HZ3 scores here. May be you could extend  the code from the SerializationBenchmarkTest to do it.

-Leo 

ahmet mırçık

unread,
Jun 18, 2013, 5:07:35 AM6/18/13
to haze...@googlegroups.com
Hi when posting on github / blog, could you please add comparison with GridGain's serialization also? 
yet seeing some different results on my tests.

On my machine :
64 bit Windows 7
Satır içi resim 1

Based on Fuad's test:

Default Java Serialization
-----------------
PUT 14085
GET 21495

Hazelcast DataSerializable 
---------------------
PUT 13027
GET 12296

Hazelcast IdentifiedDataSerializable
---------------------
PUT 11156
GET 8767

Kryo
---------------------
PUT 12057
GET 7825

Jackson Smile
---------------------
PUT 12168
GET 7945

GridGain
---------------------
PUT 13596
GET 10079




>>> Java serialization via Externalizable (average): 22,840 ms

>>> Kryo serialization (average): 16,471 ms

>>> GridGain serialization (average): 2,897 ms

>>> IdentifiedDataSerializable (average): 46,531 ms

>>> DataSerializable (average): 46,385 ms



2013/6/14 mongonix <romi...@gmail.com>



--
Ahmet Mırçık
image.png
Message has been deleted
Message has been deleted

Mehmet Dogan

unread,
Jun 18, 2013, 5:38:12 AM6/18/13
to haze...@googlegroups.com
GridGain's SampleObject has large long[] and double[] fields which kills serialization performance when writing/copying to a byte[] based stream

When you implement Kryo based serialization using UnsafeOutput and UnsafeInput (thanks to Leo for his implementation), you will see a significant performance boost.

We also added configurable DataInput and DataOutput implementations using the same technique that Leo used in Kryo. If you set useNativeByteOrder=true in SerializationCofig, you'll see the same performance boost for Hazelcast. (What actually done is to copy multibyte primitive arrays to a byte array using memcopy).

config.getSerializationCofig().setUseNativeByteOrder(true);

You can check my fork of Fuad's serialization benchmark here; 


@mmdogan

image.png

Mehmet Dogan

unread,
Jun 18, 2013, 5:40:22 AM6/18/13
to haze...@googlegroups.com
PS: You should build Hazelcast from source or use latest snapshot build from sonatype snapshot repo.


<dependency>
    <groupId>com.hazelcast</groupId>
    <artifactId>hazelcast</artifactId>
    <version>${hazelcast.version}</version>
</dependency>
<repository>
    <id>sonatype-snapshots</id>
    <name>Sonatype Snapshot Repository</name>
    <url>https://oss.sonatype.org/content/repositories/snapshots</url>
    <releases>
        <enabled>false</enabled>
    </releases>
    <snapshots>
        <enabled>true</enabled>
    </snapshots>
</repository>

@mmdogan

image.png

mongonix

unread,
Jun 18, 2013, 7:20:57 AM6/18/13
to haze...@googlegroups.com
Hi Mehmet,


On Tuesday, June 18, 2013 11:38:12 AM UTC+2, Mehmet Dogan wrote:
GridGain's SampleObject has large long[] and double[] fields which kills serialization performance when writing/copying to a byte[] based stream

When you implement Kryo based serialization using UnsafeOutput and UnsafeInput (thanks to Leo for his implementation), you will see a significant performance boost.

You are welcome! I tested with your latest changes and I can confirm that indeed it has roughly the same speed as Kryo now on GridGain use-cases. Very nice!
 
We also added configurable DataInput and DataOutput implementations using the same technique that Leo used in Kryo. If you set useNativeByteOrder=true in SerializationCofig, you'll see the same performance boost for Hazelcast. (What actually done is to copy multibyte primitive arrays to a byte array using memcopy).

config.getSerializationCofig().setUseNativeByteOrder(true);


This is nice. I have a small improvement proposal, when it comes to configuration:
I'd avoid using setuseNativeByteOrder because in the (rather unprobable) case where you have in your cluster machines with different architectures and using different native byte orders it may result in incompatibilities. 

Therefore, I'd suggest that instead you allow to set the byte order via config, e.g.

useByteOrder=LITTLE_ENDIAN
useByteOrder=BIG_ENDIAN
etc

Normally, you'd set the same byte order for all machines in your cluster. Then, if a machine where a specific instance of HZ runs has the same native byte order as described by the useByteOrder setting and it has access to Unsafe, it can use Unsafe-based implementation. In all other cases, it should use the implementation supporting a specified byte order.

Does it make sense?
Thanks a lot! I played a bit with it. In fact, the serialization speed up of Unsafe-based implementations  is much higher that what it currently shows. If you move object creation out of the loop and remove the equals check  in the "test" method, you'll see a serialization speed without overheads introduced by a memory allocation and validity checks (these checks do not make any sense in a benchmark anyway).  I see up to 6-7 times speedup compared to Java version and Kryo version not using Unsafe, and it is almost twice as fast as without moving things out of the loop.

-Leo

Mehmet Dogan

unread,
Jun 18, 2013, 7:32:45 AM6/18/13
to haze...@googlegroups.com
Hi Leo,

SerializationConfig already has field byteOrder (which defaults to big endian). See;


'useNativeByteOrder' is just a shorthand method, if one knows all members in cluster have the same arch. 


@mmdogan

mongonix

unread,
Jun 18, 2013, 7:39:57 AM6/18/13
to haze...@googlegroups.com


On Tuesday, June 18, 2013 1:32:45 PM UTC+2, Mehmet Dogan wrote:
Hi Leo,

SerializationConfig already has field byteOrder (which defaults to big endian). See;


'useNativeByteOrder' is just a shorthand method, if one knows all members in cluster have the same arch. 


Ah, I overlooked it, nice! If the value of byteOrder happens to match the native byte order of the machine, will HZ pick the Unsafe-based implementation automatically then?

-Leo

P.S. Mehmet, it would be very nice if you could comment on the overheads of copying issue I mentioned before in this discussion. Do you think it can be improved or optimized away somehow?

Mehmet Dogan

unread,
Jun 18, 2013, 8:23:51 AM6/18/13
to haze...@googlegroups.com
Yes, if byteOrder matches the native one, Unsafe input and outputs will be used. Also if Unsafe object could not be loaded (not available on classpath or security reasons), then a ByteBuffer based implementation will be picked instead.

Currently we've a pool of output streams and they are reused to avoid creating and expanding/re-allocating (byte[] or ByteBuffer) each time. I think to not to create a copy of written bytes (out.toByteArray()), we should give up from that pool. Total work will be nearly the same in that case. 

Do you have a suggestion?

@mmdogan

mongonix

unread,
Jun 18, 2013, 8:52:26 AM6/18/13
to haze...@googlegroups.com


On Tuesday, June 18, 2013 2:23:51 PM UTC+2, Mehmet Dogan wrote:

Currently we've a pool of output streams and they are reused to avoid creating and expanding/re-allocating (byte[] or ByteBuffer) each time. I think to not to create a copy of written bytes (out.toByteArray()), we should give up from that pool. Total work will be nearly the same in that case. 

Do you have a suggestion?


First let me copy & paste a fragment from my comment that I posted a few months ago:

> Since I use a totally different serialization framework, which does not support DataObjectOutput stream out of the box, I first need to serialize my objects using Kryo and produce a > byte array with a binary representation (this is a first pass over the binary representation). Then I have to write it into DataObjectOutput using write(byte[]) (this is a second pass 
> over the binary representation). Later, when the "write" method returns, HZ would perform DataObjectOutput.toBytes() or something like this, thus allocating a new byte array and 
> copying the binary representation to it (this is a third pass). So, we copy our binary representation 2 times more than required...
   
>  I think the reason for this is the fact that there is no way just to return a byte array from the write method. Therefore we have to write byte arrays to streams first and then do the 
> same in the opposite direction. And this kills performance. Therefore, I'd suggest to extend the TypeSerializer interface with methods that return byte arrays (and may be 
> ByteBuffers) or take them as parameters (for read methods). What do you think of it?

I still think that allowing a custom serializer to return a byte array with serialized representation could be a good idea. If required, the custom serializer can try to do resource polling by itself. May be there should also be a way for HZ to return back resources to the serializer (i.e. byte arrays or whatever the serializer returns from write methods). 

If done this way, HZ does not need to copy the returned serialized representation and can directly use it (e.g. write it into a network packet). IMHO, such an approach would eliminate a the second and third pass (from those described above) over serialized data.

Another idea:
- The third pass over data (i.e. out.toByteArray()) could avoid copying bytes and reuse the underlying buffer directly. This is what you actually mentioned, right? Of course, it makes pooling difficult, if not impossible. You state "Total work will be nearly the same in that case." But I'm not so sure. I'd say that allocation of new byte array instances is cheaper than copying, because during copying you need to iterate over all bytes of the serialized representation. In case of allocation, you are just manipulating pointers, but you do not touch all bytes. Of course, increased allocation rate may result in increased GC activities, but I bet that avoiding copying would bring higher benefits and thus makes the overall execution faster. This needs to be tested of course.

-Leo

 

Mehmet Dogan

unread,
Jun 19, 2013, 4:00:11 AM6/19/13
to haze...@googlegroups.com
Thanks for the suggestions. Adding another type of TypeSerializer to work with byte[] seems a better approach to me. I'll give it a try. 

@mmdogan



Mehmet Dogan

unread,
Jun 19, 2013, 4:55:32 AM6/19/13
to haze...@googlegroups.com
What about an abstract class instead of an interface;

public abstract class PlainTypeSerializer<T> implements TypeSerializer<T> {

    public final void write(ObjectDataOutput out, T object) throws IOException {
        throw new UnsupportedOperationException();
    }

    public final T read(ObjectDataInput in) throws IOException {
        throw new UnsupportedOperationException();
    }

    public abstract byte[] write(T object) throws IOException;

    public abstract T read(byte[] buffer) throws IOException;

}

@mmdogan

mongonix

unread,
Jun 19, 2013, 5:40:33 AM6/19/13
to haze...@googlegroups.com


On Wednesday, June 19, 2013 10:55:32 AM UTC+2, Mehmet Dogan wrote:
What about an abstract class instead of an interface;

public abstract class PlainTypeSerializer<T> implements TypeSerializer<T> {

    public final void write(ObjectDataOutput out, T object) throws IOException {
        throw new UnsupportedOperationException();
    }

    public final T read(ObjectDataInput in) throws IOException {
        throw new UnsupportedOperationException();
    }

    public abstract byte[] write(T object) throws IOException;

    public abstract T read(byte[] buffer) throws IOException;

}


I'm fine with it. But how such a serializer is supposed to be invoked by HZ? How does HZ know if it needs to call a stream-based method or a byte array based method? I guess there should be a way to express it somehow (e.g. via config or via a dedicated getter method getMode())? And what about extensibility (see the next paragraph)?

Another question (may be looking too far into the future and this is not an issue at all?): What if we later want to support ByteBuffers or some other kinds of buffers/containers for serialized representations? E.g. if we want to serialize directly into the off-heap memory? (BTW, Kryo can actually do it already). Would we need to extend this this class then? Or do we need a more generic approach then supporting any kinds of buffers? E.g. based on your proposed API :

public abstract class GenericTypeSerializer<T, B> implements TypeSerializer<T> {

    public final void write(ObjectDataOutput out, T object) throws IOException {
        throw new UnsupportedOperationException();
    }

    public final T read(ObjectDataInput in) throws IOException {
        throw new UnsupportedOperationException();
    }

    public abstract B write(T object) throws IOException;

    public abstract T read(B buffer) throws IOException;

}

Here B denotes the type of a buffer to be used. One could then provide byte[], ByteBuffer or anything else as B.

What do you think?

-Leo

Mehmet Dogan

unread,
Jun 19, 2013, 6:34:54 AM6/19/13
to haze...@googlegroups.com
Actually I did not like the way I suggested, it looks ugly to me. But doing cleaner way, introducing another interface, can make implementation a little complex. Just asked to learn how it's seen from your (and others') side. 

Storing in other formats (serializing into off-heap, disk etc) can be supported easily using DataInput/DataOutput interfaces. But to work with storages those are managed outside of Hazelcast can make things difficult for us. In Hazelcast 2, we're able to store data in off-heap memory, we'll add off-heap storage (also maybe disk) in later releases to version 3.
 

@mmdogan

mongonix

unread,
Jun 19, 2013, 6:56:50 AM6/19/13
to haze...@googlegroups.com


On Wednesday, June 19, 2013 12:34:54 PM UTC+2, Mehmet Dogan wrote:
Actually I did not like the way I suggested, it looks ugly to me. But doing cleaner way, introducing another interface, can make implementation a little complex. Just asked to learn how it's seen from your (and others') side. 


But coming back to the original question of being able to use byte arrays (which is the most typical case) to reduce copying, do you still plan to introduce something? Or do you tend to leave it as it is? 

IMHO, having another interface is probably the most pragmatic solution. Yes, it complicate the code a bit, but just at a very few places. HZ will probbaly just need to check if a given serializer implements a given interface and call a corresponding method. Then depending on the return type (stream vs byte array) it may need to get a byte array from stream using usual stream mechanisms.
 
Storing in other formats (serializing into off-heap, disk etc) can be supported easily using DataInput/DataOutput interfaces.

I see your point. Using  DataInput/DataOutput definitely allows to implement such features. Of course it does not necessarily means that doing it this way is very efficient, but this is another story :-)  E.g. Kryo can serialize directly into off-heap using Unsafe-based tricks. The speed is roughly the same as with serialization into byte arrays on heap. Almost zero overhead. Using streams you probably cannot use those tricks and loose some speed, but this is probably OK.
 
But to work with storages those are managed outside of Hazelcast can make things difficult for us. In Hazelcast 2, we're able to store data in off-heap memory, we'll add off-heap storage (also maybe disk) in later releases to version 3.

OK. Looking forward to see those features in HZ 3.

Mehmet Dogan

unread,
Jun 21, 2013, 3:48:48 AM6/21/13
to haze...@googlegroups.com

mongonix

unread,
Jun 21, 2013, 4:02:14 AM6/21/13
to haze...@googlegroups.com
Hi Mehmet,
Cool! Thanks a lot! 

-Leo

Michael Sick

unread,
Jul 12, 2013, 9:33:51 PM7/12/13
to haze...@googlegroups.com
Mr Mongonix,

Do you still have the serializer for Avro? Does it parse / check schema? I would like to see it and also see if I could create a Protocol Buffer equivalent.

--Mike

mongonix

unread,
Jul 13, 2013, 3:29:19 AM7/13/13
to haze...@googlegroups.com
Hi Mike,


On Saturday, July 13, 2013 3:33:51 AM UTC+2, Michael Sick wrote:
Mr Mongonix,

Do you still have the serializer for Avro? Does it parse / check schema? I would like to see it and also see if I could create a Protocol Buffer equivalent.

I'll look around to see if I still have the sources for an avro-based serializer somewhere. If I manage to find them, I'll post it here.

-Leo

mongonix

unread,
Jul 13, 2013, 6:06:46 AM7/13/13
to haze...@googlegroups.com
Hi,


On Saturday, July 13, 2013 9:29:19 AM UTC+2, mongonix wrote:
Hi Mike,

On Saturday, July 13, 2013 3:33:51 AM UTC+2, Michael Sick wrote:
Mr Mongonix,

Do you still have the serializer for Avro? Does it parse / check schema? I would like to see it and also see if I could create a Protocol Buffer equivalent.

I'll look around to see if I still have the sources for an avro-based serializer somewhere. If I manage to find them, I'll post it here.

-Leo

Here is the old code for my Avro-based serializer test:

package test.hazelcast.serialization;

import java.io.ByteArrayOutputStream;
import java.io.IOException;

import org.apache.avro.Schema;
import org.apache.avro.io.DatumReader;
import org.apache.avro.io.DatumWriter;
import org.apache.avro.io.Decoder;
import org.apache.avro.io.DecoderFactory;
import org.apache.avro.io.Encoder;
import org.apache.avro.io.EncoderFactory;
import org.apache.avro.reflect.ReflectData;
import org.apache.avro.reflect.ReflectDatumReader;
import org.apache.avro.reflect.ReflectDatumWriter;

import com.hazelcast.nio.serialization.ByteArraySerializer;

public class AvroSerializer implements ByteArraySerializer<Object>{

    final static ReflectData reflectData = ReflectData.get();
    // TODO: Currently it uses a fixed schema for a concrete class.
    // To make it more universal you may want to create a run-time extendible mapping from classes to their schemas 
    final static Schema schema = reflectData.getSchema(Main.class);
    final DatumWriter<Object> writer;
    final DatumReader<Object> reader;
    
    Encoder e;
    ByteArrayOutputStream os;
    DecoderFactory decoderFactory = DecoderFactory.get();
    EncoderFactory encoderFactory = EncoderFactory.get();
    
    public AvroSerializer() {
    // System.out.println("Schema is:\n " + schema.toString());
        writer = new ReflectDatumWriter<Object>(schema);
        reader = new ReflectDatumReader<Object>(schema);
os = new ByteArrayOutputStream(4096*4);
    }
    public int getTypeId() {
return 122;
    }

    public byte[] write(Object object) throws IOException {
os.reset();
        e = encoderFactory.binaryEncoder(os, null);
writer.write(object, e);
e.flush();
byte[] bytes = os.toByteArray();
return bytes;
    }

    public Object read(byte[] bytes) throws IOException {
Decoder decoder = decoderFactory.binaryDecoder(bytes, null);
        Object obj = reader.read(null, decoder);
return obj;
    }

    public void destroy() {
    }
}

As indicated inside code comments, this serializer works for one single user-defined class, which is statically known at compilation time. But it should be very straight forward to extend this serializer to support any class that is dynamically passed to it at run-time. 

Please let me know if this code is of any help for you and if you managed to do something interesting using a combination of Hazelcast and Avro.

Regards,
  Leo

P.S. BTW, the latest code in the Avro-trunk contains some significant speed improvements (e.g. it is 3-5 times faster than before) for org.apache.avro.reflect-based Avro serializers that are used in my code above (https://issues.apache.org/jira/browse/AVRO-1282). Therefore you may want to give it a try.

da...@livemedia.com.au

unread,
Aug 8, 2013, 8:48:44 AM8/8/13
to haze...@googlegroups.com

For anyone interested I've added an Argot (www.argot-sdk.org) implementation and put a blog post outlining performance.  I hadn't read this thread prior to writing the post, but also found that the long[] and double[] were the main things that effect performance in the benchmark.  I really like the new serialization options.  It should make development of serialization a lot easier when you can use the same code twice.

http://blog.argot-sdk.org/2013/08/argot-big-data-and-hazelcast.html

I will hopefully get my fork and changes up on git soon.

Thanks,
David.
Reply all
Reply to author
Forward
0 new messages