<serialization>
<portable-version>0</portable-version>
<serializers>
<type-serializer type-class="name.of.class.to.be.serialized">name.of.serializer.class</type-serializer>
</serializers>
</serialization>
Now getting back to your question:
What would be the advantage of using a factory based approach? The
factory will not help in finding the correct
TypeSerializers; it only creates instances of objects that need to be
deserialized.
I checked it out and played a bit with the HZ 3.0 branch. Specifically, I tested the new and shiny support for custom serialization.The classes that I took as a basis for my experiments are MainPortable & Co from the PortableTest.java. I basically create MainPortable object (which is pretty big and complex) and then serialize it 1000000 times using HZ built-in mechanisms and using an alternative custom serializer.So, here comes my initial feedback about the custom serialization support in HZ 3.0:1) It works!!! HZ finally supports this feature. Very well done! No need for wrapper classes anymore.I was able to specify my own serializer for some of my types. To do it, I used the following syntax in the config file:<serialization>
<portable-version>0</portable-version>
<serializers>
<type-serializer type-class="name.of.class.to.be.serialized">name.of.serializer.class</type-serializer>
</serializers>
</serialization>BTW, a few questions about this:- Is it possible to use wildcards for specifying the class to be serialized? It could be useful, if the same serializer is supposed to serialize a lot of different classes.
- Another minor thing that I noticed is: Even if I specify my custom serializer for a certain class, but this class implements DataSerializable or Portable, then HZ would ignore my custom serializer from the config file and use the one described by the interface. I think the configuration file setting, if provided, should override those interfaces.
2) I picked Kryo as an alternative serialization framework, just to see how it would compare to HZ built-in serialization. The outcome on those tests: Kryo is 2 times faster than HZs DataSerializable serialization.
3) When it comes to the implementation of custom serializers, I implemented TypeSerializer as prescribed by HZ. The implementation is pretty easy and straight forward, just a few lines of code. But I think it is not as efficient as it could be due to the current limitations of TypeSerializer interface. Let me explain.The "write" method that I override expects that I write a binary representation into a DataObjectOutput stream.Since I use a totally different serialization framework, which does not support DataObjectOutput stream out of the box, I first need to serialize my objects using Kryo and produce a byte array with a binary representation (this is a first pass over the binary representation). Then I have to write it into DataObjectOutput using write(byte[]) (this is a second pass over the binary representation). Later, when the "write" method returns, HZ would perform DataObjectOutput.toBytes() or something like this, thus allocating a new byte array and copying the binary representation to it (this is a third pass). So, we copy our binary representation 2 times more than required...I think the reason for this is the fact that there is no way just to return a byte array from the write method. Therefore we have to write byte arrays to streams first and then do the same in the opposite direction. And this kills performance. Therefore, I'd suggest to extend the TypeSerializer interface with methods that return byte arrays (and may be ByteBuffers) or take them as parameters (for read methods). What do you think of it?
4) Another minor issue I noticed is: I think HZ assumes that serializers are always thread-safe, i.e. the same instance of a serializer can be used by multiple threads. While it is true for many serialization frameworks, it is not always the case. Kryo, for example, is not thread-safe. May be there are others. Of course, it is not a big problem to implement a workaround in a custom serializer class derived from a TypeSerializer. I just mentioned it here for the sake of completeness.
Thanks for the feedback!I checked it out and played a bit with the HZ 3.0 branch. Specifically, I tested the new and shiny support for custom serialization.The classes that I took as a basis for my experiments are MainPortable & Co from the PortableTest.java. I basically create MainPortable object (which is pretty big and complex) and then serialize it 1000000 times using HZ built-in mechanisms and using an alternative custom serializer.So, here comes my initial feedback about the custom serialization support in HZ 3.0:1) It works!!! HZ finally supports this feature. Very well done! No need for wrapper classes anymore.I was able to specify my own serializer for some of my types. To do it, I used the following syntax in the config file:<serialization>
<portable-version>0</portable-version>
<serializers>
<type-serializer type-class="name.of.class.to.be.serialized">name.of.serializer.class</type-serializer>
</serializers>
</serialization>BTW, a few questions about this:- Is it possible to use wildcards for specifying the class to be serialized? It could be useful, if the same serializer is supposed to serialize a lot of different classes.At the moment wildcards are not supported. Instead type class can be a super-class or an interface. Hazelcast will pick the most specific one (first scan super-classes then interfaces etc).
- Another minor thing that I noticed is: Even if I specify my custom serializer for a certain class, but this class implements DataSerializable or Portable, then HZ would ignore my custom serializer from the config file and use the one described by the interface. I think the configuration file setting, if provided, should override those interfaces.This is intentional. For DataSerializable and Portable, we want to skip serializer lookup phase immediately. And also making a class DataSerializable or Portable and also registering a different serializer for that type seems a bit ambiguous.
2) I picked Kryo as an alternative serialization framework, just to see how it would compare to HZ built-in serialization. The outcome on those tests: Kryo is 2 times faster than HZs DataSerializable serialization.How did you compare these two? Implementing a Kryo type serializer for Hazelcast or .. ?
3) When it comes to the implementation of custom serializers, I implemented TypeSerializer as prescribed by HZ. The implementation is pretty easy and straight forward, just a few lines of code. But I think it is not as efficient as it could be due to the current limitations of TypeSerializer interface. Let me explain.The "write" method that I override expects that I write a binary representation into a DataObjectOutput stream.Since I use a totally different serialization framework, which does not support DataObjectOutput stream out of the box, I first need to serialize my objects using Kryo and produce a byte array with a binary representation (this is a first pass over the binary representation). Then I have to write it into DataObjectOutput using write(byte[]) (this is a second pass over the binary representation). Later, when the "write" method returns, HZ would perform DataObjectOutput.toBytes() or something like this, thus allocating a new byte array and copying the binary representation to it (this is a third pass). So, we copy our binary representation 2 times more than required...I think the reason for this is the fact that there is no way just to return a byte array from the write method. Therefore we have to write byte arrays to streams first and then do the same in the opposite direction. And this kills performance. Therefore, I'd suggest to extend the TypeSerializer interface with methods that return byte arrays (and may be ByteBuffers) or take them as parameters (for read methods). What do you think of it?Yes, we are aware of this problem. Generally most of serialization framework are able to write/read to/from streams, some of are not.
Do you suggest an additional serializer interface that returns byte[] and read from byte[] or adding these write/read methods to TypeSerializer? If latter, how we will know which method to call?
4) Another minor issue I noticed is: I think HZ assumes that serializers are always thread-safe, i.e. the same instance of a serializer can be used by multiple threads. While it is true for many serialization frameworks, it is not always the case. Kryo, for example, is not thread-safe. May be there are others. Of course, it is not a big problem to implement a workaround in a custom serializer class derived from a TypeSerializer. I just mentioned it here for the sake of completeness.Yes, TypeSerializers must be thread-safe. I think, it should not be a problem to implement a type serializer thread safe (as you already mentioned). Otherwise api will become a bit complex; implement a TypeSerializerFactory, implement a TypeSerializer .. etc.
@mmdogan
--
@mmdogan
Can you confirm that with the HZ3 serialization framework a custom serializer can finally 'stream' the object to/from a InputStream/OutputStream ? Currently with HZ2 it still uses byte-arrays and FastByteArray streams, would love to send large objects over the wire...Thanks, Barry
public interface TypeSerializer<T> {
int getTypeId();
void write(ObjectDataOutput out, T object) throws IOException;
T read(ObjectDataInput in) throws IOException;
void destroy();
}