New to Kryo

144 views
Skip to first unread message

Sandesh Kumar

unread,
Dec 4, 2013, 12:43:26 PM12/4/13
to kryo-...@googlegroups.com
Hi,

I am new to Kryo and java as well. I am looking to use Kryo in our project for serializing the data to be stored in a database. We would be providing libraries with class/method which basically accepts any object, serializes it and then stores in a database. And upon request, fetch the data back from database and deserialize them and return.

By the documentation it looks like registering a class is optimal. But in my case i do not upfront know what classes would need to be serialized/de serialized. I also see the kryo.setRegistrationRequired can be set to false if we do not upfront know the classes which needs to be serialized/deserialized.
Question -
1. When the setRegistrationRequired is set to false, will kryo internally try to register the class and use the integer to identify the class?
2. In case it does, if the object is being tried to deserialize by another instance of kryo (probably on a different machine), would it succeed? or cause error?
3. Is there a performance penalty for not setting Registrationrequired as false?

thanks in advance

mongonix

unread,
Dec 4, 2013, 1:40:07 PM12/4/13
to kryo-...@googlegroups.com
On Wednesday, December 4, 2013 6:43:26 PM UTC+1, Sandesh Kumar wrote:
Hi,

I am new to Kryo and java as well. I am looking to use Kryo in our project for serializing the data to be stored in a database. We would be providing libraries with class/method which basically accepts any object, serializes it and then stores in a database. And upon request, fetch the data back from database and deserialize them and return.

For the sake of completeness: Kryo is not the only serialization library that can serialize/deserialize classes, which are not known in advance. I think avro, protostuff-runtime and few others can do it as well.
 

By the documentation it looks like registering a class is optimal. But in my case i do not upfront know what classes would need to be serialized/de serialized. I also see the kryo.setRegistrationRequired can be set to false if we do not upfront know the classes which needs to be serialized/deserialized.
Question -
1. When the setRegistrationRequired is set to false, will kryo internally try to register the class and use the integer to identify the class?

Yes. It will implicitly register the class and use an integer id. 
For each object graph being serialized and containing an object of such an implicitly registered class, Kryo will write once (!) the classname and assigned id. All other objects of the same class in this object graph will use the integer id only. This allows for deserialization by a different Kryo instance, because when it sees a first object of such a class inside an object graph, it would read the classname and assigned id. Once it is done, it knows how to deserialize this class by its integer id.
 
2. In case it does, if the object is being tried to deserialize by another instance of kryo (probably on a different machine), would it succeed? or cause error?

See above. Since each object graph contains a fully qualified class name and id of a class, another instance of Kryo should be able to deserialize.
 
3. Is there a performance penalty for not setting Registrationrequired as false?

You mean probably "Is there a performance penalty for setting RegistrationRequired as false?" Well, as I described above, for each unregistered class, the serialized representation will contain the fully qualified class name once. IIRC, each object of such a class would need 1 byte more space in the serialized representation currently.

The performance impact is almost invisible for big object graphs. But if you serialize a lot of small object graphs, then you'll pay some price for  writing FQCN inside each object graph.
 
And a few more comments that may be relevant for your use-case:

- Kryo is very well suited for the in-flight data (e.g. sent over a network connection). When it comes to using Kryo for a long term storage, there are different opinions about it (please read recent discussions about this issue on this mailing list). Basically, if you use the same version of Kryo for serialization and deserialization you should be safe. But Kryo does not guarantee yet that it is always backwards compatible, though we are working on it. It means that if you upgrade later to a newer version of Kryo, it may turn out that it cannot read the data stored by previous versions of the library.

- For long term storage, you may consider storing some kind of meta-data in your DB as well. For example, you could assign integer ids on your own to your classes, register them with those ids and store this  FQCN<->class id mapping in your database. Later on, your deserializer Kryo instance could be configured properly by reading this mapping from a database and registering the classes with proper ids accordingly. Doing it this way could provide some benefits:
  - mapping of your classes to ids is now explicit and easily understandable, since it is stored in the DB
  - you do not need to output FQCN in each object graph, i.e. you do not have any related overhead

-Leo

thanks in advance

Reply all
Reply to author
Forward
0 new messages