Efficient Serialization of ObjectGraph pojo

176 views
Skip to first unread message

Wilson

unread,
Jul 3, 2012, 8:19:04 AM7/3/12
to proto...@googlegroups.com
Hi,

We are currently trying to serialize an object graph pojo using Protostuff Runtime 1.0.7. What we are seeing is that the serialized size using Protostuff Runtime is larger than Java Serialization. The code snippet below illustrates this:

import java.io.ObjectOutputStream;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;

import com.dyuproject.protostuff.LinkedBuffer;
import com.dyuproject.protostuff.ProtobufIOUtil;
import com.dyuproject.protostuff.Schema;
import com.dyuproject.protostuff.runtime.RuntimeSchema;

class Counterparty implements Serializable {
private String name;
private String miscDatail;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getMiscDatail() {
return miscDatail;
}
public void setMiscDatail(String miscDatail) {
this.miscDatail = miscDatail;
}
}

class EntitlementInfo implements Serializable {
private Counterparty counterparty;

public Counterparty getCounterparty() {
return counterparty;
}

public void setCounterparty(Counterparty counterparty) {
this.counterparty = counterparty;
}
}

class CounterpartyInfo implements Serializable {
private Counterparty counterparty;
private Counterparty parentCounterparty;
public Counterparty getCounterparty() {
return counterparty;
}
public void setCounterparty(Counterparty counterparty) {
this.counterparty = counterparty;
}
public Counterparty getParentCounterparty() {
return parentCounterparty;
}
public void setParentCounterparty(Counterparty parentCounterparty) {
this.parentCounterparty = parentCounterparty;
}
}

class UserInfo implements Serializable {
private List<EntitlementInfo> entitleList;
private CounterpartyInfo cpInfo;
public List<EntitlementInfo> getEntitleList() {
return entitleList;
}
public void setEntitleList(List<EntitlementInfo> entitleList) {
this.entitleList = entitleList;
}
public CounterpartyInfo getCpInfo() {
return cpInfo;
}
public void setCpInfo(CounterpartyInfo cpInfo) {
this.cpInfo = cpInfo;
}
}

public class ProtostuffObjectGraphTest {

public static void main(String[] args) {
try {
// Setup test userInfo pojo.
UserInfo userInfo = new UserInfo();
Counterparty cp = new Counterparty();
cp.setName("TestCP");
StringBuilder strbuild = new StringBuilder();
for (int i=0; i<100000; i++) {
strbuild.append("a");
}
cp.setMiscDatail("TestMiscDetail-" + strbuild.toString());
CounterpartyInfo cpInfo = new CounterpartyInfo();
cpInfo.setCounterparty(cp);
cpInfo.setParentCounterparty(cp);
EntitlementInfo entitleInfo = new EntitlementInfo();
entitleInfo.setCounterparty(cp);
List<EntitlementInfo> entitleInfoList = new ArrayList<EntitlementInfo>();
entitleInfoList.add(entitleInfo);
userInfo.setCpInfo(cpInfo);
userInfo.setEntitleList(entitleInfoList);
// Serialize with Protostuff Serialzation.
Schema<UserInfo> schema = RuntimeSchema.getSchema(UserInfo.class);
LinkedBuffer buff = LinkedBuffer.allocate(LinkedBuffer.DEFAULT_BUFFER_SIZE);
byte[] protostuffSerialzedContent = ProtobufIOUtil.toByteArray(userInfo, schema, buff);
// Deserialize with Protostuff Serialzation.
UserInfo userInfo1 = new UserInfo();
ProtobufIOUtil.mergeFrom(protostuffSerialzedContent, userInfo1, schema);
// Print key information from Protostuff serialization.
System.out.println("ProtostuffSerialization serialized size: " + protostuffSerialzedContent.length);
System.out.println("ProtostuffSerialization deserialized Counterparty references same: " + (
userInfo1.getCpInfo().getCounterparty() == userInfo1.getCpInfo().getParentCounterparty() && 
userInfo1.getCpInfo().getCounterparty() == userInfo1.getEntitleList().get(0).getCounterparty()));
// Serialize with Java serialzation.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ObjectOutputStream out = new ObjectOutputStream(bos);
out.writeObject(userInfo);
out.close();
byte[] javaSerialzedContent = bos.toByteArray();
// Deserialize with Java serialization.
ObjectInputStream in = new ObjectInputStream(new ByteArrayInputStream(javaSerialzedContent));
UserInfo userInfo2 = (UserInfo)in.readObject();
// Print key information from Java seriaization.
System.out.println("JavaSerialization serialized size: " + javaSerialzedContent.length);
System.out.println("JavaSerialization deserialized Counterparty references same: " + (
userInfo2.getCpInfo().getCounterparty() == userInfo2.getCpInfo().getParentCounterparty() && 
userInfo2.getCpInfo().getCounterparty() == userInfo2.getEntitleList().get(0).getCounterparty()));
} catch (Exception ex) {
System.err.println("Error: " + ex);
}
}
}

The output when you run the test class above is:
-----------------------------------------------
ProtostuffSerialization serialized size: 300104
ProtostuffSerialization deserialized Counterparty references same: false
JavaSerialization serialized size: 100461
JavaSerialization deserialized Counterparty references same: true
-----------------------------------------------

As you can see from the example, the constructed UserInfo object has internally 3 references to the same Counterparty object. Those multiple references are actually referencing the same object. So under Java serialization it seems to know only to serialize the first one, but with the subsequent counterparty references, it just records it as an object references (and not attempt to serialize the counterparty contents again). Hence, the serialize size is smaller when compared with Protostuff Serialization. It looks like the default behaviour in Protostuff is that it will serialize the Counterparty object 3 times, thus the serialize size is larger.

When we did it with our actual object graph pojo objects, we are seeing using Java serialization it takes up ~200Mb in serialized size, but with Protostuff Runtime its ~450Mb, and because Protostuff is actually serializing and deserializing multiple times for same object references within a given object graph pojo instance, the time it takes to perform the serialization/deserialization comes out to be around the same time as Java Serialization.

Can Protostuff Runtime do something similar to Java Serialization (in terms of same object reference optimization in data members of object graph that references the same instance)? If it can do some optimization on same object references in Protostuff, I'm sure the serialized size will be smaller then Java Serialization, and the serialized/deserialize time will also be lower.

Thanks

Regards

Wilson.

David Yu

unread,
Jul 4, 2012, 10:31:33 PM7/4/12
to proto...@googlegroups.com
First off, read the wiki:

In the first place, you should have been using GraphIOUtill.
Also, why are you refering to ProtobufIOutil, as protostuff format?

Thanks

Regards

Wilson.



--
When the cat is away, the mouse is alone.
- David Yu

Wilson

unread,
Jul 9, 2012, 4:26:44 PM7/9/12
to proto...@googlegroups.com
Hi David,

Thanks for responding. I've just tried with GraphIOUtil in our app today and work great. Java Serialization was serializing to ~330Mb, and using GraphIOUtil, it does it with ~186Mb. ProtobufIOUtil was doing it ~520Mb. The serialization/deserialization time using GraphIOUtil was doing it in half the time of Java Serialization.

When I was referring to "Protostuff Serialization/Deserialization" in the sample program posted here, it was in the context of your product in general (given your product suite is called Protostuff), and not in reference to the specific format type name (e.g. protostuff format, protobuf format etc...). 

Thanks for your help.

Regards

Wilson.

On Thursday, July 5, 2012 3:31:33 AM UTC+1, David Yu wrote:

michael...@gmail.com

unread,
Jul 10, 2012, 1:15:22 PM7/10/12
to proto...@googlegroups.com


Sent via BlackBerry by AT&T

From: Wilson <wilson...@hotmail.com>
Date: Tue, 3 Jul 2012 05:19:04 -0700 (PDT)
Subject: [protostuff] Efficient Serialization of ObjectGraph pojo
Reply all
Reply to author
Forward
0 new messages