java.nio.BufferOverflowException

1,505 views
Skip to first unread message

jeppec

unread,
Jan 15, 2010, 9:39:53 AM1/15/10
to kryonet-users
Hi

I'm trying out a pretty simple serialization solution using Kryo
(latest version downloaded from the homepage) to see how it compares
to my current serialization solution.
But no matter how I try to use it (either with a directly created
ByteBuffer, using ObjectBuffer with an ByteArrayOutputStream and the
solution below), I keep getting java.nio.BufferOverflowException
(running Java 1.6.0_17 on OS X 10.6.2)

Kryo kryo = new Kryo();
kryo.register(HashMap.class, new MapSerializer(kryo));
kryo.register(Call.class);
...
kryo.register(Stat.class);

ObjectBuffer buffer = new ObjectBuffer(kryo);
byte[] result = buffer.writeObjectData(logDataCache);

This consequently gives me the following exception:

java.nio.BufferOverflowException
at java.nio.Buffer.nextPutIndex(Buffer.java:495)
at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:145)
at com.esotericsoftware.kryo.serialize.LongSerializer.put
(LongSerializer.java:51)
at com.esotericsoftware.kryo.serialize.LongSerializer.writeObjectData
(LongSerializer.java:33)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:332)
at com.esotericsoftware.kryo.serialize.MapSerializer.writeObjectData
(MapSerializer.java:97)
at com.esotericsoftware.kryo.serialize.FieldSerializer.writeObjectData
(FieldSerializer.java:158)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:332)
at com.esotericsoftware.kryo.serialize.MapSerializer.writeObjectData
(MapSerializer.java:104)
at com.esotericsoftware.kryo.serialize.FieldSerializer.writeObjectData
(FieldSerializer.java:158)
at com.esotericsoftware.kryo.Kryo.writeClassAndObject(Kryo.java:332)
at com.esotericsoftware.kryo.serialize.MapSerializer.writeObjectData
(MapSerializer.java:104)
at com.esotericsoftware.kryo.serialize.FieldSerializer.writeObjectData
(FieldSerializer.java:158)
at com.esotericsoftware.kryo.Kryo.writeObjectData(Kryo.java:361)
at com.esotericsoftware.kryo.ObjectBuffer.writeObjectData
(ObjectBuffer.java:133)
at com.project.LogProcessor.serializeLogDataCache(LogProcessor.java:
287)

Am I doing something wrong?

Thanks in advance :)

/Jeppe

Nate

unread,
Jan 15, 2010, 1:36:00 PM1/15/10
to kryonet-users
Hi Jeppe,

Just to be clear, there are two projects: Kryo which does
serialization, and KryoNet which uses Kryo to do TCP and UDP
communication. The Kryo discussion group is here:
http://groups.google.com/group/kryo-users
The projects are closely related so it really isn't big deal. :) I'll
go ahead and respond here.

Regarding the exception you are seeing, the ObjectBuffer must be large
enough to contain all the bytes for the object graph that is being
serialized. The default is 2048. You may increase the size using the
ObjectBuffer constructor, eg for 64KB:

ObjectBuffer buffer = new ObjectBuffer(kryo, 64 * 1024);

The object graph is nearly always entirely in memory anyway, so this
isn't usually much of a limitation.

BTW, if you were to register HashMap without specifying MapSerializer,
Kryo would automatically use MapSerializer. See the default
serializers here:
http://tinyurl.com/yhc68mk

Hope that helps!

-Nate

jeppec

unread,
Jan 15, 2010, 3:46:41 PM1/15/10
to kryonet-users
Ah sorry, didn't see that I had posted to the net group.
I didn't even think of looking at the constructors as I thought
ObjectBuffer would handle growing the byte array in case it wasn't
large enough.
I'm not too familiar with the nio ByteBuffer, so it seems odd that
there's no way to grow the buffer as needed. Right now I might need a
buffer of 200kb and later on I might need 2 MB.
I could always handle the BufferOverflowException and use that to grow
the initial buffers size, but it seems less intuitive.

Do you know if there's a way to have it grow on demand?

/Jeppe

Nate

unread,
Jan 15, 2010, 8:55:57 PM1/15/10
to kryonet-users
On Jan 15, 12:46 pm, jeppec <jeppe.cra...@gmail.com> wrote:
> Do you know if there's a way to have it grow on demand?

ByteBuffers are low level, representing a chunk of memory. There isn't
a way for them to grow automatically.

One way to handle this would be for Kryo to not pass around
ByteBuffers and instead pass around something that managed a list of
buffers. As each buffer is filled, a new one is allocated and added to
the list.

As you mentioned, another way to handle this would be to catch the
BufferOverflowException, grow the buffer, and try again. I'm leaning
toward this because it doesn't impact efficiency in the best case
scenario, where the buffer can be sized large enough up front. The
ObjectBuffer class could handle the retries, so API users won't be
bothered.

I have checked in the changes for this. If you run Kryo from SVN, the
ObjectBuffer will automatically resize. See the new constructor for
setting the initial and maximum sizes.

-Nate

jeppec

unread,
Jan 18, 2010, 11:13:45 AM1/18/10
to kryonet-users
Hi Nate

That was fast - thanks :)
It works nicely - but I'm a bit surprised by the speed and resulting
size of the object serialization.

I'm serializing a fairly large object tree and if I serialize it using
Hessian it can be done in around 642 ms. and the resulting byte[] is
6,9 MB (non compressed - compressed takes 1100 ms and the size is 1,9
MB).
With Kryo is takes 2076ms and the size is 10,3 MB (non compressed).

I'm registering serializers (field serializers) for all My class types
and for HashMap (it complains if I don't) and TreeMap.
Do I need to do something more specialized to take advantage of Kryo's
features?

/Jeppe

Nate

unread,
Jan 18, 2010, 2:14:47 PM1/18/10
to kryone...@googlegroups.com
Hi Jeppe,

That is strange. From my experience, Kryo should be much smaller and many times faster than Hessian:
http://code.google.com/p/thrift-protobuf-compare/
The time could be explained if the ObjectBuffer's initial size is too small. I don't have much of an explanation for why Kryo's output is larger. My only guess would be that there is a difference between your Kryo and Hessian benchmarks.

Just registering the classes should be all you need to do.

Can you send your code that uses Kryo? If possible, a runnable program showing the problem would be fantastic.

-Nate


--
You received this message because you are subscribed to the "kryonet-users" group.
http://groups.google.com/group/kryonet-users

jeppec

unread,
Jan 18, 2010, 3:03:51 PM1/18/10
to kryonet-users
Hi Nate

You were right about the initial too small buffer, setting it
sufficiently high cut the time down to something comparable to
Hession.
I tried to apply the compressor, but it failed. If I understand the
documentation correctly I only have to apply for the top/root level
object right?

Here's the exception:

java.lang.IndexOutOfBoundsException
java.nio.Buffer.checkBounds(Buffer.java:530)
java.nio.HeapByteBuffer.get(HeapByteBuffer.java:125)
com.esotericsoftware.kryo.compress.ByteArrayCompressor.compress
(ByteArrayCompressor.java:28)
com.esotericsoftware.kryo.Compressor.writeObjectData(Compressor.java:
67)
com.esotericsoftware.kryo.Kryo.writeObjectData(Kryo.java:367)
com.esotericsoftware.kryo.ObjectBuffer.writeObjectData
(ObjectBuffer.java:166)

And the code:


Kryo kryo = new Kryo();

kryo.register(LogDataCache.class, new DeflateCompressor(new
FieldSerializer(kryo)));
kryo.register(HashMap.class, new MapSerializer(kryo));
kryo.register(TreeMap.class, new MapSerializer(kryo));
kryo.register(Call.class);

Regarding the size issue, it will be troublesome some cut out enough
code to make a realistic example, sorry.
There are actually no differences between the two benchmarks - they're
fed the same object and are asked to serialize it.
I can't see how that should effect sizes?

/Jeppe

Nate

unread,
Jan 18, 2010, 3:18:32 PM1/18/10
to kryone...@googlegroups.com
On Mon, Jan 18, 2010 at 12:03 PM, jeppec <jeppe....@gmail.com> wrote:
You were right about the initial too small buffer, setting it
sufficiently high cut the time down to something comparable to
Hession.

Kryo should still be much faster. Strange. Did you set it higher than the serialized size?
 
I tried to apply the compressor, but it failed. If I understand the
documentation correctly I only have to apply for the top/root level
object right?

You can use a compressor anywhere you can use a serializer, even in the middle of an object graph. If you apply it to an object, the bytes for that object and all objects under it are compressed. So yes, if you apply it to the root object, the bytes for the whole graph will be compressed.

The exception you are seeing is because DeflateCompressor has a buffer size, similar to ObjectBuffer. It looks like DeflateCompressor could be rewritten to compress data in batches, in which case the default size of 2048 would be sufficient. I'll look into it. For now you can use the constructor to set a high enough buffer size.
 
And the code:
Kryo kryo = new Kryo();
kryo.register(LogDataCache.class, new DeflateCompressor(new
FieldSerializer(kryo)));
kryo.register(HashMap.class, new MapSerializer(kryo));
kryo.register(TreeMap.class, new MapSerializer(kryo));
kryo.register(Call.class);

This looks good. The following is equivalent to how you are registering the maps:

kryo.register(HashMap.class);
kryo.register(TreeMap.class);

This change won't affect performance though.

If the benchmarks are indeed the same, I am very interested to see what is going on. If Hessian somehow beats Kryo at serialized size with your object graph, the thirft-proto-compare benchmark project could be very misleading and needs fixing.

At this point, we really need an executable example. You are only registering two classes, LogDataCache and Call. If you can tell me what fields these classes have, I should be able to build a reasonable approximation to your object graph, populate it simularily, and see where the time and space goes.

-Nate
 

jeppec

unread,
Jan 18, 2010, 4:25:23 PM1/18/10
to kryonet-users
Here are some of the measurements of 4 consecutive runs (each carrying
a little more data than the previous hence the byte array size
differences)

Hessian Serialized LogDataCache in 1404 ms. - size: 6864 kb
Kryoed LogDataCache in 425 ms. - size: 10350 kb
Hessian Serialized LogDataCache in 470 ms. - size: 6868 kb
Kryoed LogDataCache in 417 ms. - size: 10357 kb
Hessian Serialized LogDataCache in 444 ms. - size: 6872 kb
Kryoed LogDataCache in 433 ms. - size: 10363 kb

I'm registering many classes:


Kryo kryo = new Kryo();

kryo.register(LogDataCache.class, new FieldSerializer(kryo));
//kryo.register(LogDataCache.class, new DeflateCompressor(new
FieldSerializer(kryo)));

kryo.register(HashMap.class, new MapSerializer(kryo));
kryo.register(TreeMap.class, new MapSerializer(kryo));
kryo.register(Call.class);

kryo.register(CallStatistics.class);
kryo.register(Host.class);
kryo.register(Component.class);
kryo.register(HostCallStatistics.class);
kryo.register(HostVolumeStatistics.class);
kryo.register(LogMessageData.class);
kryo.register(TimeInterlacedCallStatistics.class);
kryo.register(TimeInterlacedVolumeStatistics.class);
kryo.register(VolumeReport.class);
kryo.register(VolumeStatistics.class);
kryo.register(VolumeStat.class);
kryo.register(Stat.class);

I will send you a mail with more information about how to construct
the classes :)

/Jeppe

On Jan 18, 9:18 pm, Nate <nathan.sw...@gmail.com> wrote:

Nate

unread,
Jan 18, 2010, 7:36:59 PM1/18/10
to kryone...@googlegroups.com
Thanks for sending the test case, it was very helpful. After being quite confused for some time, I finally realized your object graph has many objects that appear in many places in the graph. Kryo currently doesn't support references, so it serializes each object each time it is encountered.

I will give some thought to how to efficiently implement references and get back to you.

-Nate


Nate

unread,
Jan 18, 2010, 8:36:43 PM1/18/10
to kryone...@googlegroups.com
I hacked in support for references and this is what I see:

Kryo: serialize 2243ms, deserialize 2552ms, length 7349869 bytes
Hessian: serialize 3046ms, deserialize 2092ms, length 7921806 bytes

Kryo is 26% faster at serialization, 22% slower at deserialization, and 7.2% smaller. Haven't had a change to see why deserialization is so slow. The size difference is much larger with small object graphs. With larger graphs like yours, Hessian's overhead is smaller percentage of the data.

To be the most efficient, rather than rely on the serialization library to handle the references, you could write only the objects necessary. When you read them back you would have to hand write code to reconstruct your the references in your object graph. Eg, only persist Call and Volume objects, and then call addCall and addVolumeLog as you deserialize each one.

I haven't decided how to officially add support for references. Currently I have just hacked FieldSerializer. Maybe to use references you would have to use a specific serializer...

-Nate

Jeppe Cramon

unread,
Jan 19, 2010, 1:47:43 AM1/19/10
to kryone...@googlegroups.com
HI Nate

Thanks for giving it a shot :)
Regarding you suggestion about reconstruction on deserialization - I 'm wondering why using references would make it so much bigger, isn't it just a small reference/pointer that gets serialized?

/Jeppe

Nate

unread,
Jan 19, 2010, 2:13:04 PM1/19/10
to kryone...@googlegroups.com
Hi Jeppe,

On Mon, Jan 18, 2010 at 10:47 PM, Jeppe Cramon <jeppe....@gmail.com> wrote:
Regarding you suggestion about reconstruction on deserialization - I 'm wondering why using references would make it so much bigger, isn't it just a small reference/pointer that gets serialized?

Your object graph happens to have the same object in the graph many times. In your test case, I count 300,078 unique objects and I count 600,130 times those objects appear in the graph beyond the first time.

In my Kryo implementation of references, I write a zero the first time an object is encountered. So there is 300,078 bytes used. Subsequent times an object is encountered I write an integer ordinal that represents a reference to an object already written. With Kryo, an integer costs between 1 to 5 bytes, depending on the size of the integer:
http://tinyurl.com/ykq7242
I calculate 600,130 sequential integers takes 1849542 bytes:
128 + 16256*2 + 583746*3 = 1849542
So the total bytes needed to handle all your references:
300078 + 1849542 = 2149620
That is 2.05 megabytes being wasted, which is 29% of Kryo's serialized size for your object graph.

-Nate

Nate

unread,
Jan 20, 2010, 7:22:38 PM1/20/10
to kryonet-users
The serializer that supports references has been checked into SVN for
Kryo. See the javadocs, but you can use it like the below. It is a
little hacky to do the instanceof check and also to have to muck with
the context, but I haven't yet come up with a better way to solve
these.

-Nate


kryo = new Kryo() {
public Serializer getDefaultSerializer (Class type) {
Serializer serializer = super.getDefaultSerializer(type);
if (serializer instanceof FieldSerializer) serializer = new
ReferenceFieldSerializer(kryo, type);
return serializer;
}
};

// ...register classes...

long startTime = System.currentTimeMillis();
Kryo.getContext().put("references", null);
byte[] bytes = objectBuffer.writeObjectData(logDataCache);
long serializeTime = System.currentTimeMillis() - startTime;

startTime = System.currentTimeMillis();
Kryo.getContext().put("references", null);
LogDataCache logDataCache2 = objectBuffer.readObjectData(bytes,
LogDataCache.class);
long deserializeTime = System.currentTimeMillis() - startTime;

System.out.println("Kryo: serialize " + serializeTime + "ms,
deserialize " + deserializeTime + "ms, length " + bytes.length + "
bytes");

if (!logDataCache.equals(logDataCache2)) throw new RuntimeException
("Kryo failed round trip.");


On Jan 19, 11:13 am, Nate <nathan.sw...@gmail.com> wrote:
> Hi Jeppe,
>

Reply all
Reply to author
Forward
0 new messages