--
When the cat is away, the mouse is alone.
- David Yu
--
-Nate
You received this message because you are subscribed to the Google Groups "java-serialization-benchmarking" group.
To post to this group, send email to java-serializat...@googlegroups.com.
To unsubscribe from this group, send email to java-serialization-be...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/java-serialization-benchmarking?hl=en.
On Sat, Apr 28, 2012 at 2:21 AM, Nate <nathan...@gmail.com> wrote:
You have missed the point about reducing overhead -- we should not be timing the growing of the buffer, ever.On Fri, Apr 27, 2012 at 11:09 AM, David Yu <david....@gmail.com> wrote:
Here are a number of ways where a library can take a shortcut:1. If you're a compute based serializer (any prototobuf-based serializer like wobly), you intentionally persist the computed size from the first run, so you don't need to compute for the succeeding runs (like the others do).2. If you're a stream/buffer based serializer, you intentionally persist the resized buffer from the first run, so you get exempted from flushing/resizing/expanding for the succeeding runs.It is as simple as that.You can't compromise one for the other.To be fair to all types of serializers mentioned, avoid any of the above.That is easy to digest. Hopefully that clears things up from here on out.And I agree with you, that is why media.1.cks is used on the public results? No growing of buffer happens there.That fact still remains that if ever the users themselves use there dataset, kryo will have false results because it takes the shortcut.As kannan said:it's how we measure the other tools, so it's still not fair to publish results without fixing up the others.You cannot ever fix the others because it was designed that way.Stream-based serializers will always need to flush.Being a stream based serializer, kryo is trying to be smart by avoiding that overhead only for your benefit.Other buffer based serializer will always need to expand/resize and reset on every iteration, but kryo is still try to be smart and avoid that overhead.
-Nate--
You received this message because you are subscribed to the Google Groups "java-serialization-benchmarking" group.
To post to this group, send email to java-serializat...@googlegroups.com.
To unsubscribe from this group, send email to java-serialization-be...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/java-serialization-benchmarking?hl=en.
What's the point of computing the average for the first single run, when the second run has completely different behavior/results from the first run?
In fact, there is no averaging occurring for any runs, see TestCaseRunner#runTakeMin.
-Nate
--
You received this message because you are subscribed to the Google Groups "java-serialization-benchmarking" group.
To post to this group, send email to java-serializat...@googlegroups.com.
To unsubscribe from this group, send email to java-serialization-be...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/java-serialization-benchmarking?hl=en.
The first run is not included in any average, the first run is used to check correctness.This is actually where your code cheats.You intentionally re-use your components to collect data (persist the size) from the first run, so that you'll have an advantage over all the other serializers.
-Nate--
You received this message because you are subscribed to the Google Groups "java-serialization-benchmarking" group.
To post to this group, send email to java-serializat...@googlegroups.com.
To unsubscribe from this group, send email to java-serialization-be...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/java-serialization-benchmarking?hl=en.
On Sat, Apr 28, 2012 at 3:21 AM, Nate <nathan...@gmail.com> wrote:
Again, it is done to remove overhead. The same overhead can be removed for other serializers by reusing the ByteArrayOutputStream.On Fri, Apr 27, 2012 at 12:19 PM, David Yu <david....@gmail.com> wrote:
The first run is not included in any average, the first run is used to check correctness.This is actually where your code cheats.You intentionally re-use your components to collect data (persist the size) from the first run, so that you'll have an advantage over all the other serializers.Nope. What about the others that don't use outputStream? Have you thought of that?In fact when you change the code and re-used the OutputStream, java-manual took a hit (from 1700ms to 2400ms).
In a real project, that is how java-manual is used. OutputStreams are not reused at all.
Here's the performance of java-manual and kryo both using an outputstream (not re-used)./run -trials=500 -include=java-manual,kryo,wobly data/media.3.cksChecking correctness...[done]create ser +same deser +shal +deep total size +dfljava-manual 135 7489 7294 3920 4045 4119 11608 1596 255kryo 135 6937 6945 4703 4763 4918 11855 1573 254wobly 86 11164 10979 3521 3562 3631 14796 1604 275In this case, kryo without the shortcuts, is actually slower than java-manual.When everything is equal (both libraries flushing to outputstream), java-manual performs better.
So when you mention that you're faster than java-manual, tell them "I did it through shortcuts".