the future of Kryo: v5

1,109 views
Skip to first unread message

Nate

unread,
Apr 25, 2018, 9:13:00 AM4/25/18
to kryo-users
​Hello everyone,

I have been working on cleaning up and improving the Kryo library for a new major release. You can see the current state here:
https://github.com/EsotericSoftware/kryo/tree/kryo-5.0.0-dev

I'll ramble about some of the major changes:
  • Generics has been redone to support all scenarios. GenericsUtil thoroughly discovers the generic types that are known at compile time (tests), which avoids as much work as possible during serialization. The generic types known only at serialization time are tracked by Kryo's instance of Generics, which maintains a stack of GenericType instances (lots of nice javadocs in those classes).

    How does using the new generics API look? For a simple example with just one type parameter, see CollectionSerializer. To get the class, call kryo.getGenerics().nextGenericClass() then after reading the child objects call kryo.getGenerics().popGenericType(). Simple!

    nextGenericClass() is a shortcut for the common case of a class with a single type parameter. When there are multiple type parameters, like Map, use nextGenericTypes() instead, then call resolve() to get each class. Also, the last parameter is made current automatically (meaning it is pushed by Generics#nextGenericTypes()), so the other parameter(s) have to be pushed and popped manually. Somewhat less simple!

    It's a little tricky, but these two patterns are all serializers need to worry about. It provides full support for nested generic types, eg HashMap<ArrayList<Integer>, ArrayList<String>>, which wasn't possible before.

    Feedback on all this is welcome, but the core of it is relatively complex and may induce a headache (it certainly did when writing it!). The important bits are 1) to handle all generics scenarios, and 2) to minimize work at serialization time. Digging through just the calls for generics made from serializers, you'll find minimal work is done and without allocations. Given this, generics are always enabled.

  • Serializer method signatures have changed for issue #146 in this commit. All Serializer classes need to be changed from Class<T> to Class<? extends T>.

  • Various serialization improvements. Eg, TaggedFieldSerializer now has acceptsNull=true, which saves 1 byte for all non-null objects.

  • Unsafe had permeated the API. IMO it should be sandboxed as much as possible. I began refactoring by removing Unsafe support completely, made everything nice, then put back the input/output streams. I haven't yet looked into what serializers would need to make use of all of the Unsafe features. With Java 9 dropping Unsafe, maybe it is not worth the considerable effort to support it. FWIW, I personally don't use Unsafe but I know others do. I assume some even choosing Kryo specifically for Unsafe. I don't know how those people feel about Java 9.

    I have not looked at Java 9 at all yet. I don't know if it provides APIs that Kryo can use to be more efficient.

  • FieldSerializer had gotten messy internally. It did a lot more work than necessary to build the cached fields. It did things with generics that were suspect and did expensive computations that were not even used. It had a lot of Unsafe logic.

  • Various API improvements. The Kryo class must not become a junk drawer for serializer settings. The SerializerFactory classes can be used, eg FieldSerializerFactory, which can create new, configured serializers (example). Having Input/Output varint and varlong be a hint was odd. If this ends up being needed again, maybe depending on what happens with Unsafe, it could be done in Input/Output subclasses without affecting the base class API. Javadocs are much improved.

  • Logging is improved. Some useless junk was logged, some important junk was not logged, and the formatting of log messages was not consistent.

  • Deserialization still temporarily modifies the input buffer. While in many cases this is fine, it can be quite an unexpected gotcha in some cases. I'd like to remedy this but string writing is important and any changes here need extensive benchmarks.

I'm sure everyone's first question will be about what happens to data serialized in an older Kryo version. Supporting that is noble for a minor version increment, but the changes above are too extensive for that to work. It would be dirty to have settings to disable free optimizations or enable bugs we've fixed, solely to attempt loading previously serialized bytes. I don't want the difficulties of data migration to prevent the library from evolving and I feel an improvement and maintenance pass like this is long overdue.

It may not make sense to upgrade your projects to a new major version. Kryo has been stable a long time. While it's nice to have as many projects on the latest as possible, you should not feel obligated to upgrade if the benefits do not make it worthwhile, given the pain involved. New projects of course benefit from using the latest version.

Updating Kryo versions when you care about previously serialized bytes has always been hard for all but the most trivial updates. Probably the safest way to do this is to load data with an old version, then write it with the new version. This can be done by using a class loader for the old Kryo classes. Maybe we could provide an example for this. For this to work, obviously the class files must not have not changed, only the Kryo version.

Please share your thoughts! Now is the time for us to make any big changes and improvements to both the API and serialized data that wouldn't make sense in a minor version bump. Any places in the API you don't like, are awkward, or confusing? Have any ideas for improving the API? Have any ideas for more efficient serialization?

Cheers,
-Nate

Jan Kotek

unread,
Apr 28, 2018, 12:52:26 AM4/28/18
to kryo-users
Hi Nathan,

Good work!

I have non-recursive graph serialization in Elsa. How would you feel if I merged that into FieldSerializer?

Jan Kotek

Nate

unread,
May 1, 2018, 5:30:06 PM5/1/18
to kryo-users
Hi Jan,

Non-recursive serialization would be nice. Whether it belongs in FieldSerializer depends on how much it complicates the class and whether it affects performance. It may be better suited as a separate serializer, possible a FieldSerializer subclass.

Cheers,
-Nate


--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Nate

unread,
Jun 5, 2018, 8:29:57 AM6/5/18
to kryo-users
More work has been done in the kryo-5.0.0-dev branch. I think it's pretty close to finished. Some notes:
  • Reading ASCII strings no longer modifies the buffer.
    ​ It has the same serialized bytes​ but copies bytes to chars before creating the string. It's nice to remove this gotcha.
  • ​Output/Input classes now use little endian everywhere (previously everything was big endian, except var ints/longs).

  • ByteBufferInput/Output no longer rely on the buffer's byte order
    ​ (removes a bit of juggling, plus asXxxBuffer allocates)​
    , the
    ​input/output ​
    position
    ​and the buffer's position are kept in sync, and lots of other clean up.
  • ​writeInt(int, boolean)​ for letting the output decide if a fixed or variable length int is written is back. All inputs/outputs have setVariableLengthEncoding for ints and longs. Unsafe buffers don't turn this on by default, but doing so can be much faster (taking just 23% of the time in a very simple test, but of course producing larger output).
  • Unsafe buffers are back.
    ​ As before, the downside to using these is that the deserializing computer's native byte order must match the serializing computer. With Java 10 and a simple benchmark, using unsafe buffers completes in 59% of the time as Output/Input. If variable encoding is disabled, unsafe buffers complete in just 16% of the time (woo!).
  • ​​FieldSerializer can use Unsafe to read object fields​ again​. ​U​nlike unsafe buffers, using unsafe for this doesn't have any downsides.​​ Currently it uses unsafe if possible, then ReflectASM if possible, then reflection as a last resort.​ Since it degrades gracefully, I'm not sure users ever need to disable unsafe or ReflectASM (though we might for tests).​ Are there ever cases where unsafe is worse than ReflectASM or reflection, or ReflectASM is worse than reflection? TBH with Java 10 and a simple benchmark, I don't see much difference between the three.
  • I haven't added the FieldSerializer "memory regions" features back due to this ominous comment:
    https://github.com/EsotericSoftware/kryo/blob/master/src/com/esotericsoftware/kryo/serializers/FieldSerializer.java#L92-L100

    Leo, is this feature actually usable?
    ​ Is anyone using it?​
Thoughts on the above and eyes on the kryo-5.0.0-dev branch would be appreciated.

Cheers,
-Nate

Martin Grotzke

unread,
Jun 5, 2018, 6:02:41 PM6/5/18
to kryo-...@googlegroups.com
Hi Nate,

many thanks for this enormous effort! 

I wanted to go through issues to see if there are ones left breaking compatibility, so that they should be included.
I didn't go through all issues one by one but checked some I still had in mind, and as it seems you have already fixed these:
- Type registration should be required #398
- fixed API for read #146

One that _might_ affect compatibility is
- Optimizations for common special cases #439
(but I'm not sure how much this brings, maybe you want to check that one?)

Have you gone through open issues already, so that I don't have to do this completely?

Should we have a migration guide in place that tells what needs to be changed in user code? I think we should ;-)

Before finally releasing 5.0 I'd suggest to publish one or two release candidates to get some early feedback.

WDYT? 

Cheers, 
Martin 



--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+...@googlegroups.com.

Nate

unread,
Jun 6, 2018, 7:09:15 PM6/6/18
to kryo-users
I haven't yet gone through all the issues, so your help there is appreciated.

#439 looks good, I'd like to play with that soon. Now is a good time to think about other similar improvements.

A migration guide would be great!

Agreed about release candidates, we have no need to rush.


To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+unsubscribe@googlegroups.com.

Nate

unread,
Jun 7, 2018, 8:53:33 PM6/7/18
to kryo-users
It looks a bit complex, but it's just that we only need to do checks or write bytes in certain cases. It's usually smaller than the old code and only a byte or so larger in a few cases. In other cases the savings can be 1 byte per item.

I'm not sure it's worth iterating all the keys and values in a map to do something similar for maps. Also it's quite a bit of convoluted code and would be doubled for map keys and values. I probably won't bother doing it for maps.

ObjectArraySerializer could see similar treatment, though I'd guess it gets used less often than CollectionSerializer.

Nate

unread,
Jun 8, 2018, 3:48:06 AM6/8/18
to kryo-users
FWIW, I've updated the jvmserializers project. I cleaned up the benchmark, named the runs appropriately, and updated it to Kryo 4.0.2, though I didn't re-run the full test and update the wiki. Here's a very simplistic comparison of Kryo 2.23.0, 4.0.2 and 5.0.0-dev:
http://n4te.com/x/4331-charts.html
This benchmark is pretty janky for timing and doesn't exercise much of Kryo, but the size metrics are right. Nothing stands out as terribly broken.

mongonix

unread,
Jun 8, 2018, 5:24:43 PM6/8/18
to kryo-users


On Tuesday, June 5, 2018 at 5:29:57 AM UTC-7, Nate wrote:
More work has been done in the kryo-5.0.0-dev branch. I think it's pretty close to finished. Some notes:
  • Reading ASCII strings no longer modifies the buffer.
    ​ It has the same serialized bytes​ but copies bytes to chars before creating the string. It's nice to remove this gotcha.
Very nice! 
  • ​Output/Input classes now use little endian everywhere (previously everything was big endian, except var ints/longs).

  • ByteBufferInput/Output no longer rely on the buffer's byte order
    ​ (removes a bit of juggling, plus asXxxBuffer allocates)​
    , the
    ​input/output ​
    position
    ​and the buffer's position are kept in sync, and lots of other clean up.
  • ​writeInt(int, boolean)​ for letting the output decide if a fixed or variable length int is written is back. All inputs/outputs have setVariableLengthEncoding for ints and longs. Unsafe buffers don't turn this on by default, but doing so can be much faster (taking just 23% of the time in a very simple test, but of course producing larger output).
  • Unsafe buffers are back.
    ​ As before, the downside to using these is that the deserializing computer's native byte order must match the serializing computer. With Java 10 and a simple benchmark, using unsafe buffers completes in 59% of the time as Output/Input. If variable encoding is disabled, unsafe buffers complete in just 16% of the time (woo!).
I'm glad you changed your mind. I remember you didn't like Unsafe buffers first ;-) But in terms of performance they are hard to beat!
 
  • ​​FieldSerializer can use Unsafe to read object fields​ again​. ​U​nlike unsafe buffers, using unsafe for this doesn't have any downsides.​​ Currently it uses unsafe if possible, then ReflectASM if possible, then reflection as a last resort.​ Since it degrades gracefully, I'm not sure users ever need to disable unsafe or ReflectASM (though we might for tests).​ Are there ever cases where unsafe is worse than ReflectASM or reflection, or ReflectASM is worse than reflection? TBH with Java 10 and a simple benchmark, I don't see much difference between the three.
No, I'm not aware of anyone really using it. It was more of implementing an optimization that could be potentially useful. So, if you simplified the code an removed it, that's fine.


Also, I really like your clean-ups of the code for handling generics. Great job!

-Leo 

Nate

unread,
Jun 8, 2018, 6:05:27 PM6/8/18
to kryo-users
  • Unsafe buffers are back.
    ​ As before, the downside to using these is that the deserializing computer's native byte order must match the serializing computer. With Java 10 and a simple benchmark, using unsafe buffers completes in 59% of the time as Output/Input. If variable encoding is disabled, unsafe buffers complete in just 16% of the time (woo!).
I'm glad you changed your mind. I remember you didn't like Unsafe buffers first ;-) But in terms of performance they are hard to beat!

True, mostly because in my projects I can't be sure the computers that read the data will be compatible. I saw a big difference using unsafe buffers in a simple test, but interestingly almost no difference with the jvmserializers test data. Of course we know that test data only exercise a tiny part of Kryo.
 
No, I'm not aware of anyone really using it. It was more of implementing an optimization that could be potentially useful. So, if you simplified the code an removed it, that's fine.

​OK, let's leave it out then. If it makes sense, we can add it back later.​

Also, I really like your clean-ups of the code for handling generics. Great job!

​Thanks! I got some gray hairs doing that. The tests are pretty thorough though, so it should handle all situations and it's pretty light at serialization time.​

​Cheers,
-Nate​

seth/nqzero

unread,
Jun 8, 2018, 6:14:24 PM6/8/18
to kryo-users
Now is the time for us to make any big changes and improvements to both the ...

i've posted issues for each of these but advocating for them here too:

- thread safe kryo instances
- partial deserialization (or field access) would be a big efficiency win
- async support

kryo pools become very expensive when you have many threads and many different kryo configurations

is reflectASM going to get an update pass ? i have an outstanding pull request there






Nate

unread,
Jun 9, 2018, 2:25:53 AM6/9/18
to kryo-users
Unfortunately none of those things are terribly easy.

- Thread safety is better done outside of Kryo.
- The most common serializer (FieldSerializer) doesn't lend itself to doing partial deserialization. Serializers are pluggable and others could be written to do this.
- I don't see a good way to provide access to only a portion of deserialized objects. If you use your own serializers, you can do your own callbacks, probably specific to your data.

ReflectASM might see some love, but I'm likely to run out of OSS steam very soon and go back into hibernation, aka real work. :(

Cheers,
-Nate

Nate

unread,
Jun 9, 2018, 10:16:45 PM6/9/18
to kryo-users
Here are some results from the JMH benchmarks in kryo-5.0.0-dev:

-f 4 -wi 5 -i 3 -t 2 -w 2s -r 2s
Benchmark                            (chunked)  (references)   Mode  Cnt     Score     Error  Units
FieldSerializerBenchmark.compatible       true          true  thrpt   12  1666.434 ±  37.651  ops/s
FieldSerializerBenchmark.compatible       true         false  thrpt   12  1687.522 ±  37.220  ops/s
FieldSerializerBenchmark.compatible      false          true  thrpt   12  2006.795 ± 150.495  ops/s
FieldSerializerBenchmark.compatible      false         false  thrpt   12  2080.588 ±  23.313  ops/s
FieldSerializerBenchmark.custom            N/A          true  thrpt   12  4254.326 ± 641.937  ops/s
FieldSerializerBenchmark.custom            N/A         false  thrpt   12  3319.830 ± 120.158  ops/s
FieldSerializerBenchmark.field             N/A          true  thrpt   12  2985.312 ±  38.829  ops/s
FieldSerializerBenchmark.field             N/A         false  thrpt   12  3266.215 ± 112.011  ops/s
FieldSerializerBenchmark.tagged           true          true  thrpt   12  2494.667 ± 178.776  ops/s
FieldSerializerBenchmark.tagged           true         false  thrpt   12  2695.727 ±  78.062  ops/s
FieldSerializerBenchmark.tagged          false          true  thrpt   12  3563.448 ±  46.638  ops/s
FieldSerializerBenchmark.tagged          false         false  thrpt   12  3596.008 ±  83.828  ops/s
FieldSerializerBenchmark.version           N/A          true  thrpt   12  3963.666 ±  84.364  ops/s
FieldSerializerBenchmark.version           N/A         false  thrpt   12  3949.319 ± 169.012  ops/s
StringsBenchmark.readAsciiLong             N/A           N/A     ss   12     3.156 ±   0.070   s/op
StringsBenchmark.readString                N/A           N/A     ss   12     1.316 ±   0.015   s/op
StringsBenchmark.readStringLong            N/A           N/A     ss   12     3.876 ±   0.030   s/op
StringsBenchmark.writeAsciiLong            N/A           N/A     ss   12     2.254 ±   0.051   s/op
StringsBenchmark.writeString               N/A           N/A     ss   12     0.755 ±   0.023   s/op
StringsBenchmark.writeStringLong           N/A           N/A     ss   12     2.949 ±   0.134   s/op

For ops/s higher is better. For s/op lower is better. Benchmarking is tricky, so it would be nice to get more eyes on these. Likely the object graph (a single Sample instance) was too small to be meaningful. Some other test POJOs are committed but not wired up yet.

Not sure why tagged would beat field, maybe because TaggedFieldSerializer has no subclasses? I expected from best to worst:
custom, field, version, tagged, compatible

It's strange that custom with references would be higher than without, but then the +/- error for custom with references is very large. Probably the need for a larger test again.

It shows the impact of chunked encoding.

For strings, everything seems pretty OK (after a few recent commits to remove CharSequence). I'd love for it to be faster, but haven't found a way.

Joachim Durchholz

unread,
Jun 10, 2018, 4:02:37 AM6/10/18
to kryo-...@googlegroups.com
> Not sure why tagged would beat field, maybe because
> TaggedFieldSerializer has no subclasses?
AFAIK Hotspot compiles to native depending on call frequence and does
not care much about whether there's a subclass or not.
Details can vary considerably between JVM versions. E.g. for Java 8,
compilation strategy is described in
http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/2b2511bd3cc8/src/share/vm/runtime/advancedThresholdPolicy.hpp#l34
.
https://stackoverflow.com/a/35602023/6944068 mentions a case where
escape analysis happened after a million iterations, long after a lot of
other optimizations were done.

I think one really needs to know what the JIT of the JVM in question is
doing: What mechanisms exist, which of them have kicked in and which
haven't, that kind of stuff.
Lather, rinse, repeat for other JVM versions.
I suspect that a benchmark that aims at a single number per test is at
best futile, at worst misleading (because you start optimizing the wrong
cases).

1. I'd first define some use cases. One that comes to mind is
high-throughput with a significant time spent inside Kryo. Another would
be low-latency. All of them long-running.
2. I'd make time series. What's the metric initially, how does it change
over time? That's going to give the users much more confidence in the
results.
3. Graphs help tremendously in interpreting the numbers. If tests ran
repeatedly, smoke graphs would be helpful, too. ("Smoke graph" is not a
standard term, I mean graphs like those produced by "smoke ping", see
https://www.google.com/imgres?imgurl=https%3A%2F%2Foss.oetiker.ch%2Fsmokeping%2Fdoc%2Freading_detail.png&imgrefurl=http%3A%2F%2Foss.oetiker.ch%2Fsmokeping%2Fdoc%2Freading.en.html&docid=HYqKuuEPsC7HEM&tbnid=AMRykYXB66kdPM%3A&vet=10ahUKEwj5seHZ0MjbAhWKK5oKHbZEBVAQMwg0KAAwAA..i&w=697&h=321&client=ubuntu&bih=906&biw=1541&q=%22smoke%20graph%22&ved=0ahUKEwj5seHZ0MjbAhWKK5oKHbZEBVAQMwg0KAAwAA&iact=mrc&uact=8
for a list of examples.)

Regards,
Jo

Nate

unread,
Jun 10, 2018, 5:33:04 AM6/10/18
to kryo-users
On Sun, Jun 10, 2018 at 10:02 AM, Joachim Durchholz <j...@durchholz.org> wrote:
Not sure why tagged would beat field, maybe because TaggedFieldSerializer has no subclasses?
AFAIK Hotspot compiles to native depending on call frequence and does not care much about whether there's a subclass or not.

​It can devirtualize calls.​ Like a lot of JIT things, it's complex.
https://shipilev.net/blog/2015/black-magic-method-dispatch/
 
Details can vary considerably between JVM versions. E.g. for Java 8, compilation strategy is described in http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/2b2511bd3cc8/src/share/vm/runtime/advancedThresholdPolicy.hpp#l34 .
https://stackoverflow.com/a/35602023/6944068 mentions a case where escape analysis happened after a million iterations, long after a lot of other optimizations were done.

I think one really needs to know what the JIT of the JVM in question is doing: What mechanisms exist, which of them have kicked in and which haven't, that kind of stuff.
Lather, rinse, repeat for other JVM versions.

​JVM versions can change things for sure.​ Ideally we at least address any major issues that affect all JVMs.
 
I suspect that a benchmark that aims at a single number per test is at best futile, at worst misleading (because you start optimizing the wrong cases).

Agreed, benchmark op/s (and error) are just numbers. Why they are the way they are needs to be understood (at least a little bit).
 
1. I'd first define some use cases. One that comes to mind is high-throughput with a significant time spent inside Kryo. Another would be low-latency. All of them long-running.

​In the FieldSerializerBenchmark ops/s are how many round trip serializations per second.​ The biggest factor is likely the test data: how deep is the hierarchy, how much data are in the classes, what serailizers do they use, etc.
 
2. I'd make time series. What's the metric initially, how does it change over time? That's going to give the users much more confidence in the results.

​JMH handles a lot, warm up, etc. It's hard enough to measure something accurately. I'm not sure it's important to track the metric throughout the benchmark.
 
 
​We can get the data out of JMH with its Java API and easily make charts (eg a Google chart URL).​

Joachim Durchholz

unread,
Jun 10, 2018, 6:15:16 AM6/10/18
to kryo-...@googlegroups.com
Am 10.06.2018 um 11:32 schrieb Nate:
> On Sun, Jun 10, 2018 at 10:02 AM, Joachim Durchholz <j...@durchholz.org
> <mailto:j...@durchholz.org>> wrote:
>
> Not sure why tagged would beat field, maybe because
> TaggedFieldSerializer has no subclasses?
>
> AFAIK Hotspot compiles to native depending on call frequence and
> does not care much about whether there's a subclass or not.
>
>
> ​It can devirtualize calls.​

Last time I read up on this (which is a while ago, and probably not 100%
accurate anymore), what they did is this:
* Whenever there's a polymorphic call, count the number of calls for
each target type.
* Continue doing this until the devirtualization threshold is reached.
* Pick the target type with the highest number of calls.
* Replace the call site with the equivalent of this code:
if (target.getClass() == clazz) {
static call to clazz.fn with target and parameters
} else {
normal polymorphic call
}

It was too dumb to devirtualize more than one target class per call
site. (Or maybe making it smarter would have been too much overhead to
really be worth it, on the average.)

> Like a lot of JIT things, it's complex.
> https://shipilev.net/blog/2015/black-magic-method-dispatch/

Interesting page, though he uses instanceof instead of direct
.getClass() calls, which is more expensive.

One thing that might help (or maybe not) could be final classes or
methods. JIT should then know right at class load time what's
polymorphic and what isn't.
Not sure if this information is being used; very little Java code
bothers to set that keyword, and it may not be worth the extra test for
the JVM.

> 1. I'd first define some use cases. One that comes to mind is
> high-throughput with a significant time spent inside Kryo. Another
> would be low-latency. All of them long-running.
>
> ​In the FieldSerializerBenchmark ops/s are how many round trip
> serializations per second.​ The biggest factor is likely the test data:
> how deep is the hierarchy, how much data are in the classes, what
> serailizers do they use, etc.
>
> 2. I'd make time series. What's the metric initially, how does it
> change over time? That's going to give the users much more
> confidence in the results.
>
> ​JMH handles a lot, warm up, etc. It's hard enough to measure something
> accurately. I'm not sure it's important to track the metric throughout
> the benchmark.

Users of Kryo will then know whether they need to bother about some
slowness they see.
They will also get an idea of how thorough your benchmarking was. And
from what parts of the graph you are drawing your conclusions.

It will also tell you things about the typical duration of the warm-up
phase, repeatability of test results, and similar stuff.
It's the difference between having individual data points and aggregate
data. Aggregate tends to hide details; if the details contain a
surprise, it's good to investigate more, if the details contain no
surprise, that's of interest to the users of the benchmarked code.
Sweet :-)

Martin Grotzke

unread,
Jun 10, 2018, 7:11:45 AM6/10/18
to kryo-...@googlegroups.com
Re understanding numbers (such as ops/s) I made good experiences with flame graphs produced by async-profiler recently (https://github.com/jvm-profiling-tools/async-profiler) - didn't try that for kryo though.

Cheers, 
Martin 


--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+...@googlegroups.com.

Nate

unread,
Jun 10, 2018, 10:49:45 AM6/10/18
to kryo-users
I fixed up the FieldSerializer benchmarks (let JMH do the loops). Here's a proper run with Java 10:

-f 4 -wi 5 -i 3 -t 2 -w 2s -r 2s
Benchmark                            (chunked)  (references)   Mode  Cnt        Score        Error  Units
FieldSerializerBenchmark.compatible       true          true  thrpt   12   419567.981 ±  18168.102  ops/s
FieldSerializerBenchmark.compatible       true         false  thrpt   12   426442.897 ±  24351.105  ops/s
FieldSerializerBenchmark.compatible      false          true  thrpt   12   536883.949 ±  16173.512  ops/s
FieldSerializerBenchmark.compatible      false         false  thrpt   12   510899.557 ±  17140.413  ops/s
FieldSerializerBenchmark.custom            N/A          true  thrpt   12  1652782.163 ±  81337.461  ops/s
FieldSerializerBenchmark.custom            N/A         false  thrpt   12  1663496.510 ±  35768.055  ops/s
FieldSerializerBenchmark.field             N/A          true  thrpt   12  1388097.595 ± 114145.319  ops/s
FieldSerializerBenchmark.field             N/A         false  thrpt   12  1428242.840 ±  43040.012  ops/s
FieldSerializerBenchmark.tagged           true          true  thrpt   12   782582.701 ±  24919.783  ops/s
FieldSerializerBenchmark.tagged           true         false  thrpt   12   793086.996 ±  28941.039  ops/s
FieldSerializerBenchmark.tagged          false          true  thrpt   12  1263359.889 ±  28499.285  ops/s
FieldSerializerBenchmark.tagged          false         false  thrpt   12  1255801.036 ±  34290.111  ops/s
FieldSerializerBenchmark.version           N/A          true  thrpt   12  1428134.379 ±  52969.689  ops/s
FieldSerializerBenchmark.version           N/A         false  thrpt   12  1401367.543 ±  31198.617  ops/s


This is more in line with what I expected to see.

Results JSON is here:
http://n4te.com/x/4334-jmh-result.json

Can drop the JSON into this website, though I'm not a fan of how it shows the data. We need to find a better tool or do the charts ourselves.
http://jmh.morethan.io/


To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the "kryo-users" group.
http://groups.google.com/group/kryo-users
---
You received this message because you are subscribed to the Google Groups "kryo-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kryo-users+unsubscribe@googlegroups.com.

Nate

unread,
Jun 10, 2018, 10:47:46 PM6/10/18
to kryo-users
Welp, the benchmarks in my last email still had many problems. I think they are better now (really this time). Also I've added more data classes (the "media" classes from jvmserializers) and some IO benchmarks. Benchmark code is here:
https://github.com/EsotericSoftware/kryo/tree/kryo-5.0.0-dev/benchmarks/src/main/java/com/esotericsoftware/kryo/benchmarks

We now have charts, eg:
There's a simple bash script that runs all the benchmarks and generates the charts:
https://github.com/EsotericSoftware/kryo/blob/kryo-5.0.0-dev/benchmarks/run.sh

Nate

unread,
Jun 12, 2018, 3:14:24 PM6/12/18
to kryo-users
Reply all
Reply to author
Forward
0 new messages