Help find problems with my LongAdder benchmark

115 views
Skip to first unread message

Elazar Leibovich

unread,
May 7, 2014, 5:13:04 AM5/7/14
to mechanica...@googlegroups.com
Hi,

I'm trying to get my hands dirty at Java micro-benchmarking. Hence, I wrote a small benchmark comparing the performance of LongAdder from Java 8 to a plain AtomicLong.

I wanted to show that LongAdder is good for write heavy workload, but is falling short if you mostly read the counter.

I'm not versed with micro-benchmarking techniques, so I probably made some embarrassing measurement errors. My main technique is pretty primitive - run the benchmark 100 times, and take the minimal results. I verified that the JIT have jumped into action by using -XX:+PrintCompilation. I measured duration with System.getNanos(). Of course, benchmark themselves ran without JIT logging, and with -server.

I tried to avoid too much work in the hot loop, by using a small amount of randomization per the benchmark, hoping the total amount of irrelevant work would not be too high, and yet, hoping the input would still be enough random to get meaningful results.

Also, I took a look at Caliper's FAQ, it looks like running multiple times and picking the minimum will solve the problems mentioned there.

The end results are as expected:

for  1% writes=16,000 (16 threads * 100,000 ops each=1,600,000):
  LongAdder  1.518ms
  AtomicLong 0.319ms
AtomicLong faster than LongAdder by 78.99%

for 90% writes=1,440,000 (16 threads * 100,000 ops each=1,600,000):
  LongAdder  3.258ms
  AtomicLong 28.892ms
LongAdder faster than AtomicLong by 786.80%

Here's the code, I'll be happy to understand any faults, or anything I could do better.


Thanks,

Martin Thompson

unread,
May 7, 2014, 5:25:06 AM5/7/14
to mechanica...@googlegroups.com
In addition to Caliper it is worth getting familiar to with JMH. Aleksey and the team on JMH have work really hard to help us all avoid the common micro-benchmarking mistakes. It is becoming the "benchmark" in micro-benchmarking if you pardon the pun.

Aleksey Shipilev

unread,
May 7, 2014, 5:30:30 AM5/7/14
to mechanical-sympathy


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Richard Warburton

unread,
May 7, 2014, 6:19:18 AM5/7/14
to mechanica...@googlegroups.com
Hi,

I wanted to show that LongAdder is good for write heavy workload, but is falling short if you mostly read the counter.

I'm sure others will comment on the technical aspects, but I would take issue when you say things like this. If you're benchmarking something you're trying to understand its performance characteristics. If you expect a certain set of characteristics going in its likely to result in you accidentally biasing your benchmarking efforts. I know we all do this to some extent or other - everyone is human and everyone has their biases - but its still something to try to minimise.

Of course I might be reading too much into a throwaway statement, but the point still stands ;)

kind regards,

  Richard Warburton

Elazar Leibovich

unread,
May 7, 2014, 6:34:57 AM5/7/14
to mechanica...@googlegroups.com
Thanks,

Of course, what I meant to say "I wanted to check my hypothesis". That said, of course I might be biased.

Also remember my aim was more to understand benchmarking pitfalls, and not to test a specific scenario.


--
You received this message because you are subscribed to a topic in the Google Groups "mechanical-sympathy" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mechanical-sympathy/U_mEK0iun84/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mechanical-symp...@googlegroups.com.

Elazar Leibovich

unread,
May 7, 2014, 6:51:35 AM5/7/14
to mechanica...@googlegroups.com
Thanks,

I'm trying to do that without a framework, for educational purposes. It's more important for me to understand the problems in the benchmark then to use a certain framework and not understand a few pitfalls in the benchmark.


--

Martin Thompson

unread,
May 7, 2014, 7:00:13 AM5/7/14
to mechanica...@googlegroups.com
I think it is admirable to understand more and if you think this is your future you should start studying how the like of JMH, Caliper, managed runtimes, operating systems, hardware, etc. all work. This for me has been a passion for me over the years. I can tell you from personal experience that given everything else I need to do in my job that it is not possible to get to the level of understanding that the likes of Gil and Aleksey have achieved without dedicating yourself to it.

I guess what I'm saying is that this is one subject that has a very deep rabbit hole. If going down it is for you then then best of luck. I think there is also huge value building on the great work that they have done and learning how to apply it in a larger context to amplify the benefits.

Just my 2 pence worth.
To unsubscribe from this group and all its topics, send an email to mechanical-sympathy+unsub...@googlegroups.com.

Aleksey Shipilev

unread,
May 7, 2014, 7:35:46 AM5/7/14
to mechanical-sympathy
What Martin said.

If you want to learn more about benchmarking, try to digest this first:

In your particular benchmarks, there are evident pitfalls:

 * Calculating "min" across all invocations is meaningless, since you are probing your luck in getting the lowest value in probability distribution for the performance metric you are measuring;

 * Edge effects with starting/finishing the threads are probably messing with your measurement; relevant sample: http://hg.openjdk.java.net/code-tools/jmh/file/1815ebfef9f4/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_17_SyncIterations.java

 * DCE in adder.sum(), adder.get(), and add.addAndGet() can eliminate parts of the benchmark; relevant sample: http://hg.openjdk.java.net/code-tools/jmh/file/1815ebfef9f4/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_08_DeadCode.java

 * Loop optimizations in (read|write)(atomic|Adder) probably make them pairwise incomparable; relevant sample: http://hg.openjdk.java.net/code-tools/jmh/file/1815ebfef9f4/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_11_Loops.java

 * Doing everything in a single VM: while it arguably estimates the worst-case scenario of polluted profiles, the very first test is in advantageous position against others; relevant sample: http://hg.openjdk.java.net/code-tools/jmh/file/1815ebfef9f4/jmh-samples/src/main/java/org/openjdk/jmh/samples/JMHSample_12_Forking.java

My dollar against Martin's 2 pences! ;)

-Aleksey.


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages