Thrashing cache between benchmark invocations

307 views
Skip to first unread message

Jimmy Jia

unread,
May 2, 2014, 10:51:40 AM5/2/14
to mechanica...@googlegroups.com
Hi,

One of the biggest problems I have with tuning my code with the use of microbenchmarks is that the microbenchmark almost always touches significantly less memory than the full, running process. This gives me misleadingly optimistic performance numbers, since in practice things that are going to be LLC cache misses instead end up being warm in cache in my microbenchmarks.

Without having thought of it more deeply, it seems to me that one way to address this would be to thrash the cache between invocations of the code that I am benchmarking. This could potentially give me a pessimistic estimate from a cache hit perspective, rather than the optimistic estimate I currently get.

Is there built-in support for this in either JMH or Caliper? At a glance it looked like no - is that because what I am proposing is a bad/impractical idea?

Thanks,
Jimmy

Aleksey Shipilev

unread,
May 2, 2014, 11:07:51 AM5/2/14
to mechanical-sympathy
I don't know for Caliper, but for JMH you can have a @Setup which trashes the cache. Although there are multiple caveats:
  - doing this before each benchmark invocation makes us to timestamp each benchmark invocation to separate @Setup time from the benchmark itself, which is very bad from overhead and omission standpoint; this is exacerbated by shorter benchmarks. 
  - long running benchmark will hydrate the caches back rather quickly (back-envelope calculation: 100 ns for LLC miss, 64 bytes per miss, so it takes ~1.6ms to fully hydrate 1 Mb LLC cache); this is exacerbated by longer benchmarks.

So, both caveats matter in different conditions, and I don't think it is easy to find the sweet spot between them.

-Aleksey.

P.S. Oh yes, you can have a "spoiler" concurrent thread which will trash the cache, but I fail to see how one can use this technique without obliterating the reliability of the experiment.


--
You received this message because you are subscribed to the Google Groups "mechanical-sympathy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mechanical-symp...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

William Louth

unread,
May 2, 2014, 11:58:34 AM5/2/14
to mechanica...@googlegroups.com
I should really write up a post on this but what I've used for many years is the hoisting of the micro-benchmark code up into one of more applications (both small, large and "enterprise"). You can do this with AOP or in my case I piggy back on the instrumentation added into the host application by the javaagent I use to meter applications. This has benefits for myself in that the metering engine has various runtime measurement strategies such as sampling that I can use to control when and where the actual micro-benchmark code gets executed and in turn measured. Of course there is noise but that is the point. The trick is separating out the different sources of noise. Recently I started used a simulated playback technology to hoist the code up with the added benefit that playback can be repeated in a near identical sequence. As a library development this approach seems very natural if not obvious.
Reply all
Reply to author
Forward
0 new messages