Microbenchmarking very quick operations

29 views
Skip to first unread message

Joseph Lust

unread,
Jul 31, 2016, 11:21:09 PM7/31/16
to ScalaMeter
I'm trying to compare (see GitHub example) the bytecode instructions for irem, frem, and the logical conjunction modulo trick (with natural dividends and power of 2 divisors).

However, when I run this with ScalaMeter, the more iterations (see repsPerRun) I dial in, the faster it runs. I fear that the JIT eventually optimizes this away. By comparison, I see about 1-6ns runtimes when I do it in a very simple loop with no frameworks or JVM warmup (see gist).

Do you have any tips for trying to time very short processes with ScalaMeter, or prevent JIT code elimination? When I make the method side effect in some way, it does not appear to change the running time. The best I've got yet is ~1000 runs in a loop, and at most 10 items in the Gen.Range, before the JIT appears to remove the code.


Thanks,
Joe

Aleksandar Prokopec

unread,
Aug 13, 2016, 6:55:47 AM8/13/16
to Joseph Lust, ScalaMeter
Hi Joe!

Yes, it's almost certain that any JIT compiler will optimize away such operations.

At the moment, standard methods of communicating with the JIT compiler are somewhat limited, and not standardized.
To prevent such optimizations from happening, you should assign the result of your computation to a volatile variable.
For example, you could have a loop like this:

  @volatile var dummy: Double = 0.0

  def doBunchDouble(v: Double): Unit = {
    var i = 0
    val divDouble = divisor.toDouble
    while (i < repsPerRun) {
        dummy = v % divDouble
        i += 1
    }
  }

However, this again, will almost certainly be optimized away, since the expression `v % divDouble` will be hoisted out of the loop.
This is a standard optimization, and you can count that the JIT will do it.
To prevent it, you need to use the loop-specific index in the expression:

    while (i < repsPerRun) {
        dummy = i % divDouble
        i += 1
    }

In case that volatile assignment is costly compared to a modulo operation, it might make sense to have several modulo operations per loop iteration.

    while (i < repsPerRun) {
        val tmp = i % divDouble
        dummy = tmp % divDouble
        i += 1
    }

Even better, you might want to keep an accumulator variable, and assign to the volatile at the end of the loop.
Hopefully, the JIT should not be smart enough with arithmetic optimizations to optimize anything there, and the accumulator variable's value should be kept in a register at all times:

    var x = 0.0
    while (i < repsPerRun) {
        val tmp = i % divDouble
        x += tmp % divDouble
        i += 1
    }
    dummy = x

Hope this helps!
Alex


--
You received this message because you are subscribed to the Google Groups "ScalaMeter" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalameter+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages