Bencharking issue with b.StartTimer and b.StopTimer. Was: Benchmark using b.StopTimer and b.StartTimer has unexpected behaviour

Orson Cart

unread,

Mar 24, 2020, 4:24:08 PM3/24/20

to golang-nuts

I posted this earlier but I realised that the code had a fundamental error in it. I've corrected here it but the underlying problem still exists.

I've recently started using go test's benchmarks support and I'm particularly interested in understanding the benchmark timer functions. I've been getting results that I found surprising and I was wondering if anyone could explain what's going on here.

The code below has three benchmarks that each invoke a single function (foo). The implementation of foo isn't important, it's just there to consume some time:

- foo is called once per iteration in Benchmark1.

- It's called twice per iteration in Benchmark2 so I'd expect Benchmark2's duration to be nominally twice that of Benchmark1.

- It's also called twice per iteration in Benchmark3 but the first call is wrapped in b.StopTimer and b.startTimer calls. Because of this I'd have expected Benchmark3 to be about the same duration as Benchmark1

Apologies for the length of the example but I didn't think it fair to ask the question and leave anything out.

package demo_test

import (

"strconv"

"testing"

)

var Foo1 []string

var Foo2 []string

var Count int = 32767

func Benchmark1(b *testing.B) {

for i := 0; i < b.N; i++{

Foo1 = foo(Count)

}

func Benchmark2(b *testing.B) {

for i := 0; i < b.N; i++{

Foo1 = foo(Count)

Foo2 = foo(Count)

}

func Benchmark3(b *testing.B) {

for i := 0; i < b.N; i++{

b.StopTimer()

Foo1 = foo(Count)

b.StartTimer()

Foo2 = foo(Count)

}

func foo(count int) []string{

testData := []string{}

for i:= 0; i < count; i++ {

testData = append(testData, strconv.Itoa(i))

}

return testData

}

When the benchmarks are run the results are as follows:

Benchmark1-4 351 3345215 ns/op

Benchmark2-4 166 7206582 ns/op

Benchmark3-4 334 3457907 ns/op

PASS

ok bar.com/benchmarks 6.881s

OK benchmark3 is a little slower than Benchmark1 but that's not what's bothering me. It's this: if I now change Count to something much smaller the results are a surprise, at least to me. Here are the results when Count = 8:

Benchmark1-4 2706196 442 ns/op

Benchmark2-4 1357482 873 ns/op

Benchmark3-4 840729 1387 ns/op

PASS

ok bar.com/benchmarks 23.547s

The ratio of timings for Benchmark1 and Benchmark2 are roughly in line with expectations but I was surprised to see that the timings for Benchmark3 are now larger than those for Benchmark2.

Can anyone explain this?

TIA

Orson

Robert Engels

unread,

Mar 24, 2020, 4:49:55 PM3/24/20

to Orson Cart, golang-nuts

Can you please succinctly explain the problem?

On Mar 24, 2020, at 11:24 AM, Orson Cart <objectiv...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/38f10940-a770-4fc5-86a9-19cc1371152a%40googlegroups.com.

Orson Cart

unread,

Mar 24, 2020, 5:46:44 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 16:49:55 UTC, Robert Engels wrote:

Can you please succinctly explain the problem?

Let's see. Benchmarks can yield incorrect results when b.StopTimer and b.StartTimer are used.

My reasoning:

1/ I have a benchmark that calls a single function. The reported duration is T.

2/ I have another benchmark which invokes the same function twice. I'd expect the reported duration for this to be nominally 2T and within reason it appears to be the case.

3/ I have yet another benchmark which also invokes the same function twice but it stops the benchmark timer before the first call and then starts it after the the first call has completed. I'd expect the reported duration to be nominally T again. It isn't. It's closer to 3T on my hardware and I've seen much worse correlation on other hardware - almost 5T

Now it's entirely possible that I'm misunderstanding something about how b.StopTimer and b.StartTimer are intended to be used, hence my post here :)

To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Michael Jones

unread,

Mar 24, 2020, 6:31:29 PM3/24/20

to Orson Cart, golang-nuts

You use them to stop the time charged against your benchmark.

For example:

bench:

stop timer

generate initial data

start timer

do test

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1d291cef-420a-4f80-b719-e0654585c25d%40googlegroups.com.

--

Michael T. Jones
michae...@gmail.com

Orson Cart

unread,

Mar 24, 2020, 6:48:17 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 18:31:29 UTC, Michael Jones wrote:

You use them to stop the time charged against your benchmark.

For example:

bench:
stop timer
generate initial data
start timer
do test

Thanks Michael. What if the initial data has to be generated from scratch on each iteration? Is the following expected to work?

bench:

b.N Loop

stop timer

generate new data

start timer

do test

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1d291cef-420a-4f80-b719-e0654585c25d%40googlegroups.com.

Michael Jones

unread,

Mar 24, 2020, 6:50:32 PM3/24/20

to Orson Cart, golang-nuts

yes

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c3fcc6c5-9aa2-489c-adc4-c6b070ea6617%40googlegroups.com.

Orson Cart

unread,

Mar 24, 2020, 7:24:05 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 18:50:32 UTC, Michael Jones wrote:

yes

...and that is exactly the scenario that I believe is yielding inconsistent results:

Given the following function to be benchmarked:

func foo() []string{

testData := []string{}

for i:= 0; i < 8; i++ {

testData = append(testData, strconv.Itoa(i))

}

return testData

}

On my hardware this takes either 454 ns/op or 1322 ns/op depending on how it's called:

1st case:

func Benchmark(b *testing.B) {

for i := 0; i < b.N; i++{

Foo1 = foo() // this reports 454 ns/op on my hardware

}

2nd case:

func Benchmark(b *testing.B) {

for i := 0; i < b.N; i++{

b.StopTimer()

Foo1 = foo() // This shouldn't be charged against the benchmark. Ignore the fact that it's a call to foo, I'm just using it as a means of elapsing some time

b.StartTimer()

Foo2 = foo() // this reports a whopping 1322 ns/op on my hardware

}

So a call to foo takes over three times more time if *within the loop* we stop the timer, do something that takes time and then start the timer before calling foo. That's definitely violating the law of least surprise for me.

For StopTimer and StartTimer to be of any use I have to believe that they actually do stop and start the timer reliably. From those timings I'd say that this isn't the case.

As always I may be wrong. I'd just like someone to explain how and I'll be a happy chappie :)

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c3fcc6c5-9aa2-489c-adc4-c6b070ea6617%40googlegroups.com.

Jake Montgomery

unread,

Mar 24, 2020, 7:51:24 PM3/24/20

to golang-nuts

Strange. I hope someone has a real answer for you.

In the meantime, you can simplify your example to demonstrate the issue:

package demo_test

import (


    "testing"
)

var Foo1 []string

var Count int = 8



func Benchmark1(b *testing.B) {
    for i := 0; i < b.N; i++ {
        Foo1 = foo(Count)
    }
}

func Benchmark2(b *testing.B) {


    b.StopTimer()


    for i := 0; i < b.N; i++ {


        // Hypothetical setup here 
        b.StartTimer()
        Foo1 = foo(Count)
        b.StopTimer()


    }
}


func foo(count int) []string {
    testData := []string{}
    for i := 0; i < count; i++ {


        testData = append(testData, "a")
    }

    return testData
}

I get:

goos: windows
goarch: amd64
Benchmark1-4     2101567               584 ns/op
Benchmark2-4     1000000              1668 ns/op
PASS

So it appears that StopTimer() and StartTimer() are introducing some overhead. I am surprised that it is this large.

Good Luck

Orson Cart

unread,

Mar 24, 2020, 8:01:17 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 19:51:24 UTC, Jake Montgomery wrote:

Strange. I hope someone has a real answer for you.

In the meantime, you can simplify your example to demonstrate the issue:

Thanks for taking a look at the code Jake. Just one question: you have the call to b.StopTimer outside of the loop and the call to b.StartTimer inside. Was that intentional?

I'd have thought that the call to b.StartTimer would be a no-op after the first iteration.

Jake Montgomery

unread,

Mar 24, 2020, 8:16:21 PM3/24/20

to golang-nuts

I took a quick glance at the code, and I think I figured it out. The benchmark code, including StartTimer() and StopTimer() use time.Now(). On my windows machine time.Now() has an accuracy of 1ms. So by calling it for every iteration, when the actual time is 600ns just makes the whole thing wildly inaccurate. That 1ms accuracy is ok when the benchmark calls it once then runs the code 2 million times. But when called each time the accuracy of the time.Now() code becomes dominant.

It is just an educated guess, but that is what it looks like to me.

Perhaps there should be some sort of warning in the docs?

Sean Liao

unread,

Mar 24, 2020, 8:26:10 PM3/24/20

to golang-nuts

see https://github.com/golang/go/issues/27217

Adrian Ratnapala

unread,

Mar 24, 2020, 8:27:39 PM3/24/20

to Jake Montgomery, golang-nuts

Between foo() exiting and the hardware timer being read, there is

* The call to StopTimer()
* A conditoinal branch
* A call to time.Since
* Another conditional
* A call to either Now() or runtimenano()

Now calls might get inlined, and the branches get predicted, but I
doubt the overhead of all that is zero and, it can be significant
compared to a very fast `foo()` function. Also, after Start/Stop
timer is called, there's a call to runtime.ReadMemStats() which does
nontrivial stuff and could thrash the caches, slowing down the actual
benchmark code.

The docs for StopTimer say:

> StopTimer stops timing a test. This can be used to pause the timer while performing complex initialization that you don't want to measure.

So that sounds like the use-case is to call Stop-, StartTimer once
before you enter the main loop of the benchmark. They not efficient
enough for a tight inner loop.

> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.

> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/feda7e38-5d1f-43cf-b0cd-98db0a94d3c9%40googlegroups.com.

--
Adrian Ratnapala

Orson Cart

unread,

Mar 24, 2020, 8:32:27 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 20:26:10 UTC, Sean Liao wrote:

see https://github.com/golang/go/issues/27217

Thanks, will do.

Orson Cart

unread,

Mar 24, 2020, 8:40:18 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 20:27:39 UTC, Adrian Ratnapala wrote:

...

So that sounds like the use-case is to call Stop-, StartTimer once
before you enter the main loop of the benchmark. They not efficient
enough for a tight inner loop.

Thanks for the input and I think that you are probably correct. It would be nice if the docs said that though.

The code that I'm profiling wasn't written by me. It's fast but it modifies its input data so it needs new data on each iteration. The implementation is non-trivial and it might be a maintenance headache. I've been asked to benchmark it along with less complicated implementations. It was whilst trying to set up per-iteration test data that I came across this issue.

> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Orson Cart

unread,

Mar 24, 2020, 8:46:24 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 20:16:21 UTC, Jake Montgomery wrote:

I took a quick glance at the code, and I think I figured it out. The benchmark code, including StartTimer() and StopTimer() use time.Now(). On my windows machine time.Now() has an accuracy of 1ms. So by calling it for every iteration, when the actual time is 600ns just makes the whole thing wildly inaccurate. That 1ms accuracy is ok when the benchmark calls it once then runs the code 2 million times. But when called each time the accuracy of the time.Now() code becomes dominant.

It is just an educated guess, but that is what it looks like to me.

Thanks once again. Others on the thread have pointed out more potential issues. I feel a bit like a carpenter who has to measure a piece of wood but who only has a rubber rule :D

Robert Engels

unread,

Mar 24, 2020, 8:47:07 PM3/24/20

to Orson Cart, golang-nuts

One way to handle this is to generate all of the data up front in an array and then just index into the array based on the run.

On Mar 24, 2020, at 3:42 PM, Orson Cart <objectiv...@gmail.com> wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/291af5de-8a01-4c30-98bb-d4765e3a5333%40googlegroups.com.

Orson Cart

unread,

Mar 24, 2020, 8:56:19 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 20:47:07 UTC, Robert Engels wrote:

One way to handle this is to generate all of the data up front in an array and then just index into the array based on the run.

Yeah, I had thought of that before posting but then I'd have to decide on a value for b.N. I was trying to roll with the idea of the framework establishing a value. To be honest if it wasn't for the fact that I was just curious as to why it wasn't working I'd just have settled for an array with a fixed b.N.

All part of my golang learning experience

To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/291af5de-8a01-4c30-98bb-d4765e3a5333%40googlegroups.com.

Sean Liao

unread,

Mar 24, 2020, 9:00:38 PM3/24/20

to golang-nuts

setup upfront shouldn't be a problem

b.N is constant throughout the function lifetime, the entire function is called multiple times during benchmarking

Orson Cart

unread,

Mar 24, 2020, 9:10:09 PM3/24/20

to golang-nuts

On Tuesday, 24 March 2020 21:00:38 UTC, Sean Liao wrote:

setup upfront shouldn't be a problem
b.N is constant throughout the function lifetime, the entire function is called multiple times during benchmarking

Thank you! I'd missed that. That makes things so much more straightforward for me.

Robert Engels

unread,

Mar 24, 2020, 10:15:26 PM3/24/20

to Orson Cart, golang-nuts

Just an FYI - this is by no means Go related. These problems and techniques are very common when doing HPC micro benchmarking.

On Mar 24, 2020, at 4:10 PM, Orson Cart <objectiv...@gmail.com> wrote:

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/8cf5eec8-9cb9-47ed-b444-e8a689d403ff%40googlegroups.com.

Reply all

Reply to author

Forward

Bencharking issue with b.StartTimer and b.StopTimer. Was: Benchmark using b.StopTimer and b.StartTimer has unexpected behaviour - to me at least

Orson Cart

Robert Engels

Orson Cart

Michael Jones

Orson Cart

Michael Jones

Orson Cart

Jake Montgomery

Orson Cart

Jake Montgomery

Sean Liao

Adrian Ratnapala

Orson Cart

Orson Cart

Orson Cart

Robert Engels

Orson Cart

Sean Liao

Orson Cart

Robert Engels