Bencharking issue with b.StartTimer and b.StopTimer. Was: Benchmark using b.StopTimer and b.StartTimer has unexpected behaviour - to me at least

84 views
Skip to first unread message

Orson Cart

unread,
Mar 24, 2020, 4:24:08 PM3/24/20
to golang-nuts
I posted this earlier but I realised that the code had a fundamental error in it. I've corrected here it but the underlying problem still exists.

I've recently started using go test's benchmarks support and I'm particularly interested in understanding the benchmark timer functions. I've been getting results that I found surprising and I was wondering if anyone could explain what's going on here.

The code below has three benchmarks that each invoke a single function (foo). The implementation of foo isn't important, it's just there to consume some time:
- foo is called once per iteration in Benchmark1.
- It's called twice per iteration in Benchmark2 so I'd expect Benchmark2's duration to be nominally twice that of Benchmark1.
- It's also called twice per iteration in Benchmark3 but the first call is wrapped in b.StopTimer and b.startTimer calls. Because of this I'd have expected Benchmark3 to be about the same duration as Benchmark1

Apologies for the length of the example but I didn't think it fair to ask the question and leave anything out.

package demo_test

import (
"strconv"
"testing"
)

var Foo1 []string
var Foo2 []string
var Count int = 32767

func Benchmark1(b *testing.B) {
for i := 0; i < b.N; i++{
Foo1 = foo(Count)
}
}

func Benchmark2(b *testing.B) {
for i := 0; i < b.N; i++{
Foo1 = foo(Count)
Foo2 = foo(Count)
}
}

func Benchmark3(b *testing.B) {
for i := 0; i < b.N; i++{
b.StopTimer()
Foo1 = foo(Count)
b.StartTimer()
Foo2 = foo(Count)
}
}

func foo(count int) []string{
testData := []string{}
for i:= 0; i < count; i++ {
testData = append(testData, strconv.Itoa(i))
}

return testData
}


When the benchmarks are run the results are as follows:

Benchmark1-4         351           3345215 ns/op
Benchmark2-4         166           7206582 ns/op
Benchmark3-4         334           3457907 ns/op
PASS
ok      bar.com/benchmarks      6.881s

OK benchmark3 is a little slower than Benchmark1 but that's not what's bothering me. It's this: if I now change Count to something much smaller the results are a surprise, at least to me. Here are the results when Count = 8:

Benchmark1-4     2706196               442 ns/op
Benchmark2-4     1357482               873 ns/op
Benchmark3-4      840729              1387 ns/op
PASS
ok      bar.com/benchmarks      23.547s

The ratio of timings for Benchmark1 and Benchmark2 are roughly in line with expectations but I was surprised to see that the timings for Benchmark3 are now larger than those for Benchmark2.

Can anyone explain this?

TIA
Orson

Robert Engels

unread,
Mar 24, 2020, 4:49:55 PM3/24/20
to Orson Cart, golang-nuts
Can you please succinctly explain the problem? 

On Mar 24, 2020, at 11:24 AM, Orson Cart <objectiv...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/38f10940-a770-4fc5-86a9-19cc1371152a%40googlegroups.com.

Orson Cart

unread,
Mar 24, 2020, 5:46:44 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 16:49:55 UTC, Robert Engels wrote:
Can you please succinctly explain the problem?

Let's see. Benchmarks can yield incorrect results when b.StopTimer and b.StartTimer are used.

My reasoning:

1/ I have a benchmark that calls a single function. The reported duration is T.
2/ I have another benchmark which invokes the same function twice. I'd expect the reported duration for this to be nominally 2T and within reason it appears to be the case.
3/ I have yet another benchmark which also invokes the same function twice but it stops the benchmark timer before the first call and then starts it after the the first call has completed. I'd expect the reported duration to be nominally T again. It isn't. It's closer to 3T on my hardware and I've seen much worse correlation on other hardware - almost 5T

Now it's entirely possible that I'm misunderstanding something about how b.StopTimer and b.StartTimer are intended to be used, hence my post here :)
 




 
To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Michael Jones

unread,
Mar 24, 2020, 6:31:29 PM3/24/20
to Orson Cart, golang-nuts
You use them to stop the time charged against your benchmark.

For example:

bench:
  stop timer
  generate initial data
  start timer
  do test

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/1d291cef-420a-4f80-b719-e0654585c25d%40googlegroups.com.


--
Michael T. Jones
michae...@gmail.com

Orson Cart

unread,
Mar 24, 2020, 6:48:17 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 18:31:29 UTC, Michael Jones wrote:
You use them to stop the time charged against your benchmark.

For example:

bench:
  stop timer
  generate initial data
  start timer
  do test

Thanks Michael. What if the initial data has to be generated from scratch on each iteration? Is the following expected to work?

bench:
    b.N Loop
        stop timer
        generate new data
        start timer
        do test


 

Michael Jones

unread,
Mar 24, 2020, 6:50:32 PM3/24/20
to Orson Cart, golang-nuts
yes

To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/c3fcc6c5-9aa2-489c-adc4-c6b070ea6617%40googlegroups.com.

Orson Cart

unread,
Mar 24, 2020, 7:24:05 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 18:50:32 UTC, Michael Jones wrote:
yes


...and that is exactly the scenario that I believe is yielding inconsistent results:

Given the following function to be benchmarked:

func foo() []string{
testData := []string{}
for i:= 0; i < 8; i++ {
testData = append(testData, strconv.Itoa(i))
}

return testData
}

On my hardware this takes either 454 ns/op or 1322 ns/op depending on how it's called:

1st case:
func Benchmark(b *testing.B) {
    for i := 0; i < b.N; i++{
        Foo1 = foo()        // this reports 454 ns/op on my hardware
    }
}

2nd case:
func Benchmark(b *testing.B) {
    for i := 0; i < b.N; i++{
b.StopTimer()
Foo1 = foo() // This shouldn't be charged against the benchmark. Ignore the fact that it's a call to foo, I'm just using it as a means of elapsing some time
b.StartTimer()
Foo2 = foo()            // this reports a whopping 1322 ns/op on my hardware
    }
        }

So a call to foo takes over three times more time if *within the loop* we stop the timer, do something that takes time and then start the timer before calling foo. That's definitely violating the law of least surprise for me.
For StopTimer and StartTimer to be of any use I have to believe that they actually do stop and start the timer reliably. From those timings I'd say that this isn't the case.

As always I may be wrong. I'd just like someone to explain how and I'll be a happy chappie :)



 

Jake Montgomery

unread,
Mar 24, 2020, 7:51:24 PM3/24/20
to golang-nuts
Strange. I hope someone has a real answer for you.

In the meantime, you can simplify your example to demonstrate the issue:

package demo_test

import (

   
"testing"
)

var Foo1 []string
var Count int = 8


func
Benchmark1(b *testing.B) {
   
for i := 0; i < b.N; i++ {
       
Foo1 = foo(Count)
   
}
}

func
Benchmark2(b *testing.B) {

    b
.StopTimer()

   
for i := 0; i < b.N; i++ {

       
// Hypothetical setup here
        b
.StartTimer()
       
Foo1 = foo(Count)
        b
.StopTimer()

   
}
}


func foo
(count int) []string {
    testData
:= []string{}
   
for i := 0; i < count; i++ {

        testData
= append(testData, "a")
   
}

   
return testData
}

I get:

goos: windows
goarch
: amd64
Benchmark1-4     2101567               584 ns/op
Benchmark2-4     1000000              1668 ns/op
PASS

So it appears that StopTimer() and StartTimer() are introducing some overhead. I am surprised that it is this large.

Good Luck


Orson Cart

unread,
Mar 24, 2020, 8:01:17 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 19:51:24 UTC, Jake Montgomery wrote:
Strange. I hope someone has a real answer for you.

In the meantime, you can simplify your example to demonstrate the issue:

Thanks for taking a look at the code Jake. Just one question: you have the call to b.StopTimer outside of the loop and the call to b.StartTimer inside. Was that intentional?

I'd have thought that the call to b.StartTimer would be a no-op after the first iteration.

Jake Montgomery

unread,
Mar 24, 2020, 8:16:21 PM3/24/20
to golang-nuts
I took a quick glance at the code, and I think I figured it out. The benchmark code, including StartTimer() and StopTimer() use time.Now(). On my windows  machine time.Now() has an accuracy of 1ms. So by calling it for every iteration, when the actual time is 600ns just makes the whole thing wildly inaccurate. That 1ms accuracy is ok when the benchmark calls it once then runs the code 2 million times. But when called each time the accuracy of the time.Now() code becomes dominant.

It is just an educated guess, but that is what it looks like to me.

Perhaps there should be some sort of warning in the docs?

Sean Liao

unread,
Mar 24, 2020, 8:26:10 PM3/24/20
to golang-nuts

Adrian Ratnapala

unread,
Mar 24, 2020, 8:27:39 PM3/24/20
to Jake Montgomery, golang-nuts
Between foo() exiting and the hardware timer being read, there is

* The call to StopTimer()
* A conditoinal branch
* A call to time.Since
* Another conditional
* A call to either Now() or runtimenano()

Now calls might get inlined, and the branches get predicted, but I
doubt the overhead of all that is zero and, it can be significant
compared to a very fast `foo()` function. Also, after Start/Stop
timer is called, there's a call to runtime.ReadMemStats() which does
nontrivial stuff and could thrash the caches, slowing down the actual
benchmark code.

The docs for StopTimer say:

> StopTimer stops timing a test. This can be used to pause the timer while performing complex initialization that you don't want to measure.

So that sounds like the use-case is to call Stop-, StartTimer once
before you enter the main loop of the benchmark. They not efficient
enough for a tight inner loop.
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/feda7e38-5d1f-43cf-b0cd-98db0a94d3c9%40googlegroups.com.



--
Adrian Ratnapala

Orson Cart

unread,
Mar 24, 2020, 8:32:27 PM3/24/20
to golang-nuts


On Tuesday, 24 March 2020 20:26:10 UTC, Sean Liao wrote:

Thanks, will do.

Orson Cart

unread,
Mar 24, 2020, 8:40:18 PM3/24/20
to golang-nuts


On Tuesday, 24 March 2020 20:27:39 UTC, Adrian Ratnapala wrote:
...
So that sounds like the use-case is to call Stop-, StartTimer once
before you enter the main loop of the benchmark.  They not efficient
enough for a tight inner loop.

Thanks for the input and I think that you are probably correct. It would be nice if the docs said that though.

The code that I'm profiling wasn't written by me. It's fast but it modifies its input data so it needs new data on each iteration. The implementation is non-trivial and it might be a maintenance headache. I've been asked to benchmark it along with less complicated implementations. It was whilst trying to set up per-iteration test data that I came across this issue.







 
> To unsubscribe from this group and stop receiving emails from it, send an email to golan...@googlegroups.com.

Orson Cart

unread,
Mar 24, 2020, 8:46:24 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 20:16:21 UTC, Jake Montgomery wrote:
I took a quick glance at the code, and I think I figured it out. The benchmark code, including StartTimer() and StopTimer() use time.Now(). On my windows  machine time.Now() has an accuracy of 1ms. So by calling it for every iteration, when the actual time is 600ns just makes the whole thing wildly inaccurate. That 1ms accuracy is ok when the benchmark calls it once then runs the code 2 million times. But when called each time the accuracy of the time.Now() code becomes dominant.

It is just an educated guess, but that is what it looks like to me.

Thanks once again. Others on the thread have pointed out more potential issues. I feel a bit like a carpenter who has to measure a piece of wood but who only has a rubber rule :D

Robert Engels

unread,
Mar 24, 2020, 8:47:07 PM3/24/20
to Orson Cart, golang-nuts
One way to handle this is to generate all of the data up front in an array and then just index into the array based on the run. 

On Mar 24, 2020, at 3:42 PM, Orson Cart <objectiv...@gmail.com> wrote:


To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/291af5de-8a01-4c30-98bb-d4765e3a5333%40googlegroups.com.

Orson Cart

unread,
Mar 24, 2020, 8:56:19 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 20:47:07 UTC, Robert Engels wrote:
One way to handle this is to generate all of the data up front in an array and then just index into the array based on the run. 

Yeah, I had thought of that before posting but then I'd have to decide on a value for b.N. I was trying to roll with the idea of the framework establishing a value. To be honest if it wasn't for the fact that I was just curious as to why it wasn't working I'd just have settled for an array with a fixed b.N.

All part of my golang learning experience

 

Sean Liao

unread,
Mar 24, 2020, 9:00:38 PM3/24/20
to golang-nuts
setup upfront shouldn't be a problem
b.N is constant throughout the function lifetime, the entire function is called multiple times during benchmarking

Orson Cart

unread,
Mar 24, 2020, 9:10:09 PM3/24/20
to golang-nuts
On Tuesday, 24 March 2020 21:00:38 UTC, Sean Liao wrote:
setup upfront shouldn't be a problem
b.N is constant throughout the function lifetime, the entire function is called multiple times during benchmarking

Thank you! I'd missed that. That makes things so much more straightforward for me.

Robert Engels

unread,
Mar 24, 2020, 10:15:26 PM3/24/20
to Orson Cart, golang-nuts
Just an FYI - this is by no means Go related. These problems and techniques are very common when doing HPC micro benchmarking. 

On Mar 24, 2020, at 4:10 PM, Orson Cart <objectiv...@gmail.com> wrote:


To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/8cf5eec8-9cb9-47ed-b444-e8a689d403ff%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages