Interaction between Test and Benchmark functions?

98 views
Skip to first unread message

Peter Fröhlich

unread,
Apr 25, 2017, 7:48:29 PM4/25/17
to golang-nuts
Hi all,

Sorry if this is an FAQ, but then my Google-foo has failed me. :-/

It seems to me that the test functions in a bla_test.go file influence
the benchmark functions in the same file. In the snippet below, if I
comment out the body of the test function, I get better benchmark
numbers than when I leave it in. Suspecting some GC "leftovers" from
the test, I've tried adding "runtime.GC(); b.ResetTimer()" to the
benchmark but that doesn't actually help.

Before I dig through the source for Go's testing module itself, I
wanted to see if this is a well-known issue, maybe with a well-known
workaround?

Best,
Peter

-----CUT-----
$ go version
go version go1.7.5 linux/amd64
$ cat bla_test.go
package main

import "testing"
import "container/list"
import "runtime"

const size = 2270

func TestPushBack(t *testing.T) {
// /*
var q list.List
for i := 0; i < size; i++ {
q.PushBack(i)
}
for i := 0; q.Len() > 0; i++ {
e := q.Front()
x := e.Value.(int)
q.Remove(e)
if x != i {
t.Errorf("q.PopFront() = %d, want %d", x, i)
}
}
// */
}

func BenchmarkPushBack(b *testing.B) {
runtime.GC()
b.ResetTimer()
for i := 0; i < b.N; i++ {
var q list.List
for n := 0; n < size; n++ {
q.PushBack(n)
}
}
}
$ go test -bench .
BenchmarkPushBack-2 5000 378367 ns/op
PASS
ok X 1.937s
$ go test -bench .
BenchmarkPushBack-2 5000 378196 ns/op
PASS
ok X 1.936s
$ vi bla_test.go # comment out test code
$ go test -bench .
BenchmarkPushBack-2 5000 356310 ns/op
PASS
ok X 1.825s
$ go test -bench .
BenchmarkPushBack-2 5000 354530 ns/op
PASS
ok X 1.816s

Dave Cheney

unread,
Apr 25, 2017, 7:50:09 PM4/25/17
to golang-nuts
Yeah, you shouldn't combine benchmark and tests, use 

go test -run=XXX -bench=.

To disable all tests (XXX matches nothing, you might prefer ^$) 

Caleb Spare

unread,
Apr 25, 2017, 8:03:33 PM4/25/17
to Peter Fröhlich, golang-nuts
Hi Peter,

Running your code locally (Go 1.8.1 / linux / amd64), I don't get a
statistically significant difference between the two. The benchmark
does generate a ton of garbage so the GC causes the benchmark runs to
have a pretty decent variance.

To check this kind of thing, you'll want to run the benchmarks many
times and then compare the distributions using a tool like benchstat:
https://godoc.org/golang.org/x/perf/cmd/benchstat

For example:

go test -bench . -count 10 | tee old.txt
# make changes
go test -bench . -count 10 | tee new.txt
benchstat old.txt new.txt

When I've seen effects like this before, one explanation I've seen is
that the code alignment can change performance characteristics (this
only applies to the tightest of loops, IIRC). Too lazy to look up
references right now, but I've definitely seen this discussed on this
mailing list as well as various issues.

-Caleb
> --
> You received this message because you are subscribed to the Google Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Peter Fröhlich

unread,
Apr 25, 2017, 9:19:43 PM4/25/17
to Caleb Spare, golang-nuts
Thank you David and Caleb, those were helpful suggestions, especially
benchstat! Check out my conclusions though... :-/ I am running these
four:

$ go test -bench=. -count=10 >normal.txt
$ go test -run=XXX -bench=. -count=10 >filter.txt
$ go test -bench=. -count=10 >comment.txt
$ go test -run=XXX -bench=. -count=10 >comment_filter.txt

As you might guess, the last two are when I manually comment out the
body of the test function. Here's the result:

$ benchstat normal.txt filter.txt
name old time/op new time/op delta
PushBack-2 380µs ± 0% 380µs ± 1% ~ (p=0.447 n=10+9)

So filtering the test case out with -run seems to make no difference.

$ benchstat normal.txt comment.txt
name old time/op new time/op delta
PushBack-2 380µs ± 0% 355µs ± 1% -6.45% (p=0.000 n=10+9)

So commenting it out (the body!) leads to what I observed earlier,
just more reliably now thanks to running it 10 times.

$ benchstat comment.txt comment_filter.txt
name old time/op new time/op delta
PushBack-2 355µs ± 1% 357µs ± 1% ~ (p=0.278 n=9+10)

So applying the filter on top of commenting things out (just sanity
checking!) changes nothing again.

Since Caleb said he cannot detect a difference on his machine with his
version of Go, I am wondering if what I am seeing is just a weird
artifact of my ancient machine and old Go release. I'll try upgrading
soon. Also I'll try to see what I get on different machines I have
access too.

Thanks again!
--
Peter H. Froehlich <http://www.cs.jhu.edu/~phf/>
Senior Lecturer | Director, Johns Hopkins Gaming Lab

The views stated in this communication are my own and do not express
the position or views of The Johns Hopkins University.

Peter Fröhlich

unread,
May 3, 2017, 8:55:26 PM5/3/17
to golang-nuts
I can now confirm that in Go 1.8.1 this is *much* less of a problem
than in Go 1.7.5 although the details of what's happening there are
still quite unclear to me. Here's what I get now, comparing the
benchmarks with all test cases included and with some test cases
commented out:

-----CUT-----
$ benchstat bench.txt bench-commented.txt
name old time/op new time/op delta
PushFrontQueue-2 82.8µs ± 1% 82.7µs ± 1% ~ (p=0.832 n=16+19)
PushFrontList-2 162µs ± 1% 161µs ± 1% -0.48% (p=0.003 n=18+19)
PushBackQueue-2 83.4µs ± 1% 83.3µs ± 1% ~ (p=0.756 n=16+19)
PushBackList-2 158µs ± 3% 156µs ± 1% -1.02% (p=0.024 n=20+19)
PushBackChannel-2 110µs ± 2% 110µs ± 2% ~ (p=0.341 n=20+20)
RandomQueue-2 161µs ± 2% 158µs ± 4% -2.28% (p=0.000 n=20+20)
RandomList-2 281µs ± 4% 279µs ± 1% ~ (p=0.223 n=19+19)
GrowShrinkQueue-2 110µs ± 1% 111µs ± 2% +1.41% (p=0.000 n=19+18)
GrowShrinkList-2 170µs ± 5% 168µs ± 1% -1.19% (p=0.017 n=20+19)
-----CUT-----

There *is* still a difference, but at least now it's no longer 10
microseconds like it was in 1.7.5. Whoever fixed this on the way to
1.8: My thanks! Now if you'd just also fix the memory benchmark
scores... :-)

On Tue, Apr 25, 2017 at 9:19 PM, Peter Fröhlich
Reply all
Reply to author
Forward
0 new messages