Why is this simple benchmark showing zero allocations?

xi...@fictionpress.com

unread,

Jun 3, 2014, 1:52:49 AM6/3/14

to golan...@googlegroups.com

Would be grateful if someone can point out the obvious to us for the following 4 benchmarks.

We can understand why there would be 1 allocation for A, but why is B and C shows zero allocation and zero memory usage? Thanks in advance for any pointers.

Go-version: Go 1.3beta2

----------------------

type Dummy struct {

A int64

B int64

}

type Dummy2 struct {

A int64

}

func BenchmarkA(t *testing.B) {

var r *Dummy

for i := 0; i < t.N; i++ {

r = &Dummy{A: int64(i)}

_ = r

}

func BenchmarkB(t *testing.B) {

for i := 0; i < t.N; i++ {

r := &Dummy{A: int64(i)}

_ = r

}

func BenchmarkC(t *testing.B) {

var r Dummy

for i := 0; i < t.N; i++ {

r = Dummy{A: int64(i)}

_ = r

}

func BenchmarkD(t *testing.B) {

var r *Dummy2

for i := 0; i < t.N; i++ {

r = &Dummy2{A: int64(i)}

_ = r

}

------ Result ------

type Dummy struct {

A int64

B int64

}

type Dummy2 struct {

A int64

}

func BenchmarkA(t *testing.B) {

var r *Dummy

for i := 0; i < t.N; i++ {

r = &Dummy{A: int64(i)}

_ = r

}

func BenchmarkB(t *testing.B) {

for i := 0; i < t.N; i++ {

r := &Dummy{A: int64(i)}

_ = r

}

func BenchmarkC(t *testing.B) {

var r Dummy

for i := 0; i < t.N; i++ {

r = Dummy{A: int64(i)}

_ = r

}

func BenchmarkD(t *testing.B) {

var r *Dummy2

for i := 0; i < t.N; i++ {

r = &Dummy2{A: int64(i)}

_ = r

}

---- Benchmark Result -----

BenchmarkA 100000000 80.4 ns/op 16 B/op 1 allocs/op

BenchmarkB 5000000000 2.36 ns/op 0 B/op 0 allocs/op

BenchmarkC 10000000000 1.17 ns/op 0 B/op 0 allocs/op

BenchmarkD 200000000 40.9 ns/op 8 B/op 0 allocs/op

-------------------------------

Stranger to us still is that for D where the struct contains only 1 less int64 var, the benchmark shows 1/2 the memory which is correct, but now has zero allocation?

So we are confused at two fronts. 1) Why is B and C showing zero memory and zero allocation? 2) Why is A and D off by 1 allocation?

Thanks.

Xing

Dan Kortschak

unread,

Jun 3, 2014, 10:21:32 AM6/3/14

to xi...@fictionpress.com, golan...@googlegroups.com

Try using the -m compiler flag to find out.

Xing Li

unread,

Jun 3, 2014, 8:14:28 PM6/3/14

to golan...@googlegroups.com, xi...@fictionpress.com

The gcflag -m flag appears to show info that shows whether the var is assigned in the stack or heap but running it still does not reveal to me how the benchmark tool is counting allocations.

Using -m I see that both A and D are escaping to heap while B and C are assigned locally in the stack? Is that correct? Does it mean that the benchmark tool only calculate heap allocations? Or is that stack allocations are reused in the loop so they result in average of 0 allocation over time?

It also doesn't explain to me why A and D are showing memory used, but A has 1 allocation over time but D results in 0 allocation over the span of the benchmark when the only difference is 1 extra int64 in the struct.

Thanks for any clarification.

Dan Kortschak

unread,

Jun 3, 2014, 10:47:23 PM6/3/14

to Xing Li, golan...@googlegroups.com

On Tue, 2014-06-03 at 17:14 -0700, Xing Li wrote:
> Thanks for any clarification.
>
What you've surmised seems fair. Have a look at the asm output for each
of the loops to see what is being generated. That should clarify the
remaining points.

Dave Cheney

unread,

Jun 4, 2014, 4:50:39 PM6/4/14

to golan...@googlegroups.com, xi...@fictionpress.com

The benchmark tool only reports heap allocations. Stack allocations via escape analysis are less costly, possibly free, so are not reported.

Reply all

Reply to author

Forward