Wide variance in performance when making minor changes in code

147 views
Skip to first unread message

Ugorji Nwoke

unread,
Apr 15, 2012, 12:13:10 PM4/15/12
to golan...@googlegroups.com
I'm seeing weird things in my code which I'm benchmarking.

Sometimes, if I change an if statement to a switch, performance goes up or down ~20%.

This happened in different situations also:
- changing from an if to a switch
- using one big switch instead of multiple switches (
- removing a block of code within a switch and moving into it's own function 
- predeclaring variables before using them elsewhere

In the latest one, I switched from 2 lines to one line in one place:
  rvi := rv.Interface()
  bs = rvi.([]byte)
TO
  bs = rv.Interface().([]byte)
And performance went from 321227 ns/op to  414865 ns/op. I switched back and it returned to normal. I tried this back and forth a few times, and results were same. 

Note that this is within code where I run through encoding to msgpack, so the code does a lot per op (loop). I'm surprised that just one line change can cause such wide variations. 

When I did shorter benchmarks, I couldn't reproduce this. But it happens very easily with singular changes in the code (ie change one thing, run benchmark, change back, run benchmark again, consistent variance in results).

I'm not knowledgeable enough about the gc internals, but was hoping someone could give some theories or shed some light on why this is happening.

I was hoping that gccgo would give more stable results but it actually increased run times across the board significantly, and had issues decoding BSON where it decoded fine when run under gc.


Rémy Oudompheng

unread,
Apr 15, 2012, 12:19:06 PM4/15/12
to Ugorji Nwoke, golan...@googlegroups.com
Le 15 avril 2012 18:13, Ugorji Nwoke <ugo...@gmail.com> a écrit :
> I'm seeing weird things in my code which I'm benchmarking.
>
> Sometimes, if I change an if statement to a switch, performance goes up or
> down ~20%.
>
> This happened in different situations also:
> - changing from an if to a switch
> - using one big switch instead of multiple switches (
> - removing a block of code within a switch and moving into it's own
> function
> - predeclaring variables before using them elsewhere
>
> In the latest one, I switched from 2 lines to one line in one place:
>   rvi := rv.Interface()
>   bs = rvi.([]byte)
> TO
>   bs = rv.Interface().([]byte)
> And performance went from 321227 ns/op to  414865 ns/op. I switched back and
> it returned to normal. I tried this back and forth a few times, and results
> were same.

Maybe it's caused by inlining behaviour differences? Why aren't you
using bs = rv.Bytes() ?

Rémy.

Ugorji Nwoke

unread,
Apr 15, 2012, 12:30:07 PM4/15/12
to golan...@googlegroups.com, Ugorji Nwoke
Thanks Remy.

Previously, I didn't use rv.Bytes() because I already had the "rvi := rv.Interface()" set earlier in the code. I've changed the logic so that earlier setting of rvi is no longer necessary.

I tried rv.Bytes(), and benchmark time went to 390642 ns/op. Switched back to "rvi := rv.Interface(); bs = rvi.([]byte)" and runtime went to 320996. 

For a one-line change that looks so innocuous and insignificant, this variance is pretty high, and makes it hard to do proper benchmarking.

Rémy Oudompheng

unread,
Apr 15, 2012, 12:47:00 PM4/15/12
to Ugorji Nwoke, golan...@googlegroups.com
Le 15 avril 2012 18:30, Ugorji Nwoke <ugo...@gmail.com> a écrit :
> Thanks Remy.
>
> Previously, I didn't use rv.Bytes() because I already had the "rvi :=
> rv.Interface()" set earlier in the code. I've changed the logic so that
> earlier setting of rvi is no longer necessary.
>
> I tried rv.Bytes(), and benchmark time went to 390642 ns/op. Switched back
> to "rvi := rv.Interface(); bs = rvi.([]byte)" and runtime went to 320996.
>
> For a one-line change that looks so innocuous and insignificant, this
> variance is pretty high, and makes it hard to do proper benchmarking.

- Can you give more details about what is being benchmarked?
- Did you try profiling?
- Can you count allocations to see if there is a difference?
- Did you report your problems with gccgo?

Rémy.

Ugorji Nwoke

unread,
Apr 15, 2012, 12:57:45 PM4/15/12
to golan...@googlegroups.com, Ugorji Nwoke
Thanks Remy,

Will do all of them. 

I was just looking to see if anything jumps out or if this issue is known.

I'm benchmarking the msgpack package, and seeing this variance.

I tried profiling with go tool pprof, but I couldn't figure out how to profile beyond granularity of a function. Once I move things to their own functions, the variance goes away (meaning that I had worse performance when I inlined some code blocks). 

I don't understand how allocation would be a factor here. All I did was just a pre-declared variable which caused the variance, or moving code into a function. How will allocations affect that? If anything, I'd expect it to get worse, but it gets better.

Regarding gccgo/bson, I was hoping that Gustavo and Ian see this thread and maybe Gustavo can try it out in his environment. 

I'd also update the msgpack code to github and update the thread, so that folks can reproduce it. I couldn't reproduce it with a small test.

Gustavo Niemeyer

unread,
Apr 15, 2012, 1:39:26 PM4/15/12
to Ugorji Nwoke, golan...@googlegroups.com
Hi Ugorji,

On Sun, Apr 15, 2012 at 13:13, Ugorji Nwoke <ugo...@gmail.com> wrote:
> I'm seeing weird things in my code which I'm benchmarking.

(...)


>   rvi := rv.Interface()
>   bs = rvi.([]byte)
> TO
>   bs = rv.Interface().([]byte)

If you put more or less data onto the stack, you'll affect the precise
point in which the stack is split, which will touch performance up or
down. This is specially true for recursive functions (generally the
case with marshalers) as such a single line change will in fact
accumulate.

> I was hoping that gccgo would give more stable results but it actually
> increased run times across the board significantly, and had issues decoding
> BSON where it decoded fine when run under gc.

Can you please provide a small example that breaks gccgo?

--
Gustavo Niemeyer
http://niemeyer.net
http://niemeyer.net/plus
http://niemeyer.net/twitter
http://niemeyer.net/blog

-- I'm not absolutely sure of anything.

Rémy Oudompheng

unread,
Apr 15, 2012, 1:51:21 PM4/15/12
to Ugorji Nwoke, golan...@googlegroups.com
Le 15 avril 2012 18:57, Ugorji Nwoke <ugo...@gmail.com> a écrit :
> Regarding gccgo/bson, I was hoping that Gustavo and Ian see this thread and
> maybe Gustavo can try it out in his environment.
>

I think Russ mentioned Ian was in holidays this weeks, and I support
the practice of always writing for everyone and not individual people.
Even if other people might not help, everybody could be interested
about knowing in detail the current issues. It's always a good thing
knowing in advance of possible problems.

Rémy.

Ugorji Nwoke

unread,
Apr 15, 2012, 2:33:42 PM4/15/12
to golan...@googlegroups.com, Ugorji Nwoke
Thanks Remy.

I agree. I'd upload the source to github (today or tomorrow) and update this thread with how to reproduce.

Ugorji Nwoke

unread,
Apr 15, 2012, 5:00:24 PM4/15/12
to golan...@googlegroups.com, Ugorji Nwoke
Thanks Gustavo.

It still feels weird. If these 2 lines are beside each other, shouldn't they basically compile to the same code? And if anything, shouldn't the 2nd one (one expression) be faster? I get significant speedup when I split them. I tried to read up on split stacks and saw Ian's paper on it wrt gcc, but I couldn't get any better understanding of why the speedup reduction.

I've updated the code at: 

You can get it with go get.

To see the bson error, run:
    go test -test.bench . 
With gc. Everything runs fine. With gccgo, I get Document is corrupted

To test in your own environment, you can copy the type TestStruc and the function newTestStruc(...) from msgpack_test.go, and try to Encode, and then decode that. 

To check out the issue with a line change causing a discrepancy. Look at decode.go line 360. I have the other 2 ways of converting to bytes commented. Uncomment one and re-run the benchmarks and see the variance.

BTW I'm on Ubuntu Linux 64-bit.

On Sunday, April 15, 2012 1:39:26 PM UTC-4, Gustavo Niemeyer wrote:
Hi Ugorji,
Reply all
Reply to author
Forward
0 new messages