gccgo produces much slower code than gc

800 views
Skip to first unread message

Tamir Duberstein

unread,
Feb 14, 2018, 4:20:11 PM2/14/18
to golan...@googlegroups.com
Running the benchmarks in github.com/cockroachdb/cockroach/pkg/roachpb:

PATH=$HOME/local/go1.10/bin:$PATH
go test -i ./pkg/roachpb && go test -run - -bench . ./pkg/roachpb -count 5 -benchmem > gc.txt

PATH=$GCCROOT/gcc/bin:$PATH
go test -i ./pkg/roachpb && go test -run - -bench . ./pkg/roachpb -count 5 -benchmem > gccgo.txt

benchstat gc.txt gccgo.txt
name                old time/op    new time/op    delta
ValueSetBytes-16      38.7ns ± 1%   188.8ns ±44%  +388.36%  (p=0.008 n=5+5)
ValueSetFloat-16      27.6ns ± 1%   112.4ns ± 4%  +306.95%  (p=0.008 n=5+5)
ValueSetBool-16       29.5ns ± 0%    69.5ns ± 7%  +135.59%  (p=0.008 n=5+5)
ValueSetInt-16        35.9ns ± 1%    89.0ns ± 5%  +147.83%  (p=0.008 n=5+5)
ValueSetProto-16      45.5ns ± 0%   127.4ns ± 0%  +180.00%  (p=0.008 n=5+5)
ValueSetTime-16       52.4ns ± 1%   136.4ns ± 0%  +160.40%  (p=0.008 n=5+5)
ValueSetDecimal-16    95.5ns ± 1%   255.0ns ± 0%  +166.96%  (p=0.008 n=5+5)
ValueSetTuple-16      38.7ns ± 1%   116.0ns ± 0%  +200.05%  (p=0.016 n=5+4)
ValueGetBytes-16      9.22ns ± 0%   31.60ns ± 0%  +242.66%  (p=0.008 n=5+5)
ValueGetFloat-16      12.0ns ± 0%    49.9ns ± 0%  +315.83%  (p=0.016 n=4+5)
ValueGetBool-16       14.4ns ± 0%    39.3ns ± 0%  +172.92%  (p=0.029 n=4+4)
ValueGetInt-16        13.7ns ± 0%    37.3ns ± 0%  +172.12%  (p=0.016 n=4+5)
ValueGetProto-16      26.1ns ± 0%    60.5ns ± 0%  +131.72%  (p=0.016 n=4+5)
ValueGetTime-16       39.6ns ± 0%   172.0ns ± 0%  +334.34%  (p=0.008 n=5+5)
ValueGetDecimal-16    95.1ns ± 0%   264.0ns ± 0%  +177.49%  (p=0.008 n=5+5)
ValueGetTuple-16      9.84ns ± 0%   31.50ns ± 0%  +220.25%  (p=0.008 n=5+5)
name                old alloc/op   new alloc/op   delta
ValueSetBytes-16       32.0B ± 0%     32.0B ± 0%      ~     (all equal)
ValueSetFloat-16       16.0B ± 0%     16.0B ± 0%      ~     (all equal)
ValueSetBool-16        8.00B ± 0%     8.00B ± 0%      ~     (all equal)
ValueSetInt-16         16.0B ± 0%     16.0B ± 0%      ~     (all equal)
ValueSetProto-16       8.00B ± 0%     8.00B ± 0%      ~     (all equal)
ValueSetTime-16        16.0B ± 0%     16.0B ± 0%      ~     (all equal)
ValueSetDecimal-16     32.0B ± 0%     32.0B ± 0%      ~     (all equal)
ValueSetTuple-16       32.0B ± 0%     32.0B ± 0%      ~     (all equal)
ValueGetBytes-16       0.00B          0.00B           ~     (all equal)
ValueGetFloat-16       0.00B          0.00B           ~     (all equal)
ValueGetBool-16        0.00B          0.00B           ~     (all equal)
ValueGetInt-16         0.00B          0.00B           ~     (all equal)
ValueGetProto-16       0.00B          0.00B           ~     (all equal)
ValueGetTime-16        0.00B          0.00B           ~     (all equal)
ValueGetDecimal-16     48.0B ± 0%     48.0B ± 0%      ~     (all equal)
ValueGetTuple-16       0.00B          0.00B           ~     (all equal)
name                old allocs/op  new allocs/op  delta
ValueSetBytes-16        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetFloat-16        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetBool-16         1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetInt-16          1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetProto-16        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetTime-16         1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetDecimal-16      1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueSetTuple-16        1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueGetBytes-16        0.00           0.00           ~     (all equal)
ValueGetFloat-16        0.00           0.00           ~     (all equal)
ValueGetBool-16         0.00           0.00           ~     (all equal)
ValueGetInt-16          0.00           0.00           ~     (all equal)
ValueGetProto-16        0.00           0.00           ~     (all equal)
ValueGetTime-16         0.00           0.00           ~     (all equal)
ValueGetDecimal-16      1.00 ± 0%      1.00 ± 0%      ~     (all equal)
ValueGetTuple-16        0.00           0.00           ~     (all equal)

I chose this package because it doesn't depend on any of the fancy Makefile magic in the CockroachDB repo; you should be able to reproduce these results using just the go tool.

Are these results expected? I did minimal digging using pprof and perf but nothing obvious jumps out - things are just slower across the board. These results are on linux amd64.

Ian Lance Taylor

unread,
Feb 14, 2018, 7:44:55 PM2/14/18
to Tamir Duberstein, golang-nuts
Which version of gccgo?

Ian

Tamir Duberstein

unread,
Feb 15, 2018, 9:43:33 AM2/15/18
to Ian Lance Taylor, golang-nuts

Ian Lance Taylor

unread,
Feb 15, 2018, 1:59:32 PM2/15/18
to Tamir Duberstein, golang-nuts
On Thu, Feb 15, 2018 at 6:42 AM, Tamir Duberstein <tam...@gmail.com> wrote:
> Built at this revision:
> https://github.com/gcc-mirror/gcc/commit/a82f431e184a9ac922ad43df73cdcc702ab0f279

Thanks. What do you see from

go test -gccgoflags="-g -O2"

?

Ian

Ian Lance Taylor

unread,
Feb 15, 2018, 2:00:34 PM2/15/18
to Tamir Duberstein, golang-nuts
On Thu, Feb 15, 2018 at 10:59 AM, Ian Lance Taylor <ia...@golang.org> wrote:
> On Thu, Feb 15, 2018 at 6:42 AM, Tamir Duberstein <tam...@gmail.com> wrote:
>> Built at this revision:
>> https://github.com/gcc-mirror/gcc/commit/a82f431e184a9ac922ad43df73cdcc702ab0f279
>
> Thanks. What do you see from
>
> go test -gccgoflags="-g -O2"
>
> ?

Sorry, make that

go test -gccgoflags=all="-g -O2"

Tamir Duberstein

unread,
Feb 15, 2018, 2:32:43 PM2/15/18
to Ian Lance Taylor, golang-nuts
What does all do? Anyway, the results are better, but still not "good":

name                old time/op    new time/op    delta
ValueSetBytes-16      38.5ns ± 0%   105.8ns ± 4%  +174.81%  (p=0.008 n=5+5)
ValueSetFloat-16      27.5ns ± 1%    73.2ns ± 1%  +166.38%  (p=0.008 n=5+5)
ValueSetBool-16       29.4ns ± 0%    52.2ns ± 5%   +77.77%  (p=0.016 n=4+5)
ValueSetInt-16        34.0ns ± 1%    74.8ns ± 1%  +119.62%  (p=0.008 n=5+5)
ValueSetProto-16      45.4ns ± 0%    87.8ns ± 1%   +93.57%  (p=0.008 n=5+5)
ValueSetTime-16       52.9ns ± 1%   111.4ns ±18%  +110.67%  (p=0.008 n=5+5)
ValueSetDecimal-16    94.6ns ± 0%   214.2ns ±36%  +126.43%  (p=0.008 n=5+5)
ValueSetTuple-16      38.7ns ± 0%   105.6ns ± 3%  +172.87%  (p=0.008 n=5+5)
ValueGetBytes-16      9.22ns ± 0%   11.60ns ± 0%   +25.84%  (p=0.008 n=5+5)
ValueGetFloat-16      12.0ns ± 0%    23.8ns ± 0%   +97.67%  (p=0.016 n=5+4)
ValueGetBool-16       14.4ns ± 0%    15.2ns ± 0%    +5.56%  (p=0.029 n=4+4)
ValueGetInt-16        13.7ns ± 0%    14.6ns ± 0%    +6.57%  (p=0.016 n=5+4)
ValueGetProto-16      26.1ns ± 0%    22.6ns ± 0%   -13.41%  (p=0.008 n=5+5)
ValueGetTime-16       41.0ns ± 4%    78.9ns ± 0%   +92.68%  (p=0.008 n=5+5)
ValueGetDecimal-16     130ns ±24%     183ns ± 1%   +40.29%  (p=0.008 n=5+5)
ValueGetTuple-16      9.87ns ± 1%   11.60ns ± 0%   +17.58%  (p=0.008 n=5+5)

Ian Lance Taylor

unread,
Feb 15, 2018, 7:38:21 PM2/15/18
to Tamir Duberstein, golang-nuts
On Thu, Feb 15, 2018 at 11:31 AM, Tamir Duberstein <tam...@gmail.com> wrote:
> What does all do? Anyway, the results are better, but still not "good":

Using "all" applies the options to all packages, not just the one being built.

Thanks for the benchmarks, it's something to look at.

Ian

Tamir Duberstein

unread,
May 3, 2018, 3:54:12 PM5/3/18
to Ian Lance Taylor, golang-nuts
Looks like performance is virtually identical in GCC 8.1:

ValueSetBytes-16      39.1ns ± 1%   104.5ns ± 0%  +167.26%  (p=0.029 n=4+4)
ValueSetFloat-16      25.9ns ± 1%    67.8ns ± 0%  +161.78%  (p=0.029 n=4+4)
ValueSetBool-16       27.5ns ± 0%    54.0ns ± 1%   +96.18%  (p=0.029 n=4+4)
ValueSetInt-16        35.1ns ± 4%    75.0ns ± 0%  +113.51%  (p=0.029 n=4+4)
ValueSetProto-16      45.9ns ± 0%    86.7ns ± 0%   +88.89%  (p=0.029 n=4+4)
ValueSetTime-16       52.7ns ± 1%   100.2ns ± 1%   +90.32%  (p=0.029 n=4+4)
ValueSetDecimal-16    88.9ns ± 1%   177.2ns ± 0%   +99.33%  (p=0.029 n=4+4)
ValueSetTuple-16      39.4ns ± 1%   104.5ns ± 0%  +165.23%  (p=0.029 n=4+4)
ValueGetBytes-16      9.48ns ± 0%   11.62ns ± 1%   +22.66%  (p=0.029 n=4+4)
ValueGetFloat-16      13.3ns ± 0%    23.9ns ± 1%   +79.70%  (p=0.029 n=4+4)
ValueGetBool-16       14.8ns ± 1%    15.2ns ± 0%    +2.36%  (p=0.029 n=4+4)
ValueGetInt-16        14.0ns ± 0%    14.6ns ± 0%    +4.29%  (p=0.029 n=4+4)
ValueGetProto-16      26.4ns ± 0%    22.7ns ± 0%   -14.08%  (p=0.029 n=4+4)
ValueGetTime-16       39.6ns ± 0%    79.0ns ± 0%   +99.24%  (p=0.029 n=4+4)
ValueGetDecimal-16     101ns ± 3%     182ns ± 0%   +80.04%  (p=0.029 n=4+4)
ValueGetTuple-16      9.14ns ± 0%   11.60ns ± 0%   +26.85%  (p=0.029 n=4+4)
Reply all
Reply to author
Forward
0 new messages