Congrats to the Go team

710 views
Skip to first unread message

Robert Engels

unread,
Apr 24, 2024, 2:27:07 PMApr 24
to golang-nuts
I have a fairly stable project github.com/robaho/fixed which is almost 100% cpu bound. It doesn’t change so it makes a great way to compare the performance of different Go versions using the same hardware. I took the time to re-run the tests today.

Using 1.21.17:

BenchmarkAddFixed-8 2000000000 0.59 ns/op 0 B/op 0 allocs/op
BenchmarkAddDecimal-8 5000000 243 ns/op 176 B/op 8 allocs/op
BenchmarkAddBigInt-8 100000000 14.3 ns/op 0 B/op 0 allocs/op
BenchmarkAddBigFloat-8 20000000 78.8 ns/op 48 B/op 1 allocs/op
BenchmarkMulFixed-8 300000000 4.88 ns/op 0 B/op 0 allocs/op
BenchmarkMulDecimal-8 20000000 72.0 ns/op 80 B/op 2 allocs/op
BenchmarkMulBigInt-8 100000000 17.1 ns/op 0 B/op 0 allocs/op
BenchmarkMulBigFloat-8 30000000 35.5 ns/op 0 B/op 0 allocs/op
BenchmarkDivFixed-8 300000000 4.71 ns/op 0 B/op 0 allocs/op
BenchmarkDivDecimal-8 2000000 779 ns/op 568 B/op 21 allocs/op
BenchmarkDivBigInt-8 30000000 46.1 ns/op 8 B/op 1 allocs/op
BenchmarkDivBigFloat-8 20000000 108 ns/op 24 B/op 2 allocs/op
BenchmarkCmpFixed-8 2000000000 0.38 ns/op 0 B/op 0 allocs/op
BenchmarkCmpDecimal-8 200000000 8.05 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigInt-8 300000000 5.87 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigFloat-8 300000000 5.46 ns/op 0 B/op 0 allocs/op
BenchmarkStringFixed-8 20000000 57.4 ns/op 32 B/op 1 allocs/op
BenchmarkStringNFixed-8 20000000 55.6 ns/op 32 B/op 1 allocs/op
BenchmarkStringDecimal-8 10000000 218 ns/op 64 B/op 5 allocs/op
BenchmarkStringBigInt-8 10000000 122 ns/op 24 B/op 2 allocs/op
BenchmarkStringBigFloat-8 3000000 416 ns/op 192 B/op 8 allocs/op
BenchmarkWriteTo-8 30000000 45.8 ns/op 18 B/op 0 allocs/op

and version 1.21.5:

BenchmarkAddFixed-8 1000000000 0.9735 ns/op 0 B/op 0 allocs/op
BenchmarkAddDecimal-8 14311995 69.99 ns/op 80 B/op 2 allocs/op
BenchmarkAddBigInt-8 100000000 13.42 ns/op 0 B/op 0 allocs/op
BenchmarkAddBigFloat-8 17506702 63.84 ns/op 48 B/op 1 allocs/op
BenchmarkMulFixed-8 313983104 3.732 ns/op 0 B/op 0 allocs/op
BenchmarkMulDecimal-8 18046520 66.59 ns/op 80 B/op 2 allocs/op
BenchmarkMulBigInt-8 100000000 10.79 ns/op 0 B/op 0 allocs/op
BenchmarkMulBigFloat-8 49186024 24.30 ns/op 0 B/op 0 allocs/op
BenchmarkDivFixed-8 306888069 3.721 ns/op 0 B/op 0 allocs/op
BenchmarkDivDecimal-8 2510688 462.4 ns/op 384 B/op 12 allocs/op
BenchmarkDivBigInt-8 33993822 37.02 ns/op 8 B/op 1 allocs/op
BenchmarkDivBigFloat-8 9415330 111.5 ns/op 24 B/op 2 allocs/op
BenchmarkCmpFixed-8 1000000000 0.2548 ns/op 0 B/op 0 allocs/op
BenchmarkCmpDecimal-8 168714549 7.086 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigInt-8 234895634 4.952 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigFloat-8 260814464 4.503 ns/op 0 B/op 0 allocs/op
BenchmarkStringFixed-8 23725470 50.57 ns/op 24 B/op 1 allocs/op
BenchmarkStringNFixed-8 23666628 50.67 ns/op 24 B/op 1 allocs/op
BenchmarkStringDecimal-8 5665790 200.1 ns/op 56 B/op 4 allocs/op
BenchmarkStringBigInt-8 10596398 100.2 ns/op 16 B/op 1 allocs/op
BenchmarkStringBigFloat-8 2922332 391.2 ns/op 176 B/op 7 allocs/op
BenchmarkWriteTo-8 45734523 31.53 ns/op 23 B/op 0 allocs/op
which is pretty impressive across the board.

Only 2 tests show any degradation and most show significant improvement.

On the two that degrade, AddFixed is a fairly trivial add of 2 longs, so that is surprising. Strangely, WriteTo shows a different number of B/op on different runs (all of the other tests are stable).

Stephen Illingworth

unread,
Apr 24, 2024, 3:52:50 PMApr 24
to golang-nuts
How does it perform with v1.22.0? I found a small but measurable drop in throughput in one of my projects when compiled with 1.22.0. Issue raised here:


I have a feeling it's an issue with my older development hardware. But it's a compute bound project , similar to your project, so I'd be interested in hearing how you think it performs with 1.22.0

robert engels

unread,
Apr 24, 2024, 5:40:16 PMApr 24
to Stephen Illingworth, golang-nuts
Rough guess, it seems about the same.

1.22.2:

BenchmarkAddFixed-8 1000000000 0.7931 ns/op 0 B/op 0 allocs/op
BenchmarkAddDecimal-8 18156120 66.27 ns/op 80 B/op 2 allocs/op
BenchmarkAddBigInt-8 100000000 10.65 ns/op 0 B/op 0 allocs/op
BenchmarkAddBigFloat-8 18105667 66.33 ns/op 48 B/op 1 allocs/op
BenchmarkMulFixed-8 295736967 3.939 ns/op 0 B/op 0 allocs/op
BenchmarkMulDecimal-8 17827340 67.07 ns/op 80 B/op 2 allocs/op
BenchmarkMulBigInt-8 100000000 10.49 ns/op 0 B/op 0 allocs/op
BenchmarkMulBigFloat-8 49651710 24.12 ns/op 0 B/op 0 allocs/op
BenchmarkDivFixed-8 309444237 3.661 ns/op 0 B/op 0 allocs/op
BenchmarkDivDecimal-8 2426755 469.6 ns/op 384 B/op 12 allocs/op
BenchmarkDivBigInt-8 34289701 34.90 ns/op 8 B/op 1 allocs/op
BenchmarkDivBigFloat-8 9028243 113.6 ns/op 24 B/op 2 allocs/op
BenchmarkCmpFixed-8 1000000000 0.2784 ns/op 0 B/op 0 allocs/op
BenchmarkCmpDecimal-8 181467510 6.475 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigInt-8 244090252 4.805 ns/op 0 B/op 0 allocs/op
BenchmarkCmpBigFloat-8 256882512 5.081 ns/op 0 B/op 0 allocs/op
BenchmarkStringFixed-8 23666678 50.64 ns/op 24 B/op 1 allocs/op
BenchmarkStringNFixed-8 23938096 49.66 ns/op 24 B/op 1 allocs/op
BenchmarkStringDecimal-8 5196085 197.0 ns/op 56 B/op 4 allocs/op
BenchmarkStringBigInt-8 10304404 98.00 ns/op 16 B/op 1 allocs/op
BenchmarkStringBigFloat-8 2902165 395.2 ns/op 176 B/op 7 allocs/op
BenchmarkWriteTo-8 37140805 31.71 ns/op 28 B/op 0 allocs/op

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/dcaa3f55-d6f7-42cb-80c3-3fb791900f4en%40googlegroups.com.

Steven Hartland

unread,
Apr 24, 2024, 7:22:00 PMApr 24
to Robert Engels, golang-nuts
What’s it look like when your run it through 
https://pkg.go.dev/golang.org/x/perf/cmd/benchstat which will provide a nice side by side comparison?

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Robert Engels

unread,
Apr 24, 2024, 7:29:57 PMApr 24
to Steven Hartland, golang-nuts
│ /Users/robertengels/go1.21.5.txt │ /Users/robertengels/go1.22.2.txt │
│ sec/op │ sec/op vs base │
AddFixed-8 0.9603n ± ∞ ¹ 0.7931n ± ∞ ¹ ~ (p=1.000 n=1) ²
AddDecimal-8 66.41n ± ∞ ¹ 66.27n ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigInt-8 9.452n ± ∞ ¹ 10.650n ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigFloat-8 63.26n ± ∞ ¹ 66.33n ± ∞ ¹ ~ (p=1.000 n=1) ²
MulFixed-8 3.519n ± ∞ ¹ 3.939n ± ∞ ¹ ~ (p=1.000 n=1) ²
MulDecimal-8 65.98n ± ∞ ¹ 67.07n ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigInt-8 10.69n ± ∞ ¹ 10.49n ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigFloat-8 23.72n ± ∞ ¹ 24.12n ± ∞ ¹ ~ (p=1.000 n=1) ²
DivFixed-8 3.675n ± ∞ ¹ 3.661n ± ∞ ¹ ~ (p=1.000 n=1) ²
DivDecimal-8 460.8n ± ∞ ¹ 469.6n ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigInt-8 34.82n ± ∞ ¹ 34.90n ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigFloat-8 110.4n ± ∞ ¹ 113.6n ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpFixed-8 0.2529n ± ∞ ¹ 0.2784n ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpDecimal-8 6.883n ± ∞ ¹ 6.475n ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigInt-8 4.779n ± ∞ ¹ 4.805n ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigFloat-8 4.411n ± ∞ ¹ 5.081n ± ∞ ¹ ~ (p=1.000 n=1) ²
StringFixed-8 50.36n ± ∞ ¹ 50.64n ± ∞ ¹ ~ (p=1.000 n=1) ²
StringNFixed-8 53.41n ± ∞ ¹ 49.66n ± ∞ ¹ ~ (p=1.000 n=1) ²
StringDecimal-8 197.6n ± ∞ ¹ 197.0n ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigInt-8 98.17n ± ∞ ¹ 98.00n ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigFloat-8 386.2n ± ∞ ¹ 395.2n ± ∞ ¹ ~ (p=1.000 n=1) ²
WriteTo-8 31.82n ± ∞ ¹ 31.71n ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean 22.01n 22.28n +1.26%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
│ /Users/robertengels/go1.21.5.txt │ /Users/robertengels/go1.22.2.txt │
│ B/op │ B/op vs base │
AddFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddDecimal-8 80.00 ± ∞ ¹ 80.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigFloat-8 48.00 ± ∞ ¹ 48.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulDecimal-8 80.00 ± ∞ ¹ 80.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigFloat-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivDecimal-8 384.0 ± ∞ ¹ 384.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigInt-8 8.000 ± ∞ ¹ 8.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigFloat-8 24.00 ± ∞ ¹ 24.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpDecimal-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigFloat-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringFixed-8 24.00 ± ∞ ¹ 24.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringNFixed-8 24.00 ± ∞ ¹ 24.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringDecimal-8 56.00 ± ∞ ¹ 56.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigInt-8 16.00 ± ∞ ¹ 16.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigFloat-8 176.0 ± ∞ ¹ 176.0 ± ∞ ¹ ~ (p=1.000 n=1) ²
WriteTo-8 29.00 ± ∞ ¹ 28.00 ± ∞ ¹ ~ (p=1.000 n=1) ³
geomean ⁴ -0.16% ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean
│ /Users/robertengels/go1.21.5.txt │ /Users/robertengels/go1.22.2.txt │
│ allocs/op │ allocs/op vs base │
AddFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddDecimal-8 2.000 ± ∞ ¹ 2.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
AddBigFloat-8 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulDecimal-8 2.000 ± ∞ ¹ 2.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
MulBigFloat-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivDecimal-8 12.00 ± ∞ ¹ 12.00 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigInt-8 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
DivBigFloat-8 2.000 ± ∞ ¹ 2.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpFixed-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpDecimal-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigInt-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
CmpBigFloat-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringFixed-8 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringNFixed-8 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringDecimal-8 4.000 ± ∞ ¹ 4.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigInt-8 1.000 ± ∞ ¹ 1.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
StringBigFloat-8 7.000 ± ∞ ¹ 7.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
WriteTo-8 0.000 ± ∞ ¹ 0.000 ± ∞ ¹ ~ (p=1.000 n=1) ²
geomean ³ +0.00% ³

robert engels

unread,
Apr 24, 2024, 7:35:57 PMApr 24
to Robert Engels, Steven Hartland, golang-nuts
And for the 1.12 vs 1.22:

                 │ /Users/robertengels/go1.12.17.txt │    /Users/robertengels/go1.22.2.txt    │
                 │              sec/op               │    sec/op      vs base                 │
AddFixed-8                             0.5900n ± ∞ ¹   0.7931n ± ∞ ¹        ~ (p=1.000 n=1) ²
AddDecimal-8                           243.00n ± ∞ ¹    66.27n ± ∞ ¹        ~ (p=1.000 n=1) ²
AddBigInt-8                             14.30n ± ∞ ¹    10.65n ± ∞ ¹        ~ (p=1.000 n=1) ²
AddBigFloat-8                           78.80n ± ∞ ¹    66.33n ± ∞ ¹        ~ (p=1.000 n=1) ²
MulFixed-8                              4.880n ± ∞ ¹    3.939n ± ∞ ¹        ~ (p=1.000 n=1) ²
MulDecimal-8                            72.00n ± ∞ ¹    67.07n ± ∞ ¹        ~ (p=1.000 n=1) ²
MulBigInt-8                             17.10n ± ∞ ¹    10.49n ± ∞ ¹        ~ (p=1.000 n=1) ²
MulBigFloat-8                           35.50n ± ∞ ¹    24.12n ± ∞ ¹        ~ (p=1.000 n=1) ²
DivFixed-8                              4.710n ± ∞ ¹    3.661n ± ∞ ¹        ~ (p=1.000 n=1) ²
DivDecimal-8                            779.0n ± ∞ ¹    469.6n ± ∞ ¹        ~ (p=1.000 n=1) ²
DivBigInt-8                             46.10n ± ∞ ¹    34.90n ± ∞ ¹        ~ (p=1.000 n=1) ²
DivBigFloat-8                           108.0n ± ∞ ¹    113.6n ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpFixed-8                             0.3800n ± ∞ ¹   0.2784n ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpDecimal-8                            8.050n ± ∞ ¹    6.475n ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpBigInt-8                             5.870n ± ∞ ¹    4.805n ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpBigFloat-8                           5.460n ± ∞ ¹    5.081n ± ∞ ¹        ~ (p=1.000 n=1) ²
StringFixed-8                           57.40n ± ∞ ¹    50.64n ± ∞ ¹        ~ (p=1.000 n=1) ²
StringNFixed-8                          55.60n ± ∞ ¹    49.66n ± ∞ ¹        ~ (p=1.000 n=1) ²
StringDecimal-8                         218.0n ± ∞ ¹    197.0n ± ∞ ¹        ~ (p=1.000 n=1) ²
StringBigInt-8                         122.00n ± ∞ ¹    98.00n ± ∞ ¹        ~ (p=1.000 n=1) ²
StringBigFloat-8                        416.0n ± ∞ ¹    395.2n ± ∞ ¹        ~ (p=1.000 n=1) ²
WriteTo-8                               45.80n ± ∞ ¹    31.71n ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                 28.48n          22.28n        -21.75%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05

                 │ /Users/robertengels/go1.12.17.txt │  /Users/robertengels/go1.22.2.txt   │
                 │               B/op                │    B/op      vs base                │
AddFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
AddDecimal-8                            176.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
AddBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
AddBigFloat-8                            48.00 ± ∞ ¹   48.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
MulFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
MulDecimal-8                             80.00 ± ∞ ¹   80.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
MulBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
MulBigFloat-8                            0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
DivFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
DivDecimal-8                             568.0 ± ∞ ¹   384.0 ± ∞ ¹       ~ (p=1.000 n=1) ³
DivBigInt-8                              8.000 ± ∞ ¹   8.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
DivBigFloat-8                            24.00 ± ∞ ¹   24.00 ± ∞ ¹       ~ (p=1.000 n=1) ²
CmpFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
CmpDecimal-8                             0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
CmpBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
CmpBigFloat-8                            0.000 ± ∞ ¹   0.000 ± ∞ ¹       ~ (p=1.000 n=1) ²
StringFixed-8                            32.00 ± ∞ ¹   24.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
StringNFixed-8                           32.00 ± ∞ ¹   24.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
StringDecimal-8                          64.00 ± ∞ ¹   56.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
StringBigInt-8                           24.00 ± ∞ ¹   16.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
StringBigFloat-8                         192.0 ± ∞ ¹   176.0 ± ∞ ¹       ~ (p=1.000 n=1) ³
WriteTo-8                                18.00 ± ∞ ¹   28.00 ± ∞ ¹       ~ (p=1.000 n=1) ³
geomean                                            ⁴                -8.44%               ⁴
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal
³ need >= 4 samples to detect a difference at alpha level 0.05
⁴ summaries must be >0 to compute geomean

                 │ /Users/robertengels/go1.12.17.txt │   /Users/robertengels/go1.22.2.txt   │
                 │             allocs/op             │  allocs/op   vs base                 │
AddFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
AddDecimal-8                             8.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
AddBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
AddBigFloat-8                            1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
MulFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
MulDecimal-8                             2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
MulBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
MulBigFloat-8                            0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
DivFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
DivDecimal-8                             21.00 ± ∞ ¹   12.00 ± ∞ ¹        ~ (p=1.000 n=1) ³
DivBigInt-8                              1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
DivBigFloat-8                            2.000 ± ∞ ¹   2.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpFixed-8                               0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpDecimal-8                             0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpBigInt-8                              0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
CmpBigFloat-8                            0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
StringFixed-8                            1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
StringNFixed-8                           1.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
StringDecimal-8                          5.000 ± ∞ ¹   4.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
StringBigInt-8                           2.000 ± ∞ ¹   1.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
StringBigFloat-8                         8.000 ± ∞ ¹   7.000 ± ∞ ¹        ~ (p=1.000 n=1) ³
WriteTo-8                                0.000 ± ∞ ¹   0.000 ± ∞ ¹        ~ (p=1.000 n=1) ²
geomean                                            ⁴                -12.73%               ⁴

Robert Engels

unread,
Apr 24, 2024, 7:36:34 PMApr 24
to Robert Engels, Steven Hartland, golang-nuts
Sorry, need html, 1.12 vs 1.22

Steven Hartland

unread,
Apr 25, 2024, 1:24:38 PMApr 25
to Robert Engels, golang-nuts
Thanks for these, not sure if you noticed the notes from each run e.g. need >= 4 samples to detect a difference at alpha level 0.05.

Basically run the benchmark with a -count=6 or more and then run the tool, and you get a the comparison values which are typically the interesting bit

Robert Engels

unread,
Apr 25, 2024, 1:25:52 PMApr 25
to Steven Hartland, golang-nuts
Thanks. I noticed those but didn’t look into how to address. Appreciate it. I will rerun.

Robert Engels

unread,
Apr 25, 2024, 1:54:42 PMApr 25
to Steven Hartland, golang-nuts
There is a pretty significant degradation in AddFixed() which may be concerning to the Go team, because the code of AddFixed is simply:

// Add adds f0 to f producing a Fixed. If either operand is NaN, NaN is returned
func (f Fixed) Add(f0 Fixed) Fixed {
    if f.IsNaN() || f0.IsNaN() {
        return NaN
    }
    return Fixed{fp: f.fp + f0.fp}
}

Here is the combined output:

│ go1.12.17.txt │ go1.21.5.txt │ go1.22.2.txt │
│ sec/op │ sec/op vs base │ sec/op vs base │
AddFixed-8 0.6000n ± 2% 0.9593n ± 1% +59.89% (p=0.002 n=6) 0.8012n ± 12% +33.53% (p=0.002 n=6)
AddDecimal-8 246.00n ± 1% 66.47n ± 14% -72.98% (p=0.002 n=6) 66.23n ± 1% -73.08% (p=0.002 n=6)
AddBigInt-8 14.400n ± 1% 9.560n ± 2% -33.61% (p=0.002 n=6) 9.525n ± 7% -33.85% (p=0.002 n=6)
AddBigFloat-8 79.90n ± 3% 63.09n ± 0% -21.03% (p=0.002 n=6) 66.20n ± 1% -17.15% (p=0.002 n=6)
MulFixed-8 4.950n ± 3% 3.512n ± 0% -29.04% (p=0.002 n=6) 3.809n ± 2% -23.06% (p=0.002 n=6)
MulDecimal-8 73.45n ± 3% 65.90n ± 0% -10.29% (p=0.002 n=6) 67.20n ± 1% -8.52% (p=0.002 n=6)
MulBigInt-8 17.45n ± 1% 10.38n ± 2% -40.52% (p=0.002 n=6) 10.43n ± 1% -40.23% (p=0.002 n=6)
MulBigFloat-8 36.00n ± 2% 23.85n ± 1% -33.75% (p=0.002 n=6) 24.00n ± 1% -33.35% (p=0.002 n=6)
DivFixed-8 4.700n ± 1% 3.689n ± 1% -21.51% (p=0.002 n=6) 3.695n ± 2% -21.39% (p=0.002 n=6)
DivDecimal-8 767.0n ± 11% 462.9n ± 0% -39.65% (p=0.002 n=6) 470.4n ± 4% -38.68% (p=0.002 n=6)
DivBigInt-8 45.25n ± 1% 34.68n ± 10% -23.36% (p=0.002 n=6) 34.98n ± 1% -22.70% (p=0.002 n=6)
DivBigFloat-8 108.0n ± 1% 110.8n ± 0% +2.64% (p=0.002 n=6) 113.6n ± 0% +5.19% (p=0.002 n=6)
CmpFixed-8 0.3800n ± 3% 0.2500n ± 1% -34.22% (p=0.002 n=6) 0.2511n ± 1% -33.92% (p=0.002 n=6)
CmpDecimal-8 7.925n ± 1% 6.942n ± 1% -12.40% (p=0.002 n=6) 6.503n ± 1% -17.94% (p=0.002 n=6)
CmpBigInt-8 5.800n ± 0% 4.795n ± 2% -17.32% (p=0.002 n=6) 4.807n ± 1% -17.12% (p=0.002 n=6)
CmpBigFloat-8 5.310n ± 2% 4.417n ± 1% -16.83% (p=0.002 n=6) 4.475n ± 9% -15.73% (p=0.002 n=6)
StringFixed-8 57.10n ± 9% 50.40n ± 1% -11.73% (p=0.002 n=6) 50.70n ± 1% -11.22% (p=0.002 n=6)
StringNFixed-8 55.60n ± 0% 51.41n ± 15% ~ (p=0.061 n=6) 49.78n ± 1% -10.48% (p=0.002 n=6)
StringDecimal-8 216.0n ± 2% 215.2n ± 21% ~ (p=1.000 n=6) 197.2n ± 0% -8.68% (p=0.002 n=6)
StringBigInt-8 121.00n ± 1% 98.81n ± 1% -18.33% (p=0.002 n=6) 98.61n ± 4% -18.50% (p=0.002 n=6)
StringBigFloat-8 413.0n ± 3% 387.6n ± 1% -6.15% (p=0.002 n=6) 408.4n ± 2% -1.10% (p=0.026 n=6)
WriteTo-8 37.15n ± 15% 26.14n ± 45% -29.65% (p=0.041 n=6) 26.40n ± 40% -28.94% (p=0.015 n=6)
geomean 28.20n 21.86n -22.49% 21.79n -22.76%
│ go1.12.17.txt │ go1.21.5.txt │ go1.22.2.txt │
│ B/op │ B/op vs base │ B/op vs base │
AddFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddDecimal-8 176.00 ± 0% 80.00 ± 0% -54.55% (p=0.002 n=6) 80.00 ± 0% -54.55% (p=0.002 n=6)
AddBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddBigFloat-8 48.00 ± 0% 48.00 ± 0% ~ (p=1.000 n=6) ¹ 48.00 ± 0% ~ (p=1.000 n=6) ¹
MulFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulDecimal-8 80.00 ± 0% 80.00 ± 0% ~ (p=1.000 n=6) ¹ 80.00 ± 0% ~ (p=1.000 n=6) ¹
MulBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivDecimal-8 568.0 ± 0% 384.0 ± 0% -32.39% (p=0.002 n=6) 384.0 ± 0% -32.39% (p=0.002 n=6)
DivBigInt-8 8.000 ± 0% 8.000 ± 0% ~ (p=1.000 n=6) ¹ 8.000 ± 0% ~ (p=1.000 n=6) ¹
DivBigFloat-8 24.00 ± 0% 24.00 ± 0% ~ (p=1.000 n=6) ¹ 24.00 ± 0% ~ (p=1.000 n=6) ¹
CmpFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpDecimal-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
StringFixed-8 32.00 ± 0% 24.00 ± 0% -25.00% (p=0.002 n=6) 24.00 ± 0% -25.00% (p=0.002 n=6)
StringNFixed-8 32.00 ± 0% 24.00 ± 0% -25.00% (p=0.002 n=6) 24.00 ± 0% -25.00% (p=0.002 n=6)
StringDecimal-8 64.00 ± 0% 56.00 ± 0% -12.50% (p=0.002 n=6) 56.00 ± 0% -12.50% (p=0.002 n=6)
StringBigInt-8 24.00 ± 0% 16.00 ± 0% -33.33% (p=0.002 n=6) 16.00 ± 0% -33.33% (p=0.002 n=6)
StringBigFloat-8 192.0 ± 0% 176.0 ± 0% -8.33% (p=0.002 n=6) 176.0 ± 0% -8.33% (p=0.002 n=6)
WriteTo-8 21.00 ± 14% 23.00 ± 13% +9.52% (p=0.002 n=6) 23.00 ± 13% +9.52% (p=0.002 n=6)
geomean ² -9.89% ² -9.89% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
│ go1.12.17.txt │ go1.21.5.txt │ go1.22.2.txt │
│ allocs/op │ allocs/op vs base │ allocs/op vs base │
AddFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddDecimal-8 8.000 ± 0% 2.000 ± 0% -75.00% (p=0.002 n=6) 2.000 ± 0% -75.00% (p=0.002 n=6)
AddBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddBigFloat-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹ 1.000 ± 0% ~ (p=1.000 n=6) ¹
MulFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulDecimal-8 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹ 2.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivDecimal-8 21.00 ± 0% 12.00 ± 0% -42.86% (p=0.002 n=6) 12.00 ± 0% -42.86% (p=0.002 n=6)
DivBigInt-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹ 1.000 ± 0% ~ (p=1.000 n=6) ¹
DivBigFloat-8 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹ 2.000 ± 0% ~ (p=1.000 n=6) ¹
CmpFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpDecimal-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
StringFixed-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹ 1.000 ± 0% ~ (p=1.000 n=6) ¹
StringNFixed-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹ 1.000 ± 0% ~ (p=1.000 n=6) ¹
StringDecimal-8 5.000 ± 0% 4.000 ± 0% -20.00% (p=0.002 n=6) 4.000 ± 0% -20.00% (p=0.002 n=6)
StringBigInt-8 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.002 n=6) 1.000 ± 0% -50.00% (p=0.002 n=6)
StringBigFloat-8 8.000 ± 0% 7.000 ± 0% -12.50% (p=0.002 n=6) 7.000 ± 0% -12.50% (p=0.002 n=6)
WriteTo-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹ 0.000 ± 0% ~ (p=1.000 n=6) ¹
geomean ² -12.73% ² -12.73% ²
¹ all samples are equal
² summaries must be >0 to compute geomean

Keith Randall

unread,
Apr 26, 2024, 1:17:14 AMApr 26
to golang-nuts
> There is a pretty significant degradation in AddFixed() which may be concerning to the Go team

What is the benchmark for this?
I am usually suspicious of sub-nanosecond benchmark times. Generally that indicates that the benchmark completely optimized away and all you are measuring is an empty loop.
Hard to know for sure without looking at the generated code for BenchmarkAddFixed.

Robert Engels

unread,
Apr 26, 2024, 7:21:18 AMApr 26
to Keith Randall, golang-nuts
I agree but in this case it is very consistent. Even if that were the case, wouldn’t that mean that 1.12 had better optimization in this case?

I will dig in today and report back with the generated code. 

On Apr 26, 2024, at 12:17 AM, 'Keith Randall' via golang-nuts <golan...@googlegroups.com> wrote:

> There is a pretty significant degradation in AddFixed() which may be concerning to the Go team

robert engels

unread,
Apr 26, 2024, 8:44:01 AMApr 26
to Keith Randall, golang-nuts
There seems to be a material difference in the generated code.

The function is:

func (f Fixed) Add(f0 Fixed) Fixed {
    if f.IsNaN() || f0.IsNaN() {
        return NaN
    }
    return Fixed{fp: f.fp + f0.fp}
}

In 1.12 it appears to place the receiver and the argument in the AX and BX registers as a calling convention, but in 1.21 these are passed on the stack.

1.12 assembly:

github.com/robaho/fixed.Fixed.Add STEXT nosplit size=33 args=0x10 locals=0x0 funcid=0x0 align=0x0
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) TEXT github.com/robaho/fixed.Fixed.Add(SB), NOSPLIT|NOFRAME|ABIInternal, $0-16
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $5, github.com/robaho/fixed.Fixed.Add.arginfo1(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $6, github.com/robaho/fixed.Fixed.Add.argliveinfo(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) PCDATA $3, $1
0x0000 00000 (<unknown line number>) NOP
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) XCHGL AX, AX
0x0001 00001 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) MOVQ $9223372036854775807, CX
0x000b 00011 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) CMPQ AX, CX
0x000e 00014 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) JEQ 21
0x0010 00016 (<unknown line number>) NOP
0x0010 00016 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) CMPQ BX, CX
0x0013 00019 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) JNE 29
0x0015 00021 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:174) MOVQ github.com/robaho/fixed.NaN(SB), AX
0x001c 00028 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:174) RET
0x001d 00029 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:176) ADDQ BX, AX
0x0020 00032 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:176) RET
0x0000 90 48 b9 ff ff ff ff ff ff ff 7f 48 39 c8 74 05  .H.........H9.t.
0x0010 48 39 cb 75 08 48 8b 05 00 00 00 00 c3 48 01 d8  H9.u.H.......H..
0x0020 c3                                               .

1.21.5 assembly:

"".Fixed.Add STEXT nosplit size=54 args=0x18 locals=0x0
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) TEXT "".Fixed.Add(SB), NOSPLIT|ABIInternal, $0-24
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:172) FUNCDATA $3, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB)
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) PCDATA $2, $0
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) PCDATA $0, $0
0x0000 00000 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) XCHGL AX, AX
0x0001 00001 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) MOVQ "".f+8(SP), AX
0x0006 00006 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) MOVQ $9223372036854775807, CX
0x0010 00016 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) CMPQ AX, CX
0x0013 00019 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) JNE 34
0x0015 00021 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:174) MOVQ "".NaN(SB), AX
0x001c 00028 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:174) MOVQ AX, "".~r1+24(SP)
0x0021 00033 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:174) RET
0x0022 00034 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) XCHGL AX, AX
0x0023 00035 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) MOVQ "".f0+16(SP), DX
0x0028 00040 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:144) CMPQ DX, CX
0x002b 00043 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:173) JEQ 21
0x002d 00045 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:176) ADDQ DX, AX
0x0030 00048 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:176) MOVQ AX, "".~r1+24(SP)
0x0035 00053 (/Users/robertengels/go/src/github.com/robaho/fixed/fixed.go:176) RET
0x0000 90 48 8b 44 24 08 48 b9 ff ff ff ff ff ff ff 7f  .H.D$.H.........
0x0010 48 39 c8 75 0d 48 8b 05 00 00 00 00 48 89 44 24  H9.u.H......H.D$
0x0020 18 c3 90 48 8b 54 24 10 48 39 ca 74 e8 48 01 d0  ...H.T$.H9.t.H..
0x0030 48 89 44 24 18 c3                                H.D$..
rel 24+4 t=15 "".NaN+0



Keith Randall

unread,
Apr 26, 2024, 11:02:48 PMApr 26
to golang-nuts
Isn't that assembly exactly the opposite? The code that is passing in registers should be 1.21, the one passing on the stack would be 1.12. At least, that's what should be the case with the register ABI that launched in 1.17 (for amd64).
Please post a full program and a full command line you're using, so it is immediately obvious how one would reproduce.

Also, the assembly of Fixed.Add isn't the thing I'm curious about. It is the assembly of BenchmarkAddFixed. Fixed.Add will probably be inlined into that function, and possibly optimized away.

Robert Engels

unread,
Apr 26, 2024, 11:33:00 PMApr 26
to Keith Randall, golang-nuts
Why would it be optimized away in 1.12 and not optimized away in 1.21?

I could have made a mistake in my recording so I’ll test it again tomorrow. 

On Apr 26, 2024, at 10:03 PM, 'Keith Randall' via golang-nuts <golan...@googlegroups.com> wrote:

Isn't that assembly exactly the opposite? The code that is passing in registers should be 1.21, the one passing on the stack would be 1.12. At least, that's what should be the case with the register ABI that launched in 1.17 (for amd64).

robert engels

unread,
Apr 27, 2024, 12:12:36 AMApr 27
to Keith Randall, golang-nuts
Apologies. The assembly files were reversed. But the timings remain the same.

I will get the assembly of the test compiles tomorrow  - but if these differ - that suggests that the test harness between versions has changed as well - meaning can’t compare benchmark performances across versions ?

Under the assumption that the register passing should be more efficient than the stack passing.

Steven Hartland

unread,
Apr 27, 2024, 11:32:05 AMApr 27
to robert engels, Keith Randall, golang-nuts
Do you have the test code for that specific test?

That would allow others to have a look at it and also confirm if the test is somehow optimising the function call away.

Robert Engels

unread,
Apr 27, 2024, 4:31:14 PMApr 27
to Steven Hartland, Keith Randall, golang-nuts

robert engels

unread,
Apr 28, 2024, 11:25:57 AMApr 28
to Keith Randall, golang-nuts
I modified the benchmarks to ensure the Add test was not being optimized away, and the results are the same. I understand that these are very micro benchmarks, but I think it would apply in any numerically intensive application.

│ go1.12.17.txt │ go1.21.5.txt │
│ sec/op │ sec/op vs base │
AddFixed-8 0.5850n ± 1% 0.9707n ± 3% +65.92% (p=0.002 n=6)
AddDecimal-8 243.50n ± 3% 65.26n ± 1% -73.20% (p=0.002 n=6)
AddBigInt-8 14.400n ± 2% 9.525n ± 13% -33.85% (p=0.002 n=6)
AddBigFloat-8 81.80n ± 10% 63.04n ± 0% -22.93% (p=0.002 n=6)
MulFixed-8 4.990n ± 4% 3.519n ± 1% -29.48% (p=0.002 n=6)
MulDecimal-8 72.35n ± 2% 65.86n ± 0% -8.97% (p=0.002 n=6)
MulBigInt-8 17.10n ± 1% 10.42n ± 0% -39.09% (p=0.002 n=6)
MulBigFloat-8 35.65n ± 1% 23.89n ± 0% -32.99% (p=0.002 n=6)
DivFixed-8 4.715n ± 1% 3.697n ± 6% -21.59% (p=0.002 n=6)
DivDecimal-8 755.5n ± 5% 463.0n ± 3% -38.72% (p=0.002 n=6)
DivBigInt-8 45.55n ± 5% 34.66n ± 1% -23.92% (p=0.002 n=6)
DivBigFloat-8 107.5n ± 8% 111.6n ± 5% +3.81% (p=0.030 n=6)
CmpFixed-8 0.3850n ± 1% 0.2503n ± 0% -34.99% (p=0.002 n=6)
CmpDecimal-8 8.085n ± 30% 7.647n ± 1% -5.42% (p=0.002 n=6)
CmpBigInt-8 5.410n ± 1% 4.882n ± 38% ~ (p=0.058 n=6)
CmpBigFloat-8 6.520n ± 1% 4.675n ± 1% -28.30% (p=0.002 n=6)
StringFixed-8 56.35n ± 0% 50.30n ± 0% -10.75% (p=0.002 n=6)
StringNFixed-8 54.90n ± 32% 50.67n ± 1% -7.70% (p=0.002 n=6)
StringDecimal-8 254.0n ± 11% 194.1n ± 1% -23.60% (p=0.002 n=6)
StringBigInt-8 125.00n ± 5% 97.50n ± 1% -22.00% (p=0.002 n=6)
StringBigFloat-8 412.0n ± 5% 390.7n ± 7% -5.17% (p=0.024 n=6)
WriteTo-8 40.35n ± 37% 26.88n ± 38% -33.40% (p=0.009 n=6)
geomean 28.67n 21.94n -23.47%
│ go1.12.17.txt │ go1.21.5.txt │
│ B/op │ B/op vs base │
AddFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddDecimal-8 176.00 ± 0% 80.00 ± 0% -54.55% (p=0.002 n=6)
AddBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddBigFloat-8 48.00 ± 0% 48.00 ± 0% ~ (p=1.000 n=6) ¹
MulFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulDecimal-8 80.00 ± 0% 80.00 ± 0% ~ (p=1.000 n=6) ¹
MulBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivDecimal-8 568.0 ± 0% 384.0 ± 0% -32.39% (p=0.002 n=6)
DivBigInt-8 8.000 ± 0% 8.000 ± 0% ~ (p=1.000 n=6) ¹
DivBigFloat-8 24.00 ± 0% 24.00 ± 0% ~ (p=1.000 n=6) ¹
CmpFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpDecimal-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
StringFixed-8 32.00 ± 0% 24.00 ± 0% -25.00% (p=0.002 n=6)
StringNFixed-8 32.00 ± 0% 24.00 ± 0% -25.00% (p=0.002 n=6)
StringDecimal-8 64.00 ± 0% 56.00 ± 0% -12.50% (p=0.002 n=6)
StringBigInt-8 24.00 ± 0% 16.00 ± 0% -33.33% (p=0.002 n=6)
StringBigFloat-8 192.0 ± 0% 176.0 ± 0% -8.33% (p=0.002 n=6)
WriteTo-8 18.00 ± 17% 23.00 ± 17% +27.78% (p=0.002 n=6)
geomean ² -9.25% ²
¹ all samples are equal
² summaries must be >0 to compute geomean
│ go1.12.17.txt │ go1.21.5.txt │
│ allocs/op │ allocs/op vs base │
AddFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddDecimal-8 8.000 ± 0% 2.000 ± 0% -75.00% (p=0.002 n=6)
AddBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
AddBigFloat-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹
MulFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulDecimal-8 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
MulBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
DivDecimal-8 21.00 ± 0% 12.00 ± 0% -42.86% (p=0.002 n=6)
DivBigInt-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹
DivBigFloat-8 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹
CmpFixed-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpDecimal-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigInt-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
CmpBigFloat-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹
StringFixed-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹
StringNFixed-8 1.000 ± 0% 1.000 ± 0% ~ (p=1.000 n=6) ¹
StringDecimal-8 5.000 ± 0% 4.000 ± 0% -20.00% (p=0.002 n=6)
StringBigInt-8 2.000 ± 0% 1.000 ± 0% -50.00% (p=0.002 n=6)
StringBigFloat-8 8.000 ± 0% 7.000 ± 0% -12.50% (p=0.002 n=6)
WriteTo-8 0.000 ± 0% 0.000 ± 0% ~ (p=1.000 n=6) ¹

robert engels

unread,
Apr 28, 2024, 11:34:58 AMApr 28
to Keith Randall, golang-nuts
I figured out the code that causes the performance degradation in 1.21. I created a self-contained test below.

If you change the code in Add() below to use the commented out line instead of creating the new Fixed struct, it is 30% slower in Go 1.21 than in Go 1.12

package main

import "testing"

type Fixed struct { fp int64 }

const NaN = int64((int64((1 << 63) -1)))

var NaNF = Fixed{fp: NaN}

func (a Fixed) Add(b Fixed) Fixed {
    if(a.IsNaN() || b.IsNaN()) {
        // return NaNF
        return Fixed{fp:NaN}
    }
    return Fixed{ fp: (a.fp + b.fp) }
}

func (a Fixed) IsNaN() bool {
    return a.fp == NaN
}

var Result Fixed

func BenchmarkAdd(b *testing.B) {

    var f1 Fixed = Fixed{fp : 1}
    var f2 Fixed = Fixed{fp : 1}
    for i := 0; i < b.N; i++ {
        f2 = f2.Add(f1)
    }

    Result = f2

robert engels

unread,
Apr 28, 2024, 11:36:15 AMApr 28
to Keith Randall, golang-nuts
One other interesting note, if I change Fixed to

type Fixed int64

and change the code accordingly, then Go 1.21 is roughly 50% faster than Go 1.12

Ah, micro-benchmarks - got to love them.

robert engels

unread,
Apr 29, 2024, 1:30:17 PMApr 29
to Keith Randall, golang-nuts
I figured out the code that causes the performance degradation in 1.21. I created a self-contained test below.

If you change the code in Add() below to use the commented out line instead of creating the new Fixed struct, it is 30% slower in Go 1.21 than in Go 1.12

package main

import "testing"

type Fixed struct { fp int64 }

const NaN = int64((int64((1 << 63) -1)))

var NaNF = Fixed{fp: NaN}

func (a Fixed) Add(b Fixed) Fixed {
    if(a.IsNaN() || b.IsNaN()) {
        // return NaNF
        return Fixed{fp:NaN}
    }
    return Fixed{ fp: (a.fp + b.fp) }
}

func (a Fixed) IsNaN() bool {
    return a.fp == NaN
}

var Result Fixed

func BenchmarkAdd(b *testing.B) {

    var f1 Fixed = Fixed{fp : 1}
    var f2 Fixed = Fixed{fp : 1}
    for i := 0; i < b.N; i++ {
        f2 = f2.Add(f1)
    }

    Result = f2
}

Keith Randall

unread,
Apr 30, 2024, 1:23:48 PMApr 30
to golang-nuts
The inner loop of BenchmarkAdd changed a bit between 1.20 and 1.21. It is marginally slower (6% or so on my machine). It is the exact same set of instructions, just in a different order.

fast, 1.20.6:

  tmp2_test.go:16 0x4f8409 488d5a01 LEAQ 0x1(DX), BX
  tmp2_test.go:20 0x4f840d 48beffffffffffffff7f MOVQ $0x7fffffffffffffff, SI
  tmp2_test.go:20 0x4f8417 4839f2 CMPQ DX, SI
  tmp2_test.go:30 0x4f841a 480f44de CMOVE SI, BX
  tmp2_test.go:29 0x4f841e 48ffc1 INCQ CX
  tmp2_test.go:12 0x4f8421 90 NOPL
  tmp2_test.go:33 0x4f8422 4889da MOVQ BX, DX
  tmp2_test.go:29 0x4f8425 483988a0010000 CMPQ 0x1a0(AX), CX
  tmp2_test.go:29 0x4f842c 7fdb JG 0x4f8409

slow, 1.21.6:

  tmp2_test.go:16 0x4f9769 488d5a01 LEAQ 0x1(DX), BX
  tmp2_test.go:29 0x4f976d 48ffc1 INCQ CX
  tmp2_test.go:12 0x4f9770 90 NOPL
  tmp2_test.go:20 0x4f9771 48beffffffffffffff7f MOVQ $0x7fffffffffffffff, SI
  tmp2_test.go:20 0x4f977b 4839f2 CMPQ DX, SI
  tmp2_test.go:30 0x4f977e 480f44de CMOVE SI, BX
  tmp2_test.go:33 0x4f9782 4889da MOVQ BX, DX
  tmp2_test.go:29 0x4f9785 483988a0010000 CMPQ 0x1a0(AX), CX
  tmp2_test.go:29 0x4f978c 7fdb JG 0x4f9769

Probably caused by https://go-review.googlesource.com/c/go/+/270940 or one of its followons.
I'm not sure what, if anything, we should do here. Certainly the first step would be to figure out why the second inner loop is slower.

Reply all
Reply to author
Forward
0 new messages