linux/arm64 performance improvements

194 views
Skip to first unread message

Dave Cheney

unread,
Apr 16, 2015, 3:02:56 AM4/16/15
to golang-dev
Hello,

Last week Aram and I spent a week together working on the performance
of the arm64 port. We also benefited from Minux's peep optimisation
work, and the improvements to internal/gc that Josh has been doing.

This is comparing the build at an April 06 revision to this morning.

rugby(~/go/src/bytes) % benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkCompareBytesEqual 133 27.1 -79.62%
BenchmarkCompareBytesToNil 17.9 10.0 -44.13%
BenchmarkCompareBytesEmpty 17.5 10.0 -42.86%
BenchmarkCompareBytesIdentical 133 27.3 -79.47%
BenchmarkCompareBytesSameLength 60.4 16.7 -72.35%
BenchmarkCompareBytesDifferentLength 60.4 16.3 -73.01%
BenchmarkCompareBytesBigUnaligned 9126257 1126176 -87.66%
BenchmarkCompareBytesBig 9067664 1120451 -87.64%
BenchmarkCompareBytesBigIdentical 8968409 1081455 -87.94%

benchmark old MB/s new MB/s speedup
BenchmarkCompareBytesBigUnaligned 114.90 931.10 8.10x
BenchmarkCompareBytesBig 115.64 935.86 8.09x
BenchmarkCompareBytesBigIdentical 116.92 969.61 8.29x

-rwxr-xr-x 1 dfc warthogs 11M Apr 16 06:15 go1.golden # apr 06
-rwxr-xr-x 1 dfc warthogs 9.3M Apr 16 06:26 go1.test # apr 16

rugby(~/go/test/bench/go1) % benchcmp {old,new}.txt
benchmark old ns/op new ns/op delta
BenchmarkBinaryTree17 23690320000 15247745000 -35.64%
BenchmarkFannkuch11 17095248000 10852034000 -36.52%
BenchmarkFmtFprintfEmpty 292 182 -37.67%
BenchmarkFmtFprintfString 1017 723 -28.91%
BenchmarkFmtFprintfInt 986 696 -29.41%
BenchmarkFmtFprintfIntInt 1576 1158 -26.52%
BenchmarkFmtFprintfPrefixedInt 1330 994 -25.26%
BenchmarkFmtFprintfFloat 2302 1301 -43.48%
BenchmarkFmtManyArgs 5952 4246 -28.66%
BenchmarkGobDecode 54266840 34391478 -36.63%
BenchmarkGobEncode 37041604 27803960 -24.94%
BenchmarkGzip 1881814300 1277625400 -32.11%
BenchmarkGunzip 386336833 289727180 -25.01%
BenchmarkHTTPClientServer 234070 171968 -26.53%
BenchmarkJSONEncode 102261070 69771365 -31.77%
BenchmarkJSONDecode 343646420 232288720 -32.40%
BenchmarkMandelbrot200 28333376 12991695 -54.15%
BenchmarkGoParse 21524154 15317058 -28.84%
BenchmarkRegexpMatchEasy0_32 558 456 -18.28%
BenchmarkRegexpMatchEasy0_1K 4687 4806 +2.54%
BenchmarkRegexpMatchEasy1_32 540 455 -15.74%
BenchmarkRegexpMatchEasy1_1K 5106 4949 -3.07%
BenchmarkRegexpMatchMedium_32 784 593 -24.36%
BenchmarkRegexpMatchMedium_1K 234059 170815 -27.02%
BenchmarkRegexpMatchHard_32 13161 9562 -27.35%
BenchmarkRegexpMatchHard_1K 401956 296410 -26.26%
BenchmarkRevcomp 3822210333 2825194000 -26.08%
BenchmarkTemplate 443013800 321078280 -27.52%
BenchmarkTimeParse 1467 992 -32.38%
BenchmarkTimeFormat 1619 1180 -27.12%

benchmark old MB/s new MB/s speedup
BenchmarkGobDecode 14.14 22.32 1.58x
BenchmarkGobEncode 20.72 27.61 1.33x
BenchmarkGzip 10.31 15.19 1.47x
BenchmarkGunzip 50.23 66.98 1.33x
BenchmarkJSONEncode 18.98 27.81 1.47x
BenchmarkJSONDecode 5.65 8.35 1.48x
BenchmarkGoParse 2.69 3.78 1.41x
BenchmarkRegexpMatchEasy0_32 57.30 70.17 1.22x
BenchmarkRegexpMatchEasy0_1K 218.44 213.04 0.98x
BenchmarkRegexpMatchEasy1_32 59.17 70.24 1.19x
BenchmarkRegexpMatchEasy1_1K 200.53 206.88 1.03x
BenchmarkRegexpMatchMedium_32 1.27 1.69 1.33x
BenchmarkRegexpMatchMedium_1K 4.37 5.99 1.37x
BenchmarkRegexpMatchHard_32 2.43 3.35 1.38x
BenchmarkRegexpMatchHard_1K 2.55 3.45 1.35x
BenchmarkRevcomp 66.50 89.96 1.35x
BenchmarkTemplate 4.38 6.04 1.38x

Thanks

Dave
Reply all
Reply to author
Forward
0 new messages