FMA (Fused Multiply Add) detector

199 views
Skip to first unread message

Nigel Tao

unread,
May 9, 2024, 12:57:18 AMMay 9
to golang-dev, d...@kortschak.io
https://golang.org/issue/67029 says that the golang.org/x/image/draw
package's BiLinear scaler (and maybe other scalers) have different
results depending on GOARCH. I assume that this is because of FMA
(Fused Multiply Add) kicking in on amd64 but not arm64 (or vice
versa).

What's the recommended approach for disabling FMA? There was a 2017
golang-dev discussion
(https://groups.google.com/g/golang-dev/c/Sti0bl2xUXQ) and CL (for
x/image/vector, not x/image/draw:
https://go-review.googlesource.com/c/image/+/61024). Is the
recommendation still the same, changing a bunch of "expr" expressions
to "float64(expr)" even though expr is nominally already a float64?

The x/image/draw package has a lot of generated code. Is there some
tooling (compiler debug-output flag??) where I can say "build the
x/image/draw package and tell me all the line numbers where FMA was
applied"?

Jorropo

unread,
May 9, 2024, 2:03:50 AMMay 9
to Nigel Tao, golang-dev, d...@kortschak.io
`git grep -F "FMA"` inside `src/cmd/compile/internal` eventually shows `a.Block.Func.useFMA(v)`
`git grep -F ") useFMA("` then shows https://github.com/golang/go/blob/49eedfb4d07ab6c0d62041ba722dfe81e73d92ce/src/cmd/compile/internal/ssa/func.go#L827-L837

which mentions:
> // useFMA allows targeted debugging w/ GOFMAHASH
> // If you have an architecture-dependent FP glitch, this will help you find it.

The easiest way to turn it off as you asked would be to follow `.UseFMA` but actually looking up refences with gopls shows it only disable FMA on p9 https://github.com/golang/go/blob/49eedfb4d07ab6c0d62041ba722dfe81e73d92ce/src/cmd/compile/internal/ssa/config.go#L371-L372 (you could add `|| true` to help in debugging).

Else searching for `git grep -Fi fmahash` shows a few results https://grep.app/search?q=FMAHASH then searching in the promisingly named `fmahash_test.go` file states https://github.com/golang/go/blob/49eedfb4d07ab6c0d62041ba722dfe81e73d92ce/src/cmd/compile/internal/ssa/fmahash_test.go#L37C46-L39

TL;DR: I think your want `GOCOMPILEDEBUG=fmahash=1/0`

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/CAOeFMNVv-7JkS%2BwM%3Dwia7AM3bDw7OYYnDZxteN1U62Ddkye5YA%40mail.gmail.com.

Keith Randall

unread,
May 9, 2024, 2:13:36 AMMay 9
to Nigel Tao, golang-dev, d...@kortschak.io
On Wed, May 8, 2024 at 9:57 PM Nigel Tao <nige...@golang.org> wrote:
https://golang.org/issue/67029 says that the golang.org/x/image/draw
package's BiLinear scaler (and maybe other scalers) have different
results depending on GOARCH. I assume that this is because of FMA
(Fused Multiply Add) kicking in on amd64 but not arm64 (or vice
versa).

What's the recommended approach for disabling FMA? There was a 2017
golang-dev discussion
(https://groups.google.com/g/golang-dev/c/Sti0bl2xUXQ) and CL (for
x/image/vector, not x/image/draw:
https://go-review.googlesource.com/c/image/+/61024). Is the
recommendation still the same, changing a bunch of "expr" expressions
to "float64(expr)" even though expr is nominally already a float64?


Yes, that is the way to disable FMA. Note for x+y*z, you need to do x+float64(y*z), not float64(x+y*z).
 
The x/image/draw package has a lot of generated code. Is there some
tooling (compiler debug-output flag??) where I can say "build the
x/image/draw package and tell me all the line numbers where FMA was
applied"?


We have a mode that will tell you where FMA was applied. It isn't terribly easy to use, but it exists.
Adding -gcflags=-d=fmahash=/ to a go build line will print something for every FMA it does.
There's some tooling at https://github.com/dr2chase/gossahash to do some more advanced things with that base functionality.

Russ Cox

unread,
May 10, 2024, 2:12:36 PMMay 10
to kei...@alum.mit.edu, Nigel Tao, golang-dev, d...@kortschak.io
Hi Nigel,

As Keith said, the compiler can help you a lot here. David's gossahash tool has been replaced by golang.org/x/tools/cmd/bisect

Running Dan's test on my arm64 mac:

% gh pr checkout 1
branch 'failure' set up to track 'origin/failure'.
Switched to a new branch 'failure' 
% bisect -compile fma go test
bisect: run: checking target with all changes disabled
bisect: run: GOCOMPILEDEBUG=fmahash=n go test... ok (1193 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=n go test... ok (1193 matches)
bisect: checking target with all changes enabled
bisect: run: GOCOMPILEDEBUG=fmahash=y go test... FAIL (1193 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=y go test... FAIL (1193 matches)
bisect: target succeeds with no changes, fails with all changes
bisect: searching for minimal set of enabled changes causing failure
bisect: run: GOCOMPILEDEBUG=fmahash=+0 go test... ok (618 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+0 go test... ok (618 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+1 go test... FAIL (575 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+1 go test... FAIL (575 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+01 go test... FAIL (282 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+01 go test... FAIL (282 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+001 go test... ok (132 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+001 go test... ok (132 matches)
...
bisect: FOUND failing change set
--- change set #1 (enabling changes causes failure)
../../pkg/mod/golang.org/x/im...@v0.15.0/draw/impl.go:6231:8
---
bisect: checking for more failures
bisect: run: GOCOMPILEDEBUG=fmahash=-x47dfdb5 go test... FAIL (1192 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=-x47dfdb5 go test... FAIL (1192 matches)
bisect: target still fails; searching for more bad changes
bisect: run: GOCOMPILEDEBUG=fmahash=+0-x47dfdb5 go test... ok (618 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+0-x47dfdb5 go test... ok (618 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=+1-x47dfdb5 go test... ok (574 matches)
...
bisect: FOUND failing change set
--- change set #2 (enabling changes causes failure)
../../pkg/mod/golang.org/x/im...@v0.15.0/draw/impl.go:6229:8
../../pkg/mod/golang.org/x/im...@v0.15.0/draw/impl.go:6230:8
---
bisect: checking for more failures
bisect: run: GOCOMPILEDEBUG=fmahash=-x8386cd4-xcdf7947-x47dfdb5 go test... ok (1190 matches)
bisect: run: GOCOMPILEDEBUG=fmahash=-x8386cd4-xcdf7947-x47dfdb5 go test... ok (1190 matches)
bisect: target succeeds with all remaining changes enabled


This output is saying that if you enable FMA usage on line 6231
that alone causes the test failure, and separately if you enable FMA
on both lines 6229 and 6230, that also causes the failure
(but either of those lines by itself is not sufficient to cause the failure).

And yes, writing float64(x*y) forces the float64 rounding after the 
multiply, which disables hoisting it into an FMA.
So we can add those to image/draw and that fixes the test:

% git clone https://go.googlesource.com/image
% go work init . image
% go test
...
FAIL
% ed image/draw/impl.go
269624
6231
                pb += p[2] * c.weight
s;p.2.*;float64(&)
                pb += float64(p[2] * c.weight)
6229
                pr += p[0] * c.weight
s;p.0.*;float64(&)
                pr += float64(p[0] * c.weight)
6230
                pg += p[1] * c.weight
s;p.1.*;float64(&)
                pg += float64(p[1] * c.weight)
w
269651
q
% go test
PASS
ok      github.com/kortschak/gifscaleissue    0.160s
% git -C image diff
diff --git a/draw/impl.go b/draw/impl.go
index 94ee826..f52ba20 100644
--- a/draw/impl.go
+++ b/draw/impl.go
@@ -6226,9 +6226,9 @@ func (z *kernelScaler) scaleY_RGBA64Image_Src(dst RGBA64Image, dr, adr image.Rec
             var pr, pg, pb, pa float64
             for _, c := range z.vertical.contribs[s.i:s.j] {
                 p := &tmp[c.coord*z.dw+dx]
-                pr += p[0] * c.weight
-                pg += p[1] * c.weight
-                pb += p[2] * c.weight
+                pr += float64(p[0] * c.weight)
+                pg += float64(p[1] * c.weight)
+                pb += float64(p[2] * c.weight)
                 pa += p[3] * c.weight
             }
 
%

Of course, there may be other multiplies (such as the one after the changed lines)
that might want to disable FMAs to keep other tests passing. They're just not
necessary to keep this test passing.

If you want a list of every FMA generated in the program, that's a subset of the
output of GOCOMPILEDEBUG=fmahash=vy go test.

Best,
Russ




Reply all
Reply to author
Forward
0 new messages