code optimization bug in s390x (Go 1.21.4) ?

309 views
Skip to first unread message

tim....@mongodb.com

unread,
Nov 28, 2023, 4:35:50 PM11/28/23
to golang-nuts
Hello,

I believe I've found a code-optimization bug in Go 1.21.4 on Linux s390x.  This is what I was able to narrow the code sample down to:

//////////
package main

import "fmt"

type myStruct struct {
A uint32
B uint32
}

func doOpOnStructElems(a, b uint32) uint64 {
return (uint64(a) << 32) | uint64(b)
}

func main() {
myVal := myStruct{0, 0xABCDEF12}

passAsMyStructAndThenDoOp(myVal)
passAsIfaceAndThenDoOp(myVal)
}

func passAsMyStructAndThenDoOp(myVal myStruct) {
fmt.Printf("%X\n", doOpOnStructElems(myVal.A, myVal.B))
}

func passAsIfaceAndThenDoOp(myIface interface{}) {
fmt.Printf("%X\n", doOpOnStructElems(myIface.(myStruct).A, myIface.(myStruct).B))
}

////////

When I run it I get:

/////

$ go run scratch_1.go 

ABCDEF12

ABCDEF12000000

//////

If I run it on Linux zSeries w/ Go 1.20.11 or on Linux AMD64, Linux ARM64, or macOS w/ Go 1.21.4, I get what I believe is the correct answer:

////

$ go run scratch_1.go 

ABCDEF12

ABCDEF12

////

In other words, with Go 1.21.4 and s390x it appears that A & B are switched in the 2nd call which passes the struct as an interface{} .

s390x is the only big endian platform I am able to test on.  So I suspect there may be an endianness issue here.  I suspect something has gone wrong with some sort of code optimization because if I insert Println() at the beginning of doOpOnStructElems(), the problem goes away.  So it's possible there's some bug with code inlining when what is being passed in was originally passed in as an interface in the caller?

If someone could confirm that this is indeed a bug I will be happy to file an issue.

Thank you,

Tim

package main import ( "fmt" ) type myStruct struct { A uint32 B uint32 } func doOpOnStructElems(a, b uint32) int64 { return int64((uint64(a) << 32) | uint64(b)) } func main() { myVal := myStruct{0, 0xFFFF} passAsMyStructAndThenDoOp(myVal) passAsIfaceAndThenDoOp(myVal) } func passAsMyStructAndThenDoOp(myVal myStruct) { fmt.Printf("%X\n", doOpOnStructElems(myVal.A, myVal.B)) } func passAsIfaceAndThenDoOp(myIface interface{}) { fmt.Printf("%X\n", doOpOnStructElems(myIface.(myStruct).A, myIface.(myStruct).B)) }

Timothy Olsen

unread,
Nov 28, 2023, 4:44:59 PM11/28/23
to tim....@mongodb.com, golang-nuts
A coworker suggested I try with optimizations off:

[tol...@rhel74-z-dev.dallasisv.build ~]$ go version
go version go1.21.4 linux/s390x
[tol...@rhel74-z-dev.dallasisv.build ~]$ go build -o /tmp/scratch_1 scratch_1.go
[tol...@rhel74-z-dev.dallasisv.build ~]$ /tmp/scratch_1
ABCDEF12
ABCDEF12000000
[tol...@rhel74-z-dev.dallasisv.build ~]$ go build -gcflags='-N' -o /tmp/scratch_1 scratch_1.go
[tol...@rhel74-z-dev.dallasisv.build ~]$ /tmp/scratch_1
ABCDEF12
ABCDEF12
[tol...@rhel74-z-dev.dallasisv.build ~]$ export GOROOT=/opt/golang/go1.20.11
[tol...@rhel74-z-dev.dallasisv.build ~]$ export PATH=${GOROOT}/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ibm/java-s390x-80/bin:/home/tolsen/.local/bin:/home/tolsen/bin
[tol...@rhel74-z-dev.dallasisv.build ~]$ go build -o /tmp/scratch_1 scratch_1.go
[tol...@rhel74-z-dev.dallasisv.build ~]$ /tmp/scratch_1
ABCDEF12
ABCDEF12
[tol...@rhel74-z-dev.dallasisv.build ~]$ 


That further confirms a bug with optimizations on s390x (and possibly other big endian machines?) .

-Tim

Keith Randall

unread,
Nov 28, 2023, 8:30:13 PM11/28/23
to golang-nuts
It seems strange that the bad result is ABCDEF12000000 and not ABCDEF1200000000. i.e., 6 zeros and not 8. Can you confirm?

Definitely sounds like a bug to me. You should open an issue.

Timothy Olsen

unread,
Nov 29, 2023, 9:40:39 AM11/29/23
to Keith Randall, golang-nuts
I can confirm that it is six zeroes.  That is indeed weird and highlights that it's not just a simple switching of the fields.

I will file a ticket now.  Thank you!

-Tim

--
You received this message because you are subscribed to a topic in the Google Groups "golang-nuts" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/golang-nuts/TYGuzUTTsN8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to golang-nuts...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/b2a37dfc-7b6a-412a-8cbb-40a2210d6921n%40googlegroups.com.

Srinivas Pokala

unread,
Nov 30, 2023, 2:34:06 AM11/30/23
to golang-nuts
Thank's for the findings.

We verified the same issue on linux/s390x for go 1.21.4  and also the issue is not reproducing in the go1.20.11. 

From our initial findings of disassembled data, we are seeing an additional instruction getting emitted i.e. SLLG instruction on go1.21.4.
This instruction is left shifting the operand logically by 24 bits. Due to this we feel that the output is incorrect.

Below is the disassembled output for the "passAsIfaceAndThenDoOp" function.

For go 1.20.11:
00000000000a8910 <main.passAsIfaceAndThenDoOp>:
   a8910:       e3 30 d0 10 00 04       lg      %r3,16(%r13)
   a8916:       ec 3f 00 61 a0 65       clgrjnl %r3,%r15,a89d8 <main.passAsIfaceAndThenDoOp+0xc8>
   a891c:       e3 e0 ff 98 ff 24       stg     %r14,-104(%r15)
   a8922:       e3 ff 0f 98 ff 71       lay     %r15,-104(%r15,%r0)
   a8928:       e3 e0 f0 00 00 24       stg     %r14,0(%r15)
   a892e:       e3 00 f0 70 00 04       lg      %r0,112(%r15)
   a8934:       c0 10 00 00 a1 66       larl    %r1,bcc00 <type:*+0xcc00>
   a893a:       ec 01 00 41 60 64       cgrjne  %r0,%r1,a89bc <main.passAsIfaceAndThenDoOp+0xac>
   a8940:       e3 10 f0 78 00 04       lg      %r1,120(%r15)
   a8946:       e3 00 10 00 00 04       lg      %r0,0(%r1)
   a894c:       e5 48 f0 58 00 00       mvghi   88(%r15),0
   a8952:       e5 48 f0 60 00 00       mvghi   96(%r15),0
   a8958:       e3 00 f0 08 00 24       stg     %r0,8(%r15)
   a895e:       c0 e5 ff fb 8f 69       brasl   %r14,1a830 <runtime.convT64>
   a8964:       e3 20 f0 10 00 04       lg      %r2,16(%r15)
   a896a:       c0 10 00 00 77 0b       larl    %r1,b7780 <type:*+0x7780>
   a8970:       eb 12 f0 58 00 24       stmg    %r1,%r2,88(%r15)
   a8976:       c4 28 00 05 7b 0d       lgrl    %r2,157f90 <os.Stdout>
   a897c:       c0 10 00 01 ec e6       larl    %r1,e6348 <go:itab.*os.File,io.Writer>
   a8982:       c0 30 00 01 16 4d       larl    %r3,cb61c <go:string.*+0x3fc4>
   a8988:       eb 13 f0 08 00 24       stmg    %r1,%r3,8(%r15)
   a898e:       e5 48 f0 20 00 1a       mvghi   32(%r15),26
   a8994:       41 0f 00 58             la      %r0,88(%r15,%r0)
   a8998:       e3 00 f0 28 00 24       stg     %r0,40(%r15)
   a899e:       e5 48 f0 30 00 01       mvghi   48(%r15),1
   a89a4:       e5 48 f0 38 00 01       mvghi   56(%r15),1
   a89aa:       c0 e5 ff ff bb 03       brasl   %r14,9ffb0 <fmt.Fprintf>
   a89b0:       e3 e0 f0 00 00 04       lg      %r14,0(%r15)
   a89b6:       a7 fb 00 68             aghi    %r15,104
   a89ba:       07 fe                   br      %r14
   a89bc:       b9 04 00 21             lgr     %r2,%r1
   a89c0:       c0 30 00 00 83 30       larl    %r3,b9020 <type:*+0x9020>
   a89c6:       b9 04 00 10             lgr     %r1,%r0
   a89ca:       eb 13 f0 08 00 24       stmg    %r1,%r3,8(%r15)
   a89d0:       c0 e5 ff fb 8d b0       brasl   %r14,1a530 <runtime.panicdottypeE>
   a89d6:       07 00                   nopr
   a89d8:       b9 04 00 5e             lgr     %r5,%r14
   a89dc:       c0 e5 ff fe 9f 0a       brasl   %r14,7c7f0 <runtime.morestack_noctxt>
   a89e2:       a7 f4 ff 97             j       a8910 <main.passAsIfaceAndThenDoOp>


For go 1.21.4:
00000000000a8250 <main.passAsIfaceAndThenDoOp>:
   a8250:       e3 30 d0 10 00 04       lg      %r3,16(%r13)
   a8256:       ec 3f 00 64 a0 65       clgrjnl %r3,%r15,a831e <main.passAsIfaceAndThenDoOp+0xce>
   a825c:       e3 e0 ff 98 ff 24       stg     %r14,-104(%r15)
   a8262:       e3 ff 0f 98 ff 71       lay     %r15,-104(%r15,%r0)
   a8268:       e3 e0 f0 00 00 24       stg     %r14,0(%r15)
   a826e:       e3 00 f0 70 00 04       lg      %r0,112(%r15)
   a8274:       c0 10 00 00 a7 16       larl    %r1,bd0a0 <type:*+0xd0a0>
   a827a:       ec 01 00 44 60 64       cgrjne  %r0,%r1,a8302 <main.passAsIfaceAndThenDoOp+0xb2>
   a8280:       e3 10 f0 78 00 04       lg      %r1,120(%r15)
   a8286:       e3 00 10 00 00 04       lg      %r0,0(%r1)
   a828c:       eb 00 00 18 00 0d       sllg    %r0,%r0,24          ----------------------------->>>>>>>>>>>>>>>>>>>>>>>>>>>   This is the extra instruction which is getting emitted in the case of go 1.21
   a8292:       e5 48 f0 58 00 00       mvghi   88(%r15),0
   a8298:       e5 48 f0 60 00 00       mvghi   96(%r15),0
   a829e:       e3 00 f0 08 00 24       stg     %r0,8(%r15)
   a82a4:       c0 e5 ff fb 94 be       brasl   %r14,1ac20 <runtime.convT64>
   a82aa:       e3 20 f0 10 00 04       lg      %r2,16(%r15)
   a82b0:       c0 10 00 00 77 b8       larl    %r1,b7220 <type:*+0x7220>
   a82b6:       eb 12 f0 58 00 24       stmg    %r1,%r2,88(%r15)
   a82bc:       c4 28 00 05 89 26       lgrl    %r2,159508 <os.Stdout>
   a82c2:       c0 10 00 01 ff bb       larl    %r1,e8238 <go:itab.*os.File,io.Writer>
   a82c8:       c0 30 00 01 1d 5a       larl    %r3,cbd7c <go:string.*+0x3884>
   a82ce:       eb 13 f0 08 00 24       stmg    %r1,%r3,8(%r15)
   a82d4:       e5 48 f0 20 00 1a       mvghi   32(%r15),26
   a82da:       41 0f 00 58             la      %r0,88(%r15,%r0)
   a82de:       e3 00 f0 28 00 24       stg     %r0,40(%r15)
   a82e4:       e5 48 f0 30 00 01       mvghi   48(%r15),1
   a82ea:       e5 48 f0 38 00 01       mvghi   56(%r15),1
   a82f0:       c0 e5 ff ff ba 78       brasl   %r14,9f7e0 <fmt.Fprintf>
   a82f6:       e3 e0 f0 00 00 04       lg      %r14,0(%r15)
   a82fc:       a7 fb 00 68             aghi    %r15,104
   a8300:       07 fe                   br      %r14  
   a8302:       b9 04 00 21             lgr     %r2,%r1
   a8306:       c0 30 00 00 89 dd       larl    %r3,b96c0 <type:*+0x96c0>
   a830c:       b9 04 00 10             lgr     %r1,%r0
   a8310:       eb 13 f0 08 00 24       stmg    %r1,%r3,8(%r15)
   a8316:       c0 e5 ff fb 93 2d       brasl   %r14,1a970 <runtime.panicdottypeE>
   a831c:       07 00                   nopr  
   a831e:       b9 04 00 5e             lgr     %r5,%r14
   a8322:       c0 e5 ff fe be 17       brasl   %r14,7ff50 <runtime.morestack_noctxt>
   a8328:       a7 f4 ff 94             j       a8250 <main.passAsIfaceAndThenDoOp>


As you can see above that the SLLG instruction is logically left shifting the second operand (r0) by 24bits and placing it in the first operand (r0).
Due to this, the 64 bit contents of "r0" register is becoming "00ABCDEF12000000".... 

Ex: r0: ==> 0x00000000ABCDEFf12
After SLLG instruction  
sllg %r0, %r0, 24  
The r0 becomes ==> 0x00ABCDEF12000000
After this we returning the r0 content.

We are further taking a look at the same. We will share our further findings when we have it.
~                       

Timothy Olsen

unread,
Nov 30, 2023, 9:53:42 AM11/30/23
to Srinivas Pokala, golang-nuts
Thank you Srinivas!

Btw, I finally filed the issue here: https://github.com/golang/go/issues/64468

Also, I've noticed that putting "//go:noinline" just before doOpOnStructElems() fixes the problem.  

-Tim

Reply all
Reply to author
Forward
0 new messages