[go/dev.simd] [dev.simd] cmd/internal/obj/arm64: add indexed element operand patterns for VFMLA/VFMLS/VFMUL

1 view
Skip to first unread message

Jonathan Swinney (Gerrit)

unread,
Mar 27, 2026, 4:32:43 PM (12 days ago) Mar 27
to goph...@pubsubhelper.golang.org, Alexander Musman, golang-co...@googlegroups.com

Jonathan Swinney has uploaded the change for review

Commit message

[dev.simd] cmd/internal/obj/arm64: add indexed element operand patterns for VFMLA/VFMLS/VFMUL

Add C_ELEM optab entries for indexed element forms of VFMLA, VFMLS,
and VFMUL, supporting both scalar and vector variants:

VFMLA V6.S[0], F2, F14 // scalar indexed
VFMLA V28.S[2], V2.S2, V30.S2 // vector indexed

Two new optab entries using AVFMLA as root opcode (covering AVFMLS
and AVFMUL via oprangeset):
{AVFMLA, C_ELEM, C_FREG, C_NONE, C_FREG, ...} // scalar
{AVFMLA, C_ELEM, C_ARNG, C_NONE, C_ARNG, ...} // vector

New encoding case 111 implements the ARM encoding:
Scalar: 0x5f800000 | sz<<22 | L<<21 | M<<20 | Rm<<16 | op<<12 | H<<11 | Rn<<5 | Rd
Vector: 0x0f800000 | Q<<30 | sz<<22 | L<<21 | M<<20 | Rm<<16 | op<<12 | H<<11 | Rn<<5 | Rd

For .S: sz=0, H=index[1], L=index[0], M=Vm[4]
For .D: sz=1, H=index[0], L=0, M=Vm[4]

Encodings verified against GNU as output.

Uncomment the 6 corresponding encoding tests in arm64enc.s.
Change-Id: Iab5218f34e8bc3c306bd7a720b7bea818f4f23db

Change diff

diff --git a/src/cmd/asm/internal/asm/testdata/arm64enc.s b/src/cmd/asm/internal/asm/testdata/arm64enc.s
index 0876080..24f2c2d 100644
--- a/src/cmd/asm/internal/asm/testdata/arm64enc.s
+++ b/src/cmd/asm/internal/asm/testdata/arm64enc.s
@@ -576,11 +576,11 @@
VFMINP V10.S2, F20 // 54f9b07e
VFMINP V1.D2, V10.D2, V3.D2 // 43f5e16e
VFMINV V11.S4, F9 // 69f9b06e
- //TODO VFMLA V6.S[0], F2, F14 // 4e10865f
- //TODO VFMLA V28.S[2], V2.S2, V30.S2 // 5e189c0f
+ VFMLA V6.S[0], F2, F14 // 4e10865f
+ VFMLA V28.S[2], V2.S2, V30.S2 // 5e189c0f
VFMLA V29.S2, V20.S2, V14.S2 // 8ece3d0e
- //TODO VFMLS V24.D[1], F3, F17 // 7158d85f
- //TODO VFMLS V10.S[0], V11.S2, V10.S2 // 6a518a0f
+ VFMLS V24.D[1], F3, F17 // 7158d85f
+ VFMLS V10.S[0], V11.S2, V10.S2 // 6a518a0f
VFMLS V29.S2, V27.S2, V17.S2 // 71cfbd0e
//TODO FMOVS $(-1.625), F13 // 0d503f1e
//TODO FMOVD $12.5, F30 // 1e30651e
@@ -594,8 +594,8 @@
//TODO VFMOV $3.125, V8.D2 // 28f5006f
FMSUBS F13, F21, F13, F19 // b3d50d1f
FMSUBD F11, F7, F15, F31 // ff9d4b1f
- //TODO VFMUL V9.S[2], F21, F19 // b39a895f
- //TODO VFMUL V26.S[2], V26.S2, V2.S2 // 429b9a0f
+ VFMUL V9.S[2], F21, F19 // b39a895f
+ VFMUL V26.S[2], V26.S2, V2.S2 // 429b9a0f
VFMUL V21.D2, V17.D2, V25.D2 // 39de756e
FMULS F0, F6, F24 // d808201e
FMULD F5, F29, F9 // a90b651e
diff --git a/src/cmd/internal/obj/arm64/asm7.go b/src/cmd/internal/obj/arm64/asm7.go
index 4dc93bd..d034b42 100644
--- a/src/cmd/internal/obj/arm64/asm7.go
+++ b/src/cmd/internal/obj/arm64/asm7.go
@@ -391,10 +391,16 @@
{AVADDV, C_ARNG, C_NONE, C_NONE, C_FREG, C_NONE, 85, 4, 0, 0, 0},

/* scalar pairwise reductions: vfaddp/vfmaxp/vfminp/vfmaxnmp/vfminnmp Vn.<T>, Fd
- These use AVFMLA as the root opcode so they are included in the AVFMLA oprange,
- which covers AVFADDP, AVFMAXP, AVFMINP, AVFMAXNMP, AVFMINNMP via oprangeset. */
+ Use AVFMLA as the root opcode so buildop includes these in the AVFMLA oprange,
+ which is shared with AVFADDP, AVFMAXP, AVFMINP, AVFMAXNMP, AVFMINNMP via oprangeset. */
{AVFMLA, C_ARNG, C_NONE, C_NONE, C_FREG, C_NONE, 110, 4, 0, 0, 0},

+ /* indexed element forms: vfmla/vfmls/vfmul Vm.<T>[index], Rn, Rd
+ Use AVFMLA as the root opcode so buildop includes these in the AVFMLA oprange,
+ which is shared with AVFMLS and AVFMUL via oprangeset. */
+ {AVFMLA, C_ELEM, C_FREG, C_NONE, C_FREG, C_NONE, 111, 4, 0, 0, 0},
+ {AVFMLA, C_ELEM, C_ARNG, C_NONE, C_ARNG, C_NONE, 111, 4, 0, 0, 0},
+
/* logical operations */
{AAND, C_ZREG, C_ZREG, C_NONE, C_ZREG, C_NONE, 1, 4, 0, 0, 0},
{AAND, C_ZREG, C_NONE, C_NONE, C_ZREG, C_NONE, 1, 4, 0, 0, 0},
@@ -6060,23 +6066,81 @@
switch p.As {
case AVFADDP:
opcode = 0x0d
- case AVFMAXP, AVFMAXNMP:
+ case AVFMAXP:
opcode = 0x0f
- if p.As == AVFMAXNMP {
- opcode = 0x0c
- }
- case AVFMINP, AVFMINNMP:
+ case AVFMAXNMP:
+ opcode = 0x0c
+ case AVFMINP:
sz |= 2 // set sz[1] for min variants
opcode = 0x0f
- if p.As == AVFMINNMP {
- opcode = 0x0c
- }
+ case AVFMINNMP:
+ sz |= 2 // set sz[1] for min variants
+ opcode = 0x0c
default:
c.ctxt.Diag("unsupported op %v\n", p.As)
}
rn := uint32(p.From.Reg & 31)
rd := uint32(p.To.Reg & 31)
o1 = 0x7e300800 | sz<<22 | opcode<<12 | rn<<5 | rd
+
+ case 111: /* indexed element: vfmla/vfmls/vfmul Vm.<T>[index], Rn, Rd */
+ // AdvSIMD scalar x indexed element (C_FREG dest):
+ // 0 1 0 1 1 1 1 1 1 sz L M Rm opcode H 0 Rn Rd (base: 0x5f800000)
+ // AdvSIMD vector x indexed element (C_ARNG dest):
+ // 0 Q 0 0 1 1 1 1 1 sz L M Rm opcode H 0 Rn Rd (base: 0x0f800000)
+ // For .S type: sz=0, L=index[0], M=Vm[4], Rm[3:0]=Vm[3:0], H=index[1]
+ // For .D type: sz=1, L=0, M=Vm[4], Rm[3:0]=Vm[3:0], H=index
+ af := int((p.From.Reg >> 5) & 15) // arrangement of C_ELEM (ARNG_S or ARNG_D)
+ index := int(p.From.Index)
+ rm := int(p.From.Reg & 31)
+ rn := uint32(p.Reg & 31)
+ rd := uint32(p.To.Reg & 31)
+ var sz, H, L, M uint32
+ switch af {
+ case ARNG_S:
+ sz = 0
+ H = uint32(index>>1) & 1
+ L = uint32(index) & 1
+ M = uint32(rm>>4) & 1
+ case ARNG_D:
+ sz = 1
+ H = uint32(index) & 1
+ L = 0
+ M = uint32(rm>>4) & 1
+ default:
+ c.ctxt.Diag("invalid arrangement: %v\n", p)
+ }
+ var opcode uint32
+ switch p.As {
+ case AVFMLA:
+ opcode = 0x1
+ case AVFMLS:
+ opcode = 0x5
+ case AVFMUL:
+ opcode = 0x9
+ default:
+ c.ctxt.Diag("unsupported op %v\n", p.As)
+ }
+ rmLow := uint32(rm) & 0xf
+ if p.To.Reg >= REG_F0 && p.To.Reg <= REG_F31 {
+ // scalar indexed element
+ o1 = 0x5f800000 | sz<<22 | L<<21 | M<<20 | rmLow<<16 | opcode<<12 | H<<11 | rn<<5 | rd
+ } else {
+ // vector indexed element: Q from destination arrangement
+ at := int((p.To.Reg >> 5) & 15)
+ var Q uint32
+ switch at {
+ case ARNG_2S:
+ Q = 0
+ case ARNG_4S:
+ Q = 1
+ case ARNG_2D:
+ Q = 1
+ default:
+ c.ctxt.Diag("invalid arrangement: %v\n", p)
+ }
+ o1 = 0x0f800000 | Q<<30 | sz<<22 | L<<21 | M<<20 | rmLow<<16 | opcode<<12 | H<<11 | rn<<5 | rd
+ }
}
out[0] = o1
out[1] = o2

Change information

Files:
  • M src/cmd/asm/internal/asm/testdata/arm64enc.s
  • M src/cmd/internal/obj/arm64/asm7.go
Change size: M
Delta: 2 files changed, 80 insertions(+), 16 deletions(-)
Open in Gerrit

Related details

Attention set is empty
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement is not satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: newchange
Gerrit-Project: go
Gerrit-Branch: dev.simd
Gerrit-Change-Id: Iab5218f34e8bc3c306bd7a720b7bea818f4f23db
Gerrit-Change-Number: 760541
Gerrit-PatchSet: 1
Gerrit-Owner: Jonathan Swinney <jswi...@amazon.com>
Gerrit-CC: Alexander Musman <alexande...@gmail.com>
unsatisfied_requirement
satisfied_requirement
open
diffy

Jonathan Swinney (Gerrit)

unread,
Mar 27, 2026, 5:12:37 PM (12 days ago) Mar 27
to goph...@pubsubhelper.golang.org, golang-co...@googlegroups.com
Attention needed from Alexander Musman

Jonathan Swinney uploaded new patchset

Jonathan Swinney uploaded patch set #2 to this change.
Open in Gerrit

Related details

Attention is currently required from:
  • Alexander Musman
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement is not satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: newpatchset
Gerrit-Project: go
Gerrit-Branch: dev.simd
Gerrit-Change-Id: Iab5218f34e8bc3c306bd7a720b7bea818f4f23db
Gerrit-Change-Number: 760541
Gerrit-PatchSet: 2
Gerrit-Owner: Jonathan Swinney <jswi...@amazon.com>
Gerrit-Reviewer: Alexander Musman <alexande...@gmail.com>
Gerrit-Attention: Alexander Musman <alexande...@gmail.com>
unsatisfied_requirement
satisfied_requirement
open
diffy

Jonathan Swinney (Gerrit)

unread,
Apr 8, 2026, 4:58:41 PM (8 hours ago) Apr 8
to goph...@pubsubhelper.golang.org, Alexander Musman, golang-co...@googlegroups.com

Jonathan Swinney abandoned this change.

View Change

Abandoned replaced by CL764224

Jonathan Swinney abandoned this change

Related details

Attention set is empty
Submit Requirements:
  • requirement is not satisfiedCode-Review
  • requirement satisfiedNo-Unresolved-Comments
  • requirement is not satisfiedReview-Enforcement
  • requirement is not satisfiedTryBots-Pass
Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. DiffyGerrit
Gerrit-MessageType: abandon
unsatisfied_requirement
satisfied_requirement
open
diffy
Reply all
Reply to author
Forward
0 new messages