[dev.simd] cmd/compile: add masked merging ops
This CL only adds the ops, the rules to generate them will be in the
next CL.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
+2, plus a question about something that looks maybe like a leftover from an earlier CL.
w2fpkw = regInfo{inputs: []regMask{w, wz, fp, mask}, outputs: wonly} // used in resultInArg0 ops, arg0 must not be x15See other remarks about "did we need this?"
wkwload, v21load, v31load, v11load, w21load, w31load, w2kload, w2kwload, w11load, w3kwload, w2kkload, v31x0AtIn2, w2fpkw regInfo) []opData {did we need to add this one? It looks like we use the pre-existing w3kw, maybe?
x.Add(y).Merge(x, mask).StoreSlice(res)
} else {
x.Add(y).Merge(x, mask).StoreSlice(res)I might want to add a "z" vector for the Merge, say {-1, -2, -3, -4} so that expected would be {6, 8, -3, -4}. That should work, right?'
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +2 |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Commit-Queue | +1 |
w2fpkw = regInfo{inputs: []regMask{w, wz, fp, mask}, outputs: wonly} // used in resultInArg0 ops, arg0 must not be x15See other remarks about "did we need this?"
Ohh yes, thank you for finding this, it is indeed stale change from previous PCs.
wkwload, v21load, v31load, v11load, w21load, w31load, w2kload, w2kwload, w11load, w3kwload, w2kkload, v31x0AtIn2, w2fpkw regInfo) []opData {did we need to add this one? It looks like we use the pre-existing w3kw, maybe?
Done
x.Add(y).Merge(x, mask).StoreSlice(res)
} else {
x.Add(y).Merge(x, mask).StoreSlice(res)I might want to add a "z" vector for the Merge, say {-1, -2, -3, -4} so that expected would be {6, 8, -3, -4}. That should work, right?'
Yes that works, updated.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
8 is the latest approved patch-set.
The change was submitted with unreviewed changes in the following files:
```
The name of the file: src/simd/internal/simd_test/simd_test.go
Insertions: 4, Deletions: 3.
The diff is too large to show. Please review the diff.
```
```
The name of the file: src/cmd/compile/internal/ssa/_gen/simdAMD64ops.go
Insertions: 1, Deletions: 1.
The diff is too large to show. Please review the diff.
```
```
The name of the file: src/cmd/compile/internal/ssa/rewriteAMD64.go
Insertions: 5624, Deletions: 5624.
The diff is too large to show. Please review the diff.
```
```
The name of the file: src/simd/_gen/simdgen/gen_simdMachineOps.go
Insertions: 2, Deletions: 2.
The diff is too large to show. Please review the diff.
```
```
The name of the file: src/cmd/compile/internal/ssa/_gen/simdAMD64.rules
Insertions: 403, Deletions: 403.
The diff is too large to show. Please review the diff.
```
```
The name of the file: src/cmd/compile/internal/ssa/_gen/AMD64Ops.go
Insertions: 15, Deletions: 16.
The diff is too large to show. Please review the diff.
```
[dev.simd] cmd/compile: add masked merging ops and optimizations
This CL generates optimizations for masked variant of AVX512
instructions for patterns:
x.Op(y).Merge(z, mask) => OpMasked(z, x, y mask), where OpMasked is
resultInArg0.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |