| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +1 |
VXORPS X1, X2, X3Question from the CL description: does this only sometimes break when runtime does not safe the correct state? Since it's not a SIGILL but a potential value corruption.
I am wondering if we should actually make `checkAVX` return something that relies on a correct AVX instruction set being emulated by Rosetta.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
VXORPS X1, X2, X3Question from the CL description: does this only sometimes break when runtime does not safe the correct state? Since it's not a SIGILL but a potential value corruption.
I am wondering if we should actually make `checkAVX` return something that relies on a correct AVX instruction set being emulated by Rosetta.
The purpose of this CL is to make the Go runtime's knowledge on AVX availability match the system's, i.e. whether the instruction faults. It's not that Rosetta 2 will run the instruction incorrectly, but that it runs whereas the Go runtime thinks it is not available (i.e. will fault). The data corruption comes from the Go runtime if the information is incorrect, e.g. async preempt doesn't save AVX registers, or it doesn't initialize the zero register correctly.
This particular test is just to see if it faults, and see if it matches the runtime's HasAVX.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
VXORPS X1, X2, X3Cherry MuiQuestion from the CL description: does this only sometimes break when runtime does not safe the correct state? Since it's not a SIGILL but a potential value corruption.
I am wondering if we should actually make `checkAVX` return something that relies on a correct AVX instruction set being emulated by Rosetta.
The purpose of this CL is to make the Go runtime's knowledge on AVX availability match the system's, i.e. whether the instruction faults. It's not that Rosetta 2 will run the instruction incorrectly, but that it runs whereas the Go runtime thinks it is not available (i.e. will fault). The data corruption comes from the Go runtime if the information is incorrect, e.g. async preempt doesn't save AVX registers, or it doesn't initialize the zero register correctly.
This particular test is just to see if it faults, and see if it matches the runtime's HasAVX.
Yeah that's what I meant, does it fault always? It could be that X1/X2/X3 happen to be not in use so not saving them is fine. And async preempt is not always happening.
Maybe those false negative cases are rare. Anyway this CL is more about the functionality. Thanks! 😄
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Code-Review | +2 |
We had desk-to-desk conversation, "does this need a backport", and the conclusion was no, because until Go SIMD extensions are a thing, it's all in assembly language and already reliably gated by feature checks.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Thanks.
VXORPS X1, X2, X3Cherry MuiQuestion from the CL description: does this only sometimes break when runtime does not safe the correct state? Since it's not a SIGILL but a potential value corruption.
I am wondering if we should actually make `checkAVX` return something that relies on a correct AVX instruction set being emulated by Rosetta.
Junyang ShaoThe purpose of this CL is to make the Go runtime's knowledge on AVX availability match the system's, i.e. whether the instruction faults. It's not that Rosetta 2 will run the instruction incorrectly, but that it runs whereas the Go runtime thinks it is not available (i.e. will fault). The data corruption comes from the Go runtime if the information is incorrect, e.g. async preempt doesn't save AVX registers, or it doesn't initialize the zero register correctly.
This particular test is just to see if it faults, and see if it matches the runtime's HasAVX.
Yeah that's what I meant, does it fault always? It could be that X1/X2/X3 happen to be not in use so not saving them is fine. And async preempt is not always happening.
Maybe those false negative cases are rare. Anyway this CL is more about the functionality. Thanks! 😄
Yes, if it faults, it always faults. I don't think any hardware or VM would implement it as _sometimes_ faults.
We don't async preempt assembly code, so that would be irrelevant here.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
[dev.simd] internal/cpu: report AVX1 and 2 as supported on macOS 15 Rosetta 2
Apparently, on macOS 15 or newer, Rosetta 2 supports AVX1 and 2.
However, neither CPUID nor the Apple-recommended sysctl says it
has AVX. If AVX is used without checking the CPU feature, it may
run fine without SIGILL, but the runtime doesn't know AVX is
available therefore save and restore its states. This may lead to
value corruption.
Check if we are running under Rosetta 2 on macOS 15 or newer. If so,
report AVX1 and 2 as supported.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
// TODO: check if any other feature is actually supported.I was debugging the feature support due to https://go-review.googlesource.com/c/sys/+/737260 and found some additional information about this.
Rosetta ends up reporting AVX and AVX2 features only when `ROSETTA_ADVERTISE_AVX=1` is set. After it's set, CPUID reading will work as intended.
For example `golang.org/x/sys/cpu.X86` will then additionally report:
```
HasAVX
HasAVX2
HasBMI1
HasBMI2
HasFMA
HasOSXSAVE
HasRDRAND
```
If I would hazard to guess, the reason they didn't enable these by default is because the performance of SSE code can be faster. AVX uses 256bit instructions, but macs only have 128bit registers -- so the AVX instructions are translated to several neon instructions.
In other words -- the instructions should still work under Rosetta, however, they might be slower than using SSE.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |