Set Ready For Review
Ready for review.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Existing tests/lib/typed_data/simd_*_test.dart still pass in JIT andPlease add `TEST=` here at the start of the line.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
Existing tests/lib/typed_data/simd_*_test.dart still pass in JIT andDone in patchset 2.
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Auto-Submit | +1 |
Thanks for the reviews!
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |
[vm/compiler] Specialize Int32x4 operators in AOT.
The five binary operators on Int32x4 (+, -, |, &, ^) were never added
to the recognized-method list as graph intrinsics, so calls to them
were left as runtime calls through external-name bodies. In JIT the
call specializer picked kInt32x4Cid from IC feedback and emitted a
native SimdOpInstr, but AOT has no IC feedback and therefore fell back
to boxed calls, making Int32x4List inner loops 10-70x slower than both
the JIT version and a hand-written scalar equivalent.
This CL wires the same specialization paths that already exist for
Float32x4.+,-,*,/:
- recognize the five operators as graph intrinsics and mark them
with `@pragma("vm:recognized", "graph-intrinsic")` plus an
exact-result-type pragma;
- add Build_Int32x4{Add,Sub,BitAnd,BitOr,BitXor} helpers that
delegate to the existing BuildSimdOp;
- extend SimdOpInstr::KindForOperator and CreateFromCall;
- extend CallSpecializer::InlineSimdOp and TryInlineRecognizedMethod
so the non-speculative null-check path used for Float32x4 operators
in AOT also applies here.
Measured on macOS arm64 (M-series), `dart compile exe`:
Issue 63217 orSimd : 12.58 -> 0.32 us/iter (39x)
Issue 63217 andNotSimd : 23.51 -> 0.34 us/iter (69x)
Issue 53662 mandelbrot : 4038.5 -> 55.5 ms (72x)
A new benchmark benchmarks/SimdInt32x4 exercises all five operators
with a scalar and a SIMD variant so the specialization stays covered
by the benchmark bots; it is registered in Omnibus and OmnibusDeferred.
Existing tests/lib/typed_data/simd_*_test.dart still pass in JIT and
AOT.
TEST=tests/lib/typed_data/int32x4_arithmetic_test; benchmarks/SimdInt32x4
Bug: https://github.com/dart-lang/sdk/issues/53662
Bug: https://github.com/dart-lang/sdk/issues/63217
Change-Id: I9b76ab4fff228ff1a5e3d3c86f4bfc059e66a49a
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/497000
Reviewed-by: Slava Egorov <veg...@google.com>
Commit-Queue: Slava Egorov <veg...@google.com>
Auto-Submit: Modestas Valauskas <valauska...@gmail.com>
Reviewed-by: Alexander Aprelev <a...@google.com>
| Inspect html for hidden footers to help with email filtering. To unsubscribe visit settings. |