I'd like to add some pattern matching, for Turboshaft, to recognise add + shuffle patterns which correspond to a horizontal pairwise reduction. I've started doing this with wasm::SimdShuffle helpers and then during arm64 instruction selection, but it feels like the pattern matching should be done in a generic place too... So, I was thinking about adding more four more kinds (I32x4, I64x4, F32x4 and F64x2 PairwiseReduction) to Simd128UnaryOp and then perform the combining in machine-optimization-reducer.
Does this sound reasonable enough..? Or is the overhead of plumbing this into the TS IR likely going to be significantly more complicated than backend pattern matching?
Thanks,
Sam