Hi Matt,
I tried debug-only=isel and have some more informations.
The steps before 'Legalized selection'( excluding it) all use v2i32 load. At the step of 'Legalized selection', it replaced one v2i32 load by two i32 load + shl+ or + bitcast (I have a pattern for convert from v2i32 to 2*i32). In previous steps (initial, lowered, type-legalized), they all use v2i32 load.
Can you please think of any other places where certain things have to be declared legal?
Thanks,
Xiaochu