Hi,
MachineVerifier allows a use of a physical register iff any of its
subregisters is defined:
https://github.com/llvm/llvm-project/blob/f302e0b5dd402e629620a58f9115a3441c65d60f/llvm/lib/CodeGen/MachineVerifier.cpp#L2305
This means that a wide COPY like this is legal, even if only some of
the individual subregsters ($sgpr0, $sgpr1, $sgpr2) are defined:
$vgpr1_vgpr2_vgpr3 = COPY killed $sgpr0_sgpr1_sgpr2
But if the target’s copyPhysReg splits it into multiple word-sized
copies like this, then some of them may no longer be legal, because
the corresponding source is completely undefined:
$vgpr3 = V_MOV_B32_e32 $sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1
$vgpr1 = V_MOV_B32_e32 $sgpr0
Various targets (AMDGPU, Sparc, SystemZ) work around this in
copyPhysReg by adding extra implicit operands to the word-sized copy
instructions to satisfy MachineVerifier. But these extra operands make
it look like there are dependencies between instructions that have
none. This restricts post-RA scheduling freedom and confuses other
late codegen passes. For example, AMDGPU actually adds all of these
implicit operands for the example above [edited slightly to remove
irrelevant stuff]:
$vgpr3 = V_MOV_B32_e32 $sgpr2, implicit-def $vgpr1_vgpr2_vgpr3,
implicit $sgpr0_sgpr1_sgpr2
$vgpr2 = V_MOV_B32_e32 $sgpr1, implicit $sgpr0_sgpr1_sgpr2
$vgpr1 = V_MOV_B32_e32 $sgpr0, implicit killed $sgpr0_sgpr1_sgpr2
Because of this, I would like to find a better solution. I can think of three:
1. Use subreg liveness information in copyPhysReg to only copy the
parts of the wide register that are live. I tried that in D113017
"[AMDGPU] Avoid copying dead subregisters in copyPhysReg" and it seems
to work, but it also has a measurable (0.7%) compile time cost, which
seems unfortunate.
2. Change the physical subreg liveness rules to say that all parts of
a physical register have to be defined. I’m not sure how to implement
this, but I suppose it would mean we would need more IMPLICIT_DEF
instructions to satisfy the verifier.
3. Change the physical subreg liveness rules to say that no part of a
physical register has to be defined. I guess this would be unpopular
because it means we end up with no liveness verification at all.
Thoughts?
Thanks,
Jay.
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev