I'd like to define a pseudo-instruction whose expansion will, as a
side-effect, overwrite an input register's value: the pseudo-instruction
ldw r1:r2, P
to load 2 bytes from memory address P is to be expaneded to
ld r1, P+
ld r2, P
where "ld _, P+" is an instruction that loads a single byte from P, and
post-increments P by one.
How can I represent this behaviour in LLVM? Currently, I have
let Constraints = "@earlyclobber $reg" in
def LDWRdPtr : Pseudo<(outs DREGS:$reg),
(ins PTRREGS:$ptrreg),
"ldw\t$reg, $ptrreg",
[(set i16:$reg, (load i16:$ptrreg))]>,
Requires<[HasSRAM]>;
The problem, of course, is that with this definition I end up with code
which assumes it is equivalent to save P before 'ldw r1:r2,P' or after. I
tried adding "@earlyclobber $ptrreg" as a Constraint, but that just leads
to an assertion failure during codegen (I assume because @earlyclobber is
for output ports)
void llvm::MachineOperand::setIsEarlyClobber(bool): Assertion `isReg() && IsDef && "Wrong MachineOperand accessor"' failed.
Thanks,
Gergo
_______________________________________________
LLVM Developers mailing list
llvm...@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
You need to express the P as both an input and output operand and add a constraint that both must be the same register.
David
>> let Constraints = "@earlyclobber $reg" in
>> def LDWRdPtr : Pseudo<(outs DREGS:$reg),
>> (ins PTRREGS:$ptrreg),
>> "ldw\t$reg, $ptrreg",
>> [(set i16:$reg, (load i16:$ptrreg))]>,
>> Requires<[HasSRAM]>;
>>
>> The problem, of course, is that with this definition I end up with code which assumes it is equivalent to save P before 'ldw r1:r2,P' or after. I tried adding "@earlyclobber $ptrreg" as a Constraint, but that just leads to an assertion failure during codegen (I assume because @earlyclobber is for output ports)
>
> You need to express the P as both an input and output operand and add a constraint that both must be the same register.
OK, but then the pattern will have to include that extra output operand
somehow, right? What would the pattern need to be so that during ISel,
this LDWRdPtr instruction with the extra output still matches?
> This is typically accomplished with something like PPC's `RegConstraint` and
> `NoEncode`. You can see examples of it that are very similar to what you're after in
> PPC's load/store with update forms (i.e. load a value and update the base register
> with the effective address - these are used for pre-increment loads/stores).
> For example: the definition of LBZU and friends in lib/Target/PowerPC/PPCInstrInfo.td.
> For a simpler example of just the `RegConstraint` usage (as it doesn't use a compound
> node like PPC's address nodes), you can look at all the fused multiply-add such as
> XSMADDADP in lib/Target/PowerPC/PPCInstrVSX.td.
>
> Hope this helps.
Thanks!
However, none of the NoEncode examples in PPCInstrInfo.td seem to have an
isel pattern; and the VSX examples, like XSMADDADP, seem to match on
setting a single output:
let BaseName = "XSMADDADP" in {
let isCommutable = 1 in
def XSMADDADP : XX3Form<60, 33,
(outs vsfrc:$XT), (ins vsfrc:$XTi, vsfrc:$XA, vsfrc:$XB),
"xsmaddadp $XT, $XA, $XB", IIC_VecFP,
[(set f64:$XT, (fma f64:$XA, f64:$XB, f64:$XTi))]>,
RegConstraint<"$XTi = $XT">, NoEncode<"$XTi">,
AltVSXFMARel;
If I'm reading this right, this matches an instruction that updates $XT
by taking the current $XT, and two extra args in $XA and $XB. However, my
situation would be something akin to
(set f64:$XC, (fma f64:$XA, f64:$XB, f64:$XTi))
with the extra constraint that $XTi is overwritten in the process.
Is there maybe a way to write a pattern like
(set (tuple f64:$XC, f64:$XT), (fma f64:$XA, f64:$XB, f64:$XTi))
that would match
(set f64:$XC, (fma f64:$XA, f64:$XB, f64:$XTi))
by automatically lifting it to store $XT as well? (of course, with a
RegConstraint that $XT = $XTi)
> So the idea is that you specify your `outs` in the instruction definition, one of
> those will have a `RegConstraint` on them and finally, you emit these nodes in your
> <TargetName>ISelDAGToDAG.cpp.
OK, thanks, I now get the basic idea -- but I'm still struggling with the
implementation.
In my ISelDATToDAG, if I match something like
Selecting: t17: i16,ch = load<LD2[%v25](align=1)(dereferenceable)> t16:1, t2, undef:i16
then whatever I return as the machine node, it will have to be of the same
type, i.e. (i16, ch), right? But if I have this extra output port for the
changed address register, that means my output is now (i16, i16, ch). It
is unclear to me how to reconcile that with the original abstract node
that I'm matching on.
In more concrete terms, I tried ignoring this and just copying the address
argument and the chain:
const LoadSDNode *LD = cast<LoadSDNode>(N);
int Offs = cast<ConstantSDNode>(LD->getOffset())->getSExtValue();
if (AM == ISD::UNINDEXED && Offs == 0) {
SDNode* LDW = CurDAG->getMachineNode(
AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other,
LD->getBasePtr(), LD->getChain());
ReplaceNode(N, LDW);
return true;
}
but this fails with
/home/cactus/prog/rust/rust-avr/llvm/include/llvm/Support/Casting.h:222:
typename std::enable_if<(! llvm::is_simple_type<Y>::value), typename
llvm::cast_retty<X, const Y>::ret_type>::type llvm::cast(const Y&)
[with X = llvm::ConstantSDNode;
Y = llvm::SDValue;
typename std::enable_if<(!llvm::is_simple_type<Y>::value),
typename llvm::cast_retty<X, const Y>::ret_type>::type = llvm::ConstantSDNode*]:
Assertion `isa<X>(Val) && "cast<Ty>() argument of incompatible type!"' failed.
Any more hints, please?
Thanks,
Gergo
llc: CodeGen/SelectionDAG/SelectionDAG.cpp:6518:
void llvm::SelectionDAG::ReplaceAllUsesWith(llvm::SDNode*, llvm::SDNode*):
Assertion `(!From->hasAnyUseOfValue(i) || From->getValueType(i) == To->getValueType(i))
&& "Cannot use this version of ReplaceAllUsesWith!"'
failed.
which I assume is because of the type difference I mentioned ((i16, i16,
ch) vs (i16, ch) on the output ports)
> the type difference I mentioned ((i16, i16, ch) vs (i16, ch) on the
> output ports)
OK I managed to make some progress on this, by using a MergeValues node:
SDNode* LDW = CurDAG->getMachineNode(
AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other,
LD->getBasePtr(), LD->getChain());
SDValue Unpack[] = { SDValue(LDW, 0), SDValue(LDW, 2) };
SDNode* NN = CurDAG->getMergeValues(Unpack, SDLoc(N)).getNode();
ReplaceNode(N, NN);
which gets me from
t17: i16,ch = load<LD2[%v25](align=1)(dereferenceable)> t16:1, t2, undef:i16
to
t24: i16,i16,ch = LDWRdPtr t2, t16:1
looking tood; but then it fails during scheduling with
llc: CodeGen/SelectionDAG/InstrEmitter.cpp:303:
unsigned int llvm::InstrEmitter::getVR(
llvm::SDValue,
llvm::DenseMap<llvm::SDValue, unsigned int>&):
Assertion `I != VRBaseMap.end() && "Node emitted out of order - late"' failed.
For reference, the full DAG before and after ISEL:
SelectionDAG has 22 nodes:
t0: ch = EntryToken
t2: i16,ch = CopyFromReg t0, Register:i16 %vreg0
t5: i16,ch = load<LD2[%v25](align=1)(dereferenceable)> t0, t2, undef:i16
t9: ch,glue = callseq_start t5:1, TargetConstant:i16<0>
t11: ch,glue = CopyToReg t9, Register:i16 %R25R24, t5
t13: ch,glue = CALL t11, TargetGlobalAddress:i16<i8 (i16)* @read_ram> 0, Register:i16 %R25R24, RegisterMask:Untyped, t11:1
t14: ch,glue = callseq_end t13, TargetConstant:i16<0>, TargetConstant:i16<0>, t13:1
t16: i8,ch,glue = CopyFromReg t14, Register:i8 %R24, t14:1
t17: i16,ch = load<LD2[%v25](align=1)(dereferenceable)> t16:1, t2, undef:i16
t18: ch,glue = callseq_start t17:1, TargetConstant:i16<0>
t19: ch,glue = CopyToReg t18, Register:i16 %R25R24, t17
t20: ch,glue = CALL t19, TargetGlobalAddress:i16<i8 (i16)* @read_ram> 0, Register:i16 %R25R24, RegisterMask:Untyped, t19:1
t21: ch,glue = callseq_end t20, TargetConstant:i16<0>, TargetConstant:i16<0>, t20:1
t22: i8,ch,glue = CopyFromReg t21, Register:i8 %R24, t21:1
t23: ch = RET_FLAG t22:1
SelectionDAG has 23 nodes:
t0: ch = EntryToken
t2: i16,ch = CopyFromReg t0, Register:i16 %vreg0
t9: i16,ch,glue = ADJCALLSTACKDOWN TargetConstant:i16<0>, t27:1
t11: ch,glue = CopyToReg t9:1, Register:i16 %R25R24, t27
t13: ch,glue = CALLk TargetGlobalAddress:i16<i8 (i16)* @read_ram> 0, Register:i16 %R25R24, RegisterMask:Untyped, t11, t11:1
t14: i16,ch,glue = ADJCALLSTACKUP TargetConstant:i16<0>, TargetConstant:i16<0>, t13, t13:1
t18: i16,ch,glue = ADJCALLSTACKDOWN TargetConstant:i16<0>, t25:1
t19: ch,glue = CopyToReg t18:1, Register:i16 %R25R24, t25
t20: ch,glue = CALLk TargetGlobalAddress:i16<i8 (i16)* @read_ram> 0, Register:i16 %R25R24, RegisterMask:Untyped, t19, t19:1
t21: i16,ch,glue = ADJCALLSTACKUP TargetConstant:i16<0>, TargetConstant:i16<0>, t20, t20:1
t16: i8,ch,glue = CopyFromReg t14:1, Register:i8 %R24, t14:2
t24: i16,i16,ch = LDWRdPtr t2, t16:1
t25: i16,ch = merge_values t24, t24:2
t26: i16,i16,ch = LDWRdPtr t2, t0
t27: i16,ch = merge_values t26, t26:2
t22: i8,ch,glue = CopyFromReg t21:1, Register:i8 %R24, t21:2
t23: ch = RET t22:1