See this bug.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Unfortunately there is a codegen issue with this patch. Since the Instruction would no longer have an old operand, we couldn't create the DPP version (See GCNDPPCombine.cpp:244). I have just pushed a test for that 50dfd3e9e4930834c3c77a289000fd57bbc16727
So for _e64_dpp we need both old and src2 and both these operands must be tied to dst, correct?
Yes, it seems that would be correct to me.
The FIXME hints the presence of src2 at all is a hack, but I'm not sure how practical it is to remove src2 completely.
Unfortunately there is a codegen issue with this patch.
It looks like we have never supported dpp variants of MAC and FMAC opcodes in codegen. e32 dpp definitions also lack $old argument. See https://github.com/llvm/llvm-project/blob/2812a1413f19b9ebd954b57ecbd021656508eade/llvm/lib/Target/AMDGPU/VOP2Instructions.td#L417
So for _e64_dpp we need both old and src2 and both these operands must be tied to dst, correct?
Yes, it seems that would be correct to me.
After a bit of digging I found that we cannot tie more than one operand to another. See https://github.com/llvm/llvm-project/blob/2812a1413f19b9ebd954b57ecbd021656508eade/llvm/utils/TableGen/CodeGenInstruction.cpp#L372
I believe it is the reason why codegen does not support dpp variants of MAC and FMAC opcodes.
If I understand the current state: e32 dpp is not creatable but is assemblable, e64 dpp is creatable but not assemblable. Perhaps a way to fix _e64 is in the asmParser in cvtVOP3DPP ~L8900? I think it's treating the operand as src2 when it is actually the old operand.
If I understand the current state: e32 dpp is not creatable but is assemblable, e64 dpp is creatable but not assemblable.
Correct. I'm not sure e64 dpp is creatable but at least it has $old operand.
Perhaps a way to fix _e64 is in the asmParser in cvtVOP3DPP ~L8900? I think it's treating the operand as src2 when it is actually the old operand.
cvtVOP3DPP incorrectly handles $old because it is not tied to $dst.
It seems possible to change cvtVOP3DPP to handle v_fmac opcodes correctly. But if we go that way, $src2 will be tied to $dst, but $old will be present but not tied. Is this acceptable?
I believe it is actually that way currently in main. And it seems to work in GCNDPPCombine at least. Though I have not strongly tested it, maybe it's wasting a register? @rampitec
Tablegen dump yields :
def V_FMAC_F32_e64_dpp
... dag InOperandList = (ins anonymous_9483:$old ... ... string Constraints = "$vdst = $src2"; string DisableEncoding = "$src2";
I think the most correct thing to do is remove src2 operand from FMAC. But the FIXME at L410 implies we had good reason for adding src2. If we don't want to do that, we should either fix GCNDPPCombine to work without an old operand for these instruction, or fix the Asm/Dasm to work for them.
Correct assembler and disassembler to handle special fmac operands (old, src2, src2_modifiers).