V_FMAC_F32 and V_DOT2C_F32_F16 have a dummy src2 operand tied to vdst to
inform passes that the instructions read the dst operand. The VOPD
versions of these instructions lacked the dummy operand, which was a
problem for inserting s_delay_alu.
Introduce the dummy src2 operand on the VOPD versions, and fix the VOPD operand
tracking logic to account for it.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Overall looks fine, but I want somebody to look at codegen changes. You could leave MC code refactoring to me (but I'm not sure if I'll be able to improve code that much).
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp | ||
---|---|---|
8567 | This may be moved out of the loop (together with CInfo.hasSrc2Acc()). | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
540 | hasRegularSrcOperand looks counterintuitive now because tied src2 is actually a quite special operand. Counting this operand in ComponentProps.SrcOperandsNum is also debatable. Maybe handling this operand explicitly would make code more clear. It will also make error position correct. But it will require changes in parser and codegen which I believe you tried to avoid. | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
580 | I believe this change may be avoided by making ComponentLayout a subclass of ComponentProps. | |
llvm/test/MC/AMDGPU/gfx11_asm_vopd_errs.s | ||
269 | It's a pity. |
Overall looks fine, but I want somebody to look at codegen changes.
The changes in test/CodeGen/AMDGPU/vopd-src2acc-delay.mir are good.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
---|---|---|
563–565 | You are right, but I'm not sure if ... : Kind(Kind), ... is a good and clear style. |
LGTM (with or without MC code corrections).
llvm/lib/Target/AMDGPU/VOP2Instructions.td | ||
---|---|---|
439 | I think these two lines are too long even for td files. |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
---|---|---|
563–565 | It is ubiquitous, and I think it's much better than inventing lots of different conventions (like trailing underscore) to make the argument name different frmo the field name. |
use const auto &, hoist loop invariant code, shorten td line length, delete parameter name underscores
Thanks for the comments. I will plan to land this after rebasing and testing, and leave aside the remaining refactoring opportunities for now.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp | ||
---|---|---|
540 | Yes, I have tried to avoid special case code in the parser and codegen. Counting the operand in SrcOperandsNum is indeed debatable but it is a standard tied operand from a CodeGen perspective and the VOPD rules are the same for dst and src2, so it works out cleanly. I'm open to an NFC (except for error caret position) refactor in the future. | |
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
580 | It seems natural to delete ComponentLayout and incorporate its functionaliy into ComponentInfo. I'd prefer that was in a separate patch. |
This may be moved out of the loop (together with CInfo.hasSrc2Acc()).