Page MenuHomePhabricator

[AMDGPU] Fix delay alu for VOPD with src2acc

Authored by Joe_Nash on Oct 24 2022, 10:59 AM.



V_FMAC_F32 and V_DOT2C_F32_F16 have a dummy src2 operand tied to vdst to
inform passes that the instructions read the dst operand. The VOPD
versions of these instructions lacked the dummy operand, which was a
problem for inserting s_delay_alu.
Introduce the dummy src2 operand on the VOPD versions, and fix the VOPD operand
tracking logic to account for it.

Diff Detail

Event Timeline

Joe_Nash created this revision.Oct 24 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 10:59 AM
Joe_Nash requested review of this revision.Oct 24 2022, 10:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 10:59 AM
dp added a comment.Oct 24 2022, 12:32 PM

Overall looks fine, but I want somebody to look at codegen changes. You could leave MC code refactoring to me (but I'm not sure if I'll be able to improve code that much).


This may be moved out of the loop (together with CInfo.hasSrc2Acc()).


hasRegularSrcOperand looks counterintuitive now because tied src2 is actually a quite special operand. Counting this operand in ComponentProps.SrcOperandsNum is also debatable. Maybe handling this operand explicitly would make code more clear. It will also make error position correct. But it will require changes in parser and codegen which I believe you tried to avoid.


I believe this change may be avoided by making ComponentLayout a subclass of ComponentProps.


It's a pity.

foad added inline comments.Oct 25 2022, 3:22 AM

Maybe const auto & here and on the declaration of InstInfo?


Don't need any of these trailing underscores.

foad added a comment.Oct 25 2022, 3:24 AM

Overall looks fine, but I want somebody to look at codegen changes.

The changes in test/CodeGen/AMDGPU/vopd-src2acc-delay.mir are good.

dp added inline comments.Oct 25 2022, 6:20 AM

You are right, but I'm not sure if ... : Kind(Kind), ... is a good and clear style.

dp accepted this revision.Oct 25 2022, 7:28 AM

LGTM (with or without MC code corrections).


I think these two lines are too long even for td files.

This revision is now accepted and ready to land.Oct 25 2022, 7:28 AM
foad added inline comments.Oct 25 2022, 7:40 AM

It is ubiquitous, and I think it's much better than inventing lots of different conventions (like trailing underscore) to make the argument name different frmo the field name.

Joe_Nash updated this revision to Diff 470508.Oct 25 2022, 8:05 AM
Joe_Nash marked 6 inline comments as done.

use const auto &, hoist loop invariant code, shorten td line length, delete parameter name underscores

Thanks for the comments. I will plan to land this after rebasing and testing, and leave aside the remaining refactoring opportunities for now.


Yes, I have tried to avoid special case code in the parser and codegen. Counting the operand in SrcOperandsNum is indeed debatable but it is a standard tied operand from a CodeGen perspective and the VOPD rules are the same for dst and src2, so it works out cleanly. I'm open to an NFC (except for error caret position) refactor in the future.


It seems natural to delete ComponentLayout and incorporate its functionaliy into ComponentInfo. I'd prefer that was in a separate patch.

This revision was landed with ongoing or failed builds.Oct 25 2022, 10:11 AM
This revision was automatically updated to reflect the committed changes.