This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add missing testcase for SGPR to AGPR copy
ClosedPublic

Authored by hsmhsm on Mar 22 2022, 10:12 PM.

Details

Summary

and, also update the function indirectCopyToAGPR() to ensure that it is called only on GFX908 sub-target.

Diff Detail

Event Timeline

hsmhsm created this revision.Mar 22 2022, 10:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2022, 10:12 PM
hsmhsm requested review of this revision.Mar 22 2022, 10:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 22 2022, 10:12 PM
hsmhsm updated this revision to Diff 417536.Mar 23 2022, 2:14 AM

hasGFX90AInsts() is true for both GFX90A and GFX940 subtargets.

foad added a comment.Mar 23 2022, 2:27 AM

Can you simplify the patch? I think all it is doing is (a) adding a new test and (b) simplifying the "if" on line 609 because the condition is always true. All the other changes make it harder to see what the patch is really doing.

Can you simplify the patch? I think all it is doing is (a) adding a new test and (b) simplifying the "if" on line 609 because the condition is always true. All the other changes make it harder to see what the patch is really doing.

For sure, I need to change review message now based on my latest update where I lately realized that "hasGFX90AInsts() is true for both GFX90A and GFX940 sub-targets".

However, the whole intention of this patch is to make sure that the function indirectCopyToAGPR() is meaningful only in case of GFX908, and should be called only in case of GFX908(). Hence updated indirectCopyToAGPR() accordingly. I am not sure what else can be simplified here apart from meaningfully updating the review message.

hsmhsm retitled this revision from [AMDGPU] VGPR need to be reserved for AGPR copy *only* on subtarget GFX908 to [AMDGPU] Add missing testcase for SGPR to AGPR copy.Mar 23 2022, 2:50 AM
hsmhsm edited the summary of this revision. (Show Details)
foad added inline comments.Mar 23 2022, 3:22 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
554–560

Either do this or don't. Adding another FIXME is just more clutter. Anyway I think it is more of a TODO than a FIXME since it is not really broken.

606

I don't think you need a FIXME comment for this. It looks like the code is using the scavenger to try to find a low-numbered vgpr, and only using the reserved one if that fails.

hsmhsm updated this revision to Diff 417553.Mar 23 2022, 3:31 AM
hsmhsm edited the summary of this revision. (Show Details)

Fix review comments by Jay.

hsmhsm marked 2 inline comments as done.Mar 23 2022, 3:32 AM
cdevadas added inline comments.Mar 23 2022, 3:43 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
628

Shouldn't this be Tmp = MIN (Tmp, Tmp2) if Tmp2 is a valid RegNo?
RS might return a higher number than the reserved.
(Unrelated to this patch)

rampitec added inline comments.Mar 23 2022, 8:46 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
628

No. It should rotate 3 different VGPRs with each call.

rampitec accepted this revision.Mar 23 2022, 8:48 AM
This revision is now accepted and ready to land.Mar 23 2022, 8:48 AM
This revision was landed with ongoing or failed builds.Mar 23 2022, 9:08 AM
This revision was automatically updated to reflect the committed changes.