This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix using physical registers in vector instructions
AbandonedPublic

Authored by Flakebi on Mar 9 2020, 9:34 AM.

Details

Summary

We can only get the register classes of virtual registers, not of
physical registers. The previous code crashed for vector instructions that
use physical registers.

Diff Detail

Event Timeline

Flakebi created this revision.Mar 9 2020, 9:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2020, 9:34 AM
Flakebi updated this revision to Diff 249133.Mar 9 2020, 9:36 AM

Remove accidentally added empty line

arsenm added a comment.Mar 9 2020, 9:46 AM

Need test and I would expect a physical register to never appear in this context

foad added a comment.Mar 9 2020, 9:47 AM

Test case? Is this related to the atomic optimizer and/or the ballot intrinsic?

Harbormaster failed remote builds in B48569: Diff 249132!

I’ll add a test case. Yes, this is related to the atomic optimizer and ballot intrinsic. There we get e.g. %0:vgpr_32 = V_MBCNT_LO_U32_B32_e64 $exec_lo, 0, implicit $exec.

arsenm added a comment.Mar 9 2020, 3:02 PM

I’ll add a test case. Yes, this is related to the atomic optimizer and ballot intrinsic. There we get e.g. %0:vgpr_32 = V_MBCNT_LO_U32_B32_e64 $exec_lo, 0, implicit $exec.

That should probably be reading a copy from exec_lo

I still think theses should not be seeing physical register operands, and it would be better to fix this by avoiding that situation

I still think theses should not be seeing physical register operands, and it would be better to fix this by avoiding that situation

Why is introducing a copy a better fix? And does this mean I should return a COPY when doing (ballot 1) -> EXEC/EXEC_LO here? https://reviews.llvm.org/D65088#C1855323NL8643

Wouldn’t a copy use an SGPR and increase the SGPR-count? Then we would generate less optimal code, right?

I still think theses should not be seeing physical register operands, and it would be better to fix this by avoiding that situation

Why is introducing a copy a better fix? And does this mean I should return a COPY when doing (ballot 1) -> EXEC/EXEC_LO here? https://reviews.llvm.org/D65088#C1855323NL8643

Wouldn’t a copy use an SGPR and increase the SGPR-count? Then we would generate less optimal code, right?

It should be eliminated later. Introducing physical register constraints earlier is generally bad. I would also worry about earlier passes not expected a non-implicit exec use

Flakebi abandoned this revision.Mar 13 2020, 8:54 AM

Thanks Matt, I’ll use CopyFromReg.