This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Stop using getMinimalPhysRegClass in LowerFormalArguments
ClosedPublic

Authored by foad on Mar 17 2022, 7:52 AM.

Details

Summary

NFCI. The motivation for this is avoid problems in future if we add new
classes containing only a subset of all VGPRs, or a subset of all SGPRs.
getMinimalPhysRegClass would favour these smaller classes, which is not
what we want here.

Diff Detail

Event Timeline

foad created this revision.Mar 17 2022, 7:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2022, 7:52 AM
foad requested review of this revision.Mar 17 2022, 7:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2022, 7:52 AM
foad added inline comments.Mar 17 2022, 7:54 AM
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
2551

As an alternative, I did try implementing a getMaximalAllocatablePhysRegClass and using it here. That seemed to cause other problems, but I can pursue it further if you think it's the right way to go.

arsenm accepted this revision.Mar 17 2022, 7:55 AM
This revision is now accepted and ready to land.Mar 17 2022, 7:55 AM

I think a switch case over known RegClasses that we want to use is the way to go. The only question is do we need to handle more cases. Is this passing testing? When I put assert(VT.getSizeInBits() == 32) here I get massive test failures. So what happens to wider types, they all become VGPR_32?

I think a switch case over known RegClasses that we want to use is the way to go. The only question is do we need to handle more cases. Is this passing testing? When I put assert(VT.getSizeInBits() == 32) here I get massive test failures. So what happens to wider types, they all become VGPR_32?

All arguments are supposed to be split into 32 bit pieces before they reach here

I think a switch case over known RegClasses that we want to use is the way to go. The only question is do we need to handle more cases. Is this passing testing? When I put assert(VT.getSizeInBits() == 32) here I get massive test failures. So what happens to wider types, they all become VGPR_32?

All arguments are supposed to be split into 32 bit pieces before they reach here

I'm assuming your failures are on 16-bit types which still use a 32-bit register class

foad added a comment.Mar 17 2022, 8:06 AM

Is this passing testing?

Yes.

When I put assert(VT.getSizeInBits() == 32) here I get massive test failures. So what happens to wider types, they all become VGPR_32?

See the calling conv definitions in AMDGPUCallingConv.td. All supported types (including i16 and f16) are passed in 32-bit GPRs, either VGPRn or SGPRn.

Joe_Nash accepted this revision.Mar 17 2022, 8:11 AM

I think a switch case over known RegClasses that we want to use is the way to go. The only question is do we need to handle more cases. Is this passing testing? When I put assert(VT.getSizeInBits() == 32) here I get massive test failures. So what happens to wider types, they all become VGPR_32?

All arguments are supposed to be split into 32 bit pieces before they reach here

I'm assuming your failures are on 16-bit types which still use a 32-bit register class

Yep. When I put assert(VT.getSizeInBits() <= 32) they go away.

Ok. LGTM

foad added a comment.Mar 17 2022, 8:14 AM

I think a switch case over known RegClasses that we want to use is the way to go.

Do you prefer this (cribbed from SITargetLowering::insertCopiesSplitCSR)?

const TargetRegisterClass *RC = nullptr;
if (AMDGPU::SGPR_32RegClass.contains(Reg))
  RC = &AMDGPU::SGPR_32RegClass;
else if (AMDGPU::VGPR_32RegClass.contains(Reg))
  RC = &AMDGPU::VGPR_32RegClass;
else
  llvm_unreachable("Unexpected register class in LowerFormalArguments!");

As you say, it seems guaranteed to be SGPR or VGPR by the calling convention. They compile to the same thing under optimization right? If so I slightly prefer the llvm_unreachable version because the intent is locally clear.

This revision was landed with ongoing or failed builds.Mar 17 2022, 8:24 AM
This revision was automatically updated to reflect the committed changes.