This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Restrict machine copy propagation from creating unaligned classes
ClosedPublic

Authored by rampitec on Mar 11 2022, 1:44 PM.

Diff Detail

Event Timeline

rampitec created this revision.Mar 11 2022, 1:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2022, 1:44 PM
rampitec requested review of this revision.Mar 11 2022, 1:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2022, 1:44 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm added inline comments.Mar 11 2022, 2:01 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4807

RI is already a class member

4808–4815

Why fix this up here when adjustAllocatableRegClass can return the aligned version in the first place?

arsenm added inline comments.Mar 11 2022, 2:04 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4748

This reservedRegsFrozen check is probably obsolete

rampitec updated this revision to Diff 414753.Mar 11 2022, 2:33 PM
rampitec marked 2 inline comments as done.
arsenm added inline comments.Mar 11 2022, 2:37 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4774

Why can't you just select the aligned class above to begin with?

rampitec added inline comments.Mar 11 2022, 2:41 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4774

This is only VGPRs, and not all VGPRs. Then it will use a condition at every case, while the code below is still needed.

rampitec edited the summary of this revision. (Show Details)Mar 11 2022, 3:17 PM
arsenm added inline comments.Mar 11 2022, 6:30 PM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4774

I don't think RC can be null here

4774

I don't understand. What do you mean not all VGPRs? You need the aligned class or not?

rampitec marked 2 inline comments as done.Mar 11 2022, 8:30 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4774

RC can be null here and it does for many tests.

The switch above only covers AV and only certain sizes. It does not cover neither vgprs nor agprs. The very first test I have added would not pass because it uses VReg_64.

rampitec marked an inline comment as done.Mar 11 2022, 8:57 PM

There seems to be some confusion here, it is not because AV classes has become allocatable. It is an issue from day 1 aligned classes were introduced.

In general, invoking adjustAllocatableRegClass post-regalloc doesn't sound good to me.
I'm in favor of the code that you added to get the aligned regclasses.
But what's the need to adjust the AV operands beforehand? They'll meet the condition as we have the _align2 versions for the superclasses available now.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4777

Factor this out into a separate function?
We have the following code already used at least 2 more places in our backend.

llvm/test/CodeGen/AMDGPU/mcp-aligned-vgprs.mir
13

Why does such a copy exist in the first place?
The register pair given for 64-bit Dst operand of COPY itself is unaligned.
One situation I can imagine is when the Dst is the odd subpart of a higher tuple.
But in that case, shouldn't that info be given in the instruction as an implicit operand?

rampitec added inline comments.Mar 12 2022, 1:02 AM
llvm/test/CodeGen/AMDGPU/mcp-aligned-vgprs.mir
13

This is RA created copy of vreg_96:sub1_sub2. A perfectly valid and aligned case.

In general, invoking adjustAllocatableRegClass post-regalloc doesn't sound good to me.

Which we do already.

rampitec updated this revision to Diff 415162.Mar 14 2022, 10:56 AM
rampitec marked an inline comment as done.

Factored out getProperlyAlignedRC().

But what's the need to adjust the AV operands beforehand? They'll meet the condition as we have the _align2 versions for the superclasses available now.

MCP sees a physreg which can be propagated, like in the test: aligned physreg copied into an unaligned physreg. It queries TII of get instruction's operand regclass and then checks if copy destination fits that regclass. If it does MCP performs propagation. Therefore, to prevent it we shall adjust the regclass extracted from the operand desc.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4777

Extracted, although I do not see other places where we need exactly the same interface.

cdevadas accepted this revision.Mar 14 2022, 1:30 PM

LGTM except for the typo.

llvm/lib/Target/AMDGPU/SIRegisterInfo.h
382

s/correcsponding/corresponding

This revision is now accepted and ready to land.Mar 14 2022, 1:30 PM
rampitec updated this revision to Diff 415202.Mar 14 2022, 1:35 PM
rampitec marked an inline comment as done.

Fixed typo.

This revision was landed with ongoing or failed builds.Mar 14 2022, 2:10 PM
This revision was automatically updated to reflect the committed changes.