This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] use scalar shift for SALU users in frame index elimination
ClosedPublic

Authored by alex-t on Mar 12 2022, 7:28 AM.

Details

Summary

In the frame index lowering we have to insert shift and add
instructions to adjust stack object access. We need to take care of the stack
object user kind and use scalar shift/add for scalar users.

Diff Detail

Event Timeline

alex-t created this revision.Mar 12 2022, 7:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 12 2022, 7:28 AM
alex-t requested review of this revision.Mar 12 2022, 7:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 12 2022, 7:28 AM
Herald added a subscriber: wdng. · View Herald Transcript
rampitec added inline comments.Mar 14 2022, 10:17 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2242–2243

Can probably be another mov?

2242–2243

Formatting.

2244

Formatting.

2256

What if $scc is alive?
$scc shall also be marked dead-def.

llvm/test/CodeGen/AMDGPU/frame-index.mir
2

No need to pass -mattr=-promote-alloca -amdgpu-sroa=0.

8

IR is also not needed.

60

Just remove memory operand (part starting from ::) so you do not need IR.

89

Generate the checks. This one misses inlicit-def dead $scc.

arsenm added inline comments.Mar 17 2022, 6:29 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2242–2243

It would be a bit more appropriate to check the class of the use operand

alex-t updated this revision to Diff 416472.Mar 18 2022, 5:32 AM

SCC clobbering case added. MIR test simplified.

alex-t updated this revision to Diff 416490.Mar 18 2022, 7:02 AM
alex-t marked 6 inline comments as done.

IsSALU check chamged to query operand register class.
MIR test were made auto-generated

alex-t marked 2 inline comments as done.Mar 18 2022, 7:03 AM
alex-t added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2242–2243

Never :)

alex-t updated this revision to Diff 416518.Mar 18 2022, 8:33 AM

formatting

Need a testcase with alive scc.

llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2242–2243

Why cannot it be v_mov_b32_e64?

2243

Is not isRegUsed for the whole function and will always return true on practice?

2262

Weird formatting.

llvm/test/CodeGen/AMDGPU/frame-index.mir
2

Not done.

alex-t added inline comments.Mar 18 2022, 11:41 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

This function is used to determine if SCC is live in all other places around the code.
Another option is to scan upwards until the block start is reached.

2262

This is clang-format. Otherwise lint complains

rampitec added inline comments.Mar 18 2022, 11:48 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

All other places shall mean the whole function? I'd suggest using MBB->computeRegisterLiveness().

alex-t added inline comments.Mar 18 2022, 12:50 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

I mean checks like this:

CanClobberSCC = !RS->isRegUsed(AMDGPU::SCC);

they really interested if SCC is live at exact point.

In case this is wrong we have lots of problems

rampitec added inline comments.Mar 18 2022, 12:52 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

But what point is it? The call does not take an iterator.

arsenm added inline comments.Mar 18 2022, 1:05 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

It's at the scavenging point, which is separately advanced

rampitec added inline comments.Mar 18 2022, 1:12 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

Was it advanced before the call?

arsenm added inline comments.Mar 18 2022, 1:44 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

It should have been advanced by the scavengeRegister call (assuming there is one on the path that reaches here). Plus PrologEpilogInserter advances before the call in the normal usage

rampitec added inline comments.Mar 18 2022, 1:47 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

OK, thanks.

alex-t updated this revision to Diff 416693.Mar 19 2022, 6:30 AM

test case for SCC multiple defs added. Unnecessary compiler options removed from the test command line.

alex-t marked an inline comment as done.Mar 19 2022, 6:32 AM
alex-t added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
2243

RegisterScavenger::forward takes care of that. It is called from PEI::eliminateFrameIndices on each iteration. So it updates live information stepping each instruction.
The very first call of the SIRegisterInfo::eliminateFrameIndex happens with the LiveRegs updated on the BB entrance with LiveIns.
Each next call happens after the RS->forward updated live info.
I have a test for this.

alex-t marked an inline comment as done.Mar 19 2022, 6:33 AM

test case for SCC multiple defs added. Unnecessary compiler options removed from the test command line.

There is a case where you are creating readfirstlane, but I do not see a test for it.

alex-t updated this revision to Diff 417094.Mar 21 2022, 2:08 PM

test updated

This revision is now accepted and ready to land.Mar 21 2022, 2:24 PM
This revision was landed with ongoing or failed builds.Mar 22 2022, 3:43 AM
This revision was automatically updated to reflect the committed changes.