Download Raw Diff

Details

Reviewers

rampitec
arsenm
foad

Commits

rG0d849b8249e8: AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

cdevadas created this revision.Mar 2 2022, 6:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 6:51 AM

Herald added subscribers: foad, kerbowa, hiraditya and 7 others. · View Herald Transcript

cdevadas requested review of this revision.Mar 2 2022, 6:51 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2022, 6:51 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

What's the reasoning for this? Won't this break chains of REG_SEQUENCE? In particular I do expect to see some REG_SEQUENCE (REG_SEQUENCE), (REG_SEQUENCE) patterns

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4758 ↗	(On Diff #412398)	isUnknownRegClass?

foad added inline comments.Mar 2 2022, 6:59 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4761 ↗	(On Diff #412398)	Why do you need to test isVariadic here()? Surely as long as you compare OpIdx against Desc.getNumOperands() (not MI.getNumOperands()) you're OK?

Harbormaster completed remote builds in B152154: Diff 412398.Mar 2 2022, 7:43 AM

In D120813#3354292, @arsenm wrote:

What's the reasoning for this? Won't this break chains of REG_SEQUENCE? In particular I do expect to see some REG_SEQUENCE (REG_SEQUENCE), (REG_SEQUENCE) patterns

Currently skipping it to avoid a crash that occurred with REG_SEQUENCE(REG_SEQUENCE) while trying to look for the regclass @OpIdx from the UseMI's MCInstrDesc.
These special cases involving target-independent and/or variadic instructions should be handled separately.
This folding will take place only when the regclass at OpIdx is a vectorSuperClass https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/SIFoldOperands.cpp#L1603
Currently, we don't select any AV operands and I guess it is ok to skip them.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4758 ↗	(On Diff #412398)	Will use
4761 ↗	(On Diff #412398)	No, `OpIdx >= Desc.getNumOperands()` didn't hold true when I checked the case REG_SEQUENCE (REG_SEQUENCE). Is it because the passed Op index is already adjusted from `UseMI->getOperand(0)`?

Used a better name for the function.

Harbormaster completed remote builds in B152286: Diff 412591.Mar 2 2022, 7:45 PM

rampitec added inline comments.Mar 3 2022, 9:47 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4761 ↗	(On Diff #412398)	Even variadic instructions shall have proper operand description before the variadic part starts.

Removed variadic instruction check.
Desc.OpInfo[OpIdx].RegClass == -1 will catch those invalid cases.

Harbormaster completed remote builds in B152491: Diff 412876.Mar 3 2022, 6:16 PM

foad added inline comments.Mar 4 2022, 4:00 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4760 ↗	(On Diff #412876)	Just use MI.getDesc()?

Addressed Jay's suggestion.

Harbormaster completed remote builds in B152572: Diff 412982.Mar 4 2022, 5:04 AM

foad added inline comments.Mar 4 2022, 6:04 AM

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1604	You could just use TargetInstrInfo::getRegClass(InstDesc, OpIdx, TRI, MF) instead of TRI->getRegClass and it would do all these checks for you already.

cdevadas added inline comments.Mar 4 2022, 6:14 AM

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1604	We can't use that equivalent of `getRegClass` here. The call to `adjustAllocatableRegClass` inside it would change AV classes to V thereby disabling the folding in almost all valid cases.

foad added inline comments.Mar 4 2022, 6:21 AM

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1604	The call to `adjustAllocatableRegClass` is in the AMDGPU-specific getOpRegClass, not in the generic getRegClass.

cdevadas added inline comments.Mar 4 2022, 10:03 AM

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
1604	@foad thanks for pointing that generic function. We have some variants in SIInstrInfo.cpp to get regclass value from InstDesc. I initially tried to use SIInstrInfo::getOpRegClass(const MachineInstr &MI, unsigned OpNo) and found the `adjustAllocatableRegClass` calls inside it prevented the folding for some tests involving loads and stores. But SIInstrInfo::getRegClass(InstDesc, OpIdx, TRI, MF) seems to be working. Though it also has a call to `adjustAllocatableRegClass`, it is more restrained and appeared to be not missing any folding opportunity for loads/stores that failed earlier with the other variant. I am going to use this function. Will update the patch.

Used existing 'SIInstrInfo::getRegClass' function to get the RC at OpIdx.

Harbormaster completed remote builds in B152629: Diff 413064.Mar 4 2022, 10:49 AM

arsenm accepted this revision.Mar 7 2022, 10:01 AM

This revision is now accepted and ready to land.Mar 7 2022, 10:01 AM

This revision was landed with ongoing or failed builds.Mar 7 2022, 8:44 PM

Closed by commit rG0d849b8249e8: AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users (authored by cdevadas). · Explain Why

This revision was automatically updated to reflect the committed changes.

cdevadas added a commit: rG0d849b8249e8: AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users.

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 413694

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

llvm/test/CodeGen/AMDGPU/skip-fold-regsequence.mir

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 413694

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

llvm/test/CodeGen/AMDGPU/skip-fold-regsequence.mir

AMDGPU: Skip folding REG_SEQUENCE if found unknown regclasses for its users.
ClosedPublic