This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GlobalISel] Eliminate cross regbank copies of constants
Needs ReviewPublic

Authored by mbrkusanin on Dec 17 2021, 9:04 AM.

Details

Reviewers
foad
arsenm
Summary

Consider uses when selecting regbank for G_CONSTANT:
By default we assign SGPR because we do not know what the uses will be at the
time of selecting. If it turns out we need it as a VGPR then a repairing is done
by inserting a copy, same as for any other instruction.

For constants with multiple uses we add a new regbank combiner. If all uses are
the same non SGPR bank (will be copies to that new bank inserted by the
RegBankSelect), we change original constant to new bank, remove copies and
update uses.

Diff Detail

Event Timeline

mbrkusanin created this revision.Dec 17 2021, 9:04 AM
mbrkusanin requested review of this revision.Dec 17 2021, 9:04 AM
foad added inline comments.Dec 20 2021, 8:12 AM
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
406

Use use_nodbg_instructions here, like you did in the match function?

arsenm added a comment.Feb 9 2022, 9:26 AM

Could use a dedicated mir test

llvm/lib/Target/AMDGPU/AMDGPUCombine.td
111

Should also apply to G_FCONSTANT?

llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
360–361

Shouldn't really need to specially check this in an individual combine

371

This level of heuristic could have/should have been done by greedy regbankselect.

However, I think we want something a bit smarter here. We should rematerialize all uses that are inline constants. For literal constants, we can make a code size tradeoff based on the number of uses and if other operands are scalars

405

Don't need llvm::

foad added inline comments.Feb 9 2022, 9:41 AM
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
371

For literal constants, we can make a code size tradeoff based on the number of uses and if other operands are scalars

Can't we do something simpler here and leave that kind of tradeoff to SIFoldOperands, which already makes that kind of decision?

arsenm added inline comments.Feb 9 2022, 9:51 AM
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
371

No, because these copies are also confusing selection patterns. I think we need to do both

foad added inline comments.Feb 10 2022, 3:58 AM
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
371

I agree we need to do *something* here, it's just not clear to me that any kind of heuristic is useful. The code size tradeoff is whether to fold literal constants into their uses or not, but here we're talking about whether to copy a constant from sgpr to vgpr or not. I don't really see how they relate.

Why don't we just remat *all* constants, by removing the hasOneNonDBGUse check from AMDGPURegisterBankInfo::canRematerialize? What's the worst that can happen -- you'll get a non-inlinable constant materialized in both an sgpr and a vgpr, and SIFoldOperands will be unable to replace uses of that vgpr with the sgpr because it does not track the fact that they hold the same constant value?

arsenm added inline comments.Feb 10 2022, 6:44 AM
llvm/lib/Target/AMDGPU/AMDGPURegBankCombiner.cpp
371

Yes. Plus I would like to write a new version of SIFoldOperands which collects all operands and considers the entire instruction to choose the optimal set, which could replace v constants with s constants