This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GlobalISel] Try to rematerialize when repairing regbank
Needs ReviewPublic

Authored by mbrkusanin on Nov 12 2021, 10:29 AM.

Details

Summary

Constants with a single use or G_IMPLICIT_DEF can be repaired by simply
cloning instruction that defined them with the appropriate register bank.

Diff Detail

Event Timeline

mbrkusanin created this revision.Nov 12 2021, 10:29 AM
mbrkusanin requested review of this revision.Nov 12 2021, 10:29 AM

This is another attempt at trying to get of cross reg bank copies for which I needed manual checks when matching (like in https://reviews.llvm.org/D110937, https://reviews.llvm.org/D98040)

I'm wondering if this would be considered more of a hack then an appropriate way of getting rid of these annoying copies.

We get decreased instruction count in a lot of test.

I'll highlight below (inline) cases that I noticed where instruction count actually increases.

llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i16.ll
2101 ↗(On Diff #386883)

This is because of SMEMtoVectorWriteHazards on gfx10. Inserted by GCNHazardRecognizer::fixSMEMtoVectorWriteHazards.

llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.intersect_ray.ll
75

Longer code due to a vgpr (v5) used for const.

llvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
33–36

SIFoldOperands won't inline this constant because it has multiple uses. It does this because this might increase code size. Previously this were two different constants (two different instructions) but they had same value. SDag also does not inline const but uses sgpr.

llvm/test/CodeGen/AMDGPU/cttz.ll
974–982

Extra vgpr used here (now 3 was 2)
Previously we had "S_MOV_B32 0" and "V_MOV_B32_e32 0". Now both are V_MOV*.
CSE will combine them and keep first one (earlier one).
SIFoldOperands inlines first use (the one that was originally sgpr) leaving us with less optimal scheduling the what we originally had.
The only "V_MOV_B32_e32 0" now need to be allocated to a v2 because const is def'd earlier than it really needs to.

Last time I thought about this I thought it would be easier to have the post-regbank combiner handle this. In some situations it makes most sense to completely rematerialize the constant value in each regbank, not just reassign it. If you have an inline constant, it would be better to just emit a new constant for each bank. For multiple uses of literals its trickier since there's a code size or instruction count tradeoff based on the uses

llvm/include/llvm/CodeGen/GlobalISel/RegBankSelect.h
324 ↗(On Diff #386883)

Typo prevously

325 ↗(On Diff #386883)

I would expect to make the right decision upfront, not have to go back and erase copies

llvm/include/llvm/CodeGen/GlobalISel/RegisterBankInfo.h
625–630

I think this is way too specific of a target hook

mbrkusanin edited the summary of this revision. (Show Details)

I've considered this regbank combiner solution before but it was too late for D98040. Thankfully those cases are single use constants needed for offset for load/stores, a case which can be easily solved in RegBankSelect.

Now updating of regbanks is split between RegBankSelect with no target hooks and AMDGPURegBankCombiner.

mbrkusanin updated this revision to Diff 391263.Dec 2 2021, 2:24 AM

Clang-format + Rebase + Ping

foad added a comment.Dec 16 2021, 4:02 AM

As discussed offline, maybe split this patch into two?

  1. Just the generic change to RegBankSelect
  2. Add the AMDGPU combine
mbrkusanin retitled this revision from [RFC][AMDGPU][GlobalISel] Fix RegBanks for G_CONSTANT to [AMDGPU][GlobalISel] Rematerialize consts when repairing regbank .
mbrkusanin edited the summary of this revision. (Show Details)

Removed combiner. Now only constants that have single use are handled here.
Combiner that handles constants with multiple uses is now: https://reviews.llvm.org/D115945

arsenm added inline comments.Dec 17 2021, 1:16 PM
llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
166

the same applies for G_FCONSTANT and G_IMPLICIT_DEF, but I'm not sure this should be unconditionally done for all targets, and in all situations. For instance on AMDGPU we may want to choose a code size tradeoff with non-inlineable constants

mbrkusanin retitled this revision from [AMDGPU][GlobalISel] Rematerialize consts when repairing regbank to [AMDGPU][GlobalISel] Try to rematerialize when repairing regbank .
mbrkusanin edited the summary of this revision. (Show Details)
  • Added target check and covered G_FCONSTANT and G_IMPLICIT_DEF

Rebase + Ping

mbrkusanin updated this revision to Diff 405271.Feb 2 2022, 7:41 AM

Rebase + Ping

I'm currently thinking rematerialized in regbankselect is a good strategy. We would still want to have a post-isel operand optimizer pass try to re-scalarize constants when there's open constant bus use.

llvm/include/llvm/CodeGen/GlobalISel/RegisterBankInfo.h
631

It would be better to report the rematerializability of an opcode as part of the reported mapping rather than add another callback for it

llvm/lib/CodeGen/GlobalISel/RegBankSelect.cpp
166–170

I think it would be cleaner to handle this as a separate RepairingPlacement case in applyMapping

Herald added a project: Restricted Project. · View Herald TranscriptSep 30 2022, 8:29 AM
Herald added a subscriber: kosarev. · View Herald Transcript