This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Add support for wide loads >= 256-bits
ClosedPublic

Authored by tstellar on Jan 29 2019, 10:37 AM.

Details

Summary

This adds support for the most commonly used wide load types:
<8xi32>, <16xi32>, <4xi64>, and <8xi64>

Diff Detail

Repository
rL LLVM

Event Timeline

tstellar created this revision.Jan 29 2019, 10:37 AM
arsenm added inline comments.Jan 29 2019, 10:48 AM
lib/Target/AMDGPU/AMDGPUGenRegisterBankInfo.def
185–188 ↗(On Diff #184120)

I have most of a patch for this

209 ↗(On Diff #184120)

Extra space before (

lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
141 ↗(On Diff #184120)

extra newline

198–199 ↗(On Diff #184120)

No, the verifier rejects physical registers in G_ instructions

407 ↗(On Diff #184120)

This should be split to a new function

451–454 ↗(On Diff #184120)

You can do just MF.getMachineMemOperand(MMO, Offset, Size). I hit a bug in it though for D57122

451–454 ↗(On Diff #184120)

Actually for this entire part I think once my other load/store patches are committed, you could just re-use the functions in LegalizerHelper

569–570 ↗(On Diff #184120)

No, the verifier rejects physical registers in G_ instructions

test/CodeGen/AMDGPU/GlobalISel/regbankselect-load.mir
4 ↗(On Diff #184120)

This isn't needed anymore

7 ↗(On Diff #184120)

I want to split the memory tests per address space, so split this into regbankselect-load-global, regbankselect-load-constant

tstellar updated this revision to Diff 208577.Jul 8 2019, 7:52 PM
tstellar marked 7 inline comments as done.

Address review comments and rebase.

Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2019, 7:52 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
tstellar marked 2 inline comments as done.Jul 8 2019, 7:55 PM
arsenm added inline comments.Jul 8 2019, 8:02 PM
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
82 ↗(On Diff #208577)

This is identical to ApplySALUMapping, so maybe just add an argument to the constructor for the register bank?

930–932 ↗(On Diff #208577)

I don't understand this, because in SSA there must be exactly 1 def

933 ↗(On Diff #208577)

s/asset/assert

969 ↗(On Diff #208577)

Registser

tstellar marked an inline comment as done.Jul 8 2019, 8:17 PM
tstellar added inline comments.
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
930–932 ↗(On Diff #208577)

It's just a temporary state, because RegBankSelect::applyMapping() calls RegBankSelect::repairReg() which inserts a "Repair Instruction" that writes to the register being repaired. This is done before calling applyMapping() which it assumes will delete the original instruction writing the register.

tstellar updated this revision to Diff 208587.Jul 8 2019, 8:44 PM
tstellar marked 3 inline comments as done.

Address more review comments.

arsenm added inline comments.Jul 9 2019, 7:08 AM
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
930–932 ↗(On Diff #208577)

So you should be checking the registers from OpdMapper instead? This needs a comment at leasts

tstellar marked an inline comment as done.Jul 9 2019, 8:17 AM
tstellar added inline comments.
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
930–932 ↗(On Diff #208577)

I can try that for getting the RepairInst, but we still need a way to get the LegalizedInst a little lower in the function. The other solution I had for this was to use the Observer and just record the last instruction added by the LegalizeHelper, but this seemed like we would be relying too much on the implementation details of LegalizeHelper::fewerVectorElements.

arsenm added inline comments.Jul 9 2019, 8:34 AM
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
930–932 ↗(On Diff #208577)

There are definitely issues trying to using the LegalizerHelper from RegBankSelect. In another case we're now relying on the behavior that the LegalizerHelper happens to modify the instruction in place.

arsenm added inline comments.Jul 9 2019, 8:51 AM
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
67 ↗(On Diff #208587)

This should use SCC for the SALU version and VCC for the VALU version

tstellar updated this revision to Diff 208820.Jul 9 2019, 2:48 PM

I added a better comment to getOtherVRegDef. I tried to come up with
a solution using the OperandsMapper, but in the end I had to implement
something similar to getOtherVRegDef to in order to verify that we retreived the
correct LegalizedInst from the LegalizeHelper. It was also way more code and
might be more confusing. Hopefully the new comment on getOtherVRegDef makes
it clear what is going on.

arsenm accepted this revision.Jul 9 2019, 2:53 PM

LGTM

This revision is now accepted and ready to land.Jul 9 2019, 2:53 PM
This revision was automatically updated to reflect the committed changes.