This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Fixed handling of non-standard vectors
ClosedPublic

Authored by rampitec on May 22 2020, 1:33 PM.

Details

Summary

We do not have register classes for all possible vector
sizes, so round it up for extract vector element.

Also fixes selection of G_MERGE_VALUES when vectors are
not a power of two.

This has required to refactor getRegSplitParts() in way
that it can handle not just power of two vectors.

Ideally we would like RegSplitParts to be generated by
tablegen.

Diff Detail

Event Timeline

rampitec created this revision.May 22 2020, 1:33 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 22 2020, 1:33 PM
arsenm requested changes to this revision.May 22 2020, 2:01 PM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
555

The selector shouldn't be trying to fix these up. The selector should only see directly selectable vector sizes. We can either widen the vector sources in the legalizer, or add the missing register classes. It's also not valid to use G_MERGE_VALUES on vectors. You have to use G_BUILD_VECTOR or G_CONCAT_VECTORS depending on whether the source is scalar or vector

This revision now requires changes to proceed.May 22 2020, 2:01 PM
rampitec marked an inline comment as done.May 26 2020, 3:33 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
555

In fact it does not seem to be needed at all. REG_SEQUENCE does it right even if incomplete.

rampitec updated this revision to Diff 266354.May 26 2020, 3:34 PM

Removed handling of undef elements.

rampitec marked an inline comment as done.May 26 2020, 3:41 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
1332

NB: Nothing breaks even if I remove Exact argument and always use if statements.

rampitec updated this revision to Diff 266363.May 26 2020, 3:54 PM

Simplified the patch:

  • Removed Exact argument.
  • Changed vector<vector<int_16_t>> into vector<int_16_t>[16].
arsenm added inline comments.May 26 2020, 4:08 PM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
58

Should probably put this into its own function with a FIXME that it should be a table produced by tablegen, also with a call_once?

llvm/lib/Target/AMDGPU/SIRegisterInfo.h
40

std::array?

rampitec marked an inline comment as done.May 26 2020, 4:38 PM
rampitec added inline comments.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
58

I am not really sure we initialize TRI exactly once, especially in a JIT, where llvm can be initialized and finalized multiple times.

arsenm added inline comments.May 27 2020, 6:15 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
58

It's initialized once per subtarget, but you can make it static. This is the un-tablegened hack from AMDGPURegisterBankInfo:

// HACK: Until this is fully tablegen'd.
static llvm::once_flag InitializeRegisterBankFlag;

static auto InitializeRegisterBankOnce = [this]() {
  assert(&getRegBank(AMDGPU::SGPRRegBankID) == &AMDGPU::SGPRRegBank &&
         &getRegBank(AMDGPU::VGPRRegBankID) == &AMDGPU::VGPRRegBank &&
         &getRegBank(AMDGPU::AGPRRegBankID) == &AMDGPU::AGPRRegBank);
  (void)this;
};

llvm::call_once(InitializeRegisterBankFlag, InitializeRegisterBankOnce);

}

rampitec updated this revision to Diff 266596.May 27 2020, 10:28 AM
rampitec marked 3 inline comments as done.

Moved RegSplitParts into a static std::array initialized once.

arsenm added inline comments.May 27 2020, 10:53 AM
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
47–50

Should be a static member of TRI. I think here you'll get a global constructor for the std::array

rampitec updated this revision to Diff 266611.May 27 2020, 11:33 AM
rampitec marked an inline comment as done.

Moved RegSplitParts into the TRI as a static member.

arsenm accepted this revision.May 27 2020, 3:19 PM
This revision is now accepted and ready to land.May 27 2020, 3:19 PM
This revision was automatically updated to reflect the committed changes.