Use a multiclass to consistently define SReg/SGPR/TTMP register classes.
Add missing TTMP registers for 96b, 160b, 192b, 224b.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Looks like a nice refactoring.
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1242–1243 ↗ | (On Diff #357861) | Was SGPR_96 some kind of anomaly among the existing classes? |
llvm/lib/Target/AMDGPU/SIRegisterInfo.td | ||
725 | Don't you want to divide by two rounding up, so !sra(!add(numRegs, 1), 1)? | |
740 | Should we be defining TTMP classes for all sizes instead of working around it like this? |
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1242–1243 ↗ | (On Diff #357861) | I think the *GPR_* class names are better where applicable |
llvm/lib/Target/AMDGPU/SIRegisterInfo.td | ||
746 | Typo inherence |
llvm/lib/Target/AMDGPU/SIInstructions.td | ||
---|---|---|
1242–1243 ↗ | (On Diff #357861) | OK, I've put these back. Pretty much everything else in this file uses SReg_* |
llvm/lib/Target/AMDGPU/SIRegisterInfo.td | ||
725 | Thanks. | |
740 | At the moment if I create the TTMP_96Regs set then SGPR_128 stops working correctly. So I need to investigate further if we want to do that. Also SGPR_1024 will never have an associated TTMP set because there are only 16 trap registers. |
I don't think we can at present.
If I define TTMP_96Regs this implicitly causes TTMP_128_with_sub0_sub1_sub2 to be created.
This then creates SReg_128_with_sub0_sub1_sub2.
Now SGPR_128_with_sub0_sub1_sub2 inherits from SReg_128_with_sub0_sub1_sub2 instead of from SGPR_128.
(The superclass of SReg_128_with_sub0_sub1_sub2 is SReg_128, but the superclasses of SGPR_128_with_sub0_sub1_sub2 are SReg_128, SGPR_128, SReg_128_with_sub0_sub1_sub2.)
The net result is that SGPR_128_with_sub0_sub1_sub2 is marked isAllocatable = 0, causing allocation new failures.
The bottom line is that I don't think isAllocatable = 0 currently supports what we are trying to do.
Best guess is it is only appropriate for isolated or leaf register classes.
I can pursue a TableGen change to try and address this, but it would be good to get the immediate issue of allocation failures with SGPR_192 fixed.
llvm/lib/Target/AMDGPU/SIRegisterInfo.td | ||
---|---|---|
740 |
|
- Rework based on TableGen changes in D105967
- Define all applicable TTMP register classes
Wouldn't it be a bit simpler to use [] as the default, and test it with !empty?