There should be no need to reserve all SGPR hi16/lo16 halves, or all
AGPR hi16 halves. This should be done by marking the corresponding
register classes as not allocatable instead.
Details
- Reviewers
rampitec Joe_Nash arsenm - Group Reviewers
Restricted Project - Commits
- rG8bad806f298c: [AMDGPU] Do not reserve 16-bit registers
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
SGPR_HI16 is already marked as not allocatable. So L611 was partially redundant with that. But this patch does appear to remove a barrier to allocating into the hi or lo half of TMA_LO or some other non-GPR SReg.
I guess there is other code somewhere preventing allocating to those non-GPR SReg? If so, this patch can be marked NFC and the commit reworded to say it skips redundant reservations.
It appears that this patch does not change anything about the legality of allocating registers in class SGPR_LO16.
I think you are right about that. Perhaps that is a a good reason not to merge this patch unless there is some immediate benefit.
I am not sure about this. It is "legal" in that the SGPR_LO16 is allocatable, but I don't think anything uses it - except for one handwritten MIR test in test/CodeGen/AMDGPU/fold_16bit_imm.mir.
It appears that this patch does not change anything about the legality of allocating registers in class SGPR_LO16.
But if we will allow SGPRs this code will be needed on older targets again.
I think you are right about that. Perhaps that is a a good reason not to merge this patch unless there is some immediate benefit.
I don't like guessing what will be required in future. Even with the GFX11 True16 work in progress it is unclear to me whether 16-bit SGPRs will be used.
My motivation is to clean up and simplify SIRegisterInfo::getReservedRegs based on what is required today. It seems like a lot of stuff has crept in and I am trying to understand what it is all for.
llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp | ||
---|---|---|
615 | I've tried to preserve the intent of this code by making SReg_LO16 unallocatable but leaving SGPR_LO16 allocatable. |
Here is the thing:
v_mov_b16 v0.h, vcc_lo.h 0xd59c4800 0x0000006a v_mov_b16 v0.l, vcc_hi.l 0x7e00386b
So this all seems legal.
Right, VALU instructions can access 16-bit SGPRs, and the assembler/disassembler has to accept that. The tricky question is whether we want to expose that to codegen. and I think it will be hard to do that because there is no S_MOV_B16 instruction and in general no way to write to a 16-bit SGPR; you can only read from them.
Thanks for the review.
Good point about the scalar move. It explains the problem. Now I think I agree about the codegen support.
I assume this comment refers to a vcc liveness error in test/CodeGen/AMDGPU/frame-index-elimination.ll. I have fixed that in a different way with D147157.