- When an unconditional branch is expanded into an indirect branch, if there is no scavenged register, an SGPR pair needs spilling to enable the destination PC calculation. In addition, before jumping into the destination, that clobbered SGPR pair need restoring.
- As SGPR cannot be spilled to or restored from memory directly, the spilling/restoring of that SGPR pair reuses the regular SGPR spilling support but without spilling it into memory. As that spilling and restoring points are fully controlled, we only need to spill that SGPR into the temporary VGPR, which needs spilling into its emergency slot.
- The target-specific hook is revised to take additional restore block, where the restoring code is filled. After that, the relaxation will place that restore block directly before the destination block and insert an unconditional branch in any fall-through block into the destination block.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/CodeGen/BranchRelaxation.cpp | ||
---|---|---|
484–485 | yeah, you are absolutely right! | |
llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll | ||
331 | That's the scavenge frame index previously added in https://reviews.llvm.org/D96336. Here, we need to spill an SGPR into a VGPR, which needs spilling into a frame slow when no VGPR could be scavenged. |
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | ||
---|---|---|
677–678 | Should turn this into an assert | |
678 | Can delete the return | |
llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll | ||
2 | Can you switch this to an amdhsa triple? I want to be sure the use of the sgpr0_sgpr1 ordinarily used for the scratch buffer is tested. I think you would need to add a test variant that is a non-kernel function too |
LGTM. Should still add a non-kernel case to the same test so that s[0:3] really is the SRD
llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll | ||
---|---|---|
2 | that none-kernel case is added. however, due to calling convention, v0 is reused for spilling SGPR and is always available for spilling SGPRs. We cannot test the no-scavenged register case. Do you have suggestions to fabricate the test case for that purpose? |
llvm/test/CodeGen/AMDGPU/branch-relax-spill.ll | ||
---|---|---|
2 | It's not available for spilling if there's a call that clobbers VGPRs |
Update non-kernel test case
- It's turned out that we need a uniform condition to prevent vcc being *really* used.
typo Optiionally