When spilling in the entry function we should be able to borrow
StackPtrOffsetReg as a last resort. This restores behaviour
removed in D75138, and fixes failures when shaders use all
SGPRs, VGPRs and spill in the entry function.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM, although CHECKs in the tests to confirm that e.g. the first involves a scavenged register and s_add, and the second involves the stack pointer and s_mov, and that this happens at the expected offset might make it easier to read and more likely to catch something that breaks this in the future. Right now it isn't clear why there are two nearly identical tests.
llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll | ||
---|---|---|
39 ↗ | (On Diff #263440) | Needs to check something |
45 ↗ | (On Diff #263440) | A more directed tests would be better. We have a few others relying on splitting up giant vector loads, and they're some of the slowest tests in the entire lit suite |
65 ↗ | (On Diff #263440) | Can you use amdgpu-max-waves-per-eu? I'm trying to move away from the direct register limit attributes |
- Fix bug where offset register would not be correctly restored
- Modify test to use smaller array (compiles much faster)
- Add checks in test
llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll | ||
---|---|---|
65 ↗ | (On Diff #263440) | Unfortunately amdgpu-waves-per-eu does not allow us to constrain the SGPRs low enough to trigger the bug without making the test much bigger. |
llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll | ||
---|---|---|
65 ↗ | (On Diff #263440) | Don't you just need a big block of sgpr use asm to cover the difference? |