Specifying -regalloc=fast is not reliable. With fast register allocation,
LIS = getAnalysisIfAvailable<LiveIntervals>(); get nullptr
in "si-lower-sgpr-spills" pass, so the slot index is not created in the
pass for new inserted instructions. When verifying the machine
instruction, it fails on checking slot index. While greedy-ra is time
consuming basic-ra can be used to reduce compiling time for this test case.
Details
- Reviewers
foad arsenm rampitec - Commits
- rG28733d86cf7b: [amdgpu] Change the RA to basic
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Typo in description: "intruction".
I also think spilling should be fixed to work with fast-ra instead.
Fixed, thanks.
I also think spilling should be fixed to work with fast-ra instead.
Fast-ra doesn't depened on SlotIndexes and LiveIntervals, but "si-lower-sgpr-spills" depend on LiveIntervals, so I am not sure we can fix. On the other hand basic-ra fit this case very well, it reduce the compiling time and depend on SlotIndexes and LiveIntervals. Even we use fast-ra the pass pipeline doesn't change much which means it fast-ra won't reduce the compiling time for other passes, it only reduce the compiling time of ra itself.
OK, using fast-ra is not a point of this test anyway, and that shall not be your problem to fix sgpr spilling even if we need it.
Specifying -regalloc=fast is not reliable. With fast register allocation,
LIS = getAnalysisIfAvailable<LiveIntervals>(); get nullptr
in "si-lower-sgpr-spills" pass, so the slot index is not created in the
pass for new inserted instructions.
That sounds like a bug in SILowerSGPRSpills. It claims to preserve all analyses, so it should be able to preserve SlotIndexex even if LiveIntervals is not available.
When verifying the machine
instruction, it fails on checking slot index.
No, this test does not fail - it has been passing for over two years. So what problem are you trying to fix? Are you testing some local changes that provoked a failure?
No, this test does not fail - it has been passing for over two years. So what problem are you trying to fix? Are you testing some local changes that provoked a failure?
This case doesn't fail, but the similar case (as below) would fail in SILowerSGPRSpills pass. Replacing it with basic-ra has no side effect.
define internal void @use256vgprs() { %v0 = call i32 asm sideeffect "; def $0", "=v"() %v1 = call i32 asm sideeffect "; def $0", "=v"() call void asm sideeffect "; use $0", "v"(i32 %v0) call void asm sideeffect "; use $0", "v"(i32 %v1) ret void } define amdgpu_kernel void @f256() #256 { call void @use256vgprs() ret void } attributes #256 = { nounwind "amdgpu-flat-work-group-size"="256,256" } define amdgpu_kernel void @f512() #512 { call void @foo() call void @use256vgprs() ret void } attributes #512 = { nounwind "amdgpu-flat-work-group-size"="512,512" } define amdgpu_kernel void @f1024() #1024 { call void @foo() call void @use256vgprs() ret void } attributes #1024 = { nounwind "amdgpu-flat-work-group-size"="1024,1024" } declare void @foo()