Promoting Alloca to Vector and Promoting Alloca to LDS are two independent handling of Alloca and should not affect each other.
As a result, we should not give up promoting to vector if there is not enough LDS. This patch factors out the local memory usage
related checking out and replace it after the calling convention checking.
Details
Diff Detail
Event Timeline
lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | ||
---|---|---|
605 | There's a possible hazard with aliases of globals but I think that's been an existing problem | |
609–610 | The user could be a constant expression which is transitively used by an instruction but I guess you're just moving this | |
704–705 | This is being called for every single alloca in the function. This should probably be checked once earlier | |
test/CodeGen/AMDGPU/vector-alloca.ll | ||
149–153 | Should use the GCN instruction checks |
lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | ||
---|---|---|
605 | Will take a look and fix it in a separate patch if needed! | |
609–610 | Will take a look and fix it in a separate patch if needed! | |
704–705 | Done! Move the check before the loop, and use an argument to handleAlloca to carry the check result. Will update the diff late. | |
test/CodeGen/AMDGPU/vector-alloca.ll | ||
149–153 | what is GCN instruction check? Do you mean should use llc to compile to ISA and do checking? |
- Move the checking of available LDS outside the loop.
- Remove the ISA checking in the newly added test (I think the OPT checking is sufficient).
- the other two original issues Matt mentioned are to be investigated in a separate patch.
lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | ||
---|---|---|
688–689 | This debug message is a misleading! if tryPromoteAllocaToVector return true, it means alloca has already been vectoized, and we should return true here. On the other hand, if tryPromoteAllocaToVector return false, we should continue to try to promote alloca to LDS. I will update the debug message! | |
test/CodeGen/AMDGPU/vector-alloca.ll | ||
149–153 | I know. That's what I copied and pasted from a previous test. |
Remove an incorrect debug message! Actually we do not need a debug message at the caller site for tryPromoteAllocaToVector because
in function tryPromoteAllocaToVector, debug message was dumped for every cases.
lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp | ||
---|---|---|
688–689 | So I completely remove this debug message here at the caller site because for every possible case in tryPromoteAllocaToVector, | |
test/CodeGen/AMDGPU/vector-alloca.ll | ||
149–153 | Let me know if you still went to check GCN for this new test, Thanks. |
There's a possible hazard with aliases of globals but I think that's been an existing problem