After https://reviews.llvm.org/D119743 added the AutomaticAllocationScope
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as async.execute).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect allocas to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
What is the longer term plan here? Do all places that introduce alloca have to perform this local optimization? Should we instead have a pass that hoists alloca out of loops? Or rethink the introduction of allocation scope for loops?
Comment Actions
We should add a facility (a pass presumably) that hoists allocas up to some user-specified level. Maybe the allocation scope that is not nested in another allocation scope, but is nested in the closest isolated-from-above operation. I believe @wsmoses has a prototype for this.
Comment Actions
This might be useful for FIR as well since there is a plan to implement such a pass in Flang/FIR.