This is an archive of the discontinued LLVM Phabricator instance.

[mlir][vector] insert `alloca`s outside of loops
ClosedPublic

Authored by ftynse on Apr 25 2022, 1:26 AM.

Details

Summary

After https://reviews.llvm.org/D119743 added the AutomaticAllocationScope
trait to loop-like constructs, the vector transfer full/partial splitting pass
started inserting allocations for temporaries within the closest loop rather
than the closest function (or other allocation scope such as async.execute).
While this is correct as long as the lowered code takes care of automatic
deallocation at the end of each iteration of the loop, this interferes with
downstream optimizations that expect allocas to be at the function level.
Step over loops when looking for the closest allocation scope in vector
transfer full/partial splitting pass thus restoring the original behavior.

Diff Detail

Event Timeline

ftynse created this revision.Apr 25 2022, 1:26 AM
Herald added a project: Restricted Project. · View Herald Transcript
ftynse requested review of this revision.Apr 25 2022, 1:26 AM
hanchung accepted this revision.Apr 25 2022, 1:47 AM

Thanks!

This revision is now accepted and ready to land.Apr 25 2022, 1:47 AM
This revision was landed with ongoing or failed builds.Apr 25 2022, 1:49 AM
This revision was automatically updated to reflect the committed changes.
herhut added a subscriber: herhut.Apr 25 2022, 5:17 AM

What is the longer term plan here? Do all places that introduce alloca have to perform this local optimization? Should we instead have a pass that hoists alloca out of loops? Or rethink the introduction of allocation scope for loops?

ftynse added a subscriber: wsmoses.Apr 25 2022, 5:25 AM

We should add a facility (a pass presumably) that hoists allocas up to some user-specified level. Maybe the allocation scope that is not nested in another allocation scope, but is nested in the closest isolated-from-above operation. I believe @wsmoses has a prototype for this.

We should add a facility (a pass presumably) that hoists allocas up to some user-specified level. Maybe the allocation scope that is not nested in another allocation scope, but is nested in the closest isolated-from-above operation. I believe @wsmoses has a prototype for this.

This might be useful for FIR as well since there is a plan to implement such a pass in Flang/FIR.