Expanding hlfir.assign's with scalar RHS late in MLIR optimization
pipeline allows LLVM to recognize most of them as simple memset loops.
This is especially important for small size LHS arrays, because
the assign loop nest may be completely unrolled enabling more value
propagation.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This addresses one performance problem in Polyhedron/fatigue2.
Another problem is in assignments like this:
subroutine test(x) real :: x(:,:) real :: y(3,3) y(:,:) = x(:,:)
So we have to optimize hlfir.assign where RHS is not elemental but a "variable". We just need to prove that LHS and RHS do not conflict. I am going to work on this next.
flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp | ||
---|---|---|
433 | @tblah I wanted to check with you before making this change. If you are extending the pass for cam, then I would like to postpone it. Please let me know. |
LGTM, thanks for this!
flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp | ||
---|---|---|
433 | I have no changes to this pass in progress so feel free to go ahead with changes. I wonder if running it on hlfir.assign might be premature because we might find optimizable bufferizations which don't involve a hlfir.assign (although I don't have any in mind currently). But yeah I like the idea of having some central heuristic for deciding which pattern to apply. |
flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp | ||
---|---|---|
433 | Thanks! I think I will investigate other benchmarks' performance and postpone the reordering. I will proceed with it if nothing new appears. |
@tblah I wanted to check with you before making this change. If you are extending the pass for cam, then I would like to postpone it. Please let me know.