This is an archive of the discontinued LLVM Phabricator instance.

[LV] Recognize store of invariant value to invariant address as uniform
ClosedPublic

Authored by reames on Jul 22 2022, 7:53 AM.

Details

Summary

This extends the handling of uniform memory operations to handle the case where a store is storing a loop invariant value. Unlike the general case of a store to an invariant address where we must use the last active lane, in this case we can use any lane since all lanes must produce the same result.

For context, the basic structure of the existing code and how the change fits in:

  • First, we select a widening strategy. (The result is irrelevant for this patch.)
  • Then we determine if a computation is uniform within all lanes of VF. (Note this is the uniform-per-part definition, not LAI's uniform across all unrolled iterations definition.)
  • If it is, we overrule the widening strategy, and unconditionally scalarize.
  • VPReplicationRecipe - which is what actually does the scalarization - knows how to handle unform-per-part values including for scalable vectors. However, we do need to know that the expression is safe to execute without predication - e.g. the uniform mem op was unconditional in the original loop. (This part was split off and already landed.)

An obvious question is why not simply implement the generic case? The answer is that I'm going to, but doing so without a canonicalization towards uniform causes regressions due to bad interaction with scalarization/uniformity of values feeding the uniform mem-op. This patch is needed to avoid those regressions.

Diff Detail

Event Timeline

reames created this revision.Jul 22 2022, 7:53 AM
reames requested review of this revision.Jul 22 2022, 7:53 AM
david-arm added inline comments.Jul 27 2022, 5:14 AM
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll
96 ↗(On Diff #446832)

Hi @reames, something doesn't look right about this change because each store instruction is storing out a different value.

reames added inline comments.Jul 27 2022, 7:02 AM
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll
96 ↗(On Diff #446832)

This is correct, but not directly related to the thrust of the patch. This is a side effect of the change in isScalarWithPredication. We'd previously been considering these stores to be predicated. They are unconditional in the original IR, so this should be correct.

If you want, I can split the patch further to do a pre-change with just the change in isScalarWithPredication.

fhahn added inline comments.Jul 27 2022, 7:13 AM
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll
96 ↗(On Diff #446832)

Yeah it would probably be good to split off the change to ‘ isPredicatedInst’, especially if it reduces the test changes per patch

reames added inline comments.Jul 27 2022, 7:43 AM
llvm/test/Transforms/LoopVectorize/X86/consecutive-ptr-uniforms.ll
96 ↗(On Diff #446832)

Split off as https://reviews.llvm.org/D130637. Will rebase this once that lands.

This did turn out to be more test churn than I'd realized. Clearly should have split that from the start. Oh well.

reames updated this revision to Diff 448397.Jul 28 2022, 11:36 AM

Rebase over split off and landed change.

reames edited the summary of this revision. (Show Details)Jul 28 2022, 11:37 AM
david-arm accepted this revision.Aug 2 2022, 6:22 AM

LGTM!

This revision is now accepted and ready to land.Aug 2 2022, 6:22 AM
fhahn accepted this revision.Aug 2 2022, 6:33 AM

LGTM, thanks!

This revision was landed with ongoing or failed builds.Aug 2 2022, 8:10 AM
This revision was automatically updated to reflect the committed changes.