This is an archive of the discontinued LLVM Phabricator instance.

[llvm][CodeGen] Do not scalarize `llvm.masked.[gather|scatter]` operating on scalable vectors.
ClosedPublic

Authored by fpetrogalli on Aug 19 2020, 2:57 PM.

Details

Summary

This patch prevents the llvm.masked.gather and llvm.masked.scatter intrinsics to be scalarized when invoked on scalable vectors.

The change in Function.cpp is needed to prevent the warning that is raised when getNumElements is used in place of getElementCount on VectorType instances. The tests guards for regressions on this change.

The tests makes sure that calls to llvm.masked.[gather|scatter] are still scalarized when:

  1. the intrinsics are operating on fixed size vectors, and
  2. the compiler is not targeting fixed length SVE code generation.

Diff Detail

Event Timeline

fpetrogalli created this revision.Aug 19 2020, 2:57 PM
fpetrogalli requested review of this revision.Aug 19 2020, 2:57 PM

ScalarizeMaskedMemIntrin can't scalarize scalable vectors on any target; would it make sense to check there?

We probably want the TTI change eventually for the sake of cost modeling in the vectorizer, but I'd prefer to be defensive about inappropriate transforms on scalable vectors.

ScalarizeMaskedMemIntrin can't scalarize scalable vectors on any target; would it make sense to check there?

Sure, good idea, working on it. It is a trivial change, but I would like to add a comment as an explanation. What is the reason for not being able to scalarize anything that work on scalable vectors? Is it because this scalarization pass cannot (shouldn't?) produce loops?

We probably want the TTI change eventually for the sake of cost modeling in the vectorizer, but I'd prefer to be defensive about inappropriate transforms on scalable vectors.

OK - but I think I should remove them from this patch. We will introduce them when needed. Does that sound right to you?

Thank you!

Francesco

What is the reason for not being able to scalarize anything that work on scalable vectors? Is it because this scalarization pass cannot (shouldn't?) produce loops?

The scalarization pass currently doesn't produce loops, so the resulting code is nonsense. We could, in theory, teach the scalarization pass how to generate loops, but it would be hard to do well.

OK - but I think I should remove them from this patch. We will introduce them when needed. Does that sound right to you?

Makes sense

I lifted the implemementation to the target-agnostic part of the scalarize pass.

sdesmalen added inline comments.
llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
894 ↗(On Diff #287394)

Should this check be hoisted out of the switch statement and have it return false if the result or any of the operands to the intrinsic is scalable? The intrinsic itself doesn't really matter, given that the pass doesn't produce loops for any of the loads/stores of scalable vector types, not just gather/scatter.

efriedma added inline comments.Aug 27 2020, 3:54 AM
llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
894 ↗(On Diff #287394)

I doubt anyone is going to care about scalable masked_expandloads anytime soon, but sure, hosting it out makes sense.

fpetrogalli added inline comments.Aug 28 2020, 12:42 PM
llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
894 ↗(On Diff #287394)

Given that we know this pass doesn't produce loop, I am happy to hoist this even outside the if (II), and test it on a generic CallInst instead of IntrinsicInst. Does that make sense?

[llvm][sve] Make llvm.masked.[gather|scatter] legal for SVE.

The title of the patch seems wrong, as this patch doesn't do any legalization of masked.gather/scatter for SVE?

llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
894 ↗(On Diff #287394)

Sure, that sounds fine.

fpetrogalli added inline comments.Sep 7 2020, 8:15 AM
llvm/lib/CodeGen/ScalarizeMaskedMemIntrin.cpp
894 ↗(On Diff #287394)

Hi both - sorry for the long wait. I had a go at hoisting the check outside the switch statement, and on a generic CallInst. I actually prefer the original version in this patch, for the following reasons:

  1. The check on the generic CallInst is a bit weird, because the pass is called "ScalarizeMaskedMemInstrinsics". Hence, the pass is already expected to ignore anything that is not a masked memory intrinsic, I don't see why we should create CallInst specific behavior. I suggested this, but I don't think it is the right thing to do anymore after seeing a LIT test with a generic call function that is invoking opt with -scalarize-masked-mem-intrin.
  2. Hosting the check outside the switch statement is viable, but than that would make sense if the scalarization process would be something generic over the masked mem intrinsics. Instead, as things are, scalarization is performed on a per intrinsic base. For this reason, I think we should perform this extra test in each case (like it is done in this patch), as we will be able to enable scalarization on each intrinsic individually without having to revert the generic check for all the masked memory intrinsics.
fpetrogalli retitled this revision from [llvm][sve] Make `llvm.masked.[gather|scatter]` legal for SVE. to [llvm][CodeGen] Do not scalarize `llvm.masked.[gather|scatter]` operating on scalable vectors..Sep 7 2020, 8:43 AM
fpetrogalli edited the summary of this revision. (Show Details)
efriedma accepted this revision.Sep 7 2020, 5:14 PM

LGTM

This revision is now accepted and ready to land.Sep 7 2020, 5:14 PM

@sdesmalen - I have moved the check outside the switch statement as discussed in our phone call.

sdesmalen accepted this revision.Sep 14 2020, 3:22 AM

Thanks @fpetrogalli, LGTM!

llvm/test/CodeGen/AArch64/llvm-masked-gather-legal-for-sve.ll
55

nit: s/passthro/passthru

This revision was landed with ongoing or failed builds.Sep 16 2020, 9:03 AM
This revision was automatically updated to reflect the committed changes.