This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine][SVE] Fix visitInsertElementInst for scalable type.
ClosedPublic

Authored by huihuiz on Apr 26 2020, 9:43 PM.

Details

Summary

This patch fixes the following issues in visitInsertElementInst:

  1. Bail out for scalable type when analysis requires fixed size number of vector elements.
  2. Use cast<FixedVectorType> to get vector number of elements. This ensure assertion on scalable vector type.
  3. For scalable type, avoid folding a chain of insertelement into splat: insertelt(insertelt(insertelt(insertelt X, %k, 0), %k, 1), %k, 2) ... -> shufflevector(insertelt(X, %k, 0), undef, zero) The length of scalable vector is unknown at compile-time, therefore we don't know if given insertelement sequence is valid for splat.

Diff Detail

Event Timeline

huihuiz created this revision.Apr 26 2020, 9:43 PM
  1. Current upstream crash at:

llvm::ShuffleVectorInst::ShuffleVectorInst(llvm::Value *, llvm::Value *, ArrayRef<int>, const llvm::Twine &, llvm::Instruction *): Assertion `isValidOperands(V1, V2, Mask) && "Invalid shuffle vector instruction operands!"' failed.
take test.ll , run opt -S -instcombine test.ll

define <vscale x 4 x i32> @insertelement_extractelement(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
  %t0 = extractelement <vscale x 4 x i32> %a, i32 1
  %t1 = insertelement <vscale x 4 x i32> %b, i32 %t0, i32 0
  ret <vscale x 4 x i32> %t1
}
  1. Current upstream is doing wrong fold, trying to fold a chain of insertelement into splat for scalable type. The vector length of scalable type is unknown at compile-time. Therefore a valid sequence of insertelement for fixed-length vector, may not be valid for scalable type to fold into splat.

take test.ll, run opt -S -instcombine test.ll

define <vscale x 4 x float> @insertelement_sequene_may_not_be_splat(float %x) {
  %t0 = insertelement <vscale x 4 x float> undef, float %x, i32 0
  %t1 = insertelement <vscale x 4 x float> %t0, float %x, i32 1
  %t2 = insertelement <vscale x 4 x float> %t1, float %x, i32 2
  %t3 = insertelement <vscale x 4 x float> %t2, float %x, i32 3
  ret <vscale x 4 x float> %t3
}

currently generating:

define <vscale x 4 x float> @insertelement_sequene_may_not_be_splat(float %x) {
  %t0 = insertelement <vscale x 4 x float> undef, float %x, i32 0
  %t3 = shufflevector <vscale x 4 x float> %t0, <vscale x 4 x float> undef, <vscale x 4 x i32> zeroinitializer
  ret <vscale x 4 x float> %t3
}

Thanks for working on this @huihuiz! I've left some comments on the patch.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1059–1060

nit: !isa<ScalableVectorType> -> isa<FixedVectorType>

1064

This is missing a && isa<FixedVectorType>(ExtVecOp->getType()), because the vector that the value is extracted from could be scalable.
(which should be guarded by an extra test case)

1109

This probably also needs a change in SimplifyDemandedVectorElts to bail out early on scalable vectors.

llvm/test/Transforms/InstCombine/vscale_insertelement.ll
5

nit: Can you add a comment what this is testing <=> what it is trying to prevent?

18

nit: This comment doesn't explain what causes the invalid isValidOperands to occur. Can you explain what code-path/optimization you're trying to prevent in this test?

31

Same here.

huihuiz updated this revision to Diff 260484.Apr 27 2020, 3:54 PM
huihuiz marked 7 inline comments as done.

Thanks Sander for the feedback!

Addressed review comments.

efriedma added inline comments.May 4 2020, 11:35 AM
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
847

I'd prefer to bail out more aggressively here; I'm not sure this makes sense for scalable vectors.

880

I'd prefer to bail out more aggressively here; I'm not sure this makes sense for scalable vectors.

huihuiz updated this revision to Diff 262692.May 7 2020, 10:28 AM
huihuiz marked 2 inline comments as done.
huihuiz edited the summary of this revision. (Show Details)

Address review comments.

efriedma accepted this revision.May 7 2020, 11:08 AM

LGTM with one minor nit

llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
1193–1198

Cannot

This revision is now accepted and ready to land.May 7 2020, 11:08 AM
sdesmalen accepted this revision.May 7 2020, 12:38 PM

LGTM as well. Thanks for adding the extra comments to the tests!

huihuiz marked an inline comment as done.May 7 2020, 12:46 PM

Thanks! Fixed in the committed patch.

This revision was automatically updated to reflect the committed changes.