This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix PR38339: Instruction does not dominate all uses!
ClosedPublic

Authored by ABataev on Jul 27 2018, 12:09 PM.

Details

Summary

If the ExtractElement instructions can be optimized out during the
vectorization and we need to reshuffle the parent vector, this
ShuffleInstruction may be inserted in the wrong place causing compiler
to produce incorrect code.

Diff Detail

Event Timeline

ABataev created this revision.Jul 27 2018, 12:09 PM

Someone more familiar with SLP should take a look, but this seems ok to me.

The testcase can be simplified to a single BB:

define void @f1(<2 x i16> %x, i16* %a) {
  %t2 = extractelement <2 x i16> %x, i32 0
  %t3 = extractelement <2 x i16> %x, i32 1
  %ptr0 = getelementptr inbounds [4 x i16], [4 x i16]* undef, i16 0, i16 0
  %ptr1 = getelementptr inbounds [4 x i16], [4 x i16]* undef, i16 0, i16 1
  %ptr2 = getelementptr inbounds [4 x i16], [4 x i16]* undef, i16 0, i16 2
  %ptr3 = getelementptr inbounds [4 x i16], [4 x i16]* undef, i16 0, i16 3
  store i16 %t2, i16* %a
  store i16 %t2, i16* %ptr0
  store i16 %t3, i16* %ptr1
  store i16 %t3, i16* %ptr2
  store i16 %t2, i16* %ptr3
  ret void
}
RKSimon accepted this revision.Jul 31 2018, 2:23 AM

LGTM with @spatel test simplification

This revision is now accepted and ready to land.Jul 31 2018, 2:23 AM
This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp
3116 ↗(On Diff #158246)

I don't think std::next(I->getIterator()) is a valid insertion point in all cases. V could be a phi, or an invoke, or some exception-handling instruction.

efriedma added inline comments.Jul 31 2018, 1:16 PM
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp
3119 ↗(On Diff #158246)

Also, you can't just stick the instruction at the beginning of the function in this case; it's illegal to hoist a reference to a canTrap() constant.

ABataev added inline comments.Jul 31 2018, 1:25 PM
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp
3116 ↗(On Diff #158246)

What if something like this:

auto It = std::next(I->getIterator());
for (auto E = I->getParent()->end(); It != E; ++It) {
  if (!isa<PHINode>(&*It)) {
    if (It->isEHPad())
      ++It;
    break; 
  }
}
Builder.SetInsertPoint(I->getParent(), It);

?

efriedma added inline comments.Jul 31 2018, 1:51 PM
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp
3116 ↗(On Diff #158246)

That doesn't really help... among other things, there are some blocks which have no legal insertion points.

Maybe there's some constraint on SLP vectorization which would help choose a better insertion point? Why can't we just use VL0 as the insertion point?

ABataev added inline comments.Jul 31 2018, 1:57 PM
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp
3116 ↗(On Diff #158246)

Yes, maybe that's the best option here. Will fix this.