This is an archive of the discontinued LLVM Phabricator instance.

[SLP]Improve cost model for the shuffled extracts.
ClosedPublic

Authored by ABataev on Nov 12 2021, 10:03 AM.

Details

Summary

Improved the calculation of the shuffled extracts, where possible. Need
to calculate the cost for the extracted scalars if some users are not
insertelements + improved the total estimation of the shuffled scalars
used in insertelements build vectors.

Diff Detail

Event Timeline

ABataev created this revision.Nov 12 2021, 10:03 AM
ABataev requested review of this revision.Nov 12 2021, 10:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 12 2021, 10:03 AM
RKSimon added inline comments.Nov 13 2021, 8:30 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5381

Pull out this NFC?

5394

Pull out this NFC?

5486

Why perform the OR with zero?

ABataev updated this revision to Diff 387246.Nov 15 2021, 7:08 AM

Rebase + address comments.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5486

Replaced by an assignment instead

RKSimon added inline comments.Nov 30 2021, 8:13 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5480

can FirstUsers be empty here ?

ABataev added inline comments.Nov 30 2021, 8:25 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5480

Yes, if no insertelement users (check line 5381) or insert index is non-constant (line 5384-5385)

RKSimon added inline comments.Dec 1 2021, 3:01 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5425

Should this be put inside the while() loop before the break? AFAICT that's the only time that cast<InsertElementInst>(Base) is valid.

ABataev added inline comments.Dec 1 2021, 5:32 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5425

No, not quite so. It can be insertelement also if ScalarToTreeEntry.count(Base) is true, i.e. there is a tree entry for this insertelement in the graph.

RKSimon added inline comments.Dec 1 2021, 6:01 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5425

I don't quite follow, this is what I had in mind:

// Find the insertvector, vectorized in tree, if any.
Value *Base = VU;
while (isa<InsertElementInst>(Base)) {
  if (ScalarToTreeEntry.count(Base)) {
    VU = Base;

    // Build the mask for the vectorized insertelement instructions.
    if (const TreeEntry *E = getTreeEntry(Base)) {
      do {
        int Idx = E->findLaneForValue(Base);
        ShuffleMask.back()[Idx] = Idx;
        Base = cast<InsertElementInst>(Base)->getOperand(0);
      } while (E == getTreeEntry(Base));
    }
    break;
  }
  Base = cast<InsertElementInst>(Base)->getOperand(0);
}
ABataev added inline comments.Dec 1 2021, 6:03 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5425

Oh, I see. Actually, I was going to rework this block of code in a pretty similar way.

ABataev updated this revision to Diff 391007.Dec 1 2021, 6:42 AM

Rebase + address comments

RKSimon accepted this revision.Dec 1 2021, 6:54 AM

LGTM - cheers

This revision is now accepted and ready to land.Dec 1 2021, 6:54 AM
This revision was landed with ongoing or failed builds.Dec 1 2021, 8:12 AM
This revision was automatically updated to reflect the committed changes.