Improved the calculation of the shuffled extracts, where possible. Need
to calculate the cost for the extracted scalars if some users are not
insertelements + improved the total estimation of the shuffled scalars
used in insertelements build vectors.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Rebase + address comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5396 | Replaced by an assignment instead |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5390 | can FirstUsers be empty here ? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5390 | Yes, if no insertelement users (check line 5381) or insert index is non-constant (line 5384-5385) |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5335 | Should this be put inside the while() loop before the break? AFAICT that's the only time that cast<InsertElementInst>(Base) is valid. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5335 | No, not quite so. It can be insertelement also if ScalarToTreeEntry.count(Base) is true, i.e. there is a tree entry for this insertelement in the graph. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5335 | I don't quite follow, this is what I had in mind: // Find the insertvector, vectorized in tree, if any. Value *Base = VU; while (isa<InsertElementInst>(Base)) { if (ScalarToTreeEntry.count(Base)) { VU = Base; // Build the mask for the vectorized insertelement instructions. if (const TreeEntry *E = getTreeEntry(Base)) { do { int Idx = E->findLaneForValue(Base); ShuffleMask.back()[Idx] = Idx; Base = cast<InsertElementInst>(Base)->getOperand(0); } while (E == getTreeEntry(Base)); } break; } Base = cast<InsertElementInst>(Base)->getOperand(0); } |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5335 | Oh, I see. Actually, I was going to rework this block of code in a pretty similar way. |
Pull out this NFC?