Get rid of non-constant and undef indices of insertelements
at buildTree() stage. Fix bugs.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Something to take it account with this and D119536 - we're going to need to cherrypick these fixes for PR52275 as they affect 14.x as well - will they be able to apply cleanly to the 14.x branch?
What kind of bugs do you see with these test cases and current version of the function? I would try to keep it, better to generate UndefMaskElem if possible, rather than cut of the vectorization early.
All these test cases are crashing without patch. There are several issues with this code. First of all, see lines 4028 and 4053 -- we use already dereferenced Optional<int> Idx = *getInsertIndex(V); (it should be without *). Also MinIdx is computed but not used at all.
But the main problem is that UndefMaskElem generally makes no sense for index: the function getInsertIndex() returns _index_ rather than element, if it is undef/poison/out of range then whole vector is poison. Also this return value UndefMaskElem isn't used now anywhere, if somehow (how?) it could be useful in future, it can be added.
No, they doesn't apply cleanly right now, but I'm to backport them through new workflow (https://discourse.llvm.org/t/rfc-new-automated-release-workflow-part-2/59981)
Ok, got it. Could you also add a check to the buildvector detection function for such kind of insertelement (+ possibly insertvalue) instructions? They should not be treated as a part of buildvector, need to treat them as an initial/terminating instruction and exclude from possibly vectorizable buildvector instructions.
Ok, got it. Could you also add a check to the buildvector detection function for such kind of insertelement (+ possibly insertvalue) instructions? They should not be treated as a part of buildvector, need to treat them as an initial/terminating instruction and exclude from possibly vectorizable buildvector instructions.
Do you mean findBuildAggregate()? This check is already there, see line 9603.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9603 | @ABataev Do you mean this check? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9603 | Yes, but I thought about returning true here. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9603 | Hmm, but true means that this insertelement is a part of buildvector. Do you want this? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9603 | True means that we have a buildvector, but do not include this insertelement into the list of the vectorisable insertelements, i.e. treat it as an initial vector of the buildvector. Could you try this? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
9607–9609 | I mean, maybe return true here too, if we have at least 2 insertelements in the InsertElts vector? |
Would be good to test it using test suite at least, just in case we missed something.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4026–4032 | With an early check for insertelements in tryVectorizePair this can be turned to an assert rather than check and schedule cancellation. |
LG
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4026–4032 | Ok, do not forget to remove this check then. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4026–4032 | Sure, already removed. |
With an early check for insertelements in tryVectorizePair this can be turned to an assert rather than check and schedule cancellation.