For rare exceptional case vector tree node (insertelements for now only)
is marked as NeedToGather, this case is processed by patch. Follow-up
of D98714 to fix bug reported here https://reviews.llvm.org/D98714#2764135.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3693–3695 | Not sure we should return 0 here, maybe better to return some max cost, like INT_MAX? | |
4597–4599 | Is this correct? Need to return the last insertelement instruction in the sequence | |
llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector-inseltpoison.ll | ||
45–80 | Why the check lines are removed? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3693–3695 | There is no difference for now (the only one is compile time). We could also just prohibit marking NeedToGather for vector node (it could be done while checking for ephemeral values only). But I've done it this way since it looks more transparent and also we could use zero cost for the future cases when/if such NeedToGather vector node is not the only member of tree and whole tree vectorization is beneficial. | |
4597–4599 | Sure, fixed! | |
llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector-inseltpoison.ll | ||
45–80 | Fixed |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3693–3695 | I think it is better to use TTI::TCC_Expensive to mark it explicitly as an expensive operation. Or even invalid cost, if possible. |
Address comment
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3693–3695 | Ok, used Invalid cost. |
llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector-inseltpoison.ll | ||
---|---|---|
149–156 | Why are the instructions reordered if no vectorization should happen? |
llvm/test/Transforms/SLPVectorizer/X86/insert-element-build-vector-inseltpoison.ll | ||
---|---|---|
149–156 | Vectorization of inserts is prohibited now, yes, but compiler still "vectorizes" small trees of size 1 with any valid cost due to options -slp-threshold=-10000 -slp-min-tree-size=0. This is not related to this patch itself. |
Not sure we should return 0 here, maybe better to return some max cost, like INT_MAX?