Added an extra analysis for better choosing of shuffle kind in
getShuffleCost functions for better cost estimation if mask was
provided.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Hello @ABataev,
Thank you for the patch and for adding me as a reviewer.
I did a comment on your patch, hope you don't mind.
Carol
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
797 | Why are you changing this to broadcast I don't see that happening in SLPVectorizer.cpp. Kind == TTI::SK_PermuteSingleSrc would call: return getPermuteShuffleOverhead(cast<FixedVectorType>(Tp)); But now it calls: return getBroadcastShuffleOverhead(cast<FixedVectorType>(Tp)); Is that correct? In my view, some tests would fail because now it has another cost model being called. Oddly, there is not test failure, despite, in my view and with my knowledge, it would make some tests fail because you are changing the code path for the cost model. Do you mind adding some tests for that as well? |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
797 | Yes, there was no such thing in SLP vectorizer, I just extended functionality. Will try to add the tests for it. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
790 | nit: maybe improveShuffleKindFromMask is more descriptive? | |
792–806 | nit: This is probably just personal preference, but can this be implemented with a switch? i.e. if (Mask.empty()) return Kind; switch (Kind) { default: break; case Kind == TTI::SK_PermuteSingleSrc: if (ShuffleVectorInst::isReverseMask(Mask)) return TargetTransformInfo::SK_Reverse; if (ShuffleVectorInst::isZeroEltSplatMask(Mask)) return TargetTransformInfo::SK_Broadcast; break; case TTI::SK_PermuteTwoSrc: if (ShuffleVectorInst::isSelectMask(Mask)) return TargetTransformInfo::SK_Select; if (ShuffleVectorInst::isTransposeMask(Mask)) return TargetTransformInfo::SK_Transpose; break; } return Kind; | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
4317 | Is this change supposed to be in this patch? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4317 | Ye, without it SLP is unable to detect broadcast case. I can prepare a separate patch for it. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 | Some of these Mask functions do not handle all sizes of mask. Do we need to be careful about shuffles that changesLength? |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 | Here we don't have info about length changes, looks like all these functions use the size of the Mask array as vector factor. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 | I may have the wrong impression of what is going on exactly, but I think it's about the elements being out of range for the mask size. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 | This function does not use 2 different length, it has just one, deduced out of Mask size. It is just impossible to have a change of size situation here, I assume. We do not work with actual shuffles, functions are static. |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 |
It should not crash with this patch :-) It says for me: SLP: Adding cost 0 for bundle that starts with %add.i33 = fadd fast half %21, %22. SLP: Current total cost = 0 SLP: Adding cost 6 for bundle that starts with %21 = extractelement <8 x half> %20, i32 0. SLP: Current total cost = 6 clang: /work/llvm-project/llvm/lib/IR/Instructions.cpp:2122: bool isSingleSourceMaskImpl(llvm::ArrayRef<int>, int): Assertion `I >= 0 && I < (NumOpElts * 2) && "Out-of-bounds shuffle mask element"' failed. ... 0. Program arguments: bin/clang -target arm-arm-none-eabi -mcpu=cortex-m55 -Ofast tmp.c -S -o - -mllvm -debug -mllvm -print- after-all ... #9 0x0000558afa5bc6de llvm::ShuffleVectorInst::isSingleSourceMask(llvm::ArrayRef<int>) (bin/clang+0x728e6de) #10 0x0000558afa5bc911 llvm::ShuffleVectorInst::isSelectMask(llvm::ArrayRef<int>) (bin/clang+0x728e911) #11 0x0000558af90f9c40 llvm::ARMTTIImpl::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm::ArrayRef <int>, int, llvm::VectorType*) ARMTargetTransformInfo.cpp:0:0 #12 0x0000558af9fcac6a llvm::TargetTransformInfo::getShuffleCost(llvm::TargetTransformInfo::ShuffleKind, llvm::VectorType*, llvm: :ArrayRef<int>, int, llvm::VectorType*) const (bin/clang+0x6c9cc6a) #13 0x0000558afb0d4f50 llvm::slpvectorizer::BoUpSLP::getEntryCost(llvm::slpvectorizer::BoUpSLP::TreeEntry*) (bin/clang+0x7da6f50) |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
802 |
| |
802 | Ok, I see. I added some checks in SLP vectorizer but looks like we need some extra checks in other places. Or relax asserts in isSingleSourceMaskImpl function (maybe, using an extra parameter) |
nit: maybe improveShuffleKindFromMask is more descriptive?