This adds an Mask ArrayRef to getShuffleCost, so that if an exact mask can be provided, a more accurate cost can be provided by the backend. For example VREV costs could be returned by the ARM backend. This should be an NFC until then, laying the groundwork for that to be added.
Details
Diff Detail
Event Timeline
What is VREV?
How is that different from TargetTransformInfo::ShuffleKind::SK_Reverse?
Can a new shuffle kind be added instead of increasing problem surface?
Hello. A VREV is a little different from a SK_Reverse, as far as I understand. A vrev mask will reverse lanes in a block, not the whole register. So they have masks like <3, 2, 1, 0, 7, 6, 5, 4> or <1, 0, 3, 2, 5, 4, 7, 6>. In MVE there are vrev16.8, vrev32.8, vrev32.16, vrev64.8, vrev64.16 and vrev64.32. The way I think of them is that taking a 128 bit legal vector, they reverse the bits in the first digit, then reverse back in the second.
This seemed like the most general way to fix it to me, giving the backend the exact shuffle mask where available can help it provide better info. Adding a collection of SK_VREVXY might work, but I was hoping to add some other shuffles cheap for the Arm backend like VMOVN's, which would make further use of it eventually.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3482 | Should isShuffle() create a shuffle mask to pass on to getShuffleCost? |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
3470 | I don't know the SLP vectorizer very well. Could these use E->ReuseShuffleIndices? |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
1047–1048 | ArrayRef<int> Mask = llvm::None? How about make it a param with the default value? | |
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
1260 | Better to use llvm::None here and in the other places | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
293 | SmalVectorImpl<int> &? | |
312 | UndefMaskElem | |
3470 | I think yes. | |
3550 | You can call inversePermutation to get a mask out of E->ReorderIndices | |
3776 | Same, call inversePermutation to get a mask out of E->ReorderIndices | |
3793 | Same, call inversePermutation to get a mask out of E->ReorderIndices | |
3861 | The mask here is something like 1010101..., just in case |
Thanks for the comments. None is a new one to me, that's nice that it works for ArrayRefs.
This now uses None, including as a default parameter value, and tries to use more accurate masks from the SLP vectorizer.
ArrayRef<int> Mask = llvm::None? How about make it a param with the default value?