Cost modeling for GEPs should actually be target dependent but is currently
done inside SLP target-independent way.
Sinking it into TTI enables target dependent implementation.
This patch adds new TTI interface and implementation of the basic functionality
trying to retain existing cost modeling.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I'd expect this to cause cost changes - have you had any luck creating tests demonstrating that?
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
2024 | Pass by const ref? | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
2817 | Drop the nullptr (maybe convert to StringRef?) and add "Calculated costs for Tree" to the only existing dumpTreeCosts call | |
7228 | This kind of thing always feels like a cheat - is the only reason you don't return TCC_Free here because we still need to do the dumpTreeCosts call below? | |
7251 | It's amazing how even a patch that tries to pull functionality out of SLP increases the size and complexity of SLPVectorizer.cpp :-( |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
7251 | It not just tries to pull something out, it introduces new interfaces, which will improve cost modeling. This, of course, may increase the complexity. Unfortunate, but required. |
I was not able to make one. Existing tests that are sensitive to this cost modeling are:
- for plain wide loads:
llvm/test/Transforms/SLPVectorizer/X86/remark_horcost.ll
llvm/test/Transforms/SLPVectorizer/X86/remark_not_all_parts.ll
- for gather loads:
llvm/test/Transforms/SLPVectorizer/X86/remark_gather-load-redux-cost.ll
But yep, I agree that this is not a great coverage.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
7228 | This is just a short cut. If we have all the scalar GEPs to remain in vectorized code then vector cost would be the same. We could just return TCC_Free here but as you noted I'd like to have costs printed. | |
7251 | I tried to add/reword some comments. May be that helps a bit. |
I'm not sure I understand. All these tests do already check for pass-remarks output which includes vec tree cost. Did you mean something else?
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
284–291 | Better to represent them as bitfields: bool IsSameBaseAddress:1 = false; bool IsUniformStride:1 = false; bool IsKnownStride:1 = false; unsigned Reserved: 29; | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
2816–2826 | This change may be committed separately | |
7201 | Use areAllUsersVectorized? |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
284–291 | That's fair | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
2816–2826 | Split into https://reviews.llvm.org/D144992 | |
7201 | Yep. That what I was considering. Just though about addressing it in a follow up patch. |
I thought they don't have remarks. Maybe add remarks to some other tests?
Please let me know if you have some specific tests in mind to beef up with pass remark checks.
May be tests with specific patterns? Finding them by looking through every test manually is a bit tedious work.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
7201 | I experimented with replacing single use check with areAllUsersVectorized(). Yes, that's what we want but it makes the patch non-NFC. So I'd like to make that change separately. |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
1045 | will do | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
7179 | We do not call this routine for scatter vectorize loads node. We did that in the past but that was a bug (since we evaluated GEPs twice as the result). For scatter vectorize we evaluate GEPs only when calculating cost for tree node that correspond to pointer operands of the loads. The comment above explains exactly that distinction. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
7179 | Could you add an assertion for the expected cases? Like ScatterVectorize is not expected, etc. |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
7179 | Sure |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
1047–1048 | Technically, yes. Ptrs is not prohibited to contain just single Base entry (which can be a value or a non-GEP instruction). How practical such use would be is the different kind of question. | |
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
7176 | I'd like to not bypass the debug printer in this case. |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
283 | Can you static_assert that the size of PointerChainInfo is 4? |
LG with a nit
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
316 | Add the meaningful message |
llvm/include/llvm/Analysis/TargetTransformInfo.h | ||
---|---|---|
316 | Yep. I'll add it. I was not sure about wording. How about "Was size increase justified?" ? BTW. I need to make another change to the struct (bool => unsigned) as on Windows that works differently and the size appears 8. | |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
1055 | There is actually a problem with this assertion. I suspected about it but only revealed for sure when tried to test the patch more extensively. The problem is that it is very possible that GEPS are not in the same block. |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
1055 | For the vectorized blocks it is not possible - we expect that vectorized GEPs are from the same block, otherwise they will be gathered |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
1055 | For GEPs that are operands of scatter vectorize node we probably have this check. I did not verify that but I'm not arguing that it is not true. |
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h | ||
---|---|---|
1055 | Right, for vectorized loads/store we do not do it, we just don't need it. Then you need to remove it. |
Fixed issues revealed with more extensive test run.
Removed asserts from base implementation. Same GEPs origin is not guaranteed and and I don't find asserting for kinds of incoming pointers as quite useful.
Added assertion that Base is provided if sameBase is set.
PointersChainInfo changed to satisfy static assert condition on Windows.
Rebased.
Can you static_assert that the size of PointerChainInfo is 4?