This implements TTI hook canTailPredicateLoop for MVE.
This is a first version, with the following restrictions:
- only operate on single block loops,
- only accept integer types, I will follow this up soon to add support for floats.
Paths
| Differential D69845
[ARM][MVE] canTailPredicateLoop ClosedPublic Authored by SjoerdMeijer on Nov 5 2019, 7:31 AM.
Details
Summary This implements TTI hook canTailPredicateLoop for MVE.
Diff Detail Event Timeline
SjoerdMeijer retitled this revision from [ARM][MVE] WIP: canTailPredicateLoop to [ARM][MVE] canTailPredicateLoop. SjoerdMeijer marked 2 inline comments as done. Comment Actions
Comment Actions Thanks for checking. Got myself confused about the loads/stores, but I think we indeed want to accept extending loads and narrowing stores, so I've added checks and a bunch of tests:
And have also allowed the float types. Comment Actions Have you checked the current code generation for masked fpext/fptrunc load/stores? I have a strong recollection that the codegen is absolutely awful. Comment Actions In an ideal world / in the long run, it would be better for the option of whether to predicate to be dependant on the vectorisation factor. I believe we will have cases where we can pick between a slower tail predicated loop or a quicker non-tail predicated loop, and we should be able to pick the quicker one. This implies that it should really be incorporated into the cost mechanism inside the vectoriser. (Also the current method seems a bit fuzzy, even by vectoriser standards. It seems to essentially be trying to pre-computing what vectorisation factor it thinks will be produced, choosing whether to predicate and hoping that the vectoriser does end up picking the same factor. Without doing something it doesn't expect). But I think you are correct that this is a sensible way forward, to get something working. We can adjust it as needed, and perhaps it will work OK for almost all cases. It does need to option to turn it off though. I would expect that should be off by default for the moment until the backend part is in and we have verified that this is working well enough.
Comment Actions
SjoerdMeijer added a child revision: D70125: [LV] PreferPredicateOverEpilog respecting predicate loop hint.Nov 12 2019, 7:32 AM Comment Actions LGTM. Something I'll like to see in a follow-up patch would be preventing predication in the presence of fcmps. This revision is now accepted and ready to land.Nov 13 2019, 2:26 AM Closed by commit rGd90804d26bef: [ARM][MVE] canTailPredicateLoop (authored by SjoerdMeijer). · Explain WhyNov 13 2019, 5:30 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 228255 llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
llvm/test/Transforms/LoopVectorize/ARM/prefer-tail-loop-folding.ll
|