This patch allows loop vectorization with function calls in cases where masks are not required but the only available vectorized function variants are masked.
Some of the code was originally written by @paulwalker-arm
Paths
| Differential D132458
[LoopVectorize] Synthesize mask operands for vector variants as needed ClosedPublic Authored by huntergr on Aug 23 2022, 3:45 AM.
Details Summary This patch allows loop vectorization with function calls in cases where masks are not required but the only available vectorized function variants are masked. Some of the code was originally written by @paulwalker-arm
Diff Detail
Unit TestsFailed Event TimelineComment Actions I recommend you split this patch into the following patches:
The first and second step are independent, and could be done in either order.
Comment Actions Split out the functionality for synthesizing a mask when required, added a cost for mask generation. Comment Actions Thanks for the review. I'm tempted to add a masking equivalent to -force-target-supports-scalable-vectors=true in order to have target-independent tests, but I can add that in another patch. huntergr added a child revision: D134422: Scalarize calls to masked functions in LV.Sep 22 2022, 2:17 AM Comment Actions Ping. @reames -- is this roughly what you expected for the case of allowing a masked variant to be used when no mask is required? I've added a cost for generating the mask (per-call for now, instead of potentially sharing it) so that we can compare costs for different VFs with and without a masked variant, but I think we would always prefer the non-masked variant for the same VF if a mask is not required. In any case, I'm now working on the third patch. fhahn added inline comments.
huntergr retitled this revision from [LoopVectorize] Support masked function vectorization to [LoopVectorize] Synthesize mask operands for vector variants as needed. Comment ActionsUpdated to store the pointer to the vector function in the recipe rather than looking it up again during recipe execution. Forced generation of a plan per VF when there are variants available for those VFs. Added some new tests for masked vs. unmasked variants. I'm not fond of adding the optional parameters LoopVectorizationCostModel::getVectorCallCost -- it feels like function lookup needs to be split out of it, but I'd like to get some feedback from others before doing so. Comment Actions Hi @huntergr, I've only partially reviewed this I'm afraid, but here are the comments I have so far. I'll try to review more this week!
huntergr added inline comments.
Comment Actions Thanks a lot for addressing all the comments @huntergr! I just have a few more minor comments then I think it's good to go. :)
This revision is now accepted and ready to land.Feb 9 2023, 1:18 AM This revision was landed with ongoing or failed builds.Feb 14 2023, 6:52 AM Closed by commit rG0fa5df1959fa: [LV] Synthesize all true masks for masked vector function variants (authored by huntergr). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 490161 llvm/include/llvm/Analysis/VectorUtils.h
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
llvm/lib/Transforms/Vectorize/VPRecipeBuilder.h
llvm/lib/Transforms/Vectorize/VPlan.h
llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
llvm/test/Transforms/LoopVectorize/AArch64/masked-call.ll
llvm/test/Transforms/LoopVectorize/AArch64/synthesize-mask-for-call.ll
llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
|
Might be worth adding /// comments here, since the others all have them?