This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Permit tail-folding for low trip counts using scalable vectors
ClosedPublic

Authored by david-arm on Mar 14 2022, 4:44 AM.

Details

Summary

When the loop vectoriser encounters a known low trip count it tries
to create a single predicated loop in order to get the benefit of
vectorisation and eliminate the scalar tail. However, until now the
vectoriser prevented the use of scalable vectors in this case due
to concerns in the past about stability. I believe that tail-folded
loops using scalable vectors are now sufficiently well tested that
we can enable this. For the same reason I've also enabled it when
optimising for code size too.

Tests added here:

Transforms/LoopVectorize/AArch64/sve-low-trip-count.ll
Transforms/LoopVectorize/AArch64/sve-tail-folding-optsize.ll
Transforms/LoopVectorize/RISCV/low-trip-count.ll

Diff Detail

Event Timeline

david-arm created this revision.Mar 14 2022, 4:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 14 2022, 4:44 AM
david-arm requested review of this revision.Mar 14 2022, 4:44 AM
david-arm updated this revision to Diff 416123.Mar 17 2022, 3:36 AM
  • Added more CHECK lines to the new tests because they will be useful for a future patch.
sdesmalen added inline comments.Mar 17 2022, 3:59 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5249–5250

I think we can remove the condition entirely, so that we consider using scalable vectors + tail folding when optimising for code-size as well.

I know that for SVE we'll want to improve code quality to avoid the redundant compare, but when optimising for code-size the user has made the decision that code-size is more important than performance. And the cost-model will still have a say in which is more beneficial (scalar, fixed or scalable) and may still choose a fixed-width VF in case the ScalableVF may not be legal for the loop.

Matt added a subscriber: Matt.Mar 17 2022, 5:48 PM
david-arm updated this revision to Diff 425526.Apr 27 2022, 8:20 AM
david-arm edited the summary of this revision. (Show Details)
  • Completely removed all restrictions on using tail-folding for scalable vectors.
  • Added test to show we apply tail-folding when compiling with -Os
david-arm marked an inline comment as done.Apr 27 2022, 8:20 AM
sdesmalen accepted this revision.May 12 2022, 6:37 AM
This revision is now accepted and ready to land.May 12 2022, 6:37 AM
This revision was landed with ongoing or failed builds.May 16 2022, 1:14 AM
This revision was automatically updated to reflect the committed changes.