This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add target hook for preferPredicateOverEpilogue

Authored by david-arm on Jul 12 2022, 5:19 AM.



This patch adds the AArch64 hook for preferPredicateOverEpilogue,
which currently returns true if SVE is enabled and one of the
following conditions (non-exhaustive) is met:

  1. The "sve-tail-folding" option is set to "all", or
  2. The "sve-tail-folding" option is set to "all+noreductions"

and the loop does not contain reductions,

  1. The "sve-tail-folding" option is set to "all+norecurrences"

and the loop has no first-order recurrences.

Currently the default option is "disabled", but this will be
changed in a later patch.

I've added new tests to show the options behave as expected here:


Diff Detail

Event Timeline

david-arm created this revision.Jul 12 2022, 5:19 AM
david-arm requested review of this revision.Jul 12 2022, 5:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2022, 5:19 AM
Matt added a subscriber: Matt.Jul 12 2022, 8:23 PM
david-arm updated this revision to Diff 444226.Jul 13 2022, 5:01 AM
david-arm edited the summary of this revision. (Show Details)
david-arm added a reviewer: dmgreen.
  • Removed previous test changes.
  • Added a new option called "EnabledNoReductions" that only uses tail-predication if the loop does not contain reductions or first-order recurrences.
  • Enhanced the TTI hook to take a pointer to LoopVectorizationLegality, which is used to detect the presence of reductions and recurrences.

Assuming you agree with my suggestion I think it's worth breaking the LoopAccessInfo->LoopVectorizationLegality change into a separate NFC patch because it will make things easier if you miss the branch point and want to back port this change since you'll only be changing AArch64TargetTransformInfo.cpp.


This doesn't really scale. What happens when there's another reason to allow the user to control tail predication? You'll need to add EnabledNoBlob2, then EnabledNoReductionOrBlob2.....

Is it worth adding a custom parsing class (assuming something doesn't already exist) so that we can do -sve-tail-predication for the default and then allow users to add a comma separated list of <option>s to enable or no-<option> to disable, along with your existing disabled/enabled options.

I'm not asking you to add new reasons to disable, only to make it easier to add then if necessary. That said, with this approach you could split the reduction and first order recurrences.

By extension I'm suggesting TailPredication wants to be a bit field like enum with Disabled=0 and Enabled=AllOnes.

david-arm updated this revision to Diff 445485.Jul 18 2022, 7:23 AM
  • Renamed sve-tail-predication -> sve-tail-folding in line with other terminology used in tests and in the codebase.
  • Changed sve-tail-folding option into an option list so that we can fine-tune various behaviours, i.e. "enabled+noreductions" or "enabled+noreductions". This allows us to represent the different flavours of tail-folding in the form of a bitmask, which makes this more scalable.
david-arm edited the summary of this revision. (Show Details)Jul 18 2022, 7:23 AM
david-arm marked an inline comment as done.
david-arm added inline comments.

Good suggestion @paulwalker-arm - I've had a go at doing this!

paulwalker-arm added inline comments.Jul 19 2022, 6:17 AM

I think you want a couple more cases here:

  • default for whatever is used when no option is provided
  • all instead of enabled
  • simple for the styles which currently have no name. I'm not that bothered about the name used, simple was just the first thing that popped into my head and isn't particularly great.

My thinking is that if the user wants to enable a non-default case they shouldn't need to know what the default is. Likewise you should be able to be explicit when enabling a subset (i.e. you shouldn't need to guess at what needs to be disable and so can just start by disabling all and then adding the cases they care about.


Is it worth having a version of this written as -sve-tail-folding=disabled+simple+reductions+recurrences, where simple is just whatever you decide to call the set which have no name. I don't think we need all combinations but having one line which enables everything manually will ensure we've no holes or typos.

If you agree with adding a default option then we should test that as well because those are the check lines that will change as we support more cases.

david-arm updated this revision to Diff 445826.Jul 19 2022, 7:55 AM
david-arm marked an inline comment as done.
david-arm edited the summary of this revision. (Show Details)
  • Added/renamed sve-tail-folding options as per suggestions and added RUN lines to test.
david-arm marked 2 inline comments as done.Jul 19 2022, 7:56 AM
paulwalker-arm added inline comments.Jul 20 2022, 5:04 PM

Can this be add(TFSimple) so it's not position dependent?


I don't think this works now we have the expanded bitfield. I think you need logic like:

TailFoldingKind Required = 0;
if (LVL->getReductionVars().size())
if (LVL->getReductionVars().size())
if (!Required)

return TailFoldingKindLoc & Required
david-arm added inline comments.Jul 21 2022, 1:04 AM

Hmm, the existing code I have does work since the tests I added all pass so I don't believe there is a bug. With the existing version if the user didn't request reductions and the loop contains at least one reduction then !(TailFoldingKindLoc & TailFoldingKind::TFReductions) && LVL->getReductionVars().size() is true and so we return false. This is the same as your suggested code because Required != TailFoldingKindLoc in that case. We then return true at the end if either: a) the loop is 'simple', or b) the user has explicitly permitted tail-folding with reductions and/or recurrences.

Having said that, I'm happy to give your suggestion a try if you think it reads better!

paulwalker-arm added inline comments.Jul 21 2022, 2:27 AM

If TailFoldingKind::TFReductions is set and the loop contains no reduction then tail folding will be enabled regardless of whether TailFoldingKind::TFSimple is set. Which I think is wrong?

david-arm added inline comments.Jul 21 2022, 2:29 AM

Oh I see what you mean now. In which case this patch is missing a test. :)

david-arm updated this revision to Diff 446421.Jul 21 2022, 3:55 AM
  • Fixed bug in preferPredicateOverEpilogue and added new RUN line to existing test file.
david-arm marked 3 inline comments as done.Jul 21 2022, 3:55 AM
david-arm updated this revision to Diff 446453.Jul 21 2022, 5:46 AM
  • Fixed another issue with the logic in preferPredicateOverEpilogue for the case when a loop contains both reductions and recurrences.
paulwalker-arm accepted this revision.Jul 21 2022, 5:52 AM
This revision is now accepted and ready to land.Jul 21 2022, 5:52 AM
This revision was landed with ongoing or failed builds.Jul 21 2022, 9:20 AM
This revision was automatically updated to reflect the committed changes.