This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize][TTI] NFCI: Clarify enum for the tail folding style.

Authored by sdesmalen on Jan 30 2023, 5:22 AM.



This NFC (intended) patch has several small changes:

  • It renames PredicationStyle to TailFoldingStyle.
  • It renames TTI.emitActiveLaneMask() to TTI.getPreferredTailFoldingStyle()
  • Simplifies some of its uses in the LoopVectorizer

Rationale: To my surprise PredicationStyle::None did not mean 'no
predication', but rather 'no active lane mask intrinsic', such that the
predicate is created using a splat + compare with stepvector. The enum is
also highly specific to tail folding, so it seems better to name this
around that feature, i.e. 'tail folding style'.

This also makes it more amenable to extend it to other tail folding styles,
such as the one added in D142109.

Diff Detail

Event Timeline

sdesmalen created this revision.Jan 30 2023, 5:22 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 5:22 AM
sdesmalen requested review of this revision.Jan 30 2023, 5:22 AM
david-arm added a subscriber: paulwalker-arm.

This patch looks good and it saves us having two orthogonal states! I just had a few minor comments ...


Given @paulwalker-arm was involved in the naming scheme for the original enum ( it's probably worth adding him as a reviewer on the patch.


I'm not sure this statement is quite true. I think the tripcount will overflow, but will overflow to 0 for powers of 2. The induction variable will also overflow to 0 and so the comparison used to branch back is safe, i.e. = add i64 %iv, %vl
%keep_going = icmp ne, %rounded.up.trip.count
br i1 %keep_going, %vector.body, %middle.block

I am not sure if this was by design, or that it's just pure luck that this works. :)


Hi @sdesmalen, it seems a little odd to have an accessor method foldTailByMasking that doesn't return the underlying variable of the same name FoldTailByMasking. It's more an observation really - I just wondered if the function should be named differently to the variable?


Can this be more simply written as

return getTailFoldingStyle() == TailFoldingStyle::DataAndControlFlow



Previously this was setting HasNUW=true when tail-folding without the active lane mask, whereas now in addCanonicalIVRecipes we only set to true if not tail-folding at all. I'm not sure if this matters? Certainly no tests seem to break. :)

sdesmalen added inline comments.Jan 31 2023, 3:32 AM

Correct, the tripcount could overflow but then so would the up-rounded tripcount. When the VL is a power-of-2 then both values would overflow to '0' and because it tests for inequality, the loop would break when both values are '0' so it just works without the runtime check. I'll update the comment!


Fair point, the meaning of FoldTailByMasking is derived from LoopVectorizationLegality::prepareToFoldTailByMasking which checks whether the loop can use tail folding, so perhaps we can rename it to LoopCanFoldTailByMasking ?


Having this property based on the tail folding style seemed to be right thing to do and none of the tests break, so I suspect this was an omission in the previous implementation. Or it is never tested, but it seems odd to do something different in the two code paths to this function.

sdesmalen updated this revision to Diff 493577.Jan 31 2023, 5:37 AM
  • Renamed LoopVectorizationCostModel::FoldTailByMasking to LoopVectorizationCostModel::CanFoldTailByMasking.
  • Fixed description of TailFoldingStyle::Data.
  • Simplified LoopVectorizationCostModel::useActiveLaneMaskForControlFlow()
david-arm accepted this revision.Feb 1 2023, 1:57 AM

LGTM! Thanks for making the changes. :)

This revision is now accepted and ready to land.Feb 1 2023, 1:57 AM