Page MenuHomePhabricator

[LV] Account for minimum vscale when rejecting scalable vectorization of short loops

Authored by reames on Nov 2 2022, 1:17 PM.



The vectorizer has code to reject scalable vectorization of loops with very short trip counts, and instead use fixed length vectors. The current code doesn't account for the minimum vscale value known, and thus under estimates the number of lanes in the scalable type for RISCV's default configuration. This results in use of predication and a trivially dead loop where a single straight line piece of code would suffice.

Note that the code quality of the original scalable vectorization could (and probably should) be improved other ways as well. This patch is solely about whether the scalable vectorization was the right choice to begin with.

This bit of code - both with and without my change - does make the unchecked assumption that the target knows how to lower fixed length vectors whose length is provably less than the vector length.

Diff Detail

Event Timeline

reames created this revision.Nov 2 2022, 1:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 2 2022, 1:17 PM
reames requested review of this revision.Nov 2 2022, 1:17 PM
david-arm added inline comments.Dec 6 2022, 6:49 AM

nit: I think you can combine these two if statements into one, i.e.

if (MaxVectorElementCount.isScalable() && TheFunction->hasFnAttribute(Attribute::VScaleRange)) {
  auto Attr = TheFunction->getFnAttribute(Attribute::VScaleRange);
  auto Min = Attr.getVScaleRangeMin();
  WidestRegisterMinEC *= Min;

I think this comment is no longer valid.


Perhaps it's better to have two tests now - one with min vscale = 4, and one with min vscale = 1, so that you can demonstrate the expected behaviour of using tail-folded loops for the latter as well?

reames updated this revision to Diff 480921.Dec 7 2022, 8:10 AM

Addressed two review comments, will reply to third in a moment.

reames updated this revision to Diff 480927.Dec 7 2022, 8:17 AM

Address third review comment.

In the process of writing my comment explaining why I didn't understand what was meant, I realized I did understand what was meant. :)

david-arm accepted this revision.Dec 7 2022, 8:58 AM



Ah ok I hadn't realised this is incompatible. I mainly suggested min=1 as a way to still show the tail-folding behaviour for low trip counts, but I guess I could also have suggested raising the known trip count to a higher value. Anyway, thank you!

This revision is now accepted and ready to land.Dec 7 2022, 8:58 AM