This is an archive of the discontinued LLVM Phabricator instance.

LoopVectorize: introduce RecurKind::Induction(I|F)(Max|Min)
AbandonedPublic

Authored by artagnon on Jun 12 2023, 3:44 AM.

Details

Summary

LoopVectorize's SelectCmp pattern suffers from the deficiency that it
cannot handle non-invariant statements. In order to support
vectorization of the following example,

int src[n] = {4, 5, 2};
int r = 331;
for (int i = 0; i < n; i++) {
  if (src[i] > 3)
    r = i;
}
return r;

introduce RecurKind::InductionIMax, RecurKind::InductionIMin,
RecurKind::InductionFMax, and RecurKind::InductionFMin. We currently
only support assignments to the induction variable; in particular, we do
not support the following case:

int src[n] = {4, 5, 2};
int r = 331;
for (int i = 0; i < n; i++) {
  if (src[i] > 3)
    r = src[i];
}
return r;

CodeGen'ing for our original example involves checking the SCEV AddRec
expression (the min/max is inverted if the assignment is r = -i, in
place of r = i). Indeed, once we determine whether it's a min or max
reduction, CodeGen'ing involves the following:

  1. Create a Splat with the int_min/int_max values, depending on the RecurKind.
  2. The Src is filled with values: {0, 1, 2, 3, 4, ...}.
  3. The Right is filled with values: {0, 1, 331, 331, 331, ...}.
  4. The CmpVector is filled with values: {1, 1, 0, 0, 0, ...}.
  5. Select using this CmpVector between Src and the Splat with int_min/int_max values.
  6. Use a max-reduce/min-reduce on the result of the Select.
  7. Use the exising Cmp which determines whether or not any assignment took place in the loop to select between the result of the max-reduce/min-reduce, and the initial value (InitVal).

Hence, the original example is vectorized.

Diff Detail

Event Timeline

artagnon created this revision.Jun 12 2023, 3:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 3:44 AM
artagnon requested review of this revision.Jun 12 2023, 3:44 AM
fhahn added a subscriber: Mel-Chen.Jun 14 2023, 6:25 AM

It looks like this addresses a similar issue as D150851 by @Mel-Chen? It would be good to iterate on D150851 first and then extend it to handle additional cases from here

I'm glad that others have also noticed this, especially the implementation of the select-cmp pattern for decreasing induction variables. The select-cmp pattern for decreasing induction variables is a function that I haven't implemented yet. Have you come across any real applications or benchmarks that make use of it.

llvm/include/llvm/Analysis/IVDescriptors.h
55–62

I can understand the literal meaning of "the induction variable to be maximized/minimized," but perhaps there is a better way to describe this pattern, such as "monotonic increasing" and "monotonic decreasing."

llvm/lib/Analysis/IVDescriptors.cpp
741

I recommend using isKnownNegative.

bool 	isKnownNegative (const SCEV *S)
 	Test if the given expression is known to be negative.
1021–1026

Actually, I've had an idea: for this types of RecurKind, we should only need to do AddReductionVar once.

1174–1177

Based on my understanding, in a broader sense, ternary operations do not have an identity. Although I used the term "identity" for convenience in implementation to refer to the sentinel value, I have been considering whether to separate the concept of the sentinel value from identity.

llvm/lib/Transforms/Utils/LoopUtils.cpp
1094

Based on your implementation, Right would not be what you stated as {0, 1, 331, 331, 331, ...}, but rather {331, 331, 331, 331, 331, ...}.

1112

The behavior of isSigned is different from what you have in mind. (Indeed, the name is really misleading.)

bool 	isSigned () const
 	Returns true if all source operands of the recurrence are SExtInsts.
llvm/test/Transforms/LoopVectorize/select-min-index.ll
2–3

Why skip interleave?

I'm glad that others have also noticed this, especially the implementation of the select-cmp pattern for decreasing induction variables. The select-cmp pattern for decreasing induction variables is a function that I haven't implemented yet. Have you come across any real applications or benchmarks that make use of it.

I implemented this patch based on a TSVC run: gcc-aarch64 vectorizes increasing and decreasing induction variables, while llvm-riscv does not.

Thanks for your review comments: I will keep them in mind when developing a follow-up patch, when your patch lands.

llvm/lib/Analysis/IVDescriptors.cpp
1021–1026

Yes, your approach of extending isSelectCmpPattern saves us these.

llvm/lib/Transforms/Utils/LoopUtils.cpp
1094

Ah yes, my error.

1112

Oh, ouch.

Hi Ram,

To fix the issue mentioned in https://reviews.llvm.org/D150851#4480312,
is it possible to introduce a vector to determine the array has element equal to the initial value or not?
So the vector could be used to adjust the result if the result comes from max reduction and it smaller than the initial value.

Something like:

    Init_val = 331;

Loop_header:
    AnyInitVal = {0, 0, 0, 0}
 
Loop_body:
    ...
    last_AnyInitVal = PHI (AnyInitVal, next_AnyInitVal), 
    current_AnyInitVal = veq  current_array_vec, Init_val   // if current_array_vec is (1, 2, 331, 331) then current_AnyInitVal = (0, 0, 1, 1)
    next_AnyInitVal = vor  current_AnyInitVal, last_AnyInitVal;
    ...
Loop Exit:

    if (result_come_from_max_reduction)
       if ((max_reduction < InitVal) && (next_AnyInitVal has non-zero element))
          return InitVal;
       else return max_reduction;
artagnon abandoned this revision.Aug 29 2023, 2:35 AM

Subsumed by https://reviews.llvm.org/D150851 and follow-ups.