This is an archive of the discontinued LLVM Phabricator instance.

LoopVectorize: extend SelectCmp pattern for non-invariants
AbandonedPublic

Authored by artagnon on Jun 2 2023, 4:24 AM.

Details

Summary

Extend LoopVectorize's SelectCmp pattern for non-invariant statements,
and get LoopUtils' createCmpSelectTargetReduction to CodeGen for this
case. Consider the following example:

int src[n] = {4, 5, 2};
int r = 331;
for (int i = 0; i < n; i++) {
  if (src[i] > 3)
    r = i;
}
return r;

Here, r = i is non-invariant, and CodeGen'ing for this case involves the
following:

  1. Create a Splat with the int_min/int_max values, depending on the loop direction.
  2. The Src is filled with values: {0, 1, 2, 3, 4, ...}.
  3. The Right is filled with values: {0, 1, 331, 331, 331, ...}.
  4. The CmpVector is filled with values: {1, 1, 0, 0, 0, ...}.
  5. Select using this CmpVector between Src and the Splat with int_min/int_max values.
  6. Use a max-reduce/min-reduce on the result of the Select.
  7. Use the exising Cmp which determines whether or not any assignment took place in the loop to select between the result of the max-reduce/min-reduce, and the initial value (InitVal).

Hence, the above case is vectorized.

Diff Detail

Event Timeline

artagnon created this revision.Jun 2 2023, 4:24 AM
artagnon requested review of this revision.Jun 2 2023, 4:24 AM
artagnon updated this revision to Diff 528757.Jun 6 2023, 2:42 AM

More comments in LoopUtils.cpp.

shiva0217 added inline comments.Jun 6 2023, 2:58 AM
llvm/lib/Analysis/IVDescriptors.cpp
678

For the following case, the final value might not be able to pick by the min/max reduction.

int test(int a[32000]) {
   int k = -1;
   for (int i = 0; i < 32000; i++)
     if (a[i] < 0)
        k = a[i];
 
     return k;
 }

Extra checking of NonPhi might be needed. So the reduction can generate min/max.

auto *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(NonPhi));
if (AR) {
  const SCEV *Step = AR->getStepRecurrence(*SE);
  const SCEVConstant *ConstStep = dyn_cast<SCEVConstant>(Step);
  if (!ConstStep && !SE->isLoopInvariant(Step, Loop))
    return InstDesc(false, I);
} else
  return InstDesc(false, I);

Would it need to check AR->hasNoUnsignedWrap?

artagnon updated this revision to Diff 528823.Jun 6 2023, 6:10 AM

Address @shiva0217's comment, and fix miscompile for the following case:

int a[n] = {0, 0, 4, 3};
int k = a[0];
for (int i = 0; i < n; i++)
  if (a[i] > 2)
    k = a[i];
return k;

Here, instead of i being maximized, a[i] is maximized. To forbid this
case, let us check that the SCEV AddRec expression has a constant step,
and not vectorize this case at all.

artagnon marked an inline comment as done.Jun 6 2023, 6:18 AM
artagnon abandoned this revision.Jun 7 2023, 2:40 AM

Will redo this patch and re-open a new review request.