This is an archive of the discontinued LLVM Phabricator instance.

[Loop Vectorizer] Abandon vectorization when no integer IV found
ClosedPublic

Authored by wristow on Sep 20 2018, 3:32 PM.

Details

Summary

Support for vectorizing loops with secondary floating-point induction variables was added in r276554. A primary integer IV is still required for vectorization to be done. If an FP IV was found, but no integer IV was found at all (primary or secondary), the attempt to vectorize still went forward, causing a compiler-crash.

This fixes PR38800.

Diff Detail

Repository
rL LLVM

Event Timeline

wristow created this revision.Sep 20 2018, 3:32 PM
hsaito added a subscriber: hsaito.Sep 21 2018, 11:55 AM

Sorry, I totally forgot about my old patch https://reviews.llvm.org/D47216.

  1. I like your two different message version better than changing Line 788 condition to if (!WidestIndTy). That's good.
  2. Please add non-NULL assertion after "Type *IdxTy = Legal->getWidestInductionType();" in InnerLoopVectorizer::getOrCreateTripCount().
  3. Please include LIT test from D47216.

Thanks,
Hideki

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Is there a practical need (i.e., beyond academic interest) to vectorize such code? Examples?

Sorry, I totally forgot about my old patch https://reviews.llvm.org/D47216.

  1. I like your two different message version better than changing Line 788 condition to if (!WidestIndTy). That's good.
  2. Please add non-NULL assertion after "Type *IdxTy = Legal->getWidestInductionType();" in InnerLoopVectorizer::getOrCreateTripCount().
  3. Please include LIT test from D47216.

Thanks Hideki! I'll add the non-NULL assertion for idxTy. Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Makes sense to me, but I'm not experienced in working on the vectorizer, so I'm hesitant to jump into that. And I'd view it as a separate patch to add that capability. As Hideki said in D47216:

Whether it's legal to convert FP primary induction to INT primary induction and if so under what conditions are debatable, but bailing
out when it's not proven safe (and currently never proven to be safe as far as LV's existing code is concerned) is a valid thing to do.

I don't know how difficult it is to extend it this way, but given that there are explicit expectations that the primary induction is of an integer type, such as:

assert((IV->getType()->isIntegerTy() || IV != OldInduction) &&
       "Primary induction variable must have an integer type");

it doesn't feel like it's a small tweak. (But again, I'm not experienced in this area.)
In short, tackling that seems like a fine idea, but a separate task.

anna added a subscriber: anna.Sep 21 2018, 1:41 PM

Why do we need an integer induction variable? If one doesn't exist, it should be straightforward to create one.

Is there a practical need (i.e., beyond academic interest) to vectorize such code? Examples?

I think that's a separate question, i.e. improving the vectorizer without depending on real world benchmarks is a good thing (especially if it's not a fundamental change). So, it looks like the LV is currently limited in vectorizing FP loops where we don't have an integer IV. It's worthwhile to see whether adding the integer IV and related cost model changes allows us to vectorize such loops on some targets? Maybe the cost model will prove that even though vectorizing is possible, it may not be beneficial. Disclaimer: This is speculation, I have not done any analysis here.

That stated, I believe fixing the PR and the crash is a good change (and orthogonal to the question of adding integer IV).

If there's some actual implementation work involved in supporting it, fine, I guess... but it seems like it can't be much work to support properly: even just adding an unused i1 induction variable is enough to get around the crash for the given testcase.

If there's some actual implementation work involved in supporting it, fine, I guess... but it seems like it can't be much work to support properly: even just adding an unused i1 induction variable is enough to get around the crash for the given testcase.

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly? The real question is: can we compute the trip count for these loops? I'd guess that we need to do that too.

wristow updated this revision to Diff 166560.Sep 21 2018, 2:38 PM

As the discussion on the possibility of creating an integer IV goes on, I'm updating the patch to include the non-NULL assertion for IdxTy.

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly?

The i1 variable doesn't actually get used for anything; the vectorizer makes a new induction variable to track the iteration count.

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

Interesting, although given that the loop runs for more than one iteration, would that i1 variable work correctly?

The i1 variable doesn't actually get used for anything; the vectorizer makes a new induction variable to track the iteration count.

Ah, okay. Interesting. I certainly see your point - if it is never used for anything, then we don't need to require it.

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

Sure. That will do it.

can we compute the trip count for these loops?

SCEV can compute the trip count for this testcase, yes. See ScalarEvolution::computeExitCountExhaustively.

Sure. That will do it.

Still doesn't move a needle on my priority list. We might be able to do this for fast-math mode, but that means loop trip count may be different depending on the FP optimizations performed. Not a very good program to begin with. Vectorizer has a long list of TODOs for optimizing practically-useful well-written code.

In any case, I think the proposed fix (or some other mechanism) is still needed since I don't think SCEV can compute trip count in all cases of FP primary inductions.

Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

I say yes and no. One induction versus two inductions. Given that there were none before, it's nice to have more than one, especially so if another one is handily available. I don't insist, though.

hsaito accepted this revision.Sep 21 2018, 3:25 PM

LGTM. We can continue discussing about not bailing out for the subset of cases, but we don't have to let the compiler crash while we do that.

This revision is now accepted and ready to land.Sep 21 2018, 3:25 PM

Regarding the LIT test from D47216, isn't that essentially the same as the LIT test I have here?

I say yes and no. One induction versus two inductions. Given that there were none before, it's nice to have more than one, especially so if another one is handily available. I don't insist, though.

Makes sense. I'll add that second test.

LGTM. We can continue discussing about not bailing out for the subset of cases, but we don't have to let the compiler crash while we do that.

Thanks! I'll commit after adding that test.

wristow updated this revision to Diff 166569.Sep 21 2018, 3:49 PM

Added test from D47216.

This revision was automatically updated to reflect the committed changes.