Page MenuHomePhabricator

[LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions.
ClosedPublic

Authored by congh on Dec 2 2015, 10:10 PM.

Details

Summary

LoopVectorizationCostModel::calculateRegisterUsage() is used to estimate the register usage for specific VFs. However, it takes into account many instructions that won't be vectorized, such as induction variables, GetElementPtr instruction, etc.. This makes the loop vectorizer too conservative when choosing VF. In this patch, the induction variables that won't be vectorized plus GetElementPtr instruction will be added to ValuesToIgnore set so that their register usage won't be considered any more.

Diff Detail

Repository
rL LLVM

Event Timeline

congh updated this revision to Diff 41718.Dec 2 2015, 10:10 PM
congh retitled this revision from to [LoopVectorizer] Refine loop vectorizer's register usage calculator by ignoring specific instructions..
congh updated this object.
congh added reviewers: hfinkel, spatel, RKSimon, davidxl.
congh added a subscriber: llvm-commits.
jmolloy requested changes to this revision.Dec 3 2015, 1:11 AM
jmolloy added a reviewer: jmolloy.
jmolloy added a subscriber: jmolloy.

Hi Cong,

Thanks for doing this. I have a bunch of comments below.

Cheers,

James

lib/Transforms/Vectorize/LoopVectorize.cpp
1526 ↗(On Diff #41718)

Everything else has a (pointless) comment - let's add a (pointless) comment here for consistency's sake.

5174 ↗(On Diff #41718)

Why this change?

5655 ↗(On Diff #41718)

This is not the right way to determine which user (if any) updates the PHI. In fact, it could be a chain of users, and they could be in any order.

The right way is to check the incoming value of the induction PHI from the loop latch.

5667 ↗(On Diff #41718)

Redefining this variable is confusing - it should be given a unique name.

5680 ↗(On Diff #41718)

I can't help but feel this is all a bit long-winded. Something like this should work and is a lot shorter (if you don't suffer from STL allergies):

for (auto &Induction : *Legal->getInductionVars()) {
  auto *PN = KV.first;
  auto *UpdateV = PN->getIncomingValueForBlock(L->getLoopLatch());

  // Check that the PHI is only used by the induction increment (UpdateV) or
  // by GEPs.
  // Then, check that UpdateV is only used by a compare instruction or the loop
  // header PHI.
  if (std::all_of(PN->user_begin(), PN->user_end(), [&](const User *U) {
        return U == UpdateV || isa<GetElementPtrInst>(U);
      }) &&
      std::all_of(UpdateV->user_begin(), UpdateV->user_end(), [&](const User *U) {
        return U == PN || isa<ICmpInst>(U);
      })) {
    ValuesToIgnore.insert(PN);
    ValuesToIgnore.insert(UpdateV);
  }
}
5687 ↗(On Diff #41718)

Surely this isn't right - GEPs can be vectorized if they have pointer type, right? as in scatter gather?

test/Transforms/LoopVectorize/X86/vector_max_bandwidth.ll
19 ↗(On Diff #41718)

I don't understand why your changes should affect the VF being selected. It should affect the UF only, right?

This revision now requires changes to proceed.Dec 3 2015, 1:11 AM
congh added a comment.Dec 3 2015, 2:57 PM

Thank you very much for the review, James! Please check my inline replies.

lib/Transforms/Vectorize/LoopVectorize.cpp
1526 ↗(On Diff #41718)

OK.

5174 ↗(On Diff #41718)

This is actually fixing a previous bug: we need to update OpenIntervals by removing instructions that end at the current location, even when we ignore values in ValuesToIgnore. Those instructions to remove won't be considered later.

5680 ↗(On Diff #41718)

Thank you very much for the suggested code. I love STL. I have applied your code in this patch.

5687 ↗(On Diff #41718)

I have added the check and now we only ignore this instruction if its last operand is an induction var. Do you have better implementation suggestion here?

BTW, does LLVM support scatter/gather now? It seems that vectorizeBlockInLoop doesn't vectorize this operation.

test/Transforms/LoopVectorize/X86/vector_max_bandwidth.ll
19 ↗(On Diff #41718)

With the patch http://reviews.llvm.org/D8943, VF is also determined by register usage when -vectorizer-maximize-bandwidth is turned on (as is in this test file).

congh updated this revision to Diff 41809.Dec 3 2015, 2:58 PM
congh edited edge metadata.

Update the patch according to James's comments.

RKSimon resigned from this revision.Dec 9 2015, 6:52 AM
RKSimon removed a reviewer: RKSimon.

Sorry but I don't know enough about the vectorizers to review.

hfinkel added inline comments.Dec 10 2015, 2:46 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
5634 ↗(On Diff #41809)

You need to have a different behavior here for VF == 1 vs. VF > 1. The VF == 1 case is important because it is used to control the interleave factor (calculateRegisterUsage is called from LoopVectorizationCostModel::selectInterleaveCount). For the VF == 1 case, a number of these exclusions don't apply.

5647 ↗(On Diff #41809)

Aren't these also used by add or sub by a constant?

5675 ↗(On Diff #41809)

Typo: inductoin

5678 ↗(On Diff #41809)

dyn_cast -> cast (it is known not to fail here)

5680 ↗(On Diff #41809)

I suspect you want to use getGEPInductionOperand (from include/llvm/Analysis/VectorUtils.h) to pick the operand index so that you ignore trailing zero operands.

congh updated this revision to Diff 42593.Dec 11 2015, 3:35 PM
congh edited edge metadata.
congh marked 2 inline comments as done.

Update the patch based on Hal's comments.

congh added inline comments.Dec 11 2015, 3:35 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
5634 ↗(On Diff #41809)

Good catch! I have created a new set to store values to ignore when VF > 1 and only add my newly added values to ignore to this new set. This set is then used to calculate register usages for VF > 1.

5647 ↗(On Diff #41809)

Yes, induction variables can participate in arithmetic operations. But I still have to follow the def-use chain and check if the value that is defined directly/indirectly is stored (it seems that the reduction with an induction variable cannot be vectorized now). Do you have any idea except doing it in a DFS manner?

5680 ↗(On Diff #41809)

I think getGEPInductionOperand is what I want here. I have updated this part. Thanks!

hfinkel added inline comments.Dec 12 2015, 4:11 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
5645 ↗(On Diff #42593)

Shouldn't this be VecValuesToIgnore? These casts are not vectorized, but they are scalar instructions that maintain a live value in a register.

(Is this the only contributor to ValuesToIgnore?)

congh added inline comments.Dec 12 2015, 4:59 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
5645 ↗(On Diff #42593)

OK. This is actually from previous code that I didn't change, which is used in all cases no matter what VF is. Updated.

congh updated this revision to Diff 42645.Dec 12 2015, 4:59 PM

Update the patch according to Hal's comment.

congh added inline comments.Dec 12 2015, 5:00 PM
lib/Transforms/Vectorize/LoopVectorize.cpp
5645 ↗(On Diff #42645)

Now the only contributor are ephemeral values that are added above.

hfinkel accepted this revision.Dec 12 2015, 5:17 PM
hfinkel edited edge metadata.

LGTM.

lib/Transforms/Vectorize/LoopVectorize.cpp
5645 ↗(On Diff #42645)

That makes sense.

This revision was automatically updated to reflect the committed changes.