This is an archive of the discontinued LLVM Phabricator instance.

[LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions
ClosedPublic

Authored by mssimpso on Jul 11 2016, 2:41 PM.

Details

Reviewers
anemet
Summary

This patch prevents increases in the number of instructions, pre-instcombine, due to induction variable scalarization. An increase in instructions can lead to an increase in the compile-time required to simplify the induction variables. We now maintain a new map for scalarized induction variables to prevent us from converting between the scalar and vector forms.

Diff Detail

Event Timeline

mssimpso updated this revision to Diff 63576.Jul 11 2016, 2:41 PM
mssimpso retitled this revision from to [LV] Avoid unnecessary IV scalar-to-vector-to-scalar conversions.
mssimpso updated this object.
mssimpso added a reviewer: anemet.
mssimpso added subscribers: cmatthews, llvm-commits.
anemet edited edge metadata.Jul 11 2016, 3:33 PM

Hi Matt,

Looks basically good.

A bit more high-level info about the approach would be nice next time. I think I understand that you are using the new Map in the scalarizeInst.

You're missing tests.

Adam

lib/Transforms/Vectorize/LoopVectorize.cpp
604–607

Please explain the relation to WidenMap. It might also be a good idea to move it closer there.

mssimpso updated this revision to Diff 63687.Jul 12 2016, 8:55 AM
mssimpso edited edge metadata.
mssimpso marked an inline comment as done.

Addressed Adam's comments.

  • Renamed ScalarMap to ScalarIVMap since we're currently only using it for induction variables (though it's possible we may want to reuse this in the future to further limit the scalar/vector conversions for other cases).
  • Updated ScalarIVMap comment and moved declaration.
  • Added comment in vectorizeMemoryInstruction
  • Added some pre-instcombine tests.
  • Verified compile-time improvement. I see significant speedups in >25 tests from the test-suite with no slowdowns.

Adam,

Thanks for the feedback, and sorry for the lack of explanation. Yes, you understand correctly; here's a quick overview.

When we build the steps for the induction variables we want to be scalar, instead of inserting them into a vector to be maintained in WidenMap, we now leave them as-is and store the steps in ScalarIVMap. Then, whenever we need to scalarize a use of an induction variable, instead of accessing WidenMap and creating an extractelement instruction, we just grab the appropriate value from ScalarIVMap. The relevant extracts are in two places: one in scalarizeInstruction, and one in vectorizeMemoryInstruction.

The approach cuts the number of instructions, pre-instcombine, related to induction variable scalarization by 2/3 since we eliminate, for each step, an insertelement and an extractelement instruction. Compared to always vectorizing the induction variables, scalarizing now doesn't increase the number of instructions. This is because where we previously would have had an extract from the induction variable vector, we now have a scalar step computation. I hope that makes sense!

Thanks for the changes, just one more thing. For absolute paranoia, can you please also add a test with interleaving>1 (no instcombine)?

lib/Transforms/Vectorize/LoopVectorize.cpp
2475

This sounds a bit confusing. How about a comment talking about this IV having a scalarized version, like what you say in the next hunk?

mssimpso updated this revision to Diff 63863.Jul 13 2016, 2:54 PM
mssimpso marked an inline comment as done.

Addressed Adam's comments.

  • Updated comment about GepOperand
  • Added tests with interleave > 1 and no instcombine

Thanks!

anemet accepted this revision.Jul 13 2016, 2:56 PM
anemet edited edge metadata.

LGTM!

This revision is now accepted and ready to land.Jul 13 2016, 2:56 PM
mssimpso closed this revision.Jul 14 2016, 7:47 AM

Committed in rL275419. Thanks, Adam!