This is an archive of the discontinued LLVM Phabricator instance.

[LSV] Look through selects for consecutive addresses
ClosedPublic

Authored by rtereshin on Jul 17 2018, 9:22 AM.

Details

Summary

In some cases LSV sees (load/store _ (select _ <pointer expression> <pointer expression>))
patterns in input IR, often due to sinking and other forms of CFG simplification, sometimes
interspersed with bitcasts and all-constant-indices GEPs. With this patch`areConsecutivePointers`
method would attempt to handle select instructions. This leads to an increased number of
successful vectorizations.

Technically, select instructions could appear in index arithmetic as well, however, we don't see
those in our test suites / benchmarks. Also, there is a lot more freedom in IR shapes computing
integral indices in general than in what's common in pointer computations, and it appears that
it's quite unreliable to do anything short of making select instructions first class citizens of
Scalar Evolution, which for the purposes of this patch is most definitely an overkill.

Diff Detail

Repository
rL LLVM

Event Timeline

rtereshin created this revision.Jul 17 2018, 9:22 AM
This revision is now accepted and ready to land.Jul 17 2018, 3:18 PM
arsenm added inline comments.Jul 19 2018, 5:22 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
140 ↗(On Diff #155897)

Why is this an enum?

440–441 ↗(On Diff #155897)

Braces

arsenm added inline comments.Jul 19 2018, 5:40 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
315 ↗(On Diff #155897)

Const method?

674–679 ↗(On Diff #155897)

I'm a bit confused by this ChainID concept. Why is the select condition used for this and not the select itself?

rtereshin added inline comments.Jul 19 2018, 10:39 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
140 ↗(On Diff #155897)

Sure, will change that to static const or static constexpr. Which one would you prefer?

315 ↗(On Diff #155897)

Maybe, I never checked. Most likely not, it calls methods on ScalarEvolution &SE, which is part of the object, and those methods change the SCEVs' cache that is owned by SE, so most likely they aren't const methods themselves.

If it is a const method though, will do.

440–441 ↗(On Diff #155897)

Where exactly would you like braces? First level if, second level if, or both?

I believe no curly braces around 1 statement bodies of if/then/else/for/while/do-while statements is the standard LLVM code style.

674–679 ↗(On Diff #155897)

I'm a bit confused by this ChainID concept.

ChainID is an arbitrary token that is allowed to be different only for the accesses that are guaranteed to be non-consecutive. It's used (and it's predecessor was used) for grouping instructions together and reducing the number of instructions the main search operates on at a time, i.e. this is to reduce compile time and nothing else as the main search has (and had) O(n^2) time complexity.

The exact way of determining accesses that are guaranteed to be non-consecutive does not affect the rest of the pass as soon as it doesn't return different ChainIDs for consecutive accesses. I played with a few different ones and I expect it to change in the future. All one needs to do to change it is to change the type definition and the getChainID function implementation. In particular, returning the same ChainID for every Ptr works too, but increases compile time.

Of course, "guaranteed to be non-consecutive" above means "guaranteed to be classified as non-consecutive by isConsecutiveAccess method".

Why is the select condition used for this and not the select itself?

The select's themselves are distinct instructions even if they share the same condition and evaluate to consecutive pointers for true and false values of the condition. Therefore using the select's themselves for grouping instructions would put consecutive accesses into different lists and they wouldn't be even checked for being consecutive, and won't be vectorized. Which defies the purpose of this patch.

rtereshin added inline comments.Jul 19 2018, 11:56 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
140 ↗(On Diff #155897)

Going with static const for now.

315 ↗(On Diff #155897)

Could be done, good catch, doing that.

440–441 ↗(On Diff #155897)

Going with braces around second level if for now.

  1. Turned methods touched to const methods
  2. Added curly braces
  3. Turned single-value enum into static const

@arsenm

Hi Matt,

Do you think this is good to go now?

Thanks,
Roman

arsenm added inline comments.Jul 24 2018, 11:32 AM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
440–441 ↗(On Diff #155897)

I would put both, as both are surrounding multi-line things

674–679 ↗(On Diff #155897)

This should be in a comment somewhere

rtereshin updated this revision to Diff 157153.Jul 24 2018, 4:00 PM
  1. Added comments explaining partitioning of memory access instructions into groups of candidates for better performance
  2. Added curly braces
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
440–441 ↗(On Diff #155897)

Done.

674–679 ↗(On Diff #155897)

Done.

Hi Roman,

I added a few comments. Could you take a look?

lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
147 ↗(On Diff #157153)

Replace RecursionLimit with MaxDepth maybe?

323 ↗(On Diff #157153)

Replace MaxRecurse with Depth maybe?

360 ↗(On Diff #157153)

Missing --MaxRecurse?

368 ↗(On Diff #157153)

You can check if isa<SelectInst>(A) && isa<SelectInst>(B) here.

454 ↗(On Diff #157153)

You can replace Value * with SelectInst & and remove the dyn_casts below.

rtereshin added inline comments.Jul 25 2018, 2:04 PM
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
323 ↗(On Diff #157153)

ok

360 ↗(On Diff #157153)

no, lookThroughComplexAddresses does not make recursive calls directly, lookThroughSelects does later.

368 ↗(On Diff #157153)

I can, but it gets messier.

454 ↗(On Diff #157153)

I can, but it gets messier on the caller's side.

rtereshin updated this revision to Diff 157360.Jul 25 2018, 2:21 PM
  1. Renamed MaxRecurse to Depth as requested and updated the logic around it as MaxRecurse was going from the positive limit down to 0 which makes no sense for a parameter called Depth, so Depth is going from 0 to MaxDepth now.
  2. Renamed RecursionLimit to MaxDepth to make the connection with Depth parameter clearer (which was MaxRecurse previously).
volkan accepted this revision.Jul 25 2018, 2:26 PM

LGTM

LGTM

Thanks!

@arsenm Hopefully it's fine to move forward with this one at this point.

Roman

This revision was automatically updated to reflect the committed changes.