Page MenuHomePhabricator

[LV] Support gaps, overlaps, and inexact sizes in speculation logic
Needs ReviewPublic

Authored by reames on Sep 12 2019, 11:28 AM.

Details

Reviewers
Ayal
hsaito
Summary

Implement last set of TODOs from rL371452. Three cases:

  1. A loop which loads N out of every M bytes. (i.e. there are gaps)
  2. A loop which loads N bytes every M bytes where N > M. (i.e. the loads overlap, and the alignment must be less than natural alignment)
  3. An exact constant trip count is not known, but an upper bound is and the entire region is dereferenceable.

The SCEV change is needed for (3) above. I'm going to separate that into it's own review (with it's own tests) and then rebase once that's landed.

Diff Detail

Event Timeline

reames created this revision.Sep 12 2019, 11:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2019, 11:28 AM
reames marked an inline comment as done.Sep 12 2019, 12:29 PM
reames added inline comments.
lib/Analysis/ScalarEvolution.cpp
5599 ↗(On Diff #219955)

This SCEV change is under review here: https://reviews.llvm.org/D67514

Once landed, I will rebase.

Ayal added inline comments.Sep 12 2019, 2:13 PM
lib/Analysis/Loads.cpp
228

Note that StepC may be negative; LV can vectorize accesses such as A[N-i], and also group together accesses with negative non-unit stride into "reversed" interleave groups.

230–238

try [the] an

232

then > than

246

Note that interleave groups (of loads with positive step currently) with gaps at the end may benefit from checking dereferenceability across AccessSize of TC * StepC. But that warrants a separate patch.

256

tha[n]

reames marked an inline comment as done.Sep 12 2019, 2:43 PM
reames added inline comments.
lib/Analysis/Loads.cpp
246

I don't understand the case you're describing here. What do you mean by a "gap at the end"?

p.s. Is there a definition somewhere in code of what an interleave group is? At the moment, I'm assuming it's an access pattern with periodic gaps, but is there something more specific?

reames updated this revision to Diff 220006.Sep 12 2019, 2:43 PM

Address review comments.

Ayal added inline comments.Sep 14 2019, 9:52 AM
lib/Analysis/Loads.cpp
246

What do you mean by a "gap at the end"?

Vectorizing with "gap at the end" was originally described in D19487, and extended under fold-tail in D53668. Simple examples are

for (int i = 0; i < N; i++)
  B[i] = A[2*i];

and

for (int i = 0; i < N; i++)
  C[i] = D[4*i] + D[4*i+1];

These currently trigger requiresScalarEpilogue(), which is better to avoid if dereferenceability can be proven (for missing elements A[2*(N-1)+1] and D[4*(N-1)+2], D[4*(N-1)+3]).

Is there a definition somewhere in code of what an interleave group is?

Yes, an interleave group is defined in Analysis/VectorUtils.h, above the definition of class InterleaveGroup.
Descriptions how interleave groups are vectorized appear above
InterleavedAccessInfo::analyzeInterleaving() in Analysis/VectorUtils.cpp and above
InnerLoopVectorizer::vectorizeInterleaveGroup() in LoopVectorize.cpp.

At the moment, I'm assuming it's an access pattern with periodic gaps, but is there something more specific?

It's usually several access patterns with the same periodic step that can be optimized together using one common wider unit-stride load/store followed-by/preceded-by shuffles, instead of independent gathers/scatters. Plenty of more specific information related to vectorizing interleave groups is available, e.g., "Automatic Vectorization of Interleaved Data Revisited" TACO 2015, "Exploiting mixed SIMD parallelism by reducing data reorganization overhead" CGO 2016.

Ayal added inline comments.Sep 15 2019, 2:03 AM
lib/Analysis/Loads.cpp
225

St[r]ide of zero should be covered by the loop-invariant case above.

226

Worth adding test with negative step?

254

This "if" condition is both necessary and sufficient ("iff"); i.e., nothing TODO/generalize, right?