Page MenuHomePhabricator

[BasicAA] Generalize base offset modulus handling
ClosedPublic

Authored by nikic on Sun, Nov 8, 1:49 AM.

Details

Summary

The GEP aliasing implementation currently has two pieces of code that solve two different subsets of the same basic problem: If you have GEPs with offsets 4*x + 0 and 4*y + 1 (assuming access size 1), then they do not alias regardless of whether x and y are the same.

One implementation is in aliasSameBasePointerGEPs(), which looks at this in a limited structural way. It requires both GEP base pointers to be exactly the same, then (optionally) a number of equal indexes, then an unknown index, then a non-equal index into a struct. This set of limitations works, but it's overly restrictive and hides the core property we're trying to exploit.

The second implementation is part of aliasGEP() itself and tries to find a common modulus in the scales, so it can then check that the constant offset doesn't overlap under modular arithmetic. The second implementation has the right idea of what the general problem is, but effectively only considers power of two factors in the scales (while aliasSameBasePointerGEPs also works with non-pow2 struct sizes.)

What this patch does is to adjust the aliasGEP() implementation to instead find the largest common factor in all the scales (i.e. the GCD) and use that as the modulus.

Diff Detail

Event Timeline

nikic created this revision.Sun, Nov 8, 1:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptSun, Nov 8, 1:49 AM
nikic requested review of this revision.Sun, Nov 8, 1:49 AM
jdoerfert added inline comments.Mon, Nov 9, 7:43 AM
llvm/lib/Analysis/BasicAliasAnalysis.cpp
1402

Maybe a comment above this to explain, I don't get the reasoning TBH.

Here is my idea of what is happening but it is not complete:

// X[0:N] is a vector of N+1 unknowns
GEP1 = GEP1BaseOffset + X[0:N] * GEP1.scale[0:N]
GCD = gcd(GEP1.scale[0], ..., GEP1.scale[N])
ModOffset = GEP1BaseOffset % GCD
// Every value of GEP1 looks like: GEP1 := Y * GCD + ModOffset
// and ModOffset < GCD.
Diff = GCD - ModOffset
if (ModOffset >=u V2Size  // I guess V2 fits in the size between Y * GCD and GEP1
 &&      Diff >=u V1Size  // this I don't understand.
  return NoAlias;
nikic updated this revision to Diff 303915.Mon, Nov 9, 9:59 AM

Update comment.

nikic added inline comments.Mon, Nov 9, 10:05 AM
llvm/lib/Analysis/BasicAliasAnalysis.cpp
1402

Does the new comment make sense to you? We have the second access at [0..V2Size) with the remaining [V2Size..GCD) not being accessed. We don't alias if the first access fits in there, i.e. ModOffset >= V2Size and ModOffset+V1Size <= GCD.

Yes, now I get it. I think I was also missing the "V2 has offset 0" precondition.

I like this, nice cleanup that adds more power. I think it is correct and good to go, would prefer a second opinion though.

asbirlea accepted this revision.Tue, Nov 17, 5:10 PM

This makes sense to me too. The additional comment was helpful with the gcd/difference condition.

This revision is now accepted and ready to land.Tue, Nov 17, 5:10 PM
This revision was automatically updated to reflect the committed changes.