This is an archive of the discontinued LLVM Phabricator instance.

llvm.noalias - The AA implementaton
Needs ReviewPublic

Authored by hfinkel on Apr 30 2015, 9:28 AM.

Details

Summary

This is part of the series started by D9375, and adds the actual AA implementation which uses the intrinsics.

Roughly, we need to check for an access through pointer derived from a llvm.noalias call, which compatible noalias metadata, where the other access also has compatible metadata, but is not derived from that llvm.noalias call (or any other with the same scope). This needs to include the possibility that the pointer has been captured and we're deriving from some use of the captured pointer. Precise capturing analysis requires a DT to be available, and if it is not, and there are captures, the answers will not be as good (as demonstrated by the noalias-dup-scope.ll test case). When looking for compatible llvm.noalias calls, we might need to look through others. If we hit a PHI or select, we need to combine the compatibility results from all possible inputs.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 24740.Apr 30 2015, 9:28 AM
hfinkel retitled this revision from to llvm.noalias - The AA implementaton.
hfinkel updated this object.
hfinkel edited the test plan for this revision. (Show Details)
hfinkel added reviewers: chandlerc, reames, dberlin.
hfinkel added a subscriber: Unknown Object (MLST).
reames resigned from this revision.Oct 8 2015, 10:25 AM
reames removed a reviewer: reames.

Resigning as a reviewer to get a very stale review off my list of blocking tasks in phabricator. Please reopen when desired.

hfinkel updated this revision to Diff 63335.Jul 8 2016, 3:43 PM

Rebased.

Note that there is some hackery around making the DominatorTree available to the pass when run through its legacy-pass-manager wrapper; this can be removed once legacy-pass-manager support goes away.

majnemer added inline comments.
lib/Analysis/ScopedNoAliasAA.cpp
273

llvm::find_if ?

295–298

const auto *

299–300

The following would be a little more compact:

for (Value *IncommingValue : PN->incoming_values())

Children.push_back(IncommingValue);
346

Perhaps:

for (Value *Arg : CSA.args())
396

Ditto.

428

auto *

454

Ditto.

hfinkel added inline comments.Jul 9 2016, 5:20 PM
lib/Analysis/ScopedNoAliasAA.cpp
273

Nice! I didn't know we had that :-)

295–298

Sure, and I'll make all of the updates you've suggested below. Thanks!

hfinkel updated this revision to Diff 63408.Jul 9 2016, 5:24 PM

Updated per review comments.

I've reviewed this to my satisfaction but you will need a more thorough review by an AA expert.

lib/Analysis/ScopedNoAliasAA.cpp
307

Might be good to hoist this out of the loop and clear it at the start of the loop. Saves you having to reallocate if the set goes big.

310

Ditto with this vector.

hfinkel updated this revision to Diff 73352.Oct 3 2016, 3:16 PM

Rebased (and addressed reviewer feedback).

jeroen.dobbelaere added inline comments.
lib/Analysis/ScopedNoAliasAA.cpp
158–163

Use CS2NoAlias and CS1NoAlias, in stead of CS2.getInstru.... and CS1.getInstru....

Hi Hal, I found an issue with this part when analyzing two pointers that are 'derived from each other', but are used in different loops.
Following code would go wrong:

int* restrict p=malloc(sizeof(int)*20);
int i;
for (i=0; i<20; ++i) {p[i]=i; }
for (i=0; i<20; ++i) { if (p[i] != i) should_never_happen(); }

Do you think that the proposed fix is ok ?

lib/Analysis/ScopedNoAliasAA.cpp
444

We need an extra check here to see if APtr is directly depending on BPtr. Something like following code should do the trick:

{
  // Check if A depends directly on B
  SmallVector<Value *, 8> AObjs;
  GetUnderlyingObjects(const_cast<Value*>(APtr), AObjs, DL, nullptr, 0, nullptr);
  DEBUG(dbgs() << "SNA: underlying objects:\n");
#ifndef NDEBUG
  for (auto *A : AObjs)
    DEBUG(dbgs() << "\t" << *A << "\n");
  DEBUG(dbgs() << "\n"); 
#endif
  if (CSB) {
    for (Value* Arg : CSB.args())
      if (llvm::is_contained(AObjs, Arg)) {
        DEBUG(dbgs() << "SNA : A depends directly on B:"; Arg->dump());

        return false;
      }
  } else if (llvm::is_contained(AObjs, BPtr)) {
    DEBUG(dbgs() << "SNA : A depends directly on B:"; BPtr->dump());
    return false;
  }
}
marksl added a subscriber: marksl.Aug 8 2017, 1:48 PM

Hi Hal, I found an issue with this part when analyzing two pointers that are 'derived from each other', but are used in different loops.
Following code would go wrong:

int* restrict p=malloc(sizeof(int)*20);
int i;
for (i=0; i<20; ++i) {p[i]=i; }
for (i=0; i<20; ++i) { if (p[i] != i) should_never_happen(); }

Do you think that the proposed fix is ok ?

I don't yet understand what's wrong. Can you provide a full test case? Locally, the results on this look fine.

FYI: There is currently some problem because the current patchset miscompiles MultiSource/Benchmarks/tramp3d-v4.

lib/Analysis/ScopedNoAliasAA.cpp
444

I don't see why this should be needed. If A is derived from B, and A is also derived from some some noalias intrinsic N, then either B is derived from N or N is derived from B. The former case is one for which we explicitly check (it checks whether B is derived from the same intrinsic, or any other compatible one) and the latter case correctly returns noalias.

amehsan added inline comments.
lib/Analysis/ScopedNoAliasAA.cpp
372

I think it should be if (CSB)

426–441

IIUC, with the logic that we have here, removing a llvm.noalias from the IR could be a functional problem. The example below can expose the issue (AFAICT, the example has no UB...but please double check.

void foo(int *a, int *b, int *c, int *d, int n)  {

  int *restrict s;

  for (int i = 10; i < 800; ++i) {
    s = c + b[i];
    *a += *s;
  }


  int *r = s;
  s = d + 8;

  *s = *r + *s;

}

I compile this with the following options: -O2 -S -emit-llvm -fno-unroll-loops. Then I manually remove the llvm.noalias intrinsic inside the loop body and make necessary changes. This is the interesting parts of the IR

entry:
  br label %for.body

for.cond.cleanup:                                 ; preds = %for.body
  %add.ptr1 = getelementptr inbounds i32, i32* %d, i64 8
  %0 = tail call i32* @llvm.noalias.p0i32(i32* %add.ptr1, metadata !1)

for.body:                                         ; preds = %for.body, %entry
  %add.ptr = getelementptr inbounds i32, i32* %c, i64 %idx.ext
}

Then I feed the manually modified IR to opt, with options -O2 -debug and I see this debug msg:

SNA: A:   %0 = tail call i32* @llvm.noalias.p0i32(i32* %add.ptr1, metadata !5)
SNA: B:   %add.ptr = getelementptr inbounds i32, i32* %c, i64 %idx.ext
SNA: Found a compatible set!
          %0 = tail call i32* @llvm.noalias.p0i32(i32* %add.ptr1, metadata !5)

SNA: B/CSB noalias:

SNA: B does not derive from the compatible set!
SNA: DT is available
 SNA: noalias!

So we seem to prove %0 and %add.ptr are not aliased, but they might be aliased legally, IIUC.

For some more interesting examples, the problem is not exposed because currently at the very end of BasicAAResult::aliasGEP we conservatively return PartialAlias.

troyj added a subscriber: troyj.Aug 22 2018, 9:30 AM