This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Analysis/
-
Analysis/
-
LoopAccessAnalysis.cpp
-
test/Analysis/LoopAccessAnalysis/
-
Analysis/
-
LoopAccessAnalysis/
-
non-wrapping-pointer.ll

Differential D10472

[LAA] Try to prove non-wrapping of pointers if SCEV cannot
ClosedPublic

Authored by anemet on Jun 15 2015, 10:34 PM.

Download Raw Diff

Details

Reviewers

nadav
aschwaighofer
atrick
sanjoy

Commits

rGc4866d29dd74: [LAA] Try to prove non-wrapping of pointers if SCEV cannot
rL240798: [LAA] Try to prove non-wrapping of pointers if SCEV cannot

Summary

Scalar evolution does not propagate the non-wrapping flags to values
that are derived from a non-wrapping induction variable because
the non-wrapping property could be flow-sensitive.

This change is a first attempt to establish the non-wrapping property in
some simple cases. The main idea is to look through the operations
defining the pointer. As long as we arrive to a non-wrapping AddRec via
a small chain of non-wrapping instruction, the pointer should not wrap
either.

I believe that this essentially is what Andy described in
http://article.gmane.org/gmane.comp.compilers.llvm.cvs/220731 as the way
forward.

Diff Detail

Repository: rL LLVM

Event Timeline

anemet updated this revision to Diff 27748.Jun 15 2015, 10:34 PM

anemet retitled this revision from to [LAA] Try to prove non-wrapping of pointers if SCEV cannot.

anemet updated this object.

anemet edited the test plan for this revision. (Show Details)

anemet added reviewers: aschwaighofer, nadav, atrick, sanjoy.

anemet added a subscriber: Unknown Object (MLST).

sanjoy requested changes to this revision.Jun 15 2015, 11:07 PM

sanjoy edited edge metadata.

sanjoy added inline comments.

lib/Analysis/LoopAccessAnalysis.cpp
509 ↗	(On Diff #27748)	I'm not clear on how LAA uses this property, but I think this function should mention what kind of no-wrap (signed or unsigned) behavior it is trying to prove. IOW, Ptr is supposed to be monotonically increasing/decreasing in the signed or unsigned sense?
540 ↗	(On Diff #27748)	What if `OBO` is `nuw` (and not `nsw`) and `OpAR` is `nsw` (and not `nuw`)? Or vice-versa?

This revision now requires changes to proceed.Jun 15 2015, 11:07 PM

anemet added inline comments.Jun 16 2015, 3:30 PM

lib/Analysis/LoopAccessAnalysis.cpp
509 ↗	(On Diff #27748)	It's unsigned. Pointers are unsigned in LLVM IR as mentioned in the GEP FAQ. I'll add a comment.
540 ↗	(On Diff #27748)	Yeah, this certainly required more thinking on my behalf, thanks for pressing it. So to document why this was wrong, let's take this counterexample: Consider the AddRec {0,+,100} <nuw> in i8. The first three iterations of that yields: 0, 100, 200. Putting this through a signed add of 3, the input is now interpreted as signed: 0, 100, -56. No signed overflow on the result (3, 103, -53), yet the result is wrapped. Similarly, we can't take <nuw>-only for index. Index is interpreted as signed. With the above (i8) {0,+,100} example we'd get a wrapping range even though it may be inbounds for the array. Let me know if you or others disagree or have further comments. Otherwise I'll update the patch accordingly.

sanjoy added inline comments.Jun 17 2015, 4:08 PM

lib/Analysis/LoopAccessAnalysis.cpp
540 ↗	(On Diff #27748)	I think you're right w.r.t. the (no-)wrapping logic. I have not worked on LAA so I cannot comment on how `isNoWrapAddRec` is used.

Updated to only check NSW when analyzing the (signed) index.

Also added a FIXME for what I think is a pre-existing problem with the code.

anemet mentioned this in D10161: [SCEV][LoopVectorize] Allow ScalarEvolution to make assumptions about overflows.Jun 23 2015, 1:25 PM

LGTM. Sorry for the delay.

It's not clear to me how comprehensive this analysis is, but at least it covers some important cases.

lib/Analysis/LoopAccessAnalysis.cpp
511 ↗	(On Diff #27890)	I agree with you here. If NSW/NW are valid conditions, it's not obvious to me and the original author needs to explain it.

Closed by commit rL240798: [LAA] Try to prove non-wrapping of pointers if SCEV cannot (authored by anemet). · Explain WhyJun 26 2015, 10:26 AM

This revision was automatically updated to reflect the committed changes.

Thanks for confirming and the review, Andy!

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

LoopAccessAnalysis.cpp

50 lines

test/

Analysis/

LoopAccessAnalysis/

non-wrapping-pointer.ll

41 lines

Diff 28576

llvm/trunk/lib/Analysis/LoopAccessAnalysis.cpp

Show First 20 Lines • Show All 498 Lines • ▼ Show 20 Lines
}		}

static bool isInBoundsGep(Value *Ptr) {		static bool isInBoundsGep(Value *Ptr) {
if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Ptr))		if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Ptr))
return GEP->isInBounds();		return GEP->isInBounds();
return false;		return false;
}		}

		/// \brief Return true if an AddRec pointer \p Ptr is unsigned non-wrapping,
		/// i.e. monotonically increasing/decreasing.
		static bool isNoWrapAddRec(Value Ptr, const SCEVAddRecExpr AR,
		ScalarEvolution SE, const Loop L) {
		// FIXME: This should probably only return true for NUW.
		if (AR->getNoWrapFlags(SCEV::NoWrapMask))
		return true;

		// Scalar evolution does not propagate the non-wrapping flags to values that
		// are derived from a non-wrapping induction variable because non-wrapping
		// could be flow-sensitive.
		//
		// Look through the potentially overflowing instruction to try to prove
		// non-wrapping for the specific value of Ptr.

		// The arithmetic implied by an inbounds GEP can't overflow.
		auto *GEP = dyn_cast<GetElementPtrInst>(Ptr);
		if (!GEP \|\| !GEP->isInBounds())
		return false;

		// Make sure there is only one non-const index and analyze that.
		Value *NonConstIndex = nullptr;
		for (auto Index = GEP->idx_begin(); Index != GEP->idx_end(); ++Index)
		if (!isa<ConstantInt>(*Index)) {
		if (NonConstIndex)
		return false;
		NonConstIndex = *Index;
		}
		if (!NonConstIndex)
		// The recurrence is on the pointer, ignore for now.
		return false;

		// The index in GEP is signed. It is non-wrapping if it's derived from a NSW
		// AddRec using a NSW operation.
		if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(NonConstIndex))
		if (OBO->hasNoSignedWrap() &&
		// Assume constant for other the operand so that the AddRec can be
		// easily found.
		isa<ConstantInt>(OBO->getOperand(1))) {
		auto *OpScev = SE->getSCEV(OBO->getOperand(0));

		if (auto *OpAR = dyn_cast<SCEVAddRecExpr>(OpScev))
		return OpAR->getLoop() == L && OpAR->getNoWrapFlags(SCEV::FlagNSW);
		}

		return false;
		}

/// \brief Check whether the access through \p Ptr has a constant stride.		/// \brief Check whether the access through \p Ptr has a constant stride.
int llvm::isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,		int llvm::isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,
const ValueToValueMap &StridesMap) {		const ValueToValueMap &StridesMap) {
const Type *Ty = Ptr->getType();		const Type *Ty = Ptr->getType();
assert(Ty->isPointerTy() && "Unexpected non-ptr");		assert(Ty->isPointerTy() && "Unexpected non-ptr");

// Make sure that the pointer does not point to aggregate types.		// Make sure that the pointer does not point to aggregate types.
const PointerType *PtrTy = cast<PointerType>(Ty);		const PointerType *PtrTy = cast<PointerType>(Ty);
Show All 21 Lines	int llvm::isStridedPtr(ScalarEvolution SE, Value Ptr, const Loop *Lp,
// The address calculation must not wrap. Otherwise, a dependence could be		// The address calculation must not wrap. Otherwise, a dependence could be
// inverted.		// inverted.
// An inbounds getelementptr that is a AddRec with a unit stride		// An inbounds getelementptr that is a AddRec with a unit stride
// cannot wrap per definition. The unit stride requirement is checked later.		// cannot wrap per definition. The unit stride requirement is checked later.
// An getelementptr without an inbounds attribute and unit stride would have		// An getelementptr without an inbounds attribute and unit stride would have
// to access the pointer value "0" which is undefined behavior in address		// to access the pointer value "0" which is undefined behavior in address
// space 0, therefore we can also vectorize this case.		// space 0, therefore we can also vectorize this case.
bool IsInBoundsGEP = isInBoundsGep(Ptr);		bool IsInBoundsGEP = isInBoundsGep(Ptr);
bool IsNoWrapAddRec = AR->getNoWrapFlags(SCEV::NoWrapMask);		bool IsNoWrapAddRec = isNoWrapAddRec(Ptr, AR, SE, Lp);
bool IsInAddressSpaceZero = PtrTy->getAddressSpace() == 0;		bool IsInAddressSpaceZero = PtrTy->getAddressSpace() == 0;
if (!IsNoWrapAddRec && !IsInBoundsGEP && !IsInAddressSpaceZero) {		if (!IsNoWrapAddRec && !IsInBoundsGEP && !IsInAddressSpaceZero) {
DEBUG(dbgs() << "LAA: Bad stride - Pointer may wrap in the address space "		DEBUG(dbgs() << "LAA: Bad stride - Pointer may wrap in the address space "
<< Ptr << " SCEV: " << PtrScev << "\n");		<< Ptr << " SCEV: " << PtrScev << "\n");
return 0;		return 0;
}		}

// Check the step is constant.		// Check the step is constant.
▲ Show 20 Lines • Show All 955 Lines • Show Last 20 Lines

llvm/trunk/test/Analysis/LoopAccessAnalysis/non-wrapping-pointer.ll

				; RUN: opt -basicaa -loop-accesses -analyze < %s \| FileCheck %s

				; For this loop:
				; for (int i = 0; i < n; i++)
				; A[2 * i] = A[2 * i] + B[i];
				;
				; , SCEV is unable to prove that A[2 * i] does not overflow. However,
				; analyzing the IR helps us to conclude it and in turn allow dependence
				; analysis.

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"

				; CHECK: Memory dependences are safe{{$}}

				define void @f(i16* noalias %a,
				i16* noalias %b, i64 %N) {
				entry:
				br label %for.body

				for.body: ; preds = %for.body, %entry
				%ind = phi i64 [ 0, %entry ], [ %inc, %for.body ]

				%mul = mul nuw nsw i64 %ind, 2

				%arrayidxA = getelementptr inbounds i16, i16* %a, i64 %mul
				%loadA = load i16, i16* %arrayidxA, align 2

				%arrayidxB = getelementptr inbounds i16, i16* %b, i64 %ind
				%loadB = load i16, i16* %arrayidxB, align 2

				%add = mul i16 %loadA, %loadB

				store i16 %add, i16* %arrayidxA, align 2

				%inc = add nuw nsw i64 %ind, 1
				%exitcond = icmp eq i64 %inc, %N
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}