This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
12
LoopAccessAnalysis.cpp
-
test/Analysis/LoopAccessAnalysis/
-
Analysis/
-
LoopAccessAnalysis/
13
memcheck-wrapping-pointers.ll

Differential D17080

[LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs
ClosedPublic

Authored by sbaranga on Feb 10 2016, 9:41 AM.

Download Raw Diff

Details

Reviewers

anemet
mzolotukhin
mkuper
sanjoy
hfinkel

Commits

rGac920f7716ef: [LAA] Allow more run-time alias checks by coercing pointer expressions to…
rL313012: [LAA] Allow more run-time alias checks by coercing pointer expressions to…

Summary

LAA can only emit run-time alias checks for pointers with affine AddRec
SCEV expressions. However, non-AddRecExprs can be now be converted to
affine AddRecExprs using SCEV predicates.

This change tries to add the minimal set of SCEV predicates in order
to enable run-time alias checking.

Diff Detail

Build Status

Buildable 9564
Build 9564: arc lint + arc unit

Event Timeline

sbaranga updated this revision to Diff 47472.Feb 10 2016, 9:41 AM

sbaranga retitled this revision from to [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs.

sbaranga updated this object.

sbaranga added reviewers: anemet, mzolotukhin.

sbaranga added a subscriber: llvm-commits.

Herald added a subscriber: mzolotukhin. · View Herald TranscriptFeb 10 2016, 9:41 AM

We have a dependency on http://reviews.llvm.org/D17078 (without it the compiler crashes while trying to build 464.h264ref in spec2006).

roman.shirokiy added a subscriber: roman.shirokiy.Mar 11 2016, 2:48 AM

This rebases the patch and moves the lambda to a separate method.

Rebase the patch since it wasn't applying to ToT anymore.

dorit added a subscriber: dorit.Feb 13 2017, 5:46 AM

This patch still applies almost entirely as is (only needs a tiny update in the testcase), and seems to have positive performance effect on a couple benchmarks.
It is also a prerequisite for PR30654.
Lets revive this review!

dorit added reviewers: mkuper, sanjoy.Feb 20 2017, 3:49 AM

dorit added a subscriber: Ayal.

dorit added inline comments.Feb 21 2017, 12:41 AM

lib/Analysis/LoopAccessAnalysis.cpp
537	I just have a small suggestion, to maybe change "Force" to "Assume", just because "Force" here has the same effect as "Assume" in the getPtrStride API, namely to allow adding new runtime tests. (Right?). But if you prefer Force that's fine with me too.

dorit added a reviewer: hfinkel.Feb 22 2017, 7:34 PM

ping

Ayal added inline comments.Mar 1 2017, 3:09 AM

test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll
6	"i can ove[r]flow": better clarify which i can overflow - in the test below the induction variable of the loop i is a signed 64bit (as is the bound 'len') whose bump has an nsw, so it is free of overflow concerns. The indices of A[i] and A[i+1] are (signed or unsigned?) 32bit idx and idx.inc, whose zext-add-trunc bump has no nuw nor nsw and are therefore subject to overflow concerns. Is it clear why the Added "SCEV assumptions" Flags for these should be <nssw>, rather than <nusw>? Starting from 0 and 1, nssw is more conservative. "When len has a type": these assumptions are needed regardless of the type of len.
10	Suggest to clarify the corresponding C code using both i and idx.
12	"emmit" >> "emit", multiple occurrences.
20	Why sext's and not zext's?
22	"We" >> "The". Hopefully the transformed expressions are i32 {0,+,1} i32 {1,+,1} based on the added flags, right?
56	It doesn't really matter if we zext or sext here, right? Or should zext indicate the type of idx was originally unsigned?
140	See comments above.

sbaranga added inline comments.Mar 7 2017, 3:05 AM

test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll
6	Thanks for the observations. There is a discrepancy here between the C code and the IR being tested (and the text is misleading). We are actually looking at the expressions for %inc.ptr0 and %inc.ptr1. The sign extension comes from the getelementptr instruction. Because the SCEV expression that we are trying to linearize is something like (sext i32 {0,+,1}<%for.body> to i64), we can only add the nssw flag. I'll update the text to clarify things.
22	i64 should be correct. We're essentially sinking the sext to get a i64 linear expression.
56	Correct (for both).

Renaming Force to Assume.

The tests have been re-written, with the test IR coming directly from C so it should now match the C code.

I've removed one of tests which was only testing that this also works for NSW.
This functionality is already covered by other tests and doesn't show up often from C.

This looks ok to me, added only minor comments, but it should be approved by someone who understands all this better...

IIUC the current scheme originally computes CanDoRT and NeedRTCheck "independently", or rather intertwinedly, checking first if CanDoRT without Assume in parallel to determining if NeedRTCheck, and if the latter is true revisit if CanDoRT with Assume as needed by Retries. A simpler albeit possibly slower approach is to first determine separately if NeedRTCheck, returning true if not; else check if CanDoRT with Assume. But Need[RT]Check seems to depend on CanDo[AliasSet]RT, and we may return false if !NeedRTCheck when there are incomparable pointers of distinct address spaces. The use of "isDependencyCheckNeeded()" also raises an eyebrow, or two.

lib/Analysis/LoopAccessAnalysis.cpp
645–648	The tests added below check this other added condition, namely PSE.hasNoOverflow(), right?
658	IsDepCheckNeeded was originally taken out of the AS : AST loop. Here you can simply ask "if(isDependencyCheckNeeded())" later.
660	IsWrite is needed only later by RtCheck.insert(), can move it there.
749	This goes back to the original code: it may be slightly easier to always bump ++NumTotalPtrChecks instead of ++NumReadPtrChecks and set NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1); But not sure I follow the original logic here; it's equivalent to doing: bool NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1); if (IsDepCheckNeeded && CanDoAliasSetRT && RunningDepId == 2) NeedsCheck = false; Having CanDoAliasSetRT && RunningDepId == 2 implies that NumTotalPtrChecks == 2, but why set NeedsCheck to false when this happens and IsDepCheckNeeded? What if NumWritePtrChecks >= 1? May want to rename NeedsCheck >> NeedsAliasSetRTCheck, analogous to the CanDo's.
795	ok, but what if !NeedRTCheck?
test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll
6	Thanks! This looks fine to me.
25	the second one should be: ; i64 {(%b),+,4}<%for.body>
76	ove[r]flow

dorit mentioned this in D30041: [PSCEV] Create AddRec for Phis in cases of possible integer overflow, using runtime checks.Apr 13 2017, 9:00 AM

In D17080#725468, @Ayal wrote:

This looks ok to me, added only minor comments, but it should be approved by someone who understands all this better...

@mkuper, @hfinkel - can you help out here?

I think @anemet has reviewed patches in this part of the vectorizer… @anemet, would you be able to please take a look?

(it's not a big patch: the heart of the patch is just one line: adding a call to
PSE.getAsAddRec(Ptr) under "if (Assume)" in hasComputeableBounds()
(which would make us more aggressive in adding runtime checks).
All the rest is an attempt to minimize the cases in which we pass Assume=true…)

In D17080#729013, @Ayal wrote:

In D17080#725468, @Ayal wrote:

This looks ok to me, added only minor comments, but it should be approved by someone who understands all this better...

@mkuper, @hfinkel - can you help out here?

@anemet , @mssimpso - can you help out here?

@sbaranga, can you clarify my comments and help me understand this better? Hopefully this could move forward.

Sorry for not replying earlier. It looks like there are only minor changes left? I plan to push an update after this gets unblocked.

It's also a little bit odd that I haven't been able to get a definitive review for this. It would be good to know if there's something fundamentally wrong with the approach (at least we could start thinking about workarounds).

lib/Analysis/LoopAccessAnalysis.cpp
645–648	Correct.
749	I've tried to preserve the original logic, so if there are any issues with the original logic I've also pulled them in. The only difference here being that we're doing the same logic per alias set. It is possible that the logic can be simplified. RunningDepId == 2 means that we only have one dependence set. There a comment above " But there is no need to checks if there is only one dependence set for this alias set.". If I remember correctly this covers cases such as: for (int i = 0; i < n; ++i) a[i] = a[i] + 1; where we don't need any checks. For renaming NeedsCheck >> NeedsAliasSetRTCheck: I have no objections.
795	If !NeedRTCheck I don't think we need to do this check. However, the old code was doing it, so I left it in.

hfinkel added inline comments.Aug 21 2017, 1:42 PM

lib/Analysis/LoopAccessAnalysis.cpp
614–616	Please add a comment on what Assume means in practice for this function.
754	Can you extend this comment to note why, even if they failed previously, they might succeed now.

Address comments received so far.

Harbormaster completed remote builds in B9564: Diff 112377.Aug 23 2017, 8:57 AM

LGTM

lib/Analysis/LoopAccessAnalysis.cpp
740	Please note here that this is the `RunningDepId == 2` check below.

This revision is now accepted and ready to land.Sep 2 2017, 5:23 PM

sbaranga closed this revision.Sep 12 2017, 12:49 AM

Thanks! Committed in r313012.

Hi Silviu,
I came across the similar gap, but I aggressively tried converting the PtrSCEV to AddRec in hasComputableBounds, which worked on our internal benchmark. Could you please clarify why you have the Assume to be false in the first call, i.e. how do we know that it's not useful to try and convert PtrSCEV to AddRec?

This was my patch over the original code.

diff --git a/lib/Analysis/LoopAccessAnalysis.cpp b/lib/Analysis/LoopAccessAnalysis.cpp
index d2dbecd..a1c7584 100644
--- a/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/lib/Analysis/LoopAccessAnalysis.cpp
@@ -608,6 +608,11 @@ static bool hasComputableBounds(PredicatedScalarEvolution &PSE,
     return true;
 
   const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
+  // Sometimes the PtrSCEV can be converted into the AddRec, try to retrieve it
+  // if possible.
+  if (!AR)
+    AR = PSE.getAsAddRec(Ptr);
+    if (!AR)
         return false;

Hi Anna,

In D17080#869566, @anna wrote:
Hi Silviu,
I came across the similar gap, but I aggressively tried converting the PtrSCEV to AddRec in hasComputableBounds, which worked on our internal benchmark. Could you please clarify why you have the Assume to be false in the first call, i.e. how do we know that it's not useful to try and convert PtrSCEV to AddRec?

This was my patch over the original code.
diff --git a/lib/Analysis/LoopAccessAnalysis.cpp b/lib/Analysis/LoopAccessAnalysis.cpp
index d2dbecd..a1c7584 100644
--- a/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/lib/Analysis/LoopAccessAnalysis.cpp
@@ -608,6 +608,11 @@ static bool hasComputableBounds(PredicatedScalarEvolution &PSE,
     return true;
 
   const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);
+  // Sometimes the PtrSCEV can be converted into the AddRec, try to retrieve it
+  // if possible.
+  if (!AR)
+    AR = PSE.getAsAddRec(Ptr);
+    if (!AR)
         return false;

We only need to convert to an AddRec if we need to emit the runtime checks (see the comment above the original logic for computing NeedRTCheck).

One example would be if we only have reads in the loop. If we convert in hasComputableBounds we end up having unnecessary versioning (at least with regards to the dependence analysis).

Would the current solution also work in your internal benchmark? If not it might be because something else would need the expression to be an AddRec?

Thanks,
Silviu

Hi Silviu,

In D17080#869615, @sbaranga wrote:

We only need to convert to an AddRec if we need to emit the runtime checks (see the comment above the original logic for computing NeedRTCheck).

One example would be if we only have reads in the loop. If we convert in hasComputableBounds we end up having unnecessary versioning (at least with regards to the dependence analysis).

Thanks for the clarification. makes sense.

Would the current solution also work in your internal benchmark? If not it might be because something else would need the expression to be an AddRec?

Yes, I tried your patch and it works. I was just curious why we needed all the extra code rather than always checking if we could convert to an AddRec.

Revision Contents

Path

Size

lib/

Analysis/

LoopAccessAnalysis.cpp

123 lines

test/

Analysis/

LoopAccessAnalysis/

memcheck-wrapping-pointers.ll

107 lines

Diff 112377

lib/Analysis/LoopAccessAnalysis.cpp

Show First 20 Lines • Show All 516 Lines • ▼ Show 20 Lines	public:

/// \brief Register a store.		/// \brief Register a store.
void addStore(MemoryLocation &Loc) {		void addStore(MemoryLocation &Loc) {
Value Ptr = const_cast<Value>(Loc.Ptr);		Value Ptr = const_cast<Value>(Loc.Ptr);
AST.add(Ptr, MemoryLocation::UnknownSize, Loc.AATags);		AST.add(Ptr, MemoryLocation::UnknownSize, Loc.AATags);
Accesses.insert(MemAccessInfo(Ptr, true));		Accesses.insert(MemAccessInfo(Ptr, true));
}		}

		/// \brief Check if we can emit a run-time no-alias check for \p Access.
		///
		/// Returns true if we can emit a run-time no alias check for \p Access.
		/// If we can check this access, this also adds it to a dependence set and
		/// adds a run-time to check for it to \p RtCheck. If \p Assume is true,
		/// we will attempt to use additional run-time checks in order to get
		/// the bounds of the pointer.
		bool createCheckForAccess(RuntimePointerChecking &RtCheck,
		MemAccessInfo Access,
		const ValueToValueMap &Strides,
		DenseMap<Value *, unsigned> &DepSetId,
		Loop *TheLoop, unsigned &RunningDepId,
		unsigned ASId, bool ShouldCheckStride,
		doritUnsubmitted Not Done Reply Inline Actions I just have a small suggestion, to maybe change "Force" to "Assume", just because "Force" here has the same effect as "Assume" in the getPtrStride API, namely to allow adding new runtime tests. (Right?). But if you prefer Force that's fine with me too. dorit: I just have a small suggestion, to maybe change "Force" to "Assume", just because "Force" here…
		bool Assume);

/// \brief Check whether we can check the pointers at runtime for		/// \brief Check whether we can check the pointers at runtime for
/// non-intersection.		/// non-intersection.
///		///
/// Returns true if we need no check or if we do and we can generate them		/// Returns true if we need no check or if we do and we can generate them
/// (i.e. the pointers have computable bounds).		/// (i.e. the pointers have computable bounds).
bool canCheckPtrAtRT(RuntimePointerChecking &RtCheck, ScalarEvolution *SE,		bool canCheckPtrAtRT(RuntimePointerChecking &RtCheck, ScalarEvolution *SE,
Loop *TheLoop, const ValueToValueMap &Strides,		Loop *TheLoop, const ValueToValueMap &Strides,
bool ShouldCheckWrap = false);		bool ShouldCheckWrap = false);
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	private:
bool IsRTCheckAnalysisNeeded;		bool IsRTCheckAnalysisNeeded;

/// The SCEV predicate containing all the SCEV-related assumptions.		/// The SCEV predicate containing all the SCEV-related assumptions.
PredicatedScalarEvolution &PSE;		PredicatedScalarEvolution &PSE;
};		};

} // end anonymous namespace		} // end anonymous namespace

/// \brief Check whether a pointer can participate in a runtime bounds check.		/// \brief Check whether a pointer can participate in a runtime bounds check.
		/// If \p Assume, try harder to prove that we can compute the bounds of \p Ptr
		/// by adding run-time checks (overflow checks) if necessary.
		hfinkelUnsubmitted Not Done Reply Inline Actions Please add a comment on what Assume means in practice for this function. hfinkel: Please add a comment on what Assume means in practice for this function.
static bool hasComputableBounds(PredicatedScalarEvolution &PSE,		static bool hasComputableBounds(PredicatedScalarEvolution &PSE,
const ValueToValueMap &Strides, Value *Ptr,		const ValueToValueMap &Strides, Value *Ptr,
Loop *L) {		Loop *L, bool Assume) {
const SCEV *PtrScev = replaceSymbolicStrideSCEV(PSE, Strides, Ptr);		const SCEV *PtrScev = replaceSymbolicStrideSCEV(PSE, Strides, Ptr);

// The bounds for loop-invariant pointer is trivial.		// The bounds for loop-invariant pointer is trivial.
if (PSE.getSE()->isLoopInvariant(PtrScev, L))		if (PSE.getSE()->isLoopInvariant(PtrScev, L))
return true;		return true;

const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);		const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(PtrScev);

		if (!AR && Assume)
		AR = PSE.getAsAddRec(Ptr);

if (!AR)		if (!AR)
return false;		return false;

return AR->isAffine();		return AR->isAffine();
}		}

/// \brief Check whether a pointer address cannot wrap.		/// \brief Check whether a pointer address cannot wrap.
static bool isNoWrap(PredicatedScalarEvolution &PSE,		static bool isNoWrap(PredicatedScalarEvolution &PSE,
const ValueToValueMap &Strides, Value Ptr, Loop L) {		const ValueToValueMap &Strides, Value Ptr, Loop L) {
const SCEV *PtrScev = PSE.getSCEV(Ptr);		const SCEV *PtrScev = PSE.getSCEV(Ptr);
if (PSE.getSE()->isLoopInvariant(PtrScev, L))		if (PSE.getSE()->isLoopInvariant(PtrScev, L))
return true;		return true;

int64_t Stride = getPtrStride(PSE, Ptr, L, Strides);		int64_t Stride = getPtrStride(PSE, Ptr, L, Strides);
return Stride == 1;		if (Stride == 1 \|\| PSE.hasNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW))
		return true;

		return false;
		AyalUnsubmitted Not Done Reply Inline Actions The tests added below check this other added condition, namely PSE.hasNoOverflow(), right? Ayal: The tests added below check this other added condition, namely PSE.hasNoOverflow(), right?
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Correct. sbaranga: Correct.
		}

		bool AccessAnalysis::createCheckForAccess(RuntimePointerChecking &RtCheck,
		MemAccessInfo Access,
		const ValueToValueMap &StridesMap,
		DenseMap<Value *, unsigned> &DepSetId,
		Loop *TheLoop, unsigned &RunningDepId,
		unsigned ASId, bool ShouldCheckWrap,
		bool Assume) {
		Value *Ptr = Access.getPointer();
		AyalUnsubmitted Not Done Reply Inline Actions IsDepCheckNeeded was originally taken out of the AS : AST loop. Here you can simply ask "if(isDependencyCheckNeeded())" later. Ayal: IsDepCheckNeeded was originally taken out of the AS : AST loop. Here you can simply ask "if…

		if (!hasComputableBounds(PSE, StridesMap, Ptr, TheLoop, Assume))
		AyalUnsubmitted Not Done Reply Inline Actions IsWrite is needed only later by RtCheck.insert(), can move it there. Ayal: IsWrite is needed only later by RtCheck.insert(), can move it there.
		return false;

		// When we run after a failing dependency check we have to make sure
		// we don't have wrapping pointers.
		if (ShouldCheckWrap && !isNoWrap(PSE, StridesMap, Ptr, TheLoop)) {
		auto *Expr = PSE.getSCEV(Ptr);
		if (!Assume \|\| !isa<SCEVAddRecExpr>(Expr))
		return false;
		PSE.setNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW);
		}

		// The id of the dependence set.
		unsigned DepId;

		if (isDependencyCheckNeeded()) {
		Value *Leader = DepCands.getLeaderValue(Access).getPointer();
		unsigned &LeaderId = DepSetId[Leader];
		if (!LeaderId)
		LeaderId = RunningDepId++;
		DepId = LeaderId;
		} else
		// Each access has its own dependence set.
		DepId = RunningDepId++;

		bool IsWrite = Access.getInt();
		RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap, PSE);
		DEBUG(dbgs() << "LAA: Found a runtime check ptr:" << *Ptr << '\n');

		return true;
}		}

bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,		bool AccessAnalysis::canCheckPtrAtRT(RuntimePointerChecking &RtCheck,
ScalarEvolution SE, Loop TheLoop,		ScalarEvolution SE, Loop TheLoop,
const ValueToValueMap &StridesMap,		const ValueToValueMap &StridesMap,
bool ShouldCheckWrap) {		bool ShouldCheckWrap) {
// Find pointers with computable bounds. We are going to use this information		// Find pointers with computable bounds. We are going to use this information
// to place a runtime bound check.		// to place a runtime bound check.
bool CanDoRT = true;		bool CanDoRT = true;

bool NeedRTCheck = false;		bool NeedRTCheck = false;
if (!IsRTCheckAnalysisNeeded) return true;		if (!IsRTCheckAnalysisNeeded) return true;

bool IsDepCheckNeeded = isDependencyCheckNeeded();		bool IsDepCheckNeeded = isDependencyCheckNeeded();

// We assign a consecutive id to access from different alias sets.		// We assign a consecutive id to access from different alias sets.
// Accesses between different groups doesn't need to be checked.		// Accesses between different groups doesn't need to be checked.
unsigned ASId = 1;		unsigned ASId = 1;
for (auto &AS : AST) {		for (auto &AS : AST) {
int NumReadPtrChecks = 0;		int NumReadPtrChecks = 0;
int NumWritePtrChecks = 0;		int NumWritePtrChecks = 0;
		bool CanDoAliasSetRT = true;

// We assign consecutive id to access from different dependence sets.		// We assign consecutive id to access from different dependence sets.
// Accesses within the same set don't need a runtime check.		// Accesses within the same set don't need a runtime check.
unsigned RunningDepId = 1;		unsigned RunningDepId = 1;
DenseMap<Value *, unsigned> DepSetId;		DenseMap<Value *, unsigned> DepSetId;

		SmallVector<MemAccessInfo, 4> Retries;

for (auto A : AS) {		for (auto A : AS) {
Value *Ptr = A.getValue();		Value *Ptr = A.getValue();
bool IsWrite = Accesses.count(MemAccessInfo(Ptr, true));		bool IsWrite = Accesses.count(MemAccessInfo(Ptr, true));
MemAccessInfo Access(Ptr, IsWrite);		MemAccessInfo Access(Ptr, IsWrite);

if (IsWrite)		if (IsWrite)
++NumWritePtrChecks;		++NumWritePtrChecks;
else		else
++NumReadPtrChecks;		++NumReadPtrChecks;

if (hasComputableBounds(PSE, StridesMap, Ptr, TheLoop) &&		if (!createCheckForAccess(RtCheck, Access, StridesMap, DepSetId, TheLoop,
// When we run after a failing dependency check we have to make sure		RunningDepId, ASId, ShouldCheckWrap, false)) {
// we don't have wrapping pointers.
(!ShouldCheckWrap \|\| isNoWrap(PSE, StridesMap, Ptr, TheLoop))) {
// The id of the dependence set.
unsigned DepId;

if (IsDepCheckNeeded) {
Value *Leader = DepCands.getLeaderValue(Access).getPointer();
unsigned &LeaderId = DepSetId[Leader];
if (!LeaderId)
LeaderId = RunningDepId++;
DepId = LeaderId;
} else
// Each access has its own dependence set.
DepId = RunningDepId++;

RtCheck.insert(TheLoop, Ptr, IsWrite, DepId, ASId, StridesMap, PSE);

DEBUG(dbgs() << "LAA: Found a runtime check ptr:" << *Ptr << '\n');
} else {
DEBUG(dbgs() << "LAA: Can't find bounds for ptr:" << *Ptr << '\n');		DEBUG(dbgs() << "LAA: Can't find bounds for ptr:" << *Ptr << '\n');
CanDoRT = false;		Retries.push_back(Access);
		CanDoAliasSetRT = false;
}		}
}		}

// If we have at least two writes or one write and a read then we need to		// If we have at least two writes or one write and a read then we need to
// check them. But there is no need to checks if there is only one		// check them. But there is no need to checks if there is only one
// dependence set for this alias set.		// dependence set for this alias set.
		hfinkelUnsubmitted Not Done Reply Inline Actions Please note here that this is the `RunningDepId == 2` check below. hfinkel: Please note here that this is the `RunningDepId == 2` check below.
//		//
// Note that this function computes CanDoRT and NeedRTCheck independently.		// Note that this function computes CanDoRT and NeedRTCheck independently.
// For example CanDoRT=false, NeedRTCheck=false means that we have a pointer		// For example CanDoRT=false, NeedRTCheck=false means that we have a pointer
// for which we couldn't find the bounds but we don't actually need to emit		// for which we couldn't find the bounds but we don't actually need to emit
// any checks so it does not matter.		// any checks so it does not matter.
if (!(IsDepCheckNeeded && CanDoRT && RunningDepId == 2))		bool NeedsAliasSetRTCheck = false;
NeedRTCheck \|= (NumWritePtrChecks >= 2 \|\| (NumReadPtrChecks >= 1 &&		if (!(IsDepCheckNeeded && CanDoAliasSetRT && RunningDepId == 2))
NumWritePtrChecks >= 1));		NeedsAliasSetRTCheck = (NumWritePtrChecks >= 2 \|\|
		(NumReadPtrChecks >= 1 && NumWritePtrChecks >= 1));
		AyalUnsubmitted Not Done Reply Inline Actions This goes back to the original code: it may be slightly easier to always bump ++NumTotalPtrChecks instead of ++NumReadPtrChecks and set NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1); But not sure I follow the original logic here; it's equivalent to doing: bool NeedsCheck = (NumTotalPtrChecks >= 2 && NumWritePtrChecks >= 1); if (IsDepCheckNeeded && CanDoAliasSetRT && RunningDepId == 2) NeedsCheck = false; Having CanDoAliasSetRT && RunningDepId == 2 implies that NumTotalPtrChecks == 2, but why set NeedsCheck to false when this happens and IsDepCheckNeeded? What if NumWritePtrChecks >= 1? May want to rename NeedsCheck >> NeedsAliasSetRTCheck, analogous to the CanDo's. Ayal: This goes back to the original code: it may be slightly easier to always bump…
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions I've tried to preserve the original logic, so if there are any issues with the original logic I've also pulled them in. The only difference here being that we're doing the same logic per alias set. It is possible that the logic can be simplified. RunningDepId == 2 means that we only have one dependence set. There a comment above " But there is no need to checks if there is only one dependence set for this alias set.". If I remember correctly this covers cases such as: for (int i = 0; i < n; ++i) a[i] = a[i] + 1; where we don't need any checks. For renaming NeedsCheck >> NeedsAliasSetRTCheck: I have no objections. sbaranga: I've tried to preserve the original logic, so if there are any issues with the original logic…

		// We need to perform run-time alias checks, but some pointers had bounds
		// that couldn't be checked.
		if (NeedsAliasSetRTCheck && !CanDoAliasSetRT) {
		// Reset the CanDoSetRt flag and retry all accesses that have failed.
		hfinkelUnsubmitted Not Done Reply Inline Actions Can you extend this comment to note why, even if they failed previously, they might succeed now. hfinkel: Can you extend this comment to note why, even if they failed previously, they might succeed now.
		// We know that we need these checks, so we can now be more aggressive
		// and add further checks if required (overflow checks).
		CanDoAliasSetRT = true;
		for (auto Access : Retries)
		if (!createCheckForAccess(RtCheck, Access, StridesMap, DepSetId,
		TheLoop, RunningDepId, ASId,
		ShouldCheckWrap, /Assume=/true)) {
		CanDoAliasSetRT = false;
		break;
		}
		}

		CanDoRT &= CanDoAliasSetRT;
		NeedRTCheck \|= NeedsAliasSetRTCheck;
++ASId;		++ASId;
}		}

// If the pointers that we would use for the bounds comparison have different		// If the pointers that we would use for the bounds comparison have different
// address spaces, assume the values aren't directly comparable, so we can't		// address spaces, assume the values aren't directly comparable, so we can't
// use them for the runtime check. We also have to assume they could		// use them for the runtime check. We also have to assume they could
// overlap. In the future there should be metadata for whether address spaces		// overlap. In the future there should be metadata for whether address spaces
// are disjoint.		// are disjoint.
Show All 10 Lines	for (unsigned j = i + 1; j < NumPointers; ++j) {

Value *PtrI = RtCheck.Pointers[i].PointerValue;		Value *PtrI = RtCheck.Pointers[i].PointerValue;
Value *PtrJ = RtCheck.Pointers[j].PointerValue;		Value *PtrJ = RtCheck.Pointers[j].PointerValue;

unsigned ASi = PtrI->getType()->getPointerAddressSpace();		unsigned ASi = PtrI->getType()->getPointerAddressSpace();
unsigned ASj = PtrJ->getType()->getPointerAddressSpace();		unsigned ASj = PtrJ->getType()->getPointerAddressSpace();
if (ASi != ASj) {		if (ASi != ASj) {
DEBUG(dbgs() << "LAA: Runtime check would require comparison between"		DEBUG(dbgs() << "LAA: Runtime check would require comparison between"
" different address spaces\n");		" different address spaces\n");
		AyalUnsubmitted Not Done Reply Inline Actions ok, but what if !NeedRTCheck? Ayal: ok, but what if !NeedRTCheck?
		sbarangaAuthorUnsubmitted Not Done Reply Inline Actions If !NeedRTCheck I don't think we need to do this check. However, the old code was doing it, so I left it in. sbaranga: If !NeedRTCheck I don't think we need to do this check. However, the old code was doing it, so…
return false;		return false;
}		}
}		}
}		}

if (NeedRTCheck && CanDoRT)		if (NeedRTCheck && CanDoRT)
RtCheck.generateChecks(DepCands, IsDepCheckNeeded);		RtCheck.generateChecks(DepCands, IsDepCheckNeeded);

▲ Show 20 Lines • Show All 1,457 Lines • Show Last 20 Lines

test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll

This file was added.

				; RUN: opt -basicaa -loop-accesses -analyze < %s \| FileCheck %s

				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

				; i and i + 1 can overflow in the following kernel:
				; void test1(unsigned long long x, int a, int b) {
				AyalUnsubmitted Not Done Reply Inline Actions "i can ove[r]flow": better clarify which i can overflow - in the test below the induction variable of the loop i is a signed 64bit (as is the bound 'len') whose bump has an nsw, so it is free of overflow concerns. The indices of A[i] and A[i+1] are (signed or unsigned?) 32bit idx and idx.inc, whose zext-add-trunc bump has no nuw nor nsw and are therefore subject to overflow concerns. Is it clear why the Added "SCEV assumptions" Flags for these should be <nssw>, rather than <nusw>? Starting from 0 and 1, nssw is more conservative. "When len has a type": these assumptions are needed regardless of the type of len. Ayal: "i can ove[r]flow": better clarify which i can overflow - in the test below the induction…
				sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Thanks for the observations. There is a discrepancy here between the C code and the IR being tested (and the text is misleading). We are actually looking at the expressions for %inc.ptr0 and %inc.ptr1. The sign extension comes from the getelementptr instruction. Because the SCEV expression that we are trying to linearize is something like (sext i32 {0,+,1}<%for.body> to i64), we can only add the nssw flag. I'll update the text to clarify things. sbaranga: Thanks for the observations. There is a discrepancy here between the C code and the IR being…
				AyalUnsubmitted Not Done Reply Inline Actions Thanks! This looks fine to me. Ayal: Thanks! This looks fine to me.
				; for (unsigned i = 0; i < x; ++i)
				; b[i] = a[i+1] + 1;
				; }
				;
				AyalUnsubmitted Not Done Reply Inline Actions Suggest to clarify the corresponding C code using both i and idx. Ayal: Suggest to clarify the corresponding C code using both i and idx.
				; If accesses to a and b can alias, we need to emit a run-time alias check
				; between accesses to a and b. However, when i and i + 1 can wrap, their
				AyalUnsubmitted Not Done Reply Inline Actions "emmit" >> "emit", multiple occurrences. Ayal: "emmit" >> "emit", multiple occurrences.
				; SCEV expression is not an AddRec. We need to create SCEV predicates and
				; coerce the expressions to AddRecs in order to be able to emit the run-time
				; alias check.
				;
				; The accesses at b[i] and a[i+1] correspond to the addresses %arrayidx and
				; %arrayidx4 in the test. The SCEV expressions for these are:
				; ((4 * (zext i32 {1,+,1}<%for.body> to i64))<nuw><nsw> + %a)<nsw>
				; ((4 * (zext i32 {0,+,1}<%for.body> to i64))<nuw><nsw> + %b)<nsw>
				AyalUnsubmitted Not Done Reply Inline Actions Why sext's and not zext's? Ayal: Why sext's and not zext's?
				;
				; The transformed expressions are:
				AyalUnsubmitted Not Done Reply Inline Actions "We" >> "The". Hopefully the transformed expressions are i32 {0,+,1} i32 {1,+,1} based on the added flags, right? Ayal: "We" >> "The". Hopefully the transformed expressions are i32 {0,+,1} i32 {1,+,1} based on…
				sbarangaAuthorUnsubmitted Not Done Reply Inline Actions i64 should be correct. We're essentially sinking the sext to get a i64 linear expression. sbaranga: i64 should be correct. We're essentially sinking the sext to get a i64 linear expression.
				; i64 {(4 + %a),+,4}<%for.body>
				; i64 {(4 + %b),+,4}<%for.body>

				AyalUnsubmitted Not Done Reply Inline Actions the second one should be: ; i64 {(%b),+,4}<%for.body> Ayal: the second one should be: ; i64 {(%b),+,4}<%for.body>
				; CHECK-LABEL: test1
				; CHECK: Memory dependences are safe with run-time checks
				; CHECK-NEXT: Dependences:
				; CHECK-NEXT: Run-time memory checks:
				; CHECK-NEXT: Check 0:
				; CHECK-NEXT: Comparing group
				; CHECK-NEXT: %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				; CHECK-NEXT: Against group
				; CHECK-NEXT: %arrayidx4 = getelementptr inbounds i32, i32* %b, i64 %conv11
				; CHECK-NEXT: Grouped accesses:
				; CHECK-NEXT: Group
				; CHECK-NEXT: (Low: (4 + %a) High: (4 + (4 * (1 umax %x)) + %a))
				; CHECK-NEXT: Member: {(4 + %a),+,4}<%for.body>
				; CHECK-NEXT: Group
				; CHECK-NEXT: (Low: %b High: ((4 * (1 umax %x)) + %b))
				; CHECK-NEXT: Member: {%b,+,4}<%for.body>
				; CHECK: Store to invariant address was not found in loop.
				; CHECK-NEXT: SCEV assumptions:
				; CHECK-NEXT: {1,+,1}<%for.body> Added Flags: <nusw>
				; CHECK-NEXT: {0,+,1}<%for.body> Added Flags: <nusw>
				; CHECK: Expressions re-written:
				; CHECK-NEXT: [PSE] %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom:
				; CHECK-NEXT: ((4 * (zext i32 {1,+,1}<%for.body> to i64))<nuw><nsw> + %a)<nsw>
				; CHECK-NEXT: --> {(4 + %a),+,4}<%for.body>
				; CHECK-NEXT: [PSE] %arrayidx4 = getelementptr inbounds i32, i32* %b, i64 %conv11:
				; CHECK-NEXT: ((4 * (zext i32 {0,+,1}<%for.body> to i64))<nuw><nsw> + %b)<nsw>
				; CHECK-NEXT: --> {%b,+,4}<%for.body>
				define void @test1(i64 %x, i32* %a, i32* %b) {
				entry:
				br label %for.body

				AyalUnsubmitted Not Done Reply Inline Actions It doesn't really matter if we zext or sext here, right? Or should zext indicate the type of idx was originally unsigned? Ayal: It doesn't really matter if we zext or sext here, right? Or should zext indicate the type of…
				sbarangaAuthorUnsubmitted Not Done Reply Inline Actions Correct (for both). sbaranga: Correct (for both).
				for.body: ; preds = %for.body.preheader, %for.body
				%conv11 = phi i64 [ %conv, %for.body ], [ 0, %entry ]
				%i.010 = phi i32 [ %add, %for.body ], [ 0, %entry ]
				%add = add i32 %i.010, 1
				%idxprom = zext i32 %add to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				%ld = load i32, i32* %arrayidx, align 4
				%add2 = add nsw i32 %ld, 1
				%arrayidx4 = getelementptr inbounds i32, i32* %b, i64 %conv11
				store i32 %add2, i32* %arrayidx4, align 4
				%conv = zext i32 %add to i64
				%cmp = icmp ult i64 %conv, %x
				br i1 %cmp, label %for.body, label %exit

				exit:
				ret void
				}

				; i can overflow in the following kernel:
				; void test2(unsigned long long x, int *a) {
				AyalUnsubmitted Not Done Reply Inline Actions ove[r]flow Ayal: ove[r]flow
				; for (unsigned i = 0; i < x; ++i)
				; a[i] = a[i] + 1;
				; }
				;
				; We need to check that i doesn't wrap, but we don't need a run-time alias
				; check. We also need an extra no-wrap check to get the backedge taken count.

				; CHECK-LABEL: test2
				; CHECK: Memory dependences are safe
				; CHECK: SCEV assumptions:
				; CHECK-NEXT: {1,+,1}<%for.body> Added Flags: <nusw>
				; CHECK-NEXT: {0,+,1}<%for.body> Added Flags: <nusw>
				define void @test2(i64 %x, i32* %a) {
				entry:
				br label %for.body

				for.body:
				%conv11 = phi i64 [ %conv, %for.body ], [ 0, %entry ]
				%i.010 = phi i32 [ %inc, %for.body ], [ 0, %entry ]
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %conv11
				%ld = load i32, i32* %arrayidx, align 4
				%add = add nsw i32 %ld, 1
				store i32 %add, i32* %arrayidx, align 4
				%inc = add i32 %i.010, 1
				%conv = zext i32 %inc to i64
				%cmp = icmp ult i64 %conv, %x
				br i1 %cmp, label %for.body, label %exit

				exit:
				ret void
				}
				AyalUnsubmitted Not Done Reply Inline Actions See comments above. Ayal: See comments above.