This is an archive of the discontinued LLVM Phabricator instance.

[BasicAliasAnalysis] Allow idAddofNonZero() for values coming from the same loop iteration.
ClosedPublic

Authored by Farhana on Jun 21 2017, 2:43 PM.

Download Raw Diff

Details

Reviewers

• dberlin
hfinkel
craig.topper

Commits

rG2ff973f2a5ad: Avoid doing conservative phi checks in aliasSameBasePointerGEPs() if no phis…
rL307581: Avoid doing conservative phi checks in aliasSameBasePointerGEPs() if no phis…

Summary

This is a fix for the case in PR33549.

Current basic alias analysis is very conservative when it tries to analyze two GEP indices with the same base pointers. This patch tries to relax it a bit. It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Farhana created this revision.Jun 21 2017, 2:43 PM

Farhana updated this revision to Diff 103471.Jun 21 2017, 3:04 PM

Farhana added reviewers: • dberlin, hfinkel, craig.topper.

"It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet."

Can you prove that this is actually true for all codepaths through here?

I'm still of the mind the right answer is to rip more of this code out, and let scev-aa handle base + loop variant ptr, rather than try to have a half-working implementation of loop analysis in basicaa.
SCEV was made for this. It knows what it is doing, and is used widely enough that it is likely to be correct.

This cycle handling is also the most expensive part of basicaa. I have testcases where basicaa takes as long to answer simple intraprocedural queries in the same time it takes cfl-steens to do whole-module interprocedural analysis :)

In D34478#787319, @dberlin wrote:

"It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet."

Can you prove that this is actually true for all codepaths through here?

I agree that my comment about loop iteration is misleading.

In order to call isAddOfNonZero() on index1 and index2 and check whether index2 equals to index1 + x( and, vice versa), we need to prove that we are working on the same copy of index1. Which means the instance that used in index1 is the same as the one used in index2.
If we haven’t visited any phis yet, we can assume that part of the code is flat and we have the same copy, right?

I'm still of the mind the right answer is to rip more of this code out, and let scev-aa handle base + loop variant ptr, rather than try to have a half-working implementation of loop analysis in basicaa.
SCEV was made for this. It knows what it is doing, and is used widely enough that it is likely to be correct.

This cycle handling is also the most expensive part of basicaa. I have testcases where basicaa takes as long to answer simple intraprocedural queries in the same time it takes cfl-steens to do whole-module interprocedural analysis :)

I totally agree with you. I was just thinking we might want a temporary solution(specifically a cheap one) while we are waiting on that.

dmgreen added a subscriber: dmgreen.Jun 22 2017, 2:38 AM

In D34478#787355, @Farhana wrote:

In D34478#787319, @dberlin wrote:

"It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet."

Can you prove that this is actually true for all codepaths through here?

I agree that my comment about loop iteration is misleading.

In order to call isAddOfNonZero() on index1 and index2 and check whether index2 equals to index1 + x( and, vice versa), we need to prove that we are working on the same copy of index1. Which means the instance that used in index1 is the same as the one used in index2.
If we haven’t visited any phis yet, we can assume that part of the code is flat and we have the same copy, right?

This is precisely my question: Are you guaranteed that all codepaths end up here only in that situation?

I'm still of the mind the right answer is to rip more of this code out, and let scev-aa handle base + loop variant ptr, rather than try to have a half-working implementation of loop analysis in basicaa.
SCEV was made for this. It knows what it is doing, and is used widely enough that it is likely to be correct.

This cycle handling is also the most expensive part of basicaa. I have testcases where basicaa takes as long to answer simple intraprocedural queries in the same time it takes cfl-steens to do whole-module interprocedural analysis :)

I totally agree with you. I was just thinking we might want a temporary solution(specifically a cheap one) while we are waiting on that.

With no offense meant (and i'm happy to accept the patch if we can prove it right): This is precisely how BasicAA got into this situation.
We add stuff thinking "well, it's probably right", and then it turns out not to be ;)

So i want to be super careful that we can prove to ourselves that this is right.

In D34478#789005, @dberlin wrote:

In D34478#787355, @Farhana wrote:

In D34478#787319, @dberlin wrote:

"It makes the assumption that the values(GEP indices) are from the same iteration if the analysis has not visited any phi's yet."

Can you prove that this is actually true for all codepaths through here?

I agree that my comment about loop iteration is misleading.

In order to call isAddOfNonZero() on index1 and index2 and check whether index2 equals to index1 + x( and, vice versa), we need to prove that we are working on the same copy of index1. Which means the instance that used in index1 is the same as the one used in index2.
If we haven’t visited any phis yet, we can assume that part of the code is flat and we have the same copy, right?

This is precisely my question: Are you guaranteed that all codepaths end up here only in that situation?

Sorry, I misunderstood your question earlier.

Yes, it is guaranteed that BasicAA hasn't done any phi translation so far for these two GEP indices when the code path reached in that situation. It not only guarantees that the phi translation has not happened for these two indices, it also guarantees that the phi-translation hasn't happened any of its parent expressions either starting from the two root memory locations.

More Details:

After each alias analysis for a pair of memory locations BasicAA always clears out the VisitedPhiBBs. Checking only for whether VisitedPhiBBs is empty leaves the analysis still pretty conservative since we don't really check whether the phi translation has anything to do with these two GEP indices but it's probably good enough for a short term solution. Since there is a tradeoff, checking for GEP indices in phi translation will induce more compile time.

AliasResult BasicAAResult::alias(..) {
         ...
VisitedPhiBBs.clear();
return Alias;
}

alias():
VisitedPhiBBs = empty

    MemLoc1           
     /  \                    
 base idx1            
        |                                
       phi                           
   
  
   MemLoc2                                        
    /   \                                                                   
base  idx2 
       /  \
   idx1   N
     |
   phi  (no phis are translated so far. Therefore, idx1 in memloc2 and idx1 in memloc1 have the same live in value)

I'm still of the mind the right answer is to rip more of this code out, and let scev-aa handle base + loop variant ptr, rather than try to have a half-working implementation of loop analysis in basicaa.
SCEV was made for this. It knows what it is doing, and is used widely enough that it is likely to be correct.

This cycle handling is also the most expensive part of basicaa. I have testcases where basicaa takes as long to answer simple intraprocedural queries in the same time it takes cfl-steens to do whole-module interprocedural analysis :)

I totally agree with you. I was just thinking we might want a temporary solution(specifically a cheap one) while we are waiting on that.

With no offense meant (and i'm happy to accept the patch if we can prove it right): This is precisely how BasicAA got into this situation.
We add stuff thinking "well, it's probably right", and then it turns out not to be ;)

So i want to be super careful that we can prove to ourselves that this is right.

I understand your concern :).

Hi Daniel,

Any news on this?

Farhana

I'm fine with this for the moment, but, as you know, i think we gotta stop thinking "bandaids" :)

This revision is now accepted and ready to land.Jul 5 2017, 9:38 AM

Farhana added a subscriber: llvm-commits.Jul 10 2017, 1:07 PM

Closed by commit rL307581: Avoid doing conservative phi checks in aliasSameBasePointerGEPs() if no phis… (authored by faaleen). · Explain WhyJul 10 2017, 1:16 PM

This revision was automatically updated to reflect the committed changes.

Reverted in r307613. This caused miscompilation.
See http://bb.pgr.jp/builders/bootstrap-clang-libcxx-lld-i686-linux/builds/172
(Note, reproduced locally w/o modules)

This revision is now accepted and ready to land.Jul 10 2017, 8:37 PM

Closed by commit rG2ff973f2a5ad: Avoid doing conservative phi checks in aliasSameBasePointerGEPs() if no phis… (authored by Farhana). · Explain WhyOct 7 2019, 5:49 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2019, 5:49 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

BasicAliasAnalysis.h

6 lines

lib/

Analysis/

BasicAliasAnalysis.cpp

18 lines

ValueTracking.cpp

2 lines

Diff 223534

llvm/include/llvm/Analysis/BasicAliasAnalysis.h

Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	private:
void GetIndexDifference(SmallVectorImpl<VariableGEPIndex> &Dest,		void GetIndexDifference(SmallVectorImpl<VariableGEPIndex> &Dest,
const SmallVectorImpl<VariableGEPIndex> &Src);		const SmallVectorImpl<VariableGEPIndex> &Src);

AliasResult aliasGEP(const GEPOperator *V1, uint64_t V1Size,		AliasResult aliasGEP(const GEPOperator *V1, uint64_t V1Size,
const AAMDNodes &V1AAInfo, const Value *V2,		const AAMDNodes &V1AAInfo, const Value *V2,
uint64_t V2Size, const AAMDNodes &V2AAInfo,		uint64_t V2Size, const AAMDNodes &V2AAInfo,
const Value UnderlyingV1, const Value UnderlyingV2);		const Value UnderlyingV1, const Value UnderlyingV2);

		AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,
		uint64_t V1Size,
		const GEPOperator *GEP2,
		uint64_t V2Size,
		const DataLayout &DL);

AliasResult aliasPHI(const PHINode *PN, uint64_t PNSize,		AliasResult aliasPHI(const PHINode *PN, uint64_t PNSize,
const AAMDNodes &PNAAInfo, const Value *V2,		const AAMDNodes &PNAAInfo, const Value *V2,
uint64_t V2Size, const AAMDNodes &V2AAInfo,		uint64_t V2Size, const AAMDNodes &V2AAInfo,
const Value *UnderV2);		const Value *UnderV2);

AliasResult aliasSelect(const SelectInst *SI, uint64_t SISize,		AliasResult aliasSelect(const SelectInst *SI, uint64_t SISize,
const AAMDNodes &SIAAInfo, const Value *V2,		const AAMDNodes &SIAAInfo, const Value *V2,
uint64_t V2Size, const AAMDNodes &V2AAInfo,		uint64_t V2Size, const AAMDNodes &V2AAInfo,
▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/lib/Analysis/BasicAliasAnalysis.cpp

Show First 20 Lines • Show All 916 Lines • ▼ Show 20 Lines	if (isIntrinsicCall(CS2, Intrinsic::experimental_guard))
return getModRefBehavior(CS1) & MRI_Mod ? MRI_Mod : MRI_NoModRef;		return getModRefBehavior(CS1) & MRI_Mod ? MRI_Mod : MRI_NoModRef;

// The AAResultBase base class has some smarts, lets use them.		// The AAResultBase base class has some smarts, lets use them.
return AAResultBase::getModRefInfo(CS1, CS2);		return AAResultBase::getModRefInfo(CS1, CS2);
}		}

/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,		/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,
/// both having the exact same pointer operand.		/// both having the exact same pointer operand.
static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,		AliasResult BasicAAResult::aliasSameBasePointerGEPs(const GEPOperator *GEP1,
uint64_t V1Size,		uint64_t V1Size,
const GEPOperator *GEP2,		const GEPOperator *GEP2,
uint64_t V2Size,		uint64_t V2Size,
const DataLayout &DL) {		const DataLayout &DL) {

assert(GEP1->getPointerOperand()->stripPointerCastsAndBarriers() ==		assert(GEP1->getPointerOperand()->stripPointerCastsAndBarriers() ==
GEP2->getPointerOperand()->stripPointerCastsAndBarriers() &&		GEP2->getPointerOperand()->stripPointerCastsAndBarriers() &&
GEP1->getPointerOperandType() == GEP2->getPointerOperandType() &&		GEP1->getPointerOperandType() == GEP2->getPointerOperandType() &&
"Expected GEPs with the same pointer operand");		"Expected GEPs with the same pointer operand");

// Try to determine whether GEP1 and GEP2 index through arrays, into structs,		// Try to determine whether GEP1 and GEP2 index through arrays, into structs,
// such that the struct field accesses provably cannot alias.		// such that the struct field accesses provably cannot alias.
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = GEP1->getNumIndices() - 1; i != e; ++i)
if (GEP1->getOperand(i + 1) != GEP2->getOperand(i + 1))		if (GEP1->getOperand(i + 1) != GEP2->getOperand(i + 1))
return MayAlias;		return MayAlias;

// Now we know that the array/pointer that GEP1 indexes into and that		// Now we know that the array/pointer that GEP1 indexes into and that
// that GEP2 indexes into must either precisely overlap or be disjoint.		// that GEP2 indexes into must either precisely overlap or be disjoint.
// Because they cannot partially overlap and because fields in an array		// Because they cannot partially overlap and because fields in an array
// cannot overlap, if we can prove the final indices are different between		// cannot overlap, if we can prove the final indices are different between
// GEP1 and GEP2, we can conclude GEP1 and GEP2 don't alias.		// GEP1 and GEP2, we can conclude GEP1 and GEP2 don't alias.

// If the last indices are constants, we've already checked they don't		// If the last indices are constants, we've already checked they don't
// equal each other so we can exit early.		// equal each other so we can exit early.
if (C1 && C2)		if (C1 && C2)
return NoAlias;		return NoAlias;
{		{
Value *GEP1LastIdx = GEP1->getOperand(GEP1->getNumOperands() - 1);		Value *GEP1LastIdx = GEP1->getOperand(GEP1->getNumOperands() - 1);
Value *GEP2LastIdx = GEP2->getOperand(GEP2->getNumOperands() - 1);		Value *GEP2LastIdx = GEP2->getOperand(GEP2->getNumOperands() - 1);
if (isa<PHINode>(GEP1LastIdx) \|\| isa<PHINode>(GEP2LastIdx)) {		if ((isa<PHINode>(GEP1LastIdx) \|\| isa<PHINode>(GEP2LastIdx)) &&
		!VisitedPhiBBs.empty()) {
// If one of the indices is a PHI node, be safe and only use		// If one of the indices is a PHI node, be safe and only use
// computeKnownBits so we don't make any assumptions about the		// computeKnownBits so we don't make any assumptions about the
// relationships between the two indices. This is important if we're		// relationships between the two indices. This is important if we're
// asking about values from different loop iterations. See PR32314.		// asking about values from different loop iterations. See PR32314.
		// But, with empty visitedPhiBBs we can guarantee that the values are
		// from the same iteration. Therefore, we can avoid doing this
		// conservative check.
// TODO: We may be able to change the check so we only do this when		// TODO: We may be able to change the check so we only do this when
// we definitely looked through a PHINode.		// we definitely looked through a PHINode.
if (GEP1LastIdx != GEP2LastIdx &&		if (GEP1LastIdx != GEP2LastIdx &&
GEP1LastIdx->getType() == GEP2LastIdx->getType()) {		GEP1LastIdx->getType() == GEP2LastIdx->getType()) {
KnownBits Known1 = computeKnownBits(GEP1LastIdx, DL);		KnownBits Known1 = computeKnownBits(GEP1LastIdx, DL);
KnownBits Known2 = computeKnownBits(GEP2LastIdx, DL);		KnownBits Known2 = computeKnownBits(GEP2LastIdx, DL);
if (Known1.Zero.intersects(Known2.One) \|\|		if (Known1.Zero.intersects(Known2.One) \|\|
Known1.One.intersects(Known2.Zero))		Known1.One.intersects(Known2.Zero))
▲ Show 20 Lines • Show All 823 Lines • Show Last 20 Lines

llvm/lib/Analysis/ValueTracking.cpp

Show First 20 Lines • Show All 1,867 Lines • ▼ Show 20 Lines	else if (match(V, m_Shr(m_Value(X), m_Value(Y)))) {
// out are known to be zero, and X is known non-zero then at least one		// out are known to be zero, and X is known non-zero then at least one
// non-zero bit must remain.		// non-zero bit must remain.
if (ConstantInt *Shift = dyn_cast<ConstantInt>(Y)) {		if (ConstantInt *Shift = dyn_cast<ConstantInt>(Y)) {
auto ShiftVal = Shift->getLimitedValue(BitWidth - 1);		auto ShiftVal = Shift->getLimitedValue(BitWidth - 1);
// Is there a known one in the portion not shifted out?		// Is there a known one in the portion not shifted out?
if (Known.countMaxLeadingZeros() < BitWidth - ShiftVal)		if (Known.countMaxLeadingZeros() < BitWidth - ShiftVal)
return true;		return true;
// Are all the bits to be shifted out known zero?		// Are all the bits to be shifted out known zero?
if (Known.countMinTrailingZeros() >= ShiftVal)		if (Known.isUnknown() \|\| Known.countMinTrailingZeros() >= ShiftVal)
return isKnownNonZero(X, Depth, Q);		return isKnownNonZero(X, Depth, Q);
}		}
}		}
// div exact can only produce a zero if the dividend is zero.		// div exact can only produce a zero if the dividend is zero.
else if (match(V, m_Exact(m_IDiv(m_Value(X), m_Value())))) {		else if (match(V, m_Exact(m_IDiv(m_Value(X), m_Value())))) {
return isKnownNonZero(X, Depth, Q);		return isKnownNonZero(X, Depth, Q);
}		}
// X + Y.		// X + Y.
▲ Show 20 Lines • Show All 2,594 Lines • Show Last 20 Lines