This is an archive of the discontinued LLVM Phabricator instance.

[MemoryDependenceAnalysis] Support -analyze
AbandonedPublic

Authored by anemet on May 7 2015, 12:55 AM.

Download Raw Diff

Details

Reviewers

reames
hfinkel

Summary

I was trying to debug a missed Load-PRE optimization (see the testcase)
and found that there was no convenient way to do this. This
adds -analyze support to memdep and a testcase which demonstrates a
trick to do this in case one wants non-local dependence analysis (like
GVN).

Admittedly, this is pretty non-local dep focused right now but it's
certainly better that what we have currently.

Diff Detail

Event Timeline

anemet updated this revision to Diff 25140.May 7 2015, 12:55 AM

anemet retitled this revision from to [MemoryDependenceAnalysis] Support -analyze.

anemet updated this object.

anemet edited the test plan for this revision. (Show Details)

anemet added reviewers: hfinkel, reames.

anemet added a subscriber: Unknown Object (MLST).

I believe that we have a MemDepPrinter class. How is this different?

Err ... it's not, thanks for pointing it out! I got misled by not seeing anything in the testsuite :(. I'll commit the test for -print-memdeps.

Any chance I could get you to either improve documentation or comments to save the next person the same mistake? Possibly the -analyze option could even be implement in terms of MemDepPrinter or vice versa?

As an aside, relying on MemoryDependence to give you inter-iteration
access dependence is ... not a great idea.
I know load-pre currently relies on this + PHITranslation hacks[1],
(and i've made memoryssa do it so new load pre can do it), PRE of loop
iterations like this is probably a problem that should be solved a
better way.

[1] PHITransAddr's logic to do gep simplification by looking for
things that have the same value (it literally does local value
numbering), and propagating constants through additions that appear
in PHI translated operations is pretty much a hack, and if you want to
get more cases, we shouldn't be piling on hacks.

In D9548#172166, @dberlin wrote:

(and i've made memoryssa do it so new load pre can do it), PRE of loop
iterations like this is probably a problem that should be solved a
better way.

Noted. I would also like this optimization to get improved by coupling it with memchecks. Probably a more loop-focused pass would work better.

In D9548#172121, @reames wrote:

Any chance I could get you to either improve documentation or comments to save the next person the same mistake? Possibly the -analyze option could even be implement in terms of MemDepPrinter or vice versa?

That was my intention too by keeping the testcase (that's where I looked initially) but sure let me see how far I can push this.

Revision Contents

Path

Size

include/

llvm/

Analysis/

MemoryDependenceAnalysis.h

7 lines

lib/

Analysis/

MemoryDependenceAnalysis.cpp

49 lines

test/

Analysis/

MemoryDependenceAnalysis/

analyze-option.ll

43 lines

Diff 25140

include/llvm/Analysis/MemoryDependenceAnalysis.h

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	Instruction *getInst() const {
if (Value.getInt() == Other) return nullptr;		if (Value.getInt() == Other) return nullptr;
return Value.getPointer();		return Value.getPointer();
}		}

bool operator==(const MemDepResult &M) const { return Value == M.Value; }		bool operator==(const MemDepResult &M) const { return Value == M.Value; }
bool operator!=(const MemDepResult &M) const { return Value != M.Value; }		bool operator!=(const MemDepResult &M) const { return Value != M.Value; }
bool operator<(const MemDepResult &M) const { return Value < M.Value; }		bool operator<(const MemDepResult &M) const { return Value < M.Value; }
bool operator>(const MemDepResult &M) const { return Value > M.Value; }		bool operator>(const MemDepResult &M) const { return Value > M.Value; }

		void print(raw_ostream &OS, unsigned Depth = 0) const;

private:		private:
friend class MemoryDependenceAnalysis;		friend class MemoryDependenceAnalysis;
/// Dirty - Entries with this marker occur in a LocalDeps map or		/// Dirty - Entries with this marker occur in a LocalDeps map or
/// NonLocalDeps map when the instruction they previously referenced was		/// NonLocalDeps map when the instruction they previously referenced was
/// removed from MemDep. In either case, the entry may include an		/// removed from MemDep. In either case, the entry may include an
/// instruction pointer. If so, the pointer is an instruction in the		/// instruction pointer. If so, the pointer is an instruction in the
/// block where scanning can start from, saving some work.		/// block where scanning can start from, saving some work.
///		///
▲ Show 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	private:
// A reverse mapping from dependencies to the non-local dependees.		// A reverse mapping from dependencies to the non-local dependees.
ReverseDepMapType ReverseNonLocalDeps;		ReverseDepMapType ReverseNonLocalDeps;

/// Current AA implementation, just a cache.		/// Current AA implementation, just a cache.
AliasAnalysis *AA;		AliasAnalysis *AA;
DominatorTree *DT;		DominatorTree *DT;
AssumptionCache *AC;		AssumptionCache *AC;
std::unique_ptr<PredIteratorCache> PredCache;		std::unique_ptr<PredIteratorCache> PredCache;
		Function *Func;

public:		public:
MemoryDependenceAnalysis();		MemoryDependenceAnalysis();
~MemoryDependenceAnalysis();		~MemoryDependenceAnalysis();
static char ID;		static char ID;

/// Pass Implementation stuff. This doesn't do any analysis eagerly.		/// Pass Implementation stuff. This doesn't do any analysis eagerly.
bool runOnFunction(Function &) override;		bool runOnFunction(Function &) override;
▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	public:
/// 2) safe for the target, and 3) would provide the specified memory		/// 2) safe for the target, and 3) would provide the specified memory
/// location value, then this function returns the size in bytes of the		/// location value, then this function returns the size in bytes of the
/// load width to use. If not, this returns zero.		/// load width to use. If not, this returns zero.
static unsigned getLoadLoadClobberFullWidthSize(const Value *MemLocBase,		static unsigned getLoadLoadClobberFullWidthSize(const Value *MemLocBase,
int64_t MemLocOffs,		int64_t MemLocOffs,
unsigned MemLocSize,		unsigned MemLocSize,
const LoadInst *LI);		const LoadInst *LI);

		/// \brief Print the result of the analysis when invoked with -analyze.
		void print(raw_ostream &OS, const Module *M = nullptr) const override;

private:		private:
MemDepResult getCallSiteDependencyFrom(CallSite C, bool isReadOnlyCall,		MemDepResult getCallSiteDependencyFrom(CallSite C, bool isReadOnlyCall,
BasicBlock::iterator ScanIt,		BasicBlock::iterator ScanIt,
BasicBlock *BB);		BasicBlock *BB);
bool getNonLocalPointerDepFromBB(Instruction *QueryInst,		bool getNonLocalPointerDepFromBB(Instruction *QueryInst,
const PHITransAddr &Pointer,		const PHITransAddr &Pointer,
const AliasAnalysis::Location &Loc,		const AliasAnalysis::Location &Loc,
bool isLoad, BasicBlock *BB,		bool isLoad, BasicBlock *BB,
Show All 20 Lines

lib/Analysis/MemoryDependenceAnalysis.cpp

Show First 20 Lines • Show All 59 Lines • ▼ Show 20 Lines
INITIALIZE_PASS_BEGIN(MemoryDependenceAnalysis, "memdep",		INITIALIZE_PASS_BEGIN(MemoryDependenceAnalysis, "memdep",
"Memory Dependence Analysis", false, true)		"Memory Dependence Analysis", false, true)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_AG_DEPENDENCY(AliasAnalysis)		INITIALIZE_AG_DEPENDENCY(AliasAnalysis)
INITIALIZE_PASS_END(MemoryDependenceAnalysis, "memdep",		INITIALIZE_PASS_END(MemoryDependenceAnalysis, "memdep",
"Memory Dependence Analysis", false, true)		"Memory Dependence Analysis", false, true)

MemoryDependenceAnalysis::MemoryDependenceAnalysis()		MemoryDependenceAnalysis::MemoryDependenceAnalysis()
: FunctionPass(ID), PredCache() {		: FunctionPass(ID), PredCache(), Func(nullptr) {
initializeMemoryDependenceAnalysisPass(*PassRegistry::getPassRegistry());		initializeMemoryDependenceAnalysisPass(*PassRegistry::getPassRegistry());
}		}
MemoryDependenceAnalysis::~MemoryDependenceAnalysis() {		MemoryDependenceAnalysis::~MemoryDependenceAnalysis() {
}		}

/// Clean up memory in between runs		/// Clean up memory in between runs
void MemoryDependenceAnalysis::releaseMemory() {		void MemoryDependenceAnalysis::releaseMemory() {
LocalDeps.clear();		LocalDeps.clear();
Show All 9 Lines
///		///
void MemoryDependenceAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {		void MemoryDependenceAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequiredTransitive<AliasAnalysis>();		AU.addRequiredTransitive<AliasAnalysis>();
}		}

bool MemoryDependenceAnalysis::runOnFunction(Function &F) {		bool MemoryDependenceAnalysis::runOnFunction(Function &F) {
		Func = &F;
AA = &getAnalysis<AliasAnalysis>();		AA = &getAnalysis<AliasAnalysis>();
AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);		AC = &getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
DominatorTreeWrapperPass *DTWP =		DominatorTreeWrapperPass *DTWP =
getAnalysisIfAvailable<DominatorTreeWrapperPass>();		getAnalysisIfAvailable<DominatorTreeWrapperPass>();
DT = DTWP ? &DTWP->getDomTree() : nullptr;		DT = DTWP ? &DTWP->getDomTree() : nullptr;
if (!PredCache)		if (!PredCache)
PredCache.reset(new PredIteratorCache());		PredCache.reset(new PredIteratorCache());
return false;		return false;
▲ Show 20 Lines • Show All 1,595 Lines • ▼ Show 20 Lines	for (ReverseNonLocalPtrDepTy::const_iterator

for (ValueIsLoadPair P : I->second)		for (ValueIsLoadPair P : I->second)
assert(P != ValueIsLoadPair(D, false) &&		assert(P != ValueIsLoadPair(D, false) &&
P != ValueIsLoadPair(D, true) &&		P != ValueIsLoadPair(D, true) &&
"Inst occurs in ReverseNonLocalPtrDeps map");		"Inst occurs in ReverseNonLocalPtrDeps map");
}		}
#endif		#endif
}		}

		void MemDepResult::print(raw_ostream &OS, unsigned Depth) const {
		const char *type;
		switch (Value.getInt()) {
		case Clobber: type = "Clobber:"; break;
		case Def: type = "Def:"; break;
		case Other:
		if (isNonLocal())
		type = "NonLocal";
		else if (isNonFuncLocal())
		type = "NonFuncLocal";
		else if (isUnknown())
		type = "Unknown";
		break;
		default:
		llvm_unreachable("unknown deptype");
		}

		OS.indent(Depth) << type;
		Instruction *Inst = getInst();
		if (Inst)
		OS << " " << *getInst();
		OS << "\n";
		}

		void MemoryDependenceAnalysis::print(raw_ostream &OS,
		const Module *M) const {
		MemoryDependenceAnalysis MD = const_cast<MemoryDependenceAnalysis>(this);

		for (auto &BB: *Func)
		for (auto &Inst : BB) {
		if (!isa<LoadInst>(&Inst) && !isa<StoreInst>(&Inst))
		continue;
		OS << Inst << "\n";
		MemDepResult Dep = MD->getDependency(&Inst);
		Dep.print(OS, 4);
		if (Dep.isNonLocal()) {
		SmallVector<NonLocalDepResult, 4> Deps;
		MD->getNonLocalPointerDependency(&Inst, Deps);
		for (const NonLocalDepResult &NonLocalDep : Deps) {
		OS.indent(6) << NonLocalDep.getBB()->getName() << ": ";
		NonLocalDep.getResult().print(OS);
		}
		}
		}
		}

test/Analysis/MemoryDependenceAnalysis/analyze-option.ll

This file was added.

				; GVN is used to ensure that domtree which is an optional analysis pass does
				; not get freed before memdep. Without domtree, memdep won't perform
				; non-local dependence analysis.

				; RUN: opt -analyze -basicaa -domtree -memdep -gvn < %s \| FileCheck %s

				; Test that for this loop:
				;
				; for (unsigned i = 0; i < 100; i++)
				; A[i+1] = A[i] + B[i];
				;
				; memdep discovers the non-local depedence between the store of A[i+1] and the
				; load of A[i] in the subsequent iteration. This is necessary so that GVN
				; Load-PRE will promote this store-to-load forwarding case to a register.

				target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"

				define void @f(i8* noalias nocapture %A, i8* noalias nocapture readonly %B) {
				entry:
				br label %for.body

				; CHECK: Printing analysis 'Memory Dependence Analysis' for function 'f':
				; CHECK-NEXT: %load = load i8, i8* %arrayidx, align 1
				; CHECK-NEXT: NonLocal
				; CHECK-NEXT: for.body: Def: store i8 %add, i8* %arrayidx_next, align 1

				for.body: ; preds = %for.body, %entry
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i8, i8* %A, i64 %indvars.iv
				%load = load i8, i8* %arrayidx, align 1
				%arrayidx2 = getelementptr inbounds i8, i8* %B, i64 %indvars.iv
				%load_1 = load i8, i8* %arrayidx2, align 1
				%add = add i8 %load_1, %load
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%arrayidx_next = getelementptr inbounds i8, i8* %A, i64 %indvars.iv.next
				store i8 %add, i8* %arrayidx_next, align 1
				%exitcond = icmp eq i64 %indvars.iv.next, 100
				br i1 %exitcond, label %for.end, label %for.body

				for.end: ; preds = %for.body
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[MemoryDependenceAnalysis] Support -analyzeAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 25140

include/llvm/Analysis/MemoryDependenceAnalysis.h

lib/Analysis/MemoryDependenceAnalysis.cpp

test/Analysis/MemoryDependenceAnalysis/analyze-option.ll

[MemoryDependenceAnalysis] Support -analyze
AbandonedPublic