This is an archive of the discontinued LLVM Phabricator instance.

LVI: Add a per-value worklist limit to LazyValueInfo.
ClosedPublic

Authored by • dberlin on Feb 8 2017, 6:00 AM.

Download Raw Diff

Details

Reviewers

chandlerc
djasper

Commits

rG9c92a469b421: LVI: Add a per-value worklist limit to LazyValueInfo.
rL294463: LVI: Add a per-value worklist limit to LazyValueInfo.

Summary

LVI is now depth first, which is optimal for iteration strategy in
terms of work per call. However, the way the results get cached means
it can still go very badly N^2 or worse right now. The overdefined
cache is per-block, because LVI wants to try to get different results
for the same name in different blocks (IE solve the problem
PredicateInfo solves). This means even if we discover a value is
overdefined after going very deep, it doesn't cache this information,
causing it to end up trying to rediscover it again and again. The
same is true for values along the way. In practice, overdefined
anywhere should mean overdefined everywhere (this is how, for example,
SCCP works).

Until we get around to reworking the overdefined cache, we need to
limit the worklist size we process. Note that permanently reverting
the DFS strategy exploration seems the wrong strategy (temporarily
seems fine if we really want). BFS is clearly the wrong approach, it
just gets luckier on some testcases. It's also very hard to design
an effective throttle for BFS. For DFS, the throttle is directly related
to the depth of the CFG. So really deep CFGs will get cutoff, smaller
ones will not. As the CFG simplifies, you get better results.
In BFS, the limit is it's related to the fan-out times average block size,
which is harder to reason about or make good choices for.

Bug being filed about the overdefined cache, but it will require major
surgery to fix it (plumbing predicateinfo through CVP or LVI).

Note: I did not make this number configurable because i'm not sure
anyone really needs to tweak this knob. We run CVP 3 times. On the
testcases i have the slow ones happen in the middle, where CVP is
doing cleanup work other things are effective at. Over the course of
3 runs, we don't see to have any real loss of performance.

I haven't gotten a minimized testcase yet, but just imagine in your
head a testcase where, going *up* the CFG, you have branches, one of
which leads 50000 blocks deep, and the other, to something where the
answer is overdefined immediately. BFS would discover the overdefined
faster than DFS, but do more work to do so. In practice, the right
answer is "once DFS discovers overdefined for a value, stop trying to
get more info about that value" (and so, DFS would normally cache the
overdefined results for every value it passed through in those 50k
blocks, and never do that work again. But it don't, because of the
naming problem)

Diff Detail

Repository: rL LLVM

Event Timeline

• dberlin created this revision.Feb 8 2017, 6:00 AM

djasper added inline comments.Feb 8 2017, 6:08 AM

lib/Analysis/LazyValueInfo.cpp
659 ↗	(On Diff #87641)	Wouldn't this lead to potentially adding Blocks to the cache where we haven't done any search (i.e. that got added after 499 processed values)? Is that a problem?

Add comments and make it fill in only the original stack values.

lib/Analysis/LazyValueInfo.cpp
661 ↗	(On Diff #87646)	nit: naming-related.
668 ↗	(On Diff #87646)	Have you intentionally changed this from DEBUG(dbgs() << ..)?

This revision is now accepted and ready to land.Feb 8 2017, 6:40 AM

• dberlin marked 3 inline comments as done.Feb 8 2017, 6:54 AM

• dberlin added inline comments.

lib/Analysis/LazyValueInfo.cpp
668 ↗	(On Diff #87646)	No, sorry, that was just from the pastebin'ing i was doing with you offline. i'll put it back.

Closed by commit rL294463: LVI: Add a per-value worklist limit to LazyValueInfo. (authored by dannyb). · Explain WhyFeb 8 2017, 7:34 AM

This revision was automatically updated to reflect the committed changes.

vitalybuka added a subscriber: vitalybuka.Feb 9 2017, 1:43 AM

vitalybuka added inline comments.

llvm/trunk/lib/Analysis/LazyValueInfo.cpp
696	MSAN reported invalid memory access which was fixed with https://reviews.llvm.org/rL294572

Revision Contents

Path

Size

llvm/

trunk/

lib/

Analysis/

LazyValueInfo.cpp

42 lines

Diff 87657

llvm/trunk/lib/Analysis/LazyValueInfo.cpp

Show All 33 Lines
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <map>		#include <map>
#include <stack>		#include <stack>
using namespace llvm;		using namespace llvm;
using namespace PatternMatch;		using namespace PatternMatch;

#define DEBUG_TYPE "lazy-value-info"		#define DEBUG_TYPE "lazy-value-info"

		// This is the number of worklist items we will process to try to discover an
		// answer for a given value.
		static const unsigned MaxProcessedPerValue = 500;

char LazyValueInfoWrapperPass::ID = 0;		char LazyValueInfoWrapperPass::ID = 0;
INITIALIZE_PASS_BEGIN(LazyValueInfoWrapperPass, "lazy-value-info",		INITIALIZE_PASS_BEGIN(LazyValueInfoWrapperPass, "lazy-value-info",
"Lazy Value Information Analysis", false, true)		"Lazy Value Information Analysis", false, true)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_END(LazyValueInfoWrapperPass, "lazy-value-info",		INITIALIZE_PASS_END(LazyValueInfoWrapperPass, "lazy-value-info",
"Lazy Value Information Analysis", false, true)		"Lazy Value Information Analysis", false, true)

▲ Show 20 Lines • Show All 508 Lines • ▼ Show 20 Lines	namespace {
class LazyValueInfoImpl {		class LazyValueInfoImpl {

/// Cached results from previous queries		/// Cached results from previous queries
LazyValueInfoCache TheCache;		LazyValueInfoCache TheCache;

/// This stack holds the state of the value solver during a query.		/// This stack holds the state of the value solver during a query.
/// It basically emulates the callstack of the naive		/// It basically emulates the callstack of the naive
/// recursive value lookup process.		/// recursive value lookup process.
std::stack<std::pair<BasicBlock, Value> > BlockValueStack;		SmallVector<std::pair<BasicBlock, Value>, 8> BlockValueStack;

/// Keeps track of which block-value pairs are in BlockValueStack.		/// Keeps track of which block-value pairs are in BlockValueStack.
DenseSet<std::pair<BasicBlock, Value> > BlockValueSet;		DenseSet<std::pair<BasicBlock, Value> > BlockValueSet;

/// Push BV onto BlockValueStack unless it's already in there.		/// Push BV onto BlockValueStack unless it's already in there.
/// Returns true on success.		/// Returns true on success.
bool pushBlockValue(const std::pair<BasicBlock , Value > &BV) {		bool pushBlockValue(const std::pair<BasicBlock , Value > &BV) {
if (!BlockValueSet.insert(BV).second)		if (!BlockValueSet.insert(BV).second)
return false; // It's already in the stack.		return false; // It's already in the stack.

DEBUG(dbgs() << "PUSH: " << *BV.second << " in " << BV.first->getName()		DEBUG(dbgs() << "PUSH: " << *BV.second << " in " << BV.first->getName()
<< "\n");		<< "\n");
BlockValueStack.push(BV);		BlockValueStack.push_back(BV);
return true;		return true;
}		}

AssumptionCache *AC; ///< A pointer to the cache of @llvm.assume calls.		AssumptionCache *AC; ///< A pointer to the cache of @llvm.assume calls.
const DataLayout &DL; ///< A mandatory DataLayout		const DataLayout &DL; ///< A mandatory DataLayout
DominatorTree *DT; ///< An optional DT pointer.		DominatorTree *DT; ///< An optional DT pointer.

LVILatticeVal getBlockValue(Value Val, BasicBlock BB);		LVILatticeVal getBlockValue(Value Val, BasicBlock BB);
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	public:

LazyValueInfoImpl(AssumptionCache *AC, const DataLayout &DL,		LazyValueInfoImpl(AssumptionCache *AC, const DataLayout &DL,
DominatorTree *DT = nullptr)		DominatorTree *DT = nullptr)
: AC(AC), DL(DL), DT(DT) {}		: AC(AC), DL(DL), DT(DT) {}
};		};
} // end anonymous namespace		} // end anonymous namespace

void LazyValueInfoImpl::solve() {		void LazyValueInfoImpl::solve() {
		SmallVector<std::pair<BasicBlock , Value >, 8> StartingStack(
		BlockValueStack.begin(), BlockValueStack.end());

		unsigned processedCount = 0;
while (!BlockValueStack.empty()) {		while (!BlockValueStack.empty()) {
std::pair<BasicBlock, Value> &e = BlockValueStack.top();		processedCount++;
		// Abort if we have to process too many values to get a result for this one.
		// Because of the design of the overdefined cache currently being per-block
		// to avoid naming-related issues (IE it wants to try to give different
		// results for the same name in different blocks), overdefined results don't
		// get cached globally, which in turn means we will often try to rediscover
		// the same overdefined result again and again. Once something like
		// PredicateInfo is used in LVI or CVP, we should be able to make the
		// overdefined cache global, and remove this throttle.
		if (processedCount > MaxProcessedPerValue) {
		DEBUG(dbgs() << "Giving up on stack because we are getting too deep\n");
		// Fill in the original values
		while (!StartingStack.empty()) {
		std::pair<BasicBlock , Value > &e = StartingStack.back();
		TheCache.insertResult(e.second, e.first,
		LVILatticeVal::getOverdefined());
		StartingStack.pop_back();
		}
		BlockValueSet.clear();
		BlockValueStack.clear();
		return;
		}
		std::pair<BasicBlock , Value > &e = BlockValueStack.back();
assert(BlockValueSet.count(e) && "Stack value should be in BlockValueSet!");		assert(BlockValueSet.count(e) && "Stack value should be in BlockValueSet!");

if (solveBlockValue(e.second, e.first)) {		if (solveBlockValue(e.second, e.first)) {
// The work item was completely processed.		// The work item was completely processed.
assert(BlockValueStack.top() == e && "Nothing should have been pushed!");		assert(BlockValueStack.back() == e && "Nothing should have been pushed!");
assert(TheCache.hasCachedValueInfo(e.second, e.first) &&		assert(TheCache.hasCachedValueInfo(e.second, e.first) &&
"Result should be in cache!");		"Result should be in cache!");

DEBUG(dbgs() << "POP " << *e.second << " in " << e.first->getName()		DEBUG(dbgs() << "POP " << *e.second << " in " << e.first->getName()
<< " = " << TheCache.getCachedValueInfo(e.second, e.first) << "\n");		<< " = " << TheCache.getCachedValueInfo(e.second, e.first) << "\n");

BlockValueStack.pop();		BlockValueStack.pop_back();
BlockValueSet.erase(e);		BlockValueSet.erase(e);
} else {		} else {
// More work needs to be done before revisiting.		// More work needs to be done before revisiting.
assert(BlockValueStack.top() != e && "Stack should have been pushed!");		assert(BlockValueStack.back() != e && "Stack should have been pushed!");
		vitalybukaUnsubmitted Not Done Reply Inline Actions MSAN reported invalid memory access which was fixed with https://reviews.llvm.org/rL294572 vitalybuka: MSAN reported invalid memory access which was fixed with https://reviews.llvm.org/rL294572
}		}
}		}
}		}

bool LazyValueInfoImpl::hasBlockValue(Value Val, BasicBlock BB) {		bool LazyValueInfoImpl::hasBlockValue(Value Val, BasicBlock BB) {
// If already a constant, there is nothing to compute.		// If already a constant, there is nothing to compute.
if (isa<Constant>(Val))		if (isa<Constant>(Val))
return true;		return true;
▲ Show 20 Lines • Show All 1,122 Lines • Show Last 20 Lines