This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Analysis/FlowSensitive/
-
clang/
-
Analysis/
-
FlowSensitive/
-
DataflowEnvironment.h
-
lib/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
-
DataflowEnvironment.cpp
-
unittests/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
-
TransferTest.cpp
-
TypeErasedDataflowAnalysisTest.cpp

Differential D123586

[clang][dataflow] Weaken abstract comparison to enable loop termination.
ClosedPublic

Authored by ymandel on Apr 12 2022, 4:33 AM.

Download Raw Diff

Details

Reviewers

xazax.hun
sgatev

Commits

rGbbcf11f5af98: [clang][dataflow] Weaken abstract comparison to enable loop termination.

Summary

Currently, when the framework is used with an analysis that does not override
compareEquivalent, it does not terminate for most loops. The root cause is the
interaction of (the default implementation of) environment comparison
(compareEquivalent) and the means by which locations and values are
allocated. Specifically, the creation of certain values (including: reference
and pointer values; merged values) results in allocations of fresh locations in
the environment. As a result, analysis of even trivial loop bodies produces
different (if isomorphic) environments, on identical inputs. At the same time,
the default analysis relies on strict equality (versus some relaxed notion of
equivalence). Together, when the analysis compares these isomorphic, yet
unequal, environments, to determine whether the successors of the given block
need to be (re)processed, the result is invariably "yes", thus preventing loop
analysis from reaching a fixed point.

There are many possible solutions to this problem, including equivalence that is
less than strict pointer equality (like structural equivalence) and/or the
introduction of an explicit widening operation. However, these solutions will
require care to be implemented correctly. While a high priority, it seems more
urgent that we fix the current default implentation to allow
termination. Therefore, this patch proposes, essentially, to change the default
comparison to trivally equate any two values. As a result, we can say precisely
that the analysis will process the loop exactly twice -- once to establish an
initial result state and the second to produce an updated result which will
(always) compare equal to the previous. While clearly unsound -- we are not
reaching a fix point of the transfer function, in practice, this level of
analysis will find many practical issues where a single iteration of the loop
impacts abstract program state.

Note, however, that the change to the default merge operation does not affect
soundness, because the framework already produces a fresh (sound) abstraction of
the value when the two values are distinct. The previous setting was overly
conservative.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ymandel created this revision.Apr 12 2022, 4:33 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2022, 4:33 AM

Herald added subscribers: tschuett, steakhal, rnkovacs. · View Herald Transcript

ymandel requested review of this revision.Apr 12 2022, 4:33 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2022, 4:33 AM

Harbormaster completed remote builds in B159198: Diff 422177.Apr 12 2022, 5:01 AM

Yeah, this is a hard problem in general. This looks like a sensible workaround for the short term, but I'm looking forward to a better solution. I'm a bit worried that the memory model will need some upgrades to properly solve this problem.

This revision is now accepted and ready to land.Apr 12 2022, 4:16 PM

In D123586#3446956, @xazax.hun wrote:

Yeah, this is a hard problem in general. This looks like a sensible workaround for the short term, but I'm looking forward to a better solution. I'm a bit worried that the memory model will need some upgrades to properly solve this problem.

Thanks for the quick review! Yes, I have my concerns as well. It seems like some amount of a) additional allocation stabilization/memoization, b) introduction of explicit widening operator and c) structural comparison will fully solve the problem. Solving this properly is a high priority.

In D123586#3449256, @ymandel wrote:

In D123586#3446956, @xazax.hun wrote:

Yeah, this is a hard problem in general. This looks like a sensible workaround for the short term, but I'm looking forward to a better solution. I'm a bit worried that the memory model will need some upgrades to properly solve this problem.

Thanks for the quick review! Yes, I have my concerns as well. It seems like some amount of a) additional allocation stabilization/memoization, b) introduction of explicit widening operator and c) structural comparison will fully solve the problem. Solving this properly is a high priority.

This is a complicated topic. If you have a plan I think it might be a good idea to share it on the forums just in case someone has some input before fully implementing it.

In D123586#3449291, @xazax.hun wrote:

In D123586#3449256, @ymandel wrote:

In D123586#3446956, @xazax.hun wrote:

Yeah, this is a hard problem in general. This looks like a sensible workaround for the short term, but I'm looking forward to a better solution. I'm a bit worried that the memory model will need some upgrades to properly solve this problem.

Thanks for the quick review! Yes, I have my concerns as well. It seems like some amount of a) additional allocation stabilization/memoization, b) introduction of explicit widening operator and c) structural comparison will fully solve the problem. Solving this properly is a high priority.

This is a complicated topic. If you have a plan I think it might be a good idea to share it on the forums just in case someone has some input before fully implementing it.

Yes, definitely! At the least, I was hoping for *your* input before we start sending you patches. :)

This revision was landed with ongoing or failed builds.Apr 13 2022, 12:51 PM

Closed by commit rGbbcf11f5af98: [clang][dataflow] Weaken abstract comparison to enable loop termination. (authored by ymandel). · Explain Why

This revision was automatically updated to reflect the committed changes.

ymandel added a commit: rGbbcf11f5af98: [clang][dataflow] Weaken abstract comparison to enable loop termination..

Revision Contents

Path

Size

clang/

include/

clang/

Analysis/

FlowSensitive/

DataflowEnvironment.h

8 lines

lib/

Analysis/

FlowSensitive/

DataflowEnvironment.cpp

15 lines

unittests/

Analysis/

FlowSensitive/

TransferTest.cpp

67 lines

TypeErasedDataflowAnalysisTest.cpp

6 lines

Diff 422615

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	public:
///		///
/// `Val1` and `Val2` must be assigned to the same storage location in		/// `Val1` and `Val2` must be assigned to the same storage location in
/// `Env1` and `Env2` respectively.		/// `Env1` and `Env2` respectively.
virtual bool compareEquivalent(QualType Type, const Value &Val1,		virtual bool compareEquivalent(QualType Type, const Value &Val1,
const Environment &Env1, const Value &Val2,		const Environment &Env1, const Value &Val2,
const Environment &Env2) {		const Environment &Env2) {
// FIXME: Consider adding QualType to StructValue and removing the Type		// FIXME: Consider adding QualType to StructValue and removing the Type
// argument here.		// argument here.
return false;		//
		// FIXME: default to a sound comparison and/or expand the comparison logic
		// built into the framework to support broader forms of equivalence than
		// strict pointer equality.
		return true;
}		}

/// Modifies `MergedVal` to approximate both `Val1` and `Val2`. This could		/// Modifies `MergedVal` to approximate both `Val1` and `Val2`. This could
/// be a strict lattice join or a more general widening operation.		/// be a strict lattice join or a more general widening operation.
///		///
/// If this function returns true, `MergedVal` will be assigned to a storage		/// If this function returns true, `MergedVal` will be assigned to a storage
/// location of type `Type` in `MergedEnv`.		/// location of type `Type` in `MergedEnv`.
///		///
/// `Env1` and `Env2` can be used to query child values and path condition		/// `Env1` and `Env2` can be used to query child values and path condition
/// implications of `Val1` and `Val2` respectively.		/// implications of `Val1` and `Val2` respectively.
///		///
/// Requirements:		/// Requirements:
///		///
/// `Val1` and `Val2` must be distinct.		/// `Val1` and `Val2` must be distinct.
///		///
/// `Val1`, `Val2`, and `MergedVal` must model values of type `Type`.		/// `Val1`, `Val2`, and `MergedVal` must model values of type `Type`.
///		///
/// `Val1` and `Val2` must be assigned to the same storage location in		/// `Val1` and `Val2` must be assigned to the same storage location in
/// `Env1` and `Env2` respectively.		/// `Env1` and `Env2` respectively.
virtual bool merge(QualType Type, const Value &Val1,		virtual bool merge(QualType Type, const Value &Val1,
const Environment &Env1, const Value &Val2,		const Environment &Env1, const Value &Val2,
const Environment &Env2, Value &MergedVal,		const Environment &Env2, Value &MergedVal,
Environment &MergedEnv) {		Environment &MergedEnv) {
return false;		return true;
}		}
};		};

/// Creates an environment that uses `DACtx` to store objects that encompass		/// Creates an environment that uses `DACtx` to store objects that encompass
/// the state of a program.		/// the state of a program.
explicit Environment(DataflowAnalysisContext &DACtx) : DACtx(&DACtx) {}		explicit Environment(DataflowAnalysisContext &DACtx) : DACtx(&DACtx) {}

/// Creates an environment that uses `DACtx` to store objects that encompass		/// Creates an environment that uses `DACtx` to store objects that encompass
▲ Show 20 Lines • Show All 238 Lines • Show Last 20 Lines

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	static bool equivalentValues(QualType Type, Value *Val1,
const Environment &Env2,		const Environment &Env2,
Environment::ValueModel &Model) {		Environment::ValueModel &Model) {
if (Val1 == Val2)		if (Val1 == Val2)
return true;		return true;

if (auto *IndVal1 = dyn_cast<IndirectionValue>(Val1)) {		if (auto *IndVal1 = dyn_cast<IndirectionValue>(Val1)) {
auto *IndVal2 = cast<IndirectionValue>(Val2);		auto *IndVal2 = cast<IndirectionValue>(Val2);
assert(IndVal1->getKind() == IndVal2->getKind());		assert(IndVal1->getKind() == IndVal2->getKind());
return &IndVal1->getPointeeLoc() == &IndVal2->getPointeeLoc();		if (&IndVal1->getPointeeLoc() == &IndVal2->getPointeeLoc())
		return true;
}		}

return Model.compareEquivalent(Type, Val1, Env1, Val2, Env2);		return Model.compareEquivalent(Type, Val1, Env1, Val2, Env2);
}		}

/// Attempts to merge distinct values `Val1` and `Val1` in `Env1` and `Env2`,		/// Attempts to merge distinct values `Val1` and `Val1` in `Env1` and `Env2`,
/// respectively, of the same type `Type`. Merging generally produces a single		/// respectively, of the same type `Type`. Merging generally produces a single
/// value that (soundly) approximates the two inputs, although the actual		/// value that (soundly) approximates the two inputs, although the actual
Show All 12 Lines	static Value mergeDistinctValues(QualType Type, Value Val1, Environment &Env1,
// FC1 <=> C1 ^ C2		// FC1 <=> C1 ^ C2
// FC2 <=> C2 ^ C3 ^ C4		// FC2 <=> C2 ^ C3 ^ C4
// FC3 <=> (FC1 v FC2) ^ C5		// FC3 <=> (FC1 v FC2) ^ C5
// \code		// \code
// Then, we can track dependencies between flow conditions (e.g. above `FC3`		// Then, we can track dependencies between flow conditions (e.g. above `FC3`
// depends on `FC1` and `FC2`) and modify `flowConditionImplies` to construct		// depends on `FC1` and `FC2`) and modify `flowConditionImplies` to construct
// a formula that includes the bi-conditionals for all flow condition atoms in		// a formula that includes the bi-conditionals for all flow condition atoms in
// the transitive set, before invoking the solver.		// the transitive set, before invoking the solver.
		//
		// FIXME: Does not work for backedges, since the two (or more) paths will not
		// have mutually exclusive conditions.
if (auto *Expr1 = dyn_cast<BoolValue>(Val1)) {		if (auto *Expr1 = dyn_cast<BoolValue>(Val1)) {
for (BoolValue *Constraint : Env1.getFlowConditionConstraints()) {		for (BoolValue *Constraint : Env1.getFlowConditionConstraints()) {
Expr1 = &Env1.makeAnd(Expr1, Constraint);		Expr1 = &Env1.makeAnd(Expr1, Constraint);
}		}
auto *Expr2 = cast<BoolValue>(Val2);		auto *Expr2 = cast<BoolValue>(Val2);
for (BoolValue *Constraint : Env2.getFlowConditionConstraints()) {		for (BoolValue *Constraint : Env2.getFlowConditionConstraints()) {
Expr2 = &Env1.makeAnd(Expr2, Constraint);		Expr2 = &Env1.makeAnd(Expr2, Constraint);
}		}
▲ Show 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	if (DeclToLoc != Other.DeclToLoc)
return false;		return false;

if (ExprToLoc != Other.ExprToLoc)		if (ExprToLoc != Other.ExprToLoc)
return false;		return false;

if (MemberLocToStruct != Other.MemberLocToStruct)		if (MemberLocToStruct != Other.MemberLocToStruct)
return false;		return false;

if (LocToVal.size() != Other.LocToVal.size())		// Compare the contents for the intersection of their domains.
return false;

for (auto &Entry : LocToVal) {		for (auto &Entry : LocToVal) {
const StorageLocation *Loc = Entry.first;		const StorageLocation *Loc = Entry.first;
assert(Loc != nullptr);		assert(Loc != nullptr);

Value *Val = Entry.second;		Value *Val = Entry.second;
assert(Val != nullptr);		assert(Val != nullptr);

auto It = Other.LocToVal.find(Loc);		auto It = Other.LocToVal.find(Loc);
if (It == Other.LocToVal.end())		if (It == Other.LocToVal.end())
return false;		continue;
assert(It->second != nullptr);		assert(It->second != nullptr);

if (!equivalentValues(Loc->getType(), Val, *this, It->second, Other, Model))		if (!equivalentValues(Loc->getType(), Val, *this, It->second, Other, Model))
return false;		return false;
}		}

return true;		return true;
}		}
Show All 32 Lines	for (auto &Entry : OldLocToVal) {
Value *Val = Entry.second;		Value *Val = Entry.second;
assert(Val != nullptr);		assert(Val != nullptr);

auto It = Other.LocToVal.find(Loc);		auto It = Other.LocToVal.find(Loc);
if (It == Other.LocToVal.end())		if (It == Other.LocToVal.end())
continue;		continue;
assert(It->second != nullptr);		assert(It->second != nullptr);

if (equivalentValues(Loc->getType(), Val, *this, It->second, Other,		if (Val == It->second) {
Model)) {
LocToVal.insert({Loc, Val});		LocToVal.insert({Loc, Val});
continue;		continue;
}		}

if (Value MergedVal = mergeDistinctValues(Loc->getType(), Val, this,		if (Value MergedVal = mergeDistinctValues(Loc->getType(), Val, this,
It->second, Other, Model))		It->second, Other, Model))
LocToVal.insert({Loc, MergedVal});		LocToVal.insert({Loc, MergedVal});
}		}
▲ Show 20 Lines • Show All 264 Lines • Show Last 20 Lines

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

Show First 20 Lines • Show All 2,938 Lines • ▼ Show 20 Lines	runDataflow(
ASSERT_THAT(Results[0], Pair("p2", _));		ASSERT_THAT(Results[0], Pair("p2", _));
const Environment &Env = Results[0].second.Env;		const Environment &Env = Results[0].second.Env;
auto &CVal = cast<BoolValue>(Env.getValue(CDecl, SkipPast::None));		auto &CVal = cast<BoolValue>(Env.getValue(CDecl, SkipPast::None));
EXPECT_TRUE(Env.flowConditionImplies(CVal));		EXPECT_TRUE(Env.flowConditionImplies(CVal));
}		}
});		});
}		}

		TEST_F(TransferTest, LoopWithAssignmentConverges) {
		std::string Code = R"(

		bool &foo();

		void target() {
		do {
		bool Bar = foo();
		if (Bar) break;
		(void)Bar;
		/[[p]]/
		} while (true);
		}
		)";
		// The key property that we are verifying is implicit in `runDataflow` --
		// namely, that the analysis succeeds, rather than hitting the maximum number
		// of iterations.
		runDataflow(
		Code, [](llvm::ArrayRef<
		std::pair<std::string, DataflowAnalysisState<NoopLattice>>>
		Results,
		ASTContext &ASTCtx) {
		ASSERT_THAT(Results, ElementsAre(Pair("p", _)));
		const Environment &Env = Results[0].second.Env;

		const ValueDecl *BarDecl = findValueDecl(ASTCtx, "Bar");
		ASSERT_THAT(BarDecl, NotNull());

		auto &BarVal = cast<BoolValue>(Env.getValue(BarDecl, SkipPast::None));
		EXPECT_TRUE(Env.flowConditionImplies(Env.makeNot(BarVal)));
		});
		}

		TEST_F(TransferTest, LoopWithReferenceAssignmentConverges) {
		std::string Code = R"(

		bool &foo();

		void target() {
		do {
		bool& Bar = foo();
		if (Bar) break;
		(void)Bar;
		/[[p]]/
		} while (true);
		}
		)";
		// The key property that we are verifying is implicit in `runDataflow` --
		// namely, that the analysis succeeds, rather than hitting the maximum number
		// of iterations.
		runDataflow(
		Code, [](llvm::ArrayRef<
		std::pair<std::string, DataflowAnalysisState<NoopLattice>>>
		Results,
		ASTContext &ASTCtx) {
		ASSERT_THAT(Results, ElementsAre(Pair("p", _)));
		const Environment &Env = Results[0].second.Env;

		const ValueDecl *BarDecl = findValueDecl(ASTCtx, "Bar");
		ASSERT_THAT(BarDecl, NotNull());

		auto &BarVal =
		cast<BoolValue>(Env.getValue(BarDecl, SkipPast::Reference));
		EXPECT_TRUE(Env.flowConditionImplies(Env.makeNot(BarVal)));
		});
		}

} // namespace		} // namespace

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp

Show First 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	bool merge(QualType Type, const Value &Val1, const Environment &Env1,
if (HasValue1 == nullptr)		if (HasValue1 == nullptr)
return false;		return false;

auto *HasValue2 = cast_or_null<BoolValue>(		auto *HasValue2 = cast_or_null<BoolValue>(
cast<StructValue>(&Val2)->getProperty("has_value"));		cast<StructValue>(&Val2)->getProperty("has_value"));
if (HasValue2 == nullptr)		if (HasValue2 == nullptr)
return false;		return false;

assert(HasValue1 != HasValue2);		if (HasValue1 == HasValue2)
		cast<StructValue>(&MergedVal)->setProperty("has_value", *HasValue1);
		else
cast<StructValue>(&MergedVal)->setProperty("has_value", HasValueTop);		cast<StructValue>(&MergedVal)->setProperty("has_value", HasValueTop);
return true;		return true;
}		}

BoolValue &HasValueTop;		BoolValue &HasValueTop;
};		};

class WideningTest : public Test {		class WideningTest : public Test {
protected:		protected:
▲ Show 20 Lines • Show All 502 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Weaken abstract comparison to enable loop termination.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 422615

clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp

[clang][dataflow] Weaken abstract comparison to enable loop termination.
ClosedPublic