This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Analysis/FlowSensitive/
-
clang/
-
Analysis/
-
FlowSensitive/
-
DataflowAnalysis.h
-
TypeErasedDataflowAnalysis.h
-
unittests/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
2/2
TypeErasedDataflowAnalysisTest.cpp

Differential D131645

[clang][dataflow] Allow user-provided lattices to provide a widen operator
AcceptedPublic

Authored by li.zhe.hua on Aug 10 2022, 9:57 PM.

Download Raw Diff

Details

Reviewers

NoQ
ymandel
xazax.hun
sgatev
gribozavr2

Summary

In order to better support convergence in a sound way, we allow users
to provide a widen operator for their lattice type. This can be
implemented for lattices of infinite (or sufficiently large) height in
order to reach convergence in loops.

If not provided, this defaults to the existing join operation that
is required to be defined. This is a sound default, as join would be
at least more precise than a theoretical widen.

Tracking issue: #56931

Depends on D131644

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

li.zhe.hua created this revision.Aug 10 2022, 9:57 PM

Herald added a reviewer: NoQ. · View Herald TranscriptAug 10 2022, 9:57 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: martong, xazax.hun. · View Herald Transcript

li.zhe.hua requested review of this revision.Aug 10 2022, 9:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2022, 9:57 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

li.zhe.hua added a child revision: D131646: [clang][dataflow] Restructure loops to call widen on back edges.Aug 10 2022, 9:58 PM

li.zhe.hua edited the summary of this revision. (Show Details)Aug 10 2022, 10:05 PM

li.zhe.hua added reviewers: ymandel, xazax.hun.

Herald added a subscriber: rnkovacs. · View Herald TranscriptAug 10 2022, 10:07 PM

Harbormaster completed remote builds in B180587: Diff 451719.Aug 10 2022, 11:22 PM

ymandel added a reviewer: sgatev.Aug 11 2022, 5:10 AM

ymandel added a subscriber: gribozavr.

ymandel accepted this revision.Aug 11 2022, 5:50 AM

ymandel added inline comments.

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
119	nit: why c++11 (vs something later)?

This revision is now accepted and ready to land.Aug 11 2022, 5:50 AM

li.zhe.hua marked an inline comment as done.Aug 11 2022, 7:52 AM

li.zhe.hua added inline comments.

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
119	Consistency with `runAnalysis` defined above. FWIW, I don't think `int x = 0;` is materially different in any standard, and all I really want out of this is an `ASTContext` to pass to `TypeErasedDataflowAnalysis` to satisfy its constructor.

sgatev accepted this revision.Aug 11 2022, 9:11 AM

Very cool! Thanks for investing into fixing the soundness issues!

A couple of ideas for the future (probably you are already aware of most of this, but open communication can help other members of the community to jump in):

Some frameworks have the option to delay widening. I.e., only invoke the widening operator when the analysis of a loop did not converge after a certain number of iterations.
Some frameworks implement narrowing. In the narrowing phase, we would do a couple more iterations but using the regular join instead of the widening operator. Sometimes doing multiple narrow/widen phases can be beneficial.
Writing a widening operator can be tricky. It would be nice to add some guidance to the documentation. E.g., listing some of the algebraic properties that we expect a widening operator to have can be a lot of help for newcommers (like bottom.widen(lat) == lat).

Add FIXME

In D131645#3716464, @xazax.hun wrote:

probably you are already aware of most of this

Actually, more likely than not I don't! I don't have a strong background in PL (or, really any background), so I'm really just learning as I go, mostly from talking to @ymandel and reading through Introduction to Static Analysis (I'm still struggling through Chapter 3!).

Some frameworks have the option to delay widening. I.e., only invoke the widening operator when the analysis of a loop did not converge after a certain number of iterations.

There's a FIXME in D131646 that talks about unrolling, which sounds like this.

Some frameworks implement narrowing. In the narrowing phase, we would do a couple more iterations but using the regular join instead of the widening operator. Sometimes doing multiple narrow/widen phases can be beneficial.

Good to know. I'll keep this in mind and try to dig up some more about this.

Writing a widening operator can be tricky. It would be nice to add some guidance to the documentation. E.g., listing some of the algebraic properties that we expect a widening operator to have can be a lot of help for newcommers (like bottom.widen(lat) == lat).

Agreed. FWIW, I'm still struggling to grok this in the abstract, and don't have a concrete example readily available that makes sense to me.

In D131645#3716508, @li.zhe.hua wrote:

In D131645#3716464, @xazax.hun wrote:

probably you are already aware of most of this

Actually, more likely than not I don't! I don't have a strong background in PL (or, really any background), so I'm really just learning as I go, mostly from talking to @ymandel and reading through Introduction to Static Analysis (I'm still struggling through Chapter 3!).

That is a really great book, I liked it a lot :) I implemented a small interpreter + static analysis framework for the graphical language in the book: https://github.com/Xazax-hun/domains
It is still half-baked, but it is doing something and can generate some ugly graphics.

Harbormaster completed remote builds in B180701: Diff 451887.Aug 11 2022, 10:56 AM

gribozavr2 accepted this revision.Aug 11 2022, 1:06 PM

ymandel mentioned this in D137948: [clang][dataflow] Add widening API and implement it for built-in boolean model..Nov 14 2022, 7:19 AM

ymandel mentioned this in rG84dd12b29064: [clang][dataflow] Add widening API and implement it for built-in boolean model..Nov 22 2022, 8:10 AM

Revision Contents

Path

Size

clang/

include/

clang/

Analysis/

FlowSensitive/

DataflowAnalysis.h

37 lines

TypeErasedDataflowAnalysis.h

4 lines

unittests/

Analysis/

FlowSensitive/

TypeErasedDataflowAnalysisTest.cpp

94 lines

Diff 451887

clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
/// `LatticeT` is a bounded join-semilattice that is used by `Derived` and must		/// `LatticeT` is a bounded join-semilattice that is used by `Derived` and must
/// provide the following public members:		/// provide the following public members:
/// * `LatticeJoinEffect join(const LatticeT &)` - joins the object and the		/// * `LatticeJoinEffect join(const LatticeT &)` - joins the object and the
/// argument by computing their least upper bound, modifies the object if		/// argument by computing their least upper bound, modifies the object if
/// necessary, and returns an effect indicating whether any changes were		/// necessary, and returns an effect indicating whether any changes were
/// made to it;		/// made to it;
/// * `bool operator==(const LatticeT &) const` - returns true if and only if		/// * `bool operator==(const LatticeT &) const` - returns true if and only if
/// the object is equal to the argument.		/// the object is equal to the argument.
		///
		/// `LatticeT` may also optionally provide the following members:
		/// * `void widen(const LatticeT &)` - relaxes the object to subsume the
		/// argument, computing an upper bound. Widen is called during loops, where
		/// a join operation may prevent convergence for a lattice of infinite
		/// height. If `widen` is not provided, `join` is used by default.
		///
		/// FIXME: Provide concrete guidance on how to write a good widen function,
		/// which can be tricky.
template <typename Derived, typename LatticeT>		template <typename Derived, typename LatticeT>
class DataflowAnalysis : public TypeErasedDataflowAnalysis {		class DataflowAnalysis : public TypeErasedDataflowAnalysis {
public:		public:
/// Bounded join-semilattice that is used in the analysis.		/// Bounded join-semilattice that is used in the analysis.
using Lattice = LatticeT;		using Lattice = LatticeT;

explicit DataflowAnalysis(ASTContext &Context) : Context(Context) {}		explicit DataflowAnalysis(ASTContext &Context) : Context(Context) {}

Show All 14 Lines	public:

LatticeJoinEffect joinTypeErased(TypeErasedLattice &E1,		LatticeJoinEffect joinTypeErased(TypeErasedLattice &E1,
const TypeErasedLattice &E2) final {		const TypeErasedLattice &E2) final {
Lattice &L1 = llvm::any_cast<Lattice &>(E1.Value);		Lattice &L1 = llvm::any_cast<Lattice &>(E1.Value);
const Lattice &L2 = llvm::any_cast<const Lattice &>(E2.Value);		const Lattice &L2 = llvm::any_cast<const Lattice &>(E2.Value);
return L1.join(L2);		return L1.join(L2);
}		}

		void widenTypeErased(TypeErasedLattice &E1,
		const TypeErasedLattice &E2) final {
		Lattice &L1 = llvm::any_cast<Lattice &>(E1.Value);
		const Lattice &L2 = llvm::any_cast<const Lattice &>(E2.Value);
		widenInternal(Rank0{}, L1, L2);
		}

bool isEqualTypeErased(const TypeErasedLattice &E1,		bool isEqualTypeErased(const TypeErasedLattice &E1,
const TypeErasedLattice &E2) final {		const TypeErasedLattice &E2) final {
const Lattice &L1 = llvm::any_cast<const Lattice &>(E1.Value);		const Lattice &L1 = llvm::any_cast<const Lattice &>(E1.Value);
const Lattice &L2 = llvm::any_cast<const Lattice &>(E2.Value);		const Lattice &L2 = llvm::any_cast<const Lattice &>(E2.Value);
return L1 == L2;		return L1 == L2;
}		}

void transferTypeErased(const Stmt *Stmt, TypeErasedLattice &E,		void transferTypeErased(const Stmt *Stmt, TypeErasedLattice &E,
Environment &Env) final {		Environment &Env) final {
Lattice &L = llvm::any_cast<Lattice &>(E.Value);		Lattice &L = llvm::any_cast<Lattice &>(E.Value);
static_cast<Derived *>(this)->transfer(Stmt, L, Env);		static_cast<Derived *>(this)->transfer(Stmt, L, Env);
}		}

private:		private:
		struct Rank1 {};
		struct Rank0 : Rank1 {};

		// We first try to use the widen operator if the lattice defines it, and fall
		// back to using join if not.
		//
		// Widen (relative to join) trades precision for convergence in loops. A
		// lattice with a reasonable, finite height may prefer to continue using the
		// join operation.
		template <typename T>
		static auto widenInternal(Rank0, T &L1, const T &L2)
		-> llvm::detail::void_t<decltype(L1.widen(L2))> {
		L1.widen(L2);
		}

		template <typename T>
		static auto widenInternal(Rank1, T &L1, const T &L2)
		-> llvm::detail::void_t<decltype(L1.join(L2))> {
		L1.join(L2);
		}

ASTContext &Context;		ASTContext &Context;
};		};

// Model of the program at a given program point.		// Model of the program at a given program point.
template <typename LatticeT> struct DataflowAnalysisState {		template <typename LatticeT> struct DataflowAnalysisState {
// Model of a program property.		// Model of a program property.
LatticeT Lattice;		LatticeT Lattice;

▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

clang/include/clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	public:
virtual TypeErasedLattice typeErasedInitialElement() = 0;		virtual TypeErasedLattice typeErasedInitialElement() = 0;

/// Joins two type-erased lattice elements by computing their least upper		/// Joins two type-erased lattice elements by computing their least upper
/// bound. Places the join result in the left element and returns an effect		/// bound. Places the join result in the left element and returns an effect
/// indicating whether any changes were made to it.		/// indicating whether any changes were made to it.
virtual LatticeJoinEffect joinTypeErased(TypeErasedLattice &,		virtual LatticeJoinEffect joinTypeErased(TypeErasedLattice &,
const TypeErasedLattice &) = 0;		const TypeErasedLattice &) = 0;

		/// Relaxes the constraints in `A` to subsume the state in `B`.
		virtual void widenTypeErased(TypeErasedLattice &A,
		const TypeErasedLattice &B) = 0;

/// Returns true if and only if the two given type-erased lattice elements are		/// Returns true if and only if the two given type-erased lattice elements are
/// equal.		/// equal.
virtual bool isEqualTypeErased(const TypeErasedLattice &,		virtual bool isEqualTypeErased(const TypeErasedLattice &,
const TypeErasedLattice &) = 0;		const TypeErasedLattice &) = 0;

/// Applies the analysis transfer function for a given statement and		/// Applies the analysis transfer function for a given statement and
/// type-erased lattice element.		/// type-erased lattice element.
virtual void transferTypeErased(const Stmt *, TypeErasedLattice &,		virtual void transferTypeErased(const Stmt *, TypeErasedLattice &,
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
using namespace test;		using namespace test;
using namespace ast_matchers;		using namespace ast_matchers;
using ::testing::_;		using ::testing::_;
using ::testing::ElementsAre;		using ::testing::ElementsAre;
using ::testing::IsEmpty;		using ::testing::IsEmpty;
using ::testing::IsNull;		using ::testing::IsNull;
using ::testing::NotNull;		using ::testing::NotNull;
using ::testing::Pair;		using ::testing::Pair;
		using ::testing::Ref;
using ::testing::Test;		using ::testing::Test;
using ::testing::UnorderedElementsAre;		using ::testing::UnorderedElementsAre;

template <typename AnalysisT>		template <typename AnalysisT>
llvm::Expected<std::vector<		llvm::Expected<std::vector<
llvm::Optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>		llvm::Optional<DataflowAnalysisState<typename AnalysisT::Lattice>>>>
runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {		runAnalysis(llvm::StringRef Code, AnalysisT (*MakeAnalysis)(ASTContext &)) {
std::unique_ptr<ASTUnit> AST =		std::unique_ptr<ASTUnit> AST =
Show All 22 Lines	auto BlockStates = llvm::cantFail(
runAnalysis<NoopAnalysis>("void target() {}", [](ASTContext &C) {		runAnalysis<NoopAnalysis>("void target() {}", [](ASTContext &C) {
return NoopAnalysis(C, false);		return NoopAnalysis(C, false);
}));		}));
EXPECT_EQ(BlockStates.size(), 2u);		EXPECT_EQ(BlockStates.size(), 2u);
EXPECT_TRUE(BlockStates[0].has_value());		EXPECT_TRUE(BlockStates[0].has_value());
EXPECT_TRUE(BlockStates[1].has_value());		EXPECT_TRUE(BlockStates[1].has_value());
}		}

		struct HasWidenLattice {
		HasWidenLattice() {
		ON_CALL(*this, join).WillByDefault([](const HasWidenLattice &) {
		return LatticeJoinEffect::Unchanged;
		});
		}
		// Mock objects are not copyable by default. Since this is a monostate,
		// delegate to the default ctor.
		HasWidenLattice(const HasWidenLattice &) : HasWidenLattice() {}
		HasWidenLattice &operator=(const HasWidenLattice &) { return *this; }

		MOCK_METHOD(LatticeJoinEffect, join, (const HasWidenLattice &));
		MOCK_METHOD(void, widen, (const HasWidenLattice &));

		friend bool operator==(const HasWidenLattice &, const HasWidenLattice &) {
		return true;
		}
		};

		class HasWidenAnalysis
		: public DataflowAnalysis<HasWidenAnalysis, HasWidenLattice> {
		public:
		static HasWidenLattice initialElement() { return {}; }
		using DataflowAnalysis::DataflowAnalysis;
		void transfer(const Stmt *, HasWidenLattice &, Environment &) {}
		};

		TEST(DataflowAnalysisTest, WidenPrefersWidenWhenProvided) {
		std::unique_ptr<ASTUnit> AST =
		tooling::buildASTFromCodeWithArgs("int x = 0;", {"-std=c++11"});
		ymandelUnsubmitted Done Reply Inline Actions nit: why c++11 (vs something later)? ymandel: nit: why c++11 (vs something later)?
		li.zhe.huaAuthorUnsubmitted Done Reply Inline Actions Consistency with `runAnalysis` defined above. FWIW, I don't think `int x = 0;` is materially different in any standard, and all I really want out of this is an `ASTContext` to pass to `TypeErasedDataflowAnalysis` to satisfy its constructor. li.zhe.hua: Consistency with `runAnalysis` defined above. FWIW, I don't think `int x = 0;` is materially…
		HasWidenAnalysis Analysis(AST->getASTContext(),
		/ApplyBuiltinTransfer=/false);

		TypeErasedLattice A = {HasWidenLattice()};
		TypeErasedLattice B = {HasWidenLattice()};
		HasWidenLattice &LA = *llvm::any_cast<HasWidenLattice>(&A.Value);
		HasWidenLattice &LB = *llvm::any_cast<HasWidenLattice>(&B.Value);

		// Expect only `LA.widen(LB)` is called, and nothing else.
		EXPECT_CALL(LA, widen).Times(0);
		EXPECT_CALL(LB, widen).Times(0);
		EXPECT_CALL(LA, join).Times(0);
		EXPECT_CALL(LB, join).Times(0);
		EXPECT_CALL(LA, widen(Ref(LB))).Times(1);

		Analysis.widenTypeErased(A, B);
		}

		struct OnlyJoinLattice {
		OnlyJoinLattice() {
		ON_CALL(*this, join).WillByDefault([](const OnlyJoinLattice &) {
		return LatticeJoinEffect::Unchanged;
		});
		}
		// Mock objects are not copyable by default. Since this is a monostate,
		// delegate to the default ctor.
		OnlyJoinLattice(const OnlyJoinLattice &) : OnlyJoinLattice() {}
		OnlyJoinLattice &operator=(const OnlyJoinLattice &) { return *this; }

		MOCK_METHOD(LatticeJoinEffect, join, (const OnlyJoinLattice &));

		friend bool operator==(const OnlyJoinLattice &, const OnlyJoinLattice &) {
		return true;
		}
		};

		class OnlyJoinAnalysis
		: public DataflowAnalysis<OnlyJoinAnalysis, OnlyJoinLattice> {
		public:
		static OnlyJoinLattice initialElement() { return {}; }
		using DataflowAnalysis::DataflowAnalysis;
		void transfer(const Stmt *, OnlyJoinLattice &, Environment &) {}
		};

		TEST(DataflowAnalysisTest, WidenFallsBackToJoin) {
		std::unique_ptr<ASTUnit> AST =
		tooling::buildASTFromCodeWithArgs("int x = 0;", {"-std=c++11"});
		OnlyJoinAnalysis Analysis(AST->getASTContext(),
		/ApplyBuiltinTransfer=/false);

		TypeErasedLattice A = {OnlyJoinLattice()};
		TypeErasedLattice B = {OnlyJoinLattice()};
		OnlyJoinLattice &LA = *llvm::any_cast<OnlyJoinLattice>(&A.Value);
		OnlyJoinLattice &LB = *llvm::any_cast<OnlyJoinLattice>(&B.Value);

		// Expect only `LA.join(LB)` is called, and nothing else.
		EXPECT_CALL(LA, join).Times(0);
		EXPECT_CALL(LB, join).Times(0);
		EXPECT_CALL(LA, join(Ref(LB))).Times(1);

		Analysis.widenTypeErased(A, B);
		}

struct NonConvergingLattice {		struct NonConvergingLattice {
int State;		int State;

bool operator==(const NonConvergingLattice &Other) const {		bool operator==(const NonConvergingLattice &Other) const {
return State == Other.State;		return State == Other.State;
}		}

LatticeJoinEffect join(const NonConvergingLattice &Other) {		LatticeJoinEffect join(const NonConvergingLattice &Other) {
▲ Show 20 Lines • Show All 1,082 Lines • Show Last 20 Lines