This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	Are we sure that the memory addresses of CFGBlocks are stable enough for a deterministic order? Alternatively, we could use the block ids for the ordering.

xazax.hun added inline comments.Jan 20 2022, 2:44 AM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	Also, could you describe where the flakiness is originated from? Naively, I'd expect that the order in which we process the predecessors should not change the results of the analysis.

Address reviewers' comments.

sgatev marked an inline comment as done.Jan 20 2022, 3:35 AM

sgatev added inline comments.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	You're right, using block ids for ordering is better. I updated the code. Also, could you describe where the flakiness is originated from? Say we have a block `B1` with predecessors `B2` and `B3`. Let the environment of `B2` after evaluating all of its statements is `B2Env = { Expr1 -> Loc1 }` and the environment of `B3` after evaluating all of its statement is `B3Env = { Expr2 -> Loc2 }` where `ExprX -> LocX` refers to a particular mapping of storage locations to expressions. What we want for the input environment of `B1` is `{}` because `B2Env` and `B3Env` do not contain common assignments of storage locations to expressions. What we got before this patch is either `B2Env.join(B3Env) = { Expr1 -> Loc1 }` or `B3Env.join(B2Env) = { Expr2 -> Loc2 }`. Without deterministic ordering of predecessors the test that I'm introducing in this patch is flaky.

xazax.hun added inline comments.Jan 20 2022, 3:47 AM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	What we got before this patch is either B2Env.join(B3Env) = { Expr1 -> Loc1 } or B3Env.join(B2Env) = { Expr2 -> Loc2 }. I think I'm still missing something. With this patch, wouldn't both B2Env.join(B3Env) and B3Env.join(B2Env) produce the empty environment? If that is the case, do we still care about a deterministic order?

Harbormaster completed remote builds in B144541: Diff 401581.Jan 20 2022, 4:06 AM

sgatev marked 2 inline comments as done.Jan 20 2022, 5:11 AM

sgatev added inline comments.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	That's right. With the patch in `DataflowEnvironment.cpp` the particular order of predecessors doesn't affect the result. However, one of the properties that I'm looking for in tests is being able to remove functionality from the code and have the tests that exercise this functionality fail. This won't necessarily be the case here if the order wasn't deterministic. I don't have a strong preference so please let me know if you have concerns about it. I should also note that we expect all of this to be removed once temporary destructors are handled better in the CFG.

xazax.hun added inline comments.Jan 20 2022, 5:45 AM

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	Strictly speaking, making this deterministic is not a requirement, it should not have any observable effect for the end user. On the other hand, we could think of this non-determinism as a feature. A well behaved analysis should produce the same answer regardless of the order in which we process the nodes. (This requirement follows from the algebraic properties of the join operation.) So in the future I would even anticipate a feature that deliberately randomize the order to ensure that the clients are well behaved. I think eliminating this non-determinism could potentially mask bugs in the future and also it requires extra code. I think I prefer the original version until we see some evidence that determinism is desired.

Address reviewers' comments.

sgatev marked an inline comment as done.Jan 20 2022, 6:05 AM

sgatev added inline comments.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
49	Sure. Reverted that part.

sgatev marked an inline comment as done.Jan 20 2022, 6:06 AM

Thanks, this looks good to me!

Harbormaster completed remote builds in B144568: Diff 401615.Jan 20 2022, 6:29 AM

Closed by commit rGc95cb4de1b66: [clang][dataflow] Intersect ExprToLoc when joining environments (authored by sgatev). · Explain WhyJan 20 2022, 6:31 AM

This revision was automatically updated to reflect the committed changes.

sgatev added a commit: rGc95cb4de1b66: [clang][dataflow] Intersect ExprToLoc when joining environments.

Revision Contents

Path

Size

clang/

lib/

Analysis/

FlowSensitive/

DataflowEnvironment.cpp

5 lines

TypeErasedDataflowAnalysis.cpp

1 line

unittests/

Analysis/

FlowSensitive/

TransferTest.cpp

37 lines

Diff 401618

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp

Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	LatticeJoinEffect Environment::join(const Environment &Other) {

auto Effect = LatticeJoinEffect::Unchanged;		auto Effect = LatticeJoinEffect::Unchanged;

const unsigned DeclToLocSizeBefore = DeclToLoc.size();		const unsigned DeclToLocSizeBefore = DeclToLoc.size();
DeclToLoc = intersectDenseMaps(DeclToLoc, Other.DeclToLoc);		DeclToLoc = intersectDenseMaps(DeclToLoc, Other.DeclToLoc);
if (DeclToLocSizeBefore != DeclToLoc.size())		if (DeclToLocSizeBefore != DeclToLoc.size())
Effect = LatticeJoinEffect::Changed;		Effect = LatticeJoinEffect::Changed;

		const unsigned ExprToLocSizeBefore = ExprToLoc.size();
		ExprToLoc = intersectDenseMaps(ExprToLoc, Other.ExprToLoc);
		if (ExprToLocSizeBefore != ExprToLoc.size())
		Effect = LatticeJoinEffect::Changed;

// FIXME: Add support for joining distinct values that are assigned to the		// FIXME: Add support for joining distinct values that are assigned to the
// same storage locations in `LocToVal` and `Other.LocToVal`.		// same storage locations in `LocToVal` and `Other.LocToVal`.
const unsigned LocToValSizeBefore = LocToVal.size();		const unsigned LocToValSizeBefore = LocToVal.size();
LocToVal = intersectDenseMaps(LocToVal, Other.LocToVal);		LocToVal = intersectDenseMaps(LocToVal, Other.LocToVal);
if (LocToValSizeBefore != LocToVal.size())		if (LocToValSizeBefore != LocToVal.size())
Effect = LatticeJoinEffect::Changed;		Effect = LatticeJoinEffect::Changed;

return Effect;		return Effect;
▲ Show 20 Lines • Show All 206 Lines • Show Last 20 Lines

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp

	Show All 17 Lines
	#include "clang/AST/DeclCXX.h"			#include "clang/AST/DeclCXX.h"
	#include "clang/Analysis/Analyses/PostOrderCFGView.h"			#include "clang/Analysis/Analyses/PostOrderCFGView.h"
	#include "clang/Analysis/CFG.h"			#include "clang/Analysis/CFG.h"
	#include "clang/Analysis/FlowSensitive/DataflowEnvironment.h"			#include "clang/Analysis/FlowSensitive/DataflowEnvironment.h"
	#include "clang/Analysis/FlowSensitive/DataflowWorklist.h"			#include "clang/Analysis/FlowSensitive/DataflowWorklist.h"
	#include "clang/Analysis/FlowSensitive/Transfer.h"			#include "clang/Analysis/FlowSensitive/Transfer.h"
	#include "clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h"			#include "clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h"
	#include "clang/Analysis/FlowSensitive/Value.h"			#include "clang/Analysis/FlowSensitive/Value.h"
				#include "llvm/ADT/DenseSet.h"
	#include "llvm/ADT/None.h"			#include "llvm/ADT/None.h"
	#include "llvm/ADT/Optional.h"			#include "llvm/ADT/Optional.h"
	#include "llvm/Support/raw_ostream.h"			#include "llvm/Support/raw_ostream.h"

	namespace clang {			namespace clang {
	namespace dataflow {			namespace dataflow {

	/// Computes the input state for a given basic block by joining the output			/// Computes the input state for a given basic block by joining the output
	/// states of its predecessors.			/// states of its predecessors.
	///			///
	/// Requirements:			/// Requirements:
	///			///
	/// All predecessors of `Block` except those with loop back edges must have			/// All predecessors of `Block` except those with loop back edges must have
	/// already been transferred. States in `BlockStates` that are set to			/// already been transferred. States in `BlockStates` that are set to
	/// `llvm::None` represent basic blocks that are not evaluated yet.			/// `llvm::None` represent basic blocks that are not evaluated yet.
	static TypeErasedDataflowAnalysisState computeBlockInputState(			static TypeErasedDataflowAnalysisState computeBlockInputState(
	const ControlFlowContext &CFCtx,			const ControlFlowContext &CFCtx,
	std::vector<llvm::Optional<TypeErasedDataflowAnalysisState>> &BlockStates,			std::vector<llvm::Optional<TypeErasedDataflowAnalysisState>> &BlockStates,
	const CFGBlock &Block, const Environment &InitEnv,			const CFGBlock &Block, const Environment &InitEnv,
	TypeErasedDataflowAnalysis &Analysis) {			TypeErasedDataflowAnalysis &Analysis) {
	llvm::DenseSet<const CFGBlock *> Preds;			llvm::DenseSet<const CFGBlock *> Preds;
	Preds.insert(Block.pred_begin(), Block.pred_end());			Preds.insert(Block.pred_begin(), Block.pred_end());
	if (Block.getTerminator().isTemporaryDtorsBranch()) {			if (Block.getTerminator().isTemporaryDtorsBranch()) {
				xazax.hunUnsubmitted Done Reply Inline Actions Are we sure that the memory addresses of CFGBlocks are stable enough for a deterministic order? Alternatively, we could use the block ids for the ordering. xazax.hun: Are we sure that the memory addresses of CFGBlocks are stable enough for a deterministic order?
				xazax.hunUnsubmitted Done Reply Inline Actions Also, could you describe where the flakiness is originated from? Naively, I'd expect that the order in which we process the predecessors should not change the results of the analysis. xazax.hun: Also, could you describe where the flakiness is originated from? Naively, I'd expect that the…
				sgatevAuthorUnsubmitted Done Reply Inline Actions You're right, using block ids for ordering is better. I updated the code. Also, could you describe where the flakiness is originated from? Say we have a block `B1` with predecessors `B2` and `B3`. Let the environment of `B2` after evaluating all of its statements is `B2Env = { Expr1 -> Loc1 }` and the environment of `B3` after evaluating all of its statement is `B3Env = { Expr2 -> Loc2 }` where `ExprX -> LocX` refers to a particular mapping of storage locations to expressions. What we want for the input environment of `B1` is `{}` because `B2Env` and `B3Env` do not contain common assignments of storage locations to expressions. What we got before this patch is either `B2Env.join(B3Env) = { Expr1 -> Loc1 }` or `B3Env.join(B2Env) = { Expr2 -> Loc2 }`. Without deterministic ordering of predecessors the test that I'm introducing in this patch is flaky. sgatev: You're right, using block ids for ordering is better. I updated the code. > Also, could you…
				xazax.hunUnsubmitted Done Reply Inline Actions What we got before this patch is either B2Env.join(B3Env) = { Expr1 -> Loc1 } or B3Env.join(B2Env) = { Expr2 -> Loc2 }. I think I'm still missing something. With this patch, wouldn't both B2Env.join(B3Env) and B3Env.join(B2Env) produce the empty environment? If that is the case, do we still care about a deterministic order? xazax.hun: > What we got before this patch is either B2Env.join(B3Env) = { Expr1 -> Loc1 } or B3Env.join…
				sgatevAuthorUnsubmitted Done Reply Inline Actions That's right. With the patch in `DataflowEnvironment.cpp` the particular order of predecessors doesn't affect the result. However, one of the properties that I'm looking for in tests is being able to remove functionality from the code and have the tests that exercise this functionality fail. This won't necessarily be the case here if the order wasn't deterministic. I don't have a strong preference so please let me know if you have concerns about it. I should also note that we expect all of this to be removed once temporary destructors are handled better in the CFG. sgatev: That's right. With the patch in `DataflowEnvironment.cpp` the particular order of predecessors…
				xazax.hunUnsubmitted Done Reply Inline Actions Strictly speaking, making this deterministic is not a requirement, it should not have any observable effect for the end user. On the other hand, we could think of this non-determinism as a feature. A well behaved analysis should produce the same answer regardless of the order in which we process the nodes. (This requirement follows from the algebraic properties of the join operation.) So in the future I would even anticipate a feature that deliberately randomize the order to ensure that the clients are well behaved. I think eliminating this non-determinism could potentially mask bugs in the future and also it requires extra code. I think I prefer the original version until we see some evidence that determinism is desired. xazax.hun: Strictly speaking, making this deterministic is not a requirement, it should not have any…
				sgatevAuthorUnsubmitted Done Reply Inline Actions Sure. Reverted that part. sgatev: Sure. Reverted that part.
	// This handles a special case where the code that produced the CFG includes			// This handles a special case where the code that produced the CFG includes
	// a conditional operator with a branch that constructs a temporary and			// a conditional operator with a branch that constructs a temporary and
	// calls a destructor annotated as noreturn. The CFG models this as follows:			// calls a destructor annotated as noreturn. The CFG models this as follows:
	//			//
	// B1 (contains the condition of the conditional operator) - succs: B2, B3			// B1 (contains the condition of the conditional operator) - succs: B2, B3
	// B2 (contains code that does not call a noreturn destructor) - succs: B4			// B2 (contains code that does not call a noreturn destructor) - succs: B4
	// B3 (contains code that calls a noreturn destructor) - succs: B4			// B3 (contains code that calls a noreturn destructor) - succs: B4
	// B4 (has temporary destructor terminator) - succs: B5, B6			// B4 (has temporary destructor terminator) - succs: B5, B6
	▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

Show First 20 Lines • Show All 1,822 Lines • ▼ Show 20 Lines	runDataflow(
dyn_cast<StructValue>(Env.getValue(*BazDecl, SkipPast::None));		dyn_cast<StructValue>(Env.getValue(*BazDecl, SkipPast::None));
ASSERT_THAT(BazVal, NotNull());		ASSERT_THAT(BazVal, NotNull());

EXPECT_NE(BazVal, FooVal);		EXPECT_NE(BazVal, FooVal);
EXPECT_NE(BazVal, BarVal);		EXPECT_NE(BazVal, BarVal);
});		});
}		}

		TEST_F(TransferTest, VarDeclInDoWhile) {
		std::string Code = R"(
		void target(int *Foo) {
		do {
		int Bar = *Foo;
		} while (true);
		(void)0;
		/[[p]]/
		}
		)";
		runDataflow(Code,
		[](llvm::ArrayRef<
		std::pair<std::string, DataflowAnalysisState<NoopLattice>>>
		Results,
		ASTContext &ASTCtx) {
		ASSERT_THAT(Results, ElementsAre(Pair("p", _)));
		const Environment &Env = Results[0].second.Env;

		const ValueDecl *FooDecl = findValueDecl(ASTCtx, "Foo");
		ASSERT_THAT(FooDecl, NotNull());

		const ValueDecl *BarDecl = findValueDecl(ASTCtx, "Bar");
		ASSERT_THAT(BarDecl, NotNull());

		const auto *FooVal =
		cast<PointerValue>(Env.getValue(*FooDecl, SkipPast::None));
		const auto *FooPointeeVal =
		cast<IntegerValue>(Env.getValue(FooVal->getPointeeLoc()));

		const auto *BarVal = dyn_cast_or_null<IntegerValue>(
		Env.getValue(*BarDecl, SkipPast::None));
		ASSERT_THAT(BarVal, NotNull());

		EXPECT_EQ(BarVal, FooPointeeVal);
		});
		}

} // namespace		} // namespace

This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Intersect ExprToLoc when joining environmentsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 401618

clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp

clang/unittests/Analysis/FlowSensitive/TransferTest.cpp

[clang][dataflow] Intersect ExprToLoc when joining environments
ClosedPublic