This is an archive of the discontinued LLVM Phabricator instance.

clang/include/clang/Analysis/FlowSensitive/Solver.h
40	Which `Result` is expected if the `Solver` gives up, assuming that's allowed in this API.
clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
194
374	Why this constraint?
457	nit: `++I`, for consistency?
530
595

This revision is now accepted and ready to land.Feb 21 2022, 6:28 PM

Address reviewers' comments.

clang/include/clang/Analysis/FlowSensitive/Solver.h
40	Added a `TimedOut` result for such cases.
clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
374	That's because `solve` must be called at most once for `WatchedLiteralsSolverImpl` - it initializes its data structures in the constructor and then mutates them in `solve`.

Harbormaster completed remote builds in B150817: Diff 410460.Feb 22 2022, 12:26 AM

Hi!

Just a quick high level question. Did you look into reusing existing SAT solvers like miniSAT? What was the main reason for rolling our own instead of picking something up off the selves?

Also, LLVM already has a SMT API (https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/SMTAPI.h). I wonder if it would make sense to have a SAT base class for the SMT API and reuse that here? Specifically, because solvers could be useful for all of the frontends and also optimizations in the backends, it would be great to make sure everyone can benefit from these features. What do you think?

Note that, there are already precedents for Clang using solver tech from LLVM. E.g., the clang static analyzer's Z3-based refutation support is using the SMT API in LLVM, and there are experiments integrating yet another solver with the SA: https://reviews.llvm.org/D110125

Did you look into reusing existing SAT solvers like miniSAT? What was the main reason for rolling our own instead of picking something up off the shelves?

Mainly to provide a lightweight out-of-the-box alternative. The solver interface is simple so one should be able to roll out an implementation for their favorite solver with little effort. One concrete use case that I have in mind for this solver implementation is unit tests, but we've also used it successfully to analyze a large codebase.

I wonder if it would make sense to have a SAT base class for the SMT API and reuse that here?

I think that on a high level we can either 1) integrate the SMT API types deeply into the dataflow framework and use that solver interface directly or 2) have a layer that translates from the dataflow BoolValue types to the SMT API types. At this point I'm not convinced that we should go for 1). For 2) we can simply provide an adapter that implements the dataflow solver API using the SMT API. This way the dataflow framework can be used with SMT API-compatible solvers. What do you think?

In D120289#3338244, @sgatev wrote:

I wonder if it would make sense to have a SAT base class for the SMT API and reuse that here?

I think that on a high level we can either 1) integrate the SMT API types deeply into the dataflow framework and use that solver interface directly or 2) have a layer that translates from the dataflow BoolValue types to the SMT API types. At this point I'm not convinced that we should go for 1).

Could you elaborate on what would be the main disadvantage of using 1)? (The advantage would be that we could very easily switch to Z3 any time.)

In D120289#3338262, @xazax.hun wrote:

In D120289#3338244, @sgatev wrote:

I wonder if it would make sense to have a SAT base class for the SMT API and reuse that here?

I think that on a high level we can either 1) integrate the SMT API types deeply into the dataflow framework and use that solver interface directly or 2) have a layer that translates from the dataflow BoolValue types to the SMT API types. At this point I'm not convinced that we should go for 1).

Could you elaborate on what would be the main disadvantage of using 1)? (The advantage would be that we could very easily switch to Z3 any time.)

Let me clarify 1) because I see two options there as well:

1.a) We replace the BoolValue hierarchy in the dataflow framework with SMTExprRef.
This feels too committal. The dataflow interfaces are still not sufficiently developed and there isn't a clear benefit (e.g. performance) with this approach so I'd rather not couple them for now. There might also be benefits to having a structured representation (as opposed to using opaque values such as SMTExprRef) that we need to consider. I think this is a viable long-term option, though.

1.b) We add an SMTExprRef member to each boolean value (atom, conjunction, disjunction, and negation) in the dataflow framework.
I like this better than 1.a). The main disadvantage I see is that SMTSolver is a huge interface and on the surface it seems tailored to the Z3 API. From that perspective it also feels more committal than necessary, given that we can make do with a simple one-method interface for now. I think this is a good long-term option if we think the SMT API is the most appropriate solver interface.

I find 2) least committal as it keeps the dataflow framework and the solver separate. For a simple integration with Z3 I can provide an implementation of clang::dataflow::Solver using the SMT API. That should boil down to translating dataflow types to SMT types.

What do you think about this? Any other options we should consider?

In D120289#3341008, @sgatev wrote:

it seems tailored to the Z3 API.

As far as I understand, there were downstream patches that used the same API with other SMT solvers. The authors did not upstream it because they did not see any major improvement on Z3 for any of the use cases.

What do you think about this? Any other options we should consider?

After reading your analysis, I'm fine with the current approach.

Overall, looks good to me. Some nits inline.

Are there plans to get a model/assignment of the variables from the solver? That could be helpful for generating warning messages in the future :)

clang/include/clang/Analysis/FlowSensitive/Solver.h
40	`= default`? To make this trivially destructible.
clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
160	Shouldn't we make this function variadic (or taking a container) instead of having multiple overloads?
214	Switch on the kind instead? That way `DisjunctionValue` and `ConjunctionValue` could be handled by the same case.
242	While I get that this should be empty before this loop due to the loop condition at the previous use, but I find such code confusing. I'd prefer to either use a separate queue or have at least an assert to remind us that this one should be a clean container before the loop.
356	Shouldn't we call the right ctor directly in the initialization list instead of resizing a default constructed vector?

Address reviewers' comments.

Are there plans to get a model/assignment of the variables from the solver? That could be helpful for generating warning messages in the future :)

Absolutely! That was only discussed briefly so far. One challenge would be distilling this model to present only relevant information to the user. That could also be best effort, of course. It's certainly something I think we should provide at some point.

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
160	I'd like to avoid containers here to minimize allocations. I'd love to use a fold expression, but that's C++ 17. I opted for setting default values for the last two arguments.
214	Unfortunately, I don't think we can handle them in the same case because they don't share a parent that represents a binary operation. Perhaps an improvement worth considering, as we discussed in one of the previous patches.
242	I agree that this would be an improvement. I opted for narrowing the scope of the first container and adding a second container for this operation.
356	Makes complete sense. Updated.

This revision was landed with ongoing or failed builds.Feb 25 2022, 6:48 AM

Closed by commit rG53dcd9efd16f: [clang][dataflow] Add SAT solver interface and implementation (authored by sgatev). · Explain Why

This revision was automatically updated to reflect the committed changes.

sgatev added a commit: rG53dcd9efd16f: [clang][dataflow] Add SAT solver interface and implementation.

Harbormaster completed remote builds in B151473: Diff 411400.Feb 25 2022, 6:56 AM

Revision Contents

Path

Size

clang/

include/

clang/

Analysis/

FlowSensitive/

Solver.h

56 lines

WatchedLiteralsSolver.h

37 lines

lib/

Analysis/

FlowSensitive/

CMakeLists.txt

1 line

WatchedLiteralsSolver.cpp

600 lines

unittests/

Analysis/

FlowSensitive/

CMakeLists.txt

1 line

SolverTest.cpp

274 lines

Diff 411403

clang/include/clang/Analysis/FlowSensitive/Solver.h

This file was added.

				//===- Solver.h -------------------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines an interface for a SAT solver that can be used by
				// dataflow analyses.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SOLVER_H
				#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SOLVER_H

				#include "clang/Analysis/FlowSensitive/Value.h"
				#include "llvm/ADT/DenseSet.h"

				namespace clang {
				namespace dataflow {

				/// An interface for a SAT solver that can be used by dataflow analyses.
				class Solver {
				public:
				enum class Result {
				/// Indicates that there exists a satisfying assignment for a boolean
				/// formula.
				Satisfiable,

				/// Indicates that there is no satisfying assignment for a boolean formula.
				Unsatisfiable,

				/// Indicates that the solver gave up trying to find a satisfying assignment
				/// for a boolean formula.
				TimedOut,
				};

				virtual ~Solver() = default;

				ymandelUnsubmitted Done Reply Inline Actions Which `Result` is expected if the `Solver` gives up, assuming that's allowed in this API. ymandel: Which `Result` is expected if the `Solver` gives up, assuming that's allowed in this API.
				sgatevAuthorUnsubmitted Done Reply Inline Actions Added a `TimedOut` result for such cases. sgatev: Added a `TimedOut` result for such cases.
				xazax.hunUnsubmitted Done Reply Inline Actions `= default`? To make this trivially destructible. xazax.hun: `= default`? To make this trivially destructible.
				/// Checks if the conjunction of `Vals` is satisfiable and returns the
				/// corresponding result.
				///
				/// Requirements:
				///
				/// All elements in `Vals` must not be null.
				///
				/// FIXME: Consider returning a model in case the conjunction of `Vals` is
				/// satisfiable so that it can be used to generate warning messages.
				virtual Result solve(llvm::DenseSet<BoolValue *> Vals) = 0;
				};

				} // namespace dataflow
				} // namespace clang

				#endif // LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SOLVER_H

clang/include/clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h

This file was added.

				//===- WatchedLiteralsSolver.h ----------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines a SAT solver implementation that can be used by dataflow
				// analyses.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_WATCHEDLITERALSSOLVER_H
				#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_WATCHEDLITERALSSOLVER_H

				#include "clang/Analysis/FlowSensitive/Solver.h"
				#include "clang/Analysis/FlowSensitive/Value.h"
				#include "llvm/ADT/DenseSet.h"

				namespace clang {
				namespace dataflow {

				/// A SAT solver that is an implementation of Algorithm D from Knuth's The Art
				/// of Computer Programming Volume 4: Satisfiability, Fascicle 6. It is based on
				/// the Davis-Putnam-Logemann-Loveland (DPLL) algorithm, keeps references to a
				/// single "watched" literal per clause, and uses a set of "active" variables
				/// for unit propagation.
				class WatchedLiteralsSolver : public Solver {
				public:
				Result solve(llvm::DenseSet<BoolValue *> Vals) override;
				};

				} // namespace dataflow
				} // namespace clang

				#endif // LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_WATCHEDLITERALSSOLVER_H

clang/lib/Analysis/FlowSensitive/CMakeLists.txt

	add_clang_library(clangAnalysisFlowSensitive			add_clang_library(clangAnalysisFlowSensitive
	ControlFlowContext.cpp			ControlFlowContext.cpp
	DataflowEnvironment.cpp			DataflowEnvironment.cpp
	Transfer.cpp			Transfer.cpp
	TypeErasedDataflowAnalysis.cpp			TypeErasedDataflowAnalysis.cpp
				WatchedLiteralsSolver.cpp

	LINK_LIBS			LINK_LIBS
	clangAnalysis			clangAnalysis
	clangAST			clangAST
	)			)

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp

This file was added.

//===- WatchedLiteralsSolver.cpp --------------------------------*- C++ -*-===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

// This file defines a SAT solver implementation that can be used by dataflow

// analyses.

//===----------------------------------------------------------------------===//

#include <cassert>

#include <cstdint>

#include <iterator>

#include <queue>

#include <vector>

#include "clang/Analysis/FlowSensitive/Solver.h"

#include "clang/Analysis/FlowSensitive/Value.h"

#include "clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h"

#include "llvm/ADT/DenseMap.h"

#include "llvm/ADT/DenseSet.h"

#include "llvm/ADT/STLExtras.h"

namespace clang {

namespace dataflow {

// `WatchedLiteralsSolver` is an implementation of Algorithm D from Knuth's

// The Art of Computer Programming Volume 4: Satisfiability, Fascicle 6. It is

// based on the backtracking DPLL algorithm [1], keeps references to a single

// "watched" literal per clause, and uses a set of "active" variables to perform

// unit propagation.

// The solver expects that its input is a boolean formula in conjunctive normal

// form that consists of clauses of at least one literal. A literal is either a

// boolean variable or its negation. Below we define types, data structures, and

// utilities that are used to represent boolean formulas in conjunctive normal

// form.

// [1] https://en.wikipedia.org/wiki/DPLL_algorithm

/// Boolean variables are represented as positive integers.

using Variable = uint32_t;

/// A null boolean variable is used as a placeholder in various data structures

/// and algorithms.

static constexpr Variable NullVar = 0;

/// Literals are represented as positive integers. Specifically, for a boolean

/// variable `V` that is represented as the positive integer `I`, the positive

/// literal `V` is represented as the integer `2*I` and the negative literal

/// `!V` is represented as the integer `2*I+1`.

using Literal = uint32_t;

/// A null literal is used as a placeholder in various data structures and

/// algorithms.

static constexpr Literal NullLit = 0;

/// Returns the positive literal `V`.

static constexpr Literal posLit(Variable V) { return 2 * V; }

/// Returns the negative literal `!V`.

static constexpr Literal negLit(Variable V) { return 2 * V + 1; }

/// Returns the negated literal `!L`.

static constexpr Literal notLit(Literal L) { return L ^ 1; }

/// Returns the variable of `L`.

static constexpr Variable var(Literal L) { return L >> 1; }

/// Clause identifiers are represented as positive integers.

using ClauseID = uint32_t;

/// A null clause identifier is used as a placeholder in various data structures

/// and algorithms.

static constexpr ClauseID NullClause = 0;

/// A boolean formula in conjunctive normal form.

struct BooleanFormula {

/// `LargestVar` is equal to the largest positive integer that represents a

/// variable in the formula.

const Variable LargestVar;

/// Literals of all clauses in the formula.

///

/// The element at index 0 stands for the literal in the null clause. It is

/// set to 0 and isn't used. Literals of clauses in the formula start from the

/// element at index 1.

///

/// For example, for the formula `(L1 v L2) ^ (L2 v L3 v L4)` the elements of

/// `Clauses` will be `[0, L1, L2, L2, L3, L4]`.

std::vector<Literal> Clauses;

/// Start indices of clauses of the formula in `Clauses`.

///

/// The element at index 0 stands for the start index of the null clause. It

/// is set to 0 and isn't used. Start indices of clauses in the formula start

/// from the element at index 1.

///

/// For example, for the formula `(L1 v L2) ^ (L2 v L3 v L4)` the elements of

/// `ClauseStarts` will be `[0, 1, 3]`. Note that the literals of the first

/// clause always start at index 1. The start index for the literals of the

/// second clause depends on the size of the first clause and so on.

std::vector<size_t> ClauseStarts;

/// Maps literals (indices of the vector) to clause identifiers (elements of

/// the vector) that watch the respective literals.

///

/// For a given clause, its watched literal is always its first literal in

/// `Clauses`. This invariant is maintained when watched literals change.

std::vector<ClauseID> WatchedHead;

/// Maps clause identifiers (elements of the vector) to identifiers of other

/// clauses that watch the same literals, forming a set of linked lists.

///

/// The element at index 0 stands for the identifier of the clause that

/// follows the null clause. It is set to 0 and isn't used. Identifiers of

/// clauses in the formula start from the element at index 1.

std::vector<ClauseID> NextWatched;

explicit BooleanFormula(Variable LargestVar) : LargestVar(LargestVar) {

Clauses.push_back(0);

ClauseStarts.push_back(0);

NextWatched.push_back(0);

const size_t NumLiterals = 2 * LargestVar + 1;

WatchedHead.resize(NumLiterals + 1, 0);

}

/// Adds the `L1 v L2 v L3` clause to the formula. If `L2` or `L3` are

/// `NullLit` they are respectively omitted from the clause.

///

/// Requirements:

///

/// `L1` must not be `NullLit`.

///

/// All literals in the input that are not `NullLit` must be distinct.

void addClause(Literal L1, Literal L2 = NullLit, Literal L3 = NullLit) {

// The literals are guaranteed to be distinct from properties of BoolValue

// and the construction in `buildBooleanFormula`.

assert(L1 != NullLit && L1 != L2 && L1 != L3 &&

(L2 != L3 || L2 == NullLit));

const ClauseID C = ClauseStarts.size();

const size_t S = Clauses.size();

ClauseStarts.push_back(S);

Clauses.push_back(L1);

if (L2 != NullLit)

Clauses.push_back(L2);

if (L3 != NullLit)

Clauses.push_back(L3);

// Designate the first literal as the "watched" literal of the clause.

NextWatched.push_back(WatchedHead[L1]);

WatchedHead[L1] = C;

}

/// Returns the number of literals in clause `C`.

xazax.hunUnsubmitted

Done

Shouldn't we make this function variadic (or taking a container) instead of having multiple overloads?

xazax.hun: Shouldn't we make this function variadic (or taking a container) instead of having multiple…

sgatevAuthorUnsubmitted

Done

I'd like to avoid containers here to minimize allocations. I'd love to use a fold expression, but that's C++ 17. I opted for setting default values for the last two arguments.

sgatev: I'd like to avoid containers here to minimize allocations. I'd love to use a fold expression…

size_t clauseSize(ClauseID C) const {

return C == ClauseStarts.size() - 1 ? Clauses.size() - ClauseStarts[C]

: ClauseStarts[C + 1] - ClauseStarts[C];

}

/// Returns the literals of clause `C`.

llvm::ArrayRef<Literal> clauseLiterals(ClauseID C) const {

return llvm::ArrayRef<Literal>(&Clauses[ClauseStarts[C]], clauseSize(C));

}

};

/// Converts the conjunction of `Vals` into a formula in conjunctive normal

/// form where each clause has at least one and at most three literals.

BooleanFormula buildBooleanFormula(const llvm::DenseSet<BoolValue *> &Vals) {

// The general strategy of the algorithm implemented below is to map each

// of the sub-values in `Vals` to a unique variable and use these variables in

// the resulting CNF expression to avoid exponential blow up. The number of

// literals in the resulting formula is guaranteed to be linear in the number

// of sub-values in `Vals`.

// Map each sub-value in `Vals` to a unique variable.

llvm::DenseMap<BoolValue *, Variable> SubValsToVar;

Variable NextVar = 1;

{

std::queue<BoolValue *> UnprocessedSubVals;

for (BoolValue *Val : Vals)

UnprocessedSubVals.push(Val);

while (!UnprocessedSubVals.empty()) {

BoolValue *Val = UnprocessedSubVals.front();

UnprocessedSubVals.pop();

if (!SubValsToVar.try_emplace(Val, NextVar).second)

continue;

++NextVar;

ymandelUnsubmitted

Done

// The general strategy of the algorithm implemented below is to map each

- // sub-values in `Vals` to a unique variable and use these variables in

+ // of the sub-values in `Vals` to a unique variable and use these variables in

// the resulting CNF expression to avoid exponential blow up. The number of

ymandel:

// Visit the sub-values of `Val`.

if (auto *C = dyn_cast<ConjunctionValue>(Val)) {

UnprocessedSubVals.push(&C->getLeftSubValue());

UnprocessedSubVals.push(&C->getRightSubValue());

} else if (auto *D = dyn_cast<DisjunctionValue>(Val)) {

UnprocessedSubVals.push(&D->getLeftSubValue());

UnprocessedSubVals.push(&D->getRightSubValue());

} else if (auto *N = dyn_cast<NegationValue>(Val)) {

UnprocessedSubVals.push(&N->getSubVal());

}

auto GetVar = [&SubValsToVar](const BoolValue *Val) {

auto ValIt = SubValsToVar.find(Val);

assert(ValIt != SubValsToVar.end());

return ValIt->second;

};

xazax.hunUnsubmitted

Done

Switch on the kind instead? That way DisjunctionValue and ConjunctionValue could be handled by the same case.

xazax.hun: Switch on the kind instead? That way `DisjunctionValue` and `ConjunctionValue` could be handled…

sgatevAuthorUnsubmitted

Done

Unfortunately, I don't think we can handle them in the same case because they don't share a parent that represents a binary operation. Perhaps an improvement worth considering, as we discussed in one of the previous patches.

sgatev: Unfortunately, I don't think we can handle them in the same case because they don't share a…

BooleanFormula Formula(NextVar - 1);

std::vector<bool> ProcessedSubVals(NextVar, false);

// Add a conjunct for each variable that represents a top-level conjunction

// value in `Vals`.

for (BoolValue *Val : Vals)

Formula.addClause(posLit(GetVar(Val)));

// Add conjuncts that represent the mapping between newly-created variables

// and their corresponding sub-values.

std::queue<BoolValue *> UnprocessedSubVals;

for (BoolValue *Val : Vals)

UnprocessedSubVals.push(Val);

while (!UnprocessedSubVals.empty()) {

const BoolValue *Val = UnprocessedSubVals.front();

UnprocessedSubVals.pop();

const Variable Var = GetVar(Val);

if (ProcessedSubVals[Var])

continue;

ProcessedSubVals[Var] = true;

if (auto *C = dyn_cast<ConjunctionValue>(Val)) {

const Variable LeftSubVar = GetVar(&C->getLeftSubValue());

const Variable RightSubVar = GetVar(&C->getRightSubValue());

// `X <=> (A ^ B)` is equivalent to `(!X v A) ^ (!X v B) ^ (X v !A v !B)`

// which is already in conjunctive normal form. Below we add each of the

xazax.hunUnsubmitted

Done

While I get that this should be empty before this loop due to the loop condition at the previous use, but I find such code confusing. I'd prefer to either use a separate queue or have at least an assert to remind us that this one should be a clean container before the loop.

xazax.hun: While I get that this should be empty before this loop due to the loop condition at the…

sgatevAuthorUnsubmitted

Done

I agree that this would be an improvement. I opted for narrowing the scope of the first container and adding a second container for this operation.

sgatev: I agree that this would be an improvement. I opted for narrowing the scope of the first…

// conjuncts of the latter expression to the result.

Formula.addClause(negLit(Var), posLit(LeftSubVar));

Formula.addClause(negLit(Var), posLit(RightSubVar));

Formula.addClause(posLit(Var), negLit(LeftSubVar), negLit(RightSubVar));

// Visit the sub-values of `Val`.

UnprocessedSubVals.push(&C->getLeftSubValue());

UnprocessedSubVals.push(&C->getRightSubValue());

} else if (auto *D = dyn_cast<DisjunctionValue>(Val)) {

const Variable LeftSubVar = GetVar(&D->getLeftSubValue());

const Variable RightSubVar = GetVar(&D->getRightSubValue());

// `X <=> (A v B)` is equivalent to `(!X v A v B) ^ (X v !A) ^ (X v !B)`

// which is already in conjunctive normal form. Below we add each of the

// conjuncts of the latter expression to the result.

Formula.addClause(negLit(Var), posLit(LeftSubVar), posLit(RightSubVar));

Formula.addClause(posLit(Var), negLit(LeftSubVar));

Formula.addClause(posLit(Var), negLit(RightSubVar));

// Visit the sub-values of `Val`.

UnprocessedSubVals.push(&D->getLeftSubValue());

UnprocessedSubVals.push(&D->getRightSubValue());

} else if (auto *N = dyn_cast<NegationValue>(Val)) {

const Variable SubVar = GetVar(&N->getSubVal());

// `X <=> !Y` is equivalent to `(!X v !Y) ^ (X v Y)` which is already in

// conjunctive normal form. Below we add each of the conjuncts of the

// latter expression to the result.

Formula.addClause(negLit(Var), negLit(SubVar));

Formula.addClause(posLit(Var), posLit(SubVar));

// Visit the sub-values of `Val`.

UnprocessedSubVals.push(&N->getSubVal());

}

return Formula;

}

class WatchedLiteralsSolverImpl {

/// A boolean formula in conjunctive normal form that the solver will attempt

/// to prove satisfiable. The formula will be modified in the process.

BooleanFormula Formula;

/// The search for a satisfying assignment of the variables in `Formula` will

/// proceed in levels, starting from 1 and going up to `Formula.LargestVar`

/// (inclusive). The current level is stored in `Level`. At each level the

/// solver will assign a value to an unassigned variable. If this leads to a

/// consistent partial assignment, `Level` will be incremented. Otherwise, if

/// it results in a conflict, the solver will backtrack by decrementing

/// `Level` until it reaches the most recent level where a decision was made.

size_t Level = 0;

/// Maps levels (indices of the vector) to variables (elements of the vector)

/// that are assigned values at the respective levels.

///

/// The element at index 0 isn't used. Variables start from the element at

/// index 1.

std::vector<Variable> LevelVars;

/// State of the solver at a particular level.

enum class State : uint8_t {

/// Indicates that the solver made a decision.

Decision = 0,

/// Indicates that the solver made a forced move.

Forced = 1,

};

/// State of the solver at a particular level. It keeps track of previous

/// decisions that the solver can refer to when backtracking.

///

/// The element at index 0 isn't used. States start from the element at index

/// 1.

std::vector<State> LevelStates;

enum class Assignment : int8_t {

Unassigned = -1,

AssignedFalse = 0,

AssignedTrue = 1

};

/// Maps variables (indices of the vector) to their assignments (elements of

/// the vector).

///

/// The element at index 0 isn't used. Variable assignments start from the

/// element at index 1.

std::vector<Assignment> VarAssignments;

/// A set of unassigned variables that appear in watched literals in

/// `Formula`. The vector is guaranteed to contain unique elements.

std::vector<Variable> ActiveVars;

public:

explicit WatchedLiteralsSolverImpl(const llvm::DenseSet<BoolValue *> &Vals)

: Formula(buildBooleanFormula(Vals)), LevelVars(Formula.LargestVar + 1),

LevelStates(Formula.LargestVar + 1) {

assert(!Vals.empty());

// Initialize the state at the root level to a decision so that in

// `reverseForcedMoves` we don't have to check that `Level >= 0` on each

// iteration.

LevelStates[0] = State::Decision;

// Initialize all variables as unassigned.

VarAssignments.resize(Formula.LargestVar + 1, Assignment::Unassigned);

// Initialize the active variables.

for (Variable Var = Formula.LargestVar; Var != NullVar; --Var) {

if (isWatched(posLit(Var)) || isWatched(negLit(Var)))

ActiveVars.push_back(Var);

}

xazax.hunUnsubmitted

Done

Shouldn't we call the right ctor directly in the initialization list instead of resizing a default constructed vector?

xazax.hun: Shouldn't we call the right ctor directly in the initialization list instead of resizing a…

sgatevAuthorUnsubmitted

Done

Makes complete sense. Updated.

sgatev: Makes complete sense. Updated.

Solver::Result solve() && {

size_t I = 0;

while (I < ActiveVars.size()) {

// Assert that the following invariants hold:

// 1. All active variables are unassigned.

// 2. All active variables form watched literals.

// 3. Unassigned variables that form watched literals are active.

// FIXME: Consider replacing these with test cases that fail if the any

// of the invariants is broken. That might not be easy due to the

// transformations performed by `buildBooleanFormula`.

assert(activeVarsAreUnassigned());

assert(activeVarsFormWatchedLiterals());

assert(unassignedVarsFormingWatchedLiteralsAreActive());

const Variable ActiveVar = ActiveVars[I];

// Look for unit clauses that contain the active variable.

const bool unitPosLit = watchedByUnitClause(posLit(ActiveVar));

ymandelUnsubmitted

Done

Why this constraint?

ymandel: Why this constraint?

sgatevAuthorUnsubmitted

Done

That's because solve must be called at most once for WatchedLiteralsSolverImpl - it initializes its data structures in the constructor and then mutates them in solve.

sgatev: That's because `solve` must be called at most once for `WatchedLiteralsSolverImpl` - it…

const bool unitNegLit = watchedByUnitClause(negLit(ActiveVar));

if (unitPosLit && unitNegLit) {

// We found a conflict!

// Backtrack and rewind the `Level` until the most recent non-forced

// assignment.

reverseForcedMoves();

// If the root level is reached, then all possible assignments lead to

// a conflict.

if (Level == 0)

return WatchedLiteralsSolver::Result::Unsatisfiable;

// Otherwise, take the other branch at the most recent level where a

// decision was made.

LevelStates[Level] = State::Forced;

const Variable Var = LevelVars[Level];

VarAssignments[Var] = VarAssignments[Var] == Assignment::AssignedTrue

? Assignment::AssignedFalse

: Assignment::AssignedTrue;

updateWatchedLiterals();

} else if (unitPosLit || unitNegLit) {

// We found a unit clause! The value of its unassigned variable is

// forced.

++Level;

LevelVars[Level] = ActiveVar;

LevelStates[Level] = State::Forced;

VarAssignments[ActiveVar] =

unitPosLit ? Assignment::AssignedTrue : Assignment::AssignedFalse;

// Remove the variable that was just assigned from the set of active

// variables.

if (I + 1 < ActiveVars.size()) {

// Replace the variable that was just assigned with the last active

// variable for efficient removal.

ActiveVars[I] = ActiveVars.back();

} else {

// This was the last active variable. Repeat the process from the

// beginning.

I = 0;

}

ActiveVars.pop_back();

updateWatchedLiterals();

} else if (I + 1 == ActiveVars.size()) {

// There are no remaining unit clauses in the formula! Make a decision

// for one of the active variables at the current level.

++Level;

LevelVars[Level] = ActiveVar;

LevelStates[Level] = State::Decision;

VarAssignments[ActiveVar] = decideAssignment(ActiveVar);

// Remove the variable that was just assigned from the set of active

// variables.

ActiveVars.pop_back();

updateWatchedLiterals();

// This was the last active variable. Repeat the process from the

// beginning.

I = 0;

} else {

++I;

}

return WatchedLiteralsSolver::Result::Satisfiable;

}

private:

// Reverses forced moves until the most recent level where a decision was made

// on the assignment of a variable.

void reverseForcedMoves() {

for (; LevelStates[Level] == State::Forced; --Level) {

const Variable Var = LevelVars[Level];

VarAssignments[Var] = Assignment::Unassigned;

// If the variable that we pass through is watched then we add it to the

// active variables.

if (isWatched(posLit(Var)) || isWatched(negLit(Var)))

ymandelUnsubmitted

Done

nit: ++I, for consistency?

ymandel: nit: `++I`, for consistency?

ActiveVars.push_back(Var);

}

// Updates watched literals that are affected by a variable assignment.

void updateWatchedLiterals() {

const Variable Var = LevelVars[Level];

// Update the watched literals of clauses that currently watch the literal

// that falsifies `Var`.

const Literal FalseLit = VarAssignments[Var] == Assignment::AssignedTrue

? negLit(Var)

: posLit(Var);

ClauseID FalseLitWatcher = Formula.WatchedHead[FalseLit];

Formula.WatchedHead[FalseLit] = NullClause;

while (FalseLitWatcher != NullClause) {

const ClauseID NextFalseLitWatcher = Formula.NextWatched[FalseLitWatcher];

// Pick the first non-false literal as the new watched literal.

const size_t FalseLitWatcherStart = Formula.ClauseStarts[FalseLitWatcher];

size_t NewWatchedLitIdx = FalseLitWatcherStart + 1;

while (isCurrentlyFalse(Formula.Clauses[NewWatchedLitIdx]))

++NewWatchedLitIdx;

const Literal NewWatchedLit = Formula.Clauses[NewWatchedLitIdx];

const Variable NewWatchedLitVar = var(NewWatchedLit);

// Swap the old watched literal for the new one in `FalseLitWatcher` to

// maintain the invariant that the watched literal is at the beginning of

// the clause.

Formula.Clauses[NewWatchedLitIdx] = FalseLit;

Formula.Clauses[FalseLitWatcherStart] = NewWatchedLit;

// If the new watched literal isn't watched by any other clause and its

// variable isn't assigned we need to add it to the active variables.

if (!isWatched(NewWatchedLit) && !isWatched(notLit(NewWatchedLit)) &&

VarAssignments[NewWatchedLitVar] == Assignment::Unassigned)

ActiveVars.push_back(NewWatchedLitVar);

Formula.NextWatched[FalseLitWatcher] = Formula.WatchedHead[NewWatchedLit];

Formula.WatchedHead[NewWatchedLit] = FalseLitWatcher;

// Go to the next clause that watches `FalseLit`.

FalseLitWatcher = NextFalseLitWatcher;

}

/// Returns true if and only if one of the clauses that watch `Lit` is a unit

/// clause.

bool watchedByUnitClause(Literal Lit) const {

for (ClauseID LitWatcher = Formula.WatchedHead[Lit];

LitWatcher != NullClause;

LitWatcher = Formula.NextWatched[LitWatcher]) {

llvm::ArrayRef<Literal> Clause = Formula.clauseLiterals(LitWatcher);

// Assert the invariant that the watched literal is always the first one

// in the clause.

// FIXME: Consider replacing this with a test case that fails if the

// invariant is broken by `updateWatchedLiterals`. That might not be easy

// due to the transformations performed by `buildBooleanFormula`.

assert(Clause.front() == Lit);

if (isUnit(Clause))

return true;

}

return false;

}

/// Returns true if and only if `Clause` is a unit clause.

bool isUnit(llvm::ArrayRef<Literal> Clause) const {

return llvm::all_of(Clause.drop_front(),

[this](Literal L) { return isCurrentlyFalse(L); });

}

ymandelUnsubmitted

Done

// Assert the invariant that the watched literal is always the first one

- // in the clause holds.

+ // in the clause.

// FIXME: Consider replacing this with a test case that fails if the

ymandel:

/// Returns true if and only if `Lit` evaluates to `false` in the current

/// partial assignment.

bool isCurrentlyFalse(Literal Lit) const {

return static_cast<int8_t>(VarAssignments[var(Lit)]) ==

static_cast<int8_t>(Lit & 1);

}

/// Returns true if and only if `Lit` is watched by a clause in `Formula`.

bool isWatched(Literal Lit) const {

return Formula.WatchedHead[Lit] != NullClause;

}

/// Returns an assignment for an unassigned variable.

Assignment decideAssignment(Variable Var) const {

return !isWatched(posLit(Var)) || isWatched(negLit(Var))

? Assignment::AssignedFalse

: Assignment::AssignedTrue;

}

/// Returns a set of all watched literals.

llvm::DenseSet<Literal> watchedLiterals() const {

llvm::DenseSet<Literal> WatchedLiterals;

for (Literal Lit = 2; Lit < Formula.WatchedHead.size(); Lit++) {

if (Formula.WatchedHead[Lit] == NullClause)

continue;

WatchedLiterals.insert(Lit);

}

return WatchedLiterals;

}

/// Returns true if and only if all active variables are unassigned.

bool activeVarsAreUnassigned() const {

return llvm::all_of(ActiveVars, [this](Variable Var) {

return VarAssignments[Var] == Assignment::Unassigned;

});

}

/// Returns true if and only if all active variables form watched literals.

bool activeVarsFormWatchedLiterals() const {

const llvm::DenseSet<Literal> WatchedLiterals = watchedLiterals();

return llvm::all_of(ActiveVars, [&WatchedLiterals](Variable Var) {

return WatchedLiterals.contains(posLit(Var)) ||

WatchedLiterals.contains(negLit(Var));

});

}

/// Returns true if and only if all unassigned variables that are forming

/// watched literals are active.

bool unassignedVarsFormingWatchedLiteralsAreActive() const {

const llvm::DenseSet<Variable> ActiveVarsSet(ActiveVars.begin(),

ActiveVars.end());

for (Literal Lit : watchedLiterals()) {

const Variable Var = var(Lit);

if (VarAssignments[Var] != Assignment::Unassigned)

continue;

if (ActiveVarsSet.contains(Var))

continue;

return false;

}

return true;

}

};

Solver::Result WatchedLiteralsSolver::solve(llvm::DenseSet<BoolValue *> Vals) {

return Vals.empty() ? WatchedLiteralsSolver::Result::Satisfiable

ymandelUnsubmitted

Done

/// Returns true if and only if all unassigned variables that are forming

- /// warched literals are active.

+ /// watched literals are active.

bool unassignedVarsFormingWatchedLiteralsAreActive() const {

ymandel:

: WatchedLiteralsSolverImpl(Vals).solve();

}

} // namespace dataflow

} // namespace clang

clang/unittests/Analysis/FlowSensitive/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	FrontendOpenMP			FrontendOpenMP
	Support			Support
	)			)

	add_clang_unittest(ClangAnalysisFlowSensitiveTests			add_clang_unittest(ClangAnalysisFlowSensitiveTests
	MapLatticeTest.cpp			MapLatticeTest.cpp
	MultiVarConstantPropagationTest.cpp			MultiVarConstantPropagationTest.cpp
	SingleVarConstantPropagationTest.cpp			SingleVarConstantPropagationTest.cpp
	TestingSupport.cpp			TestingSupport.cpp
	TestingSupportTest.cpp			TestingSupportTest.cpp
	TransferTest.cpp			TransferTest.cpp
	TypeErasedDataflowAnalysisTest.cpp			TypeErasedDataflowAnalysisTest.cpp
				SolverTest.cpp
	)			)

	clang_target_link_libraries(ClangAnalysisFlowSensitiveTests			clang_target_link_libraries(ClangAnalysisFlowSensitiveTests
	PRIVATE			PRIVATE
	clangAnalysis			clangAnalysis
	clangAnalysisFlowSensitive			clangAnalysisFlowSensitive
	clangAST			clangAST
	clangASTMatchers			clangASTMatchers
	Show All 12 Lines

clang/unittests/Analysis/FlowSensitive/SolverTest.cpp

This file was added.

				//===- unittests/Analysis/FlowSensitive/SolverTest.cpp --------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "clang/Analysis/FlowSensitive/Solver.h"
				#include "clang/Analysis/FlowSensitive/Value.h"
				#include "clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h"
				#include "gmock/gmock.h"
				#include "gtest/gtest.h"
				#include <memory>
				#include <utility>
				#include <vector>

				namespace {

				using namespace clang;
				using namespace dataflow;

				class SolverTest : public ::testing::Test {
				protected:
				// Checks if the conjunction of `Vals` is satisfiable and returns the
				// corresponding result.
				Solver::Result solve(llvm::DenseSet<BoolValue *> Vals) {
				return WatchedLiteralsSolver().solve(std::move(Vals));
				}

				// Creates an atomic boolean value.
				BoolValue *atom() {
				Vals.push_back(std::make_unique<AtomicBoolValue>());
				return Vals.back().get();
				}

				// Creates a boolean conjunction value.
				BoolValue conj(BoolValue LeftSubVal, BoolValue *RightSubVal) {
				Vals.push_back(
				std::make_unique<ConjunctionValue>(LeftSubVal, RightSubVal));
				return Vals.back().get();
				}

				// Creates a boolean disjunction value.
				BoolValue disj(BoolValue LeftSubVal, BoolValue *RightSubVal) {
				Vals.push_back(
				std::make_unique<DisjunctionValue>(LeftSubVal, RightSubVal));
				return Vals.back().get();
				}

				// Creates a boolean negation value.
				BoolValue neg(BoolValue SubVal) {
				Vals.push_back(std::make_unique<NegationValue>(*SubVal));
				return Vals.back().get();
				}

				// Creates a boolean implication value.
				BoolValue impl(BoolValue LeftSubVal, BoolValue *RightSubVal) {
				return disj(neg(LeftSubVal), RightSubVal);
				}

				// Creates a boolean biconditional value.
				BoolValue iff(BoolValue LeftSubVal, BoolValue *RightSubVal) {
				return conj(impl(LeftSubVal, RightSubVal), impl(RightSubVal, LeftSubVal));
				}

				private:
				std::vector<std::unique_ptr<BoolValue>> Vals;
				};

				TEST_F(SolverTest, Var) {
				auto X = atom();

				// X
				EXPECT_EQ(solve({X}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, NegatedVar) {
				auto X = atom();
				auto NotX = neg(X);

				// !X
				EXPECT_EQ(solve({NotX}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, UnitConflict) {
				auto X = atom();
				auto NotX = neg(X);

				// X ^ !X
				EXPECT_EQ(solve({X, NotX}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, DistinctVars) {
				auto X = atom();
				auto Y = atom();
				auto NotY = neg(Y);

				// X ^ !Y
				EXPECT_EQ(solve({X, NotY}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, DoubleNegation) {
				auto X = atom();
				auto NotX = neg(X);
				auto NotNotX = neg(NotX);

				// !!X ^ !X
				EXPECT_EQ(solve({NotNotX, NotX}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, NegatedDisjunction) {
				auto X = atom();
				auto Y = atom();
				auto XOrY = disj(X, Y);
				auto NotXOrY = neg(XOrY);

				// !(X v Y) ^ (X v Y)
				EXPECT_EQ(solve({NotXOrY, XOrY}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, NegatedConjunction) {
				auto X = atom();
				auto Y = atom();
				auto XAndY = conj(X, Y);
				auto NotXAndY = neg(XAndY);

				// !(X ^ Y) ^ (X ^ Y)
				EXPECT_EQ(solve({NotXAndY, XAndY}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, DisjunctionSameVars) {
				auto X = atom();
				auto NotX = neg(X);
				auto XOrNotX = disj(X, NotX);

				// X v !X
				EXPECT_EQ(solve({XOrNotX}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, ConjunctionSameVarsConflict) {
				auto X = atom();
				auto NotX = neg(X);
				auto XAndNotX = conj(X, NotX);

				// X ^ !X
				EXPECT_EQ(solve({XAndNotX}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, PureVar) {
				auto X = atom();
				auto Y = atom();
				auto NotX = neg(X);
				auto NotXOrY = disj(NotX, Y);
				auto NotY = neg(Y);
				auto NotXOrNotY = disj(NotX, NotY);

				// (!X v Y) ^ (!X v !Y)
				EXPECT_EQ(solve({NotXOrY, NotXOrNotY}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, MustAssumeVarIsFalse) {
				auto X = atom();
				auto Y = atom();
				auto XOrY = disj(X, Y);
				auto NotX = neg(X);
				auto NotXOrY = disj(NotX, Y);
				auto NotY = neg(Y);
				auto NotXOrNotY = disj(NotX, NotY);

				// (X v Y) ^ (!X v Y) ^ (!X v !Y)
				EXPECT_EQ(solve({XOrY, NotXOrY, NotXOrNotY}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, DeepConflict) {
				auto X = atom();
				auto Y = atom();
				auto XOrY = disj(X, Y);
				auto NotX = neg(X);
				auto NotXOrY = disj(NotX, Y);
				auto NotY = neg(Y);
				auto NotXOrNotY = disj(NotX, NotY);
				auto XOrNotY = disj(X, NotY);

				// (X v Y) ^ (!X v Y) ^ (!X v !Y) ^ (X v !Y)
				EXPECT_EQ(solve({XOrY, NotXOrY, NotXOrNotY, XOrNotY}),
				Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, IffSameVars) {
				auto X = atom();
				auto XEqX = iff(X, X);

				// X <=> X
				EXPECT_EQ(solve({XEqX}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, IffDistinctVars) {
				auto X = atom();
				auto Y = atom();
				auto XEqY = iff(X, Y);

				// X <=> Y
				EXPECT_EQ(solve({XEqY}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, IffWithUnits) {
				auto X = atom();
				auto Y = atom();
				auto XEqY = iff(X, Y);

				// (X <=> Y) ^ X ^ Y
				EXPECT_EQ(solve({XEqY, X, Y}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, IffWithUnitsConflict) {
				auto X = atom();
				auto Y = atom();
				auto XEqY = iff(X, Y);
				auto NotY = neg(Y);

				// (X <=> Y) ^ X !Y
				EXPECT_EQ(solve({XEqY, X, NotY}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, IffTransitiveConflict) {
				auto X = atom();
				auto Y = atom();
				auto Z = atom();
				auto XEqY = iff(X, Y);
				auto YEqZ = iff(Y, Z);
				auto NotX = neg(X);

				// (X <=> Y) ^ (Y <=> Z) ^ Z ^ !X
				EXPECT_EQ(solve({XEqY, YEqZ, Z, NotX}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, DeMorgan) {
				auto X = atom();
				auto Y = atom();
				auto Z = atom();
				auto W = atom();

				// !(X v Y) <=> !X ^ !Y
				auto A = iff(neg(disj(X, Y)), conj(neg(X), neg(Y)));

				// !(Z ^ W) <=> !Z v !W
				auto B = iff(neg(conj(Z, W)), disj(neg(Z), neg(W)));

				// A ^ B
				EXPECT_EQ(solve({A, B}), Solver::Result::Satisfiable);
				}

				TEST_F(SolverTest, RespectsAdditionalConstraints) {
				auto X = atom();
				auto Y = atom();
				auto XEqY = iff(X, Y);
				auto NotY = neg(Y);

				// (X <=> Y) ^ X ^ !Y
				EXPECT_EQ(solve({XEqY, X, NotY}), Solver::Result::Unsatisfiable);
				}

				TEST_F(SolverTest, ImplicationConflict) {
				auto X = atom();
				auto Y = atom();
				auto *XImplY = impl(X, Y);
				auto *XAndNotY = conj(X, neg(Y));

				// X => Y ^ X ^ !Y
				EXPECT_EQ(solve({XImplY, XAndNotY}), Solver::Result::Unsatisfiable);
				}

				} // namespace

This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Add SAT solver interface and implementationClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 411403

clang/include/clang/Analysis/FlowSensitive/Solver.h

clang/include/clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h

clang/lib/Analysis/FlowSensitive/CMakeLists.txt

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp

clang/unittests/Analysis/FlowSensitive/CMakeLists.txt

clang/unittests/Analysis/FlowSensitive/SolverTest.cpp

[clang][dataflow] Add SAT solver interface and implementation
ClosedPublic