This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
9/12
WatchedLiteralsSolver.cpp
-
unittests/Analysis/FlowSensitive/
-
Analysis/
-
FlowSensitive/
-
SolverTest.cpp

Differential D158407

[clang][dataflow] #llvm #flow-analysis Simplify formula at CNF construction time, and short-cut solving of known contradictory formulas.
ClosedPublic

Authored by burakemir on Aug 21 2023, 4:15 AM.

Download Raw Diff

Details

Reviewers

sammccall
xazax.hun
NoQ

Commits

rGb50c87d1e63f: [clang][dataflow] #llvm #flow-analysis Simplify formula at CNF construction…

Summary

In dataflow analysis, SAT solver: simplify formula during CNF construction and short-cut
solving when the formula has been recognized as contradictory.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

burakemir created this revision.Aug 21 2023, 4:15 AM

Herald added a reviewer: NoQ. · View Herald TranscriptAug 21 2023, 4:15 AM

Herald added a project: Restricted Project. · View Herald Transcript

burakemir requested review of this revision.Aug 21 2023, 4:15 AM

Herald added a project: Restricted Project. · View Herald TranscriptAug 21 2023, 4:15 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

burakemir edited reviewers, added: sammccall, xazax.hun; removed: NoQ.Aug 21 2023, 4:19 AM

Herald added a reviewer: NoQ. · View Herald TranscriptAug 21 2023, 4:19 AM

Harbormaster completed remote builds in B253814: Diff 551975.Aug 21 2023, 4:48 AM

This simplification makes sense, but if we're adding the layer, we're missing an opportunity to apply it completely.
(Intuitively, I don't see any reason that two passes is "enough")
To do this, we'd want to simplify existing formulas when we learn something new.

Curious whether you think this is worth doing and why - I don't think I'm opposed to the current version if it solves the practical problem, but I am concerned we might be adding simplification layers that solve some test cases but are defeated by relatively simple changes like reordering inputs.

For a complete version, I think the algorithm we'd want is something like:

clauses_so_far = {}
clauses_to_add = [...]

for clause in clauses_to_add:
  clause = simplify(clause)
  if clause is trivially true or clause in clauses_so_far:
    continue # adds nothing
  if clause is {lit}:
    # reprocess clauses containing the variable we just resolved
    affected_clauses = [c in clauses_so_far if lit in c or negate(lit) in c]
    clauses_so_far -= affected_clauses
    clauses_to_add += affected_clauses
  clauses_so_far.add(clause)
  if clause is trivially false:
    break

def simplify(clause):
  if clause.any(lit => {lit} in clauses_so_far):
    return trivially true 
  return clause.filter(lit => not {negate(lit)} in clauses_so_far)

So the builder data structure (clauses_so_far) needs to support:

test for clause presence
remove and return clauses containing a literal

I can't come up with anything particularly clever, but throwing some hashtables at it always works...

(A more powerful version would recognize that AvB can simplify AvBvC and so do removal by subset rather than a single literal, but that seems even further beyond scope)

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
181	It would be useful to briefly describe what kind of simplifications. (this drives e.g. the fact we use it twice). AIUI: It tracks variables that must be true/false, based on trivial clauses seen so far. Such variables can be dropped from subsequently-added clauses, which may render such clauses trivial and give us more known variables.
197	you're already testing this in the higher-level loop, so checking on every call to addClause doesn't seem to actually save anything significant. IMO it makes responsibilities less clear, so I'd prefer to drop it here and addClauseLiterals
205	nit: capitalize variable names here & elsewhere
421	the issue is that info only propagates forward (earlier to later clauses, right?) so by running this again, and sorting units first, we allow simplifications that propagate info backwards once, but we still don't have all simplifications. D Av!B Bv!C Cv!D // first simplification pass Av!B Bv!C C // hoist new unit // second simplification pass Av!B B // hoist new unit // third simplification pass A I think this is worth being explicit about: we're going to find some more simplifications, but we won't find them all, because running this to fixed point is too inefficient. Is 2 experimentally determined to be the right number of passes? a guess? or am I misunderstanding :-)

burakemir retitled this revision from #llvm #flow-analysis Simplify formula at CNF construction time, and short-cut solving of known contradictory formulas. to [clang][dataflow] #llvm #flow-analysis Simplify formula at CNF construction time, and short-cut solving of known contradictory formulas..Aug 28 2023, 1:18 AM

Herald added a subscriber: martong. · View Herald TranscriptAug 28 2023, 1:18 AM

Addressing reviewer comments.

Thanks for the review. PTAL.

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
421	You are right that one could do more work but it is better to leave this to the solver algorithm. We know empirically that there will be a few unit clauses, so might as well spend linear time (in number of unit clauses) to save some work. This won't be enough to determine whether all formulas are satisfiable, but it catches a few obvious contradictions. Doing this twice (as opposed to once) catches more formulas that are obvious contradictions in our unit tests and some real sources. I picked two simply because when we obtain unit clauses "later", we had no opportunity to apply them to earlier clauses. Doing full-blow mutations seems more complicated, esp. given that the Clauses data structure has been written for the actual solver algorithm. I think your concern on optimizing for a certain pattern of input formulas, which may well change in the future, is valid; therefore one should leave the "real" solving work to the solver algorithm, which systematically explores all cases.

Harbormaster completed remote builds in B255297: Diff 554040.Aug 28 2023, 2:55 PM

sammccall accepted this revision.Aug 31 2023, 2:01 PM

sammccall added inline comments.

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
155–157	lits => Lits
186	I'm confused about this comment: the preprocessing looks like O(N): we process every clause once, and addClauseLiterals() runs in constant time. What am I missing?
204	if you're going to form an ArrayRef in any case, might as well skip this indirection and have the callsites pass `addClause({L1, L2})` or so?
223	literal => L or so
261	IsKnownContradictory => isKnownContradictory

This revision is now accepted and ready to land.Aug 31 2023, 2:01 PM

Applied reviewer comments.

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp
186	I am sorry for the misleading text: I only talked about addClause complexity. And I estimated the lookup at a conservative O(log(K)) worst case complexity. And this is just completely wrong, since we can expect lookup to be on a hash-table. It would be O(1) average and worst case O(K). I guess my mind was wandering off thinking about improving worst case lookup complexity. Changed to O(N).
204	Thanks, this looks much nicer. I chose to preserve the assert condition to check that there are <= 3 literals. I believe the solver way work with clauses that have more literals but I don't know whether any trade-offs were made to focus on 3SAT in the solver.

This revision was landed with ongoing or failed builds.Sep 4 2023, 11:23 PM

Closed by commit rGb50c87d1e63f: [clang][dataflow] #llvm #flow-analysis Simplify formula at CNF construction… (authored by burakemir, committed by mboehme). · Explain Why

This revision was automatically updated to reflect the committed changes.

mboehme added a commit: rGb50c87d1e63f: [clang][dataflow] #llvm #flow-analysis Simplify formula at CNF construction….

Revision Contents

Path

Size

clang/

lib/

Analysis/

FlowSensitive/

WatchedLiteralsSolver.cpp

211 lines

unittests/

Analysis/

FlowSensitive/

SolverTest.cpp

28 lines

Diff 555810

clang/lib/Analysis/FlowSensitive/WatchedLiteralsSolver.cpp

//===- WatchedLiteralsSolver.cpp --------------------------------- C++ --===//		//===- WatchedLiteralsSolver.cpp --------------------------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file defines a SAT solver implementation that can be used by dataflow		// This file defines a SAT solver implementation that can be used by dataflow
// analyses.		// analyses.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include <cassert>		#include <cassert>
		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <iterator>
#include <queue>		#include <queue>
#include <vector>		#include <vector>

#include "clang/Analysis/FlowSensitive/Formula.h"		#include "clang/Analysis/FlowSensitive/Formula.h"
#include "clang/Analysis/FlowSensitive/Solver.h"		#include "clang/Analysis/FlowSensitive/Solver.h"
#include "clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h"		#include "clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"


namespace clang {		namespace clang {
namespace dataflow {		namespace dataflow {

// `WatchedLiteralsSolver` is an implementation of Algorithm D from Knuth's		// `WatchedLiteralsSolver` is an implementation of Algorithm D from Knuth's
// The Art of Computer Programming Volume 4: Satisfiability, Fascicle 6. It is		// The Art of Computer Programming Volume 4: Satisfiability, Fascicle 6. It is
// based on the backtracking DPLL algorithm [1], keeps references to a single		// based on the backtracking DPLL algorithm [1], keeps references to a single
// "watched" literal per clause, and uses a set of "active" variables to perform		// "watched" literal per clause, and uses a set of "active" variables to perform
// unit propagation.		// unit propagation.
Show All 21 Lines

/// A null literal is used as a placeholder in various data structures and		/// A null literal is used as a placeholder in various data structures and
/// algorithms.		/// algorithms.
static constexpr Literal NullLit = 0;		static constexpr Literal NullLit = 0;

/// Returns the positive literal `V`.		/// Returns the positive literal `V`.
static constexpr Literal posLit(Variable V) { return 2 * V; }		static constexpr Literal posLit(Variable V) { return 2 * V; }

		static constexpr bool isPosLit(Literal L) { return 0 == (L & 1); }

		static constexpr bool isNegLit(Literal L) { return 1 == (L & 1); }

/// Returns the negative literal `!V`.		/// Returns the negative literal `!V`.
static constexpr Literal negLit(Variable V) { return 2 * V + 1; }		static constexpr Literal negLit(Variable V) { return 2 * V + 1; }

/// Returns the negated literal `!L`.		/// Returns the negated literal `!L`.
static constexpr Literal notLit(Literal L) { return L ^ 1; }		static constexpr Literal notLit(Literal L) { return L ^ 1; }

/// Returns the variable of `L`.		/// Returns the variable of `L`.
static constexpr Variable var(Literal L) { return L >> 1; }		static constexpr Variable var(Literal L) { return L >> 1; }
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	struct CNFFormula {
/// follows the null clause. It is set to 0 and isn't used. Identifiers of		/// follows the null clause. It is set to 0 and isn't used. Identifiers of
/// clauses in the formula start from the element at index 1.		/// clauses in the formula start from the element at index 1.
std::vector<ClauseID> NextWatched;		std::vector<ClauseID> NextWatched;

/// Stores the variable identifier and Atom for atomic booleans in the		/// Stores the variable identifier and Atom for atomic booleans in the
/// formula.		/// formula.
llvm::DenseMap<Variable, Atom> Atomics;		llvm::DenseMap<Variable, Atom> Atomics;

		/// Indicates that we already know the formula is unsatisfiable.
		/// During construction, we catch simple cases of conflicting unit-clauses.
		bool KnownContradictory;

explicit CNFFormula(Variable LargestVar,		explicit CNFFormula(Variable LargestVar,
llvm::DenseMap<Variable, Atom> Atomics)		llvm::DenseMap<Variable, Atom> Atomics)
: LargestVar(LargestVar), Atomics(std::move(Atomics)) {		: LargestVar(LargestVar), Atomics(std::move(Atomics)),
		KnownContradictory(false) {
Clauses.push_back(0);		Clauses.push_back(0);
ClauseStarts.push_back(0);		ClauseStarts.push_back(0);
NextWatched.push_back(0);		NextWatched.push_back(0);
const size_t NumLiterals = 2 * LargestVar + 1;		const size_t NumLiterals = 2 * LargestVar + 1;
WatchedHead.resize(NumLiterals + 1, 0);		WatchedHead.resize(NumLiterals + 1, 0);
}		}

/// Adds the `L1 v L2 v L3` clause to the formula. If `L2` or `L3` are		/// Adds the `L1 v ... v Ln` clause to the formula.
/// `NullLit` they are respectively omitted from the clause.
///
/// Requirements:		/// Requirements:
///		///
/// `L1` must not be `NullLit`.		/// `Li` must not be `NullLit`.
///		///
/// All literals in the input that are not `NullLit` must be distinct.		/// All literals in the input that are not `NullLit` must be distinct.
void addClause(Literal L1, Literal L2 = NullLit, Literal L3 = NullLit) {		void addClause(ArrayRef<Literal> lits) {
// The literals are guaranteed to be distinct from properties of Formula		assert(!lits.empty());
// and the construction in `buildCNF`.		assert(llvm::all_of(lits, [](Literal L) { return L != NullLit; }));
		sammccallUnsubmitted Not Done Reply Inline Actions lits => Lits sammccall: lits => Lits
assert(L1 != NullLit && L1 != L2 && L1 != L3 &&
(L2 != L3 \|\| L2 == NullLit));

const ClauseID C = ClauseStarts.size();		const ClauseID C = ClauseStarts.size();
const size_t S = Clauses.size();		const size_t S = Clauses.size();
ClauseStarts.push_back(S);		ClauseStarts.push_back(S);
		Clauses.insert(Clauses.end(), lits.begin(), lits.end());
Clauses.push_back(L1);
if (L2 != NullLit)
Clauses.push_back(L2);
if (L3 != NullLit)
Clauses.push_back(L3);

// Designate the first literal as the "watched" literal of the clause.		// Designate the first literal as the "watched" literal of the clause.
NextWatched.push_back(WatchedHead[L1]);		NextWatched.push_back(WatchedHead[lits.front()]);
WatchedHead[L1] = C;		WatchedHead[lits.front()] = C;
}		}

/// Returns the number of literals in clause `C`.		/// Returns the number of literals in clause `C`.
size_t clauseSize(ClauseID C) const {		size_t clauseSize(ClauseID C) const {
return C == ClauseStarts.size() - 1 ? Clauses.size() - ClauseStarts[C]		return C == ClauseStarts.size() - 1 ? Clauses.size() - ClauseStarts[C]
: ClauseStarts[C + 1] - ClauseStarts[C];		: ClauseStarts[C + 1] - ClauseStarts[C];
}		}

/// Returns the literals of clause `C`.		/// Returns the literals of clause `C`.
llvm::ArrayRef<Literal> clauseLiterals(ClauseID C) const {		llvm::ArrayRef<Literal> clauseLiterals(ClauseID C) const {
return llvm::ArrayRef<Literal>(&Clauses[ClauseStarts[C]], clauseSize(C));		return llvm::ArrayRef<Literal>(&Clauses[ClauseStarts[C]], clauseSize(C));
}		}
};		};

		/// Applies simplifications while building up a BooleanFormula.
		sammccallUnsubmitted Done Reply Inline Actions It would be useful to briefly describe what kind of simplifications. (this drives e.g. the fact we use it twice). AIUI: It tracks variables that must be true/false, based on trivial clauses seen so far. Such variables can be dropped from subsequently-added clauses, which may render such clauses trivial and give us more known variables. sammccall: It would be useful to briefly describe what kind of simplifications. (this drives e.g. the fact…
		/// We keep track of unit clauses, which tell us variables that must be
		/// true/false in any model that satisfies the overall formula.
		/// Such variables can be dropped from subsequently-added clauses, which
		/// may in turn yield more unit clauses or even a contradiction.
		/// The total added complexity of this preprocessing is O(N) where we
		sammccallUnsubmitted Not Done Reply Inline Actions I'm confused about this comment: the preprocessing looks like O(N): we process every clause once, and addClauseLiterals() runs in constant time. What am I missing? sammccall: I'm confused about this comment: the preprocessing looks like O(N): we process every clause…
		burakemirAuthorUnsubmitted Done Reply Inline Actions I am sorry for the misleading text: I only talked about addClause complexity. And I estimated the lookup at a conservative O(log(K)) worst case complexity. And this is just completely wrong, since we can expect lookup to be on a hash-table. It would be O(1) average and worst case O(K). I guess my mind was wandering off thinking about improving worst case lookup complexity. Changed to O(N). burakemir: I am sorry for the misleading text: I only talked about addClause complexity. And I estimated…
		/// for every clause, we do a lookup for each unit clauses.
		/// The lookup is O(1) on average. This method won't catch all
		/// contradictory formulas, more passes can in principle catch
		/// more cases but we leave all these and the general case to the
		/// proper SAT solver.
		struct CNFFormulaBuilder {
		// Formula should outlive CNFFormulaBuilder.
		explicit CNFFormulaBuilder(CNFFormula &CNF)
		: Formula(CNF) {}

		/// Adds the `L1 v ... v Ln` clause to the formula. Applies
		sammccallUnsubmitted Done Reply Inline Actions you're already testing this in the higher-level loop, so checking on every call to addClause doesn't seem to actually save anything significant. IMO it makes responsibilities less clear, so I'd prefer to drop it here and addClauseLiterals sammccall: you're already testing this in the higher-level loop, so checking on every call to addClause…
		/// simplifications, based on single-literal clauses.
		///
		/// Requirements:
		///
		/// `Li` must not be `NullLit`.
		///
		/// All literals must be distinct.
		sammccallUnsubmitted Done Reply Inline Actions if you're going to form an ArrayRef in any case, might as well skip this indirection and have the callsites pass `addClause({L1, L2})` or so? sammccall: if you're going to form an ArrayRef in any case, might as well skip this indirection and have…
		burakemirAuthorUnsubmitted Done Reply Inline Actions Thanks, this looks much nicer. I chose to preserve the assert condition to check that there are <= 3 literals. I believe the solver way work with clauses that have more literals but I don't know whether any trade-offs were made to focus on 3SAT in the solver. burakemir: Thanks, this looks much nicer. I chose to preserve the assert condition to check that there…
		void addClause(ArrayRef<Literal> Literals) {
		sammccallUnsubmitted Not Done Reply Inline Actions nit: capitalize variable names here & elsewhere sammccall: nit: capitalize variable names here & elsewhere
		// We generate clauses with up to 3 literals in this file.
		assert(!Literals.empty() && Literals.size() <= 3);
		// Contains literals of the simplified clause.
		llvm::SmallVector<Literal> Simplified;
		for (auto L : Literals) {
		assert(L != NullLit &&
		llvm::all_of(Simplified,
		[L](Literal S) { return S != L; }));
		auto X = var(L);
		if (trueVars.contains(X)) { // X must be true
		if (isPosLit(L))
		return; // Omit clause `(... v X v ...)`, it is `true`.
		else
		continue; // Omit `!X` from `(... v !X v ...)`.
		}
		if (falseVars.contains(X)) { // X must be false
		if (isNegLit(L))
		return; // Omit clause `(... v !X v ...)`, it is `true`.
		sammccallUnsubmitted Done Reply Inline Actions literal => L or so sammccall: literal => L or so
		else
		continue; // Omit `X` from `(... v X v ...)`.
		}
		Simplified.push_back(L);
		}
		if (Simplified.empty()) {
		// Simplification made the clause empty, which is equivalent to `false`.
		// We already know that this formula is unsatisfiable.
		Formula.KnownContradictory = true;
		// We can add any of the input literals to get an unsatisfiable formula.
		Formula.addClause(Literals[0]);
		return;
		}
		if (Simplified.size() == 1) {
		// We have new unit clause.
		const Literal lit = Simplified.front();
		const Variable v = var(lit);
		if (isPosLit(lit))
		trueVars.insert(v);
		else
		falseVars.insert(v);
		}
		Formula.addClause(Simplified);
		}

		/// Returns true if we observed a contradiction while adding clauses.
		/// In this case then the formula is already known to be unsatisfiable.
		bool isKnownContradictory() { return Formula.KnownContradictory; }

		private:
		CNFFormula &Formula;
		llvm::DenseSet<Variable> trueVars;
		llvm::DenseSet<Variable> falseVars;
		};

/// Converts the conjunction of `Vals` into a formula in conjunctive normal		/// Converts the conjunction of `Vals` into a formula in conjunctive normal
/// form where each clause has at least one and at most three literals.		/// form where each clause has at least one and at most three literals.
CNFFormula buildCNF(const llvm::ArrayRef<const Formula *> &Vals) {		CNFFormula buildCNF(const llvm::ArrayRef<const Formula *> &Vals) {
		sammccallUnsubmitted Done Reply Inline Actions IsKnownContradictory => isKnownContradictory sammccall: IsKnownContradictory => isKnownContradictory
// The general strategy of the algorithm implemented below is to map each		// The general strategy of the algorithm implemented below is to map each
// of the sub-values in `Vals` to a unique variable and use these variables in		// of the sub-values in `Vals` to a unique variable and use these variables in
// the resulting CNF expression to avoid exponential blow up. The number of		// the resulting CNF expression to avoid exponential blow up. The number of
// literals in the resulting formula is guaranteed to be linear in the number		// literals in the resulting formula is guaranteed to be linear in the number
// of sub-formulas in `Vals`.		// of sub-formulas in `Vals`.

// Map each sub-formula in `Vals` to a unique variable.		// Map each sub-formula in `Vals` to a unique variable.
llvm::DenseMap<const Formula *, Variable> SubValsToVar;		llvm::DenseMap<const Formula *, Variable> SubValsToVar;
Show All 23 Lines	CNFFormula buildCNF(const llvm::ArrayRef<const Formula *> &Vals) {
auto GetVar = [&SubValsToVar](const Formula *Val) {		auto GetVar = [&SubValsToVar](const Formula *Val) {
auto ValIt = SubValsToVar.find(Val);		auto ValIt = SubValsToVar.find(Val);
assert(ValIt != SubValsToVar.end());		assert(ValIt != SubValsToVar.end());
return ValIt->second;		return ValIt->second;
};		};

CNFFormula CNF(NextVar - 1, std::move(Atomics));		CNFFormula CNF(NextVar - 1, std::move(Atomics));
std::vector<bool> ProcessedSubVals(NextVar, false);		std::vector<bool> ProcessedSubVals(NextVar, false);
		CNFFormulaBuilder builder(CNF);

// Add a conjunct for each variable that represents a top-level formula in		// Add a conjunct for each variable that represents a top-level conjunction
// `Vals`.		// value in `Vals`.
for (const Formula *Val : Vals)		for (const Formula *Val : Vals)
CNF.addClause(posLit(GetVar(Val)));		builder.addClause(posLit(GetVar(Val)));

// Add conjuncts that represent the mapping between newly-created variables		// Add conjuncts that represent the mapping between newly-created variables
// and their corresponding sub-formulas.		// and their corresponding sub-formulas.
std::queue<const Formula *> UnprocessedSubVals;		std::queue<const Formula *> UnprocessedSubVals;
for (const Formula *Val : Vals)		for (const Formula *Val : Vals)
UnprocessedSubVals.push(Val);		UnprocessedSubVals.push(Val);
while (!UnprocessedSubVals.empty()) {		while (!UnprocessedSubVals.empty()) {
const Formula *Val = UnprocessedSubVals.front();		const Formula *Val = UnprocessedSubVals.front();
Show All 10 Lines	while (!UnprocessedSubVals.empty()) {
case Formula::And: {		case Formula::And: {
const Variable LHS = GetVar(Val->operands()[0]);		const Variable LHS = GetVar(Val->operands()[0]);
const Variable RHS = GetVar(Val->operands()[1]);		const Variable RHS = GetVar(Val->operands()[1]);

if (LHS == RHS) {		if (LHS == RHS) {
// `X <=> (A ^ A)` is equivalent to `(!X v A) ^ (X v !A)` which is		// `X <=> (A ^ A)` is equivalent to `(!X v A) ^ (X v !A)` which is
// already in conjunctive normal form. Below we add each of the		// already in conjunctive normal form. Below we add each of the
// conjuncts of the latter expression to the result.		// conjuncts of the latter expression to the result.
CNF.addClause(negLit(Var), posLit(LHS));		builder.addClause({negLit(Var), posLit(LHS)});
CNF.addClause(posLit(Var), negLit(LHS));		builder.addClause({posLit(Var), negLit(LHS)});
} else {		} else {
// `X <=> (A ^ B)` is equivalent to `(!X v A) ^ (!X v B) ^ (X v !A v !B)`		// `X <=> (A ^ B)` is equivalent to `(!X v A) ^ (!X v B) ^ (X v !A v
// which is already in conjunctive normal form. Below we add each of the		// !B)` which is already in conjunctive normal form. Below we add each
// conjuncts of the latter expression to the result.		// of the conjuncts of the latter expression to the result.
CNF.addClause(negLit(Var), posLit(LHS));		builder.addClause({negLit(Var), posLit(LHS)});
CNF.addClause(negLit(Var), posLit(RHS));		builder.addClause({negLit(Var), posLit(RHS)});
CNF.addClause(posLit(Var), negLit(LHS), negLit(RHS));		builder.addClause({posLit(Var), negLit(LHS), negLit(RHS)});
}		}
break;		break;
}		}
case Formula::Or: {		case Formula::Or: {
const Variable LHS = GetVar(Val->operands()[0]);		const Variable LHS = GetVar(Val->operands()[0]);
const Variable RHS = GetVar(Val->operands()[1]);		const Variable RHS = GetVar(Val->operands()[1]);

if (LHS == RHS) {		if (LHS == RHS) {
// `X <=> (A v A)` is equivalent to `(!X v A) ^ (X v !A)` which is		// `X <=> (A v A)` is equivalent to `(!X v A) ^ (X v !A)` which is
// already in conjunctive normal form. Below we add each of the		// already in conjunctive normal form. Below we add each of the
// conjuncts of the latter expression to the result.		// conjuncts of the latter expression to the result.
CNF.addClause(negLit(Var), posLit(LHS));		builder.addClause({negLit(Var), posLit(LHS)});
CNF.addClause(posLit(Var), negLit(LHS));		builder.addClause({posLit(Var), negLit(LHS)});
} else {		} else {
// `X <=> (A v B)` is equivalent to `(!X v A v B) ^ (X v !A) ^ (X v		// `X <=> (A v B)` is equivalent to `(!X v A v B) ^ (X v !A) ^ (X v
// !B)` which is already in conjunctive normal form. Below we add each		// !B)` which is already in conjunctive normal form. Below we add each
// of the conjuncts of the latter expression to the result.		// of the conjuncts of the latter expression to the result.
CNF.addClause(negLit(Var), posLit(LHS), posLit(RHS));		builder.addClause({negLit(Var), posLit(LHS), posLit(RHS)});
CNF.addClause(posLit(Var), negLit(LHS));		builder.addClause({posLit(Var), negLit(LHS)});
CNF.addClause(posLit(Var), negLit(RHS));		builder.addClause({posLit(Var), negLit(RHS)});
}		}
break;		break;
}		}
case Formula::Not: {		case Formula::Not: {
const Variable Operand = GetVar(Val->operands()[0]);		const Variable Operand = GetVar(Val->operands()[0]);

// `X <=> !Y` is equivalent to `(!X v !Y) ^ (X v Y)` which is		// `X <=> !Y` is equivalent to `(!X v !Y) ^ (X v Y)` which is
// already in conjunctive normal form. Below we add each of the		// already in conjunctive normal form. Below we add each of the
// conjuncts of the latter expression to the result.		// conjuncts of the latter expression to the result.
CNF.addClause(negLit(Var), negLit(Operand));		builder.addClause({negLit(Var), negLit(Operand)});
CNF.addClause(posLit(Var), posLit(Operand));		builder.addClause({posLit(Var), posLit(Operand)});
break;		break;
}		}
case Formula::Implies: {		case Formula::Implies: {
const Variable LHS = GetVar(Val->operands()[0]);		const Variable LHS = GetVar(Val->operands()[0]);
const Variable RHS = GetVar(Val->operands()[1]);		const Variable RHS = GetVar(Val->operands()[1]);

// `X <=> (A => B)` is equivalent to		// `X <=> (A => B)` is equivalent to
// `(X v A) ^ (X v !B) ^ (!X v !A v B)` which is already in		// `(X v A) ^ (X v !B) ^ (!X v !A v B)` which is already in
// conjunctive normal form. Below we add each of the conjuncts of		// conjunctive normal form. Below we add each of the conjuncts of
// the latter expression to the result.		// the latter expression to the result.
CNF.addClause(posLit(Var), posLit(LHS));		builder.addClause({posLit(Var), posLit(LHS)});
CNF.addClause(posLit(Var), negLit(RHS));		builder.addClause({posLit(Var), negLit(RHS)});
CNF.addClause(negLit(Var), negLit(LHS), posLit(RHS));		builder.addClause({negLit(Var), negLit(LHS), posLit(RHS)});
break;		break;
}		}
case Formula::Equal: {		case Formula::Equal: {
const Variable LHS = GetVar(Val->operands()[0]);		const Variable LHS = GetVar(Val->operands()[0]);
const Variable RHS = GetVar(Val->operands()[1]);		const Variable RHS = GetVar(Val->operands()[1]);

if (LHS == RHS) {		if (LHS == RHS) {
// `X <=> (A <=> A)` is equvalent to `X` which is already in		// `X <=> (A <=> A)` is equivalent to `X` which is already in
// conjunctive normal form. Below we add each of the conjuncts of the		// conjunctive normal form. Below we add each of the conjuncts of the
// latter expression to the result.		// latter expression to the result.
CNF.addClause(posLit(Var));		builder.addClause(posLit(Var));

// No need to visit the sub-values of `Val`.		// No need to visit the sub-values of `Val`.
continue;		continue;
}		}
// `X <=> (A <=> B)` is equivalent to		// `X <=> (A <=> B)` is equivalent to
// `(X v A v B) ^ (X v !A v !B) ^ (!X v A v !B) ^ (!X v !A v B)` which		// `(X v A v B) ^ (X v !A v !B) ^ (!X v A v !B) ^ (!X v !A v B)` which
// is already in conjunctive normal form. Below we add each of the		// is already in conjunctive normal form. Below we add each of the
// conjuncts of the latter expression to the result.		// conjuncts of the latter expression to the result.
CNF.addClause(posLit(Var), posLit(LHS), posLit(RHS));		builder.addClause({posLit(Var), posLit(LHS), posLit(RHS)});
CNF.addClause(posLit(Var), negLit(LHS), negLit(RHS));		builder.addClause({posLit(Var), negLit(LHS), negLit(RHS)});
CNF.addClause(negLit(Var), posLit(LHS), negLit(RHS));		builder.addClause({negLit(Var), posLit(LHS), negLit(RHS)});
CNF.addClause(negLit(Var), negLit(LHS), posLit(RHS));		builder.addClause({negLit(Var), negLit(LHS), posLit(RHS)});
break;		break;
}		}
}		}
		if (builder.isKnownContradictory()) {
		return CNF;
		}
for (const Formula *Child : Val->operands())		for (const Formula *Child : Val->operands())
UnprocessedSubVals.push(Child);		UnprocessedSubVals.push(Child);
}		}

return CNF;		// Unit clauses that were added later were not
		// considered for the simplification of earlier clauses. Do a final
		// pass to find more opportunities for simplification.
		sammccallUnsubmitted Done Reply Inline Actions the issue is that info only propagates forward (earlier to later clauses, right?) so by running this again, and sorting units first, we allow simplifications that propagate info backwards once, but we still don't have all simplifications. D Av!B Bv!C Cv!D // first simplification pass Av!B Bv!C C // hoist new unit // second simplification pass Av!B B // hoist new unit // third simplification pass A I think this is worth being explicit about: we're going to find some more simplifications, but we won't find them all, because running this to fixed point is too inefficient. Is 2 experimentally determined to be the right number of passes? a guess? or am I misunderstanding :-) sammccall: the issue is that info only propagates forward (earlier to later clauses, right?) so by…
		burakemirAuthorUnsubmitted Done Reply Inline Actions You are right that one could do more work but it is better to leave this to the solver algorithm. We know empirically that there will be a few unit clauses, so might as well spend linear time (in number of unit clauses) to save some work. This won't be enough to determine whether all formulas are satisfiable, but it catches a few obvious contradictions. Doing this twice (as opposed to once) catches more formulas that are obvious contradictions in our unit tests and some real sources. I picked two simply because when we obtain unit clauses "later", we had no opportunity to apply them to earlier clauses. Doing full-blow mutations seems more complicated, esp. given that the Clauses data structure has been written for the actual solver algorithm. I think your concern on optimizing for a certain pattern of input formulas, which may well change in the future, is valid; therefore one should leave the "real" solving work to the solver algorithm, which systematically explores all cases. burakemir: You are right that one could do more work but it is better to leave this to the solver…
		CNFFormula FinalCNF(NextVar - 1, std::move(CNF.Atomics));
		CNFFormulaBuilder FinalBuilder(FinalCNF);

		// Collect unit clauses.
		for (ClauseID C = 1; C < CNF.ClauseStarts.size(); ++C) {
		if (CNF.clauseSize(C) == 1) {
		FinalBuilder.addClause(CNF.clauseLiterals(C)[0]);
		}
		}

		// Add all clauses that were added previously, preserving the order.
		for (ClauseID C = 1; C < CNF.ClauseStarts.size(); ++C) {
		FinalBuilder.addClause(CNF.clauseLiterals(C));
		if (FinalBuilder.isKnownContradictory()) {
		break;
		}
		}
		// It is possible there were new unit clauses again, but
		// we stop here and leave the rest to the solver algorithm.
		return FinalCNF;
}		}

class WatchedLiteralsSolverImpl {		class WatchedLiteralsSolverImpl {
/// A boolean formula in conjunctive normal form that the solver will attempt		/// A boolean formula in conjunctive normal form that the solver will attempt
/// to prove satisfiable. The formula will be modified in the process.		/// to prove satisfiable. The formula will be modified in the process.
CNFFormula CNF;		CNFFormula CNF;

/// The search for a satisfying assignment of the variables in `Formula` will		/// The search for a satisfying assignment of the variables in `Formula` will
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	for (Variable Var = CNF.LargestVar; Var != NullVar; --Var) {
if (isWatched(posLit(Var)) \|\| isWatched(negLit(Var)))		if (isWatched(posLit(Var)) \|\| isWatched(negLit(Var)))
ActiveVars.push_back(Var);		ActiveVars.push_back(Var);
}		}
}		}

// Returns the `Result` and the number of iterations "remaining" from		// Returns the `Result` and the number of iterations "remaining" from
// `MaxIterations` (that is, `MaxIterations` - iterations in this call).		// `MaxIterations` (that is, `MaxIterations` - iterations in this call).
std::pair<Solver::Result, std::int64_t> solve(std::int64_t MaxIterations) && {		std::pair<Solver::Result, std::int64_t> solve(std::int64_t MaxIterations) && {
		if (CNF.KnownContradictory) {
		// Short-cut the solving process. We already found out at CNF
		// construction time that the formula is unsatisfiable.
		return std::make_pair(Solver::Result::Unsatisfiable(), MaxIterations);
		}
size_t I = 0;		size_t I = 0;
while (I < ActiveVars.size()) {		while (I < ActiveVars.size()) {
if (MaxIterations == 0)		if (MaxIterations == 0)
return std::make_pair(Solver::Result::TimedOut(), 0);		return std::make_pair(Solver::Result::TimedOut(), 0);
--MaxIterations;		--MaxIterations;

// Assert that the following invariants hold:		// Assert that the following invariants hold:
// 1. All active variables are unassigned.		// 1. All active variables are unassigned.
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	while (I < ActiveVars.size()) {

// This was the last active variable. Repeat the process from the		// This was the last active variable. Repeat the process from the
// beginning.		// beginning.
I = 0;		I = 0;
} else {		} else {
++I;		++I;
}		}
}		}
return std::make_pair(Solver::Result::Satisfiable(buildSolution()), MaxIterations);		return std::make_pair(Solver::Result::Satisfiable(buildSolution()),
		MaxIterations);
}		}

private:		private:
/// Returns a satisfying truth assignment to the atoms in the boolean formula.		/// Returns a satisfying truth assignment to the atoms in the boolean formula.
llvm::DenseMap<Atom, Solver::Result::Assignment> buildSolution() {		llvm::DenseMap<Atom, Solver::Result::Assignment> buildSolution() {
llvm::DenseMap<Atom, Solver::Result::Assignment> Solution;		llvm::DenseMap<Atom, Solver::Result::Assignment> Solution;
for (auto &Atomic : CNF.Atomics) {		for (auto &Atomic : CNF.Atomics) {
// A variable may have a definite true/false assignment, or it may be		// A variable may have a definite true/false assignment, or it may be
▲ Show 20 Lines • Show All 152 Lines • ▼ Show 20 Lines	bool unassignedVarsFormingWatchedLiteralsAreActive() const {
return true;		return true;
}		}
};		};

Solver::Result		Solver::Result
WatchedLiteralsSolver::solve(llvm::ArrayRef<const Formula *> Vals) {		WatchedLiteralsSolver::solve(llvm::ArrayRef<const Formula *> Vals) {
if (Vals.empty())		if (Vals.empty())
return Solver::Result::Satisfiable({{}});		return Solver::Result::Satisfiable({{}});
auto [Res, Iterations] =		auto [Res, Iterations] = WatchedLiteralsSolverImpl(Vals).solve(MaxIterations);
WatchedLiteralsSolverImpl(Vals).solve(MaxIterations);
MaxIterations = Iterations;		MaxIterations = Iterations;
return Res;		return Res;
}		}

} // namespace dataflow		} // namespace dataflow
} // namespace clang		} // namespace clang

clang/unittests/Analysis/FlowSensitive/SolverTest.cpp

Show First 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	TEST(SolverTest, ReachedLimitsReflectsTimeouts) {
// !(Z ^ W) <=> !Z v !W		// !(Z ^ W) <=> !Z v !W
auto B = Ctx.iff(Ctx.neg(Ctx.conj(Z, W)), Ctx.disj(Ctx.neg(Z), Ctx.neg(W)));		auto B = Ctx.iff(Ctx.neg(Ctx.conj(Z, W)), Ctx.disj(Ctx.neg(Z), Ctx.neg(W)));

// A ^ B		// A ^ B
ASSERT_EQ(solver.solve({A, B}).getStatus(), Solver::Result::Status::TimedOut);		ASSERT_EQ(solver.solve({A, B}).getStatus(), Solver::Result::Status::TimedOut);
EXPECT_TRUE(solver.reachedLimit());		EXPECT_TRUE(solver.reachedLimit());
}		}

		TEST(SolverTest, SimpleButLargeContradiction) {
		// This test ensures that the solver takes a short-cut on known
		// contradictory inputs, without using max_iterations. At the time
		// this test is added, formulas that are easily recognized to be
		// contradictory at CNF construction time would lead to timeout.
		WatchedLiteralsSolver solver(10);
		ConstraintContext Ctx;
		auto first = Ctx.atom();
		auto last = first;
		for (int i = 1; i < 10000; ++i) {
		last = Ctx.conj(last, Ctx.atom());
		}
		last = Ctx.conj(Ctx.neg(first), last);
		ASSERT_EQ(solver.solve({last}).getStatus(),
		Solver::Result::Status::Unsatisfiable);
		EXPECT_FALSE(solver.reachedLimit());

		first = Ctx.atom();
		last = Ctx.neg(first);
		for (int i = 1; i < 10000; ++i) {
		last = Ctx.conj(last, Ctx.neg(Ctx.atom()));
		}
		last = Ctx.conj(first, last);
		ASSERT_EQ(solver.solve({last}).getStatus(),
		Solver::Result::Status::Unsatisfiable);
		EXPECT_FALSE(solver.reachedLimit());
		}

} // namespace		} // namespace