This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/StaticAnalyzer/Core/PathSensitive/
-
clang/
-
StaticAnalyzer/
-
Core/
-
PathSensitive/
-
ConstraintManager.h
-
SimpleConstraintManager.h
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
-
ConstraintManager.cpp
-
SimpleConstraintManager.cpp
-
test/Analysis/
-
Analysis/
2/2
infeasible-crash.c

Differential D124758

[analyzer] Implement assume in terms of assumeDual
ClosedPublic

Authored by martong on May 2 2022, 4:47 AM.

Download Raw Diff

Details

Reviewers

NoQ
steakhal
ASDenysPetrov
Szelethus

Commits

rG1c1c1e25f94f: [analyzer] Implement assume in terms of assumeDual

Summary

By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, assume is implemented in the terms
of assumeDual. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

martong created this revision.May 2 2022, 4:47 AM

Herald added a reviewer: Szelethus. · View Herald TranscriptMay 2 2022, 4:47 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: manas, gamesh411, dkrupp and 8 others. · View Herald Transcript

martong requested review of this revision.May 2 2022, 4:47 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2022, 4:47 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B162216: Diff 426376.May 2 2022, 4:47 AM

martong added a parent revision: D124674: [analyzer] Indicate if a parent state is infeasible.May 2 2022, 4:48 AM

Although there is a visible slowdown, the new run-times seem promising. My guess is that it is usually not more than 1-2%.

Full report:

stats.html169 KBDownload

The patch looks great. Thanks for the stats.

Beyond that, I feel these tests sooo fragile; Although I don't have anything to improve that, so be it.

clang/test/Analysis/infeasible-crash.c
4	Do we really need this?

This revision is now accepted and ready to land.May 2 2022, 6:15 AM

martong mentioned this in D124674: [analyzer] Indicate if a parent state is infeasible.May 3 2022, 3:53 AM

martong added a child revision: D124761: [analyzer] Replace adjacent assumeInBound calls to assumeInBoundDual.May 6 2022, 6:57 AM

Rebase to parent revision

Remove ExprInspection checker from the test

clang/test/Analysis/infeasible-crash.c
4	No, thx. Removed.

Harbormaster completed remote builds in B163145: Diff 427636.May 6 2022, 8:46 AM

This revision was landed with ongoing or failed builds.May 10 2022, 1:17 AM

Closed by commit rG1c1c1e25f94f: [analyzer] Implement assume in terms of assumeDual (authored by martong). · Explain Why

This revision was automatically updated to reflect the committed changes.

martong added a commit: rG1c1c1e25f94f: [analyzer] Implement assume in terms of assumeDual.

infeasible-crash.c fails both on arm and windows. The reasion is the incompatible memmove declaration. I am to fix this ASAP.

Armv7

error: 'warning' diagnostics seen but not expected: 
  File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove'
error: 'note' diagnostics seen but not expected: 
  File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned int)'
2 errors generated.

windows x64:

error: 'warning' diagnostics seen but not expected: 
  File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove'
error: 'note' diagnostics seen but not expected: 
  File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned long long)'
2 errors generated.

Hopefully this commit fixes the failure:
https://github.com/llvm/llvm-project/commit/21feafaeb85aad2847db44aa2208999b166ba4a9

In D124758#3502913, @martong wrote:

Hopefully this commit fixes the failure:
https://github.com/llvm/llvm-project/commit/21feafaeb85aad2847db44aa2208999b166ba4a9

Yes, it fixed the test case both at clang-armv7-quick and at clang-x64-windows-msvc.

martong added a child revision: D125892: [analyzer] Implement assumeInclusiveRange in terms of assumeInclusiveRangeDual.May 18 2022, 8:03 AM

This commit introduced a serious runtime regression on this code:

#define DEMONSTRATE_HANG

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned long uint64_t;

void clang_analyzer_numTimesReached(void);

int filter_slice_word(int sat_linesize, int sigma, int radius, uint64_t *sat,
                      uint64_t *square_sat, int width, int height,
                      int src_linesize, int dst_linesize, const uint16_t *src,
                      uint16_t *dst, int jobnr, int nb_jobs) {
  const int starty = height * jobnr / nb_jobs;
  const int endy = height * (jobnr + 1) / nb_jobs;

  clang_analyzer_numTimesReached(); // 1 times
  for (int y = starty; y < endy; y++) {
    clang_analyzer_numTimesReached(); // 285 times

    int lower_y = y - radius < 0 ? 0 : y - radius;
    int higher_y = y + radius + 1 > height ? height : y + radius + 1;
    int dist_y = higher_y - lower_y;
    clang_analyzer_numTimesReached(); // 1128 times

    for (int x = 0; x < width; x++) {
      clang_analyzer_numTimesReached(); // 560 times
      
      int lower_x = x - radius < 0 ? 0 : x - radius;
      int higher_x = x + radius + 1 > width ? width : x + radius + 1;
      int count = dist_y * (higher_x - lower_x);
#ifdef DEMONSTRATE_HANG
      uint64_t sum = sat[higher_y * sat_linesize + higher_x] -
                     sat[higher_y * sat_linesize + lower_x] -
                     sat[lower_y * sat_linesize + higher_x] +
                     sat[lower_y * sat_linesize + lower_x];
      uint64_t square_sum = square_sat[higher_y * sat_linesize + higher_x] -
                            square_sat[higher_y * sat_linesize + lower_x] -
                            square_sat[lower_y * sat_linesize + higher_x] +
                            square_sat[lower_y * sat_linesize + lower_x];
      uint64_t mean = sum / count;
      uint64_t var = (square_sum - sum * sum / count) / count;
      dst[y * dst_linesize + x] =
          (sigma * mean + var * src[y * src_linesize + x]) / (sigma + var);
#endif
    }
  }
  return 0;
}

/build/release/bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c

Prior to this commit, the analysis of this code took about 2.6 seconds on my machine (release build with shared libs)
After this commit, it takes more than 67 minutes, and most of the time is spent in the constraint solver doing simplification I think.

Be reminded of the ArrayBoundV2 checker which will try to express the symbolic atom of the indexer expression:
0 <= (x + 3) < extent => -3 <= x < extent - 3 (this is more complex in the wild, it's for demonstration now).
So, it will create new and new states during this process, while gathering constraints to make the indexing well-formed.

However, the nested loops and the eager bifurcation over the < and > operators cause a significant path explosion, and on each one of them, we do this complex equality system reorganization, aggregating constraints, thus causing constraint system simplifications down the line.
I'm not sure how to tackle this problem ATM.

Thanks Balazs for the report.

Here is my analysis. Looks like during the recursive simplification, reAssume produces States that had been created by a previous reAssume. Before this change, to stop the recursion it was enough to to check if the OldState equals to the actual State in reAssume. Now, with this change, each assume call evaluates both the true and the false branches, thus it is not necessary that the subsequent reAssume could detect an already "visited" State.
So, the obvious solution would be to have a State cache in the reAssume machinery, though, implementation details are not clear yet.

There is another really important thing. We should not continue with reAssume if the State is posteriorlyOverConstrained.

 LLVM_NODISCARD ProgramStateRef reAssume(ProgramStateRef State,
                                         const RangeSet *Constraint,
                                         SVal TheValue) {
+  if (State->isPosteriorlyOverconstrained())
+    return nullptr;
   if (!Constraint)
     return State;

This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.

This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.

Sorry, in that 16s, I measured also the rebuild and linkage of the Clang binary. The time is actually way better, 2.8s, which is quite close to the original values we had before this change. So, perhaps it is not even needed to bother with the above mentioned cache mechanism.

 time ./bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c
test.c:14:3: warning: 1 [debug.ExprInspection]
  clang_analyzer_numTimesReached(); // 1 times
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:16:5: warning: 253 [debug.ExprInspection]
    clang_analyzer_numTimesReached(); // 285 times
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:21:5: warning: 805 [debug.ExprInspection]
    clang_analyzer_numTimesReached(); // 1128 times
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:24:7: warning: 487 [debug.ExprInspection]
      clang_analyzer_numTimesReached(); // 560 times
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4 warnings generated.
./bin/clang --analyze -Xclang  test.c  2.74s user 0.07s system 99% cpu 2.811 total

martong added a child revision: D126406: [analyzer] Return from reAssume if State is posteriorly overconstrained.May 25 2022, 12:30 PM

As a heads up, because I'm not sure how often folks look at Github Issues. This patch causes a stack overflow on some Objective-C++ code. I have filed https://github.com/llvm/llvm-project/issues/55851. Could you take a look @martong?

Revision Contents

Path

Size

clang/

include/

clang/

StaticAnalyzer/

Core/

PathSensitive/

ConstraintManager.h

8 lines

SimpleConstraintManager.h

4 lines

lib/

StaticAnalyzer/

Core/

ConstraintManager.cpp

12 lines

SimpleConstraintManager.cpp

10 lines

test/

Analysis/

infeasible-crash.c

38 lines

Diff 427635

clang/include/clang/StaticAnalyzer/Core/PathSensitive/ConstraintManager.h

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
class ConstraintManager {		class ConstraintManager {
public:		public:
ConstraintManager() = default;		ConstraintManager() = default;
virtual ~ConstraintManager();		virtual ~ConstraintManager();

virtual bool haveEqualConstraints(ProgramStateRef S1,		virtual bool haveEqualConstraints(ProgramStateRef S1,
ProgramStateRef S2) const = 0;		ProgramStateRef S2) const = 0;

virtual ProgramStateRef assume(ProgramStateRef state,		ProgramStateRef assume(ProgramStateRef state, DefinedSVal Cond,
DefinedSVal Cond,		bool Assumption);
bool Assumption) = 0;

using ProgramStatePair = std::pair<ProgramStateRef, ProgramStateRef>;		using ProgramStatePair = std::pair<ProgramStateRef, ProgramStateRef>;

/// Returns a pair of states (StTrue, StFalse) where the given condition is		/// Returns a pair of states (StTrue, StFalse) where the given condition is
/// assumed to be true or false, respectively.		/// assumed to be true or false, respectively.
/// (Note that these two states might be equal if the parent state turns out		/// (Note that these two states might be equal if the parent state turns out
/// to be infeasible. This may happen if the underlying constraint solver is		/// to be infeasible. This may happen if the underlying constraint solver is
/// not perfectly precise and this may happen very rarely.)		/// not perfectly precise and this may happen very rarely.)
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	protected:
/// A flag to indicate that clients should be notified of assumptions.		/// A flag to indicate that clients should be notified of assumptions.
/// By default this is the case, but sometimes this needs to be restricted		/// By default this is the case, but sometimes this needs to be restricted
/// to avoid infinite recursions within the ConstraintManager.		/// to avoid infinite recursions within the ConstraintManager.
///		///
/// Note that this flag allows the ConstraintManager to be re-entrant,		/// Note that this flag allows the ConstraintManager to be re-entrant,
/// but not thread-safe.		/// but not thread-safe.
bool NotifyAssumeClients = true;		bool NotifyAssumeClients = true;

		virtual ProgramStateRef assumeInternal(ProgramStateRef state,
		DefinedSVal Cond, bool Assumption) = 0;

/// canReasonAbout - Not all ConstraintManagers can accurately reason about		/// canReasonAbout - Not all ConstraintManagers can accurately reason about
/// all SVal values. This method returns true if the ConstraintManager can		/// all SVal values. This method returns true if the ConstraintManager can
/// reasonably handle a given SVal value. This is typically queried by		/// reasonably handle a given SVal value. This is typically queried by
/// ExprEngine to determine if the value should be replaced with a		/// ExprEngine to determine if the value should be replaced with a
/// conjured symbolic value in order to recover some precision.		/// conjured symbolic value in order to recover some precision.
virtual bool canReasonAbout(SVal X) const = 0;		virtual bool canReasonAbout(SVal X) const = 0;

/// Returns whether or not a symbol is known to be null ("true"), known to be		/// Returns whether or not a symbol is known to be null ("true"), known to be
Show All 16 Lines

clang/include/clang/StaticAnalyzer/Core/PathSensitive/SimpleConstraintManager.h

Show All 30 Lines	public:
~SimpleConstraintManager() override;		~SimpleConstraintManager() override;

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// Implementation for interface from ConstraintManager.		// Implementation for interface from ConstraintManager.
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//

/// Ensures that the DefinedSVal conditional is expressed as a NonLoc by		/// Ensures that the DefinedSVal conditional is expressed as a NonLoc by
/// creating boolean casts to handle Loc's.		/// creating boolean casts to handle Loc's.
ProgramStateRef assume(ProgramStateRef State, DefinedSVal Cond,		ProgramStateRef assumeInternal(ProgramStateRef State, DefinedSVal Cond,
bool Assumption) override;		bool Assumption) override;

ProgramStateRef assumeInclusiveRange(ProgramStateRef State, NonLoc Value,		ProgramStateRef assumeInclusiveRange(ProgramStateRef State, NonLoc Value,
const llvm::APSInt &From,		const llvm::APSInt &From,
const llvm::APSInt &To,		const llvm::APSInt &To,
bool InRange) override;		bool InRange) override;

protected:		protected:
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Core/ConstraintManager.cpp

Show All 38 Lines	if (P.first && !P.second)
return ConditionTruthVal(false);		return ConditionTruthVal(false);
if (!P.first && P.second)		if (!P.first && P.second)
return ConditionTruthVal(true);		return ConditionTruthVal(true);
return {};		return {};
}		}

ConstraintManager::ProgramStatePair		ConstraintManager::ProgramStatePair
ConstraintManager::assumeDual(ProgramStateRef State, DefinedSVal Cond) {		ConstraintManager::assumeDual(ProgramStateRef State, DefinedSVal Cond) {
ProgramStateRef StTrue = assume(State, Cond, true);		ProgramStateRef StTrue = assumeInternal(State, Cond, true);

if (!StTrue) {		if (!StTrue) {
ProgramStateRef StFalse = assume(State, Cond, false);		ProgramStateRef StFalse = assumeInternal(State, Cond, false);
if (LLVM_UNLIKELY(!StFalse)) { // both infeasible		if (LLVM_UNLIKELY(!StFalse)) { // both infeasible
ProgramStateRef StInfeasible = State->cloneAsPosteriorlyOverconstrained();		ProgramStateRef StInfeasible = State->cloneAsPosteriorlyOverconstrained();
assert(StInfeasible->isPosteriorlyOverconstrained());		assert(StInfeasible->isPosteriorlyOverconstrained());
// Checkers might rely on the API contract that both returned states		// Checkers might rely on the API contract that both returned states
// cannot be null. Thus, we return StInfeasible for both branches because		// cannot be null. Thus, we return StInfeasible for both branches because
// it might happen that a Checker uncoditionally uses one of them if the		// it might happen that a Checker uncoditionally uses one of them if the
// other is a nullptr. This may also happen with the non-dual and		// other is a nullptr. This may also happen with the non-dual and
// adjacent `assume(true)` and `assume(false)` calls. By implementing		// adjacent `assume(true)` and `assume(false)` calls. By implementing
// assume in therms of assumeDual, we can keep our API contract there as		// assume in therms of assumeDual, we can keep our API contract there as
// well.		// well.
return ProgramStatePair(StInfeasible, StInfeasible);		return ProgramStatePair(StInfeasible, StInfeasible);
}		}
return ProgramStatePair(nullptr, StFalse);		return ProgramStatePair(nullptr, StFalse);
}		}

ProgramStateRef StFalse = assume(State, Cond, false);		ProgramStateRef StFalse = assumeInternal(State, Cond, false);
if (!StFalse) {		if (!StFalse) {
return ProgramStatePair(StTrue, nullptr);		return ProgramStatePair(StTrue, nullptr);
}		}

return ProgramStatePair(StTrue, StFalse);		return ProgramStatePair(StTrue, StFalse);
}		}

		ProgramStateRef ConstraintManager::assume(ProgramStateRef State,
		DefinedSVal Cond, bool Assumption) {
		ConstraintManager::ProgramStatePair R = assumeDual(State, Cond);
		return Assumption ? R.first : R.second;
		}

clang/lib/StaticAnalyzer/Core/SimpleConstraintManager.cpp

Show All 16 Lines
#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/ProgramState.h"

namespace clang {		namespace clang {

namespace ento {		namespace ento {

SimpleConstraintManager::~SimpleConstraintManager() {}		SimpleConstraintManager::~SimpleConstraintManager() {}

ProgramStateRef SimpleConstraintManager::assume(ProgramStateRef State,		ProgramStateRef SimpleConstraintManager::assumeInternal(ProgramStateRef State,
DefinedSVal Cond,		DefinedSVal Cond,
bool Assumption) {		bool Assumption) {
// If we have a Loc value, cast it to a bool NonLoc first.		// If we have a Loc value, cast it to a bool NonLoc first.
if (Optional<Loc> LV = Cond.getAs<Loc>()) {		if (Optional<Loc> LV = Cond.getAs<Loc>()) {
SValBuilder &SVB = State->getStateManager().getSValBuilder();		SValBuilder &SVB = State->getStateManager().getSValBuilder();
QualType T;		QualType T;
const MemRegion *MR = LV->getAsRegion();		const MemRegion *MR = LV->getAsRegion();
if (const TypedRegion *TR = dyn_cast_or_null<TypedRegion>(MR))		if (const TypedRegion *TR = dyn_cast_or_null<TypedRegion>(MR))
T = TR->getLocationType();		T = TR->getLocationType();
else		else
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	ProgramStateRef SimpleConstraintManager::assumeAux(ProgramStateRef State,

case nonloc::PointerToMemberKind: {		case nonloc::PointerToMemberKind: {
bool IsNull = !Cond.castAs<nonloc::PointerToMember>().isNullMemberPointer();		bool IsNull = !Cond.castAs<nonloc::PointerToMember>().isNullMemberPointer();
bool IsFeasible = IsNull ? Assumption : !Assumption;		bool IsFeasible = IsNull ? Assumption : !Assumption;
return IsFeasible ? State : nullptr;		return IsFeasible ? State : nullptr;
}		}

case nonloc::LocAsIntegerKind:		case nonloc::LocAsIntegerKind:
return assume(State, Cond.castAs<nonloc::LocAsInteger>().getLoc(),		return assumeInternal(State, Cond.castAs<nonloc::LocAsInteger>().getLoc(),
Assumption);		Assumption);
} // end switch		} // end switch
}		}

ProgramStateRef SimpleConstraintManager::assumeInclusiveRange(		ProgramStateRef SimpleConstraintManager::assumeInclusiveRange(
ProgramStateRef State, NonLoc Value, const llvm::APSInt &From,		ProgramStateRef State, NonLoc Value, const llvm::APSInt &From,
const llvm::APSInt &To, bool InRange) {		const llvm::APSInt &To, bool InRange) {

assert(From.isUnsigned() == To.isUnsigned() &&		assert(From.isUnsigned() == To.isUnsigned() &&
Show All 34 Lines

clang/test/Analysis/infeasible-crash.c

This file was added.

				// RUN: %clang_analyze_cc1 %s \
				// RUN: -analyzer-checker=core \
				// RUN: -analyzer-checker=alpha.unix.cstring.OutOfBounds,alpha.unix.cstring.UninitializedRead \
				// RUN: -analyzer-checker=debug.ExprInspection \
				steakhalUnsubmitted Done Reply Inline Actions Do we really need this? steakhal: Do we really need this?
				martongAuthorUnsubmitted Done Reply Inline Actions No, thx. Removed. martong: No, thx. Removed.
				// RUN: -analyzer-config eagerly-assume=false \
				// RUN: -verify

				// expected-no-diagnostics

				void memmove(void , const void *, unsigned long);

				typedef struct {
				char a[1024];
				} b;
				int c;
				b *invalidate();
				int d() {
				b *a = invalidate();
				if (c < 1024)
				return 0;
				int f = c & ~3, g = f;
				g--;
				if (g)
				return 0;

				// Parent state is already infeasible.
				// clang_analyzer_printState();
				// "constraints": [
				// { "symbol": "(derived_$3{conj_$0{int, LC1, S728, #1},c}) & -4", "range": "{ [1, 1] }" },
				// { "symbol": "derived_$3{conj_$0{int, LC1, S728, #1},c}", "range": "{ [1024, 2147483647] }" }
				// ],

				// This sould not crash!
				// It crashes in baseline, since there both true and false states are nullptr!
				memmove(a->a, &a->a[f], c - f);

				return 0;
				}