This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
2/2
RangeConstraintManager.cpp
-
test/Analysis/
-
Analysis/
-
equality_tracking.c

Differential D88019

[analyzer][solver] Fix issue with symbol non-equality tracking
ClosedPublic

Authored by martong on Sep 21 2020, 4:45 AM.

Download Raw Diff

Details

Reviewers

vsavchenko
steakhal
NoQ
Szelethus

Commits

rG0c4f91f84b2e: [analyzer][solver] Fix issue with symbol non-equality tracking

Summary

We should track non-equivalency (disequality) in case of greater-then or
less-then assumptions.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	260 ms	windows > Clang.Driver::arm-default-build-attributes.s
	460 ms	windows > LLVM.DebugInfo::debuglineinfo-path.ll
	150 ms	windows > LLVM.DebugInfo::symbolize-build-id.test
	50 ms	windows > LLVM.DebugInfo::symbolize-gnu-debuglink.test
	160 ms	windows > LLVM.DebugInfo::symbolize.test
		View Full Test Results (11 Failed)

Event Timeline

martong created this revision.Sep 21 2020, 4:45 AM

Herald added a reviewer: Szelethus. · View Herald TranscriptSep 21 2020, 4:45 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: cfe-commits, ASDenysPetrov, Charusso and 11 others. · View Herald Transcript

martong requested review of this revision.Sep 21 2020, 4:45 AM

@steakhal, thank you for your time and huge effort in debugging this!

Harbormaster completed remote builds in B72368: Diff 293142.Sep 21 2020, 5:35 AM

I came up with exactly the same fix! Great job!
I just wanted to refactor it and not having

if (New.isEmpty())
  // this is infeasible assumption
  return nullptr;

ProgramStateRef NewState = setConstraint(St, Sym, New);
return trackNE(NewState, Sym, Int, Adjustment);

repeated in different places

After this "accepted" and a huge thank you 😄

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1349–1379	I suggest to change these two functions this way, so we can avoid the same pattern in 4 different functions

Avoid same pattern when checking whether the assumption is infeasible

Thanks for the quick review!
I updated according to your suggestion, so we avoid the same pattern.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1349–1379	I suggest to change these two functions this way, so we can avoid the same pattern in 4 different functions

Amazing!

This revision is now accepted and ready to land.Sep 21 2020, 7:35 AM

This revision was landed with ongoing or failed builds.Sep 21 2020, 8:00 AM

Closed by commit rG0c4f91f84b2e: [analyzer][solver] Fix issue with symbol non-equality tracking (authored by martong). · Explain Why

This revision was automatically updated to reflect the committed changes.

martong added a commit: rG0c4f91f84b2e: [analyzer][solver] Fix issue with symbol non-equality tracking.

Harbormaster completed remote builds in B72382: Diff 293168.Sep 21 2020, 8:10 AM

What are our options mitigating anything similar happening in the future?

This way any change touching the SymbolicRangeInferrer and any related parts of the analyzer seems to be way too fragile.
Especially, since we might want to add support for comparing SymSyms, just like we try to do in D77792.

In D88019#2291953, @steakhal wrote:

What are our options mitigating anything similar happening in the future?

This way any change touching the SymbolicRangeInferrer and any related parts of the analyzer seems to be way too fragile.
Especially, since we might want to add support for comparing SymSyms, just like we try to do in D77792.

What about changing the EXPENSIVE_CHECKS in the assume function in the following way:
Convert all range constraints into a Z3 model and check if that is UNSAT.
In that case, we would have returned a state with contradictions, so we would prevent this particular bug from lurking around to bite us later.

And another possibility could be to create a debug checker, which registers to the assume callback and does the same conversion and check.
This is more appealing to me in some way, like decouples the Z3 dependency from the ConstraintManager header.

Which approach should I prefer? @NoQ @vsavchenko @martong @xazax.hun @Szelethus

In D88019#2296337, @steakhal wrote:

In D88019#2291953, @steakhal wrote:

What are our options mitigating anything similar happening in the future?

This way any change touching the SymbolicRangeInferrer and any related parts of the analyzer seems to be way too fragile.
Especially, since we might want to add support for comparing SymSyms, just like we try to do in D77792.

What about changing the EXPENSIVE_CHECKS in the assume function in the following way:
Convert all range constraints into a Z3 model and check if that is UNSAT.
In that case, we would have returned a state with contradictions, so we would prevent this particular bug from lurking around to bite us later.

And another possibility could be to create a debug checker, which registers to the assume callback and does the same conversion and check.
This is more appealing to me in some way, like decouples the Z3 dependency from the ConstraintManager header.

Which approach should I prefer? @NoQ @vsavchenko @martong @xazax.hun @Szelethus

I like the second approach, i.e. to have a debug checker. But I don't see, how would this checker validate all constraints at the moment when they are added to the State. And if we don't check all constraints then we might end up checking a state that is invalid for a while (i.e. that might became invalid previously and not because of the lastly added constraint.) So, essentially, my gut feeling is that both approaches should be validating all newly added constraints against Z3. And that might be too slow, it would have the same speed as using z3 instead of the range based solver.

In D88019#2303078, @martong wrote:

I like the second approach, i.e. to have a debug checker. But I don't see, how would this checker validate all constraints at the moment when they are added to the State.

ProgramStateRef evalAssume(ProgramStateRef State, SVal Cond, bool Assumption) checker callback is called after any assume call.
So, we would have a State which has all? the constraints stored in the appropriate GDM. I'm not sure if we should aggregate all the constraints till this node - like we do for refutation. I think, in this case, we should just make sure that the current state does not have any contradiction.

Here is my proof-of-concept:

I've dumped the state and args of evalAssume for the repro example, pretending that we don't reach a code-path on the infeasable path to crash:

void avoidInfeasibleConstraintforGT(int a, int b) {
  int c = b - a;
  if (c <= 0)
    return;
  if (a != b) {
    clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
    return;
  }
  clang_analyzer_warnIfReached(); // eh, we will reach this..., #1
  // These are commented out, to pretend that we don't find out that we are on an infeasable path...
  // a == b
  // if (c < 0)
  //   ;
}

We reach the #1 line, but more importantly, we will have the following state dumps:

evalAssume: assuming Cond: (reg_$1<int a>) != (reg_$0<int b>) to be true in state:
"program_state": {
  "constraints": [
    { "symbol": "(reg_$0<int b>) - (reg_$1<int a>)", "range": "{ [1, 2147483647] }" },
    { "symbol": "(reg_$1<int a>) != (reg_$0<int b>)", "range": "{ [-2147483648, -1], [1, 2147483647] }" }
  ]
}

evalAssume: assuming Cond: (reg_$1<int a>) != (reg_$0<int b>) to be false in state:
"program_state": {
  "constraints": [
    { "symbol": "(reg_$0<int b>) - (reg_$1<int a>)", "range": "{ [1, 2147483647] }" },
    { "symbol": "(reg_$1<int a>) != (reg_$0<int b>)", "range": "{ [0, 0] }" }
  ]
}

As you can see, the latter is the infeasable path, and if we have had serialized the constraints and let the Z3 check it, we would have gotten that it's unfeasable.
At this point the checker can dump any useful data for debugging, and crash the analyzer to let us know that something really bad happened.

[...] And that might be too slow, it would have the same speed as using z3 instead of the range based solver.

Yes, it will probably freaking slow, but at least we have something.

steakhal mentioned this in D97874: [analyzer] Improve SVal cast from integer to bool using known RangeSet.Mar 10 2021, 5:47 AM

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Core/

RangeConstraintManager.cpp

54 lines

test/

Analysis/

equality_tracking.c

34 lines

Diff 293168

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

Show First 20 Lines • Show All 1,340 Lines • ▼ Show 20 Lines

RangeSet getSymLERange(llvm::function_ref<RangeSet()> RS,

const llvm::APSInt &Adjustment);

RangeSet getSymGERange(ProgramStateRef St, SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment);

//===------------------------------------------------------------------===//

// Equality tracking implementation

//===------------------------------------------------------------------===//

ProgramStateRef trackEQ(ProgramStateRef State, SymbolRef Sym,

ProgramStateRef trackEQ(RangeSet NewConstraint, ProgramStateRef State,

const llvm::APSInt &Int,

SymbolRef Sym, const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

if (auto Equality = EqualityInfo::extract(Sym, Int, Adjustment)) {

return track<true>(NewConstraint, State, Sym, Int, Adjustment);

// Extract function assumes that we gave it Sym + Adjustment != Int,

// so the result should be opposite.

Equality->invert();

return track(State, *Equality);

}

return State;

ProgramStateRef trackNE(RangeSet NewConstraint, ProgramStateRef State,

SymbolRef Sym, const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

return track<false>(NewConstraint, State, Sym, Int, Adjustment);

}

ProgramStateRef trackNE(ProgramStateRef State, SymbolRef Sym,

template <bool EQ>

const llvm::APSInt &Int,

ProgramStateRef track(RangeSet NewConstraint, ProgramStateRef State,

SymbolRef Sym, const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

if (NewConstraint.isEmpty())

// This is an infeasible assumption.

return nullptr;

ProgramStateRef NewState = setConstraint(State, Sym, NewConstraint);

if (auto Equality = EqualityInfo::extract(Sym, Int, Adjustment)) {

return track(State, *Equality);

// If the original assumption is not Sym + Adjustment !=/</> Int,

// we should invert IsEquality flag.

Equality->IsEquality = Equality->IsEquality != EQ;

return track(NewState, *Equality);

}

return State;

return NewState;

}

vsavchenkoUnsubmitted

Done

//===------------------------------------------------------------------===//

- ProgramStateRef trackEQ(ProgramStateRef State, SymbolRef Sym,

- const llvm::APSInt &Int,

+ ProgramStateRef trackEQ(RangeSet NewConstraint, ProgramStateRef State,

+ SymbolRef Sym, const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

- if (auto Equality = EqualityInfo::extract(Sym, Int, Adjustment)) {

- // Extract function assumes that we gave it Sym + Adjustment != Int,

- // so the result should be opposite.

- Equality->invert();

- return track(State, *Equality);

- }

+ return track<true>(NewConstraint, State, Sym, Int, Adjustment);

+ }

- return State;

+ ProgramStateRef trackNE(RangeSet NewConstraint, ProgramStateRef State,

+ SymbolRef Sym, const llvm::APSInt &Int,

+ const llvm::APSInt &Adjustment) {

+ return track<false>(NewConstraint, State, Sym, Int, Adjustment);

}

- ProgramStateRef trackNE(ProgramStateRef State, SymbolRef Sym,

- const llvm::APSInt &Int,

- const llvm::APSInt &Adjustment) {

+ template <bool EQ>

+ ProgramStateRef track(RangeSet NewConstraint, ProgramStateRef State,

+ SymbolRef Sym, const llvm::APSInt &Int,

+ const llvm::APSInt &Adjustment) {

+ if (NewConstraint.isEmpty())

+ // this is infeasible assumption

+ return nullptr;

+ ProgramStateRef NewState = setConstraint(State, Sym, NewConstraint);

if (auto Equality = EqualityInfo::extract(Sym, Int, Adjustment)) {

- return track(State, *Equality);

+ // If the original assuption is not Sym + Adjustment !=/</> Int,

+ // we should invert IsEquality flag.

+ Equality->IsEquality = Equality->IsEquality != EQ;

+ return track(NewState, *Equality);

}

- return State;

+ return NewState;

}

ProgramStateRef track(ProgramStateRef State, EqualityInfo ToTrack) {

I suggest to change these two functions this way, so we can avoid the same pattern in 4 different functions

vsavchenko: I suggest to change these two functions this way, so we can avoid the same pattern in 4…

martongAuthorUnsubmitted

Done

I suggest to change these two functions this way, so we can avoid the same pattern in 4 different functions

martong: > I suggest to change these two functions this way, so we can avoid the same pattern in 4…

ProgramStateRef track(ProgramStateRef State, EqualityInfo ToTrack) {

if (ToTrack.IsEquality) {

return trackEquality(State, ToTrack.Left, ToTrack.Right);

}

return trackDisequality(State, ToTrack.Left, ToTrack.Right);

}

▲ Show 20 Lines • Show All 657 Lines • ▼ Show 20 Lines

RangeConstraintManager::assumeSymNE(ProgramStateRef St, SymbolRef Sym,

APSIntType AdjustmentType(Adjustment);

if (AdjustmentType.testInRange(Int, true) != APSIntType::RTR_Within)

return St;

llvm::APSInt Point = AdjustmentType.convert(Int) - Adjustment;

RangeSet New = getRange(St, Sym).Delete(getBasicVals(), F, Point);

if (New.isEmpty())

return trackNE(New, St, Sym, Int, Adjustment);

// this is infeasible assumption

return nullptr;

ProgramStateRef NewState = setConstraint(St, Sym, New);

return trackNE(NewState, Sym, Int, Adjustment);

}

ProgramStateRef

RangeConstraintManager::assumeSymEQ(ProgramStateRef St, SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

// Before we do any real work, see if the value can even show up.

APSIntType AdjustmentType(Adjustment);

if (AdjustmentType.testInRange(Int, true) != APSIntType::RTR_Within)

return nullptr;

// [Int-Adjustment, Int-Adjustment]

llvm::APSInt AdjInt = AdjustmentType.convert(Int) - Adjustment;

RangeSet New = getRange(St, Sym).Intersect(getBasicVals(), F, AdjInt, AdjInt);

if (New.isEmpty())

return trackEQ(New, St, Sym, Int, Adjustment);

// this is infeasible assumption

return nullptr;

ProgramStateRef NewState = setConstraint(St, Sym, New);

return trackEQ(NewState, Sym, Int, Adjustment);

}

RangeSet RangeConstraintManager::getSymLTRange(ProgramStateRef St,

SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

// Before we do any real work, see if the value can even show up.

APSIntType AdjustmentType(Adjustment);

Show All 19 Lines

RangeSet RangeConstraintManager::getSymLTRange(ProgramStateRef St,

return getRange(St, Sym).Intersect(getBasicVals(), F, Lower, Upper);

}

ProgramStateRef

RangeConstraintManager::assumeSymLT(ProgramStateRef St, SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

RangeSet New = getSymLTRange(St, Sym, Int, Adjustment);

return New.isEmpty() ? nullptr : setConstraint(St, Sym, New);

return trackNE(New, St, Sym, Int, Adjustment);

}

RangeSet RangeConstraintManager::getSymGTRange(ProgramStateRef St,

SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

// Before we do any real work, see if the value can even show up.

APSIntType AdjustmentType(Adjustment);

Show All 19 Lines

RangeSet RangeConstraintManager::getSymGTRange(ProgramStateRef St,

return getRange(St, Sym).Intersect(getBasicVals(), F, Lower, Upper);

}

ProgramStateRef

RangeConstraintManager::assumeSymGT(ProgramStateRef St, SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

RangeSet New = getSymGTRange(St, Sym, Int, Adjustment);

return New.isEmpty() ? nullptr : setConstraint(St, Sym, New);

return trackNE(New, St, Sym, Int, Adjustment);

}

RangeSet RangeConstraintManager::getSymGERange(ProgramStateRef St,

SymbolRef Sym,

const llvm::APSInt &Int,

const llvm::APSInt &Adjustment) {

// Before we do any real work, see if the value can even show up.

APSIntType AdjustmentType(Adjustment);

▲ Show 20 Lines • Show All 132 Lines • Show Last 20 Lines

clang/test/Analysis/equality_tracking.c

Show First 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	if (a >= 0 && a <= 10 && b >= 20 && b <= 50) {
}		}
if (a != b) {		if (a != b) {
clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}		clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
} else {		} else {
clang_analyzer_warnIfReached(); // no warning		clang_analyzer_warnIfReached(); // no warning
}		}
}		}
}		}

		void avoidInfeasibleConstraintforGT(int a, int b) {
		int c = b - a;
		if (c <= 0)
		return;
		// c > 0
		// b - a > 0
		// b > a
		if (a != b) {
		clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
		return;
		}
		clang_analyzer_warnIfReached(); // no warning
		// a == b
		if (c < 0)
		;
		}

		void avoidInfeasibleConstraintforLT(int a, int b) {
		int c = b - a;
		if (c >= 0)
		return;
		// c < 0
		// b - a < 0
		// b < a
		if (a != b) {
		clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
		return;
		}
		clang_analyzer_warnIfReached(); // no warning
		// a == b
		if (c < 0)
		;
		}

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][solver] Fix issue with symbol non-equality trackingClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 293168

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

clang/test/Analysis/equality_tracking.c

[analyzer][solver] Fix issue with symbol non-equality tracking
ClosedPublic