This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Support generating and reasoning over more symbolic constraint types
Needs RevisionPublic

Authored by ddcc on Jul 14 2017, 11:20 PM.

Details

Summary

Generate more IntSymExpr constraints, perform SVal simplification for IntSymExpr and SymbolCast constraints, and create fully symbolic SymExprs. Also fixes crashes and adds support for tests that should not be run with Z3ConstraintManager.

Event Timeline

ddcc created this revision.Jul 14 2017, 11:20 PM
ddcc updated this revision to Diff 106758.Jul 15 2017, 1:21 AM

Modify Z3RangeConstraintManager::fixAPSInt() and add Expr::isCommutativeOp()

ddcc added a comment.Jul 15 2017, 1:24 AM

Compared with D28953, this revision fixes the test failure with PR3991.m with RangeConstraintManager, and a few other failures with Z3ConstraintManager. However, there's one remaining test failure in range_casts.c that I'm not sure how to resolve. For reference, this is the simplified code snippet from that testcase:

void f15(long foo) {
  unsigned index = -1;
  if (index < foo) index = foo;
  unsigned int tmp = index + 1;
  if (tmp == 0)
    clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
  else
    clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
}

In debug mode, an assertion about the system being over-constrained is thrown from ConstraintManager.h. This is because the new Simplifier::VisitSymbolCast function will attempt to evaluate the cast (unsigned int) (reg_$0<long foo>), which is suppressed by the call to haveSameType() in SimpleSValBuilder::evalCastFromNonLoc (D28955 fixes this, but only for Z3ConstraintManager), generating just the symbol reg_$0<long foo>. Subsequently, the analyzer will attempt to evaluate the expression ((reg_$0<long foo>) + 1U) == 0U with the range reg_$0<long foo> : { [4294967296, 9223372036854775807] }, or [UINT_MAX + 1, LONG_MAX]. However, in the case where the assumption is true, RangeConstraintManager takes the intersection of the previous range with [UINT_MAX, UINT_MAX] and produces the empty set, and likewise where the assumption is false, the intersection with [UINT_MAX - 1, 0] again produces the empty set. As a result, both program states are NULL, triggering the assertion.

I'm now somewhat inclined to drop the addition of Simplified::VisitSymbolCast() and those associated testsuite changes, because ignoring type casts is clearly incorrect and will introduce both false negatives and false positives.

NoQ edited edge metadata.Jul 17 2017, 8:15 AM

I'd have a look soon.

Is diff 1 the original diff from D28953? It was ok to reopen it, but the new revision is also fine.

Regarding 1-bit bools: did you notice D32328 and D35041, do they accidentally help?

include/clang/AST/Expr.h
3096

There seems to be a typo somewhere around the last comparison.

ddcc updated this revision to Diff 106883.Jul 17 2017, 8:46 AM

Fix typo

ddcc added a comment.Jul 17 2017, 8:46 AM

Is diff 1 the original diff from D28953? It was ok to reopen it, but the new revision is also fine.

No, diff 1 is already different; it contains most of the bugfixes. I couldn't find any way to reopen the previous review, and arc diff wouldn't let me update a closed review.

Regarding 1-bit bools: did you notice D32328 and D35041, do they accidentally help?

I did see those changes, but I think they're a little different. The test failures I ran into were cases where the input is a 1-bit APSInt, and attempting to retrieve the type for that gives a null QualType.

ddcc updated this revision to Diff 106896.Jul 17 2017, 9:55 AM

Fix tests after typo fix

ddcc added a comment.Jul 17 2017, 10:01 AM

As an update, after fixing the typo and updating the tests, the assertion in range_casts.c is no longer triggered and everything seems fine now.

ddcc updated this revision to Diff 107190.Jul 18 2017, 3:19 PM

Minor style fix

zaks.anna added inline comments.Aug 25 2017, 9:44 AM
lib/StaticAnalyzer/Core/SValBuilder.cpp
364

As a follow up to the previous version of this patch, I do not think we should set the default complexity limit to 10000.

What is the relation between this limit and the limit in VisitNonLocSymbolVal? If they are related, would it be worthwhile turning these into an analyzer option?

ddcc added inline comments.Aug 25 2017, 4:26 PM
lib/StaticAnalyzer/Core/SValBuilder.cpp
364

To clarify, the current version of this patch does not modify the MaxComp parameter.

My understanding is that MaxComp is the upper bound for building a new NonLoc symbol from two children, based on the sum of the number of child symbols (complexity) across both children.

In contrast, the limit in VisitNonLocSymbolVal (@NoQ, correct me if I'm wrong), is the upper bound for recursively evaluating and inlining a NonLoc symbol, triggered from simplifySVal by evalBinOpNN. Note that these two latter functions indirectly call each other recursively (through evalBinOp), causing the previous runtime blowup. Furthermore, each call to computeComplexitywill re-iterate through all child symbols of the current symbol, but only the first complexity check at the root symbol is actually necessary, because by definition the complexity of a child symbol at each recursive call is monotonically decreasing.

I think it'd be useful to allow both to be configurable, but I don't see a direct relationship between the two.

zaks.anna added inline comments.Aug 27 2017, 1:03 AM
lib/StaticAnalyzer/Core/SValBuilder.cpp
364

To clarify, the current version of this patch does not modify the MaxComp parameter.

I know. Also, currently, it is only used in the unsupported taint tracking mode and only for tainted symbols. I would like a different smaller default for all expressions.

ddcc added inline comments.Aug 28 2017, 10:11 AM
lib/StaticAnalyzer/Core/SValBuilder.cpp
364

Ok. I can make both configurable as part of this patch, with a new default of 10 for VisitNonLocSymbolVal. But I've never used the taint tracking mode, so I don't know what would be a reasonable default for MaxComp.

zaks.anna edited edge metadata.Aug 28 2017, 11:37 AM

But I've never used the taint tracking mode, so I don't know what would be a reasonable default for MaxComp.

that one is very experimental anyway. I'd just keep the functional changes to tain out of this patch and use the current default that taint uses.

ddcc updated this revision to Diff 113505.Aug 31 2017, 9:31 PM

Rebase, make complexity limits configurable

ddcc marked 5 inline comments as done.Aug 31 2017, 9:35 PM

All testcases pass, except the issue with range_casts.c. The cause is still the range intersection discussed in https://reviews.llvm.org/D35450#810469.

ddcc edited the summary of this revision. (Show Details)Aug 31 2017, 9:58 PM

@ddcc Hi, are you still interested in landing the fixes associated with this patch? I can take a look as I'm currently reviewing https://reviews.llvm.org/D45517, but it is likely that the patch would need to be changed substantially before it could land.

ddcc added a comment.May 30 2018, 1:43 PM

@ddcc Hi, are you still interested in landing the fixes associated with this patch? I can take a look as I'm currently reviewing https://reviews.llvm.org/D45517, but it is likely that the patch would need to be changed substantially before it could land.

@george.karpenkov Yeah, I've got this and a couple of other patches still awaiting review. If it's easier, I can also split out the APSInt fix into a separate patch.

I can also split out the APSInt fix into a separate patch.

Yes please.

In general I really dislike arbitrary "complexity cutoffs", as they make further debugging hard and may lead to weird bugs: could you explain why is that required and can not be avoided?

george.karpenkov requested changes to this revision.May 30 2018, 4:59 PM

@ddcc so would be great if we could split this patch into smaller chunks.

This revision now requires changes to proceed.May 30 2018, 4:59 PM