Generate more IntSymExpr constraints, perform SVal simplification for IntSymExpr and SymbolCast constraints, and create fully symbolic SymExprs. Also fixes crashes and adds support for tests that should not be run with Z3ConstraintManager.
Details
Diff Detail
- Build Status
Buildable 9830 Build 9830: arc lint + arc unit
Event Timeline
Compared with D28953, this revision fixes the test failure with PR3991.m with RangeConstraintManager, and a few other failures with Z3ConstraintManager. However, there's one remaining test failure in range_casts.c that I'm not sure how to resolve. For reference, this is the simplified code snippet from that testcase:
void f15(long foo) { unsigned index = -1; if (index < foo) index = foo; unsigned int tmp = index + 1; if (tmp == 0) clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}} else clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}} }
In debug mode, an assertion about the system being over-constrained is thrown from ConstraintManager.h. This is because the new Simplifier::VisitSymbolCast function will attempt to evaluate the cast (unsigned int) (reg_$0<long foo>), which is suppressed by the call to haveSameType() in SimpleSValBuilder::evalCastFromNonLoc (D28955 fixes this, but only for Z3ConstraintManager), generating just the symbol reg_$0<long foo>. Subsequently, the analyzer will attempt to evaluate the expression ((reg_$0<long foo>) + 1U) == 0U with the range reg_$0<long foo> : { [4294967296, 9223372036854775807] }, or [UINT_MAX + 1, LONG_MAX]. However, in the case where the assumption is true, RangeConstraintManager takes the intersection of the previous range with [UINT_MAX, UINT_MAX] and produces the empty set, and likewise where the assumption is false, the intersection with [UINT_MAX - 1, 0] again produces the empty set. As a result, both program states are NULL, triggering the assertion.
I'm now somewhat inclined to drop the addition of Simplified::VisitSymbolCast() and those associated testsuite changes, because ignoring type casts is clearly incorrect and will introduce both false negatives and false positives.
Is diff 1 the original diff from D28953? It was ok to reopen it, but the new revision is also fine.
No, diff 1 is already different; it contains most of the bugfixes. I couldn't find any way to reopen the previous review, and arc diff wouldn't let me update a closed review.
Regarding 1-bit bools: did you notice D32328 and D35041, do they accidentally help?
I did see those changes, but I think they're a little different. The test failures I ran into were cases where the input is a 1-bit APSInt, and attempting to retrieve the type for that gives a null QualType.
As an update, after fixing the typo and updating the tests, the assertion in range_casts.c is no longer triggered and everything seems fine now.
lib/StaticAnalyzer/Core/SValBuilder.cpp | ||
---|---|---|
370 | As a follow up to the previous version of this patch, I do not think we should set the default complexity limit to 10000. What is the relation between this limit and the limit in VisitNonLocSymbolVal? If they are related, would it be worthwhile turning these into an analyzer option? |
lib/StaticAnalyzer/Core/SValBuilder.cpp | ||
---|---|---|
370 | To clarify, the current version of this patch does not modify the MaxComp parameter. My understanding is that MaxComp is the upper bound for building a new NonLoc symbol from two children, based on the sum of the number of child symbols (complexity) across both children. In contrast, the limit in VisitNonLocSymbolVal (@NoQ, correct me if I'm wrong), is the upper bound for recursively evaluating and inlining a NonLoc symbol, triggered from simplifySVal by evalBinOpNN. Note that these two latter functions indirectly call each other recursively (through evalBinOp), causing the previous runtime blowup. Furthermore, each call to computeComplexitywill re-iterate through all child symbols of the current symbol, but only the first complexity check at the root symbol is actually necessary, because by definition the complexity of a child symbol at each recursive call is monotonically decreasing. I think it'd be useful to allow both to be configurable, but I don't see a direct relationship between the two. |
lib/StaticAnalyzer/Core/SValBuilder.cpp | ||
---|---|---|
370 |
I know. Also, currently, it is only used in the unsupported taint tracking mode and only for tainted symbols. I would like a different smaller default for all expressions. |
lib/StaticAnalyzer/Core/SValBuilder.cpp | ||
---|---|---|
370 | Ok. I can make both configurable as part of this patch, with a new default of 10 for VisitNonLocSymbolVal. But I've never used the taint tracking mode, so I don't know what would be a reasonable default for MaxComp. |
But I've never used the taint tracking mode, so I don't know what would be a reasonable default for MaxComp.
that one is very experimental anyway. I'd just keep the functional changes to tain out of this patch and use the current default that taint uses.
All testcases pass, except the issue with range_casts.c. The cause is still the range intersection discussed in https://reviews.llvm.org/D35450#810469.
@ddcc Hi, are you still interested in landing the fixes associated with this patch? I can take a look as I'm currently reviewing https://reviews.llvm.org/D45517, but it is likely that the patch would need to be changed substantially before it could land.
@george.karpenkov Yeah, I've got this and a couple of other patches still awaiting review. If it's easier, I can also split out the APSInt fix into a separate patch.
I can also split out the APSInt fix into a separate patch.
Yes please.
In general I really dislike arbitrary "complexity cutoffs", as they make further debugging hard and may lead to weird bugs: could you explain why is that required and can not be avoided?
There seems to be a typo somewhere around the last comparison.