This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][solver] Use all sources of constraints
ClosedPublic

Authored by vsavchenko on Jul 5 2021, 9:37 AM.

Details

Summary

Prior to this patch, we always gave priority to constraints that we
actually know about symbols in question. However, these can get
outdated and we can get better results if we look at all possible
sources of knowledge, including sub-expressions.

Diff Detail

Event Timeline

vsavchenko created this revision.Jul 5 2021, 9:37 AM
vsavchenko requested review of this revision.Jul 5 2021, 9:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2021, 9:37 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

I compared issues produced by this patch to the issues produced before that on all projects from clang/utils/analyzer/projects, and didn't find any difference.

Performance measurements also show the we are within the same margins.

NoQ accepted this revision.Jul 5 2021, 9:43 PM

such passes
@
much LLVM

Performance measurements also show the we are within the same margins.

Great! I'd expect massive constraint solver improvements to actually make performance better because they cut infeasible paths. This one's probably not that massive but it's still amazing.

This revision is now accepted and ready to land.Jul 5 2021, 9:43 PM
This revision was automatically updated to reflect the committed changes.
martong added inline comments.Jul 6 2021, 1:40 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
908

Alright. So, this is correct because Visit boils down finally to either infer(Sym->getType) or to VisitBinaryOperator. And both of them do a correct over-approximation of the ranges. Please confirm.

First, I was a bit concerned b/c it is not immediate and not documented here. And it is easy to think by the first look that this might be faulty if we take the approximation of one operand of a binop that might not be true for the whole binop expression. Again, that is not the case because we approximate only in case of such ops where we can do a correct over-approximation (i.e. |, & and %).

My point is, I'd like to see more explanatory comments here.

vsavchenko added inline comments.Jul 6 2021, 2:46 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
908

I'm sorry, but I don't really understand your point here.

Everything that this solver provides is conservative ranges, from whatever source it comes. If you intersect two conservative ranges, you get a conservative range.
It doesn't matter what we do in Visit as long as it is correct. If Visit is incorrect then the previous version of this code that gave preference to some sources over the other ones was also incorrect.

martong added inline comments.Jul 6 2021, 3:03 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
908

Thanks for your reply. So, with other words, I didn't see why it is immediate that a range for a sub-expression is a good approximation for the whole expression. Maybe it's just me, but that's not obvious until one checks that what exactly happens in Visit.

vsavchenko added inline comments.Jul 6 2021, 3:10 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
908

Oh, I mean, it's not correct. Symbolic expressions are N-ary operators, and if we know constraints for at least some of these N operands, we can provide a conservative range for the whole symbol using some knowledge of the operator. It doesn't say anywhere that we use a range for a sub-expression as an approximation for the whole range.

Actually I want to move some of these other sources inside of Visit as well because they trigger only to very specific kinds of symbolic expressions (e.g. binary minus, equality/disequality, comparisons).