- Add a new function to delete points from range sets.
- Introduce an internal generic interface for range set intersections.
- Remove unnecessary bits from a couple of solver functions.
- Add in-code sections.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
558–567 | I'm a bit confused, why aren't these invocations of getRangeForInvertedSub() and getRangeForComparisonSymbol() implemented simply as steps in Visit()? What's so different about them, other then our desire not to go into infinite recursion (which should probably be addressed via memoization)? |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
558–567 | I even had a patch that moves these two functions inside of the Visit, but then I reverted that change. So what does that mean in here: that we will need to make a performance impacting [probably nothing serious, but still a functional] change - to intersect the range assigned to the symbol and whatever we can infer from its subtrees. It would be good to have when we support binary operators +/-/==/<=/etc. because it can help narrowing it down. As of now, it will give no information, only possible overhead of traversing a tree. For this reason, I suggest to move those separately and AFTER we support more operators, because that will also have a couple of good motivation examples. |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
467–470 | Why do we have these representations in the first place? It sounds like you're treating null pointers / empty optionals equivalently to full ranges (i.e., intersecting with them has no effect). How hard would it be to simply construct an actual full range in all the places in which we currently have empty optionals? |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
373 | Sorry if my question is dumb, but I do not really have a mental model at this point about where do we actually handle types and overflows. Will this method work when we delete the last or the first element of the full range of a type? I think having unit tests would be a great way to make this clear. I always felt that the solver is actually something that should be really easy to test separately and those tests would also help a lot to understand how the solver is actually working. | |
467–470 | +1 I also find this confusing. Should I interpret a None as a full or empty range? Does this depend on the context or a general rule? Is there any reason we need to handle nullable pointers or could we actually make call sites more uniform to get rid of some of the complexity here? |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
467–470 |
It is not actually true. Empty optionals is the information that "this range information is not available for this symbol". It is true that intersecting with them has no effect, but they are semantically different in other aspects. Currently solver RELIES on the information that whether the range is available or not (see my previous comment), and we simply couldn't get rid of this behavior as part of NFC.
It is not hard architecturally, but it WILL make the change functional and possibly impact the performance.
Neither of those. That is the problem right now, that we rely on different sources of information for the ranges and behave differently depending on whether one piece of information is available or not.
Semantics are always the same - this info is not available.
State->get<XYZ>(abc) returns a nullable pointer meaning optional object, it is hard to avoid it especially because we currently behave very differently when this information is not available.
Yes we can, but later on. See the explanations above and in my other comment. |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
373 |
No problem at all, I also had this question when I looked at that.
APInt class basically does it all for us in terms of overflows, number of bits, and signedness.
I am onboard! I will add a framework for testing the solver in later patches. |
Hi @vsavchenko , sorry for the late review.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
367–378 | Useful function. But I'd better rename it to subtract as we are working with sets (as a mathimatical collection). We should have a such one for the Ranges not only for Points. | |
734 | As for me, I'd call this like getRangeForNegatedSymSymExpr, since you do Negate operation inside. | |
841–844 | Don't you think this is too complicated for such a simple getter? |
Hi @ASDenysPetrov no problem at all! Thanks for taking your time and checking it.
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
367–378 | I agree that we should have a full set of functions. I don't agree however, that this function is a subtract. Subtract is an operation on two sets and here we have a set and a point. One might argue that a point is just a very simple set, that's true, but real subtract would be more complex in its implementation. Naming it delete, on the other hand, I was coming from a notion of deleting points or neighbourhoods (https://en.wikipedia.org/wiki/Neighbourhood_(mathematics)#Deleted_neighbourhood). | |
734 | I'm not super happy about my name either, but I feel like it describes it better than the previous name and your version. That function doesn't work for any SymSymExpr and it doesn't simply negate whatever we gave it. It works specifically for symbolic subtractions and this is the information I want to be reflected in the name. | |
841–844 | It is more complex than a false range but there is a reason for it. First of all, RangeSet can't have ranges where the end is greater than its start. Only Intersect can handle such ranges correctly. Another thing is that ranges like that mean [MIN, --Zero], [++Zero, MAX] and without a type we can't really say what MIN and MAX are, so such constructor for RangeSet simply cannot exist. Another point is that we do need to have [MIN, -1], [+1, MAX] as opposed to [-1, -1], [+1, +1] because of C language (it doesn't have boolean type), and because of the cases like a - b where we know that a != b. I hope that answers the question. |
@vsavchenko
Thank you.
Despite of all of my nits, LGTM!
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
367–378 |
That's actually what I mean :) and for this particular case you may leave the implementation as is. And for me it still does what subtract does. But I'm OK, I don't insist. | |
734 | Oh, I just assumed ...Sub at the end as a subexpression but you mean subtraction. What I'm trying to say is that we can rename it like getRangeFor...the expression which this function can handle. E.g. getRangeForNegatedSubtractionSymSymExpr. My point is in a speaking name. I think invertion is not something appropriate in terms of applying minus operator. I think invertion of zero should be something opposite but not a zero. Because when you would like to implement the function which turns [A, B] into [MIN, A)U(B, MAX], what would be the name of it? I think this is an invertion. But this is not a big deal, it's just my thoughts. | |
841–844 | I just want this function has low complexity and be more lightweight as getFalseRange. And if we have any chance to simplify it (and you have all the data to get MIN and MAX), it'd be cool. |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
734 | My thought process here was that we are trying to get range for A - B and there is also information on B - A, so we can get something for A - B based on that. So, it doesn't really matter what it does under the hood with ranges, it matters what its semantics are. Here I called B - A an inverted subtraction. | |
841–844 | infer(QualType T) does just that ☺️ So it is a pretty low complexity. |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
467–470 |
Mmm, yeah, i guess the necessity to construct a full range and then do the possibly O(N) intersection when it'll do nothing most of the time is indeed daunting. Maybe we should implement full ranges as a special case in the RangeSet class? Like, don't actually construct the single range, just remember the type; special-case all operations on such sets to perform trivially in O(1), etc.? | |
734 |
https://en.wikipedia.org/wiki/Complement_(set_theory) "Negated subtraction" is definitely my favorite so far. "Mirrored" might be a good layman term as well. |
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp | ||
---|---|---|
734 |
Thanks for the links, @NoQ. That's much definite now. |
Sorry if my question is dumb, but I do not really have a mental model at this point about where do we actually handle types and overflows. Will this method work when we delete the last or the first element of the full range of a type?
I think having unit tests would be a great way to make this clear. I always felt that the solver is actually something that should be really easy to test separately and those tests would also help a lot to understand how the solver is actually working.