Page MenuHomePhabricator

[analyzer] Implemented RangeSet::Factory::castTo function to perform promotions, truncations and conversions.
Needs ReviewPublic

Authored by ASDenysPetrov on May 25 2021, 9:03 AM.

Details

Summary

Handle casts for ranges working similarly to APSIntType::apply function but for the whole range set. Support promotions, truncations and conversions.
Example:
Promotion: char [0, 42] -> short [0, 42] -> int [0, 42] -> llong [0, 42]
Truncation: llong [4295033088, 4295033130] -> int [65792, 65834] -> short [256, 298] -> char [0, 42]
Conversion: char [-42, 42] -> uint [0, 42]U[4294967254, 4294967295] -> short[-42, 42]

Diff Detail

Event Timeline

ASDenysPetrov created this revision.May 25 2021, 9:03 AM
ASDenysPetrov requested review of this revision.May 25 2021, 9:03 AM

OK, about the quadratic algorithm. I can think of one solution that is actually linear asymptotically, but because of the constant factor can be worse in practice (we usually have extremely small range sets). So, while I don't like it, it's probably fine.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h
245–246

That is a bit of a red flag because instantly my intuition tells me that there should be a linear option.

248

If we talk about types for ranges, the interface we have is incomplete.
We should introduce something like getType for range sets, so that the user can have a more holistic picture about types and ranges.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
630

We have LHS (left-hand side) and RHS (right-hand side) as a common convention for naming operands of binary operators.
The fact that you later on use it as an RHS for unite, doesn't make it a good name.

635

That's incorrect. uint64_t holds exactly this many values. If you count every number from 0 to UINT64_MAX, you get UINT64_MAX + 1, which is the number you are talking about.
Maybe you want to talk about the maximum stored value?

639–668

I think we should have a shortcut for widening casts, where we don't need to go and unite them. We just literally need to convert every range from one type to another and be done with it.

clang/unittests/StaticAnalyzer/RangeSetTest.cpp
136–148

I don't really understand what's going on here.

647–648

Yes, please.

696

Test cases don't belong in RangeSetTest.

Each piece of code like this should be a separate test case.

@vsavchenko Repled in inlines. Preparing an update.

clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h
245–246

Unfortunately, castTo is N^2 because technically we can call unite(which is N) inside it N times.

248

+1. Is there any place where we can put tasks as tech debt or just keep noting with FIXME/TODO's?

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
630

What about Tmp?

635

Maybe you want to talk about the maximum stored value?

Yes. I should rephrase the comment.
E.g.: 128 is an amount of all possible values of char and we can't keep it inside char.

639–668

we don't need to go and unite them.

Um, yes. 'unite' only needed for truncated range sets which have more then one range. I'll try to refactor this.
That will help make the algorithm linear for most cases.

clang/unittests/StaticAnalyzer/RangeSetTest.cpp
136–148

This is a generalized version of from function. But this one you can use independently from which type the current test case uses. You can set a custom type to get a range set for a given type. This allows to get range sets based on different types (e.g. char and int) in a single test case.
Previously from produced range sets only for the type specified by a test case.

ASDenysPetrov added inline comments.Jun 17 2021, 5:44 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h
245–246

Corrected.
Unfortunately, castTo is N^2 because technically we can call unite(which is N+M) inside it N times.

vsavchenko added inline comments.Jun 17 2021, 5:55 AM
clang/include/clang/StaticAnalyzer/Core/PathSensitive/RangedConstraintManager.h
245–246

Unfortunately, castTo is N^2 because technically we can call unite(which is N) inside it N times.

I know why you put N^2, and what I meant was that there is probably a better algorithm than that.

And as I said there is a linear algorithm:
If you have a smaller type that is, let's say, 8 times smaller than the larger type. We can split the original range set into 8 range sets and unite them.
This difference between type sizes is limited (because we don't support integer types of arbitrary bit width) aka is a constant. However, IRL the size of range sets is very small, and this constant factor can make it much slower than your N^2 algorithm.

But it doesn't mean that there is no better algorithm and that you HAVE TO call unite in a loop N times.

vsavchenko added inline comments.Jun 17 2021, 6:06 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
630

Maybe something like Dummy or Intermediate.

clang/unittests/StaticAnalyzer/RangeSetTest.cpp
136–148

OK then, if you make this from templated, make all of them templated and you won't need to duplicate the logic with sizeof * 8 everywhere.

Made changes according to comments. Optimized castTo function for each cast case. Simplified unit test.

vsavchenko added inline comments.Jun 23 2021, 7:47 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
299–309

We should add assert(!isEmpty()); in all three

650–681

Maybe we can extract this into something like truncateTo and the next if to convertTo private methods?

clang/unittests/StaticAnalyzer/RangeSetTest.cpp
117–118

Great, that works too!

120

Default to BaseType?

126

Default to BaseType?

130–131

Default to BaseType?

131

Unused parameter?

722–800

If loop and promotion share the same test case, why should we split them into two groups?

I'll update tommorow.

clang/unittests/StaticAnalyzer/RangeSetTest.cpp
120

It's implicitly deduced. I think it is not necessary.

131

Oh... :-)

722–800

Yes, the tests are identical but they use different testing::Types in the suites. Also I don't want to mix them to make problem localizing easier if occurs. But still I haven't strong preferences on that.

Added assertions. Added two helper functions for castTo method. Moved some distinct code to them. Fixed clang-tidy complaints.