This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][solver] Fix assertion on (NonLoc, Op, Loc) expressions
ClosedPublic

Authored by steakhal on Dec 6 2021, 5:34 AM.

Details

Summary

Previously, the SValBuilder could not encounter expressions of the
following kind:

NonLoc OP Loc
Loc OP NonLoc

Where the Op is other than BO_Add.

As of now, due to the smarter simplification and the fixedpoint
iteration, it turns out we can.
It can happen if the Loc was perfectly constrained to a concrete
value (nonloc::ConcreteInt), thus the simplifier can do
constant-folding in these cases as well.

Unfortunately, this could cause assertion failures, since we assumed
that the operator must be BO_Add, causing a crash.


In the patch, I decided to preserve the original behavior (aka. swap the
operands (if the operator is commutative), but if the RHS was a
loc::ConcreteInt call evalBinOpNN().

I think this interpretation of the arithmetic expression is closer to
reality.

I also tried naively introducing a separate handler for
loc::ConcreteInt RHS, before doing handling the more generic Loc RHS
case. However, it broke the zoo1backwards() test in the nullptr.cpp
file. This highlighted for me the importance to preserve the original
behavior for the BO_Add at least.

PS: Sorry for introducing yet another branch into this evalBinOpXX
madness. I've got a couple of ideas about refactoring these.
We'll see if I can get to it.

The test file demonstrates the issue and makes sure nothing similar
happens. The no-crash annotated lines show, where we crashed before
applying this patch.

Diff Detail

Event Timeline

steakhal created this revision.Dec 6 2021, 5:34 AM
steakhal requested review of this revision.Dec 6 2021, 5:34 AM

For the record, all touched lines are covered by tests.

martong added inline comments.Dec 6 2021, 6:15 AM
clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
475

const auto ?

clang/test/Analysis/symbol-simplification-nonloc-loc.cpp
72

Did we have an existing test in nullptr.cpp in the same form? If yes, then why that did not crash before this patch?

martong added inline comments.Dec 6 2021, 6:20 AM
clang/test/Analysis/symbol-simplification-nonloc-loc.cpp
72

I see now, it did not crash because the op is +. But then the warnings should have been changed when simplification came in, or what am I missing here?

steakhal updated this revision to Diff 392068.Dec 6 2021, 8:03 AM
steakhal marked 3 inline comments as done.
steakhal edited the summary of this revision. (Show Details)
  • updated summary
  • added const here and there
clang/test/Analysis/symbol-simplification-nonloc-loc.cpp
72

I decided to preserve the current behavior in case of commutative operators, aka. swap the parameters and do the evalBinOpLN().
Originally, I tried to handle NonLoc OP loc::ConcreteInt by delegating to evalBinOpNN(), but that triggered an assertion within the evalLoad() expecting a Loc in the zoo1backwards test-case.

When I realized this, I thought I have two ways of fixing this:

  1. Hack the evalLoad() to expect nonloc::ConcreteInts besides Locs, but it doesn't feel like the right thing to do.
  2. Preserve the previous behavior whenever we can, and do this as last resort.

I chose the latter. I should have mentioned this in the commit message, my bad. I updated the summary accordingly.

martong accepted this revision.Dec 6 2021, 8:20 AM

LGTM! Thanks!

This revision is now accepted and ready to land.Dec 6 2021, 8:20 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2021, 9:39 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
NoQ added a comment.Dec 6 2021, 4:25 PM

It can happen if the Loc was perfectly constrained to a concrete
value (nonloc::ConcreteInt)

This shouldn't happen. It should be loc::ConcreteInt which is, well, a Loc.

It can happen if the Loc was perfectly constrained to a concrete
value (nonloc::ConcreteInt)

This shouldn't happen. It should be loc::ConcreteInt which is, well, a Loc.

I see your point and thank you for taking a look. Perhaps there is a problem in SimpleSValBuilder::simplifySVal which is responsible to use the constraint and build up the ConcreteInt. Especially, VisitSymbolData and getConst is worth to continue with the investigation, we might have the wrong type associated to the symbol:

if (Const)
  return Loc::isLocType(Sym->getType()) ? (SVal)SVB.makeIntLocVal(*Const)
                                        : (SVal)SVB.makeIntVal(*Const);

We are going to further investigate.

ASDenysPetrov added inline comments.Dec 7 2021, 4:00 AM
clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
463

IMO it should be in a family of BinaryOperator::is.. methods.

steakhal marked an inline comment as done.Dec 8 2021, 8:07 AM

Sorry for my late response, I have a bunch of other tasks to do.


It can happen if the Loc was perfectly constrained to a concrete
value (nonloc::ConcreteInt)

This shouldn't happen. It should be loc::ConcreteInt which is, well, a Loc.

Well, yes.

However, unless we implement evalBinOpLN() for all possible arithmetic operators and do the same for the evalBinOpNL() (which currently doesn't exist), we cannot calculate this operation accurately.
In the arithmetic operation, we have two integral operands, which is basically modeled by the domain of NonLocs. However, the address of a pointer is modeled by Locs. There is the reinterpret-cast operation which is capable of crossing these two domains, producing an expression that can participate in arithmetic operations, but on the abstract domain side, we still stick to Locs, which breaks the assumption that should only handle the BO_Add for mixed domain symbolic computations.

By combining this with the improved simplifier, which substitutes concretized subexpressions we can get to a situation where we have to evaluate other operators besides BO_Add.
I had no time to implement evalBinOpLN() for all possible arithmetic operators nor implement evalBinOpNL() in a similar way. The current implementation in this patch is like a hotfix, doing the best possible thing given that the mentioned functions/behavior is not implemented yet.
IMO since it did not crash too frequently, it is probably not that critical to have this slight inaccuracy in representation - given that zero is zero in both (NonLocs, Locs) domains xD.
Alternatively to this, we could return Unknown, but I feel like that approach is even worse than this.

clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
463

It isn't as easy as it might seem at first glance. I've considered doing it that way.
In our context, we don't really deal with floating-point numbers, but in the generic implementation, we should have the type in addition to the OpKind. We don't need that complexity, so I sticked to inlining them here.

It seems like your test file passes even before your patch. I've just checked it. My last pull from the baseline was on Nov 15.

NoQ added a comment.EditedDec 8 2021, 5:05 PM

There is the reinterpret-cast operation which is capable of crossing these two domains, producing an expression that can participate in arithmetic operations, but on the abstract domain side, we still stick to Locs

Such cast should turn the loc::ConcreteInt into a nonloc::ConcreteInt with the same integral value.

The distinction between Loc and NonLoc is very important. It's at the core of our type correctness. We should fight tooth and nail to preserve it because assertions about these things (such as the one removed here) help us discover a lot of bugs in other places (such as genuinely misplacing a value).

NoQ added a comment.Dec 8 2021, 5:12 PM

Like, that's the whole reason why nonloc::LocAsInteger exists: so that we could cast a pointer to an integer and actually have a way to represent the resulting value as NonLoc.

I'm confident that there's a way to get it right, simply because the program under analysis is type-correct. If it's the simplifier, let's fix the simplifier. If the original value is not verbose enough, let's fix the original value. But I really think we should keep this assertion working 24/7. I'm sure when you find the root cause it'll all make sense to you.

ASDenysPetrov added a comment.EditedDec 9 2021, 1:11 AM

@NoQ wrote:

I'm confident that there's a way to get it right, simply because the program under analysis is type-correct. If it's the simplifier, let's fix the simplifier.

I agree with your main thought. But I also believe there definitely are thing to improve everywhere. And this one could wait until we find the root cause and correct it somewhere before.

@steakhal
Please provide a case which asserts before your patch.

ASDenysPetrov added inline comments.Dec 9 2021, 1:25 AM
clang/lib/StaticAnalyzer/Core/SValBuilder.cpp
463

I see your concerns about floats but the operators don't stop to be commutative themselves. Yes, it may not work with some types but it will still be commutative. Leave the context for a user outside the function. It may bother you only a few operators from the set and you just check whether they belong to the set or not.

steakhal marked an inline comment as done.Dec 13 2021, 2:59 AM

@steakhal
Please provide a case which asserts before your patch.

I don't get this one. I've provided a bunch of tests, even annotated with no-crash comments where we crashed prior to this change.

There is the reinterpret-cast operation which is capable of crossing these two domains, producing an expression that can participate in arithmetic operations, but on the abstract domain side, we still stick to Locs

Such cast should turn the loc::ConcreteInt into a nonloc::ConcreteInt with the same integral value.

The distinction between Loc and NonLoc is very important. It's at the core of our type correctness. We should fight tooth and nail to preserve it because assertions about these things (such as the one removed here) help us discover a lot of bugs in other places (such as genuinely misplacing a value).

Yes, it makes sense to look for bugs in the cast handling. Ideally, we should not transform loc::ConcreteInt to nonloc::ConcreteInt by hand. The reinterpret-cast should have done this transformation for us.
If that is fixed, I'm fine reverting this one. Sorry for rushing.

@steakhal
I don't get this one. I've provided a bunch of tests, even annotated with no-crash comments where we crashed prior to this change.

I wasn't able to catch any crashes with your tests file (symbol-simplification-nonloc-loc.cpp) on the baseline before your patch (D115149#3180005). So I ask you to provide the concrete example you've caught which promted you to do this patch.

@steakhal
I don't get this one. I've provided a bunch of tests, even annotated with no-crash comments where we crashed prior to this change.

I wasn't able to catch any crashes with your tests file (symbol-simplification-nonloc-loc.cpp) on the baseline before your patch (D115149#3180005). So I ask you to provide the concrete example you've caught which promted you to do this patch.

Denis, you can see in the Revision Contents that Diff 3 has the baseline commit 63a6348. When I check out 63a6348 then the newly added test file triggers the assertion about BO_Add.

@martong wrote:

Denis, you can see in the Revision Contents that Diff 3 has the baseline commit 63a6348. When I check out 63a6348 then the newly added test file triggers the assertion about BO_Add.

Yes is see it:


I don't get this feature. Is it a parent or something? Please explain how to interpret this table. And how can I use it myself and when?

@martong wrote:

Denis, you can see in the Revision Contents that Diff 3 has the baseline commit 63a6348. When I check out 63a6348 then the newly added test file triggers the assertion about BO_Add.

Yes is see it:


I don't get this feature. Is it a parent or something? Please explain how to interpret this table. And how can I use it myself and when?

Each time you update the patch a new row appears in this table. The Base is the parent of the Diff connected to the row. The Base is auto-filled if you use arcanist; it will not be set if you just simply upload the patch via https. Arcanist deduces the base from your local git repository by taking HEAD~. It might happen, however, that your parent commit is local-only (i.e. not available in the remote copy of github/llvm/llvm-project). In this case you see the hash id of the commit, but that is not a clickable URL.

If you'd like to checkout the given patch and the Base is available in the remote, you can simply do

arc patch D115149

inside the top-most directory of the llvm-project of your local copy.
This will create a branch for the base commit and will apply the latest diff on top of that.

If the base is not available or you want more control then you can export the patch:

cd llvm-project
arc export --revision D85528 --git > /tmp/p
patch -p1 < /tmp/p

TLDR, it is recommended to use the arcanist.

@martong
Thanks for clarification.

TLDR, it is recommended to use the arcanist.

I'm not able to use arcanist. It doesn't work on Windows (at least I've tryed several ways to set up it).

BTW. I found a revision from which the test(symbol-simplification-nonloc-loc.cpp) file starts to assert. It's your one D113754. You can checkout a revision before and test the file. I think we should investigate it deeper to understand the real cause.

@martong
Thanks for clarification.

TLDR, it is recommended to use the arcanist.

I'm not able to use arcanist. It doesn't work on Windows (at least I've tryed several ways to set up it).

BTW. I found a revision from which the test(symbol-simplification-nonloc-loc.cpp) file starts to assert. It's your one D113754. You can checkout a revision before and test the file. I think we should investigate it deeper to understand the real cause.

Yes, as of D113754, we handle IntSymExprs in the SValBuilder, and the half of the test cases added by Balazs are such (nonloc_OP_loc). These are the cases that triggers the assertion.

Like, that's the whole reason why nonloc::LocAsInteger exists: so that we could cast a pointer to an integer and actually have a way to represent the resulting value as NonLoc.
I'm confident that there's a way to get it right, simply because the program under analysis is type-correct. If it's the simplifier, let's fix the simplifier. If the original value is not verbose enough, let's fix the original value. But I really think we should keep this assertion working 24/7. I'm sure when you find the root cause it'll all make sense to you.

OK. Actually, we do create an SVal for the pointer-to-integer cast that is indeed an LocAsInteger. However, we loose this information when we build up the symbol tree in SValBuilder::makeSymExprValNN. I think, the only way to preserve this information is to create a SymbolCast node in the symbol tree. So, later the SValBuilder can build-up the correct SVal that represents the cast properly even when the symbol is constrained. I have created a new child patch that reverts this patch and which creates the SymbolCast.