This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][solver] Introduce reasoning for not equal to operator
ClosedPublic

Authored by manas on Jul 15 2021, 2:29 PM.

Details

Summary

Prior to this, the solver was only able to verify whether two symbols
are equal/unequal, only when constants were involved. This patch allows
the solver to work over ranges as well.

Diff Detail

Event Timeline

manas created this revision.Jul 15 2021, 2:29 PM
manas requested review of this revision.Jul 15 2021, 2:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2021, 2:29 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
vsavchenko added inline comments.Jul 16 2021, 1:53 AM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1252–1254

I think it should be a specialization for another VisitBinaryOperator.
In the switch, you can see that we give range sets for LHS and RHS, so how does it work?
There is a function in between (also VisitBinaryOperator) that creates simple ranges out of range sets and ask to visit binary operator for those. You can specialize it instead since we can simply check for empty intersection of range sets.

1270

Empty range set means "This situation is IMPOSSIBLE". Is that what you want here?

clang/test/Analysis/constant-folding.c
471–505

Did you try it in debugger, do we get inside of your function?

manas added inline comments.Jul 16 2021, 2:04 PM
clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1252–1254

I went through the code and understood that part. I agree that this should be a specialized case for VisitBinaryOperator. So I understand it in this way:

  1. Creating a new VisitBinaryOperator(Range,Range,QualType) which can check for empty intersections.
  2. It will then call to appropriate functions (like VisittBinaryOperator<BO_NE> and others.)
1270

True! That's incorrect. We cannot find a good rangeset. Should I rather return the entire RangeSet inferred from T?

clang/test/Analysis/constant-folding.c
471–505

Yes, the function is being reached.

This is promising! Gentle ping @manas @vsavchenko

I haven't tried specializing that VisitBinaryOperator method which converts Ranges from RangeSets (as @vsavchenko mentioned). Should this case for NE stay here in the switch or move?

manas marked 2 inline comments as done.Sep 24 2021, 12:55 AM

I haven't tried specializing that VisitBinaryOperator method which converts Ranges from RangeSets (as @vsavchenko mentioned). Should this case for NE stay here in the switch or move?

Please try what @vsavchenko mentioned, seems like with RangeSet::getMinValue and getMaxValue we can achieve the same before coersing to simple Ranges.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
1252–1254

No. You need the following specialization for RangeSet:

template <>
RangeSet VisitBinaryOperator<BO_NE>(RangeSet LHS, RangeSet RHS, QualType T) {
  // your original code here...
  return infer(T);
}

Besides, I think you should use getMaxValue() and getMinValue() instead of To() and From().

1270

True! That's incorrect. We cannot find a good rangeset. Should I rather return the entire RangeSet inferred from T?

Yes, that would work! Because the entire RangeSet implies both the logical True (non-zero values) and the logica False values.

Adding @steakhal as reviewer because of other reviewers inactivity.

manas updated this revision to Diff 375682.Sep 28 2021, 1:26 PM

Move method to a specialized template for VisitBinaryOperator

manas added a comment.Sep 28 2021, 1:58 PM

The pre-merge checks fail due to the patch being unable to get applied. The troubleshooting suggest to update the patch via arc diff `git merge-base HEAD origin` --update D106102 and it would include all local changes into that patch. Shouldn't I avoid including all local changes, as some local changes are still WIP?

manas marked 3 inline comments as done.Sep 28 2021, 7:22 PM
martong accepted this revision.Sep 29 2021, 2:10 AM

LGTM! Thanks!

This revision is now accepted and ready to land.Sep 29 2021, 2:10 AM

The pre-merge checks fail due to the patch being unable to get applied. The troubleshooting suggest to update the patch via arc diff `git merge-base HEAD origin` --update D106102 and it would include all local changes into that patch. Shouldn't I avoid including all local changes, as some local changes are still WIP?

I guess HEAD should refer to the commit that represents this patch, in this case that is ce82443c69be. And I believe you local changes that are not committed should not affect arc diff at all. But, as a precaution, you could first stash your local changes before calling arc. Hope this helps.

This patch seems pretty straightforward, good job @manas, and for the folks giving review comments.
Aside from polishing the tests, I think it's good to go.

clang/test/Analysis/constant-folding.c
470
470

I would prefer u1, u2, s1, s2 respectively.
This way the name would signify the signess of the variable.

473

Isn't a != b enough? Why do you need the (..) != 0 part?

476–478

How is this different compared to the previous case? The difference I can see is that now we use different constants and the == operator in the outer expression. None of which should really change the behavior AFAICT.

481

You could have a comment that a: [20,50] b:[20,50].
It would be easier to comprehend than the chain of conjunctions.
Similarly, how at L464 does.

manas added a comment.Sep 29 2021, 2:21 PM

The pre-merge checks fail due to the patch being unable to get applied. The troubleshooting suggest to update the patch via arc diff `git merge-base HEAD origin` --update D106102 and it would include all local changes into that patch. Shouldn't I avoid including all local changes, as some local changes are still WIP?

I guess HEAD should refer to the commit that represents this patch, in this case that is ce82443c69be. And I believe you local changes that are not committed should not affect arc diff at all. But, as a precaution, you could first stash your local changes before calling arc. Hope this helps.

Thanks. That was helpful.

clang/test/Analysis/constant-folding.c
470

Makes sense. I have amended it.

473

Sometime ago, Valeriy and I were discussing about how solver can't reason about sub-expressions containing logical operators with ranges. Currently, only ==/!= operators can reason about them, as they are trivial. So, this case is for != operator used with sub-expressions.

476–478

I wanted to put a signed case. Although I believe, I should have replaced == with !=.

481

Sure. I have amended.

manas updated this revision to Diff 376035.Sep 29 2021, 2:21 PM

Apply mentioned edits in testcases

manas marked 3 inline comments as done.Sep 29 2021, 2:22 PM

Okay, I don't see any problems besides this typo.

BTW do you plan to implement other rules besides this in the future?
E.g. we currently miss this:
'x < y' is true if max(x) < min(y); and false if min(x) >= max(y)
'x > y', 'x <= y', etc. in a similar way.

clang/test/Analysis/constant-folding.c
478

Okay, I don't see any problems besides this typo.

Oops!

BTW do you plan to implement other rules besides this in the future?
E.g. we currently miss this:
'x < y' is true if max(x) < min(y); and false if min(x) >= max(y)
'x > y', 'x <= y', etc. in a similar way.

I think these cases need to be specialized as well. As, they aren't returning some complex RangeSet, so we can get away with a specialized template for BO_LT/BO_GT etc., instead of a class method (like BO_Rem has). I can work on this.

manas updated this revision to Diff 376363.Sep 30 2021, 2:12 PM

Fix typo

manas marked 3 inline comments as done.Sep 30 2021, 2:13 PM
steakhal accepted this revision.Sep 30 2021, 11:29 PM

Good work. Land it.

manas added a comment.Oct 1 2021, 5:35 AM

Good work. Land it.

I do not have landing rights.

I do not have landing rights.

Please add your name and email on whom behalf I should commit this patch. Mine is Balazs Benics<balazs.benics@sigmatechnology.se>

manas added a comment.Oct 1 2021, 7:36 AM

I do not have landing rights.

Please add your name and email on whom behalf I should commit this patch. Mine is Balazs Benics<balazs.benics@sigmatechnology.se>

It is Manas <manas18244@iiitd.ac.in>

Thanks!

It seems like it doesn't build with GCC 8.3.0 on a Debian system.
Could you investigate?

llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1029:13: error: explicit specialization in non-namespace scope 'class {anonymous}::SymbolicRangeInferrer'
   template <>
             ^
llvm-project/clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp:1030:77: error: template-id 'VisitBinaryOperator<BO_NE>' in declaration of primary template
   RangeSet VisitBinaryOperator<BO_NE>(RangeSet LHS, RangeSet RHS, QualType T) {
                                                                             ^

I think the issue is this: https://stackoverflow.com/questions/3052579/explicit-specialization-in-non-namespace-scope

manas added a comment.Oct 1 2021, 4:04 PM

I'll take a look at this over the weekend.

This revision was automatically updated to reflect the committed changes.

I've fixed the compilation error and pushed.

manas added a comment.Oct 22 2021, 4:11 AM

I have received a failed build on clang-ppc64be-linux (https://lab.llvm.org/buildbot/#/builders/52/builds/11704). Although, I haven't verified the commit. Is there something wrong with this commit? Is the function VisitBinaryOperator<BO_NE> reachable?

I have received a failed build on clang-ppc64be-linux (https://lab.llvm.org/buildbot/#/builders/52/builds/11704). Although, I haven't verified the commit. Is there something wrong with this commit? Is the function VisitBinaryOperator<BO_NE> reachable?

No worries, that seems to be a test suite python error that is unrelated:

Traceback (most recent call last):
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit.py", line 6, in <module>
    main()
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit/main.py", line 44, in main
    opts.indirectlyRunCheck)
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit/discovery.py", line 279, in find_tests_for_inputs
    local_config_cache, indirectlyRunCheck)[1])
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit/discovery.py", line 130, in getTests
    ts,path_in_suite = getTestSuite(path, litConfig, testSuiteCache)
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit/discovery.py", line 91, in getTestSuite
    ts, relative = search(item)
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/llvm/llvm/utils/lit/lit/discovery.py", line 75, in search
    cache[real_path] = res = search1(path)
  File "/home/buildbots/ppc64be-clang-test/clang-ppc64be/l
bjope added a subscriber: bjope.Oct 22 2021, 10:39 AM

I get failures after having merged this patch:

/llvm/build-all/bin/clang -Xanalyzer -analyzer-output=text -Xanalyzer -analyzer-checker='nullability,optin.portability.UnixAPI,security,valist' -Xanalyzer -analyzer-disable-checker='security.insecureAPI.strcpy' --analyze crash.c
clang: ../include/llvm/ADT/APSInt.h:148: bool llvm::APSInt::operator<(const llvm::APSInt &) const: Assertion `IsUnsigned == RHS.IsUnsigned && "Signedness mismatch!"' failed.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /llvm/build-all/bin/clang -Xanalyzer -analyzer-output=text -Xanalyzer -analyzer-checker=nullability,optin.portability.UnixAPI,security,valist -Xanalyzer -analyzer-disable-checker=security.insecureAPI.strcpy --analyze crash.c
1.      <eof> parser at end of file
2.      While analyzing stack: 
        #0 Calling func_59
3.      build-all/runtest/csmith/csmith-3790728623.c:1368:79: Error evaluating statement
4.      build-all/runtest/csmith/csmith-3790728623.c:1368:79: Error evaluating statement
 #0 0x0000000002cd99a3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/llvm/build-all/bin/clang+0x2cd99a3)
 #1 0x0000000002cd761e llvm::sys::RunSignalHandlers() (/llvm/build-all/bin/clang+0x2cd761e)
 #2 0x0000000002cd8d42 llvm::sys::CleanupOnSignal(unsigned long) (/llvm/build-all/bin/clang+0x2cd8d42)
 #3 0x0000000002c4c893 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0
 #4 0x0000000002c4ca3d CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0
 #5 0x00007f877102c630 __restore_rt sigaction.c:0:0
 #6 0x00007f876e75f387 raise (/lib64/libc.so.6+0x36387)
 #7 0x00007f876e760a78 abort (/lib64/libc.so.6+0x37a78)
 #8 0x00007f876e7581a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6)
 #9 0x00007f876e758252 (/lib64/libc.so.6+0x2f252)
#10 0x000000000435bd4f (anonymous namespace)::SymbolicRangeInferrer::VisitBinaryOperator(clang::ento::RangeSet, clang::BinaryOperatorKind, clang::ento::RangeSet, clang::QualType) RangeConstraintManager.cpp:0:0
#11 0x000000000435a757 (anonymous namespace)::SymbolicRangeInferrer::infer(clang::ento::SymExpr const*) RangeConstraintManager.cpp:0:0
#12 0x000000000434e7b5 (anonymous namespace)::RangeConstraintManager::assumeSymNE(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, llvm::APSInt const&, llvm::APSInt const&) RangeConstraintManager.cpp:0:0
#13 0x0000000004361e52 clang::ento::RangedConstraintManager::assumeSymUnsupported(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SymExpr const*, bool) (/llvm/build-all/bin/clang+0x4361e52)
#14 0x000000000437e419 clang::ento::SimpleConstraintManager::assumeAux(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (/llvm/build-all/bin/clang+0x437e419)
#15 0x000000000437e278 clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::NonLoc, bool) (/llvm/build-all/bin/clang+0x437e278)
#16 0x000000000437e18b clang::ento::SimpleConstraintManager::assume(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal, bool) (/llvm/build-all/bin/clang+0x437e18b)
#17 0x0000000003f4967c clang::ento::ConstraintManager::assumeDual(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::DefinedSVal) crtstuff.c:0:0
#18 0x00000000042ed7cf clang::ento::ExprEngine::evalEagerlyAssumeBinOpBifurcation(clang::ento::ExplodedNodeSet&, clang::ento::ExplodedNodeSet&, clang::Expr const*) (/llvm/build-all/bin/clang+0x42ed7cf)
#19 0x00000000042ea5db clang::ento::ExprEngine::Visit(clang::Stmt const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/llvm/build-all/bin/clang+0x42ea5db)
#20 0x00000000042e58ee clang::ento::ExprEngine::ProcessStmt(clang::Stmt const*, clang::ento::ExplodedNode*) (/llvm/build-all/bin/clang+0x42e58ee)
#21 0x00000000042e5609 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) (/llvm/build-all/bin/clang+0x42e5609)
#22 0x00000000042cd850 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const*, unsigned int, clang::ento::ExplodedNode*) (/llvm/build-all/bin/clang+0x42cd850)
#23 0x00000000042cc8e4 clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (/llvm/build-all/bin/clang+0x42cc8e4)
#24 0x0000000003f1fa5c (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*> >*) AnalysisConsumer.cpp:0:0
#25 0x0000000003f035c7 (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) AnalysisConsumer.cpp:0:0

Here is the crash.c input

I see lots of these. Probably worth a revert.

bjope added a comment.EditedOct 22 2021, 10:41 AM

I actually see lots of these as well:

clang: ../lib/Support/APInt.cpp:284: int llvm::APInt::compareSigned(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Bit widths must be same for comparison"' failed.

I get failures after having merged this patch:
[...]
Here is the crash.c input

I see lots of these. Probably worth a revert.

Yeah, actually our nightlies are also crying for the same reason, thus I'm reverting this.
@martong, we should have run some measurement jobs. We definitely need tighter integration with the upstream changes, regarding our build jobs.

@bjope, thanks for the quick report.

manas added a comment.Oct 23 2021, 3:33 PM

Based on the information from debugger, the logs show RangeSets [0, 255] and [INT_MIN, 0] from different types are causing the failure.
I tried producing a compact test case. The essential part is the comparison between different types, as in this case unsigned int and int. In other binary operators (BO_And, BO_Rem, etc.), this gets handled by coarsing RangeSets to Ranges and converting those Ranges to resulting type for comparison. This is missing in BO_NE. I suppose, this may be the reason behind the signedness mismatch.

One way to solve this would be to let the specialization of VisitBinaryOperator (definition here) handle the coarse/convert, and then VisitBinaryOperator<BO_NE> checking for any reasonable result which can be inferred from those Ranges.

Based on the information from debugger, the logs show RangeSets [0, 255] and [INT_MIN, 0] from different types are causing the failure.
I tried producing a compact test case. The essential part is the comparison between different types, as in this case unsigned int and int. In other binary operators (BO_And, BO_Rem, etc.), this gets handled by coarsing RangeSets to Ranges and converting those Ranges to resulting type for comparison. This is missing in BO_NE. I suppose, this may be the reason behind the signedness mismatch.

One way to solve this would be to let the specialization of VisitBinaryOperator (definition here) handle the coarse/convert, and then VisitBinaryOperator<BO_NE> checking for any reasonable result which can be inferred from those Ranges.

I think you are right. I'm gonna have a deeper look at this tomorrow. Until then you shouls figure out how this was avoided in other operations. Dealing with this for one and all seems promising. Ive seen multiple fixup commits resolving similar issues, so it seems to bite back regularly.

Thanks for the report @bjope and thanks for the action @manas and @steakhal !
Sorry guys, I should have run our CI safety net jobs for this patch. I tend to do that with all my solver related patches, I really don't know why I thought this was different.