This PR changes the SymIntExpr so the expression that uses
negative value as RHS, for example: x +/- (-N), are modeled as
as x -/+ N. This avoid producing very large RHS in case when the
symbol is cased to unsigned number, and as consequence makes the
value more robust in presence of cast.
This change is not apllied if N is lowest negative value for
which negation would not be representable.
Details
- Reviewers
NoQ steakhal martong ASDenysPetrov Szelethus - Commits
- rG14742443a258: Reland "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
rGda5b5ae852c4: Revert "[analyzer] Canonicalize SymIntExpr so the RHS is positive when possible"
rGf0d6cb4a5cf5: [analyzer] Canonicalize SymIntExpr so the RHS is positive when possible
Diff Detail
Event Timeline
Without the changes, out of the following unsatisfiable conditions:
A) (l - 1000) > 0
B) l > 1000
C) l > 1000L
D) (l + 0L) > 1000
CSA was considering A and B to satisable. The l is (long)(x + 1 - 10U) and x is in range [10, 100] so no overflow happens.
It looks good at first glance.
For the review, it would be nice to see which clang_analyzer_dumps and reachables change with the patch.
Could you mark the ones which have different results depending on whether you apply your fix or not?
E.g put a no-warning where we previously had a report, but now we don't.
Ah, I see. It would be still nice to see a comment in the test :D
PS: we tend to use regular expression matching for masking out the number in the symbol dump: expected-warning-re {{(reg_${{[0-9]+}}<int x>) - 1}}
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
221 | I would rather swap these branches though, to leave the default case (aka. this) to the end. | |
229–236 | Somehow I miss a check for signedness here. I have a guess, that since we already handled x +-0, SymIntExprs like x - (-0) cannot exist here, thus cannot trigger this condition spuriously. I cannot think of any ther example that could cause this misbehaving. So in that sense ConvertedRHSValue > resultIntTy.getMinValue() implies *at this place* that ConvertedRHSValue.isSigned(). | |
668–674 | It feels like this is a completely different problem compared to the simplification of X - (-N) to X + N. | |
clang/test/Analysis/additive-op-on-sym-int-expr.c | ||
1 | You assume that int has 8 bytes and that 1 byte is 8 bits. Pin the target triple. | |
11 | ||
35–36 | You could simply assign the right value directly to i AFAIK. |
This avoid producing very large RHS in case when the symbol is cased to unsigned number, and as consequence makes the value more robust in presence of cast.
Could you please elaborate on this? I understand that you'd like to simplify certain binary operations by merging the RHS and the operand, however, I don't see what are the empirical problems that this patch supposed to fix. E.g. did you encounter a crash without the fix, or was it the mentioned infeasible state ((l - 1000) > 0) that caused a false positive? If that is the case, I believe the huge cast patch is going to solve it. Could you please check if those infeasible cases are solved by the mentioned D103096 (you have to set support-symbolic-integer-casts=true) ?
clang/test/Analysis/additive-op-on-sym-int-expr.c | ||
---|---|---|
18 | And same below for the other infeasible cases. |
Could you please elaborate on this? I understand that you'd like to simplify certain binary operations by merging the RHS and the operand, however, I don't see what are the empirical problems that this patch supposed to fix. E.g. did you encounter a crash without the fix, or was it the mentioned infeasible state ((l - 1000) > 0) that caused a false positive?
Yes, the patch is aiming to fix false positives raised from unfeasible paths, that were guarded by conditions that encountered this problem.
If that is the case, I believe the huge cast patch is going to solve it. Could you please check if those infeasible cases are solved by the mentioned D103096 (you have to set support-symbolic-integer-casts=true) ?
Indeed all false positives that I am fixing with this patch would be also fixed by proper modeling of the cast of the integers.
However this patch:
- does not in any way affect the correctness of the symbol values, or the upcoming patch
- fixes them even without that support-symbolic-integer-casts enabled, and with nearly zero performance impact (two additional comparisons)
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
221 | I folded the RHS.isNegative() into the if for the BinaryOperator::isAssociative(op), as same conversion is performed in final else branch. | |
229–236 | The integer representation does not have negative zeros (the standard and clang assume two's complement). However, this condition does need to check for the signedness of the types. What I mean is that if the RHS is negative, but ConvertedRHSValue the branch will trigger and we will change x - INT_MIN to x + (INT_MAX + 1)U which is ok, as a negation of INT_MIN is representable as an unsigned type of same or lager bit with. | |
229–236 | However, I was not able to reach this point with RHS being signed, and resultTy being unsigned. Any hints how this could be done? | |
668–674 | Without the change, this code would produce X -(-N) cases, that would be detected for signed numbers. This would be handled for signed integers later by the simplification but would lead to double conversion. However, if IntType is is unsigned and first < last is true, then we will produce x + extremely large number, which would not be later simplified to X - (last - first). In other words, without applying this change, the normalization of x -(-N) would provide no benefit. And only this change without the initialial normalization would fail in case of: i = x + (-1); l = i - 10U; |
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
221 | I think what confused me is that a different API is used for doing the conversion.
Anyway, leave it as-is. | |
229–236 | I'm not saying that I can follow this thought process. But the clang/test/Analysis/PR49642.c would trigger an assertion like this: diff --git a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp index 088c33c8e612..7e59309228e1 100644 --- a/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp +++ b/clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp @@ -207,6 +207,16 @@ SVal SimpleSValBuilder::MakeSymIntVal(const SymExpr *LHS, "number of bits as its operands."); llvm::APSInt ConvertedRHSValue = resultIntTy.convert(RHS); + if (RHS.isSigned() && resultTy->isUnsignedIntegerOrEnumerationType()) { + llvm::errs() << "LHS sym:\n"; + LHS->dump(); + llvm::errs() << "RHS integral:\n"; + RHS.dump(); + llvm::errs() << "OP: " << BinaryOperator::getOpcodeStr(op) << "\n"; + llvm::errs() << "result type:\n"; + resultTy->dump(); + llvm_unreachable("how is it possible??"); + } // Check if the negation of the RHS is representable, // i.e., the resultTy is signed, and it is not the lowest // representable negative value. Which can be reduced into this one: // RUN: %clang_analyze_cc1 -Wno-implicit-function-declaration -w -verify %s \ // RUN: -analyzer-checker=core \ // RUN: -analyzer-checker=apiModeling.StdCLibraryFunctions // expected-no-diagnostics typedef int ssize_t; int write(int, const void *, unsigned long); unsigned c; void a() { int b = write(0, 0, c); b != 0; c -= b; b < 1; ++c; // crash simplifySValOnce: derived_$4{conj_$1{int, LC1, S700, #1},c} op(-) APInt(32b, 4294967295u -1s) :: unsigned int } |
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
221 | As a note, the use of different APIs was intentional. The BasicVals one is persisting the value, so it is safe to use ptr to it, as a consequence it is more costfull. So, I am delaying its use until I know I will need to persist the value. | |
229–236 | What I mean, is that performing normalization (op and sign switch) is always correct for the unsigned resultInTy, even if RHS is the lowest representable negative number. The code is already behaving correctly in that case (I have verified your example), as the ConvertedRHSValue > resultIntTy.getMinValue() is always passing in a situation when resultIntTy.isUnsigned() is true (zero was eliminated before), so I left simple check. But, now I see that this is confusing, so I have updated the check to be more explicit and updated the comment. |
I do not have commit rights. Would it be possible to you to commit the changes on my behalf?
This patch triggers a crash with this minimized example.
assertion at L205: "The result operation type must have at least the same number of bits as its operands."
// build/bin/clang -cc1 -triple x86_64-unknown-linux-gnu -analyze -analyzer-checker=core,apiModeling repro.c typedef long ssize_t; ssize_t write(int, const void *, unsigned long); int crash(int x, int fd) { unsigned wres = write(fd, "a", 1); if (wres) {} int t1 = x - wres; if (wres < 0) {} return x + t1; }
The debugger showed these values:
op: BO_Sub resultTy: unsigned int resultIntTy.isUnsigned(): true Sym: reg<int x> RHSValue: APInt(64b, 18446744073709551615u -1s) resultIntTy.getBitWidth(): 32 RHS.getBitWidth(): 64
Please investigate this @tomasz-kaminski-sonarsource
Hi,
I've stumbled on this crash as well with this patch.
So
clang --analyze foo.c
crashes with
clang: ../../clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp:207: clang::ento::SVal (anonymous namespace)::SimpleSValBuilder::MakeSymIntVal(const clang::ento::SymExpr *, BinaryOperator::Opcode, const llvm::APSInt &, clang::QualType): Assertion `resultIntTy.getBitWidth() >= RHS.getBitWidth() && "The result operation type must have at least the same " "number of bits as its operands."' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /repo/uabelho/master-github/llvm/build-all/bin/clang --analyze foo.c 1. <eof> parser at end of file 2. While analyzing stack: #0 Calling h 3. foo.c:9:9: Error evaluating statement 4. foo.c:9:9: Error evaluating statement #0 0x0000000002e4fbf3 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x2e4fbf3) #1 0x0000000002e4d86e llvm::sys::RunSignalHandlers() (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x2e4d86e) #2 0x0000000002e4ef92 llvm::sys::CleanupOnSignal(unsigned long) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x2e4ef92) #3 0x0000000002db4e03 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) CrashRecoveryContext.cpp:0:0 #4 0x0000000002db4fbd CrashRecoverySignalHandler(int) CrashRecoveryContext.cpp:0:0 #5 0x00007fd21e2ce630 __restore_rt sigaction.c:0:0 #6 0x00007fd21ba15387 raise (/lib64/libc.so.6+0x36387) #7 0x00007fd21ba16a78 abort (/lib64/libc.so.6+0x37a78) #8 0x00007fd21ba0e1a6 __assert_fail_base (/lib64/libc.so.6+0x2f1a6) #9 0x00007fd21ba0e252 (/lib64/libc.so.6+0x2f252) #10 0x000000000467e0bb (anonymous namespace)::SimpleSValBuilder::MakeSymIntVal(clang::ento::SymExpr const*, clang::BinaryOperatorKind, llvm::APSInt const&, clang::QualType) SimpleSValBuilder.cpp:0:0 #11 0x000000000467a94f (anonymous namespace)::SimpleSValBuilder::evalBinOpNN(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::NonLoc, clang::ento::NonLoc, clang::QualType) SimpleSValBuilder.cpp:0:0 #12 0x0000000004688f46 clang::ento::SValBuilder::evalBinOp(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::SVal, clang::ento::SVal, clang::QualType) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x4688f46) #13 0x000000000467f28e clang::ento::SymExprVisitor<(anonymous namespace)::SimpleSValBuilder::simplifySValOnce(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SVal)::Simplifier, clang::ento::SVal>::Visit(clang::ento::SymExpr const*) SimpleSValBuilder.cpp:0:0 #14 0x000000000467eccc (anonymous namespace)::SimpleSValBuilder::simplifySValOnce(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SVal) SimpleSValBuilder.cpp:0:0 #15 0x000000000467d814 (anonymous namespace)::SimpleSValBuilder::simplifySVal(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::ento::SVal) SimpleSValBuilder.cpp:0:0 #16 0x0000000004679ec3 (anonymous namespace)::SimpleSValBuilder::evalBinOpNN(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::NonLoc, clang::ento::NonLoc, clang::QualType) SimpleSValBuilder.cpp:0:0 #17 0x0000000004688f46 clang::ento::SValBuilder::evalBinOp(llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>, clang::BinaryOperatorKind, clang::ento::SVal, clang::ento::SVal, clang::QualType) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x4688f46) #18 0x00000000045f509c clang::ento::ExprEngine::VisitBinaryOperator(clang::BinaryOperator const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45f509c) #19 0x00000000045e201a clang::ento::ExprEngine::Visit(clang::Stmt const*, clang::ento::ExplodedNode*, clang::ento::ExplodedNodeSet&) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45e201a) #20 0x00000000045dd27e clang::ento::ExprEngine::ProcessStmt(clang::Stmt const*, clang::ento::ExplodedNode*) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45dd27e) #21 0x00000000045dcf99 clang::ento::ExprEngine::processCFGElement(clang::CFGElement, clang::ento::ExplodedNode*, unsigned int, clang::ento::NodeBuilderContext*) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45dcf99) #22 0x00000000045c5300 clang::ento::CoreEngine::HandlePostStmt(clang::CFGBlock const*, unsigned int, clang::ento::ExplodedNode*) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45c5300) #23 0x00000000045c437a clang::ento::CoreEngine::ExecuteWorkList(clang::LocationContext const*, unsigned int, llvm::IntrusiveRefCntPtr<clang::ento::ProgramState const>) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x45c437a) #24 0x00000000041f10dc (anonymous namespace)::AnalysisConsumer::HandleCode(clang::Decl*, unsigned int, clang::ento::ExprEngine::InliningModes, llvm::DenseSet<clang::Decl const*, llvm::DenseMapInfo<clang::Decl const*, void> >*) AnalysisConsumer.cpp:0:0 #25 0x00000000041d3e6a (anonymous namespace)::AnalysisConsumer::HandleTranslationUnit(clang::ASTContext&) AnalysisConsumer.cpp:0:0 #26 0x00000000046c5f03 clang::ParseAST(clang::Sema&, bool, bool) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x46c5f03) #27 0x00000000038748a6 clang::FrontendAction::Execute() (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x38748a6) #28 0x00000000037e8fc4 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x37e8fc4) #29 0x0000000003932562 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x3932562) #30 0x0000000000a0feac cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0xa0feac) #31 0x0000000000a0dccf ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0 #32 0x00000000036577e2 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) Job.cpp:0:0 #33 0x0000000002db4d21 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x2db4d21) #34 0x00000000036571d8 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x36571d8) #35 0x000000000361abc6 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x361abc6) #36 0x000000000361b11a clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x361b11a) #37 0x0000000003637778 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/repo/uabelho/master-github/llvm/build-all/bin/clang+0x3637778) #38 0x0000000000a0d571 main (/repo/uabelho/master-github/llvm/build-all/bin/clang+0xa0d571) #39 0x00007fd21ba01555 __libc_start_main (/lib64/libc.so.6+0x22555) #40 0x0000000000a0a90c _start (/repo/uabelho/master-github/llvm/build-all/bin/clang+0xa0a90c)
with foo.c being
long a; char c; long(d)(long f, long g) { return g ? f : 0; } void h() { for (;;) { long *e = h; d(-5L, *e) == a; char b = a; c = b + c; } }
Thank you for reverting the change.
I am looking into the crashes. The simplest solution is to not apply normalization in such a situation. But, want to understand better how such values are produced in the first case, and if performing normalization would make sense in such case.
I have removed the assertions and updated the code to handle both extensions of reductions of bitwith from RHS to ResultType.
Expanded test, to cover the above scenarios.
@uabelho Could you please apply this change and check if it resolves your crashes?
I'm gonna do the same on our side.
It's still messed up. Please double-check.
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
117 | else after return is unnecessary. |
Not sure if I'm doing something wrong but I can't apply this patch on current top of tree (a4190037fac0)
When I download the "raw diff" using the Download Raw Diff button on this page, then I try to git apply it still fails. (error 219: new blank line at EOF.)
Anyway, I managed to apply your path by hand, but now some tests fail.
It looks like I got change in clang/test/Analysis/expr-inspection.c while updating the patch. :/
Now rebasing, rebuilding llvm.
Firstly, my apologies for the back and forth. My repo was in some strange state, and I wasn't picking up all the changes that got reverted in my diff (this is why the patch was not applying and the test failed later). Uploading a clean diff.
This time I've checked this on our test set, and it seems to run without any issues. I'm still waiting for the results of analyzing LLVM, but I think safe to say that this should work.
Could you please give this another shoot at this @uabelho?
clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp | ||
---|---|---|
107–122 | Please follow the naming conventions of LLVM, so variables are UpperCase, and the & sign binds to the variable name. Also, taking a parameter by-const-value is discouraged, since it has no visible effect. |
[...]
Could you please give this another shoot at this @uabelho?
Yes, the case that crashed for me works with the updated patch. Haven't done other testing.
Thank you for additional checks. Unfortunatelly, I am not able to perform commits for this and next wee (up to 20th of May), so I wonder if you could commit this one for me.
else after return is unnecessary.