By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, assume is implemented in the terms
of assumeDual. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Although there is a visible slowdown, the new run-times seem promising. My guess is that it is usually not more than 1-2%.
Full report:
The patch looks great. Thanks for the stats.
Beyond that, I feel these tests sooo fragile; Although I don't have anything to improve that, so be it.
clang/test/Analysis/infeasible-crash.c | ||
---|---|---|
4 | Do we really need this? |
- Remove ExprInspection checker from the test
clang/test/Analysis/infeasible-crash.c | ||
---|---|---|
4 | No, thx. Removed. |
infeasible-crash.c fails both on arm and windows. The reasion is the incompatible memmove declaration. I am to fix this ASAP.
Armv7
error: 'warning' diagnostics seen but not expected: File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove' error: 'note' diagnostics seen but not expected: File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned int)' 2 errors generated.
windows x64:
error: 'warning' diagnostics seen but not expected: File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove' error: 'note' diagnostics seen but not expected: File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned long long)' 2 errors generated.
Hopefully this commit fixes the failure:
https://github.com/llvm/llvm-project/commit/21feafaeb85aad2847db44aa2208999b166ba4a9
This commit introduced a serious runtime regression on this code:
#define DEMONSTRATE_HANG typedef unsigned char uint8_t; typedef unsigned short uint16_t; typedef unsigned long uint64_t; void clang_analyzer_numTimesReached(void); int filter_slice_word(int sat_linesize, int sigma, int radius, uint64_t *sat, uint64_t *square_sat, int width, int height, int src_linesize, int dst_linesize, const uint16_t *src, uint16_t *dst, int jobnr, int nb_jobs) { const int starty = height * jobnr / nb_jobs; const int endy = height * (jobnr + 1) / nb_jobs; clang_analyzer_numTimesReached(); // 1 times for (int y = starty; y < endy; y++) { clang_analyzer_numTimesReached(); // 285 times int lower_y = y - radius < 0 ? 0 : y - radius; int higher_y = y + radius + 1 > height ? height : y + radius + 1; int dist_y = higher_y - lower_y; clang_analyzer_numTimesReached(); // 1128 times for (int x = 0; x < width; x++) { clang_analyzer_numTimesReached(); // 560 times int lower_x = x - radius < 0 ? 0 : x - radius; int higher_x = x + radius + 1 > width ? width : x + radius + 1; int count = dist_y * (higher_x - lower_x); #ifdef DEMONSTRATE_HANG uint64_t sum = sat[higher_y * sat_linesize + higher_x] - sat[higher_y * sat_linesize + lower_x] - sat[lower_y * sat_linesize + higher_x] + sat[lower_y * sat_linesize + lower_x]; uint64_t square_sum = square_sat[higher_y * sat_linesize + higher_x] - square_sat[higher_y * sat_linesize + lower_x] - square_sat[lower_y * sat_linesize + higher_x] + square_sat[lower_y * sat_linesize + lower_x]; uint64_t mean = sum / count; uint64_t var = (square_sum - sum * sum / count) / count; dst[y * dst_linesize + x] = (sigma * mean + var * src[y * src_linesize + x]) / (sigma + var); #endif } } return 0; }
/build/release/bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c
Prior to this commit, the analysis of this code took about 2.6 seconds on my machine (release build with shared libs)
After this commit, it takes more than 67 minutes, and most of the time is spent in the constraint solver doing simplification I think.
Be reminded of the ArrayBoundV2 checker which will try to express the symbolic atom of the indexer expression:
0 <= (x + 3) < extent => -3 <= x < extent - 3 (this is more complex in the wild, it's for demonstration now).
So, it will create new and new states during this process, while gathering constraints to make the indexing well-formed.
However, the nested loops and the eager bifurcation over the < and > operators cause a significant path explosion, and on each one of them, we do this complex equality system reorganization, aggregating constraints, thus causing constraint system simplifications down the line.
I'm not sure how to tackle this problem ATM.
Thanks Balazs for the report.
Here is my analysis. Looks like during the recursive simplification, reAssume produces States that had been created by a previous reAssume. Before this change, to stop the recursion it was enough to to check if the OldState equals to the actual State in reAssume. Now, with this change, each assume call evaluates both the true and the false branches, thus it is not necessary that the subsequent reAssume could detect an already "visited" State.
So, the obvious solution would be to have a State cache in the reAssume machinery, though, implementation details are not clear yet.
There is another really important thing. We should not continue with reAssume if the State is posteriorlyOverConstrained.
LLVM_NODISCARD ProgramStateRef reAssume(ProgramStateRef State, const RangeSet *Constraint, SVal TheValue) { + if (State->isPosteriorlyOverconstrained()) + return nullptr; if (!Constraint) return State;
This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.
This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.
Sorry, in that 16s, I measured also the rebuild and linkage of the Clang binary. The time is actually way better, 2.8s, which is quite close to the original values we had before this change. So, perhaps it is not even needed to bother with the above mentioned cache mechanism.
time ./bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c test.c:14:3: warning: 1 [debug.ExprInspection] clang_analyzer_numTimesReached(); // 1 times ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test.c:16:5: warning: 253 [debug.ExprInspection] clang_analyzer_numTimesReached(); // 285 times ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test.c:21:5: warning: 805 [debug.ExprInspection] clang_analyzer_numTimesReached(); // 1128 times ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ test.c:24:7: warning: 487 [debug.ExprInspection] clang_analyzer_numTimesReached(); // 560 times ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4 warnings generated. ./bin/clang --analyze -Xclang test.c 2.74s user 0.07s system 99% cpu 2.811 total
As a heads up, because I'm not sure how often folks look at Github Issues. This patch causes a stack overflow on some Objective-C++ code. I have filed https://github.com/llvm/llvm-project/issues/55851. Could you take a look @martong?
Do we really need this?