This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Implement assume in terms of assumeDual
ClosedPublic

Authored by martong on May 2 2022, 4:47 AM.

Details

Summary

By evaluating both children states, now we are capable of discovering
infeasible parent states. In this patch, assume is implemented in the terms
of assumeDual. This might be suboptimal (e.g. where there are adjacent
assume(true) and assume(false) calls, next patches addresses that). This patch
fixes a real CRASH.
Fixes https://github.com/llvm/llvm-project/issues/54272

Diff Detail

Event Timeline

martong created this revision.May 2 2022, 4:47 AM
Herald added a project: Restricted Project. · View Herald Transcript
martong requested review of this revision.May 2 2022, 4:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2022, 4:47 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Although there is a visible slowdown, the new run-times seem promising. My guess is that it is usually not more than 1-2%.

Full report:

steakhal accepted this revision.May 2 2022, 6:15 AM

The patch looks great. Thanks for the stats.

Beyond that, I feel these tests sooo fragile; Although I don't have anything to improve that, so be it.

clang/test/Analysis/infeasible-crash.c
4

Do we really need this?

This revision is now accepted and ready to land.May 2 2022, 6:15 AM
martong updated this revision to Diff 427635.May 6 2022, 7:22 AM
  • Rebase to parent revision
martong updated this revision to Diff 427636.May 6 2022, 7:24 AM
martong marked an inline comment as done.
  • Remove ExprInspection checker from the test
clang/test/Analysis/infeasible-crash.c
4

No, thx. Removed.

This revision was landed with ongoing or failed builds.May 10 2022, 1:17 AM
This revision was automatically updated to reflect the committed changes.

infeasible-crash.c fails both on arm and windows. The reasion is the incompatible memmove declaration. I am to fix this ASAP.

Armv7

error: 'warning' diagnostics seen but not expected: 
  File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove'
error: 'note' diagnostics seen but not expected: 
  File /home/tcwg-buildbot/worker/clang-armv7-quick/llvm/clang/test/Analysis/infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned int)'
2 errors generated.

windows x64:

error: 'warning' diagnostics seen but not expected: 
  File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: incompatible redeclaration of library function 'memmove'
error: 'note' diagnostics seen but not expected: 
  File C:\b\slave\clang-x64-windows-msvc\llvm-project\clang\test\Analysis\infeasible-crash.c Line 9: 'memmove' is a builtin with type 'void *(void *, const void *, unsigned long long)'
2 errors generated.

Yes, it fixed the test case both at clang-armv7-quick and at clang-x64-windows-msvc.

This commit introduced a serious runtime regression on this code:

#define DEMONSTRATE_HANG

typedef unsigned char uint8_t;
typedef unsigned short uint16_t;
typedef unsigned long uint64_t;

void clang_analyzer_numTimesReached(void);

int filter_slice_word(int sat_linesize, int sigma, int radius, uint64_t *sat,
                      uint64_t *square_sat, int width, int height,
                      int src_linesize, int dst_linesize, const uint16_t *src,
                      uint16_t *dst, int jobnr, int nb_jobs) {
  const int starty = height * jobnr / nb_jobs;
  const int endy = height * (jobnr + 1) / nb_jobs;

  clang_analyzer_numTimesReached(); // 1 times
  for (int y = starty; y < endy; y++) {
    clang_analyzer_numTimesReached(); // 285 times

    int lower_y = y - radius < 0 ? 0 : y - radius;
    int higher_y = y + radius + 1 > height ? height : y + radius + 1;
    int dist_y = higher_y - lower_y;
    clang_analyzer_numTimesReached(); // 1128 times

    for (int x = 0; x < width; x++) {
      clang_analyzer_numTimesReached(); // 560 times
      
      int lower_x = x - radius < 0 ? 0 : x - radius;
      int higher_x = x + radius + 1 > width ? width : x + radius + 1;
      int count = dist_y * (higher_x - lower_x);
#ifdef DEMONSTRATE_HANG
      uint64_t sum = sat[higher_y * sat_linesize + higher_x] -
                     sat[higher_y * sat_linesize + lower_x] -
                     sat[lower_y * sat_linesize + higher_x] +
                     sat[lower_y * sat_linesize + lower_x];
      uint64_t square_sum = square_sat[higher_y * sat_linesize + higher_x] -
                            square_sat[higher_y * sat_linesize + lower_x] -
                            square_sat[lower_y * sat_linesize + higher_x] +
                            square_sat[lower_y * sat_linesize + lower_x];
      uint64_t mean = sum / count;
      uint64_t var = (square_sum - sum * sum / count) / count;
      dst[y * dst_linesize + x] =
          (sigma * mean + var * src[y * src_linesize + x]) / (sigma + var);
#endif
    }
  }
  return 0;
}
/build/release/bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c

Prior to this commit, the analysis of this code took about 2.6 seconds on my machine (release build with shared libs)
After this commit, it takes more than 67 minutes, and most of the time is spent in the constraint solver doing simplification I think.

Be reminded of the ArrayBoundV2 checker which will try to express the symbolic atom of the indexer expression:
0 <= (x + 3) < extent => -3 <= x < extent - 3 (this is more complex in the wild, it's for demonstration now).
So, it will create new and new states during this process, while gathering constraints to make the indexing well-formed.

However, the nested loops and the eager bifurcation over the < and > operators cause a significant path explosion, and on each one of them, we do this complex equality system reorganization, aggregating constraints, thus causing constraint system simplifications down the line.
I'm not sure how to tackle this problem ATM.

Thanks Balazs for the report.

Here is my analysis. Looks like during the recursive simplification, reAssume produces States that had been created by a previous reAssume. Before this change, to stop the recursion it was enough to to check if the OldState equals to the actual State in reAssume. Now, with this change, each assume call evaluates both the true and the false branches, thus it is not necessary that the subsequent reAssume could detect an already "visited" State.
So, the obvious solution would be to have a State cache in the reAssume machinery, though, implementation details are not clear yet.

There is another really important thing. We should not continue with reAssume if the State is posteriorlyOverConstrained.

 LLVM_NODISCARD ProgramStateRef reAssume(ProgramStateRef State,
                                         const RangeSet *Constraint,
                                         SVal TheValue) {
+  if (State->isPosteriorlyOverconstrained())
+    return nullptr;
   if (!Constraint)
     return State;

This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.

This change in itself reduced the run-time of the analysis to 16 seconds, on my machine. However, the repetition of States should still be addressed. I am going to upload the upper patch for a starter.

Sorry, in that 16s, I measured also the rebuild and linkage of the Clang binary. The time is actually way better, 2.8s, which is quite close to the original values we had before this change. So, perhaps it is not even needed to bother with the above mentioned cache mechanism.

 time ./bin/clang --analyze -Xclang -analyzer-checker=core,alpha.security.ArrayBoundV2,debug.ExprInspection test.c
test.c:14:3: warning: 1 [debug.ExprInspection]
  clang_analyzer_numTimesReached(); // 1 times
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:16:5: warning: 253 [debug.ExprInspection]
    clang_analyzer_numTimesReached(); // 285 times
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:21:5: warning: 805 [debug.ExprInspection]
    clang_analyzer_numTimesReached(); // 1128 times
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test.c:24:7: warning: 487 [debug.ExprInspection]
      clang_analyzer_numTimesReached(); // 560 times
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4 warnings generated.
./bin/clang --analyze -Xclang  test.c  2.74s user 0.07s system 99% cpu 2.811 total

As a heads up, because I'm not sure how often folks look at Github Issues. This patch causes a stack overflow on some Objective-C++ code. I have filed https://github.com/llvm/llvm-project/issues/55851. Could you take a look @martong?