The first crash reported in the bug report 44338.
Condition !isSat.hasValue() || isNotSat.getValue() here should be !isNotSat.hasValue() || isNotSat.getValue().
getValue here crashed when we used the static analyzer to analyze postgresql-12.0.
Differential D83660
[analyzer] Fix a crash for dereferencing an empty llvm::Optional variable in SMTConstraintManager.h. OikawaKirie on Jul 13 2020, 12:52 AM. Authored by
Details The first crash reported in the bug report 44338.
Diff Detail
Event TimelineComment Actions Looks like a copy-paste error indeed. @OikawaKirie do you accidentally have a test case to reproduce the crash, so that we could add it to our regression test suite? 'Cause traditionally we add a test to every patch. Note that SMT-backed constraint managers aren't an actual supported mode; it's an experimental project that isn't yet supposed to be used anywhere and probably won't be ever officially supported, unless reworked dramatically. Comment Actions @vrnithinkumar @NoQ looks like a good opportunity for SmartPtr checker and llvm::Optional! Comment Actions
@NoQ I am now working with the reporter of this bug to make a simple test case to trigger the crash.
@vsavchenko It seems not difficult to write a checker specifically for llvm::Optional. However, the root reason for this crash is an assertion failure. It seems to be better to write a checker to report assertion failures. Comment Actions I cannot agree more. Although I would appreciate some progress improving Z3 constraint manager. Might in the future I would spend some time on it - we will see. BTW nice catch & fix @OikawaKirie! Comment Actions
@steakhal My boss always asks me about how to improve the performance for SMT solver based constraint solving, on both the engine side and the SMT solver side. If there is anything that our research group can do, you are free to contact us.
Thank you. This crash was discovered by one of our team members when we were testing our CSA based tool. If the patch gets merged in the future, I'd like to use his name and email for the commit. Comment Actions I would use the official LLVM/clang-static-analyzer channel for lightweight discussions. Comment Actions No no no. No no no no no. Assertions don't say "the program crashes if this happens", assertions say "this doesn't happen". We should trust them, not argue against them. They're god-sent for any static analysis and also they themselves constitute a form of dynamic analysis. Like, even if there's assert(false) in the beginning of a function, we shouldn't flag it; it's merely an indication from the user that this function is never called. There are some cases where the assertion indeed doesn't make sense but they're extremely rare. It becomes even more confusing when people occasionally have code to handle a would-be assertion failure gracefully in release builds (i.e., when assertions are disabled). What we should warn about, though, is precondition failures. And the assertion that we've stepped on here is, in fact, a precondition: "if you call this method on an optional, first make sure the optional isn't empty". I.e., the assertion is put inside the method, but in fact it should be on the method boundary, and that's where it makes sense to warn ("method is misused"). But as of now C/C++ doesn't have a way of formally differentiating assertions from preconditions or postconditions; such feature is known as "contracts" and it was recently delayed until C++23. So manually teaching the static analyzer about preconditions of Optional's methods is the right solution until we have contracts. Comment Actions Let's wait for the test to be added, sounds like it's close. (@OikawaKirie, you can use delta debugging tools like creduce or delta to automatically write test cases for crashes) Comment Actions +1 for the test. It's not like we don't trust the fix, it is pretty obvious. However, it would be good to cover it for the future because it might help with tests around this area of code. Comment Actions It seems that this patch is stuck. The closest I could come up with was: void clang_analyzer_eval(int); void clang_analyzer_printState(); void foo(int a, int b) { if (a > 5) return; if (b < 4) return; if (a <= b) return; clang_analyzer_printState(); clang_analyzer_eval(a == 5); // expected-warning{{TRUE}} clang_analyzer_eval(b == 4); // expected-warning{{TRUE}} (void)a; (void)b; } But it still not crash. Any idea? Comment Actions Ping @OikawaKirie . Comment Actions We have tried to trigger the crash with the original project where the crash was encountered. But the problem is we cannot trigger the crash with the project, and we have lost all the previous records about this crash. Besides, both I and the bug reporter himself are now working on our own research right now. We have very limited time working on this patch in recent months. We are now planning to just make a simple regression test case to trigger the crash during our spire time. Sorry for the delay. Comment Actions After reviewing the code of this snippet, I think it would be very difficult to make a regression test case for the crash, as far as what I know about Z3 and SMT solvers. First of all, all calls to Solver->check() will return true for sat, false for unsat, and empty for a timeout. To trigger the crash, we need to construct a group of constraints that are sat. Then, we need to exclude a valid value for a symbolic variable and make the constraints lead to a timeout (rather than an unsat). Simple linear constraints have very few chances to lead to a timeout. I tried to create a group of constraints from https://stackoverflow.com/questions/20536435/z3-what-might-be-the-reason-for-timeout, which are a group of non-linear unsat constraints that can trigger a timeout (under a timeout of 10 seconds). However, I have not successfully made one, as it has too many things to do with mathematics. And my SMT solver colleagues also think it is quite difficult to make one. As far as I am thinking, it is also very tricky to trigger a constraint solver timeout. Since it can be impacted by which version of the constraint solver is used, how much time is set for the timeout, how fast your computer runs, and so on. Chances are that even if I reproduced the crash with a test case, the same test case may not work on your computer. Is it possible to hack the SMT solver creator to make a mock solver for triggering the problem? Comment Actions I suspected this one due to my previous investigation.
I highly agree.
I suspect, no. We don't have a mechanism to do dependency injection into the SMT constant manager implementation. It would be nice to have IMO. I see two options here:
I think the second option is easier and the preferred one. At the same time, that would require a greater consensus about changing the corresponding API since that is potentially used by different out-of-tree projects of LLVM. Comment Actions I'm somewhat bothered that this patch is not landed yet. :| I made a crude mock to trigger the bug using LD_PRELOAD. The test reproduces the issue and passes if you apply the fix from this patch. However, it requires compiling a shared object with the same compiler you used for building clang. I'm using the c++ compiler, for now. Comment Actions Simply replace the c++ compiler with %host_cxx compiler to compile the mock Z3 solver. BTW, I was obstructed by the z3 requirement in the regression test case when I tried to understand your test case. Even though I set the variables LLVM_Z3_INSTALL_DIR and LLVM_ENABLE_Z3_SOLVER during CMake, and the CSA can be correctly compiled with Z3, I still cannot make the test case run during make check-all. Therefore, this case was manually executed with llvm-lit -DUSE_Z3_SOLVER=1. Could you please tell me how to enable Z3 during llvm-lit. Comment Actions I went through the change and it looks good, seems like this is indeed a copy paste error from line 132.
I am a bit unhappy that we cannot run the test automatically, but maybe in the future. This is a good fix, I'd like to see landed!
Comment Actions I wouldn't mind fixing that yes, but right now I'm busy with other things. Comment Actions I am just reporting this phenomenon and I am not sure whether it is caused by my bad configurations. Can we automatically enable all test cases requiring z3 if clang is built with z3? I do not think the patch D83677 really make the problem fixed. Comment Actions Z3 constraint manager is unsupported, thus no test runs those parts. And yea, crashes more often than not. The primary goal of D83677 is to run the analyzer with the dummy range-based constraint manager, but with Z3 bugreport refutation. This scenario is monitored and maintained currently by us, Ericsson. We put effort into fixing crashes relating to this, thus it would be nice to state a test requirement that it // REQUIRES: z3, but only for refutation - which is really fast, has negligible overhead. Comment Actions Indeed it looks like a copy & paste error, I'm surprised no one found it earlier. Regarding the tests, we used to have make check-clang-analysis-z3 (or something similar) that would run only the analyzer's tests, but using Z3 as the constraint solver. It looks like this change broke it: https://reviews.llvm.org/D62445 Comment Actions I add VERBOSE=1 during execution, and command /path/to/llvm-project/build/./bin/llvm-lit -sv --param USE_Z3_SOLVER=0 /path/to/llvm-project/clang/test/Analysis/z3 is used to execute lit for testing. Should the execution requirements be changed to make it run if z3 is enabled? Or just keep it as it is now? If there are no other suggestions for this patch, I'd like to see it landed ASAP. I think it is a far too long period for a fix of a copy & paste error. Comment Actions It might worth investigating this build 'target'.
We should keep current behavior IMO.
I investigated if we could mock the whole Z3 shared object. I think we should mock only the Z3_solver_check function as done in this current patch. I don't see any immediate 'easier' and 'robust way of testing this. PS: Even if this lands, Z3 solver will crash all over the place :D This was the reason why I did not want to 'push' this fix so hard. |
It's not clear why we have to wait for the 5th invocation. Can't we return UNDEF for the first time? That would eliminate the need to call the originalFN.
By the way why do we have call the original, can't we just return immediately with Z3_L_TRUE?