Implement new Z3 constraint manager backend.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
This is a fairly large patch, but the core of it is the actual implementation of the Z3 constraint solver backend. In its current form, the backend is implemented as a Clang plugin, but I also have a git commit that has it inside the static analyzer. The non-plugin approach makes testing easier, by avoiding the introduction of new %z3 and %z3_cc1 variables in lit.cfg and modifying all the testcases to load in the Z3 constraint manager plugin.
- The testcase Analysis/unsupported-types.c currently fails, because when using the constraint manager as a Clang plugin, somehow a copy of llvm::APFloat is getting compiled into the shared library. Thus, comparisons of &TI.getLongDoubleFormat() against &llvm::APFloat::x87DoubleExtended() or &llvm::APFloat::PPCDoubleDouble() are always failing, which causes a subsequent assertion error. When testing the non-plugin implementation, it works fine. I'm not familiar with the Clang/LLVM build system, so this is probably due to an error on my part.
- The testcase Analysis/reference.cpp currently fails, because the RangeConstraintManager has an internal optimization that assumes references are known to be non-zero (see RangeConstraintManager.cpp:422). This optimization is probably in the wrong place, and should be moved into the analyzer and out of the constraint solver.
- The testcase Analysis/ptr-arith.c currently fails, because handling of ptrdiff_t is somewhat tricky. This constraint manager interface maps ptrdiff_t to a signed bitvector in Z3, and pointers to unsigned bitvectors in Z3. However, since this testcase compares the pointer values both directly and indirectly via ptrdiff_t, if a < b is true using a signed comparison, the same is not necessarily true using an unsigned comparison.
- Because of the divergence in capabilities between this and the existing constraint manager, the follow-up child patches that add support for additional features will probably cause problems for the existing constraint manager. I haven't fully tested that, but there is an issue of how to update the testsuite results without duplicating most of the existing testcases for each constraint manager backend.
- Since this constraint manager uses Z3, the plugin links against the Z3 shared library. I believe this should be fine since Z3 is under a MIT license, but I'm not really familiar with any licensing issues.
- This plugin requires Z3 4.5.0 or later. I'm not sure if this is available in any distribution package repositories, so it may be necessary to build from source.
- There are some performance issues with symbolic state explosion when analyzing medium to large size codebases. It will probably be necessary to tune the maximum complexity parameter (MaxComp) in SValBuilder.cpp, and to adjust the Z3 constraint solver timeout (currently 15 sec). Longer-term, it is probably useful to implement a SMT query cache for assumeAux and getSymVal.
Do you think you could upload the patch omitting all of the test case changes for now? Maybe include one as an example but it seems to be just adding %z3_cc1 so we don't need to see all of them right now.
The KLEE project has a useful abstraction layer around multiple bitvector solvers (Boolector, CVC4, STP, and Z3). It's also used by Souper, another LLVM-based project. I would encourage you to consider using that rather than interacting directly with a specific solver.
For what it's worth, here's my attempt at integrating STP.
Amazing work, Dominic. That's what I wanted to test for long time. But, personally, I'm not happy with massive changes in tests.
- I don't think that we need to change run line for tests if they pass with both managers. These changes are pretty noisy,
- If Z3 is optional, we cannot enforce its usage.
- How testing will be performed without Z3?
I think there should be a way to separate tests between RangeConstraintManager and Z3.
Are we going to drop RangeConstraintManager support? If no, we should care about backward compatibility until this happens.
Do you want me to replace this version of the patch with one that omits the test case changes? The underlying git commit for just the Z3 constraint manager implementation is https://github.com/ddcc/clang/commit/e1414d300882c1459f461424d3e89d1613ecf03c , and https://github.com/ddcc/clang/commit/fb7d558be6492f11a3793f011f364098ddcc9711 is the commit that converts it to a plugin and modifies all the testcases (ignore the commit naming, it was changed by Phabricator).
I looked around for a generic smt-lib interface earlier, but couldn't find much available, and since I wanted floating-point arithmetic support, I ended up just directly using the Z3 C interface (the C++ interface uses exceptions, which causes problems). As far as I know, CVC4 doesn't have built-in floating-point support. But longer term, I'd agree that this backend should be made more generic.
I'm not sure what's the easiest way to maintain backwards support with RangeConstraintManager. Right now, I've modified lit.cfg to substitute %z3 and %z3_cc1 (for clang or clang -cc1) with the appropriate argument to load the plugin, if the z3 feature is available. Otherwise, they are substituted with just the empty string. Another approach would be to do this all in llvm-lit without modifying the testcases, but then there'd need to be some sort of path-based matching to determine when the argument to load the plugin should be injected (e.g. only for test/Analysis). However, if this constraint manager were built into clang instead of as a plugin, then this issue is moot.
Another issue is that some testcases will have different results depending on the constraint manager in use, but I don't believe it's possible to change the expected testcase output based on the presence of a feature flag. Unless this changes, there'll need to be (mostly) duplicate testcases for each constraint manager.
Furthermore, this and the child patches will cause the static analyzer to generate more complex constraints at runtime. But, I'm not sure if just having RangeConstraintManager ignore unsupported constraints will be sufficient, from a performance and correctness perspective.
My personal preference would be to directly merge in this constraint manager without using the plugin approach, because it would simplify some of the testing and changes to the build system (e.g. the APFloat linking issue). But I'm not sure if this would be ok?
Late-joining the round of applause for the awesome progression of this project!
Not sure how to deal with the massive test run-line changes at the moment, will take some extra time to think.
For such test cases i'd consider something like this:
// RUN: ... -analyzer-constraints=range -DRANGE ... // RUN: ... -analyzer-constraints=z3 ... #ifdef RANGE foo(); // expected-warning{{}} #else foo(); // no-warning #endif
.
Yep, the analyzer core has a lot of code that says "we denote this value with an incorrect symbol here, because even if we provided the correct symbol it would be too hard for the RangeConstraintManager to understand". The most obvious place is the integral cast code, which leaves the symbol intact *most* of the time (all the time until recently).
I think the right way to address these issues is introduce an API in the base ConstraintManager interface to ask the solver if he implements a certain feature or is capable of dealing with a particular kind of symbol, and change our behavior accordingly. We already have one function that was intended to do something like this - ConstraintManager::canReasonAbout(); not sure if it's working or broken.
For such test cases i'd consider something like this:
// RUN: ... -analyzer-constraints=range -DRANGE ... // RUN: ... -analyzer-constraints=z3 ... #ifdef RANGE foo(); // expected-warning{{}} #else foo(); // no-warning #endif
Would this be assuming that the test setup has Z3 installed and both constraint managers available? I see that llvm-lit supports a REQUIRES directive, but I don't know if that can be scoped to just one RUN.
Yep, the analyzer core has a lot of code that says "we denote this value with an incorrect symbol here, because even if we provided the correct symbol it would be too hard for the RangeConstraintManager to understand". The most obvious place is the integral cast code, which leaves the symbol intact *most* of the time (all the time until recently).
I think the right way to address these issues is introduce an API in the base ConstraintManager interface to ask the solver if he implements a certain feature or is capable of dealing with a particular kind of symbol, and change our behavior accordingly. We already have one function that was intended to do something like this - ConstraintManager::canReasonAbout(); not sure if it's working or broken.
canReasonAbout() works fine, but in its current form, requires parsing each constraint to determine whether its supported or not. In D26061, I've pushed the implementation from SimpleConstraintManager into RangeConstraintManager itself, since Z3ConstraintManager will also inherit from SimpleConstraintManager but with a different feature set. However, even with these changes, canReasonAbout() is only called from within SimpleConstraintManager; the feedback doesn't go all the way back to e.g. SimpleSValBuilder to dynamically change the constraint complexity at runtime.
Thanks for working on this Dominic!!!
Can you talk a bit about your motivation for working on this project and what your goals are?
Have you compared the performance when using Z3 vs the current builtin solver? I saw that you mention performance issues on large codebases, but could you give some specific numbers of the tradeoffs between performance, code coverage, and the quality of reported bugs. (One very rough idea to use a stronger solver while still keeping the analyzer performant would be to only use it for false positive refutation.)
I looked around for a generic smt-lib interface earlier, but couldn't find much available, and since I wanted floating-point arithmetic support, I ended up just directly using the Z3 C interface (the C++ interface uses exceptions, which causes problems). As far as I know, CVC4 doesn't have built-in floating-point support. But longer term, I'd agree that this backend should be made more generic.
Ok. Though I'd love to see an interface that supports CVC4!
Another issue is that some testcases will have different results depending on the constraint manager in use, but I don't believe it's possible to change the expected testcase output based on the presence of a feature flag. Unless this changes, there'll need to be (mostly) duplicate testcases for each constraint manager.
How different the results are? Is it the case that you get more warnings with Z3 in most cases? Would it be possible to divide the tests into the ones that work similarly with both solvers and the ones that are only expected to report warnings with Z3? I know this won't work in general, but might be much cleaner than polluting every test with a ton of #ifdefs.
Furthermore, this and the child patches will cause the static analyzer to generate more complex constraints at runtime. But, I'm not sure if just having RangeConstraintManager ignore unsupported constraints will be sufficient, from a performance and correctness perspective.
This is probably the most important question to answer. Maintaining the performance and correctness of the analyzer mode that is using RangeConstraintManager is very important as this is the mode most users will use at least for a while.
My personal preference would be to directly merge in this constraint manager without using the plugin approach, because it would simplify some of the testing and changes to the build system (e.g. the APFloat linking issue). But I'm not sure if this would be ok?
Most likely that would be possible.
I'd also like to second Ryan's suggestion to look at his patch to add support for STP. It was very close to landing other than some build system integration issues.
Cool work Dominic!
Just a few random comments from the peanut gallery regarding the usage of Z3:
- I would definitely split the Z3Expr into expr/solver/model classes. Several advantages: plugging in new solvers becomes easier and reference counting can be handled more safely and automatically. Right now your solver+model related code handled ref-counting "by hand", which can easily lead to bugs.
- I would add an rvalue constructor to the expr class to avoid ref-counting when unneeded.
- Function z3Expr::fromData() seems to be able to create arbitrarily-sized bit-vectors. I would limit to e.g. 512 bits or something like that. Z3 is super slow with huge bit-vectors. If you feel fancy (for a following version of this feature), you could encode this information lazily (based on the model).
- Z3Expr::fromInt() -- why take a string as input and not a uin64_t?
- Why call simplify before asserting a constraint? That's usually a pretty slow thing to do (in my experience).
- You should replace Z3_mk_solver() with something fancier. If you don't feel fancy, just use Z3_mk_simple_solver() (simple is short for simplifying btw). If you know the theory of the constraints (e.g., without floats, it would be QF_BV), you should use Z3_mk_solver_for_theory() (or whatever the name was). If you feel really fancy, you could roll your own tactic (in my tools I often have a tactic class to build my own tactics; Z3 even allows you to case split on the theory after simplifications)
- Instead of asserting multiple constraints to the solver, I would and() them and assert just once (plays nicer with some preprocessors of Z3)
- Timeout: SMT solvers sometimes go crazy and take forever. I would advise to add a timeout to the solver to avoid getting caught on a trap. This timeout can be different for the two use cases: check if a path is feasible (lower timeout), or check if you actually have a bug (higher timeout)
- Also, I didn't see anything regarding memory (I might have missed yet). Are you using the array theory? If so, I would probably advise benchmarking it carefully since Z3's support for arrays is not great (even though it has improved a bit in the last year or so).
BTW, adding small amount of constant folding in the expression builder is usually good, since you save time in Z3 preprocessor. Especially if you want to go for smallish timeouts (likely). If you want to be fancy, you can also canonicalize expressions like Z3 likes (e.g., only ule/sle, no slt, sgt, etc).
Convert to plugin, add move/assignment constructor, avoid Z3_simplify(), use Z3_mk_simple_solver(), generate logical and of all constraints in solver
I'd like to note that one of the main challenges with this implementation was needing to perform type reinference and implicit casting within the constraint manager, since the constraints are sometimes implicit. (e.g. (int) x instead of x == 0). It also didn't help that the Sema methods for handling type conversion aren't directly usable.
Can you talk a bit about your motivation for working on this project and what your goals are?
The original goal was to add support for symbolic execution on floating-point types, and analyze some real-world programs to look for interesting bugs. Unfortunately, it turns out that most floating-point bugs in the past tend to originate lower in the compiler toolchain, within the register allocator (e.g. non-determinism involving x87 FPU 80-bit registers), intermediate optimization passes (see Alive-FP), string conversions involving floating-point types (which may be too complex to model easily), or unexpected interactions between various 1st/3rd-party code when modifying the C99 thread-local floating-point environment (from fpenv.h). But the latter isn't really well modeled by Clang/LLVM, resulting in false-positives when real-world program use floating-point exception handlers to catch underflow/overflow/etc conditions. There's also the floating-point rounding mode that can be changed at runtime, but again, it's not something that's currently modeled by Clang/LLVM. I have seen a set of patches on the mailing list (by Sergey, I believe) that improve support for this, but to my knowledge they haven't been merged yet.
Have you compared the performance when using Z3 vs the current builtin solver? I saw that you mention performance issues on large codebases, but could you give some specific numbers of the tradeoffs between performance, code coverage, and the quality of reported bugs. (One very rough idea to use a stronger solver while still keeping the analyzer performant would be to only use it for false positive refutation.)
On a virtualized i5-2500k system, I get the following performance numbers on the testsuite, with a Clang/LLVM BUILD_SHARED_LIBS release build and Z3 https://github.com/z3prover/z3/commits/1bfd09e16bd9aaeae8a6b2841a2e8da615fdcda4 (post-4.5.0):
RangeConstraintManager: 13.46 sec
Z3ConstraintManager (plugin): 416.5 sec
Z3ConstraintManager (built-in): 414.6 sec
Z3ConstraintManager (built-in, with below changes): 303.92 sec
Z3ConstraintManager (built-in, with below changes and no simplify): 250.71 sec
In addition to the caveats above, the other more practical problem is that the performance is really bad on medium to large-sized codebases. I did some testing on gmp, mpfr, libvpx, daala, libsdl, and opus, and for all of these projects, with static analysis enabled (via scan-build, and a release mode Clang/LLVM), the builds (in parallel) wouldn't finish even after a week on a Xeon E5-2620v3 workstation, and there were a lot of false positives that were hard to dig through. I did end up finding and reporting one bug in mpfr (https://sympa.inria.fr/sympa/arc/mpfr/2017-01/msg00012.html), but it's not floating-point related, and I didn't check if it could also be found by vanilla Clang/LLVM. With regards to the performance, I believe that this was due to multiple factors, and correct me if I'm wrong:
- The Z3 solver timeout is set to 15 sec, which is probably too large (for an example that fails, see https://github.com/Z3Prover/z3/issues/834). But when timeouts do occur, they can cause runtime failures in the static analyzer: e.g. the assertion to avoid overconstrained states (in ConstraintManager.h), or underconstrained states (in BugReporterVisitors.cpp).
- The symbolic expression complexity parameter (MaxComp) is set to 10000, which is probably also too large. I'm guessing that overly-complex constraints are being generated, which can't be meaningfully solved and just bog down the analyzer.
- Within the analyzer, states are removed and iterated one-by-one off of the internal worklist, in a single-thread in a single-process. I'm guessing that this means the analyzer ends up waiting on Z3 most of the time. This is also why I started running analyses in parallel on different projects, because the analysis for each project would occupy just one core.
- The same queries end up being sent to the constraint manager despite no meaningful change in the program state, through either assume() or getSymVal(). But for expensive queries, this means that a lot of time/compute is just wasted. I think this could be improved by implementing a SMT query cache, and potentially by only adding relevant constraints to the solver state. I actually have a draft attempt at a simple query cache implementation (https://github.com/ddcc/clang/commit/e0236ff8f7b9c16dd29e8529420ab14b4309c3e6), but it's very hacky, isn't entirely memory-safe, caused some testsuite failures, and I ran out of time to dig further into it.
How different the results are? Is it the case that you get more warnings with Z3 in most cases? Would it be possible to divide the tests into the ones that work similarly with both solvers and the ones that are only expected to report warnings with Z3? I know this won't work in general, but might be much cleaner than polluting every test with a ton of #ifdefs.
In general, there are more warnings that appear because the number of false negatives are reduced. Ignoring changes to the testcases caused by infrastructure-related issues, this entire series of patches modifies around 15 testcases, so there won't be a lot that need the conditional handling.
I'd also like to second Ryan's suggestion to look at his patch to add support for STP. It was very close to landing other than some build system integration issues.
I actually looked through his patch while writing this one, I think the general recursive approach is pretty similar, as well as the plugin-stub implementation.
I would definitely split the Z3Expr into expr/solver/model classes. Several advantages: plugging in new solvers becomes easier and reference counting can be handled more safely and automatically. Right now your solver+model related code handled ref-counting "by hand", which can easily lead to bugs.
The base ref-counting implementation for expr/Z3Expr is essentially the same as the one in the Z3 C++ API, except that there is no non-move/copy/named constructor. It would be good to have some abstraction for the solver and model as well, but since they are used in only two or three functions, I originally used manually ref-counting to save time.
I would add an rvalue constructor to the expr class to avoid ref-counting when unneeded.
Done (see new revision).
Function z3Expr::fromData() seems to be able to create arbitrarily-sized bit-vectors. I would limit to e.g. 512 bits or something like that. Z3 is super slow with huge bit-vectors. If you feel fancy (for a following version of this feature), you could encode this information lazily (based on the model).
That's true, but in practice I don't think Clang has any integer types larger than 128-bit. The created bitvector size should be equivalent to the underlying type size for the target platform.
Z3Expr::fromInt() -- why take a string as input and not a uin64_t?
To avoid any limitations on the type size of the input value, I ended up using strings. Practically, 128-bit integers may be the only reason not to use uint64_t, although I admit I haven't tested with 128-bit integers.
Why call simplify before asserting a constraint? That's usually a pretty slow thing to do (in my experience).
Done. This has been my first time using Z3, my original thought was to simplify constraints as early as possible, because they are stored in the program state and propagate as symbolic execution continues, but the numbers show (at least for small programs), that this is less efficient.
You should replace Z3_mk_solver() with something fancier. If you don't feel fancy, just use Z3_mk_simple_solver() (simple is short for simplifying btw). If you know the theory of the constraints (e.g., without floats, it would be QF_BV), you should use Z3_mk_solver_for_theory() (or whatever the name was). If you feel really fancy, you could roll your own tactic (in my tools I often have a tactic class to build my own tactics; Z3 even allows you to case split on the theory after simplifications)
Done. My earlier understanding from https://github.com/Z3Prover/z3/issues/546 was to stick with Z3_mk_solver(), since I'm not familiar with tactics.
Instead of asserting multiple constraints to the solver, I would and() them and assert just once (plays nicer with some preprocessors of Z3)
Done.
Timeout: SMT solvers sometimes go crazy and take forever. I would advise to add a timeout to the solver to avoid getting caught on a trap. This timeout can be different for the two use cases: check if a path is feasible (lower timeout), or check if you actually have a bug (higher timeout)
There's currently a 15 sec timeout in the Z3ConstraintManager constructor, but that'd be a useful improvement to implement.
Also, I didn't see anything regarding memory (I might have missed yet). Are you using the array theory? If so, I would probably advise benchmarking it carefully since Z3's support for arrays is not great (even though it has improved a bit in the last year or so).
No, all the memory modeling is handled through Clang's StoreManager (the default implementation is RegionStore). I'm not familiar with how that part works, but there are definitely some false positives there; I recall making a mailing list post about it a few months ago. As future work, it'd be interesting to compare that with Z3's array theory.
BTW, adding small amount of constant folding in the expression builder is usually good, since you save time in Z3 preprocessor. Especially if you want to go for smallish timeouts (likely). If you want to be fancy, you can also canonicalize expressions like Z3 likes (e.g., only ule/sle, no slt, sgt, etc).
That'd definitely be useful. It seems that using Z3_simplify() is expensive, so my code should do the simpler things like constant folding and canonicalization.
This is super-exciting work!
Some high-level notes:
- The running-time numbers you report are very high. At a ~20x slowdown, the benefits from improved solver reasoning will have to be very, very large to justify the performance cost. It is worth thinking about ways to limit the number of solver calls. Ultimately, I suspect @zaks.anna's suggestion of using z3 for refutation is the right way to go -- this would invoke the expensive solver only after the cheaper solver has found a potentially buggy path and try to show that it is infeasible.
- That said, I think there are still significant optimization opportunities. It looks like when performing a query you create a new solver for each set of constraints. My understanding (and perhaps @nlopes can correct me here) is that these days Z3 is quite good at incremental solves so you may be leaving a lot of performance on the table. For example, in getSymVal() you create one solver to query for a satisfying assignment and then later create a completely new one to determine whether it is the only satisfying assignment. Since those two queries share all of their conjuncts but one it might be faster to reuse the first solver and add the new assertion for the second. Similarly, since analyzer exploration is a DFS, you could imagine building up the path condition incrementally. Z3 has solver APIs for pushing and popping assertions. (Again, @nlopes may have a better idea of whether this would pay off in practice.)
- It would be good to measure the increased peak memory usage with the Z3 constraint solver. The analyzer is often used as part of an IDE and so it needs to be able to run in memory constrained environments (such as laptops). Further, since multiple invocations of clang are often run simultaneously, memory is often a more precious resource than CPU time.
- As @a.sidorin noted, there is a significant test and maintenance cost to keeping two constraint managers around. Since the testing matrix would double, anything we can do to limit the need to modify/duplicate tests would be a huge savings. Is it possible to use lit substitution to call the analyzer twice for each run line: once with the range constraint manager and once with z3 (for builds where z3 is requested)? This, combined with automatically adding the #defines that @NoQ suggested would provide a mechanism to share most tests between the constraint managers.
- I don't think it is reasonable to ask all analyzer users or even all clang bots to install Z3, so we'll have to make sure the tests degrade gracefully when Z3 is not available. That said, we could set up dedicated Z3 bots to make sure that there is continuous integration for the feature.
- Right now the CMake uses find_package and builds with Z3 if it is found to be installed on the building system. I think this is too aggressive. It would be better to have the build explicitly opt in to using Z3. Without this, a user could accidentally build a clang that dynamically links to their local Z3 but then fails to launch with a load error when distributed. Similarly, it would be good to be able to explicitly set the the location of the Z3 to be used at build time for systems with multiple Z3s installed.
- That said, I think there are still significant optimization opportunities. It looks like when performing a query you create a new solver for each set of constraints. My understanding (and perhaps @nlopes can correct me here) is that these days Z3 is quite good at incremental solves so you may be leaving a lot of performance on the table. For example, in getSymVal() you create one solver to query for a satisfying assignment and then later create a completely new one to determine whether it is the only satisfying assignment. Since those two queries share all of their conjuncts but one it might be faster to reuse the first solver and add the new assertion for the second. Similarly, since analyzer exploration is a DFS, you could imagine building up the path condition incrementally. Z3 has solver APIs for pushing and popping assertions. (Again, @nlopes may have a better idea of whether this would pay off in practice.)
I agree that the performance is the main problem, and that there are still a lot of performance optimizations available. I've separated the Z3 solver and model into separate classes, and reuse the solver now in getSymVal() and checkNull(). I looked at using the push/pop approach briefly when I started writing the solver, but since the static analyzer itself seems to have an internal worklist, I wasn't sure if the state traversal order is deterministic and predictable, so I didn't try to reuse a single solver across all program states. This is the new timing:
Z3ConstraintManager (built-in, previous changes and solver state reuse): 202.37 sec
- It would be good to measure the increased peak memory usage with the Z3 constraint solver. The analyzer is often used as part of an IDE and so it needs to be able to run in memory constrained environments (such as laptops). Further, since multiple invocations of clang are often run simultaneously, memory is often a more precious resource than CPU time.
On the following testcases, the memory usage increase is around 18%. Using GNU time, measuring the max RSS in kbytes:
edges-new.mm (RangeConstraintManager): 45184
edges-new.mm (Z3ConstraintManager): 53568
malloc-plist.c (RangeConstraintManager): 40264
malloc-plist.c (Z3ConstraintManager): 47888
- As @a.sidorin noted, there is a significant test and maintenance cost to keeping two constraint managers around. Since the testing matrix would double, anything we can do to limit the need to modify/duplicate tests would be a huge savings. Is it possible to use lit substitution to call the analyzer twice for each run line: once with the range constraint manager and once with z3 (for builds where z3 is requested)? This, combined with automatically adding the #defines that @NoQ suggested would provide a mechanism to share most tests between the constraint managers.
I think this is reasonable, the only remaining question is how to identify when lit should call the analyzer twice, e.g. match on path against test/Analysis? Also, note that the argument passed to clang (-Xclang -analyzer-constraints=z3 vs. -analyzer-constraints=z3) need to change depending on whether the actual compiler or just the compiler driver is invoked.
- Right now the CMake uses find_package and builds with Z3 if it is found to be installed on the building system. I think this is too aggressive. It would be better to have the build explicitly opt in to using Z3. Without this, a user could accidentally build a clang that dynamically links to their local Z3 but then fails to launch with a load error when distributed. Similarly, it would be good to be able to explicitly set the the location of the Z3 to be used at build time for systems with multiple Z3s installed.
Done. I believe the location can already be set by overriding Z3_INCLUDE_DIR and/or Z3_LIBRARIES in CMake.
Add CMake CLANG_BUILD_Z3 option, add Z3Model and Z3Solver classes, reuse solver in checkNull() and getSymVal()
The improvement here (200s vs. 250s) makes me think that some form of caching/incrementalization could pay off.
On the following testcases, the memory usage increase is around 18%. Using GNU time, measuring the max RSS in kbytes:
edges-new.mm (RangeConstraintManager): 45184
edges-new.mm (Z3ConstraintManager): 53568
malloc-plist.c (RangeConstraintManager): 40264
malloc-plist.c (Z3ConstraintManager): 47888
That's not as bad as I feared!
- As @a.sidorin noted, there is a significant test and maintenance cost to keeping two constraint managers around. Since the testing matrix would double, anything we can do to limit the need to modify/duplicate tests would be a huge savings. Is it possible to use lit substitution to call the analyzer twice for each run line: once with the range constraint manager and once with z3 (for builds where z3 is requested)? This, combined with automatically adding the #defines that @NoQ suggested would provide a mechanism to share most tests between the constraint managers.
I think this is reasonable, the only remaining question is how to identify when lit should call the analyzer twice, e.g. match on patch against test/Analysis? Also, note that the argument passed to clang (-Xclang -analyzer-constraints=z3 vs. -analyzer-constraints=z3) need to change depending on whether the actual compiler or just the compiler driver is invoked.
We could add a new lit substitution '%clang_cc1_analyze' that expands to '%clang_cc1 -analyze -analyzer-constraints=foo -DANALYZER_HAS_CONSTRAINT_MANAGER_FOO" and change the analyzer tests to use it.
My sense is that most of the tests that legitimately call the driver directly are testing functionality that should be independent of the constraint manager. (Though there are some that call the driver for no good reason -- these should be easy to rewrite the RUN line).
Regarding incremental solving with Z3 (or with most SMT solvers in general), let me just lower the expectations a bit:
In Z3, when you do push(), there are a few things that change immediately: 1) it uses a different SAT solver (one that supports incremental reasoning), and 2) some preprocessing steps are disabled completely.
The advantage of incremental solving is that it shares the state of the SAT solver between queries. On the other hand, Z3 disables a bunch of nice preprocessors, and also it switches to eager bit-blasting. So whether going incremental pays off, it depends.. I've seen cases where it's faster to create a new solver for each query and cases where incremental is a good thing.
It's on our plans to "fix" Z3 to do better preprocessing in incremental mode, but there's no ETA at the moment to start this work..
Regarding incremental solving with Z3 (or with most SMT solvers in general), let me just lower the expectations a bit: <snip>
Ok, that is good to know. It seems that the performance benefits of incremental solving are unclear, and would be nontrivial to implement (maybe store the states in a DAG, then pop back to the lowest common ancestor from the previous state, and push down to the new state).
The improvement here (200s vs. 250s) makes me think that some form of caching/incrementalization could pay off.
I'm optimistic that implementing a SMT query cache would significantly improve performance (for both constraint managers), but the current version of that patch has some memory-safety problems (since ProgramStateRef is a reference-counting pointer), and I currently don't have the time to fix it up.
We could add a new lit substitution '%clang_cc1_analyze' that expands to '%clang_cc1 -analyze -analyzer-constraints=foo -DANALYZER_HAS_CONSTRAINT_MANAGER_FOO" and change the analyzer tests to use it.
I've updated the current version of this diff to use this approach, though there are still the two remaining testcase failures mentioned earlier (references.cpp, ptr-arith.c) that I'm not sure how to handle. I also had to remove the argument -fsyntax-only in some testcases, because the argument seems to need to appear before -analyze (which isn't possible using the substitution-based approach), though it doesn't seem to affect the outcome of the testcases. However, I haven't been able to figure out how to get llvm-lit to execute the same statement twice with different substitutions. From lit.cfg, it seems that I can only define additional substitutions, and not execute a test multiple times. I'm not familiar with the test infrastructure, do you have any suggestions?
Let me give just 2 more Z3-related suggestions:
- instead of re-creating the solver, it might be faster to do Z3_solver_reset
- "once in a while" it might be helpful to delete everything (all solvers, asts, context) and call Z3_reset_memory. Z3's small object pool is not very good and keeps growing /ad infinitum/, so cleaning it up once a while is a good thing (improves cache usage and reduces memory consumption).
BTW, you may want to call Z3_finalize_memory at exit to keep clang valgrind/asan-clean. (Z3 keeps a lot of internal buffers otherwise)
Reuse Z3_solver globally, cleanup Z3 memory on exit, disable Z3 proof generation, move Z3_context to separate Z3Context wrapper class
Thanks, the most recent revision now reuses the solver object, and frees Z3 memory at exit. I'm skipping the reset memory part for now, because my understanding is that a new instance of clang -cc1 will be spun up for each compilation unit anyway, and it'd be tricky to find a suitable point to do that while maintaining consistency.
On the testsuite, the runtime is now down to 185 secs, and a valgrind test is much cleaner.
I added support for a callback field in lit's configuration (see D29684), which is used to execute each testcase for each supported constraint solver backends at runtime. I believe this resolves all remaining issues, except for the remaining two testcase failures:
- The testcase Analysis/reference.cpp currently fails, because the RangeConstraintManager has an internal optimization that assumes references are known to be non-zero (see RangeConstraintManager.cpp:422). This optimization is probably in the wrong place, and should be moved into the analyzer and out of the constraint solver. Unless anyone objects, I plan to modify this testcase so that this result is fine for the z3 solver backend.
- The testcase Analysis/ptr-arith.c currently fails, because handling of ptrdiff_t is somewhat tricky. This constraint manager interface maps ptrdiff_t to a signed bitvector in Z3, and pointers to unsigned bitvectors in Z3. However, since this testcase compares the pointer values both directly and indirectly via ptrdiff_t, if a < b is true using a signed comparison, the same is not necessarily true using an unsigned comparison. Likewise, unless anyone objects, I plan to whitelist this result for the z3 solver backend.
Thanks for updating the tests to be able to run both the z3 and range-constraint managers! I think this will make it much easier to ensure the z3-backed manager continues to work as as expected moving forward. I suggested an alternative approach inline to running the tests under both managers, which you may wish to consider.
This sounds reasonable to me.
- The testcase Analysis/ptr-arith.c currently fails, because handling of ptrdiff_t is somewhat tricky. This constraint manager interface maps ptrdiff_t to a signed bitvector in Z3, and pointers to unsigned bitvectors in Z3. However, since this testcase compares the pointer values both directly and indirectly via ptrdiff_t, if a < b is true using a signed comparison, the same is not necessarily true using an unsigned comparison. Likewise, unless anyone objects, I plan to whitelist this result for the z3 solver backend.
I think in both cases whitelisting is the right thing to do -- just provide a comment in the test explaining why. If you have a sense of the right way to fix the problem, please add a // FIXME: in the test with a short description.
There are some minor comments inline. Also: It is important to make sure the build still works and the tests still pass even both when Z3 is not present and when it is not enabled.
CMakeLists.txt | ||
---|---|---|
371 ↗ | (On Diff #87519) | I think it would be good to change the name of this setting to have the prefix "CLANG_ANALYZER_...". This will help make it clear in, for example, the ccmake interface, that the option is only relevant to the analyzer. |
lib/StaticAnalyzer/Core/CMakeLists.txt | ||
62 ↗ | (On Diff #87519) | This needs something like: if(CLANG_BUILD_Z3 AND CLANG_HAVE_Z3) target_include_directories(clangStaticAnalyzerCore SYSTEM PRIVATE ${Z3_INCLUDE_DIR} ) endif() so that the analyzer core can find the Z3 headers when they are installed outside of the usual system include paths. |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
17 ↗ | (On Diff #87519) | Is it possible to expose only one Z3-related definition here? Does the C++ need to understand the distinction between CLANG_HAVE_Z3 and CLANG_BUILD_Z3? Could the CMake expose one master build setting that is true when other two are both true? |
21 ↗ | (On Diff #87519) | These 'using' declarations need to be moved outside the #if -- otherwise there is a compile error at the definition of CreateZ3ConstraintManager() when building without Z3. |
test/Analysis/lit.local.cfg | ||
2 ↗ | (On Diff #87519) | Here is a suggestion that would avoid needing to block changes to 'lit' itself. lit allows you to create different a different Test subclass in each test directory. You could create a custom subclass just for analyzer tests (we already do this for unit tests): class AnalyzerTest(lit.formats.ShTest, object): def execute(self, test, litConfig): result = self.executeWithAnalyzeSubstitution(test, litConfig, '-analyzer-constraints=range') if result.code == lit.Test.FAIL: return result if test.config.build_z3 == 'ON' and test.config.have_z3 == '1': result = self.executeWithAnalyzeSubstitution(test, litConfig, '-analyzer-constraints=z3 -DANALYZER_CM_Z3') return result def executeWithAnalyzeSubstitution(self, test, litConfig, substitution): saved_substitutions = list(test.config.substitutions) test.config.substitutions.append(('%analyze', substitution)) result = lit.TestRunner.executeShTest(test, litConfig, self.execute_external) test.config.substitutions = saved_substitutions return result config.test_format = AnalyzerTest(config.test_format.execute_external) Saving and mutating the substitutions is kind of janky, but it would avoid blocking this commit on an expanded lit interface. |
Dominic, this (D28952) and D26061 look get to me! Let's get these two committed! We'd like to get to a place where in-tree incremental development can be done on the Z3 constraint manager.
On my machine total Analysis test time increases from 25s to 90s when the Z3 tests are enabled using a static Z3 library. Testing time when Z3 is not enabled does not change measurably.
Here's how I suggest staging these in. Please give the bots time to settle after each commit.
- Apply the refactoring changes from D26061. This is NFC (no functional change).
- Split out the test changes from D28952. This would change the tests to use 'clang_analyze_cc1' and change the lit.cfg to add a normal substitution for it. (But not call the analyzer twice). This is also NFC.
- Add the new constraint solver and the lit/CMake changes for for the rest of D28952.
This will separate the stuff that is unlikely to break from the build system changes, which might need to be reverted quickly.
Once this is done, we (I) will add a bot that automatically runs the Z3 tests to make sure we catch regressions that affect the Z3 solver.
With the bot in place we will review D28953, D28954, and D28955 separately.
Does that sound good to you?
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
---|---|---|
60 ↗ | (On Diff #89777) | Sorry for being a bit late in the party, I was wondering whether this timeout can be a source of non-determinism. The constraint solver might get lucky or unlucky based on the load or the underlying hardware to solve some of the constraints. We might end up with different results over different runs which can be confusing for the users. Is there any other way, to set something like a timeout, like limiting the number of steps inside the solver? |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
---|---|---|
60 ↗ | (On Diff #89777) | I believe Z3 uses various heuristics internally, so the exact execution trace can differ between multiple executions. However, I don't believe that this would be a common issue, since queries on fixed-width bitvector are decidable. The 15 sec timeout is intended to prevent a single query from stalling the static analyzer; I've only experienced this with complex queries over floating-point types (e.g. D28953 + D28954 + D28955), which can degenerate to bitblasting. I don't have the exact numbers right now, but given that the 200+ tests in the testsuite complete in ~90 secs with this patch, the average time per test is < 0.5 sec, which includes tens if not hundreds of individual SMT queries. As far as I know, this timeout parameter is the only support configuration parameter. I'd also like to point out that this solver isn't used unless the user both builds with it and specifies it at runtime. |
I'm a little late to the review but hopefully I'll have useful comments. Nice work @ddc
I looked around for a generic smt-lib interface earlier, but couldn't find much available, and since I wanted floating-point arithmetic support, I ended up just directly using the Z3 C interface (the C++ interface uses exceptions, which causes problems). As far as I know, CVC4 doesn't have built-in floating-point support. But longer term, I'd agree that this backend should be made more generic.
FYI I've been working on floating point support in KLEE and have extended it to support floating point (note however only the Z3 backend actually supports consuming floating point constraints). I've not yet open sourced what I've done as I'm not entirely happy with the design but if there is interest we could see if we could figure out a way of pulling klee::Expr (and the solver bits) out of KLEE to make a standalone library. Note there is a project called metaSMT that uses template meta programming to give the same interface to multiple solvers. KLEE does support it but I'm not familiar with it.
Also using Z3's C interface is the way to go (compared to Z3's C++ interface). We also use this in KLEE.
KLEE has a few optimization ideas that you could consider implementing that certainly help in the context of symbolic execution.
- Constraint independence. This splits a set of constraints into independent constraints and solves them separately. This can be useful for further processing (e.g. increases the likely hood of a cache hit)
- "CounterExampleCache" . The naming is rather unfortunate and confusing (because KLEE works in terms of validity rather than satisfiability). The idea is to cache models for solver queries that are SAT (i.e. maintain a mapping from models to satisfying assignments with a sentinel to indicate when a query does not have a satisfying assignment) and when handling future queries
- If the new query is in the existing cache if it has a model it is SAT otherwise it is UNSAT.
- If the new query is a superset of an existing query in the cache then: If there was no model for the cached query then the super set query is also UNSAT. If there is a model for the original query try substituting it into the query (along with a deterministic values for symbolic values in the new query that aren't in the model) to try to quickly check if the cached model also satisfies the superset. If this fails then call the real solver
- If the new query is a subset of an existing query in the cache then: If there is a model for the cached query then that model will also satisfying the new query. If there is no model then the new query is UNSAT.
KLEE also does very aggressive constant folding and expression canonicalization to try to get more cache hits and have smaller constraints.
CMakeLists.txt | ||
---|---|---|
188 ↗ | (On Diff #89777) | @ddcc It is of course up to you but I recently [[ added support for using libz3 directly | added support for using libz3 directly ]] from CMake. via it's own CMake config package. You only get this if Z3 was built with CMake so you might not want this restriction. This feature has only just landed though and might not be sufficient for your needs. If you take a look at Z3's example projects they are now built with this mechanism when building with CMake. If you are interested I'd be more than happy to work with you to get this feature ready for your needs in upstream Z3 so you can use it here. |
include/clang/Config/config.h.cmake | ||
42 ↗ | (On Diff #89777) | Do you really want such a specific name? How about CLANG_ANALYSER_HAS_SMT_SOLVER? |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
70 ↗ | (On Diff #89777) | @ddc |
81 ↗ | (On Diff #89777) | @ddcc We also have a handy dump() method on Z3ASTHandle and Z3SortHandle for debugging. |
113 ↗ | (On Diff #89777) | @ddcc Of course this is fine for now but I think we really need to fix the APFloat API. I use it heavily myself and I've come across problems with it such as this one. |
160 ↗ | (On Diff #89777) | 128 is actually ambiguous. It could also be ppc_fp128. I had the same problem in KLEE and for now I just assumed IEEEQuad like you have done but at the very least we should leave a comment here noting that this should be fixed some how. |
438 ↗ | (On Diff #89777) | @ddcc Isn't this repeating some of the logic in fromBinOp()? |
559 ↗ | (On Diff #89777) | @ddcc I'm not convinced this is a good idea. Printing as a decimal string could lead to rounding errors. I think you're better of getting an APInt from the APFloat, constructing a bitvector constant from those bits and then using Z3_mk_fpa_to_fp_bv() to get a Z3Expr of the right sort. That was we are being bit-wise accurate. |
581 ↗ | (On Diff #89777) | @ddcc Again I don't think the use of strings here is a good idea. You can do this in a bit-precise way by getting the bits, making an APInt from that and then making an APFloat from that. |
CMakeLists.txt | ||
---|---|---|
188 ↗ | (On Diff #89777) | Sorry that URL should be https://github.com/Z3Prover/z3/pull/926 |
Use direct bitcasting instead of string conversion for APFloat casting, add reference counting for Z3_sort, eliminate some duplicate code
Thanks for the feedback! My main constraint is that the results from the floating-point analysis weren't very interesting (see #652894), so I'm not actively working on further development.
FYI I've been working on floating point support in KLEE and have extended it to support floating point (note however only the Z3 backend actually supports consuming floating point constraints). I've not yet open sourced what I've done as I'm not entirely happy with the design but if there is interest we could see if we could figure out a way of pulling klee::Expr (and the solver bits) out of KLEE to make a standalone library. Note there is a project called metaSMT that uses template meta programming to give the same interface to multiple solvers. KLEE does support it but I'm not familiar with it.
I agree that it'd be useful to move to a more generic SMT interface, potentially SMT-LIB instead of something specific to a solver, but this is more of a long-term goal at the moment.
KLEE has a few optimization ideas that you could consider implementing that certainly help in the context of symbolic execution...
Likewise, I think adding a constraint cache and implementing performance optimizations would be the next step to getting this enabled by default, and perhaps deprecating the range constraint manager, but the implementation is a little tricky because there's state also being stored in the ProgramState object. Since those are reference counted, there's a little more work involved than just adding an intermediate layer with e.g. an std::map. But again, it's not something I have the time for at the moment.
CMakeLists.txt | ||
---|---|---|
188 ↗ | (On Diff #89777) | I think this is useful, and upstream z3 has been in need of better packaging. But until it's used by default over the current python script and the z3 folks actively maintain it, I'm a little hesitant to depend on it. |
include/clang/Config/config.h.cmake | ||
42 ↗ | (On Diff #89777) | There's been a bit of back and forth over this name in previous revisions, so I'm a little hesitant to change it. In essence, there are two CMake variables, one for whether z3 is present, and another for whether the user requested to build with z3, which are exported to a single C-level definition that is true iff both CMake variables are true. |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
70 ↗ | (On Diff #89777) | I agree that it is an inherent limitation, but since the static analyzer itself is only single-threaded anyway, the entire analyzer will need significant changes before this becomes a problem. |
81 ↗ | (On Diff #89777) | Originally, my understanding was that reference counting wasn't necessary for Z3_sort objects, based on the API documentation, and the lack of Z3_sort_inc_ref()/Z3_sort_dec_ref() functions. But, taking another look at the z3 source code, it seems that Z3_sort is casted directly to Z3_ast, so I think this does need to be fixed. |
113 ↗ | (On Diff #89777) | Yeah, both the APFloat and Sema API's need some work to be more usable from other areas of Clang. This function wasn't originally present in earlier revisions, but at some point the fltSemantics definition was changed to only return references, and then this helper became necessary. |
160 ↗ | (On Diff #89777) | This is a common pitfall, but isn't a problem for this function in particular, because canReasonAbout() will exclude non-IEEE 754 types, e.g. PPC and 8087 FPU. |
438 ↗ | (On Diff #89777) | The cases for BO_LAnd and BO_LOr are duplicated, but since the two functions handle exclusively floating-point and non floating-point types, the overlap is fairly small. |
559 ↗ | (On Diff #89777) | From the comments inside APFloat::toString(), this should be precise enough to round-trip from float to string and back again with no loss of precision. However, the I agree that bitcasting makes more sense, especially given the overhead of string conversion. |
581 ↗ | (On Diff #89777) | See above. |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
---|---|---|
559 ↗ | (On Diff #89777) | The APFloat comments claim this but[[ https://bugs.llvm.org/show_bug.cgi?id=24539 | I've found at least one case where this does not hold ]] in the current implementation. |
cmake/modules/FindZ3.cmake | ||
---|---|---|
3 ↗ | (On Diff #93588) | @ddcc Seeing as you don't want to use the new upstream Z3 CMake package config files I'll try to review this. Upstream Z3 does not come with pkg-config files for the native library so I'm wondering where you expect this to work. Does a Linux distro add their own .pc files? It looks like you only use these paths as hints so this so this looks like it'll work even without the pkg-config files. |
5 ↗ | (On Diff #93588) | @ddcc This CMake variable is set but never used. Also based on the name it suggests that they are compiler definitions rather than other compiler flags. Does _CFLAGS_OTHER have those semantics? It's unclear from https://cmake.org/cmake/help/v3.7/module/FindPkgConfig.html#command:pkg_check_modules what they are supposed to be. To consume these flags you could add target_compile_definitions() and target_compile_options() to all users of Z3. A more elegant way of doing this is to create an imported library target (e.g. z3::libz3) and set compile definitions, compile options and include directories on this imported target with INTERFACE visibility so that these usage requirements implicitly propagate to all users of z3::libz3. |
cmake/modules/FindZ3.cmake | ||
---|---|---|
3 ↗ | (On Diff #93588) | See below. |
5 ↗ | (On Diff #93588) | I'm not very familiar with CMake, so I based it off of the FindLibXml2.cmake from the upstream CMake repository. The pkg-config part isn't used currently, but I left it in case z3 does get a proper package. |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
674 ↗ | (On Diff #93588) | The only problem I see now is that if a IEEEquad is fed in, it will generate a 128-bit bitvector, which will be truncated at this point. But the z3 API doesn't support retrieving a rational into anything larger than 64-bit, so perhaps converting it to string is the way to go here. |
619 ↗ | (On Diff #93587) | Since the rest of the code already handles APSInt, I just reused that instead of implementing another method for APInt. The overhead is one additional bool used to store the sign, which doesn't make much difference, since it needs to be specified anyway with APInt::toString*(). |
668 ↗ | (On Diff #93587) | I don't think so. Things change around a bit with D28954, but by the time this method is called, the casting should have already been handled elsewhere. |
lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp | ||
---|---|---|
674 ↗ | (On Diff #93588) | @ddcc I'm heavily against using the string method due to rounding problems it might introduce. The trick to handling types wider than 64 bits is to construct a new Z3_ast that does an Z3_mk_extract() to pull out the bits you want and then call Z3_get_numeral_uint64() on that. In the case of IEEE-754 quad you would need to pull out the lower order bits and then the higher order bits, then put these into an array and build an APInt from that. |
cmake/modules/FindZ3.cmake | ||
---|---|---|
5 ↗ | (On Diff #93588) | I'm not a huge fan of leaving around dead code. I don't have any intention of implementing pkg-config files for Z3 so I think it's unlikely that support will ever happen. So personally I'd rather you removed the pkg-config stuff here. |
Dominic: I don't have a bot set up yet, but let's get this committed. Thanks for all your hard work on this!
How can I make z3constraintmanager.cpp work in the command line?Or how to make z3 work?
You'll need a bleeding-edge build of Clang/LLVM, since this isn't available in any stable release yet. First, build or install a recent version of z3; libz3-dev in Debian/Ubuntu may work. Pass -DCLANG_ANALYZER_BUILD_Z3=ON to cmake, when building LLVM. Then, when running clang, pass -Xanalyzer -analyzer-constraints=z3.
Edit: I should mention, that unless at least D28955 and D28953 are also applied, you probably won't see any concrete improvement in precision of the static analyzer, but only a slowdown in performance.
Thank you for helping me build clang with z3.I have already applied the above updating.But now I have another question, when running clang with '-Xanalyzer -analyzer-constraints=z3' there is always be a fatal error and I don't know whether it is right to use a command like 'clang -Xanalyzer -analyzer-constraints=z3 -analyzer-checker=debug.DumpExplodedGraph test.cpp-' to get the ExplodedGraph which is analyzed by z3constraint?Sorry to bother.
You probably want something along the lines of clang --analyze -Xanalyzer -analyzer-constraints=z3 -Xanalyzer -analyzer-checker=debug.ViewExplodedGraph test.c. In particular, there is a difference between using clang as a compiler driver or frontend (see FAQ entry on cc1 at https://clang.llvm.org/docs/FAQ.html). This is also mentioned in the checker development manual: https://clang-analyzer.llvm.org/checker_dev_manual.html .