Chromium's implementation of assertions (CHECK, DCHECK, etc.) are not
annotated with "noreturn", by default. This patch adds a model of the logical
implications of successfully executing one of these assertions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
clang/lib/Analysis/FlowSensitive/Models/ChromiumCheckModel.cpp | ||
---|---|---|
17–18 | Please don't repeat the comment from the header in the cc file. | |
36 | ||
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
207 | Sorry, could you explain how this works? I think the flow condition should not be implying 'Foo' since we're supposed to ignore this unrelated Check() call. |
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
207 | Good catch -- that was a typo (and the test was failing -- I'd gone back and forth on the formulation and uploaded at the wrong point). |
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
122 | I wonder whether the models should actually be called by the framework at some point. void f() { std::optional<int> o(5); if (o) { // dead code here; } } In an ideal case, an analysis could use the std::optional modeling to realize that the code in the if statement is dead and use this fact to improve its precision. Explicitly request the modeling in the transfer function works OK when we only have a couple things to model. But it might not scale in the future. When we model dozens of standard types and functions we would not want all the analysis clients to invoke all the transfers for all the models individually. |
Thanks for the review!
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
122 | Agreed. It seems similar the problems that motivated DLLs back in the day. there's clearly a lot to be worked out here in terms of how best to support composition. It's probably worth a RFC or somesuch to discuss in more depth. |
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
122 | Having an RFC and some deeper discussions would be great. I also wonder whether modeling arbitrary Stmts is the right approach. The peculiarities of the language should probably be modelled by the framework itself without any extensions. Maybe we only want the modeling of certain function calls to be customizable? |
clang/include/clang/Analysis/FlowSensitive/Models/ChromiumCheckModel.h | ||
---|---|---|
15 | This is unnecessary. | |
clang/lib/Analysis/FlowSensitive/Models/ChromiumCheckModel.cpp | ||
49–55 | ||
51 | Shouldn't this be part of isCheckLikeMethod? | |
52 | ||
57 | ||
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
90 | Perhaps "Incorrect" instead of "Bad" and comment on what makes it incorrect? | |
133 | We're not testing with other standards so remove this? |
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
122 | Good question. It would be really nice if we could draw this line, but I have a bad feeling that it won't be so simple. :) Still, worth looking at our existing models, and new ones that we're developing, to see if we can find a clear "bounding box". What does CSA do in this regard? |
Why this should be maintained and developed by LLVM/Clang developers and not by Chromium?
That's a good question. I think the short answer, that skirts around the issue, is that this is targeted for use in an upcoming clang-tidy check, and we don't have any framework for clients developing their own pluggable models for individual checks. Chromium could find some way to patch their clang-tidy source, I suppose, but we'd rather not encourage that. This framework is new and under development and the benefit of accomodating Chromium (in this very small way) seems worth the feedback from them applying it to their codebase.
That said, we are *also* in discussions with them to change their implement of CHECK, etc. so as to obviate the need for this model at all.
clang/unittests/Analysis/FlowSensitive/ChromiumCheckModelTest.cpp | ||
---|---|---|
122 | In the CSA, there is no clear distinction between modeling and diagnostics, each check can do both. Historically, this turned out to be a bad idea. When we have a check that is doing both modeling and diagnostics, users can end up turning the check off due to some false positives and end up getting worse results from other checks because some critical piece of modeling is also turned off. (E.g., even if there are a couple of false positives from a std::optional check, it might still be beneficial to do the modeling in the background without the diagnostics because it might help discover infeasible paths and that can improve the precision of other checks). Nowadays, we try to make the distinction clear, some checks are modeling only and others are diagnostic only (they might still have their own modeling but they do not affect the global analysis state, i.e., the diagnostic only checks should not be able to affect each other). This distinction is currently a best effort approach and not enforced by any of the APIs. In CSA, the checker APIs are very powerful. E.g., if there is a pointer with an unknown value and we see a dereference, we can continue the analysis with the assumption that the pointer is not-null. These assumptions could be added by the framework itself or just a regular check. Over time, we are trying to move as much modeling to (non-diagnostic) checks as possible to keep the framework lightweight but most of the meat is still in the framework. To model libraries, we are using the evalCall callback: https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer/Checkers/CheckerDocumentation.cpp#L229 Roughly speaking the model looks like this:
The model above assumes that when a check return true it will end up modeling all the aspects of a function (like invalidation). A downside might be that it will not solve the composition, when multiple checks want to model the same function, well, the framework will just pick one of them. Also, modeling types is really challenging. E.g., if a modeling check models the constructor of a type but does not model the destructor, the framework will end up using the default modeling for the dtor. The problem is that, in that case the ctor was not modeled by the framework (but the modeling checker) so some invariants were not established. This can result in false positives or even crashes. Overall, it looks like modeling types is almost an all or nothing endeavor. If a check models at least one method of a type it is really likely that it will need to model at least a bunch more to ensure a good experience. At Microsoft, we have a similar framework to CSA called EspXEngine. In EspXEngine, we have different APIs for modeling and diagnostic so the API enforce that diagnostic checks cannot influence each other. Our model for the modeling checks is very similar to CSA but the idea is that the modeling checks are maintained by the authors of EspXEngine (while diagnostic checks could be written by anyone), so conflicting models are less of a problem. Both EspXEngine and CSA has problems chaining modeling checks. E.g., if CSA has modeling for unique_ptr and optional, optional<unique_ptr<int>> or unique_ptr<optional<int>> will not model every aspects of the inner type out of the box. |
This is unnecessary.