This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h | ||
---|---|---|
51 | There is a special class for this at clang/lib/Analysis/CFGStmtMap.cpp. That also does some special handling of the terminators. I wonder if we need to do something similar here (or just use that class as is?). | |
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp | ||
51 | This comment might be a bit hard to understand without a code example. Admittedly, including an example here could get very chatty, but we could refer to the name of the corresponding test. I wonder whether naming the blocks like The first successor of block 'A' would make this clearer. |
I am looking at the CFG and wonder if this change is what we what to do in the first place:
[B7 (ENTRY)] Succs (1): B6 [B1] 1: 0 2: (void)[B1.1] (CStyleCastExpr, ToVoid, void) Preds (2): B2(Unreachable) B3 Succs (1): B0 [B2 (NORETURN)] 1: ~Fatal() (Temporary object destructor) Preds (1): B3 Succs (1): B0 [B3] 1: [B6.2] ? [B4.3] : [B5.5] 2: int value = b ? foo() : Fatal().bar(); T: (Temp Dtor) [B5.2] Preds (2): B4 B5 Succs (2): B2 B1 [B4] 1: foo 2: [B4.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void)) 3: [B4.2]() Preds (1): B6 Succs (1): B3 [B5] 1: Fatal() (CXXConstructExpr, [B5.2], [B5.3], class Fatal) 2: [B5.1] (BindTemporary) 3: [B5.2] 4: [B5.3].bar 5: [B5.4]() Preds (1): B6 Succs (1): B3 [B6] 1: b 2: [B6.1] (ImplicitCastExpr, LValueToRValue, _Bool) T: [B6.2] ? ... : ... Preds (1): B7 Succs (2): B4 B5 [B0 (EXIT)] Preds (2): B1 B2
To summarize the problem, and let me know if I'm wrong:
- In the CFG the noreturn block works as expected, its only successor is the exit block, this is great
- The source of the problem is that we have a join node, where we do a merge BEFORE reaching the noreturn block, so we end up propagating the states from the noreturn block.
- This code tries to pattern match this particular pattern in the CFG and introduce a small degree of local path-sensitivity.
How resilient is this pattern matching? Will this work if we swap the branches of the ternary operator? Will this work if there are multiple dtors in either/both branches? Will this work if there are dtors coming from the condition? Will this work of I start nesting ternary operators? I think this needs more coverage.
Also, I wonder if concentrating on noreturn is the right approach here. The root cause of this problem has relatively little to do with noreturn, and we could probably come up with a poorly behaving analysis even without having a single noreturn destructor. The root cause is the overly eager merge in the CFG.
Some alternative approaches:
- Duplicate the B3 block to avoid having a join node in the CFG.
- Do a proper pattern matching, match blocks containing the dtor calls with the branches of the ternary, and only propagate the states from the corresponding branches. I.e., do the correct state propagation even when we have no noreturn dtors.
clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h | ||
---|---|---|
51 | The only downside I see is that CFGStmtMap needs a ParentMap, which is not cheap to make, but we don't need. Maybe we need to make it optional in CFGStmtMap? | |
clang/include/clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h | ||
103 | I'd suggest a more verbose name like CFCtx, since Ctx is most often used in Clang for ASTContext. (Here and elsewhere in the patch.) | |
clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp | ||
2 | Please add a license comment. | |
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp | ||
47 | Use llvm::cantFail to unwrap the Expected? |
How resilient is this pattern matching?
Possibly it is not very robust. However it is very important for a certain style of precondition verification macros used by various Google projects. I can't find an exact open source copy, but they are generally similar to this one, but usually use a ternary instead of an if statement:
https://source.chromium.org/chromium/chromium/src/+/main:third_party/re2/src/util/logging.h
Also similar to this one, but with an object that has a noreturn destructor instead of inline assembly that causes a trap:
https://source.chromium.org/chromium/chromium/src/+/main:base/check.h
If it wasn't for this "curious" coding pattern that is stamped into many function thanks to the macro, we wouldn't be bothering with this pattern matching at this point.
I see. This makes sense as a temporary workaround, but in the long term I'd love to see a more general solution. I'm ok with this as is, but I'd love to see more tests that (possibly) does not provide the expected results along some TODOs and some comments about the specific coding patterns this is targeting.
I also agree that the current approach isn't robust. I think that a proper solution would involve patching the CFG because in some cases it seems to be incorrect. For example, the call to qux is incorrectly deemed to be unreachable in the CFG of the following code:
int foo(); class Fatal { public: ~Fatal() __attribute__((noreturn)); int bar(); int baz(); }; void qux(); void target(bool b1, bool b2) { int value = b1 ? foo() : (b2 ? Fatal().bar() : Fatal().baz()); qux(); }
I'd be happy to look into a solution for this as a follow up. I guess it would have an impact on other users of the CFG too.
clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h | ||
---|---|---|
51 | The main reason for me not to use CFGStmtMap right away is that based on some tests it seems to behave differently compared to what's implemented here. I plan to look into that in detail and see if there's an opportunity to replace the logic here. It shouldn't be too difficult to do that in a follow up as getStmtToBlock is used in a single place. | |
clang/include/clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h | ||
103 | Replaced Ctx with CFCtx across the patch. | |
clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp | ||
51 | I updated the comment and added more tests. Hopefully, that would be good enough for now and we can replace all of this with a more principled solution shortly. | |
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp | ||
47 | Much better. Thanks! |
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp | ||
---|---|---|
193 | DYM "Noreturn" instead of "Virtual"? |
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp | ||
---|---|---|
193 | Right. Renamed it. |
There is a special class for this at clang/lib/Analysis/CFGStmtMap.cpp. That also does some special handling of the terminators. I wonder if we need to do something similar here (or just use that class as is?).