This is an archive of the discontinued LLVM Phabricator instance.

[clang][dataflow] Add support for noreturn destructor calls
ClosedPublic

Authored by sgatev on Dec 20 2021, 1:58 AM.

Details

Summary

This is part of the implementation of the dataflow analysis framework.
See "[RFC] A dataflow analysis framework for Clang AST" on cfe-dev.

Diff Detail

Event Timeline

sgatev created this revision.Dec 20 2021, 1:58 AM
sgatev requested review of this revision.Dec 20 2021, 1:58 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 20 2021, 1:58 AM
sgatev updated this revision to Diff 395393.Dec 20 2021, 2:14 AM

Remove anonymous namespace.

xazax.hun added inline comments.Dec 20 2021, 9:36 AM
clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
51

There is a special class for this at clang/lib/Analysis/CFGStmtMap.cpp. That also does some special handling of the terminators. I wonder if we need to do something similar here (or just use that class as is?).

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
51

This comment might be a bit hard to understand without a code example. Admittedly, including an example here could get very chatty, but we could refer to the name of the corresponding test.

I wonder whether naming the blocks like The first successor of block 'A' would make this clearer.

I am looking at the CFG and wonder if this change is what we what to do in the first place:

[B7 (ENTRY)]
   Succs (1): B6

 [B1]
   1: 0
   2: (void)[B1.1] (CStyleCastExpr, ToVoid, void)
   Preds (2): B2(Unreachable) B3
   Succs (1): B0

 [B2 (NORETURN)]
   1: ~Fatal() (Temporary object destructor)
   Preds (1): B3
   Succs (1): B0

 [B3]
   1: [B6.2] ? [B4.3] : [B5.5]
   2: int value = b ? foo() : Fatal().bar();
   T: (Temp Dtor) [B5.2]
   Preds (2): B4 B5
   Succs (2): B2 B1

 [B4]
   1: foo
   2: [B4.1] (ImplicitCastExpr, FunctionToPointerDecay, int (*)(void))
   3: [B4.2]()
   Preds (1): B6
   Succs (1): B3

 [B5]
   1: Fatal() (CXXConstructExpr, [B5.2], [B5.3], class Fatal)
   2: [B5.1] (BindTemporary)
   3: [B5.2]
   4: [B5.3].bar
   5: [B5.4]()
   Preds (1): B6
   Succs (1): B3

 [B6]
   1: b
   2: [B6.1] (ImplicitCastExpr, LValueToRValue, _Bool)
   T: [B6.2] ? ... : ...
   Preds (1): B7
   Succs (2): B4 B5

 [B0 (EXIT)]
   Preds (2): B1 B2

To summarize the problem, and let me know if I'm wrong:

  • In the CFG the noreturn block works as expected, its only successor is the exit block, this is great
  • The source of the problem is that we have a join node, where we do a merge BEFORE reaching the noreturn block, so we end up propagating the states from the noreturn block.
  • This code tries to pattern match this particular pattern in the CFG and introduce a small degree of local path-sensitivity.

How resilient is this pattern matching? Will this work if we swap the branches of the ternary operator? Will this work if there are multiple dtors in either/both branches? Will this work if there are dtors coming from the condition? Will this work of I start nesting ternary operators? I think this needs more coverage.

Also, I wonder if concentrating on noreturn is the right approach here. The root cause of this problem has relatively little to do with noreturn, and we could probably come up with a poorly behaving analysis even without having a single noreturn destructor. The root cause is the overly eager merge in the CFG.
Some alternative approaches:

  • Duplicate the B3 block to avoid having a join node in the CFG.
  • Do a proper pattern matching, match blocks containing the dtor calls with the branches of the ternary, and only propagate the states from the corresponding branches. I.e., do the correct state propagation even when we have no noreturn dtors.
gribozavr2 added inline comments.Dec 21 2021, 12:30 PM
clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
51

The only downside I see is that CFGStmtMap needs a ParentMap, which is not cheap to make, but we don't need. Maybe we need to make it optional in CFGStmtMap?

clang/include/clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h
103

I'd suggest a more verbose name like CFCtx, since Ctx is most often used in Clang for ASTContext. (Here and elsewhere in the patch.)

clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
2

Please add a license comment.

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
47

Use llvm::cantFail to unwrap the Expected?

How resilient is this pattern matching?

Possibly it is not very robust. However it is very important for a certain style of precondition verification macros used by various Google projects. I can't find an exact open source copy, but they are generally similar to this one, but usually use a ternary instead of an if statement:

https://source.chromium.org/chromium/chromium/src/+/main:third_party/re2/src/util/logging.h

Also similar to this one, but with an object that has a noreturn destructor instead of inline assembly that causes a trap:

https://source.chromium.org/chromium/chromium/src/+/main:base/check.h

If it wasn't for this "curious" coding pattern that is stamped into many function thanks to the macro, we wouldn't be bothering with this pattern matching at this point.

How resilient is this pattern matching?

Possibly it is not very robust. However it is very important for a certain style of precondition verification macros used by various Google projects. I can't find an exact open source copy, but they are generally similar to this one, but usually use a ternary instead of an if statement:

https://source.chromium.org/chromium/chromium/src/+/main:third_party/re2/src/util/logging.h

Also similar to this one, but with an object that has a noreturn destructor instead of inline assembly that causes a trap:

https://source.chromium.org/chromium/chromium/src/+/main:base/check.h

If it wasn't for this "curious" coding pattern that is stamped into many function thanks to the macro, we wouldn't be bothering with this pattern matching at this point.

I see. This makes sense as a temporary workaround, but in the long term I'd love to see a more general solution. I'm ok with this as is, but I'd love to see more tests that (possibly) does not provide the expected results along some TODOs and some comments about the specific coding patterns this is targeting.

sgatev updated this revision to Diff 395841.Dec 22 2021, 4:35 AM
sgatev marked 2 inline comments as done.

Address reviewers' comments.

sgatev marked 3 inline comments as done.Dec 22 2021, 4:53 AM

I also agree that the current approach isn't robust. I think that a proper solution would involve patching the CFG because in some cases it seems to be incorrect. For example, the call to qux is incorrectly deemed to be unreachable in the CFG of the following code:

int foo();

class Fatal {
 public:
  ~Fatal() __attribute__((noreturn));
  int bar();
  int baz();
};

void qux();

void target(bool b1, bool b2) {
  int value = b1 ? foo() : (b2 ? Fatal().bar() : Fatal().baz());
  qux();
}

I'd be happy to look into a solution for this as a follow up. I guess it would have an impact on other users of the CFG too.

clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
51

The main reason for me not to use CFGStmtMap right away is that based on some tests it seems to behave differently compared to what's implemented here. I plan to look into that in detail and see if there's an opportunity to replace the logic here. It shouldn't be too difficult to do that in a follow up as getStmtToBlock is used in a single place.

clang/include/clang/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.h
103

Replaced Ctx with CFCtx across the patch.

clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
51

I updated the comment and added more tests. Hopefully, that would be good enough for now and we can replace all of this with a more principled solution shortly.

clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
47

Much better. Thanks!

sgatev retitled this revision from [clang][dataflow] Add support for terminating virtual destructors to [clang][dataflow] Add support for terminating noreturn destructors.Dec 22 2021, 4:57 AM
sgatev updated this revision to Diff 395845.Dec 22 2021, 5:11 AM
sgatev marked 3 inline comments as done.

Change commit message.

sgatev retitled this revision from [clang][dataflow] Add support for terminating noreturn destructors to [clang][dataflow] Add support for noreturn destructor calls.Dec 22 2021, 5:12 AM
xazax.hun accepted this revision.Dec 22 2021, 8:34 AM

Thanks!

This revision is now accepted and ready to land.Dec 22 2021, 8:34 AM
gribozavr2 accepted this revision.Dec 22 2021, 9:22 AM
gribozavr2 added inline comments.Dec 22 2021, 9:28 AM
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
193

DYM "Noreturn" instead of "Virtual"?

sgatev updated this revision to Diff 395979.Dec 22 2021, 11:26 PM
sgatev marked an inline comment as done.

Rename a test.

sgatev marked an inline comment as done.Dec 22 2021, 11:27 PM
sgatev added inline comments.
clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
193

Right. Renamed it.

sgatev marked an inline comment as done.Dec 22 2021, 11:58 PM

Thanks for the reviews!

This revision was landed with ongoing or failed builds.Dec 27 2021, 11:59 PM
This revision was automatically updated to reflect the committed changes.