This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][NFC] Split the main logic of NoStoreFuncVisitor to an abstract NoStateChangeVisitor class
ClosedPublic

Authored by Szelethus on Jul 7 2021, 8:06 AM.

Details

Summary

Preceding discussion on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2021-June/068450.html

NoStoreFuncVisitor is a rather unique visitor. As VisitNode is invoked on most other visitors, they are looking for the point where something changed -- change on a value, some checker-specific GDM trait, a new constraint. NoStoreFuncVisitor, however, looks specifically for functions that *didn't* write to a MemRegion of interesting. Quoting from its comments:

/// Put a diagnostic on return statement of all inlined functions
/// for which  the region of interest \p RegionOfInterest was passed into,
/// but not written inside, and it has caused an undefined read or a null
/// pointer dereference outside.

It so happens that there are a number of other similar properties that are worth checking. For instance, if some memory leaks, it might be interesting why a function didn't take ownership of said memory:

void sink(int *P) {} // no notes

void f() {
  sink(new int(5)); // note: Memory is allocated
                    // Well hold on, sink() was supposed to deal with
                    // that, this must be a false positive...
} // warning: Potential memory leak [cplusplus.NewDeleteLeaks]

In here, the entity of interest isn't a MemRegion, but a symbol. The property that changed here isn't a change of value, but rather liveness and GDM traits managed by MalloChecker.

This patch moves some of the logic of NoStoreFuncVisitor to a new abstract class, NoStateChangeFuncVisitor. This is mostly calculating and caching the stack frames in which the entity of interest wasn't changed.

Descendants of this interface have to define 3 things:

  • What constitutes as a change to an entity (this is done by overriding wasModifiedBeforeCallExit)
  • What the diagnostic message should be (this is done by overriding maybeEmitNoteFor.*)
  • What constitutes as the entity of interest being passed into the function (this is also done by overriding maybeEmitNoteFor.*, to be factored out a bit more in a later patch)

Diff Detail

Event Timeline

Szelethus created this revision.Jul 7 2021, 8:06 AM
Szelethus requested review of this revision.Jul 7 2021, 8:06 AM
Szelethus updated this revision to Diff 357202.Jul 8 2021, 6:29 AM
Szelethus edited the summary of this revision. (Show Details)
  • Don't depend on D105552.
  • Merge with a followup patch; it was a poor splitting point, and the diffs didn't really convey what actually changed. However, it didn't make this patch look any more obfuscated.
    • Split up maybeEmitNote to special cases. Descendants will need to check whether the entity of interest was passed into the function or not, and emit a note if so.
      • NoStateChangeVisitor forces you to override 3 of these functions, one for ObjC 'self', one for C++ 'this', and one for regular parameters.
  • D105552#2864085 gives a bit more insight into the longer term plans:

[...] clients only need to define these 3 [...]:

  • What constitutes as a change to an entity
  • What the diagnostic message should be
  • What constitutes as the entity of interest being passed into the function

In the end, in my eye, it would look like this:

                     <--- NoStateChangeToRegionVisitor <--- NoStoreFuncVisitor
                    /
NoStateChangeVisitor                                   <--- NoDynamicMemoryOwnershipChangeVisitor
                    \                                 /
                     <--- NoStateChangeToSymbolVisitor
                                                      \
                                                       <--- NoStreamOwnershipChangeVisitor
  • NoStateChangeVisitor will ask its clients the above mentioned 3 questions.
  • NoStateChangeToSymbolVisitor and NoStateChangeToRegionVisitor would answer whether the entity of interest was actually passed into the function, and propagate the rest.
  • Visitors on the bottom would take care of what remains.
NoQ accepted this revision.Jul 9 2021, 12:29 PM

This looks perfectly sensible to me.

In the mailing list I seem to have made a mistake about how this works: we don't explicitly scan the AST for potential statements that could cause a state change (eg., assignment operators in case of NoStore) but we only check if the region was explicitly made accessible. This makes things a lot easier (we don't have to build an auxiliary AST-based analysis) and I'm not sure if this other heuristic that I thought we have would actually make things significantly better. I guess it makes sense to keep in mind that we might want to ultimately plug it in but I don't immediately see what'd stop us.

clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
629

Or on the } if there's no return statement.

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
399

While we're at it, can you take a look if the recently introduced CallEvent::parameters() is good enough for this?

532–533

I guess this comment could be copied to the other two virtual methods?

This revision is now accepted and ready to land.Jul 9 2021, 12:29 PM
Szelethus updated this revision to Diff 358876.Jul 15 2021, 2:40 AM
Szelethus marked 3 inline comments as done.
  • Change parameter names to highlight that the node at the end of the function call is a CallExitBegin.
  • Change individual parameter callbacks to the entire call -- if a client knows that they don't want to emit diagnostics for some function, they can early return, instead of early returning for each parameter.
  • Fix according to reviewer comments!

In the mailing list I seem to have made a mistake about how this works: we don't explicitly scan the AST for potential statements that could cause a state change (eg., assignment operators in case of NoStore) but we only check if the region was explicitly made accessible. This makes things a lot easier (we don't have to build an auxiliary AST-based analysis) and I'm not sure if this other heuristic that I thought we have would actually make things significantly better. I guess it makes sense to keep in mind that we might want to ultimately plug it in but I don't immediately see what'd stop us.

For NoStoreFuncVisitor, this seems to be fine I suppose, though I'm not sure how things pan out for memory leaks. As you put it,

We probably shouldn't emit a note every time the function simply accepts the value of interest by pointer, because this doesn't necessarily prove the intent to delete the pointer;

it might be rather difficult to tell this intent. For now, I'm leaning on the side of slightly too much than too little, but I need real world data on my hand to have a good feel for it.

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
399

I suppose yeah, at least the lit tests all passed, though its worth mentioning that this function explicitly calls CallEvent::parameters()...

This revision was landed with ongoing or failed builds.Aug 16 2021, 6:03 AM
This revision was automatically updated to reflect the committed changes.