This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
-
BugReporterVisitors.cpp
-
test/Analysis/
-
Analysis/
-
Inputs/
-
no-store-suppression.h
-
no-store-suppression.cpp

Differential D60107

[analyzer] NoStoreFuncVisitor: Suppress bug reports with no-store in system headers.
ClosedPublic

Authored by NoQ on Apr 1 2019, 5:35 PM.

Download Raw Diff

Details

Reviewers

dcoughlin
xazax.hun
rnkovacs
mikhail.ramalho
Szelethus
baloghadamsoftware
Charusso
a_sidorin

Commits

rG5c6fc36de897: [analyzer] NoStoreFuncVisitor: Suppress reports with no-store in system headers.
rC357810: [analyzer] NoStoreFuncVisitor: Suppress reports with no-store in system headers.
rL357810: [analyzer] NoStoreFuncVisitor: Suppress reports with no-store in system headers.

Summary

Hmm, i literally patched the same line of code a few days ago in D59901. That's fairly accidental.

NoStoreFuncVisitor is mostly attached to uninitialized value reports and is responsible for adding path notes within (a.k.a. disabling pruning of) inlined calls that could have initialized the memory region but didn't end up doing it. It emits notes that say Returning without writing to ... at the respective return sites of such calls.

George decided to suppress the note for calls into system headers. I guess the reason was that it was otherwise too loud. After all, it's an un-pruning effort - it can cause massive bring-ins of inlined calls into reports.

However, when the note is suppressed, the very original issue that caused us to write this visitor bites us again: the report becomes incomprehensible. Here's a specific example i'm looking at:

#include <iostream>

void use(char);

void foo() {
  char A;
  std::cin >> A;
  use(A); // Use of uninitialized variable?!
}

This is a "true" positive: there are a bunch of failure modes in std::cin that may lead to not initializing the variable and the developer has to check for them before using the variable. However,

You'll never be able to understand that from the report;
Even if it's true, the user would most likely still not bother fixing it unless it's a security-critical application.

What are our options here?

We can model operator>>(). We can either model it as something that always initializes the value to an unknown (and possibly tainted) value, or force a state split with a specific note about the contract of the operator, such as Value A may remain uninitialized when B is called (ρ=0.56), given C, assuming D and under E conditions (we'll have to also make sure that these preconditions are compatible with the current state). That's the ideal solution because it gives us the perfect modeling that we want and gives us ultimate flexibility.
We can disable inlining of operator>>() and maybe other system header functions that take out-parameters.
We can suppress bug reports that would have caused the no-store visitor to emit its notes in system headers.

In this patch i'm implementing solution 3.

Solution 1 is not incompatible with solutions 2 and 3; it can be incrementally added on top of either solution 2 or solution 3 as a more man-hour-expensive incremental improvement (suppress inlining/reporting but also unsuppress by modeling a few known functions explicitly).

Solution 2 is similar to our container inlining heuristic: just don't bother inlining 'cause we'll never understand what's going on anyway. I ended up hating that heuristic and dreaming of carefully removing it and replacing it with visitor-based suppressions such as the solution 3. By suppressing inlining, we have all of the downsides of the conservative evaluation: we end up exploring obviously infeasible execution paths because we're losing information about the program. Solution 3 is more targeted: it only suppresses reports of a specific checker for which the problem has actually manifested rather than all checkers, it doesn't cause arbitrary unpredictable coverage skew, and it doesn't destroy any valid information that we managed to obtain during inlining.

The patch is trivial and mostly consists of inevitable renaming functions and variables. There's one interesting gotcha though: if the function has no branches whatsoever, disable the suppression. Like, if the function unconditionally fails to initialize anything, the developer probably knows about that. I think we should do more of such un-suppressions. This was inspired by a test case in test/Analysis/new.cpp that otherwise regressed:

200 int testNoInitializationPlacement() {
201   int n;
202   new (&n) int; // Doesn't initialize 'n'!
203
204   if (n) { // expected-warning{{Branch condition evaluates to a garbage value}}
205     return 0;
206   }
207   return 1;
208 }

Diff Detail

Repository: rL LLVM

Event Timeline

NoQ created this revision.Apr 1 2019, 5:35 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 1 2019, 5:35 PM

Herald added subscribers: cfe-commits, jdoerfert, dkrupp and 3 others. · View Herald Transcript

NoQ edited the summary of this revision. (Show Details)Apr 1 2019, 5:35 PM

(whoops, forgot Alexey)

It is very good to try one improvement in another similar function.

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
364 ↗	(On Diff #193211)	`R` was cool, that is our unspoken naming convention. What about `CallR`/`CallRegion` and possibly `CallSVal`? If I am right usually the `BugReport` object is named `BR` because of our regions.
547 ↗	(On Diff #193211)	Update that comment?
552 ↗	(On Diff #193211)	What about `tryToEmitNote()`? This 'or' condition is very uncommon for a function name. Also there is another `R` what could be `BR`.
567 ↗	(On Diff #193211)	The Phabricator comment is better because it has a cool example. What about merging that into this one?
clang/test/Analysis/no-store-suppression.cpp
3 ↗	(On Diff #193211)	Could you inject a link for the diff or copy the information for further improvements why no diagnostic happen?

This revision is now accepted and ready to land.Apr 4 2019, 11:32 AM

Address comments. Thanks!

clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
364 ↗	(On Diff #193211)	My unspoken (well, not anymore, i guess) naming convention usually goes like this: `B` for `BinaryOperator` (emphasis on binary-ness, `U` for unary so they don't conflict) `R` for `BugReport` (emphasis on being a report) `BR` for `BugReporter` (for the lack of better name), `MR` for `MemRegion` (specific classes of regions go like `VR`, `FR`, `TR`, `SR`, or, well, `BR`) `BRC` for `BugReporterContext` (emphasis on being a context) Also `V` for `SVal` (emphasis on value) because `S` for `Stmt`.
clang/test/Analysis/no-store-suppression.cpp
3 ↗	(On Diff #193211)	Links to diffs are rarely included in the source code because they're usually two clicks away anyway: just do git blame and see the bottom of the commit message.

Closed by commit rL357810: [analyzer] NoStoreFuncVisitor: Suppress reports with no-store in system headers. (authored by NoQ). · Explain WhyApr 5 2019, 1:17 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2019, 1:17 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Revision Contents

Path

Size

cfe/

trunk/

lib/

StaticAnalyzer/

Core/

BugReporterVisitors.cpp

71 lines

test/

Analysis/

Inputs/

no-store-suppression.h

17 lines

no-store-suppression.cpp

22 lines

Diff 193956

cfe/trunk/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp

Show First 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	NoStoreFuncVisitor(const SubRegion *R)
PP(MmrMgr.getContext().getPrintingPolicy()) {}		PP(MmrMgr.getContext().getPrintingPolicy()) {}

void Profile(llvm::FoldingSetNodeID &ID) const override {		void Profile(llvm::FoldingSetNodeID &ID) const override {
static int Tag = 0;		static int Tag = 0;
ID.AddPointer(&Tag);		ID.AddPointer(&Tag);
ID.AddPointer(RegionOfInterest);		ID.AddPointer(RegionOfInterest);
}		}

		void *getTag() const {
		static int Tag = 0;
		return static_cast<void *>(&Tag);
		}

std::shared_ptr<PathDiagnosticPiece> VisitNode(const ExplodedNode *N,		std::shared_ptr<PathDiagnosticPiece> VisitNode(const ExplodedNode *N,
BugReporterContext &BR,		BugReporterContext &BR,
BugReport &) override {		BugReport &R) override {

const LocationContext *Ctx = N->getLocationContext();		const LocationContext *Ctx = N->getLocationContext();
const StackFrameContext *SCtx = Ctx->getStackFrame();		const StackFrameContext *SCtx = Ctx->getStackFrame();
ProgramStateRef State = N->getState();		ProgramStateRef State = N->getState();
auto CallExitLoc = N->getLocationAs<CallExitBegin>();		auto CallExitLoc = N->getLocationAs<CallExitBegin>();

// No diagnostic if region was modified inside the frame.		// No diagnostic if region was modified inside the frame.
if (!CallExitLoc \|\| isRegionOfInterestModifiedInFrame(N))		if (!CallExitLoc \|\| isRegionOfInterestModifiedInFrame(N))
return nullptr;		return nullptr;

CallEventRef<> Call =		CallEventRef<> Call =
BR.getStateManager().getCallEventManager().getCaller(SCtx, State);		BR.getStateManager().getCallEventManager().getCaller(SCtx, State);

if (Call->isInSystemHeader())
return nullptr;

// Region of interest corresponds to an IVar, exiting a method		// Region of interest corresponds to an IVar, exiting a method
// which could have written into that IVar, but did not.		// which could have written into that IVar, but did not.
if (const auto *MC = dyn_cast<ObjCMethodCall>(Call)) {		if (const auto *MC = dyn_cast<ObjCMethodCall>(Call)) {
if (const auto *IvarR = dyn_cast<ObjCIvarRegion>(RegionOfInterest)) {		if (const auto *IvarR = dyn_cast<ObjCIvarRegion>(RegionOfInterest)) {
const MemRegion *SelfRegion = MC->getReceiverSVal().getAsRegion();		const MemRegion *SelfRegion = MC->getReceiverSVal().getAsRegion();
if (RegionOfInterest->isSubRegionOf(SelfRegion) &&		if (RegionOfInterest->isSubRegionOf(SelfRegion) &&
potentiallyWritesIntoIvar(Call->getRuntimeDefinition().getDecl(),		potentiallyWritesIntoIvar(Call->getRuntimeDefinition().getDecl(),
IvarR->getDecl()))		IvarR->getDecl()))
return notModifiedDiagnostics(N, {}, SelfRegion, "self",		return maybeEmitNode(R, *Call, N, {}, SelfRegion, "self",
/FirstIsReferenceType=/false, 1);		/FirstIsReferenceType=/false, 1);
}		}
}		}

if (const auto *CCall = dyn_cast<CXXConstructorCall>(Call)) {		if (const auto *CCall = dyn_cast<CXXConstructorCall>(Call)) {
const MemRegion *ThisR = CCall->getCXXThisVal().getAsRegion();		const MemRegion *ThisR = CCall->getCXXThisVal().getAsRegion();
if (RegionOfInterest->isSubRegionOf(ThisR)		if (RegionOfInterest->isSubRegionOf(ThisR)
&& !CCall->getDecl()->isImplicit())		&& !CCall->getDecl()->isImplicit())
return notModifiedDiagnostics(N, {}, ThisR, "this",		return maybeEmitNode(R, *Call, N, {}, ThisR, "this",
/FirstIsReferenceType=/false, 1);		/FirstIsReferenceType=/false, 1);

// Do not generate diagnostics for not modified parameters in		// Do not generate diagnostics for not modified parameters in
// constructors.		// constructors.
return nullptr;		return nullptr;
}		}

ArrayRef<ParmVarDecl *> parameters = getCallParameters(Call);		ArrayRef<ParmVarDecl *> parameters = getCallParameters(Call);
for (unsigned I = 0; I < Call->getNumArgs() && I < parameters.size(); ++I) {		for (unsigned I = 0; I < Call->getNumArgs() && I < parameters.size(); ++I) {
const ParmVarDecl *PVD = parameters[I];		const ParmVarDecl *PVD = parameters[I];
SVal S = Call->getArgSVal(I);		SVal V = Call->getArgSVal(I);
bool ParamIsReferenceType = PVD->getType()->isReferenceType();		bool ParamIsReferenceType = PVD->getType()->isReferenceType();
std::string ParamName = PVD->getNameAsString();		std::string ParamName = PVD->getNameAsString();

int IndirectionLevel = 1;		int IndirectionLevel = 1;
QualType T = PVD->getType();		QualType T = PVD->getType();
while (const MemRegion *R = S.getAsRegion()) {		while (const MemRegion *MR = V.getAsRegion()) {
if (RegionOfInterest->isSubRegionOf(R) && !isPointerToConst(T))		if (RegionOfInterest->isSubRegionOf(MR) && !isPointerToConst(T))
return notModifiedDiagnostics(N, {}, R, ParamName,		return maybeEmitNode(R, *Call, N, {}, MR, ParamName,
ParamIsReferenceType, IndirectionLevel);		ParamIsReferenceType, IndirectionLevel);

QualType PT = T->getPointeeType();		QualType PT = T->getPointeeType();
if (PT.isNull() \|\| PT->isVoidType()) break;		if (PT.isNull() \|\| PT->isVoidType()) break;

if (const RecordDecl *RD = PT->getAsRecordDecl())		if (const RecordDecl *RD = PT->getAsRecordDecl())
if (auto P = findRegionOfInterestInRecord(RD, State, R))		if (auto P = findRegionOfInterestInRecord(RD, State, MR))
return notModifiedDiagnostics(N, *P, RegionOfInterest, ParamName,		return maybeEmitNode(R, Call, N, P, RegionOfInterest, ParamName,
ParamIsReferenceType,		ParamIsReferenceType, IndirectionLevel);
IndirectionLevel);

S = State->getSVal(R, PT);		V = State->getSVal(MR, PT);
T = PT;		T = PT;
IndirectionLevel++;		IndirectionLevel++;
}		}
}		}

return nullptr;		return nullptr;
}		}

▲ Show 20 Lines • Show All 153 Lines • ▼ Show 20 Lines	private:
}		}

/// \return whether \p Ty points to a const type, or is a const reference.		/// \return whether \p Ty points to a const type, or is a const reference.
bool isPointerToConst(QualType Ty) {		bool isPointerToConst(QualType Ty) {
return !Ty->getPointeeType().isNull() &&		return !Ty->getPointeeType().isNull() &&
Ty->getPointeeType().getCanonicalType().isConstQualified();		Ty->getPointeeType().getCanonicalType().isConstQualified();
}		}

/// \return Diagnostics piece for region not modified in the current function.		/// Consume the information on the no-store stack frame in order to
		/// either emit a note or suppress the report enirely.
		/// \return Diagnostics piece for region not modified in the current function,
		/// if it decides to emit one.
std::shared_ptr<PathDiagnosticPiece>		std::shared_ptr<PathDiagnosticPiece>
notModifiedDiagnostics(const ExplodedNode *N, const RegionVector &FieldChain,		maybeEmitNode(BugReport &R, const CallEvent &Call, const ExplodedNode *N,
const MemRegion *MatchedRegion, StringRef FirstElement,		const RegionVector &FieldChain, const MemRegion *MatchedRegion,
bool FirstIsReferenceType, unsigned IndirectionLevel) {		StringRef FirstElement, bool FirstIsReferenceType,
		unsigned IndirectionLevel) {
		// Optimistically suppress uninitialized value bugs that result
		// from system headers having a chance to initialize the value
		// but failing to do so. It's too unlikely a system header's fault.
		// It's much more likely a situation in which the function has a failure
		// mode that the user decided not to check. If we want to hunt such
		// omitted checks, we should provide an explicit function-specific note
		// describing the precondition under which the function isn't supposed to
		// initialize its out-parameter, and additionally check that such
		// precondition can actually be fulfilled on the current path.
		if (Call.isInSystemHeader()) {
		// We make an exception for system header functions that have no branches,
		// i.e. have exactly 3 CFG blocks: begin, all its code, end. Such
		// functions unconditionally fail to initialize the variable.
		// If they call other functions that have more paths within them,
		// this suppression would still apply when we visit these inner functions.
		// One common example of a standard function that doesn't ever initialize
		// its out parameter is operator placement new; it's up to the follow-up
		// constructor (if any) to initialize the memory.
		if (N->getStackFrame()->getCFG()->size() > 3)
		R.markInvalid(getTag(), nullptr);
		return nullptr;
		}

PathDiagnosticLocation L =		PathDiagnosticLocation L =
PathDiagnosticLocation::create(N->getLocation(), SM);		PathDiagnosticLocation::create(N->getLocation(), SM);

SmallString<256> sbuf;		SmallString<256> sbuf;
llvm::raw_svector_ostream os(sbuf);		llvm::raw_svector_ostream os(sbuf);
os << "Returning without writing to '";		os << "Returning without writing to '";

▲ Show 20 Lines • Show All 1,921 Lines • Show Last 20 Lines

cfe/trunk/test/Analysis/Inputs/no-store-suppression.h

				#pragma clang system_header

				namespace std {
				class istream {
				public:
				bool is_eof();
				char get_char();
				};

				istream &operator>>(istream &is, char &c) {
				if (is.is_eof())
				return;
				c = is.get_char();
				}

				extern istream cin;
				};

cfe/trunk/test/Analysis/no-store-suppression.cpp

				// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s

				// expected-no-diagnostics

				#include "Inputs/no-store-suppression.h"

				using namespace std;

				namespace value_uninitialized_after_stream_shift {
				void use(char c);

				// Technically, it is absolutely necessary to check the status of cin after
				// read before using the value that just read from it. Practically, we don't
				// really care unless we eventually come up with a special security check
				// for just that purpose. Static Analyzer shouldn't be yelling at every person's
				// third program in their C++ 101.
				void foo() {
				char c;
				std::cin >> c;
				use(c); // no-warning
				}
				} // namespace value_uninitialized_after_stream_shift