This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
Analysis/
-
AnalysisDeclContext.h
-
StaticAnalyzer/Core/PathSensitive/
-
Core/
-
PathSensitive/
-
ExplodedGraph.h
-
lib/StaticAnalyzer/Checkers/
-
StaticAnalyzer/
-
Checkers/
-
UnreachableCodeChecker.cpp
-
test/Analysis/
-
Analysis/
-
unreachable-code-exceptions.cpp
-
unreachable-code-path.c
-
unreachable-code-path.cpp

Differential D127874

[analyzer] Reimplement UnreachableCodeChecker using worklists
Needs ReviewPublic

Authored by steakhal on Jun 15 2022, 9:36 AM.

Download Raw Diff

Details

Reviewers

NoQ
martong
balazske
ASDenysPetrov
Szelethus
frederic-tingaud-sonarsource

Summary

This checker suffers from many false-positives.
I decided to have a look at them and I found multiple areas in which
we could improve.

We should ignore try statements for now, until #55621 is resolved by fixing the CFG generation for try-catch and other exception-related constructs.
If some checker placed a sink node into some block, the successors of the sink block was marked unreachable previously. This leads to an unpleasant situation from the user's point of view since a different bugreport gets correlated with the FPs caused by the sink node of that. This effectively means that for each unconditional sink, we will also have (preferable) an unreachable code report as well. Now, we ignore unreachable node x if the predecessors of x are reachable, but none of the successors of the predecessors of x are reachable. This case is demonstrated in the magic_clamp example later, where the B10 block has a division by zero fatal bug, which sinks all execution paths making both successor blocks (B8, B9) unreachable.
The sweep-back part cannot be implemented correctly via a DFS visitation. This part supposed to filter the unreachable nodes to keep only the first unreachable block in the CFG. This is done to minimize the number of dead statement reports. It needs to be done in a breath-search manner, using a worklist. I'll express later why in the magic_clamp example, where I demonstrate step-by-step the previous algorithm goes wide.
Imagine a completely detached CFG block segment, e.g.:

void test10_chained_unreachable(void) {
  goto end;
a:
  goto b;
b:
  goto c;
c:
  goto a;
end:;
}

This produces this CFG:

#6(entry)      #2(goto a;)
 |              |        ^
#5(goto end;)   |         \
 |             #4(goto b;) |
#1(end:)        |          |
 |             #3(goto c;) |
#0(exit)         \________/

Previously, the checker only reported B2 as dead, since that was the
first block which it encountered.
Which means that the swipe-back part kept only the initial node as
dead, and marking the rest (B3, B4) as reachable.
IMO picking artificially that block makes no sense. We should
either mark all or none in this case. Why would be 'only' the B2 dead?

The previous algorithm:

Walk each visited ExplodedNode, find which corresponds to basic-block enter events and mark the entered CFG block as reachable by adding it to the corresponding set.
Walk all the CFG blocks starting with the artificial CFG exit node (ID 0). Do a backward DFS to find the start of each unreachable CFG block chain, to only report unreachable diagnostics to the very first unreachable block. I refer to this step as the sweep-back part. We do this by inserting the uninteresting CFG block IDs into the reachables set. Note that the algorithm also maintains a visited block set, to account for circles in the CFG.
Iterate over the CFG blocks, and if the block is not present in the reachables set, then it's a candidate for diagnostic. We do some trivial checks to filter out common FPs, and emit the report otherwise.

Why do we need to do the sweep-back part in BFS order?
Here is some code demonstrating this:

int magic_clamp(int x) {
  if (x != 0) return -1;
  int v1 = 100 / x; // Div by zero

  int clamped;
  if (v1 < 0)
    clamped = 0;
  else if (v1 > 100)
    clamped = 100;
  else if (v1 % 2 == 0)
    clamped = 0;
  else if (v1 % 2 == -1)
    clamped = -1;
  else
    clamped = v1;

  return clamped;
}

The resulting CFG looks like this:

         <13>(entry)
          |
         <12>
         /  \      Legend:
       <10> |
       /  \ |       <N> : reachable   block with the id N
      8   9 |        N  : unreachable block with the id N
     / \  | |
    6  7  | |
   / \ |  | |
  4  5 |  | |
 / \ | |  | |
2  3 | |  | |
 \  \| | /  |
  \  |/ /   |
   \ | /    |
    \|/     |
     1    <11>
      \   /
       <0>(exit)

Initially, B13, B12, B10, B11, B0 are reachable, thus the rest are
unreachable.
The swipe-back phase starts from B0 and marks B1, B2, B4, B6
reachable until it reaches the first unreachable block (B8), whose parent
is reachable. After this, it backtracks and checks if B3 should be
refuted or not. It finds that the single predecessor block of B3 is
reachable, thus we should report the B3 as unreachable.
I leave the rest for the reader, but in the end, we end up having B3, B5,
B8, B9 unreachable, thus 4 reports were generated for this example
previously.

If we were doing the swipe-back phase in BFS order, we would have only
B8 and B9 blocks as unreachable - as we should.

However, I'm suppressing these since they only exist because some
checker (in this case DivByZeroChecker) sank all paths reaching B8 and
B9. Consequently, after fixing the div-by-zero bug, they would become
reachable again.

Unfortunately, even with this new implementation, the number of
false-positives is still too high to move this out of alpha.
I've checked a couple of those reports, and I could not find any obvious
patterns. They certainly are, but I'd need to reduce some cases and have
a deeper look - which I haven't done.
IMO even this is a good increment which is worth considering to land.

My measurements did not show any crashes, or runtime regressions.
Regarding the reports:

In both the baseline, and in this version: 13868
Disappeared: 4560 (32.88%)
New: 190 (1.37%)

So, this implementation would not flood the users with new reports,
'only' remove a couple of annoying ones.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

steakhal created this revision.Jun 15 2022, 9:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 9:36 AM

Herald added subscribers: manas, dkrupp, donat.nagy and 6 others. · View Herald Transcript

steakhal requested review of this revision.Jun 15 2022, 9:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2022, 9:36 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B170025: Diff 437211.Jun 15 2022, 11:20 AM

If some checker placed a sink node into some block, the successors of the sink block was marked unreachable previously. This leads to an unpleasant situation from the user's point of view since a different bugreport gets correlated with the FPs caused by the sink node of that. This effectively means that for each unconditional sink, we will also have (preferable) an unreachable code report as well.

An example and/or a visualization would be useful for this.

Now, we ignore unreachable nodes if the predecessor is reachable, but none of its successors are.

Isn't there a contradiction here? The node's predecessor must have at least one reachable successor, the node itself; otherwise the node itself would be unreachable.

the sweep-back part cannot be implemented correctly via a DFS visitation

What is the purpose of this sweep-back? You should summarize that here.

... Do a backward DFS to find the start of each unreachable CFG block chain ...

How does it find the start if there is a cycle in the graph? Isn't it rather about finding strongly connected components?

The sweep-back part cannot be implemented correctly via a DFS visitation ... Why do we need to do the sweep-back part in BFS order?

Connected components can be found by both DFS or BFS. Please elaborate why should we prefer BFS? The problem you are trying to solve seems to be an unnecessarily convoluted problem to me. Shouldn't it be enough to simply pick one node (or all of them) from the connected component? It should not matter if we use DFS or BFS for finding the nodes in a connected component.

steakhal edited the summary of this revision. (Show Details)Jun 16 2022, 9:33 AM

In D127874#3588786, @martong wrote:

If some checker placed a sink node into some block, the successors of the sink block was marked unreachable previously. This leads to an unpleasant situation from the user's point of view since a different bugreport gets correlated with the FPs caused by the sink node of that. This effectively means that for each unconditional sink, we will also have (preferable) an unreachable code report as well.

An example and/or a visualization would be useful for this.

It's quite hard. I could add step numbers so that I could mark which block is being entered and exited the visitation in the recursion.

Now, we ignore unreachable nodes if the predecessor is reachable, but none of its successors are.

Isn't there a contradiction here? The node's predecessor must have at least one reachable successor, the node itself; otherwise the node itself would be unreachable.

I tried to cheap out the mouthful definition. I'm not sure if it helps more than hinders this way.

the sweep-back part cannot be implemented correctly via a DFS visitation

What is the purpose of this sweep-back? You should summarize that here.

Yea, done.

... Do a backward DFS to find the start of each unreachable CFG block chain ...

How does it find the start if there is a cycle in the graph? Isn't it rather about finding strongly connected components?

I left circle detection to the reader. Now, I'm explicitly mentioning the maintenance of the visited block set.

The sweep-back part cannot be implemented correctly via a DFS visitation ...

Why do we need to do the sweep-back part in BFS order?

Due to the recursive visitation order, the basic blocks get inserted to the reachable blocks set sooner, than another unreachable block would get visited.
In the magic_clamp example the recursive (DFS) visitation starts from B0, enters B1, recurses further up to B2 then to B4 and B4. This is when the algorithm starts popping frames (backtracking).
It backtracks, backtracks ... back to B1, where it has another predecessor (B3), hence it recurses on that path as well. But this time, it will find that the only predecessor of B3 is reachable, hence it will leave B3 as unreachable. And this is the problem, because we should not report a dead code bug there. Consequently, we should not extend the reachable blocks set in DFS visitation order. However, if we were using BFS visitation order, we can never end up in this situation.

Connected components can be found by both DFS or BFS. Please elaborate why should we prefer BFS? The problem you are trying to solve seems to be an unnecessarily convoluted problem to me. Shouldn't it be enough to simply pick one node (or all of them) from the connected component? It should not matter if we use DFS or BFS for finding the nodes in a connected component.

I don't think any of the issues mentioned in this patch relates to strongly connected components, thus I don't think I can answer to this question.

I don't think any of the issues mentioned in this patch relates to strongly connected components, thus I don't think I can answer to this question.

In your example above, repeated here:

#6(entry)      #2(goto a;)
 |              |        ^
#5(goto end;)   |         \
 |             #4(goto b;) |
#1(end:)        |          |
 |             #3(goto c;) |
#0(exit)         \________/

[#2, #4, #3] is a strongly connected (and unreachable) component of the CFG, isn't it?

In D127874#3628112, @martong wrote:
I don't think any of the issues mentioned in this patch relates to strongly connected components, thus I don't think I can answer to this question.

In your example above, repeated here:
#6(entry)      #2(goto a;)
 |              |        ^
#5(goto end;)   |         \
 |             #4(goto b;) |
#1(end:)        |          |
 |             #3(goto c;) |
#0(exit)         \________/
[#2, #4, #3] is a strongly connected (and unreachable) component of the CFG, isn't it?

Upsz, I've made a mistake. I wanted to write connected components without the strongly adverb. Please remove the strongly from all my comments above.

In D127874#3628112, @martong wrote:
I don't think any of the issues mentioned in this patch relates to strongly connected components, thus I don't think I can answer to this question.

In your example above, repeated here:
#6(entry)      #2(goto a;)
 |              |        ^
#5(goto end;)   |         \
 |             #4(goto b;) |
#1(end:)        |          |
 |             #3(goto c;) |
#0(exit)         \________/
[#2, #4, #3] is a strongly connected (and unreachable) component of the CFG, isn't it?

Right; those three blocks are unreachable in the CFG.

Let me clarify that this (previous) example has nothing to do with the visitation order. For that, yes either BFS and DFS order would work.
The magic_clamp example supposed to underpin the rationale behind choosing BFS instead of DFS.
In the summary, you will find a step-by-step playthrough how the DFS visitation worked previously, and resulted in falsely leaving B3 and B5 unreachable due to the order in which their predecessor nodes were visited. Let me know if it helped.

Revision Contents

Path

Size

clang/

include/

clang/

Analysis/

AnalysisDeclContext.h

1 line

StaticAnalyzer/

Core/

PathSensitive/

ExplodedGraph.h

8 lines

lib/

StaticAnalyzer/

Checkers/

UnreachableCodeChecker.cpp

389 lines

test/

Analysis/

unreachable-code-exceptions.cpp

14 lines

unreachable-code-path.c

47 lines

unreachable-code-path.cpp

23 lines

Diff 437211

clang/include/clang/Analysis/AnalysisDeclContext.h

Show First 20 Lines • Show All 245 Lines • ▼ Show 20 Lines	public:
/// It might return null.		/// It might return null.
const LocationContext *getParent() const { return Parent; }		const LocationContext *getParent() const { return Parent; }

bool isParentOf(const LocationContext *LC) const;		bool isParentOf(const LocationContext *LC) const;

const Decl *getDecl() const { return Ctx->getDecl(); }		const Decl *getDecl() const { return Ctx->getDecl(); }

CFG *getCFG() const { return Ctx->getCFG(); }		CFG *getCFG() const { return Ctx->getCFG(); }
		CFG *getUnoptimizedCFG() const { return Ctx->getUnoptimizedCFG(); }

template <typename T> T *getAnalysis() const { return Ctx->getAnalysis<T>(); }		template <typename T> T *getAnalysis() const { return Ctx->getAnalysis<T>(); }

const ParentMap &getParentMap() const { return Ctx->getParentMap(); }		const ParentMap &getParentMap() const { return Ctx->getParentMap(); }

/// \copydoc AnalysisDeclContext::getSelfDecl()		/// \copydoc AnalysisDeclContext::getSelfDecl()
const ImplicitParamDecl *getSelfDecl() const { return Ctx->getSelfDecl(); }		const ImplicitParamDecl *getSelfDecl() const { return Ctx->getSelfDecl(); }

▲ Show 20 Lines • Show All 235 Lines • Show Last 20 Lines

clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h

Show First 20 Lines • Show All 393 Lines • ▼ Show 20 Lines	public:
using const_eop_iterator = NodeVector::const_iterator;		using const_eop_iterator = NodeVector::const_iterator;
using node_iterator = AllNodesTy::iterator;		using node_iterator = AllNodesTy::iterator;
using const_node_iterator = AllNodesTy::const_iterator;		using const_node_iterator = AllNodesTy::const_iterator;

node_iterator nodes_begin() { return Nodes.begin(); }		node_iterator nodes_begin() { return Nodes.begin(); }

node_iterator nodes_end() { return Nodes.end(); }		node_iterator nodes_end() { return Nodes.end(); }

		llvm::iterator_range<node_iterator> nodes() {
		return llvm::make_range(nodes_begin(), nodes_end());
		}

const_node_iterator nodes_begin() const { return Nodes.begin(); }		const_node_iterator nodes_begin() const { return Nodes.begin(); }

const_node_iterator nodes_end() const { return Nodes.end(); }		const_node_iterator nodes_end() const { return Nodes.end(); }

		llvm::iterator_range<const_node_iterator> nodes() const {
		return llvm::make_range(nodes_begin(), nodes_end());
		}

roots_iterator roots_begin() { return Roots.begin(); }		roots_iterator roots_begin() { return Roots.begin(); }

roots_iterator roots_end() { return Roots.end(); }		roots_iterator roots_end() { return Roots.end(); }

const_roots_iterator roots_begin() const { return Roots.begin(); }		const_roots_iterator roots_begin() const { return Roots.begin(); }

const_roots_iterator roots_end() const { return Roots.end(); }		const_roots_iterator roots_end() const { return Roots.end(); }

▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

clang/lib/StaticAnalyzer/Checkers/UnreachableCodeChecker.cpp

	//==- UnreachableCodeChecker.cpp - Generalized dead code checker -- C++ --==//			//==- UnreachableCodeChecker.cpp - Generalized dead code checker -- C++ --==//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// This file implements a generalized unreachable code checker using a			// This file implements a generalized unreachable code checker using a
	// path-sensitive analysis. We mark any path visited, and then walk the CFG as a			// path-sensitive analysis. We mark any path visited, and then walk the CFG as a
	// post-analysis to determine what was never visited.			// post-analysis to determine what was never visited.
	//			//
	// A similar flow-sensitive only check exists in Analysis/ReachableCode.cpp			// A similar flow-sensitive only check exists in Analysis/ReachableCode.cpp
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
	#include "clang/AST/ParentMap.h"			#include "clang/AST/ParentMap.h"
	#include "clang/Basic/Builtins.h"			#include "clang/Basic/Builtins.h"
	#include "clang/Basic/SourceManager.h"			#include "clang/Basic/SourceManager.h"
				#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
	#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"			#include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
	#include "clang/StaticAnalyzer/Core/Checker.h"			#include "clang/StaticAnalyzer/Core/Checker.h"
	#include "clang/StaticAnalyzer/Core/CheckerManager.h"			#include "clang/StaticAnalyzer/Core/CheckerManager.h"
	#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"			#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
	#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"			#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerHelpers.h"
	#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"			#include "clang/StaticAnalyzer/Core/PathSensitive/ExplodedGraph.h"
	#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"			#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.h"
	#include "llvm/ADT/SmallSet.h"			#include <queue>

	using namespace clang;			using namespace clang;
	using namespace ento;			using namespace ento;
				using namespace llvm;

				using CFGBlocksSet = SmallSet<unsigned, 32>;

				template <typename Range> static auto skippingNulls(Range &&R) {
				auto IsNonNull = [](const CFGBlock *B) -> bool { return B; };
				return llvm::make_filter_range(R, IsNonNull);
				}

	namespace {			namespace {
	class UnreachableCodeChecker : public Checker<check::EndAnalysis> {			class UnreachableCodeChecker : public Checker<check::EndAnalysis> {
	public:			public:
	void checkEndAnalysis(ExplodedGraph &G, BugReporter &B,			void checkEndAnalysis(ExplodedGraph &G, BugReporter &B,
	ExprEngine &Eng) const;			ExprEngine &Eng) const;
	private:
	typedef llvm::SmallSet<unsigned, 32> CFGBlocksSet;

	static inline const Stmt getUnreachableStmt(const CFGBlock CB);
	static void FindUnreachableEntryPoints(const CFGBlock *CB,
	CFGBlocksSet &reachable,
	CFGBlocksSet &visited);
	static bool isInvalidPath(const CFGBlock *CB, const ParentMap &PM);
	static inline bool isEmptyCFGBlock(const CFGBlock *CB);
	};			};
	}			} // namespace

	void UnreachableCodeChecker::checkEndAnalysis(ExplodedGraph &G,
	BugReporter &B,
	ExprEngine &Eng) const {
	CFGBlocksSet reachable, visited;

	if (Eng.hasWorkRemaining())
	return;

	const Decl *D = nullptr;
	CFG *C = nullptr;
	const ParentMap *PM = nullptr;
	const LocationContext *LC = nullptr;
	// Iterate over ExplodedGraph
	for (ExplodedGraph::node_iterator I = G.nodes_begin(), E = G.nodes_end();
	I != E; ++I) {
	const ProgramPoint &P = I->getLocation();
	LC = P.getLocationContext();
	if (!LC->inTopFrame())
	continue;

	if (!D)
	D = LC->getAnalysisDeclContext()->getDecl();

	// Save the CFG if we don't have it already
	if (!C)
	C = LC->getAnalysisDeclContext()->getUnoptimizedCFG();
	if (!PM)
	PM = &LC->getParentMap();

	if (Optional<BlockEntrance> BE = P.getAs<BlockEntrance>()) {
	const CFGBlock *CB = BE->getBlock();
	reachable.insert(CB->getBlockID());
	}
	}

	// Bail out if we didn't get the CFG or the ParentMap.			// Finds the entry point(s) for this dead CFGBlock in a BFS order.
	if (!D \|\| !C \|\| !PM)			// Marks reachable all blocks, whose parents are all unreachable.
				static void filterUnreachableEntryPoints(const CFGBlock *CB,
				CFGBlocksSet &Reachable,
				CFGBlocksSet &Visited) {
				if (Visited.contains(CB->getBlockID()))
	return;			return;

	// Don't do anything for template instantiations. Proving that code			auto IsUnreachable = [&Reachable](const CFGBlock *B) -> bool {
	// in a template instantiation is unreachable means proving that it is			return !Reachable.contains(B->getBlockID());
	// unreachable in all instantiations.			};
	if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D))
	if (FD->isTemplateInstantiation())
	return;

	// Find CFGBlocks that were not covered by any node
	for (CFG::const_iterator I = C->begin(), E = C->end(); I != E; ++I) {
	const CFGBlock CB = I;
	// Check if the block is unreachable
	if (reachable.count(CB->getBlockID()))
	continue;

	// Check if the block is empty (an artificial block)
	if (isEmptyCFGBlock(CB))
	continue;

	// Find the entry points for this block
	if (!visited.count(CB->getBlockID()))
	FindUnreachableEntryPoints(CB, reachable, visited);

	// This block may have been pruned; check if we still want to report it
	if (reachable.count(CB->getBlockID()))
	continue;

	// Check for false positives
	if (isInvalidPath(CB, *PM))
	continue;

	// It is good practice to always have a "default" label in a "switch", even			std::queue<const CFGBlock *> Worklist;
	// if we should never get there. It can be used to detect errors, for			Worklist.push(CB);
	// instance. Unreachable code directly under a "default" label is therefore
	// likely to be a false positive.
	if (const Stmt *label = CB->getLabel())
	if (label->getStmtClass() == Stmt::DefaultStmtClass)
	continue;

	// Special case for __builtin_unreachable.			while (!Worklist.empty()) {
	// FIXME: This should be extended to include other unreachable markers,			const CFGBlock *Current = Worklist.front();
	// such as llvm_unreachable.			unsigned CurrentID = Current->getBlockID();
	if (!CB->empty()) {			Worklist.pop();
	bool foundUnreachable = false;
	for (CFGBlock::const_iterator ci = CB->begin(), ce = CB->end();
	ci != ce; ++ci) {
	if (Optional<CFGStmt> S = (*ci).getAs<CFGStmt>())
	if (const CallExpr *CE = dyn_cast<CallExpr>(S->getStmt())) {
	if (CE->getBuiltinCallee() == Builtin::BI__builtin_unreachable \|\|
	CE->isBuiltinAssumeFalse(Eng.getContext())) {
	foundUnreachable = true;
	break;
	}
	}
	}
	if (foundUnreachable)
	continue;
	}

	// We found a block that wasn't covered - find the statement to report			// Skip if already visited.
	SourceRange SR;			if (!Visited.insert(CurrentID).second)
	PathDiagnosticLocation DL;
	SourceLocation SL;
	if (const Stmt *S = getUnreachableStmt(CB)) {
	// In macros, 'do {...} while (0)' is often used. Don't warn about the
	// condition 0 when it is unreachable.
	if (S->getBeginLoc().isMacroID())
	if (const auto *I = dyn_cast<IntegerLiteral>(S))
	if (I->getValue() == 0ULL)
	if (const Stmt *Parent = PM->getParent(S))
	if (isa<DoStmt>(Parent))
	continue;
	SR = S->getSourceRange();
	DL = PathDiagnosticLocation::createBegin(S, B.getSourceManager(), LC);
	SL = DL.asLocation();
	if (SR.isInvalid() \|\| !SL.isValid())
	continue;
	}
	else
	continue;			continue;

	// Check if the SourceLocation is in a system header			auto NonNullPreds = skippingNulls(Current->preds());
	const SourceManager &SM = B.getSourceManager();
	if (SM.isInSystemHeader(SL) \|\| SM.isInExternCSystemHeader(SL))
	continue;

	B.EmitBasicReport(D, this, "Unreachable code", categories::UnusedCode,			// Schedule the unvisited predecessors.
	"This statement is never executed", DL, SR);			for (const CFGBlock *Pred : NonNullPreds)
	}			Worklist.push(Pred);
	}

	// Recursively finds the entry point(s) for this dead CFGBlock.			if (Reachable.contains(CurrentID))
	void UnreachableCodeChecker::FindUnreachableEntryPoints(const CFGBlock *CB,
	CFGBlocksSet &reachable,
	CFGBlocksSet &visited) {
	visited.insert(CB->getBlockID());

	for (CFGBlock::const_pred_iterator I = CB->pred_begin(), E = CB->pred_end();
	I != E; ++I) {
	if (!*I)
	continue;			continue;

	if (!reachable.count((*I)->getBlockID())) {			// If all predecessors are unreachable, consider the current block as
	// If we find an unreachable predecessor, mark this block as reachable so			// reachable.
	// we don't report this block			if (!NonNullPreds.empty() && all_of(NonNullPreds, IsUnreachable))
	reachable.insert(CB->getBlockID());			Reachable.insert(CurrentID);
	if (!visited.count((*I)->getBlockID()))
	// If we haven't previously visited the unreachable predecessor, recurse
	FindUnreachableEntryPoints(*I, reachable, visited);
	}
	}			}
	}			}

	// Find the Stmt* in a CFGBlock for reporting a warning			/// Find the Stmt* in a CFGBlock for reporting a warning.
	const Stmt UnreachableCodeChecker::getUnreachableStmt(const CFGBlock CB) {			/// Might return null.
	for (CFGBlock::const_iterator I = CB->begin(), E = CB->end(); I != E; ++I) {			static const Stmt getUnreachableStmt(const CFGBlock CB) {
	if (Optional<CFGStmt> S = I->getAs<CFGStmt>()) {			for (const CFGElement &E : *CB) {
				if (auto S = E.getAs<CFGStmt>())
	if (!isa<DeclStmt>(S->getStmt()))			if (!isa<DeclStmt>(S->getStmt()))
	return S->getStmt();			return S->getStmt();
	}			}
	}			return CB->getTerminatorStmt();
	if (const Stmt *S = CB->getTerminatorStmt())
	return S;
	else
	return nullptr;
	}			}

	// Determines if the path to this CFGBlock contained an element that infers this			// Determines if the path to this CFGBlock contained an element that infers this
	// block is a false positive. We assume that FindUnreachableEntryPoints has			// block is a false positive. We assume that findUnreachableEntryPoints has
	// already marked only the entry points to any dead code, so we need only to			// already marked only the entry points to any dead code, so we need only to
	// find the condition that led to this block (the predecessor of this block.)			// find the condition that led to this block (the predecessor of this block.)
	// There will never be more than one predecessor.			// There will never be more than one predecessor.
	bool UnreachableCodeChecker::isInvalidPath(const CFGBlock *CB,			static bool isInvalidPath(const CFGBlock *CB, const ParentMap &PM) {
	const ParentMap &PM) {
	// We only expect a predecessor size of 0 or 1. If it is >1, then an external			// We only expect a predecessor size of 0 or 1. If it is >1, then an external
	// condition has broken our assumption (for example, a sink being placed by			// condition has broken our assumption (for example, a sink being placed by
	// another check). In these cases, we choose not to report.			// another check). In these cases, we choose not to report.
	if (CB->pred_size() > 1)			if (CB->pred_size() > 1)
	return true;			return true;

	// If there are no predecessors, then this block is trivially unreachable			// If there are no predecessors, then this block is trivially unreachable
	if (CB->pred_size() == 0)			if (CB->pred_size() == 0)
	return false;			return false;

	const CFGBlock pred = CB->pred_begin();			auto NonNullPreds = skippingNulls(CB->preds());
	if (!pred)			if (NonNullPreds.empty())
	return false;			return false;

	// Get the predecessor block's terminator condition			// Get the predecessor block's terminator condition
	const Stmt *cond = pred->getTerminatorCondition();			const Stmt cond = (NonNullPreds.begin())->getTerminatorCondition();

	//assert(cond && "CFGBlock's predecessor has a terminator condition");			// assert(cond && "CFGBlock's predecessor has a terminator condition");
	// The previous assertion is invalid in some cases (eg do/while). Leaving			// The previous assertion is invalid in some cases (eg do/while). Leaving
	// reporting of these situations on at the moment to help triage these cases.			// reporting of these situations on at the moment to help triage these cases.
	if (!cond)			if (!cond)
	return false;			return false;

	// Run each of the checks on the conditions			// Run each of the checks on the conditions
	return containsMacro(cond) \|\| containsEnum(cond) \|\|			return containsMacro(cond) \|\| containsEnum(cond) \|\|
	containsStaticLocal(cond) \|\| containsBuiltinOffsetOf(cond) \|\|			containsStaticLocal(cond) \|\| containsBuiltinOffsetOf(cond) \|\|
	containsStmt<UnaryExprOrTypeTraitExpr>(cond);			containsStmt<UnaryExprOrTypeTraitExpr>(cond);
	}			}

	// Returns true if the given CFGBlock is empty			// Returns true if the given CFGBlock is empty
	bool UnreachableCodeChecker::isEmptyCFGBlock(const CFGBlock *CB) {			static bool isEmptyCFGBlock(const CFGBlock *CB) {
	return CB->getLabel() == nullptr // No labels			return CB->getLabel() == nullptr // No labels
	&& CB->size() == 0 // No statements			&& CB->size() == 0 // No statements
	&& !CB->getTerminatorStmt(); // No terminator			&& !CB->getTerminatorStmt(); // No terminator
	}			}

				static bool isBuiltinUnreachable(const CFGBlock *CB, const ASTContext &Ctx) {
				for (const CFGElement &E : *CB) {
				if (const auto S = E.getAs<CFGStmt>())
				if (const auto *CE = dyn_cast<CallExpr>(S->getStmt())) {
				if (CE->getBuiltinCallee() == Builtin::BI__builtin_unreachable \|\|
				CE->isBuiltinAssumeFalse(Ctx)) {
				return true;
				}
				}
				}
				return false;
				}

				static bool isDoWhileMacro(const Stmt *S, const ParentMap &PM) {
				if (S->getBeginLoc().isMacroID())
				if (const auto *I = dyn_cast<IntegerLiteral>(S))
				if (I->getValue() == 0)
				if (const Stmt *Parent = PM.getParent(S))
				if (isa<DoStmt>(Parent))
				return true;
				return false;
				}

				/// TODO: Doc this...
				static bool shouldIgnoreBlock(const CFGBlock *CB, const ParentMap &PM,
				const ASTContext &Ctx) {
				// Check for false positives.
				if (isInvalidPath(CB, PM))
				return true;

				// It is good practice to always have a "default" label in a "switch", even
				// if we should never get there. It can be used to detect errors, for
				// instance. Unreachable code directly under a "default" label is therefore
				// likely to be a false positive.
				if (const Stmt *L = CB->getLabel())
				if (L->getStmtClass() == Stmt::DefaultStmtClass)
				return true;

				// Special case for __builtin_unreachable.
				// FIXME: This should be extended to include other unreachable markers,
				// such as llvm_unreachable.
				if (isBuiltinUnreachable(CB, Ctx))
				return true;

				return false;
				}

				static const StackFrameContext *getTopFrame(const ExplodedGraph &G) {
				for (const ExplodedNode &N : G.nodes())
				if (N.getLocation().getLocationContext()->inTopFrame())
				return cast<StackFrameContext>(N.getLocationContext());
				llvm_unreachable("The top-level frame should always exist.");
				}

				/// The sink block should be reachable, but all the non-self successor blocks
				/// should be unreachable.
				static bool isSink(const CFGBlock *Block, const CFGBlocksSet &Reachables) {
				if (!Reachables.contains(Block->getBlockID()))
				return false;

				for (const CFGBlock *Succ : skippingNulls(Block->succs()))
				if (Reachables.contains(Succ->getBlockID()))
				return false;

				return true;
				}

				static CFGBlocksSet collectReachableBlocks(const ExplodedGraph &G) {
				CFGBlocksSet Reachable;
				for (const ExplodedNode &N : G.nodes())
				if (N.getLocation().getLocationContext()->inTopFrame())
				if (auto BE = N.getLocationAs<BlockEntrance>())
				Reachable.insert(BE->getBlock()->getBlockID());
				return Reachable;
				}

				static void filterUnreachableBlocksCausedBySinks(
				const std::vector<const CFGBlock *> &Blocks, CFGBlocksSet &Reachables) {
				CFGBlocksSet SuccessorsOfSinks;
				for (const CFGBlock *Block : Blocks) {
				assert(Block);
				if (isSink(Block, Reachables))
				for (const CFGBlock *Succ : skippingNulls(Block->succs()))
				SuccessorsOfSinks.insert(Succ->getBlockID());
				}

				// Mark these blocks as 'reachable' to prevent reporting these.
				Reachables.insert(SuccessorsOfSinks.begin(), SuccessorsOfSinks.end());
				}

				void UnreachableCodeChecker::checkEndAnalysis(ExplodedGraph &G, BugReporter &B,
				ExprEngine &Eng) const {
				if (Eng.hasWorkRemaining())
				return;

				const SourceManager &SM = B.getSourceManager();
				const ASTContext &ACtx = Eng.getContext();
				const StackFrameContext *Frame = getTopFrame(G);
				const ParentMap &PM = Frame->getParentMap();
				const CFG *C = Frame->getUnoptimizedCFG();
				assert(C);

				// Don't do anything for template instantiations. Proving that code
				// in a template instantiation is unreachable means proving that it is
				// unreachable in all instantiations.
				if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(Frame->getDecl()))
				if (FD->isTemplateInstantiation())
				return;

				CFGBlocksSet Visited;
				CFGBlocksSet Reachables = collectReachableBlocks(G);
				Visited.insert(C->getEntry().getBlockID());
				Reachables.insert(C->getEntry().getBlockID());
				Reachables.insert(C->getExit().getBlockID());

				filterUnreachableEntryPoints(&C->getExit(), Reachables, Visited);

				std::vector<const CFGBlock *> ConsideredBlocks;
				ConsideredBlocks.reserve(C->size());
				llvm::append_range(ConsideredBlocks, skippingNulls(*C));
				llvm::erase_if(ConsideredBlocks, isEmptyCFGBlock);

				filterUnreachableBlocksCausedBySinks(ConsideredBlocks, Reachables);

				auto ShouldIgnoreBlock = [&](const CFGBlock *B) {
				return shouldIgnoreBlock(B, PM, ACtx);
				};
				auto IsReachable = [&](const CFGBlock *B) {
				return Reachables.contains(B->getBlockID());
				};

				llvm::erase_if(ConsideredBlocks, ShouldIgnoreBlock);
				llvm::erase_if(ConsideredBlocks, IsReachable);

				for (const CFGBlock *UnreachableBlock : ConsideredBlocks) {
				const Stmt *S = getUnreachableStmt(UnreachableBlock);
				if (!S)
				continue;

				if (isDoWhileMacro(S, PM))
				continue;

				// FIXME: Exceptions and try-catch blocks are modeled by a malformed CFG.
				// Let's suppress these for now.
				// For more details see: #55621.
				if (isa<CXXTryStmt>(S))
				continue;

				SourceRange SR = S->getSourceRange();
				PathDiagnosticLocation DL =
				PathDiagnosticLocation::createBegin(S, SM, Frame);
				SourceLocation SL = DL.asLocation();
				if (SR.isInvalid() \|\| !SL.isValid() \|\| SM.isInSystemHeader(SL))
				continue;

				B.EmitBasicReport(Frame->getDecl(), this, "Unreachable code",
				categories::UnusedCode,
				"This statement is never executed", DL, SR);
				}
				}

	void ento::registerUnreachableCodeChecker(CheckerManager &mgr) {			void ento::registerUnreachableCodeChecker(CheckerManager &mgr) {
	mgr.registerChecker<UnreachableCodeChecker>();			mgr.registerChecker<UnreachableCodeChecker>();
	}			}

	bool ento::shouldRegisterUnreachableCodeChecker(const CheckerManager &mgr) {			bool ento::shouldRegisterUnreachableCodeChecker(const CheckerManager &mgr) {
	return true;			return true;
	}			}

clang/test/Analysis/unreachable-code-exceptions.cpp

This file was added.

				// RUN: %clang_analyze_cc1 -verify %s -fcxx-exceptions -fexceptions \
				// RUN: -analyzer-checker=core \
				// RUN: -analyzer-checker=alpha.deadcode.UnreachableCode

				// expected-no-diagnostics

				int test(int a) {
				try { // no-warning
				a *= 2;
				} catch (int) {
				return -1; // FIXME: We should mark this dead.
				}
				return a;
				}

clang/test/Analysis/unreachable-code-path.c

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	void test10(void) {
goto b; // expected-warning {{never executed}}		goto b; // expected-warning {{never executed}}
goto a; // expected-warning {{never executed}}		goto a; // expected-warning {{never executed}}
b:		b:
i = 1; // no-warning		i = 1; // no-warning
a:		a:
i = 2; // no-warning		i = 2; // no-warning
goto f;		goto f;
e:		e:
goto d;		goto d; // expected-warning {{never executed}}
f: ;		f: ;
}		}

		void test10_chained_unreachable(void) {
		goto end;
		a:
		goto b; // expected-warning {{never executed}}
		b:
		goto c; // expected-warning {{never executed}}
		c:
		goto a; // expected-warning {{never executed}}
		end:;
		}

// test11: we can actually end up in the default case, even if it is not		// test11: we can actually end up in the default case, even if it is not
// obvious: there might be something wrong with the given argument.		// obvious: there might be something wrong with the given argument.
enum foobar { FOO, BAR };		enum foobar { FOO, BAR };
extern void error(void);		extern void error(void);
void test11(enum foobar fb) {		void test11(enum foobar fb) {
switch (fb) {		switch (fb) {
case FOO:		case FOO:
break;		break;
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines
#define MACRO(C) \		#define MACRO(C) \
if (!C) { \		if (!C) { \
static int x; \		static int x; \
writeSomething(&x); \		writeSomething(&x); \
}		}
void macro2(void) {		void macro2(void) {
MACRO(1);		MACRO(1);
}		}

		int if_else_if_chain(int x) {
		if (x != 0)
		return -1;
		int v1 = 100 / x; // expected-warning {{Division by zero}}

		/// We should not warn for dead code, caused by a sink node.
		int clamped; // no-warning
		if (v1 < 0) // no-warning
		clamped = 0; // no-warning
		else if (v1 > 100) // no-warning
		clamped = 100; // no-warning
		else if (v1 % 2 == 0) // no-warning
		clamped = 0; // no-warning
		else if (v1 % 2 == -1) // no-warning
		clamped = -1; // no-warning
		else // no-warning
		clamped = v1; // no-warning
		return clamped;
		}

		int unreachable_standalone_recursive_block(int a) {
		switch (a) {
		back:
		a += 5; // expected-warning{{never executed}}
		a *= 2; // no-warning
		goto back;
		case 2:
		a *= 10;
		case 3:
		a %= 2;
		}
		return a;
		}

clang/test/Analysis/unreachable-code-path.cpp

This file was added.

				// RUN: %clang_analyze_cc1 -verify %s \
				// RUN: -analyzer-checker=core,deadcode,alpha.deadcode

				// expected-no-diagnostics

				struct NonTrivial {
				~NonTrivial();
				};
				struct NonTrivialPair {
				NonTrivial a, b;
				};
				enum Kind { Kind1 };

				// This code creates a null CFGBlock in the unoptimized CFG.
				// Should not crash.
				void NullCFGBlock(enum Kind k) {
				{ // Block start
				NonTrivialPair a;
				}
				switch (k) {
				case Kind1:;
				}
				}