This is an archive of the discontinued LLVM Phabricator instance.

Thank you for your work, Zoltán!
Did you checked if same warnings may be emitted by another checkers? For example, ArrayBoundChecker may warn if index is tainted.
I also have some comments below.

lib/StaticAnalyzer/Checkers/DirtyScalarChecker.cpp
48	`CheckerContext &C` to unify naming.
52	The default choice of such a value needs some explanation. It is also good to move a hard-coded value to a named constant, or, maybe a separate checker option.
53	`ProgramStateRef State, const Stmt *S)`. These names are usually used in analyzer and LLVM for ProgramState and Stmt, correspondingly.
55	We pass `V` by non-const reference, but it is not changed inside the function. So, we may use a constant reference here or even pass it by value (because it is small enough).
63	`CallEvent::getOriginExpr()` may return `nullptr` (in case of implicit destructor calls, for example) so we need to check the result before `dyn_cast` or dereference.
83	Variable names should start with capital letter to follow LLVM naming convention. We may move CE->getNumArgs out of the loop in order not to re-evaluate it every time so the code will look like `for (unsigned int I = 0, E = CE->getNumArgs(); I < E; ++I) {`. We also can C++11-fy this loop: for (const Expr *Arg : CE->arguments()) checkUnbounded(C, State, Arg);
test/Analysis/dirty-scalar-unbounded.cpp
2	It will be good to have tests for option set to true. Is there any test that makes usage of 'RecurseLimit` variable?

Did you checked if same warnings may be emitted by another checkers? For example,
ArrayBoundChecker may warn if index is tainted.

I second that. The GenericTaintChecker also reports uses of tainted values. It is not clear that we should add a new separate checker instead of adding the missing cases to the existing checkers.

Thank you!
Anna.

gerazo updated this revision to Diff 82336.Dec 22 2016, 5:42 AM

gerazo edited edge metadata.

Thank you very much for your help. I've added all suggested modifications including tests covering all checker option settings.

So thank you again for the valuable questions.
In this checker, I give warnings for values which are both tainted and were also not checked by the programmer. So unlike GenericTaintChecker, I do implement the boundedness check here for certain, interesting constructs (which is controlled by the critical option). GenericTaintChecker focuses purely on taintedness, almost like a service for other checkers. I've added a new rule to it, improving the taintedness logic, but I felt mixing the bound checking logic into it would make the two ideas inseparable.

I've also checked others using tainted values. ArrayBoundCheckerV2 also works with array indexing and modifies its behaviour on finding tainted values. However, the main difference is that ArrayBoundCheckerV2 uses region information to check bounds which may or may not be present, while this checker can give a warning whenever any information from the constraint manager does not prove that any bound checking were performed on the value before (and potentially works on many other constructs where region information shouldn't be there at all). Not having correct region information with tainted values is typical when reading data from unknown sources. This dirty approach is better in this regard. ArrayBoundCheckerV2 on the other hand can give warnings solely based on region information. Yes, it can happen that both will give a warning as well for the same construct. Do you think it is distracting? Should I remove the array indexing checks from this checker (still it gives warning for pointer arithmetic as well)?

test/Analysis/dirty-scalar-unbounded.cpp
2	Now both settings are covered. For RecurseLimit, I've added a named constant and better explanation. This is only a practical limit to not let the analysis dive too deep into complex loop expressions. The limit currently set should be adequate, so it would not make sense to set it programmatically.

gerazo marked an inline comment as done.Dec 22 2016, 6:23 AM

Hi, did you have time to check my changes?

Hello Zoltan. Sorry, I'm a bit busy now. Here are my thoughts about the design.

I think we should not add new warnings to GenericTaintChecker. Instead, it is better to move warnings out of it. After that it will become just a plugin used by other checkers. Such split will make it cleaner because we avoid mixing taint propagating logic and checks. It is not an item for this patch, of course. This new checker follows my preferences in this part.
For me, taint-related check for array index in this checker and in ArrayBoundCheckerV2 look almost the same if the offset is known (with just a single difference). However, in case of tainted check we can be less conservative so approach used in this checker when a warning is emitted even if base region or offset are unknown looks better for me. I think we should remove tainted check from ArrayBoundV2 and leave it in this new checker. But this causes a situation where out-of-bound check is not done in checker intended for this. A possible solution here is to move array-related logic of DirtyScalar to a separate checker that is enabled when ArrayBound or DirtyScalar are enabled. Zoltán, could you confirm that your checker emits same warnings on ArrayBoundChecker test cases with taint?

Anna, what's your opinion? Did I miss something?

lib/StaticAnalyzer/Checkers/DirtyScalarChecker.cpp
184	Does the second check means that we exclude boolean and char values? I cannot find any reason to do it for chars.

In this checker, I give warnings for values which are both tainted and were also not checked by the programmer. So unlike GenericTaintChecker, I do implement the boundedness check here for certain, interesting constructs (which is controlled by the critical option). GenericTaintChecker focuses purely on taintedness, almost like a service for other checkers. I've added a new rule to it, improving the taintedness logic, but I felt mixing the bound checking logic into it would make the two ideas inseparable.

I'd like to step back a bit. In my view, the taint implementation should consist of three elements:

taint source
taint sink
cleansing rules

I always considered the taint analysis in the analyzer not fully implemented because #3 was missing. It sounds a lot like non-"dirty" scalars would be the same as values that went through cleansing - they should be considered not tainted anymore. Now, are cleansing rules checker specific or generic? If they are generic, these rules should definitely be part of GenericTaintChecker and every checker using taint would utilize them.

WDYT?

(We can talk about the array bound checking part separately.)

gerazo marked an inline comment as done.Feb 28 2017, 5:50 AM

gerazo added inline comments.

lib/StaticAnalyzer/Checkers/DirtyScalarChecker.cpp
184	Yes, we exclude them. Using lookup tables especially in cryptography sometimes involve reading a value from disk and than using this value immediately with a table lookup. This way, you use a dirty value directly in array indexing. Reading a byte and using it on a prepared 256 element table is common. As the read value gets bigger it is less performant and hence less common to do it. I've found exactly 1 false positive in openssl without this exclusion.

Hmm... I am thinking on this issue for a week now...

I've played with the idea of implementing cleansing rules in GenericTaintChecker. It would be elegant but unfortunately, I have to think they are not general. Cleansing of a string (because it has no terminating null at the end) is very different from integral type cleansing. A signed value has to be lower bound checked as well, an unsigned only gets upper bound treatment. It also turns out that the type itself also can't give enough information about the needed cleansing rule. A number used for array indexing can be properly bound checked on the region extents while a simple number can only be checked in a way that any upper bound checking was done at all on it... All this leads to the need of several types of taintednesses (string taintedness, array taintedness, general bound check taintedness) because the cleansing can only take down one type of taintedness at a time. That would be the only way for a checker to be able to access that the taintedness type specific to the checker is still there or was already cleansed by the central logic. For me it shows that cleansing rules belong to specific checkers and cannot be efficiently generalized even in case of a single int type.

About the ArrayBoundCheckerV2: I agree that no redundant checks should be done system-wide. I would also extend ArrayBoundCheckV2 or put array checking into a separate checker. Currently that checker and the one implemented here do not give the same results. ArrayBoundCheckerV2 knows more about the array but is not willing to give a warning without knowing region information for the array. The easiest way would be to use one checker's code from the other and find out if the other is active and would already give a warning anyway... but I understand that it is against current architectural policies.

All this leads to the need of several types of taintednesses (string taintedness, array taintedness, general bound check taintedness) because the cleansing can only take down one type of taintedness at a time. That would be the only way for a checker to be able to access that the taintedness type specific to the checker is still there or was already cleansed by the central logic. For me it shows that cleansing rules belong to specific checkers and cannot be efficiently generalized even in case of a single int type.

I do not see why we cannot have type-dependent rules in the general taint checker. Checker-specific taint rules could be added to the checkers on top of the generic rules. Though, I do not have an example of this off the top of my head. Specifically, I do not think we can issue a warning every time we are not 100% sure that an index into an array is not in bounds with respect to the region extents. It would trigger for cases when the code compares against a statically unknown value and lead to too many false positives.

Stepping back a bit, what do you consider "dirty" vs "clean"? It seems that you are looking for prove that the values are known to be within the bounds of min and max int values. What happens if there is a comparison to an unknown symbolic value? Should that be considered as clean or tainted? Are there test cases for this?

Stepping back a bit, what do you consider "dirty" vs "clean"? It seems that you are looking for prove that the values are known to be within the bounds of min and max int values. What happens if there is a comparison to an unknown symbolic value? Should that be considered as clean or tainted? Are there test cases for this?

I consider values as clean when they were checked by the programmer from both sides. However, my implementation purely works from constraints in effect (and using min and max is just the broadest constraint I could find). So you are totally right that comparison with unknown symbols is not covered nor in implementation, nor in tests. Can you suggest a universally working method which can handle all cases (e.g. complex expressions on both sides of the operator)? If we could find such an approach, that would be something which could really go into the GenericTaintChecker as an improvement. And I would gladly rewrite this whole stuff to fit the more general solution.

Before abandoning this patch and rewriting it, I would like to get a thumbs up for my plans: I will reimplement all functionality included here but without creating a new checker. Some parts which relate to specific checkers will be put into the corresponding checkers (like ArrayBoundCheckerV2). General ideas on taintedness (cleansing rules and usage warnings on standard types) will be put into GenericTaintChecker. We will see how it goes, will we have a smaller patch or not. WDYT?

dkrupp added a subscriber: dkrupp.Jun 14 2017, 7:03 AM

This generally sounds good. Definitely do submit these changes in small pieces! See http://llvm.org/docs/DeveloperPolicy.html#incremental-development for rationale.

zaks.anna resigned from this revision.Aug 27 2017, 1:24 AM

gerazo abandoned this revision.Aug 28 2017, 2:31 AM

Revision Contents

Path

Size

include/

clang/

StaticAnalyzer/

Checkers/

Checkers.td

4 lines

lib/

StaticAnalyzer/

Checkers/

CMakeLists.txt

1 line

DirtyScalarChecker.cpp

231 lines

GenericTaintChecker.cpp

1 line

test/

Analysis/

dirty-scalar-unbounded.cpp

92 lines

Diff 82336

include/clang/StaticAnalyzer/Checkers/Checkers.td

	Show First 20 Lines • Show All 378 Lines • ▼ Show 20 Lines
	def ReturnPointerRangeChecker : Checker<"ReturnPtrRange">,			def ReturnPointerRangeChecker : Checker<"ReturnPtrRange">,
	HelpText<"Check for an out-of-bound pointer being returned to callers">,			HelpText<"Check for an out-of-bound pointer being returned to callers">,
	DescFile<"ReturnPointerRangeChecker.cpp">;			DescFile<"ReturnPointerRangeChecker.cpp">;

	def MallocOverflowSecurityChecker : Checker<"MallocOverflow">,			def MallocOverflowSecurityChecker : Checker<"MallocOverflow">,
	HelpText<"Check for overflows in the arguments to malloc()">,			HelpText<"Check for overflows in the arguments to malloc()">,
	DescFile<"MallocOverflowSecurityChecker.cpp">;			DescFile<"MallocOverflowSecurityChecker.cpp">;

				def DirtyScalarChecker : Checker<"DirtyScalar">,
				HelpText<"Warn on using tainted integers without proper bound check">,
				DescFile<"DirtyScalarChecker.cpp">;

	} // end "alpha.security"			} // end "alpha.security"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Taint checkers.			// Taint checkers.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let ParentPackage = Taint in {			let ParentPackage = Taint in {

	▲ Show 20 Lines • Show All 350 Lines • Show Last 20 Lines

lib/StaticAnalyzer/Checkers/CMakeLists.txt

Show All 25 Lines	add_clang_library(clangStaticAnalyzerCheckers
ClangCheckers.cpp		ClangCheckers.cpp
CloneChecker.cpp		CloneChecker.cpp
ConversionChecker.cpp		ConversionChecker.cpp
CXXSelfAssignmentChecker.cpp		CXXSelfAssignmentChecker.cpp
DeadStoresChecker.cpp		DeadStoresChecker.cpp
DebugCheckers.cpp		DebugCheckers.cpp
DereferenceChecker.cpp		DereferenceChecker.cpp
DirectIvarAssignment.cpp		DirectIvarAssignment.cpp
		DirtyScalarChecker.cpp
DivZeroChecker.cpp		DivZeroChecker.cpp
DynamicTypePropagation.cpp		DynamicTypePropagation.cpp
DynamicTypeChecker.cpp		DynamicTypeChecker.cpp
ExprInspectionChecker.cpp		ExprInspectionChecker.cpp
FixedAddressChecker.cpp		FixedAddressChecker.cpp
GenericTaintChecker.cpp		GenericTaintChecker.cpp
GTestChecker.cpp		GTestChecker.cpp
IdenticalExprChecker.cpp		IdenticalExprChecker.cpp
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

lib/StaticAnalyzer/Checkers/DirtyScalarChecker.cpp

This file was added.

				//===-- DirtyScalarChecker.cpp ------------------------------------- C++ ---//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// Reports the usage of dirty integers in code.
				// A dirty value is tainted and wasn't bound checked properly by the programmer.
				// By default (criticalOnly == true) reports dirty usage in
				// - memcpy, malloc, calloc, strcpy, strncpy, memmove functions
				// - array indexing
				// - memory allocation with new
				// - pointer arithmetic
				// otherwise (criticalOnly == false) it also reports usage as
				// - function argument
				// - loop condition
				//
				//===----------------------------------------------------------------------===//

				#include "ClangSACheckers.h"
				#include "clang/AST/ExprCXX.h"
				#include "clang/AST/ParentMap.h"
				#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
				#include "clang/StaticAnalyzer/Core/Checker.h"
				#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
				#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"

				using namespace clang;
				using namespace ento;

				namespace {

				class DirtyScalarChecker
				: public Checker<check::PreCall, check::PreStmt<ArraySubscriptExpr>,
				check::PostStmt<CXXNewExpr>,
				check::PreStmt<BinaryOperator>, check::BranchCondition> {
				public:
				// Typical loop conditions worth checking are not deeper than this limit
				static const int LogicalOpCheckDepth = 3;

				DefaultBool IsCriticalOnly;
				mutable std::unique_ptr<BugType> UnboundedBugType;

				void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
				void checkPreStmt(const ArraySubscriptExpr *ASE, CheckerContext &C) const;
				a.sidorinUnsubmitted Done Reply Inline Actions `CheckerContext &C` to unify naming. a.sidorin: `CheckerContext &C` to unify naming.
				void checkPostStmt(const CXXNewExpr *NE, CheckerContext &C) const;
				void checkPreStmt(const BinaryOperator *BO, CheckerContext &C) const;
				void checkBranchCondition(const Stmt *Cond, CheckerContext &C) const;

				a.sidorinUnsubmitted Done Reply Inline Actions The default choice of such a value needs some explanation. It is also good to move a hard-coded value to a named constant, or, maybe a separate checker option. a.sidorin: The default choice of such a value needs some explanation. It is also good to move a hard-coded…
				private:
				a.sidorinUnsubmitted Done Reply Inline Actions `ProgramStateRef State, const Stmt S)`. These names are usually used in analyzer and LLVM for ProgramState and Stmt, correspondingly. a.sidorin:* `ProgramStateRef State, const Stmt *S)`. These names are usually used in analyzer and LLVM for…
				void checkLoopCond(const Stmt *Cond, CheckerContext &C,
				int RecurseLimit = LogicalOpCheckDepth) const;
				a.sidorinUnsubmitted Done Reply Inline Actions We pass `V` by non-const reference, but it is not changed inside the function. So, we may use a constant reference here or even pass it by value (because it is small enough). a.sidorin: We pass `V` by non-const reference, but it is not changed inside the function. So, we may use a…
				void checkUnbounded(CheckerContext &C, ProgramStateRef State,
				const Stmt *S) const;
				bool isUnbounded(CheckerContext &C, ProgramStateRef State, SVal V) const;
				void reportBug(CheckerContext &C, ProgramStateRef State, SVal V) const;
				};

				} // end of anonymous namespace

				a.sidorinUnsubmitted Done Reply Inline Actions `CallEvent::getOriginExpr()` may return `nullptr` (in case of implicit destructor calls, for example) so we need to check the result before `dyn_cast` or dereference. a.sidorin: `CallEvent::getOriginExpr()` may return `nullptr` (in case of implicit destructor calls, for…
				void DirtyScalarChecker::checkPreCall(const CallEvent &Call,
				CheckerContext &C) const {
				const Expr *E = Call.getOriginExpr();
				if (!E)
				return;
				const CallExpr *CE = dyn_cast<CallExpr>(E);
				const FunctionDecl *FDecl = C.getCalleeDecl(CE);
				if (!FDecl \|\| FDecl->getKind() != Decl::Function)
				return;
				StringRef FName = C.getCalleeName(FDecl);
				if (FName.empty())
				return;
				if (IsCriticalOnly) {
				bool AllowedFunc = llvm::StringSwitch<bool>(FName)
				.Case("memcpy", true)
				.Case("malloc", true)
				.Case("calloc", true)
				.Case("strcpy", true)
				.Case("strncpy", true)
				.Case("memmove", true)
				a.sidorinUnsubmitted Done Reply Inline Actions Variable names should start with capital letter to follow LLVM naming convention. We may move CE->getNumArgs out of the loop in order not to re-evaluate it every time so the code will look like `for (unsigned int I = 0, E = CE->getNumArgs(); I < E; ++I) {`. We also can C++11-fy this loop: for (const Expr Arg : CE->arguments()) checkUnbounded(C, State, Arg); a.sidorin:* 1. Variable names should start with capital letter to follow LLVM naming convention. 2. We may…
				.Default(false);
				if (!AllowedFunc)
				return;
				}
				ProgramStateRef State = C.getState();
				for (unsigned int I = 0, E = CE->getNumArgs(); I < E; ++I) {
				const Expr *Arg = CE->getArg(I);
				checkUnbounded(C, State, Arg);
				}
				}

				void DirtyScalarChecker::checkPreStmt(const ArraySubscriptExpr *ASE,
				CheckerContext &C) const {
				checkUnbounded(C, C.getState(), ASE->getIdx());
				}

				void DirtyScalarChecker::checkPostStmt(const CXXNewExpr *NE,
				CheckerContext &C) const {
				if (!NE->isArray())
				return;
				checkUnbounded(C, C.getState(), NE->getArraySize());
				}

				void DirtyScalarChecker::checkPreStmt(const BinaryOperator *BO,
				CheckerContext &C) const {
				if (!BO->isAdditiveOp())
				return;
				Expr *LHS = BO->getLHS();
				Expr *RHS = BO->getRHS();
				if (RHS->getType()->isPointerType()) {
				std::swap(LHS, RHS);
				}
				if (!LHS->getType()->isPointerType() \|\| !RHS->getType()->isIntegerType())
				return;
				checkUnbounded(C, C.getState(), RHS);
				}

				// We want to check the whole loop condition so we catch the direct descendant
				// statement of the loop only.
				void DirtyScalarChecker::checkBranchCondition(const Stmt *Cond,
				CheckerContext &C) const {
				if (IsCriticalOnly)
				return;
				ParentMap &PM = C.getLocationContext()->getParentMap();
				const Stmt *P = PM.getParentIgnoreParenCasts(Cond);
				if (!P \|\| (!isa<ForStmt>(P) && !isa<WhileStmt>(P) && !isa<DoStmt>(P)))
				return;
				checkLoopCond(Cond, C);
				}

				// The heuristic implemented here tries to get conditions where the
				// loop variable will be run in relation to an unbounded and tainted value.
				void DirtyScalarChecker::checkLoopCond(const Stmt *Cond, CheckerContext &C,
				int RecurseLimit) const {
				if (RecurseLimit == 0)
				return;
				const BinaryOperator *BO = dyn_cast<BinaryOperator>(Cond);
				if (!BO)
				return;
				if (BO->isLogicalOp()) {
				Expr *LHS = BO->getLHS();
				Expr *RHS = BO->getRHS();
				checkLoopCond(LHS, C, RecurseLimit - 1);
				checkLoopCond(RHS, C, RecurseLimit - 1);
				return;
				}
				if (!BO->isComparisonOp())
				return;
				Expr *LHS = BO->getLHS();
				Expr *RHS = BO->getRHS();
				checkUnbounded(C, C.getState(), LHS);
				checkUnbounded(C, C.getState(), RHS);
				}

				void DirtyScalarChecker::checkUnbounded(CheckerContext &C,
				ProgramStateRef State,
				const Stmt *S) const {
				SVal Val = C.getSVal(S);
				if (Val.isUndef() \|\| !State->isTainted(Val))
				return;
				if (isUnbounded(C, State, Val))
				reportBug(C, State, Val);
				}

				// We make here an indirect query on in-place constraints of V.
				// If it can be assumed that a value cannot be the highest value of that type
				// it surely has an upper bound. The same is true for lower bounds in case of
				// signed types.
				bool DirtyScalarChecker::isUnbounded(CheckerContext &C, ProgramStateRef State,
				SVal V) const {
				const int TooNarrowForBoundCheck = 8;

				SValBuilder &SVB = C.getSValBuilder();
				ASTContext &Ctx = SVB.getContext();
				const SymExpr *SE = V.getAsSymExpr();
				if (!SE)
				return false;
				QualType Ty = SE->getType();
				if (Ty.isNull())
				Ty = Ctx.IntTy;
				if (!Ty->isIntegerType() \|\| Ctx.getIntWidth(Ty) <= TooNarrowForBoundCheck)
				a.sidorinUnsubmitted Done Reply Inline Actions Does the second check means that we exclude boolean and char values? I cannot find any reason to do it for chars. a.sidorin: Does the second check means that we exclude boolean and char values? I cannot find any reason…
				gerazoAuthorUnsubmitted Not Done Reply Inline Actions Yes, we exclude them. Using lookup tables especially in cryptography sometimes involve reading a value from disk and than using this value immediately with a table lookup. This way, you use a dirty value directly in array indexing. Reading a byte and using it on a prepared 256 element table is common. As the read value gets bigger it is less performant and hence less common to do it. I've found exactly 1 false positive in openssl without this exclusion. gerazo: Yes, we exclude them. Using lookup tables especially in cryptography sometimes involve reading…
				return false;

				BasicValueFactory &BVF = SVB.getBasicValueFactory();
				nonloc::ConcreteInt Max(BVF.getMaxValue(Ty));
				SVal LTCond =
				SVB.evalBinOpNN(State, BO_LT, V.castAs<NonLoc>(), Max, Ctx.IntTy);
				if (LTCond.isUnknownOrUndef())
				return false;
				ProgramStateRef StateNotLess =
				State->assume(LTCond.castAs<DefinedSVal>(), false);
				if (StateNotLess)
				return true;

				if (Ty->isSignedIntegerType()) {
				nonloc::ConcreteInt Min(BVF.getMinValue(Ty));
				SVal GTCond =
				SVB.evalBinOpNN(State, BO_GT, V.castAs<NonLoc>(), Min, Ctx.IntTy);
				if (GTCond.isUnknownOrUndef())
				return false;
				ProgramStateRef StateNotGreater =
				State->assume(GTCond.castAs<DefinedSVal>(), false);
				if (StateNotGreater)
				return true;
				}

				return false;
				}

				void DirtyScalarChecker::reportBug(CheckerContext &C, ProgramStateRef State,
				SVal V) const {
				ExplodedNode *EN = C.generateNonFatalErrorNode(State);
				if (!UnboundedBugType)
				UnboundedBugType.reset(new BugType(this, "Unchecked tainted variable usage",
				"Insecure usage"));
				auto BR = llvm::make_unique<BugReport>(
				*UnboundedBugType,
				"Tainted variable is used without proper bound checking", EN);
				BR->markInteresting(C.getLocationContext());
				BR->markInteresting(V);
				C.emitReport(std::move(BR));
				}

				void ento::registerDirtyScalarChecker(CheckerManager &mgr) {
				DirtyScalarChecker *checker = mgr.registerChecker<DirtyScalarChecker>();
				checker->IsCriticalOnly =
				mgr.getAnalyzerOptions().getBooleanOption("criticalOnly", true, checker);
				}

lib/StaticAnalyzer/Checkers/GenericTaintChecker.cpp

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	TaintPropagationRule Rule = llvm::StringSwitch<TaintPropagationRule>(Name)
.Case("strrchr", TaintPropagationRule(0, ReturnValueIndex))		.Case("strrchr", TaintPropagationRule(0, ReturnValueIndex))
.Case("read", TaintPropagationRule(0, 2, 1, true))		.Case("read", TaintPropagationRule(0, 2, 1, true))
.Case("pread", TaintPropagationRule(InvalidArgIndex, 1, true))		.Case("pread", TaintPropagationRule(InvalidArgIndex, 1, true))
.Case("gets", TaintPropagationRule(InvalidArgIndex, 0, true))		.Case("gets", TaintPropagationRule(InvalidArgIndex, 0, true))
.Case("fgets", TaintPropagationRule(2, 0, true))		.Case("fgets", TaintPropagationRule(2, 0, true))
.Case("getline", TaintPropagationRule(2, 0))		.Case("getline", TaintPropagationRule(2, 0))
.Case("getdelim", TaintPropagationRule(3, 0))		.Case("getdelim", TaintPropagationRule(3, 0))
.Case("fgetln", TaintPropagationRule(0, ReturnValueIndex))		.Case("fgetln", TaintPropagationRule(0, ReturnValueIndex))
		.Case("recv", TaintPropagationRule(InvalidArgIndex, 1, true))
.Default(TaintPropagationRule());		.Default(TaintPropagationRule());

if (!Rule.isNull())		if (!Rule.isNull())
return Rule;		return Rule;

// Check if it's one of the memory setting/copying functions.		// Check if it's one of the memory setting/copying functions.
// This check is specialized but faster then calling isCLibraryFunction.		// This check is specialized but faster then calling isCLibraryFunction.
unsigned BId = 0;		unsigned BId = 0;
▲ Show 20 Lines • Show All 499 Lines • Show Last 20 Lines

test/Analysis/dirty-scalar-unbounded.cpp

This file was added.

				// RUN: %clang_cc1 -analyze -analyzer-checker=alpha.security.taint,alpha.security.DirtyScalar -verify -analyzer-config alpha.security.DirtyScalar:criticalOnly=true %s
				// RUN: %clang_cc1 -analyze -analyzer-checker=alpha.security.taint,alpha.security.DirtyScalar -verify -analyzer-config alpha.security.DirtyScalar:criticalOnly=false -DDIRTYSCALARSTRICT=1 %s
				a.sidorinUnsubmitted Done Reply Inline Actions It will be good to have tests for option set to true. Is there any test that makes usage of 'RecurseLimit` variable? a.sidorin: 1. It will be good to have tests for option set to true. 2. Is there any test that makes usage…
				gerazoAuthorUnsubmitted Not Done Reply Inline Actions Now both settings are covered. For RecurseLimit, I've added a named constant and better explanation. This is only a practical limit to not let the analysis dive too deep into complex loop expressions. The limit currently set should be adequate, so it would not make sense to set it programmatically. gerazo: Now both settings are covered. For RecurseLimit, I've added a named constant and better…

				#include "Inputs/system-header-simulator.h"

				typedef long ssize_t;

				ssize_t recv(int s, void *buf, size_t len, int flags);

				void gets_tainted_ival(int val) {
				(void)val;
				}

				void gets_tainted_uval(unsigned int val) {
				(void)val;
				}

				int tainted_usage() {
				int size;
				scanf("%d", &size);
				char *buff = new char[size]; // expected-warning{{Tainted variable is used without proper bound checking}}
				for (int i = 0; i < size; ++i) {
				#if DIRTYSCALARSTRICT
				// expected-warning@-2{{Tainted variable is used without proper bound checking}}
				#endif
				scanf("%d", &buff[i]);
				}
				buff[size - 1] = 0; // expected-warning{{Tainted variable is used without proper bound checking}}
				*(buff + size - 2) = 0; // expected-warning{{Tainted variable is used without proper bound checking}}
				gets_tainted_ival(size);
				#if DIRTYSCALARSTRICT
				// expected-warning@-2{{Tainted variable is used without proper bound checking}}
				#endif

				return 0;
				}

				int tainted_usage_checked() {
				int size;
				scanf("%d", &size);
				if (size < 0 \|\| size > 255)
				return -1;
				char *buff = new char[size]; // no warning
				for (int i = 0; i < size; ++i) { // no warning
				scanf("%d", &buff[i]); // no warning
				}
				buff[size - 1] = 0; // no warning
				*(buff + size - 2) = 0; // no warning
				gets_tainted_ival(size); // no warning

				unsigned int idx;
				scanf("%d", &idx);
				if (idx > 255)
				return -1;
				gets_tainted_uval(idx); // no warning

				return 0;
				}

				int detect_tainted(char const **messages) {
				int sock, index;
				scanf("%d", &sock);
				if (recv(sock, &index, sizeof(index), 0) != sizeof(index)) {
				#if DIRTYSCALARSTRICT
				// expected-warning@-2{{Tainted variable is used without proper bound checking}}
				#endif
				return -1;
				}
				int index2 = index;
				printf("%s\n", messages[index]); // expected-warning{{Tainted variable is used without proper bound checking}}
				printf("%s\n", messages[index2]); // expected-warning{{Tainted variable is used without proper bound checking}}

				return 0;
				}

				int skip_sizes_likely_used_for_table_access(char const **messages) {
				int sock;
				char byte;

				scanf("%d", &sock);
				if (recv(sock, &byte, sizeof(byte), 0) != sizeof(byte)) {
				#if DIRTYSCALARSTRICT
				// expected-warning@-2{{Tainted variable is used without proper bound checking}}
				#endif
				return -1;
				}
				char byte2 = byte;
				printf("%s\n", messages[byte]); // no warning
				printf("%s\n", messages[byte2]); // no warning

				return 0;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] alpha.security.DirtyScalar CheckerAbandonedPublic

Details

Diff Detail

Event Timeline