This is an archive of the discontinued LLVM Phabricator instance.

This would be a serious change. Especially if we internally already depend on modulo arithmetic.
For example, the ArrayBoundV2 might exploit this fact - when it rearranges the inequality. I might be wrong on this though.
Besides that checker, I can't mention any others.

Furthermore, the readability of the bug reports in their current form is not quite enough.
I'm not against the idea, but it probably needs some refinement and measurement.

I think it could be better to implement this check with a checker on PreStmt<BinaryOperator> and so on. And IMO, checkers have enough functionalities to report these problems.

Besides, the return value should be the exact value computed from the two integers, even unknown, rather than undefined. As the developers may overflow an integer on purpose.

In D92634#2451646, @danielmarjamaki wrote:

No reviews => I will not contribute.

Hi, thanks for the patch! I am terribly sorry that we could not pick this up earlier. You can always ping the community with a simple 'Ping' message. Unfortunately, it often happens that a non-trivial patch stalls in the review queue, a ping might push that a bit.

In D92634#2453822, @OikawaKirie wrote:

I think it could be better to implement this check with a checker on PreStmt<BinaryOperator> and so on. And IMO, checkers have enough functionalities to report these problems.

Besides, the return value should be the exact value computed from the two integers, even unknown, rather than undefined. As the developers may overflow an integer on purpose.

+1
To provide warnings on overflows is a great idea. However, perhaps a separate new checker should emit a report with more information (i.e. that an overflow could happen vs "undefined").

ok.. thanks for the reviews. I will see if I can make some new check.

Besides, the return value should be the exact value computed from the two integers, even unknown, rather than undefined. As the developers may overflow an integer on purpose.

I am not sure what you mean. If there is undefined behavior then the value should be undefined and nothing else.. right?

In D92634#2476161, @danielmarjamaki wrote:

Besides, the return value should be the exact value computed from the two integers, even unknown, rather than undefined. As the developers may overflow an integer on purpose.

I am not sure what you mean. If there is undefined behavior then the value should be undefined and nothing else.. right?

Exactly, it is undefined behavior in the C++ standard. However, the mainstream compilers like GCC and Clang implement this as the overflowed value, and some programmers also use this feature to do some tricky things. Therefore I suggest the computed value should be "the exact value computed from the two integers". Or it can be the Unknown SVal, but rather than the Undefined SVal, as the Undefined SVal is used to represent what is read from an uninitialized variable.

But I do not favour the Unknown solution, as it could also trigger other problems in the engine, just as what has been mentioned by steakhal. Or maybe it would be no longer a problem if you implement this in a checker, but as a non-fatal error it is, you can just leave the overflowed value as it is, and report the problem only without terminating the symbol execution on this path. There is no need to report this problem all the time.

However, the mainstream compilers like GCC and Clang implement this as the overflowed value, and some programmers also use this feature to do some tricky things.

hmm.. you mean if some -fwrapv flag is used right. yes I should disable this checking then.

For information, I am not very worried about signed integer overflow. I am mostly worried about the compiler optimisations. If the compiler determines that there is signed integer overflow in a execution path and removes all code in that path.

I was inspired by this blog post: https://www.airs.com/blog/archives/120

This function:

int f(int x) { return 0x7ffffff0 < x && x + 32 < 0x7fffffff; }

might be optimized to:

int f(int x) { return 0; }

danielmarjamaki added a subscriber: danielmarjamaki.Jan 4 2021, 7:35 AM

In D92634#2477342, @danielmarjamaki wrote:
However, the mainstream compilers like GCC and Clang implement this as the overflowed value, and some programmers also use this feature to do some tricky things.

hmm.. you mean if some -fwrapv flag is used right. yes I should disable this checking then.

For information, I am not very worried about signed integer overflow. I am mostly worried about the compiler optimisations. If the compiler determines that there is signed integer overflow in a execution path and removes all code in that path.

I was inspired by this blog post: https://www.airs.com/blog/archives/120

This function:
int f(int x) { return 0x7ffffff0 < x && x + 32 < 0x7fffffff; }
might be optimized to:
int f(int x) { return 0; }

Yes, you are right. Currently, in the CSA, the symbolic execution engine will return true for this function. Condition 0x7ffffff0 < x will assume x to be [2147483632, 2147483647], then x + 32 will be evaluated to [-2147483632, -2147483617] which will make the check return true.

But for the example you mentioned, I prefer implementing this check with a coding style checker for Clang-Tidy, although the integer overflow checker in the CSA is also necessary.
As the evaluation result potentially conflicts with the literal arithmetical result, reporting directly this conflict condition expression would be easier for programmers to diagnostic the code.
Besides, as far as I am thinking, the compiler optimizes the expression as it is literally inferable, i.e. the result should be determinable literally during compilation. Otherwise, it will be computed during runtime. Therefore I suggest you can do a similar check with the ASTMatcher, since the and and or conjunctions will be removed in the CFG.

BTW, I cannot optimize function f to returning zero directly with GCC-10.2.1 and Clang-10.0.1 under -O3. Should I add any other flags? Or it is version specific?

This patch adds a new core checker. I wouldn't be comfortable enabling it by default without a much more careful evaluation (than a single lit test). In particular, i'm worried about false positives due to constraint solving issues and our incorrect cast representation - it sounds like both of these may disproportionally affect this checker. I also still wonder about "intentional" use of signed overflows; it's likely that some developers would prefer to silence the check. At least we should have a rough idea about how loud it is.

In D92634#2454438, @martong wrote:

+1
To provide warnings on overflows is a great idea. However, perhaps a separate new checker should emit a report with more information (i.e. that an overflow could happen vs "undefined").

Not necessarily a new checker but bug reports from this checker are already lacking a lot of information. In this new check both operands should be tracked with visitors. Also in other checks there's a bailout path that produces generic diagnostics ("the result... is undefined" without explaining why); we should probably silence those instead because it's clear that we can't explain what's going on.

NoQ added inline comments.Jan 5 2021, 1:04 AM

clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp
219	`Bldr` is always equal to `*this` here, there's no need to obtain it separately.
306–307	All three checks deserve a separate lit test.

BTW, I cannot optimize function f to returning zero directly with GCC-10.2.1 and Clang-10.0.1 under -O3. Should I add any other flags? Or it is version specific?

Yeah I can't reproduce that with latest gcc/clang neither. Try gcc 4.x and gcc 5.x.

But for the example you mentioned, I prefer implementing this check with a coding style checker for Clang-Tidy, although the integer overflow checker in the CSA is also necessary.

Yes if that was all we wanted to detect. I'd rather also flag more complex examples.. whenever the optimizer can see that there is overflow and has an opportunity to destroy the code. Here is a more complex example: https://stackoverflow.com/questions/57650026/optimized-clang-handling-overflow

This patch adds a new core checker. I wouldn't be comfortable enabling it by default without a much more careful evaluation (than a single lit test).

I agree 1000%. The patch I provided so far was only a proof of concept. Thanks for all comments I feel I will remake this.

Not necessarily a new checker but bug reports from this checker are already lacking a lot of information. In this new check both operands should be tracked with visitors. Also in other checks there's a bailout path that produces generic diagnostics ("the result... is undefined" without explaining why); we should probably silence those instead because it's clear that we can't explain what's going on.

Yes I really agree about that. The current warning message is unacceptable and it must be clarified.

Sorry I don't make much progress right now.. I hope to have some new code to show soon..

In D92634#2478503, @OikawaKirie wrote:

Besides, as far as I am thinking, the compiler optimizes the expression as it is literally inferable, i.e. the result should be determinable literally during compilation. Otherwise, it will be computed during runtime. Therefore I suggest you can do a similar check with the ASTMatcher, since the and and or conjunctions will be removed in the CFG.

I disagree on this.
You can never be sure if a given function gets inlined or not, thus the context does matter for compiler optimizations. The given value might get constant folded and get the comparison/branch optimized away.
In a path sensitive way, we will always suffer from the limitations of the constraint solver.

BTW, I cannot optimize function f to returning zero directly with GCC-10.2.1 and Clang-10.0.1 under -O3. Should I add any other flags? Or it is version specific?

I don't think we should depend on implementation details.
The only flag I would consider though if the -fwarpv as that defines the concrete semantics.

I also agree with @NoQ's D92634#2478703 comment.

I also agree with @NoQ's D92634#2478703 comment.

well.. maybe it's better that I stop programming then and take this code I had and test it out.

I have run clang static analysis on random open source projects. The very first finding that I look at seems (to me) to be a false positive. :-(

My code seems to think that a variable print_count has the value INT_MAX for some reason and to me that seems impossible. I'll investigate this..

analyzed package:
ftp://ftp.de.debian.org/debian/pool/main/libm/libmsiecf/libmsiecf_20181227.orig.tar.gz

>

libcerror_error.c:426:45: warning: The result of the '+' expression is undefined [core.UndefinedBinaryOperatorResult]

			error_string_size = (size_t) print_count + 1;

My code seems to think that a variable print_count has the value INT_MAX for some reason

Typically in such cases bug visitors should be added/improved until it is clear from the user-facing report why does the analyzer think so. They'd highlight the important events, prevent path pruning, and potentially suppress reports if the reason is discovered to not be valid.

Here is a link for our results on a few more projects. It might be useful for you.
https://codechecker-demo.eastus.cloudapp.azure.com/Default/runs?run=D92634&items-per-page=50&sort-by=name&sort-desc=false
Note: use the diff to filter only for the new reports.

Typically in such cases bug visitors should be added/improved until it is clear from the user-facing report why does the analyzer think so. They'd highlight the important events, prevent path pruning, and potentially suppress reports if the reason is discovered to not be valid.

Thanks! But in this case I do not think that my code works well.

I have reduced the test case:

#include <stdio.h>

#define LIBCERROR_MESSAGE_INCREMENT_SIZE                64
#define LIBCERROR_MESSAGE_MAXIMUM_SIZE                  4096

int get_print_count();

void foo( )
{
	size_t message_size                        = 0;
	size_t next_message_size                   = LIBCERROR_MESSAGE_INCREMENT_SIZE;
	int print_count                            = 0;

	while (1)
	{
		if( next_message_size >= LIBCERROR_MESSAGE_MAXIMUM_SIZE )
		{
			next_message_size = LIBCERROR_MESSAGE_MAXIMUM_SIZE;
		}

		message_size = next_message_size;


		print_count = get_print_count();

		if( print_count <= -1 )
		{
			next_message_size += LIBCERROR_MESSAGE_INCREMENT_SIZE;  // <- Assigned value is garbage or undefined
		}
		else if( ( (size_t) print_count >= message_size ) )
		{
			next_message_size = (size_t) ( print_count + 1 );
			print_count       = -1;
		}
		else
		{
			int error_string_size = (size_t) print_count + 1; // <- The result of the '+' expression is undefined
		}
		if( message_size >= LIBCERROR_MESSAGE_MAXIMUM_SIZE )
		{
			break;
		}
	}
}

I guess it's best that I rewrite my logic in a new check.

But well I have the feeling it could take a couple of days before I have something new..

In D92634#2484404, @steakhal wrote:

Here is a link for our results on a few more projects. It might be useful for you.
https://codechecker-demo.eastus.cloudapp.azure.com/Default/runs?run=D92634&items-per-page=50&sort-by=name&sort-desc=false
Note: use the diff to filter only for the new reports.

Thanks! Looks great.

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Core/

SimpleSValBuilder.cpp

41 lines

test/

Analysis/

integer-overflow.c

7 lines

Diff 309464

clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp

Context not available.
	SVal evalBinOpLN(ProgramStateRef state, BinaryOperator::Opcode op,	SVal evalBinOpLN(ProgramStateRef state, BinaryOperator::Opcode op,
	Loc lhs, NonLoc rhs, QualType resultTy) override;	Loc lhs, NonLoc rhs, QualType resultTy) override;

		bool isGreater(ProgramStateRef State, const SymExpr *LHS, int64_t RHS);

	/// getKnownValue - evaluates a given SVal. If the SVal has only one possible	/// getKnownValue - evaluates a given SVal. If the SVal has only one possible
	/// (integer) value, that value is returned. Otherwise, returns NULL.	/// (integer) value, that value is returned. Otherwise, returns NULL.
	const llvm::APSInt *getKnownValue(ProgramStateRef state, SVal V) override;	const llvm::APSInt *getKnownValue(ProgramStateRef state, SVal V) override;
Context not available.
	SVal simplifySVal(ProgramStateRef State, SVal V) override;	SVal simplifySVal(ProgramStateRef State, SVal V) override;

	SVal MakeSymIntVal(const SymExpr *LHS, BinaryOperator::Opcode op,	SVal MakeSymIntVal(const SymExpr *LHS, BinaryOperator::Opcode op,
	const llvm::APSInt &RHS, QualType resultTy);	const llvm::APSInt &RHS, QualType resultTy,
		ProgramStateRef State);
	};	};
	} // end anonymous namespace	} // end anonymous namespace

Context not available.
	}	}
	}	}

		bool SimpleSValBuilder::isGreater(ProgramStateRef State, const SymExpr *LHS,
		int64_t RHS) {
		ProgramStateManager &Mgr = State->getStateManager();
		SValBuilder &Bldr = Mgr.getSValBuilder();
		NoQUnsubmitted Not Done Reply Inline Actions `Bldr` is always equal to `this` here, there's no need to obtain it separately. NoQ:* `Bldr` is always equal to `*this` here, there's no need to obtain it separately.
		SVal Eval =
		Bldr.evalBinOp(State, BO_GT, nonloc::SymbolVal(LHS),
		makeIntVal(RHS, LHS->getType()), Bldr.getConditionType());
		if (Eval.isUnknownOrUndef())
		return false;
		ProgramStateRef StTrue, StFalse;
		std::tie(StTrue, StFalse) = State->assume(Eval.castAs<DefinedSVal>());
		return (StTrue && !StFalse);
		}

	//===----------------------------------------------------------------------===//	//===----------------------------------------------------------------------===//
	// Transfer function for binary operators.	// Transfer function for binary operators.
	//===----------------------------------------------------------------------===//	//===----------------------------------------------------------------------===//

	SVal SimpleSValBuilder::MakeSymIntVal(const SymExpr *LHS,	SVal SimpleSValBuilder::MakeSymIntVal(const SymExpr *LHS,
	BinaryOperator::Opcode op,	BinaryOperator::Opcode op,
	const llvm::APSInt &RHS,	const llvm::APSInt &RHS,
	QualType resultTy) {	QualType resultTy,
		ProgramStateRef State) {
	bool isIdempotent = false;	bool isIdempotent = false;

	// Check for a few special cases with known reductions first.	// Check for a few special cases with known reductions first.
Context not available.
	if (isIdempotent)	if (isIdempotent)
	return evalCastFromNonLoc(nonloc::SymbolVal(LHS), resultTy);	return evalCastFromNonLoc(nonloc::SymbolVal(LHS), resultTy);

		// Is there a signed integer overflow
		if (LHS->getType()->isSignedIntegerType() && op == BO_Add) {
		int64_t ResultTyMaxVal =
		(1ULL << (getContext().getTypeSize(resultTy) - 1)) - 1;
		if (RHS > 0 && RHS < ResultTyMaxVal &&
		isGreater(State, LHS, ResultTyMaxVal - RHS.getExtValue()))
		NoQUnsubmitted Not Done Reply Inline Actions All three checks deserve a separate lit test. NoQ: All three checks deserve a separate lit test.
		return UndefinedVal();
		}

	// If we reach this point, the expression cannot be simplified.	// If we reach this point, the expression cannot be simplified.
	// Make a SymbolVal for the entire expression, after converting the RHS.	// Make a SymbolVal for the entire expression, after converting the RHS.
	const llvm::APSInt *ConvertedRHS = &RHS;	const llvm::APSInt *ConvertedRHS = &RHS;
Context not available.
	}	}

	// Otherwise, make a SymIntExpr out of the expression.	// Otherwise, make a SymIntExpr out of the expression.
	return MakeSymIntVal(symIntExpr, op, *RHSValue, resultTy);	return MakeSymIntVal(symIntExpr, op, *RHSValue, resultTy, state);
	}	}
	}	}

Context not available.

	// Is the RHS a constant?	// Is the RHS a constant?
	if (const llvm::APSInt *RHSValue = getKnownValue(state, rhs))	if (const llvm::APSInt *RHSValue = getKnownValue(state, rhs))
	return MakeSymIntVal(Sym, op, *RHSValue, resultTy);	return MakeSymIntVal(Sym, op, *RHSValue, resultTy, state);

	if (Optional<NonLoc> V = tryRearrange(state, op, lhs, rhs, resultTy))	if (Optional<NonLoc> V = tryRearrange(state, op, lhs, rhs, resultTy))
	return *V;	return *V;
Context not available.
	// build an expression for use by the constraint manager.	// build an expression for use by the constraint manager.
	if (SymbolRef lSym = lhs.getAsLocSymbol(true)) {	if (SymbolRef lSym = lhs.getAsLocSymbol(true)) {
	if (BinaryOperator::isComparisonOp(op))	if (BinaryOperator::isComparisonOp(op))
	return MakeSymIntVal(lSym, op, rInt->getValue(), resultTy);	return MakeSymIntVal(lSym, op, rInt->getValue(), resultTy, state);
	return UnknownVal();	return UnknownVal();
	}	}
	// Special case comparisons to NULL.	// Special case comparisons to NULL.
Context not available.

clang/test/Analysis/integer-overflow.c

This file was added.

				// RUN: %clang_analyze_cc1 -analyze -analyzer-checker core.UndefinedBinaryOperatorResult -verify %s

				void f(int x) {
				if (x > 0x7fffff00 && x + 32 < 0x7fffffff){}
				if (x > 0x7ffffff0 && x + 32 < 0x7fffffff){} // expected-warning {{The result of the '+' expression is undefined}}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Diagnose signed integer overflowNeeds ReviewPublic

Details

Diff Detail

Event Timeline

>

Revision Contents

Diff 309464

clang/lib/StaticAnalyzer/Core/SimpleSValBuilder.cpp

clang/test/Analysis/integer-overflow.c

[Analyzer] Diagnose signed integer overflow
Needs ReviewPublic