This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/StaticAnalyzer/Core/
-
StaticAnalyzer/
-
Core/
14/20
RangeConstraintManager.cpp
-
test/Analysis/
-
Analysis/
-
PR35418.cpp
-
constant-folding.c
-
uninit-bug-first-iteration-init.c

Differential D80117

[analyzer] Introduce reasoning about symbolic remainder operator
ClosedPublic

Authored by vsavchenko on May 18 2020, 4:20 AM.

Download Raw Diff

Details

Reviewers

NoQ
dcoughlin
xazax.hun
ASDenysPetrov

Commits

rG73c120a9895a: [analyzer] Introduce reasoning about symbolic remainder operator

Summary

New logic tries to narrow possible result values of the remainder operation
based on its operands and their ranges. It also tries to be conservative
with negative operands because according to the standard the sign of
the result is implementation-defined.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

vsavchenko created this revision.May 18 2020, 4:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2020, 4:20 AM

Herald added subscribers: cfe-commits, martong, Charusso and 8 others. · View Herald Transcript

Harbormaster failed remote builds in B57048: Diff 264582!May 18 2020, 5:20 AM

vsavchenko added a parent revision: D79434: [analyzer] Generalize bitwise AND rules for ranges.May 18 2020, 5:29 AM

Here is a proof in Z3:
https://gist.github.com/SavchenkoValeriy/559ca923b050f2c01e340c1be543b7e0

Rebase

Harbormaster completed remote builds in B57061: Diff 264605.May 18 2020, 8:02 AM

Here is a short summary of the performance testing I conducted across a bunch of open-source projects:

	vim	git	tmux	redis	cmake	pytorch	bitcoin	protobuf
Time (before)	20m56s	18m41s	11m40s	1h15m34s	30m34s	6h35m18s	9m27s	6m03s
Time (after)	22m16s	19m58s	29m52s	1h17m32s	33m03s	9h46m41s	9m33s	6m03s
Delta	+6.4%	+6.9%	+155.8%	+2.6%	+8.1%	+48.4%	-1.1%	+0.1%

Time (before) was measured on a commit before any of my solver changes.

This shows that performance tweaks discussed in various TODOs are indeed required to reduce the hit.

NoQ added inline comments.May 18 2020, 8:30 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442	`(Origin.From() + 1).isMinSignedValue()` is another sufficient condition(?)
445	You mean zero, right?

vsavchenko marked 2 inline comments as done.May 18 2020, 8:39 AM

vsavchenko added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442	I'm sorry, I don't quite get what cases does this check cover. Can you please explain what you have in mind?
445	No, not always. It still can be signed at this point.

NoQ added inline comments.May 18 2020, 8:55 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
445	Ok, so i misunderstood. This function computes range of `abs($x)` aka `\|$x\|` given the range for `$x`, right?

vsavchenko marked an inline comment as done.May 18 2020, 9:13 AM

vsavchenko added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
445	I guess I should fix my comments (and maybe the name for this function). This function finds absolute maximum, i.e. the value `C: \|$x\| <= C` and returns the range `[-C, C]` for signed `$x`s and `[0, C]` for unsigned `$x`s. So this new range is guaranteed to include the original range.

NoQ added inline comments.May 18 2020, 9:19 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442	Aha, ok, nvm, a different issue then: For range `[INT_MIN + 1, INT_MAX]`, the correct answer should be `[INT_MIN + 1, INT_MAX] (which is `[-C, C]` for `C = INT_MAX]`) rather than `[INT_MIN, INT_MAX]`.

Fix code review remarks.

xazax.hun added inline comments.May 19 2020, 5:38 AM

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
670	I wonder if we actually need this? I vaguely recall that we are doing a lot of simplifications during building symbolic expressions. I would be surprised if this identity is not handled there. (And in that case, probable this should be added there.) Or we might need a comment to explain why do we need this simplification at both places.

Harbormaster failed remote builds in B57193: Diff 264863!May 19 2020, 6:28 AM

@vsavchenko
I've made some assumptions.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442–452	I think you should swap `if` statements. I'll explain. Let's consider the input is an uint8 range [42, 242] and you will return [0, 242] in the second `if`. But if the input is an uint8 range [128, 242] you will return [0, 255] in the first `if`, because 128 is an equivalent of -128(INT8_MIN) in binary representation so the condition in the first if would be true. What is the great difference between [42, 242] and [128, 242] to have different results? Or you've just missed this case? P.S. I think your function's name doesn't fit its body, since absolute value is always positive (without sign) from its definition, but you output range may have negative values. You'd better write an explanation above the function and rename it.
464	As for me, the last reason fully covers previous special cases, so you can omit those ones, thus simplify the comment.
696	Extend the comment, please, why we should move bounds to zero at all.
721	Is it OK to return this rangeset in case when one of operands(or both) is negative, since this rangeset can vary from specific implementation?

Move 0 % x case to SValBuilder

vsavchenko marked 5 inline comments as done.May 27 2020, 9:10 AM

vsavchenko added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442–452	It is a valid point, I will add this test and change this behaviour! The name is confusing indeed, maybe you have any ideas what would be more appropriate?
464	I really want to be clear about the first two cases to explain why this works for any sign of `From` and `To`
670	Yeah, we don't do it in `SValBuilder`, but it is definitely a better place for that particular case. I'll move it.
696	Good point!
721	Yes, it is a conservative range for any ranges because only the sign of the operation is specific to different implementations

Harbormaster failed remote builds in B58061: Diff 266551!May 27 2020, 9:11 AM

vsavchenko marked an inline comment as done.May 27 2020, 9:11 AM

Fix code review remarks

vsavchenko marked 3 inline comments as done.May 27 2020, 10:02 AM

vsavchenko marked an inline comment as done.

vsavchenko added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
442–452	@NoQ , @ASDenysPetrov what do you think about this name instead (i.e. `getSymmetricalRange`).

Harbormaster failed remote builds in B58075: Diff 266586!May 27 2020, 10:50 AM

Performance-wise, I've investigated huge slowdowns on tmux and pytorch.

pytorch build produces a lot of warnings and simply trashed my terminal. I guess one time it had more troubles with displaying all that than the other. Here is a table with new times:

	pytorch
Time (before)	2h21m33s
Time (after)	2h19m23s
Delta	-1.5%

As you can see, these numbers are way smaller than the original ones.

tmux is a much smaller project, so I decided to run it 20 times for each case.

After consistently shows slower runtimes, but the overall difference (for median times) is only +3%.

I believe that as of now we can submit these modifications as is and explore performance optimizations later if needed.

Aha, so performance regressions on real code weren't real, that's a relief :)

I believe that as of now we can submit these modifications as is and explore performance optimizations later if needed.

I still encourage you to explore the tests we have from our previous attempts to simplify expressions recursively without memoization (test/Analysis/hangs.c). I'm asking because these aren't all that artificial: this kind of code was previously reported by a frustrated user as "the analyzer started hanging on my code". Like, please replace a bunch of +es with &/|/% and see if this causes your code to perform exponentially over the size of the program. If so, i'd rather have us hurry up and implement memoization.

The math in this patch looks great, thanks!

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
444–445	I suggest not trying to express signed types and unsigned types in a single formula, the reader will have to unwrap it back into the two cases anyway in order to understand what's going on. The following would imho be easier to read: "If T is signed, return the smallest range `[-x..x]` that covers the original range, or `[-min(T), max(T)]` if the aforementioned symmetric range doesn't exist due to original range covering `min(T)`). If T is unsigned, return the smallest range `[0..x]` that covers the original range".

This revision is now accepted and ready to land.May 28 2020, 3:45 AM

vsavchenko marked an inline comment as done.May 28 2020, 5:55 AM

vsavchenko added inline comments.

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
444–445	That is a perfect explanation, thanks!

In D80117#2059567, @NoQ wrote:

I believe that as of now we can submit these modifications as is and explore performance optimizations later if needed.

I still encourage you to explore the tests we have from our previous attempts to simplify expressions recursively without memoization (test/Analysis/hangs.c). I'm asking because these aren't all that artificial: this kind of code was previously reported by a frustrated user as "the analyzer started hanging on my code". Like, please replace a bunch of +es with &/|/% and see if this causes your code to perform exponentially over the size of the program. If so, i'd rather have us hurry up and implement memoization.

Ok, looks like my memories on this subject are heavily messed up. The actual problem that made us hang was solved by D47155. This is a dumb bug that would have been avoided if we had memoization but it doesn't require memoization to be avoided and it doesn't look like this code risks repeating that mistake.

Then, our experience with memoization in D47402 wasn't as good as i expected; it turned out that there are other exponential parts of the analysis in such cases that we still couldn't avoid. We should probably still do it (given how difficult it is now to identify these "other parts" that are exponential, i'd rather not add more such parts consciously) but i guess it's not that much of a blocker.

Fix code review remarks

Harbormaster failed remote builds in B58239: Diff 266890!May 28 2020, 9:16 AM

Closed by commit rG73c120a9895a: [analyzer] Introduce reasoning about symbolic remainder operator (authored by vsavchenko). · Explain WhyMay 28 2020, 9:17 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Core/

RangeConstraintManager.cpp

92 lines

test/

Analysis/

PR35418.cpp

28 lines

constant-folding.c

61 lines

uninit-bug-first-iteration-init.c

27 lines

Diff 264582

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

Show First 20 Lines • Show All 365 Lines • ▼ Show 20 Lines	private:

RangeSet VisitBinaryOperator(RangeSet LHS, BinaryOperator::Opcode Op,		RangeSet VisitBinaryOperator(RangeSet LHS, BinaryOperator::Opcode Op,
RangeSet RHS, QualType T) {		RangeSet RHS, QualType T) {
switch (Op) {		switch (Op) {
case BO_Or:		case BO_Or:
return VisitBinaryOperator<BO_Or>(LHS, RHS, T);		return VisitBinaryOperator<BO_Or>(LHS, RHS, T);
case BO_And:		case BO_And:
return VisitBinaryOperator<BO_And>(LHS, RHS, T);		return VisitBinaryOperator<BO_And>(LHS, RHS, T);
		case BO_Rem:
		return VisitBinaryOperator<BO_Rem>(LHS, RHS, T);
default:		default:
return infer(T);		return infer(T);
}		}
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Ranges and operators		// Ranges and operators
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	private:
template <BinaryOperator::Opcode Op>		template <BinaryOperator::Opcode Op>
RangeSet VisitBinaryOperator(Range LHS, Range RHS, QualType T) {		RangeSet VisitBinaryOperator(Range LHS, Range RHS, QualType T) {
return infer(T);		return infer(T);
}		}

template <>		template <>
RangeSet VisitBinaryOperator<BO_Or>(Range LHS, Range RHS, QualType T) {		RangeSet VisitBinaryOperator<BO_Or>(Range LHS, Range RHS, QualType T) {
APSIntType ResultType = ValueFactory.getAPSIntType(T);		APSIntType ResultType = ValueFactory.getAPSIntType(T);
llvm::APSInt Zero = ResultType.getZeroValue();		llvm::APSInt Zero = ResultType.getZeroValue();
		NoQUnsubmitted Not Done Reply Inline Actions `(Origin.From() + 1).isMinSignedValue()` is another sufficient condition(?) NoQ: `(Origin.From() + 1).isMinSignedValue()` is another sufficient condition(?)
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions I'm sorry, I don't quite get what cases does this check cover. Can you please explain what you have in mind? vsavchenko: I'm sorry, I don't quite get what cases does this check cover. Can you please explain what you…
		NoQUnsubmitted Not Done Reply Inline Actions Aha, ok, nvm, a different issue then: For range `[INT_MIN + 1, INT_MAX]`, the correct answer should be `[INT_MIN + 1, INT_MAX] (which is `[-C, C]` for `C = INT_MAX]`) rather than `[INT_MIN, INT_MAX]`. NoQ: Aha, ok, nvm, a different issue then: For range `[INT_MIN + 1, INT_MAX]`, the correct answer…

bool IsLHSPositiveOrZero = LHS.From() >= Zero;		bool IsLHSPositiveOrZero = LHS.From() >= Zero;
bool IsRHSPositiveOrZero = RHS.From() >= Zero;		bool IsRHSPositiveOrZero = RHS.From() >= Zero;
		NoQUnsubmitted Not Done Reply Inline Actions You mean zero, right? NoQ: You mean zero, right?
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions No, not always. It still can be signed at this point. vsavchenko: No, not always. It still can be signed at this point.
		NoQUnsubmitted Not Done Reply Inline Actions Ok, so i misunderstood. This function computes range of `abs($x)` aka `\|$x\|` given the range for `$x`, right? NoQ: Ok, so i misunderstood. This function computes range of `abs($x)` aka `\|$x\|` given the range…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions I guess I should fix my comments (and maybe the name for this function). This function finds absolute maximum, i.e. the value `C: \|$x\| <= C` and returns the range `[-C, C]` for signed `$x`s and `[0, C]` for unsigned `$x`s. So this new range is guaranteed to include the original range. vsavchenko: I guess I should fix my comments (and maybe the name for this function). This function finds…
		NoQUnsubmitted Not Done Reply Inline Actions I suggest not trying to express signed types and unsigned types in a single formula, the reader will have to unwrap it back into the two cases anyway in order to understand what's going on. The following would imho be easier to read: "If T is signed, return the smallest range `[-x..x]` that covers the original range, or `[-min(T), max(T)]` if the aforementioned symmetric range doesn't exist due to original range covering `min(T)`). If T is unsigned, return the smallest range `[0..x]` that covers the original range". NoQ: I suggest not trying to express signed types and unsigned types in a single formula, the reader…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions That is a perfect explanation, thanks! vsavchenko: That is a perfect explanation, thanks!

bool IsLHSNegative = LHS.To() < Zero;		bool IsLHSNegative = LHS.To() < Zero;
bool IsRHSNegative = RHS.To() < Zero;		bool IsRHSNegative = RHS.To() < Zero;

// Check if both ranges have the same sign.		// Check if both ranges have the same sign.
if ((IsLHSPositiveOrZero && IsRHSPositiveOrZero) \|\|		if ((IsLHSPositiveOrZero && IsRHSPositiveOrZero) \|\|
(IsLHSNegative && IsRHSNegative)) {		(IsLHSNegative && IsRHSNegative)) {
		ASDenysPetrovUnsubmitted Done Reply Inline Actions I think you should swap `if` statements. I'll explain. Let's consider the input is an uint8 range [42, 242] and you will return [0, 242] in the second `if`. But if the input is an uint8 range [128, 242] you will return [0, 255] in the first `if`, because 128 is an equivalent of -128(INT8_MIN) in binary representation so the condition in the first if would be true. What is the great difference between [42, 242] and [128, 242] to have different results? Or you've just missed this case? P.S. I think your function's name doesn't fit its body, since absolute value is always positive (without sign) from its definition, but you output range may have negative values. You'd better write an explanation above the function and rename it. ASDenysPetrov: I think you should swap `if` statements. I'll explain. Let's consider the input is an uint8…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions It is a valid point, I will add this test and change this behaviour! The name is confusing indeed, maybe you have any ideas what would be more appropriate? vsavchenko: It is a valid point, I will add this test and change this behaviour! The name is confusing…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions @NoQ , @ASDenysPetrov what do you think about this name instead (i.e. `getSymmetricalRange`). vsavchenko: @NoQ , @ASDenysPetrov what do you think about this name instead (i.e. `getSymmetricalRange`).
// The result is definitely greater or equal than any of the operands.		// The result is definitely greater or equal than any of the operands.
const llvm::APSInt &Min = std::max(LHS.From(), RHS.From());		const llvm::APSInt &Min = std::max(LHS.From(), RHS.From());

// We estimate maximal value for positives as the maximal value for the		// We estimate maximal value for positives as the maximal value for the
// given type. For negatives, we estimate it with -1 (e.g. 0x11111111).		// given type. For negatives, we estimate it with -1 (e.g. 0x11111111).
//		//
// TODO: We basically, limit the resulting range from below (in absolute		// TODO: We basically, limit the resulting range from below (in absolute
// numbers), but don't do anything with the upper bound.		// numbers), but don't do anything with the upper bound.
// For positive operands, it can be done as follows: for the upper		// For positive operands, it can be done as follows: for the upper
// bound of LHS and RHS we calculate the most significant bit set.		// bound of LHS and RHS we calculate the most significant bit set.
// Let's call it the N-th bit. Then we can estimate the maximal		// Let's call it the N-th bit. Then we can estimate the maximal
// number to be 2^(N+1)-1, i.e. the number with all the bits up to		// number to be 2^(N+1)-1, i.e. the number with all the bits up to
		ASDenysPetrovUnsubmitted Not Done Reply Inline Actions As for me, the last reason fully covers previous special cases, so you can omit those ones, thus simplify the comment. ASDenysPetrov: As for me, the last //reason// fully covers previous special cases, so you can omit those ones…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions I really want to be clear about the first two cases to explain why this works for any sign of `From` and `To` vsavchenko: I really want to be clear about the first two cases to explain why this works for any sign of…
// the N-th bit set.		// the N-th bit set.
const llvm::APSInt &Max = IsLHSNegative		const llvm::APSInt &Max = IsLHSNegative
? ValueFactory.getValue(--Zero)		? ValueFactory.getValue(--Zero)
: ValueFactory.getMaxValue(ResultType);		: ValueFactory.getMaxValue(ResultType);

return {RangeFactory, ValueFactory.getValue(Min), Max};		return {RangeFactory, ValueFactory.getValue(Min), Max};
}		}

▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	if (IsLHSPositiveOrZero \|\| IsRHSPositiveOrZero) {
return {RangeFactory, ValueFactory.getValue(Zero),		return {RangeFactory, ValueFactory.getValue(Zero),
ValueFactory.getValue(Max)};		ValueFactory.getValue(Max)};
}		}

// Nothing much else to do here.		// Nothing much else to do here.
return infer(T);		return infer(T);
}		}

		Range makeAbsolute(Range Origin, QualType T) {
		APSIntType RangeType = ValueFactory.getAPSIntType(T);
		bool CoversTheWholeType =
		(Origin.From().isMinSignedValue() \|\| Origin.To().isMaxValue());

		if (CoversTheWholeType) {
		return {ValueFactory.getMinValue(RangeType),
		ValueFactory.getMaxValue(RangeType)};
		}

		if (RangeType.isUnsigned()) {
		return Range(ValueFactory.getMinValue(RangeType), Origin.To());
		}

		// At this point, we are sure that the type is signed and we can safely
		// use unary - operator.
		//
		// While calculating absolute maximum, we can use the following formula
		// because of these reasons:
		// * If From >= 0 then To >= From and To >= -From.
		// AbsMax == To == max(To, -From)
		// * If To <= 0 then -From >= -To and -From >= From.
		// AbsMax == -From == max(-From, To)
		// * Otherwise, From <= 0, To >= 0, and
		// AbsMax == max(abs(From), abs(To))
		llvm::APSInt AbsMax = std::max(-Origin.From(), Origin.To());

		// Intersection is guaranteed to be non-empty.
		return {ValueFactory.getValue(-AbsMax), ValueFactory.getValue(AbsMax)};
		}

		template <>
		RangeSet VisitBinaryOperator<BO_Rem>(Range LHS, Range RHS, QualType T) {
		llvm::APSInt Zero = ValueFactory.getAPSIntType(T).getZeroValue();

		// Check if LHS is 0. It's a special case when the result is guaranteed
		// to be 0 no matter what RHS is (we put to the side the case when RHS is
		// 0 itself).
		const llvm::APSInt *LHSConstant = LHS.getConcreteValue();
		if (LHSConstant && *LHSConstant == Zero) {
		return {RangeFactory, *LHSConstant};
		}

		Range ConservativeRange = makeAbsolute(RHS, T);

		llvm::APSInt Max = ConservativeRange.To();
		llvm::APSInt Min = ConservativeRange.From();

		// At this point, our conservative range is closed. The result, however,
		// couldn't be greater than the RHS' maximal absolute value. Because of
		// this reason, we turn the range into open (or half-open in case of
		// unsigned integers).
		if (Max == Zero) {
		// It's an undefined behaviour to divide by 0 and it seems like we know
		// for sure that RHS is 0. Let's say that the resulting range is
		// simply infeasible for that matter.
		return RangeFactory.getEmptySet();
		}

		// Offset the boundaries towards zero.
		//
		// If we are dealing with unsigned case, we shouldn't move the lower bound.
		if (Min.isSigned()) {
		++Min;
		}
		--Max;

		bool IsLHSPositiveOrZero = LHS.From() >= Zero;
		bool IsRHSPositiveOrZero = RHS.From() >= Zero;

		// Remainder operator results with negative operands is implementation
		// defined. Positive cases are much easier to reason about though.
		if (IsLHSPositiveOrZero && IsRHSPositiveOrZero) {
		// If maximal value of LHS is less than maximal value of RHS,
		// the result won't get greater than LHS.To().
		Max = std::min(LHS.To(), Max);
		// We want to check if it is a situation similar to the following:
		//
		// <------------\|---[ LHS ]--------[ RHS ]----->
		// -INF 0 +INF
		//
		// In this situation, we can conclude that (LHS / RHS) == 0 and
		// (LHS % RHS) == LHS.
		Min = LHS.To() < RHS.From() ? LHS.From() : Zero;
		}

		return {RangeFactory, ValueFactory.getValue(Min),
		ValueFactory.getValue(Max)};
		}

/// Return a range set subtracting zero from \p Domain.		/// Return a range set subtracting zero from \p Domain.
RangeSet assumeNonZero(RangeSet Domain, QualType T) {		RangeSet assumeNonZero(RangeSet Domain, QualType T) {
APSIntType IntType = ValueFactory.getAPSIntType(T);		APSIntType IntType = ValueFactory.getAPSIntType(T);
return Domain.Intersect(ValueFactory, RangeFactory,		return Domain.Intersect(ValueFactory, RangeFactory,
++IntType.getZeroValue(), --IntType.getZeroValue());		++IntType.getZeroValue(), --IntType.getZeroValue());
}		}

// FIXME: Once SValBuilder supports unary minus, we should use SValBuilder to		// FIXME: Once SValBuilder supports unary minus, we should use SValBuilder to
Show All 26 Lines
};		};

class RangeConstraintManager : public RangedConstraintManager {		class RangeConstraintManager : public RangedConstraintManager {
public:		public:
RangeConstraintManager(SubEngine *SE, SValBuilder &SVB)		RangeConstraintManager(SubEngine *SE, SValBuilder &SVB)
: RangedConstraintManager(SE, SVB) {}		: RangedConstraintManager(SE, SVB) {}

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// Implementation for interface from ConstraintManager.		// Implementation for interface from ConstraintManager.
		xazax.hunUnsubmitted Done Reply Inline Actions I wonder if we actually need this? I vaguely recall that we are doing a lot of simplifications during building symbolic expressions. I would be surprised if this identity is not handled there. (And in that case, probable this should be added there.) Or we might need a comment to explain why do we need this simplification at both places. xazax.hun: I wonder if we actually need this? I vaguely recall that we are doing a lot of simplifications…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions Yeah, we don't do it in `SValBuilder`, but it is definitely a better place for that particular case. I'll move it. vsavchenko: Yeah, we don't do it in `SValBuilder`, but it is definitely a better place for that particular…
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//

bool haveEqualConstraints(ProgramStateRef S1,		bool haveEqualConstraints(ProgramStateRef S1,
ProgramStateRef S2) const override {		ProgramStateRef S2) const override {
return S1->get<ConstraintRange>() == S2->get<ConstraintRange>();		return S1->get<ConstraintRange>() == S2->get<ConstraintRange>();
}		}

bool canReasonAbout(SVal X) const override;		bool canReasonAbout(SVal X) const override;
Show All 9 Lines	public:
void printJson(raw_ostream &Out, ProgramStateRef State, const char *NL = "\n",		void printJson(raw_ostream &Out, ProgramStateRef State, const char *NL = "\n",
unsigned int Space = 0, bool IsDot = false) const override;		unsigned int Space = 0, bool IsDot = false) const override;

//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//
// Implementation for interface from RangedConstraintManager.		// Implementation for interface from RangedConstraintManager.
//===------------------------------------------------------------------===//		//===------------------------------------------------------------------===//

ProgramStateRef assumeSymNE(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymNE(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
		ASDenysPetrovUnsubmitted Done Reply Inline Actions Extend the comment, please, why we should move bounds to zero at all. ASDenysPetrov: Extend the comment, please, why we should move bounds to zero at all.
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions Good point! vsavchenko: Good point!
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymEQ(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymEQ(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymLT(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymLT(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymGT(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymGT(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymLE(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymLE(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymGE(ProgramStateRef State, SymbolRef Sym,		ProgramStateRef assumeSymGE(ProgramStateRef State, SymbolRef Sym,
const llvm::APSInt &V,		const llvm::APSInt &V,
const llvm::APSInt &Adjustment) override;		const llvm::APSInt &Adjustment) override;

ProgramStateRef assumeSymWithinInclusiveRange(		ProgramStateRef assumeSymWithinInclusiveRange(
ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,		ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,
const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;		const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;
		ASDenysPetrovUnsubmitted Done Reply Inline Actions Is it OK to return this rangeset in case when one of operands(or both) is negative, since this rangeset can vary from specific implementation? ASDenysPetrov: Is it OK to return this rangeset in case when one of operands(or both) is negative, since this…
		vsavchenkoAuthorUnsubmitted Done Reply Inline Actions Yes, it is a conservative range for any ranges because only the sign of the operation is specific to different implementations vsavchenko: Yes, it is a conservative range for any ranges because only the sign of the operation is…

ProgramStateRef assumeSymOutsideInclusiveRange(		ProgramStateRef assumeSymOutsideInclusiveRange(
ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,		ProgramStateRef State, SymbolRef Sym, const llvm::APSInt &From,
const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;		const llvm::APSInt &To, const llvm::APSInt &Adjustment) override;

private:		private:
RangeSet::Factory F;		RangeSet::Factory F;

▲ Show 20 Lines • Show All 403 Lines • Show Last 20 Lines

clang/test/Analysis/PR35418.cpp

This file was added.

				// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s

				// expected-no-diagnostics

				void halt() __attribute__((__noreturn__));
				void assert(int b) {
				if (!b)
				halt();
				}

				void decode(unsigned width) {
				assert(width > 0);

				int base;
				bool inited = false;

				int i = 0;

				if (i % width == 0) {
				base = 512;
				inited = true;
				}

				base += 1; // no-warning

				if (base >> 10)
				assert(false);
				}

clang/test/Analysis/constant-folding.c

Show First 20 Lines • Show All 168 Lines • ▼ Show 20 Lines	void testBitwiseRules(unsigned int a, int b, int c) {
if (a < 10) {		if (a < 10) {
clang_analyzer_eval((a \| 20) >= 20); // expected-warning{{TRUE}}		clang_analyzer_eval((a \| 20) >= 20); // expected-warning{{TRUE}}
}		}

if (a > 10) {		if (a > 10) {
clang_analyzer_eval((a & 1) <= 1); // expected-warning{{TRUE}}		clang_analyzer_eval((a & 1) <= 1); // expected-warning{{TRUE}}
}		}
}		}

		void testRemainderRules(unsigned int a, unsigned int b, int c, int d) {
		// Check that we know that remainder of zero divided by any number is still 0.
		clang_analyzer_eval((0 % c) == 0); // expected-warning{{TRUE}}

		clang_analyzer_eval((10 % a) <= 10); // expected-warning{{TRUE}}

		if (a <= 30 && b <= 50) {
		clang_analyzer_eval((40 % a) < 30); // expected-warning{{TRUE}}
		clang_analyzer_eval((a % b) < 50); // expected-warning{{TRUE}}
		clang_analyzer_eval((b % a) < 30); // expected-warning{{TRUE}}

		if (a >= 10) {
		// Even though it seems like a valid assumption, it is not.
		// Check that we are not making this mistake.
		clang_analyzer_eval((a % b) >= 10); // expected-warning{{UNKNOWN}}

		// Check that we can we can infer when remainder is equal
		// to the dividend.
		clang_analyzer_eval((4 % a) == 4); // expected-warning{{TRUE}}
		if (b < 7) {
		clang_analyzer_eval((b % a) < 7); // expected-warning{{TRUE}}
		}
		}
		}

		// Check that we can reason about signed integers when they are
		// known to be positive.
		if (c >= 10 && c <= 30 && d >= 20 && d <= 50) {
		clang_analyzer_eval((5 % c) == 5); // expected-warning{{TRUE}}
		clang_analyzer_eval((c % d) <= 30); // expected-warning{{TRUE}}
		clang_analyzer_eval((c % d) >= 0); // expected-warning{{TRUE}}
		clang_analyzer_eval((d % c) < 30); // expected-warning{{TRUE}}
		clang_analyzer_eval((d % c) >= 0); // expected-warning{{TRUE}}
		}

		if (c >= -30 && c <= -10 && d >= -20 && d <= 50) {
		// Test positive LHS with negative RHS.
		clang_analyzer_eval((40 % c) < 30); // expected-warning{{TRUE}}
		clang_analyzer_eval((40 % c) > -30); // expected-warning{{TRUE}}

		// Test negative LHS with possibly negative RHS.
		clang_analyzer_eval((-10 % d) < 50); // expected-warning{{TRUE}}
		clang_analyzer_eval((-20 % d) > -50); // expected-warning{{TRUE}}

		// Check that we don't make wrong assumptions
		clang_analyzer_eval((-20 % d) > -20); // expected-warning{{UNKNOWN}}

		// Check that we can reason about negative ranges...
		clang_analyzer_eval((c % d) < 50); // expected-warning{{TRUE}}
		/// ...both ways
		clang_analyzer_eval((d % c) < 30); // expected-warning{{TRUE}}

		if (a <= 10) {
		// Result is unsigned. This means that 'c' is casted to unsigned.
		// We don't want to reason about ranges changing boundaries with
		// conversions.
		clang_analyzer_eval((a % c) < 30); // expected-warning{{UNKNOWN}}
		}
		}
		}

clang/test/Analysis/uninit-bug-first-iteration-init.c

This file was added.

				// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s

				// rdar://problem/44978988
				// expected-no-diagnostics

				int foo();

				int gTotal;

				double bar(int start, int end) {
				int i, cnt, processed, size;
				double result, inc;

				result = 0;
				processed = start;
				size = gTotal * 2;
				cnt = (end - start + 1) * size;

				for (i = 0; i < cnt; i += 2) {
				if ((i % size) == 0) {
				inc = foo();
				processed++;
				}
				result += inc * inc; // no-warning
				}
				return result;
				}

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Introduce reasoning about symbolic remainder operatorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 264582

clang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp

clang/test/Analysis/PR35418.cpp

clang/test/Analysis/constant-folding.c

clang/test/Analysis/uninit-bug-first-iteration-init.c

[analyzer] Introduce reasoning about symbolic remainder operator
ClosedPublic