This is an archive of the discontinued LLVM Phabricator instance.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
411	Yeah, a must-have for this check to be enabled by default would be to be able to provide a specific warning message for every function. I guess we could include them in the summaries as an extra argument of `ArgConstraint`.
414	Let's test our notes. That'll be especially important when we get to non-concrete values, because the visitor might need to be expanded (or we might need a completely new visitor).

steakhal mentioned this in D73536: [analyzer][taint] Remove taint from symbolic expressions if used in comparisons.Feb 5 2020, 6:02 AM

martong added reviewers: Szelethus, baloghadamsoftware.Feb 6 2020, 2:08 AM

xazax.hun added inline comments.Feb 6 2020, 3:39 PM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
401	Dealing with only concrete ints might be a good start but we might want to handle symbolic cases in the future like: if (v > 255) return isalpha(v); I am ok with not addig this in the first version but adding TODOs and test cases upfront cannot hurt. So basivally, I was wondering if we should query the solver for the result instead of matching the sval kind and just early return if we do not want to support a specific kind.

I wouldn't like to see reports emitted by a checker that resides in apiModeling. Could we create a new one? Some checkers, like the IteratorChecker, MallocChecker and CStringChecker implement a variety of user-facing checkers within the same class, that is also an option, if creating a new checker class is too much of a hassle.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
697–699	This is true for the rest of the summaries as well, but shouldn't we retrieve the `unsigned char` size from `ASTContext`?

This revision now requires changes to proceed.Feb 7 2020, 4:40 AM

In D73898#1863710, @Szelethus wrote:

I wouldn't like to see reports emitted by a checker that resides in apiModeling. Could we create a new one? Some checkers, like the IteratorChecker, MallocChecker and CStringChecker implement a variety of user-facing checkers within the same class, that is also an option, if creating a new checker class is too much of a hassle.

Yes, we could split the warning emitting part to a new checker. My concern with that is in that case we would have the argument constraining part in checkPostCall still in this checker, because that is part of the modelling. And actually it makes sense to apply the argument constraints only if we know for sure that they are not violated. The violation then would be checked in the new checker, this seems a bit awkward to me. Because checking the violation of the constraints and applying the constraints seems to be a cohesive action to me. I mean it would not even make sense to turn off the warning checker, because then we'd be applying the constraints blindly.

martong marked an inline comment as done.Feb 7 2020, 8:24 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
697–699	Yes this is a good idea. I will do this. What bothers me really much, however, is that we should handle EOF in a platform dependent way as well ... and I have absolutely no idea how to do that given that is defined by a macro in a platform specific header file. I am desperately in need for help and ideas about how could we get the value of EOF for the analysed platform.

In D73898#1864066, @martong wrote:

In D73898#1863710, @Szelethus wrote:

I wouldn't like to see reports emitted by a checker that resides in apiModeling. Could we create a new one? Some checkers, like the IteratorChecker, MallocChecker and CStringChecker implement a variety of user-facing checkers within the same class, that is also an option, if creating a new checker class is too much of a hassle.

... And actually it makes sense to apply the argument constraints only if we know for sure that they are not violated. ...

What I mean by that is that we must do over-approximation if the argument is symbolic. I.e. we presume that the constraints do hold otherwise the program would be ill-formed and there is no point to continue the analysis on this path. It is very similar to what we do in case of the DivZero or the NullDeref Checkers: if there is no violation (no warning) and the variable is symbolic then we constrain the value by the condition. E.g. in DivZero::checkPreStmt we have:

// If we get here, then the denom should not be zero. We abandon the implicit
// zero denom case for now.
C.addTransition(stateNotZero);

Strictly speaking, these transitions should be part of the modeling then in this sense (and they should be in PostStmt?). Still they are not separated into a different checker.

What I mean by that is that we must do over-approximation if the argument is symbolic. I.e. we presume that the constraints do hold otherwise the program would be ill-formed and there is no point to continue the analysis on this path.

Sorry, that's actually under-approximation because we elide paths.

Based on our verbal discussion with @Szelethus and @steakhal and based on the mailing archives, I am going to do the following changes:

Add a new checker that is implemented in the StdLibraryFunctionsChecker class.
This new checker if switched on is responsible for emitting the warning. Even if this is turned off, the sink node is generated if the argument violates the given condition.
This means, the new checker has the sole responsibility of emitting the warning, but nothing more.

martong added a parent revision: D74473: [analyzer] StdLibraryFunctionsChecker: Use platform dependent EOF and UCharMax.Feb 12 2020, 2:48 AM

Rebase to master

Harbormaster failed remote builds in B46416: Diff 244430!Feb 13 2020, 8:06 AM

balazske added a subscriber: balazske.Feb 14 2020, 8:04 AM

balazske added inline comments.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
401	This check works now with concrete int values. We have a known value and a list of ranges with known limits, so testing for in any of the ranges does work the same way as testing for out of all ranges. And testing if the value is inside one of the ranges is more simple code. But I think the symbolic evaluation with "eval" and "assume" functions would be more generic here (and more simple code). Then the way of cutting-of the bad ranges is usable (probably still there is other solution).
462	If `evalCall` is used it could be more simple to test and apply the constraints for arguments and return value in a "single step".

Probably a better solution can be:
For every "case" build a single SVal that contains all argument constraints for that case. It is possible using multiple evalBinOp calls (with <=, >=, logical or) to build such a condition (or repeated calls to other assume functions to cut off outer ranges). If the condition can be satisfied (by assume) add the new state, the condition for return value can be added here too. Repeat this for every different case. If no applicable case is found none of the conditions can be assumed, this means argument constraint error.

Add new Checker that does the report
Refactor with negated RangeValues
Add overload to findFunctionSummary
Add tests for symbolic values
Add test file for bug path

I've done a major refactor with the handling of argument constraints. I am now reusing ValueRange::apply in checkPreCall on "negated" value ranges to check if the constraint is violated.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
411	What about warning messages with placeholders? E.g. "Argument constraint of {nth} argument of {fun} is not satisfied. The value is {v}, however it should be in {range}." There will be a bunch of functions whose warning message template would be the same. On the other hand some others could have different warnings, and that justifies the need for specialized warnings. Still, I think the warning message in the summary should be optional, because otherwise it would be really hard to automatically add summaries from other sources (like from cppcheck). No matter how it turns out, this should be handled in a different patch.
414	Ok, I added a separate test file where the tests focus on the bug path.

Remove leftover call from test

Harbormaster completed remote builds in B46921: Diff 245651.Feb 20 2020, 7:26 AM

Harbormaster completed remote builds in B46922: Diff 245652.Feb 20 2020, 8:12 AM

martong added reviewers: gamesh411, balazske.Feb 21 2020, 2:54 AM

steakhal added inline comments.Feb 21 2020, 4:05 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
458	StringRef

martong added a parent revision: D74973: [analyzer] StdLibraryFunctionsChecker refactor w/ inheritance.Feb 21 2020, 9:54 AM

balazske added inline comments.Feb 23 2020, 11:44 PM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
458	Is this `addTransition` needed? It would be OK to call `generateErrorNode` with `State`. Even if not, adding the transition before should not be needed?
711–712	Why is this `{128, UCharMax}` here and at the next entry needed?
714	Is this `ArgConstraint` intentionally added only to `isalnum`?

Use StringRef for Msg
Remove superfluous addTransition

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
458	Yes, you are right it is superfluous, I removed it.

martong added inline comments.Feb 24 2020, 6:12 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
711–712	This is the local specific range , [128, 255]. There are characters like `ä` which we don't know if they are treated as an alphanumerical character or not. We can't really tell how a specific libc implementation classifies them. On the other hand, with English letters we can state the classes confidently.
714	Yes, I wanted to create first the infrastructure and then later to add all these constraints to the rest of the summaries with new tests.

Harbormaster completed remote builds in B47126: Diff 246193.Feb 24 2020, 6:22 AM

Rebase on top of https://reviews.llvm.org/D74973

Harbormaster completed remote builds in B47140: Diff 246229.Feb 24 2020, 9:44 AM

martong added a child revision: D75063: [analyzer] StdLibraryFunctionsChecker: Add NotNull Arg Constraint.Feb 24 2020, 9:56 AM

martong marked an inline comment as done.Feb 24 2020, 10:00 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
174	This `default` branch is not needed here (actually gives a compiler warning too).

gamesh411 added inline comments.Feb 27 2020, 1:12 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
697–699	If the EOF is not used in the TU analyzed, then there would be no way to find the specific `#define`. Another approach would be to check if the value is defined by an expression that is the EOF define (maybe transitively?).

It may be useful to make a "macro value map" kind of object. Some macros can be added to it as a string, and it is possible to lookup for an Expr if one of the added macros is used there. This can be done by checking the concrete (numeric) value of the Expr and compare to the value of the macro, or by checking if the expression comes from a macro and take this macro name (use string comparison). Such an object can be useful because the functionality is needed at more checkers, for example the ones I am working on (StreamChecker and ErrorReturnChecker too).

something like this:

class MacroUsageDetector {
public:
  void addMacroName(StringRef MName);
  bool isMacroUsed(StringRef MName, Expr *E, ???);
  APSInt getMacroValue(StringRef MName);
};

Or one that handles a single macro?

The high level idea and the implementation of the checker seems great. In general, things that you want to address in later patches should be stated in the code with a TODO. I wrote a couple nits that I don't want to delete, but maybe it'd be better to address them after the dependency patch is agreed upon.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
296 ↗	(On Diff #245652)	How about we add an example as well?
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
10	I suspect this comment is no longer relevant.
165	Maybe `complement` would be a better name? That sounds a lot more like a set operation. Also, this function highlights well that inheritance might not be the best solution here.
193	I think that is a rather poor example to help understand what `list of list of ranges` means :) -- Could you try to find something better?
454–462	While I find your usage of lambdas fascinating, this one seems a bit unnecessary :)
457	That is a `TODO`, rather :^)
clang/test/Analysis/std-c-library-functions-arg-constraints.c
2–8	Hmm, why do we have 2 different test files that essentially do the same? Shouldn't we only have a single one with `analyzer-output=text`?
clang/test/Analysis/std-c-library-functions.c
1–31 ↗	(On Diff #246229)	What a beautiful sight. Thanks.

Is it sure that the signedness in the ranges is handled correctly? The EOF is a negative value but the RangeInt is unsigned type. The tryExpandAsInteger returns int too that is put into an unsigned RangeInt later. Probably it is better to use APSInt for the ranges? (The problem exists already before this change.)

In D73898#1894923, @balazske wrote:

It may be useful to make a "macro value map" kind of object. Some macros can be added to it as a string, and it is possible to lookup for an Expr if one of the added macros is used there. This can be done by checking the concrete (numeric) value of the Expr and compare to the value of the macro, or by checking if the expression comes from a macro and take this macro name (use string comparison). Such an object can be useful because the functionality is needed at more checkers, for example the ones I am working on (StreamChecker and ErrorReturnChecker too).

Please see my previous answer to @gamesh411

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
296 ↗	(On Diff #245652)	You mean like NonNull or other constraints?
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
10	Uh, yes.
165	Well, we check the argument constraint validity by trying to apply it's logical negation. In case of a range inclusion this is being out of that range. In case of non-null this is being null. And so on. The logic how we try to check an argument constraint is the same in all cases of the different constraints. And that is the point: in order to support a new kind of constraint we just have to figure out how to "apply" and "negate" one constraint. In my opinion this is a perfect case for polimorphism.
193	Yeah, that part definitely should be reworded.
697–699	I believe that the given standard C lib implementation (e.g. glibc) must provide a header for the prototypes of these functions where EOF is also defined transitively in any of the dependent system headers. Otherwise user code could misuse the value of EOF and thus the program would behave in an undefined manner. C99 clearly states that you should #include <ctype.h> to use isalhpa.
clang/test/Analysis/std-c-library-functions-arg-constraints.c
2–8	No, I wanted to have two different test files to test two different things: (1) We do have the constraints applied (here we don't care about the warnings and the path) (2) Check that we have a warning with the proper tracking and notes.
clang/test/Analysis/std-c-library-functions.c
1–31 ↗	(On Diff #246229)	Anytime :D

In D73898#1901142, @balazske wrote:

Is it sure that the signedness in the ranges is handled correctly? The EOF is a negative value but the RangeInt is unsigned type. The tryExpandAsInteger returns int too that is put into an unsigned RangeInt later. Probably it is better to use APSInt for the ranges? (The problem exists already before this change.)

That is not a problem, because finally in apply we use an APSInt that is constructed by considering the correct T type, e.g.:

const llvm::APSInt &Min = BVF.getValue(R[I].first, T);

We could consider RangeInt as a buffer that is big enough to hold the representation of the range values. The concrete interpretation of the bits (as T) is done by APSInt.

Just littering some more inlines, don't mind me :) Lets still wait on the dependency patch before updating.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
296 ↗	(On Diff #245652)	Like Check constraints of arguments of C standard library functions, such as whether the parameter of isalpha is in the range [0, 255] or is EOF.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
96–97	How about `ValueConstraintRef`?
165	We agreed on inheritance in the previous patch, and regarding the name, sure, leave it as-is. :)

Looks great as long as other reviewers are happy, thanks!

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	Maybe we should add an assertion that the same argument isn't specified multiple times.

martong marked 16 inline comments as done.Mar 17 2020, 10:25 AM

martong added inline comments.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
296 ↗	(On Diff #245652)	Ok, done.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
96–97	Yeah, we have `ProgramStateRef` and `SymbolRef`. And both are actually just synonyms to smart pointers. I'd rather not call a pointer as a reference, because that can be confusing when reading the code. E.g. when I see that we return with a `nullptr` from a function that can return with a `...Ref` I start to scratch my head.
193	I added an example with `isalpha`.
453	I think there could be cases when we want to have e.g. a not-null constraint on the 1st argument, but also we want to express that the 1st argument's size is described by the 2nd argument. I am planning to implement such a constraints in the future. In that case we would have two constraints on the 1st argument and the assert would fire.
454–462	Ok I moved it to be a member function named `ReportBug`.

Herald added a subscriber: DenisDvlp. · View Herald TranscriptMar 17 2020, 10:25 AM

Address review comments

Harbormaster completed remote builds in B49451: Diff 250820.Mar 17 2020, 11:16 AM

tmp -> Tmp

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
174	This default branch is not needed here (actually gives a compiler warning too). I am not sure why I thought that that's not needed, actually we need that. (Perhaps an intermediate version returned in each cases.)

Herald added a subscriber: ASDenysPetrov. · View Herald TranscriptMar 17 2020, 12:45 PM

Harbormaster completed remote builds in B49477: Diff 250869.Mar 17 2020, 1:31 PM

LGTM, aside from some checker tagging nightmare. Its a bit easy to mess up, please allow me to have a final look before commiting! :)

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
298 ↗	(On Diff #250869)	Just noticed, this checker still lies in the `apiModeling` package. Could we find a more appropriate place?
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
96–97	Sure, I'm sold.
253	By passing `this`, the error message will be tied to the modeling checker, not to the one you just added. `BugType` has a constructor that accepts a string instead, pass `CheckNames[CK_StdCLibraryFunctionArgsChecker]` in there :) Also, how about `BT_InvalidArgument` or something?

This revision now requires changes to proceed.Mar 17 2020, 8:05 PM

balazske added inline comments.Mar 18 2020, 2:27 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
96	Is it better done with `= 0`?
193	The "branches" are the structures that define relations between arguments and return values? This could be included in the description.
284	This should be called `reportBug`.

martong marked 11 inline comments as done.Mar 18 2020, 7:46 AM

martong added inline comments.

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
298 ↗	(On Diff #250869)	Technically speaking this is still api modeling. In midterm we'd like to add support for more libc functions, gnu and posix functions, they are all library functions i.e. they provide some api. Of course in long term, we'd like to experiment by getting some constraints from IR/Attributor, but we are still far from there.
clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
96	Not all of the constraint classes must implement this. Right now, e.g. the `ComparisonConstraint` does not implement this, because there is no such summary (yet) that uses a `ComparisonConstraint` as an argument constraint.
193	Not exactly. A branch represents a path in the exploded graph of a function (which is a tree). So, a branch is a series of assumptions. In other words, branches represent split states and additional assumptions on top of the splitting assumption. I added this explanation to the comments.
253	Thanks, good catch, I did not know about that. Please note that using `CheckNames` requires that we change the `BT` member to be lazily initialized. Because `CheckNames` is initialized only after the checker itself is created, thus we cannot initialize `BT` during the checkers construction, b/c that would be before `CheckNames` is set. So, I changed `BT` to be a unique_ptr and it is being lazily initialized in `reportBug`.
284	Yeah, can't get used to this strange naming convention that LLVM uses. Fixed it.

Add comments about what is a branch
Do not use 'this' for BugType
Lazily init BT and BT -> BT_InvalidArg
ReportBug -> reportBug

Harbormaster completed remote builds in B49598: Diff 251083.Mar 18 2020, 8:42 AM

Whoo! The patch looks great and well thought out, the tests look like they cover everything and we also talked about plans for future patches. Excellent!

I left a nit about merging the test files, but I'll leave it up to you to address or ignore it.

clang/test/Analysis/std-c-library-functions-arg-constraints.c
2–8	What if we had different `-verify`s? `clang/test/Analysis/track-conditions.cpp` is a great example.

This revision is now accepted and ready to land.Mar 19 2020, 12:52 PM

Use prefixes for -verify to check different things in the same test file

Thanks for the review guys!

clang/test/Analysis/std-c-library-functions-arg-constraints.c
2–8	Yeah, that's a very good approach, I just changed it like that. :)

Closed by commit rG94061df6e5f2: [analyzer] StdLibraryFunctionsChecker: Add argument constraints (authored by martong). · Explain WhyMar 20 2020, 8:39 AM

This revision was automatically updated to reflect the committed changes.

Harbormaster completed remote builds in B49893: Diff 251648.Mar 20 2020, 9:11 AM

NoQ added inline comments.Mar 25 2020, 12:06 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	Wait, i misunderstood the code. It's even worse than that: you're adding transitions in a loop, so it'll cause state splits for every constraint. Because you do not intend to create multiple branches here, there needs to be exactly one `addTransition` performed every time `checkPreCall` is called. I.e., for now this code is breaking everything whenever there's more than one constraint, regardless of whether it's on the same argument.

martong marked an inline comment as done.Mar 25 2020, 7:33 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	Yeah, that's a very good catch, thanks! I am going to prepare a patch to fix this soon. My idea is to store the `SuccessSt` and apply the next argument constraint on that. And once the loop is finished I'll have call the `addTransition()`.

NoQ added inline comments.Mar 25 2020, 8:15 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	Yup, that's the common thing to do in such cases.

NoQ added inline comments.Mar 25 2020, 9:29 AM

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	While we're at it, could you try to come up with a runtime assertion that'll help us prevent these mistakes? Like, dunno, make `CheckerContext` crash whenever there's more than one branch being added, and then add a method to opt out when it's actually necessary to add more transitions (i.e., the user would say `C.setMaxTransitions(2)` at the beginning of their checker callback whenever they need to make a state split, defaulting to 1). It's a bit tricky because i still want to allow multiple transitions when they allow one branch (i.e., transitions chained together) but i think it'll take a lot of review anxiety from me because it's a very dangerous mistake to make and for now code review is the only way to catch it. So, yay, faster code reviews.

I just created a quick fix for the issue: https://reviews.llvm.org/D76790

martong marked an inline comment as done.Mar 25 2020, 11:23 AM

martong added inline comments.

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp
453	Hmm I see your point and I agree this would be a valuable sanity check. But if you don't mind I'd like to address this in a different and stand-alone patch (independently from the quick-fix https://reviews.llvm.org/D76790) because it does not seem to be trivial for me. My first concern is this: if we have `1` as the default value for `maxTranisitions` then we should add an extra `C.setMaxTransitions(N)` in every checker callback that does a state split, is that right?

Szelethus mentioned this in D79358: [analyzer] CERT: STR37-C.May 5 2020, 2:09 AM

Revision Contents

Path

Size

clang/

lib/

StaticAnalyzer/

Checkers/

StdLibraryFunctionsChecker.cpp

113 lines

test/

Analysis/

std-c-library-functions-arg-constraints.c

22 lines

Diff 242087

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp

//=== StdLibraryFunctionsChecker.cpp - Model standard functions -- C++ --===//		//=== StdLibraryFunctionsChecker.cpp - Model standard functions -- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This checker improves modeling of a few simple library functions.		// This checker improves modeling of a few simple library functions.
// It does not generate warnings.		// It does not generate warnings.
		SzelethusUnsubmitted Done Reply Inline Actions I suspect this comment is no longer relevant. Szelethus: I suspect this comment is no longer relevant.
		martongAuthorUnsubmitted Done Reply Inline Actions Uh, yes. martong: Uh, yes.
//		//
// This checker provides a specification format - `FunctionSummaryTy' - and		// This checker provides a specification format - `FunctionSummaryTy' - and
// contains descriptions of some library functions in this format. Each		// contains descriptions of some library functions in this format. Each
// specification contains a list of branches for splitting the program state		// specification contains a list of branches for splitting the program state
// upon call, and range constraints on argument and return-value symbols that		// upon call, and range constraints on argument and return-value symbols that
// are satisfied on each branch. This spec can be expanded to include more		// are satisfied on each branch. This spec can be expanded to include more
// items, like external effects of the function.		// items, like external effects of the function.
//		//
Show All 27 Lines
// fwrite isalpha islower read		// fwrite isalpha islower read
// getc isascii isprint write		// getc isascii isprint write
// getchar isblank ispunct		// getchar isblank ispunct
// getdelim iscntrl isspace		// getdelim iscntrl isspace
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"		#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
		#include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
#include "clang/StaticAnalyzer/Core/Checker.h"		#include "clang/StaticAnalyzer/Core/Checker.h"
#include "clang/StaticAnalyzer/Core/CheckerManager.h"		#include "clang/StaticAnalyzer/Core/CheckerManager.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/CallEvent.h"
#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"		#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"

using namespace clang;		using namespace clang;
using namespace clang::ento;		using namespace clang::ento;

namespace {		namespace {
class StdLibraryFunctionsChecker : public Checker<check::PostCall, eval::Call> {		class StdLibraryFunctionsChecker
		: public Checker<check::PreCall, check::PostCall, eval::Call> {
/// Below is a series of typedefs necessary to define function specs.		/// Below is a series of typedefs necessary to define function specs.
/// We avoid nesting types here because each additional qualifier		/// We avoid nesting types here because each additional qualifier
/// would need to be repeated in every function spec.		/// would need to be repeated in every function spec.
struct FunctionSummaryTy;		struct FunctionSummaryTy;

/// Specify how much the analyzer engine should entrust modeling this function		/// Specify how much the analyzer engine should entrust modeling this function
/// to us. If he doesn't, he performs additional invalidations.		/// to us. If he doesn't, he performs additional invalidations.
enum InvalidationKindTy { NoEvalCall, EvalCallAsPure };		enum InvalidationKindTy { NoEvalCall, EvalCallAsPure };
Show All 14 Lines	class StdLibraryFunctionsChecker
/// a non-negative integer, which less than 5 and not equal to 2. For		/// a non-negative integer, which less than 5 and not equal to 2. For
/// `ComparesToArgument', holds information about how exactly to compare to		/// `ComparesToArgument', holds information about how exactly to compare to
/// the argument.		/// the argument.
typedef std::vector<std::pair<RangeIntTy, RangeIntTy>> IntRangeVectorTy;		typedef std::vector<std::pair<RangeIntTy, RangeIntTy>> IntRangeVectorTy;

/// A reference to an argument or return value by its number.		/// A reference to an argument or return value by its number.
/// ArgNo in CallExpr and CallEvent is defined as Unsigned, but		/// ArgNo in CallExpr and CallEvent is defined as Unsigned, but
/// obviously uint32_t should be enough for all practical purposes.		/// obviously uint32_t should be enough for all practical purposes.
typedef uint32_t ArgNoTy;		typedef uint32_t ArgNoTy;
		balazskeUnsubmitted Done Reply Inline Actions Is it better done with `= 0`? balazske: Is it better done with `= 0`?
		martongAuthorUnsubmitted Done Reply Inline Actions Not all of the constraint classes must implement this. Right now, e.g. the `ComparisonConstraint` does not implement this, because there is no such summary (yet) that uses a `ComparisonConstraint` as an argument constraint. martong: Not all of the constraint classes must implement this. Right now, e.g. the…
static const ArgNoTy Ret = std::numeric_limits<ArgNoTy>::max();		static const ArgNoTy Ret = std::numeric_limits<ArgNoTy>::max();
		SzelethusUnsubmitted Done Reply Inline Actions How about `ValueConstraintRef`? Szelethus: How about `ValueConstraintRef`?
		martongAuthorUnsubmitted Done Reply Inline Actions Yeah, we have `ProgramStateRef` and `SymbolRef`. And both are actually just synonyms to smart pointers. I'd rather not call a pointer as a reference, because that can be confusing when reading the code. E.g. when I see that we return with a `nullptr` from a function that can return with a `...Ref` I start to scratch my head. martong: Yeah, we have `ProgramStateRef` and `SymbolRef`. And both are actually just synonyms to smart…
		SzelethusUnsubmitted Done Reply Inline Actions Sure, I'm sold. Szelethus: Sure, I'm sold.

/// Incapsulates a single range on a single symbol within a branch.		/// Incapsulates a single range on a single symbol within a branch.
class ValueRange {		class ValueRange {
ArgNoTy ArgNo; // Argument to which we apply the range.		ArgNoTy ArgNo; // Argument to which we apply the range.
ValueRangeKindTy Kind; // Kind of range definition.		ValueRangeKindTy Kind; // Kind of range definition.
IntRangeVectorTy Args; // Polymorphic arguments.		IntRangeVectorTy Args; // Polymorphic arguments.

public:		public:
Show All 33 Lines	applyAsOutOfRange(ProgramStateRef State, const CallEvent &Call,
const FunctionSummaryTy &Summary) const;		const FunctionSummaryTy &Summary) const;
ProgramStateRef		ProgramStateRef
applyAsWithinRange(ProgramStateRef State, const CallEvent &Call,		applyAsWithinRange(ProgramStateRef State, const CallEvent &Call,
const FunctionSummaryTy &Summary) const;		const FunctionSummaryTy &Summary) const;
ProgramStateRef		ProgramStateRef
applyAsComparesToArgument(ProgramStateRef State, const CallEvent &Call,		applyAsComparesToArgument(ProgramStateRef State, const CallEvent &Call,
const FunctionSummaryTy &Summary) const;		const FunctionSummaryTy &Summary) const;

		void checkAsWithinRange(ProgramStateRef State, const CallEvent &Call,
		const FunctionSummaryTy &Summary, const BugType &BT,
		CheckerContext &C) const;

public:		public:
ProgramStateRef apply(ProgramStateRef State, const CallEvent &Call,		ProgramStateRef apply(ProgramStateRef State, const CallEvent &Call,
const FunctionSummaryTy &Summary) const {		const FunctionSummaryTy &Summary) const {
switch (Kind) {		switch (Kind) {
case OutOfRange:		case OutOfRange:
return applyAsOutOfRange(State, Call, Summary);		return applyAsOutOfRange(State, Call, Summary);
case WithinRange:		case WithinRange:
return applyAsWithinRange(State, Call, Summary);		return applyAsWithinRange(State, Call, Summary);
case ComparesToArgument:		case ComparesToArgument:
return applyAsComparesToArgument(State, Call, Summary);		return applyAsComparesToArgument(State, Call, Summary);
}		}
llvm_unreachable("Unknown ValueRange kind!");		llvm_unreachable("Unknown ValueRange kind!");
}		}

		void check(ProgramStateRef State, const CallEvent &Call,
		SzelethusUnsubmitted Done Reply Inline Actions Maybe `complement` would be a better name? That sounds a lot more like a set operation. Also, this function highlights well that inheritance might not be the best solution here. Szelethus: Maybe `complement` would be a better name? That sounds a lot more like a set operation. Also…
		martongAuthorUnsubmitted Done Reply Inline Actions Well, we check the argument constraint validity by trying to apply it's logical negation. In case of a range inclusion this is being out of that range. In case of non-null this is being null. And so on. The logic how we try to check an argument constraint is the same in all cases of the different constraints. And that is the point: in order to support a new kind of constraint we just have to figure out how to "apply" and "negate" one constraint. In my opinion this is a perfect case for polimorphism. martong: Well, we check the argument constraint validity by trying to apply it's logical negation. In…
		SzelethusUnsubmitted Done Reply Inline Actions We agreed on inheritance in the previous patch, and regarding the name, sure, leave it as-is. :) Szelethus: We agreed on inheritance in the previous patch, and regarding the name, sure, leave it as-is. :)
		const FunctionSummaryTy &Summary, const BugType &BT,
		CheckerContext &C) const {
		switch (Kind) {
		case OutOfRange:
		llvm_unreachable("Not implemented yet!");
		case WithinRange:
		checkAsWithinRange(State, Call, Summary, BT, C);
		return;
		case ComparesToArgument:
		martongAuthorUnsubmitted Done Reply Inline Actions This `default` branch is not needed here (actually gives a compiler warning too). martong: This `default` branch is not needed here (actually gives a compiler warning too).
		martongAuthorUnsubmitted Done Reply Inline Actions This default branch is not needed here (actually gives a compiler warning too). I am not sure why I thought that that's not needed, actually we need that. (Perhaps an intermediate version returned in each cases.) martong: > This default branch is not needed here (actually gives a compiler warning too). I am not…
		llvm_unreachable("Not implemented yet!");
		}
		llvm_unreachable("Unknown ValueRange kind!");
		}
};		};

/// The complete list of ranges that defines a single branch.		/// The complete list of ranges that defines a single branch.
typedef std::vector<ValueRange> ValueRangeSet;		typedef std::vector<ValueRange> ValueRangeSet;

using ArgTypesTy = std::vector<QualType>;		using ArgTypesTy = std::vector<QualType>;
using RetTypeTy = QualType;		using RetTypeTy = QualType;
using RangesTy = std::vector<ValueRangeSet>;		using RangesTy = std::vector<ValueRangeSet>;

/// Includes information about function prototype (which is necessary to		/// Includes information about
		/// * function prototype (which is necessary to
/// ensure we're modeling the right function and casting values properly),		/// ensure we're modeling the right function and casting values properly),
/// approach to invalidation, and a list of branches - essentially, a list		/// * approach to invalidation,
/// of list of ranges - essentially, a list of lists of lists of segments.		/// * a list of branches - a list of list of ranges -
		/// i.e. a list of lists of lists of segments,
		SzelethusUnsubmitted Done Reply Inline Actions I think that is a rather poor example to help understand what `list of list of ranges` means :) -- Could you try to find something better? Szelethus: I think that is a rather poor example to help understand what `list of list of ranges` means :)…
		martongAuthorUnsubmitted Done Reply Inline Actions Yeah, that part definitely should be reworded. martong: Yeah, that part definitely should be reworded.
		martongAuthorUnsubmitted Done Reply Inline Actions I added an example with `isalpha`. martong: I added an example with `isalpha`.
		balazskeUnsubmitted Done Reply Inline Actions The "branches" are the structures that define relations between arguments and return values? This could be included in the description. balazske: The "branches" are the structures that define relations between arguments and return values?
		martongAuthorUnsubmitted Done Reply Inline Actions Not exactly. A branch represents a path in the exploded graph of a function (which is a tree). So, a branch is a series of assumptions. In other words, branches represent split states and additional assumptions on top of the splitting assumption. I added this explanation to the comments. martong: Not exactly. A branch represents a path in the exploded graph of a function (which is a tree).
		/// * a list of argument constraints, that must be true on every branch.
		/// If these constraints are not satisfied that means a fatal error
		/// usually resulting in undefined behaviour.
struct FunctionSummaryTy {		struct FunctionSummaryTy {
const ArgTypesTy ArgTypes;		const ArgTypesTy ArgTypes;
const RetTypeTy RetType;		const RetTypeTy RetType;
const InvalidationKindTy InvalidationKind;		const InvalidationKindTy InvalidationKind;
RangesTy Ranges;		RangesTy Ranges;
		ValueRangeSet ArgConstraints;

FunctionSummaryTy(ArgTypesTy ArgTypes, RetTypeTy RetType,		FunctionSummaryTy(ArgTypesTy ArgTypes, RetTypeTy RetType,
InvalidationKindTy InvalidationKind)		InvalidationKindTy InvalidationKind)
: ArgTypes(ArgTypes), RetType(RetType),		: ArgTypes(ArgTypes), RetType(RetType),
InvalidationKind(InvalidationKind) {}		InvalidationKind(InvalidationKind) {}

FunctionSummaryTy &Specification(ValueRangeSet VRS) {		FunctionSummaryTy &Specification(ValueRangeSet VRS) {
Ranges.push_back(VRS);		Ranges.push_back(VRS);
return *this;		return *this;
}		}
		FunctionSummaryTy &ArgConstraint(ValueRange VR) {
		ArgConstraints.push_back(VR);
		return *this;
		}

private:		private:
static void assertTypeSuitableForSummary(QualType T) {		static void assertTypeSuitableForSummary(QualType T) {
assert(!T->isVoidType() &&		assert(!T->isVoidType() &&
"We should have had no significant void types in the spec");		"We should have had no significant void types in the spec");
assert(T.isCanonical() &&		assert(T.isCanonical() &&
"We should only have canonical types in the spec");		"We should only have canonical types in the spec");
// FIXME: lift this assert (but not the ones above!)		// FIXME: lift this assert (but not the ones above!)
Show All 20 Lines	class StdLibraryFunctionsChecker
// may have different definitions on different platforms.		// may have different definitions on different platforms.
typedef std::vector<FunctionSummaryTy> FunctionVariantsTy;		typedef std::vector<FunctionSummaryTy> FunctionVariantsTy;

// The map of all functions supported by the checker. It is initialized		// The map of all functions supported by the checker. It is initialized
// lazily, and it doesn't change after initialization.		// lazily, and it doesn't change after initialization.
typedef llvm::StringMap<FunctionVariantsTy> FunctionSummaryMapTy;		typedef llvm::StringMap<FunctionVariantsTy> FunctionSummaryMapTy;
mutable FunctionSummaryMapTy FunctionSummaryMap;		mutable FunctionSummaryMapTy FunctionSummaryMap;

		BugType BT{this, "Unsatisfied argument constraints", categories::LogicError};
		SzelethusUnsubmitted Done Reply Inline Actions By passing `this`, the error message will be tied to the modeling checker, not to the one you just added. `BugType` has a constructor that accepts a string instead, pass `CheckNames[CK_StdCLibraryFunctionArgsChecker]` in there :) Also, how about `BT_InvalidArgument` or something? Szelethus: By passing `this`, the error message will be tied to the modeling checker, not to the one you…
		martongAuthorUnsubmitted Done Reply Inline Actions Thanks, good catch, I did not know about that. Please note that using `CheckNames` requires that we change the `BT` member to be lazily initialized. Because `CheckNames` is initialized only after the checker itself is created, thus we cannot initialize `BT` during the checkers construction, b/c that would be before `CheckNames` is set. So, I changed `BT` to be a unique_ptr and it is being lazily initialized in `reportBug`. martong: Thanks, good catch, I did not know about that. Please note that using `CheckNames` requires…

// Auxiliary functions to support ArgNoTy within all structures		// Auxiliary functions to support ArgNoTy within all structures
// in a unified manner.		// in a unified manner.
static QualType getArgType(const FunctionSummaryTy &Summary, ArgNoTy ArgNo) {		static QualType getArgType(const FunctionSummaryTy &Summary, ArgNoTy ArgNo) {
return Summary.getArgType(ArgNo);		return Summary.getArgType(ArgNo);
}		}
static QualType getArgType(const CallEvent &Call, ArgNoTy ArgNo) {		static QualType getArgType(const CallEvent &Call, ArgNoTy ArgNo) {
return ArgNo == Ret ? Call.getResultType().getCanonicalType()		return ArgNo == Ret ? Call.getResultType().getCanonicalType()
: Call.getArgExpr(ArgNo)->getType().getCanonicalType();		: Call.getArgExpr(ArgNo)->getType().getCanonicalType();
}		}
static QualType getArgType(const CallExpr *CE, ArgNoTy ArgNo) {		static QualType getArgType(const CallExpr *CE, ArgNoTy ArgNo) {
return ArgNo == Ret ? CE->getType().getCanonicalType()		return ArgNo == Ret ? CE->getType().getCanonicalType()
: CE->getArg(ArgNo)->getType().getCanonicalType();		: CE->getArg(ArgNo)->getType().getCanonicalType();
}		}
static SVal getArgSVal(const CallEvent &Call, ArgNoTy ArgNo) {		static SVal getArgSVal(const CallEvent &Call, ArgNoTy ArgNo) {
return ArgNo == Ret ? Call.getReturnValue() : Call.getArgSVal(ArgNo);		return ArgNo == Ret ? Call.getReturnValue() : Call.getArgSVal(ArgNo);
}		}

public:		public:
		void checkPreCall(const CallEvent &Call, CheckerContext &C) const;
void checkPostCall(const CallEvent &Call, CheckerContext &C) const;		void checkPostCall(const CallEvent &Call, CheckerContext &C) const;
bool evalCall(const CallEvent &Call, CheckerContext &C) const;		bool evalCall(const CallEvent &Call, CheckerContext &C) const;

private:		private:
Optional<FunctionSummaryTy> findFunctionSummary(const FunctionDecl *FD,		Optional<FunctionSummaryTy> findFunctionSummary(const FunctionDecl *FD,
const CallExpr *CE,		const CallExpr *CE,
CheckerContext &C) const;		CheckerContext &C) const;

void initFunctionSummaries(BasicValueFactory &BVF) const;		void initFunctionSummaries(BasicValueFactory &BVF) const;
};		};
} // end of anonymous namespace		} // end of anonymous namespace
		balazskeUnsubmitted Done Reply Inline Actions This should be called `reportBug`. balazske: This should be called `reportBug`.
		martongAuthorUnsubmitted Done Reply Inline Actions Yeah, can't get used to this strange naming convention that LLVM uses. Fixed it. martong: Yeah, can't get used to this strange naming convention that LLVM uses. Fixed it.

ProgramStateRef StdLibraryFunctionsChecker::ValueRange::applyAsOutOfRange(		ProgramStateRef StdLibraryFunctionsChecker::ValueRange::applyAsOutOfRange(
ProgramStateRef State, const CallEvent &Call,		ProgramStateRef State, const CallEvent &Call,
const FunctionSummaryTy &Summary) const {		const FunctionSummaryTy &Summary) const {

ProgramStateManager &Mgr = State->getStateManager();		ProgramStateManager &Mgr = State->getStateManager();
SValBuilder &SVB = Mgr.getSValBuilder();		SValBuilder &SVB = Mgr.getSValBuilder();
BasicValueFactory &BVF = SVB.getBasicValueFactory();		BasicValueFactory &BVF = SVB.getBasicValueFactory();
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	StdLibraryFunctionsChecker::ValueRange::applyAsComparesToArgument(
// Note: we avoid integral promotion for comparison.		// Note: we avoid integral promotion for comparison.
OtherV = SVB.evalCast(OtherV, T, OtherT);		OtherV = SVB.evalCast(OtherV, T, OtherT);
if (auto CompV = SVB.evalBinOp(State, Op, V, OtherV, CondT)		if (auto CompV = SVB.evalBinOp(State, Op, V, OtherV, CondT)
.getAs<DefinedOrUnknownSVal>())		.getAs<DefinedOrUnknownSVal>())
State = State->assume(*CompV, true);		State = State->assume(*CompV, true);
return State;		return State;
}		}

		void StdLibraryFunctionsChecker::ValueRange::checkAsWithinRange(
		ProgramStateRef State, const CallEvent &Call,
		const FunctionSummaryTy &Summary, const BugType &BT,
		CheckerContext &C) const {

		ProgramStateManager &Mgr = State->getStateManager();
		SValBuilder &SVB = Mgr.getSValBuilder();
		BasicValueFactory &BVF = SVB.getBasicValueFactory();
		QualType T = getArgType(Summary, getArgNo());
		SVal V = getArgSVal(Call, getArgNo());
		switch (V.getSubKind()) {
		default:
		// FIXME Handle other cases.
		return;
		case nonloc::ConcreteIntKind: {
		xazax.hunUnsubmitted Done Reply Inline Actions Dealing with only concrete ints might be a good start but we might want to handle symbolic cases in the future like: if (v > 255) return isalpha(v); I am ok with not addig this in the first version but adding TODOs and test cases upfront cannot hurt. So basivally, I was wondering if we should query the solver for the result instead of matching the sval kind and just early return if we do not want to support a specific kind. xazax.hun: Dealing with only concrete ints might be a good start but we might want to handle symbolic…
		balazskeUnsubmitted Done Reply Inline Actions This check works now with concrete int values. We have a known value and a list of ranges with known limits, so testing for in any of the ranges does work the same way as testing for out of all ranges. And testing if the value is inside one of the ranges is more simple code. But I think the symbolic evaluation with "eval" and "assume" functions would be more generic here (and more simple code). Then the way of cutting-of the bad ranges is usable (probably still there is other solution). balazske: This check works now with concrete int values. We have a known value and a list of ranges with…
		const llvm::APSInt &IntVal = V.castAs<nonloc::ConcreteInt>().getValue();
		const IntRangeVectorTy &R = getRanges();
		const llvm::APSInt &Min = BVF.getValue(R[0].first, T);
		const llvm::APSInt &Max = BVF.getValue(R[0].second, T);
		assert(Min <= Max);

		// Out of range.
		if (IntVal < Min \|\| IntVal > Max) {
		if (ExplodedNode *N = C.generateErrorNode(State)) {
		// FIXME Add detailed diagnostic.
		NoQUnsubmitted Done Reply Inline Actions Yeah, a must-have for this check to be enabled by default would be to be able to provide a specific warning message for every function. I guess we could include them in the summaries as an extra argument of `ArgConstraint`. NoQ: Yeah, a must-have for this check to be enabled by default would be to be able to provide a…
		martongAuthorUnsubmitted Done Reply Inline Actions What about warning messages with placeholders? E.g. "Argument constraint of {nth} argument of {fun} is not satisfied. The value is {v}, however it should be in {range}." There will be a bunch of functions whose warning message template would be the same. On the other hand some others could have different warnings, and that justifies the need for specialized warnings. Still, I think the warning message in the summary should be optional, because otherwise it would be really hard to automatically add summaries from other sources (like from cppcheck). No matter how it turns out, this should be handled in a different patch. martong: What about warning messages with placeholders? E.g. "Argument constraint of {nth} argument of…
		std::string Msg = "Function argument constraint is not satisfied";
		auto R = std::make_unique<PathSensitiveBugReport>(BT, Msg, N);
		bugreporter::trackExpressionValue(N, Call.getArgExpr(0), *R);
		NoQUnsubmitted Done Reply Inline Actions Let's test our notes. That'll be especially important when we get to non-concrete values, because the visitor might need to be expanded (or we might need a completely new visitor). NoQ: Let's test our notes. That'll be especially important when we get to non-concrete values…
		martongAuthorUnsubmitted Done Reply Inline Actions Ok, I added a separate test file where the tests focus on the bug path. martong: Ok, I added a separate test file where the tests focus on the bug path.
		C.emitReport(std::move(R));
		}
		}
		}
		} // end switch
		}

		void StdLibraryFunctionsChecker::checkPreCall(const CallEvent &Call,
		CheckerContext &C) const {
		const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());
		if (!FD)
		return;

		const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
		if (!CE)
		return;

		Optional<FunctionSummaryTy> FoundSummary = findFunctionSummary(FD, CE, C);
		if (!FoundSummary)
		return;

		const FunctionSummaryTy &Summary = *FoundSummary;
		ProgramStateRef State = C.getState();
		for (const auto &VR : Summary.ArgConstraints) {
		VR.check(State, Call, Summary, BT, C);
		}
		}

void StdLibraryFunctionsChecker::checkPostCall(const CallEvent &Call,		void StdLibraryFunctionsChecker::checkPostCall(const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());		const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());
if (!FD)		if (!FD)
return;		return;

const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());		const CallExpr *CE = dyn_cast_or_null<CallExpr>(Call.getOriginExpr());
if (!CE)		if (!CE)
return;		return;

Optional<FunctionSummaryTy> FoundSummary = findFunctionSummary(FD, CE, C);		Optional<FunctionSummaryTy> FoundSummary = findFunctionSummary(FD, CE, C);
		NoQUnsubmitted Done Reply Inline Actions Maybe we should add an assertion that the same argument isn't specified multiple times. NoQ: Maybe we should add an assertion that the same argument isn't specified multiple times.
		martongAuthorUnsubmitted Done Reply Inline Actions I think there could be cases when we want to have e.g. a not-null constraint on the 1st argument, but also we want to express that the 1st argument's size is described by the 2nd argument. I am planning to implement such a constraints in the future. In that case we would have two constraints on the 1st argument and the assert would fire. martong: I think there could be cases when we want to have e.g. a not-null constraint on the 1st…
		NoQUnsubmitted Not Done Reply Inline Actions Wait, i misunderstood the code. It's even worse than that: you're adding transitions in a loop, so it'll cause state splits for every constraint. Because you do not intend to create multiple branches here, there needs to be exactly one `addTransition` performed every time `checkPreCall` is called. I.e., for now this code is breaking everything whenever there's more than one constraint, regardless of whether it's on the same argument. NoQ: Wait, i misunderstood the code. It's even worse than that: you're adding transitions in a loop…
		martongAuthorUnsubmitted Done Reply Inline Actions Yeah, that's a very good catch, thanks! I am going to prepare a patch to fix this soon. My idea is to store the `SuccessSt` and apply the next argument constraint on that. And once the loop is finished I'll have call the `addTransition()`. martong: Yeah, that's a very good catch, thanks! I am going to prepare a patch to fix this soon. My idea…
		NoQUnsubmitted Not Done Reply Inline Actions Yup, that's the common thing to do in such cases. NoQ: Yup, that's the common thing to do in such cases.
		NoQUnsubmitted Not Done Reply Inline Actions While we're at it, could you try to come up with a runtime assertion that'll help us prevent these mistakes? Like, dunno, make `CheckerContext` crash whenever there's more than one branch being added, and then add a method to opt out when it's actually necessary to add more transitions (i.e., the user would say `C.setMaxTransitions(2)` at the beginning of their checker callback whenever they need to make a state split, defaulting to 1). It's a bit tricky because i still want to allow multiple transitions when they allow one branch (i.e., transitions chained together) but i think it'll take a lot of review anxiety from me because it's a very dangerous mistake to make and for now code review is the only way to catch it. So, yay, faster code reviews. NoQ: While we're at it, could you try to come up with a runtime assertion that'll help us prevent…
		martongAuthorUnsubmitted Done Reply Inline Actions Hmm I see your point and I agree this would be a valuable sanity check. But if you don't mind I'd like to address this in a different and stand-alone patch (independently from the quick-fix https://reviews.llvm.org/D76790) because it does not seem to be trivial for me. My first concern is this: if we have `1` as the default value for `maxTranisitions` then we should add an extra `C.setMaxTransitions(N)` in every checker callback that does a state split, is that right? martong: Hmm I see your point and I agree this would be a valuable sanity check. But if you don't mind…
if (!FoundSummary)		if (!FoundSummary)
return;		return;

// Now apply ranges.		// Now apply ranges.
		SzelethusUnsubmitted Done Reply Inline Actions That is a `TODO`, rather :^) Szelethus: That is a `TODO`, rather :^)
const FunctionSummaryTy &Summary = *FoundSummary;		const FunctionSummaryTy &Summary = *FoundSummary;
		balazskeUnsubmitted Done Reply Inline Actions Is this `addTransition` needed? It would be OK to call `generateErrorNode` with `State`. Even if not, adding the transition before should not be needed? balazske: Is this `addTransition` needed? It would be OK to call `generateErrorNode` with `State`. Even…
		martongAuthorUnsubmitted Done Reply Inline Actions Yes, you are right it is superfluous, I removed it. martong: Yes, you are right it is superfluous, I removed it.
		steakhalUnsubmitted Done Reply Inline Actions StringRef steakhal: StringRef
ProgramStateRef State = C.getState();		ProgramStateRef State = C.getState();

		// Apply specifications.
for (const auto &VRS: Summary.Ranges) {		for (const auto &VRS: Summary.Ranges) {
		balazskeUnsubmitted Done Reply Inline Actions If `evalCall` is used it could be more simple to test and apply the constraints for arguments and return value in a "single step". balazske: If `evalCall` is used it could be more simple to test and apply the constraints for arguments…
		SzelethusUnsubmitted Done Reply Inline Actions While I find your usage of lambdas fascinating, this one seems a bit unnecessary :) Szelethus: While I find your usage of lambdas fascinating, this one seems a bit unnecessary :)
		martongAuthorUnsubmitted Done Reply Inline Actions Ok I moved it to be a member function named `ReportBug`. martong: Ok I moved it to be a member function named `ReportBug`.
ProgramStateRef NewState = State;		ProgramStateRef NewState = State;
for (const auto &VR: VRS) {		for (const auto &VR: VRS) {
NewState = VR.apply(NewState, Call, Summary);		NewState = VR.apply(NewState, Call, Summary);
if (!NewState)		if (!NewState)
break;		break;
}		}

		// Apply argument constraints as well.
		for (const auto &VR : Summary.ArgConstraints)
		if (NewState)
		NewState = VR.apply(NewState, Call, Summary);

if (NewState && NewState != State)		if (NewState && NewState != State)
C.addTransition(NewState);		C.addTransition(NewState);
}		}
}		}

bool StdLibraryFunctionsChecker::evalCall(const CallEvent &Call,		bool StdLibraryFunctionsChecker::evalCall(const CallEvent &Call,
CheckerContext &C) const {		CheckerContext &C) const {
const auto *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());		const auto *FD = dyn_cast_or_null<FunctionDecl>(Call.getDecl());
▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	auto GetlineT = [&](RetTypeTy R, RangeIntTy Max) {
return Summary(ArgTypesTy{Irrelevant, Irrelevant, Irrelevant}, RetTypeTy(R),		return Summary(ArgTypesTy{Irrelevant, Irrelevant, Irrelevant}, RetTypeTy(R),
NoEvalCall)		NoEvalCall)
.Specification(		.Specification(
{ReturnValueCondition(WithinRange, {{-1, -1}, {1, Max}})});		{ReturnValueCondition(WithinRange, {{-1, -1}, {1, Max}})});
};		};

FunctionSummaryMap = {		FunctionSummaryMap = {
// The isascii() family of functions.		// The isascii() family of functions.
		// The behavior is undefined if the value of the argument is not
		// representable as unsigned char or is not equal to EOF. See e.g. C99
		// 7.4.1.2 The isalpha function (p: 181-182).
		SzelethusUnsubmitted Done Reply Inline Actions This is true for the rest of the summaries as well, but shouldn't we retrieve the `unsigned char` size from `ASTContext`? Szelethus: This is true for the rest of the summaries as well, but shouldn't we retrieve the `unsigned…
		martongAuthorUnsubmitted Done Reply Inline Actions Yes this is a good idea. I will do this. What bothers me really much, however, is that we should handle EOF in a platform dependent way as well ... and I have absolutely no idea how to do that given that is defined by a macro in a platform specific header file. I am desperately in need for help and ideas about how could we get the value of EOF for the analysed platform. martong: Yes this is a good idea. I will do this. What bothers me really much, however, is that we…
		gamesh411Unsubmitted Done Reply Inline Actions If the EOF is not used in the TU analyzed, then there would be no way to find the specific `#define`. Another approach would be to check if the value is defined by an expression that is the EOF define (maybe transitively?). gamesh411: If the EOF is not used in the TU analyzed, then there would be no way to find the specific…
		martongAuthorUnsubmitted Done Reply Inline Actions I believe that the given standard C lib implementation (e.g. glibc) must provide a header for the prototypes of these functions where EOF is also defined transitively in any of the dependent system headers. Otherwise user code could misuse the value of EOF and thus the program would behave in an undefined manner. C99 clearly states that you should #include <ctype.h> to use isalhpa. martong: I believe that the given standard C lib implementation (e.g. glibc) must provide a header for…
{		{
"isalnum",		"isalnum",
FunctionVariantsTy{		FunctionVariantsTy{
Summary(ArgTypesTy{IntTy}, RetTypeTy(IntTy), EvalCallAsPure)		Summary(ArgTypesTy{IntTy}, RetTypeTy(IntTy), EvalCallAsPure)
// Boils down to isupper() or islower() or isdigit().		// Boils down to isupper() or islower() or isdigit().
.Specification(		.Specification(
{ArgumentCondition(0U, WithinRange,		{ArgumentCondition(0U, WithinRange,
{{'0', '9'}, {'A', 'Z'}, {'a', 'z'}}),		{{'0', '9'}, {'A', 'Z'}, {'a', 'z'}}),
ReturnValueCondition(OutOfRange, SingleValue(0))})		ReturnValueCondition(OutOfRange, SingleValue(0))})
// The locale-specific range.		// The locale-specific range.
// No post-condition. We are completely unaware of		// No post-condition. We are completely unaware of
// locale-specific return values.		// locale-specific return values.
.Specification(		.Specification(
		balazskeUnsubmitted Done Reply Inline Actions Why is this `{128, UCharMax}` here and at the next entry needed? balazske: Why is this `{128, UCharMax}` here and at the next entry needed?
		martongAuthorUnsubmitted Done Reply Inline Actions This is the local specific range , [128, 255]. There are characters like `ä` which we don't know if they are treated as an alphanumerical character or not. We can't really tell how a specific libc implementation classifies them. On the other hand, with English letters we can state the classes confidently. martong: This is the local specific range , [128, 255]. There are characters like `ä` which we don't…
{ArgumentCondition(0U, WithinRange, {{128, 255}})})		{ArgumentCondition(0U, WithinRange, {{128, 255}})})
.Specification(		.Specification(
		balazskeUnsubmitted Done Reply Inline Actions Is this `ArgConstraint` intentionally added only to `isalnum`? balazske: Is this `ArgConstraint` intentionally added only to `isalnum`?
		martongAuthorUnsubmitted Done Reply Inline Actions Yes, I wanted to create first the infrastructure and then later to add all these constraints to the rest of the summaries with new tests. martong: Yes, I wanted to create first the infrastructure and then later to add all these constraints to…
{ArgumentCondition(		{ArgumentCondition(
0U, OutOfRange,		0U, OutOfRange,
{{'0', '9'}, {'A', 'Z'}, {'a', 'z'}, {128, 255}}),		{{'0', '9'}, {'A', 'Z'}, {'a', 'z'}, {128, 255}}),
ReturnValueCondition(WithinRange, SingleValue(0))})},		ReturnValueCondition(WithinRange, SingleValue(0))})
		.ArgConstraint(
		ArgumentCondition(0U, WithinRange, {{-1, 255}}))},
},		},
{		{
"isalpha",		"isalpha",
FunctionVariantsTy{		FunctionVariantsTy{
Summary(ArgTypesTy{IntTy}, RetTypeTy(IntTy), EvalCallAsPure)		Summary(ArgTypesTy{IntTy}, RetTypeTy(IntTy), EvalCallAsPure)
.Specification(		.Specification(
{ArgumentCondition(0U, WithinRange,		{ArgumentCondition(0U, WithinRange,
{{'A', 'Z'}, {'a', 'z'}}),		{{'A', 'Z'}, {'a', 'z'}}),
▲ Show 20 Lines • Show All 200 Lines • Show Last 20 Lines

clang/test/Analysis/std-c-library-functions-arg-constraints.c

This file was added.

				// RUN: %clang_analyze_cc1 %s \
				// RUN: -analyzer-checker=apiModeling.StdCLibraryFunctions \
				// RUN: -analyzer-checker=debug.ExprInspection \
				// RUN: -triple x86_64-unknown-linux-gnu

				void clang_analyzer_eval(int);

				int glob;
				SzelethusUnsubmitted Done Reply Inline Actions Hmm, why do we have 2 different test files that essentially do the same? Shouldn't we only have a single one with `analyzer-output=text`? Szelethus: Hmm, why do we have 2 different test files that essentially do the same? Shouldn't we only have…
				martongAuthorUnsubmitted Done Reply Inline Actions No, I wanted to have two different test files to test two different things: (1) We do have the constraints applied (here we don't care about the warnings and the path) (2) Check that we have a warning with the proper tracking and notes. martong: No, I wanted to have two different test files to test two different things: (1) We do have the…
				SzelethusUnsubmitted Done Reply Inline Actions What if we had different `-verify`s? `clang/test/Analysis/track-conditions.cpp` is a great example. Szelethus: What if we had different `-verify`s? `clang/test/Analysis/track-conditions.cpp` is a great…
				martongAuthorUnsubmitted Done Reply Inline Actions Yeah, that's a very good approach, I just changed it like that. :) martong: Yeah, that's a very good approach, I just changed it like that. :)

				typedef struct FILE FILE;
				#define EOF -1

				int isalnum(int);
				void test_alnum_concrete() {
				int ret = isalnum(256); // expected-warning{{Function argument constraint is not satisfied}}
				(void)ret;
				}
				void test_alnum_symbolic(int x) {
				int ret = isalnum(x);
				(void)ret;
				clang_analyzer_eval(EOF <= x && x <= 255); // expected-warning{{TRUE}}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] StdLibraryFunctionsChecker: Add argument constraintsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 242087

clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp

clang/test/Analysis/std-c-library-functions-arg-constraints.c

[analyzer] StdLibraryFunctionsChecker: Add argument constraints
ClosedPublic