Index: clang/www/OpenProjects.html =================================================================== --- clang/www/OpenProjects.html +++ clang/www/OpenProjects.html @@ -44,19 +44,19 @@ Clang is built as a set of libraries, which means that it is possible to implement capabilities similar to other source language tools, improving them in various ways. Three examples are distcc, the distcc, the delta testcase reduction tool, and the "indent" source reformatting tool. distcc can be improved to scale better and be more efficient. Delta could be faster and more efficient at reducing C-family programs if built on the clang preprocessor. The clang-based indent replacement, -clang-format, +clang-format, could be taught to handle simple structural rules like those in the LLVM coding +href="https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code">the LLVM coding standards. -
See also PR4127.
+See also PR4127.
Index: clang/www/analyzer/annotations.html =================================================================== --- clang/www/analyzer/annotations.html +++ clang/www/analyzer/annotations.html @@ -17,18 +17,18 @@Source Annotations
The Clang frontend supports several source-level annotations in the form of -GCC-style +GCC-style attributes and pragmas that can help make using the Clang Static Analyzer more useful. These annotations can both help suppress false positives as well as enhance the analyzer's ability to find bugs.
This page gives a practical overview of such annotations. For more technical specifics regarding Clang-specific annotations please see the Clang's list of language +href="https://clang.llvm.org/docs/LanguageExtensions.html">language extensions. Details of "standard" GCC attributes (that Clang also -supports) can be found in the GCC +supports) can be found in the GCC manual, with the majority of the relevant attributes being in the section on -function +function attributes.
Note that attributes that are labeled Clang-specific are not @@ -68,7 +68,7 @@
The analyzer recognizes the GCC attribute 'nonnull', which indicates that a function expects that a given function parameter is not a null pointer. Specific details of the syntax of using the 'nonnull' attribute can be found in GCC's +href="https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-nonnull-function-attribute">GCC's documentation.
Both the Clang compiler and GCC will flag warnings for simple cases where a @@ -108,7 +108,7 @@ int bar(int*p, int q, int *r) __attribute__((nonnull(1,3))); int foo(int *p, int *q) { - return !p ? bar(q, 2, p) + return !p ? bar(q, 2, p) : bar(p, 2, q); } @@ -138,8 +138,8 @@
One can educate the analyzer (and others who read your code) about methods or functions that deviate from the Cocoa and Core Foundation conventions using the attributes described here. However, you should consider using proper naming -conventions or the objc_method_family +conventions or the objc_method_family attribute, if applicable.
The GCC-style (Clang-specific) attribute 'cf_returns_retained' allows one to annotate an Objective-C method or C function as returning a retained Core -Foundation object that the caller is responsible for releasing. The +Foundation object that the caller is responsible for releasing. The CoreFoundation framework defines a macro CF_RETURNS_RETAINED that is functionally equivalent to the one shown below.
@@ -323,7 +323,7 @@ method may appear to obey the Core Foundation or Cocoa conventions and return a retained Core Foundation object, this attribute can be used to indicate that the object reference returned should not be considered as an -"owning" reference being returned to the caller. The +"owning" reference being returned to the caller. The CoreFoundation framework defines a macro CF_RETURNS_NOT_RETAINED that is functionally equivalent to the one shown below. @@ -353,8 +353,8 @@The 'ns_consumed' attribute can be placed on a specific parameter in either the declaration of a function or an Objective-C method. It indicates to the static analyzer that a release message is implicitly sent to the -parameter upon completion of the call to the given function or method. The -Foundation framework defines a macro NS_RELEASES_ARGUMENT that +parameter upon completion of the call to the given function or method. The +Foundation framework defines a macro NS_RELEASES_ARGUMENT that is functionally equivalent to the NS_CONSUMED macro shown below.
Example
@@ -408,7 +408,7 @@ to the given function or method. The CoreFoundation framework defines a macro CF_RELEASES_ARGUMENT that is functionally equivalent to the CF_CONSUMED macro shown below. - +Operationally this attribute is nearly identical to 'ns_consumed'.
Example
@@ -438,7 +438,7 @@ void test2() { CFDateRef date = CFDateCreate(0, CFAbsoluteTimeGetCurrent()); consume_CFDate(date); // No leak, including under GC! - + } @interface Foo : NSObject @@ -463,7 +463,7 @@ follow the standard Cocoa naming conventions.Example
- +#ifndef __has_feature #define __has_feature(x) 0 // Compatibility with non-clang compilers. @@ -573,8 +573,8 @@ OSObject *f; LIBKERN_RETURNS_NOT_RETAINED OSObject *myFieldGetter(); } - - + + // Note that the annotation only has to be applied to the function declaration. OSObject * MyClass::myFieldGetter() { return f; @@ -633,7 +633,7 @@ void getterViaOutParam(LIBKERN_RETURNS_NOT_RETAINED OSObject **obj)
-In such cases a retained object is written into an out parameter, which the caller has then to release in order to avoid a leak. +In such cases a retained object is written into an out parameter, which the caller has then to release in order to avoid a leak.
These two cases are simple - but in practice a functions returning an out-parameter usually also return a return code, and then an out parameter may or may not be written, which conditionally depends on the exit code, e.g.:
@@ -718,7 +718,7 @@The analyzer knows about several well-known assertion handlers, but can automatically infer if a function should be treated as an assertion handler if it is annotated with the 'noreturn' attribute or the (Clang-specific) -'analyzer_noreturn' attribute. Note that, currently, clang does not support +'analyzer_noreturn' attribute. Note that, currently, clang does not support these attributes on Objective-C methods and C++ methods.
Specific details of the syntax of using the 'noreturn' attribute can be found in GCC's +href="https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-noreturn-function-attribute">GCC's documentation.
Not only does the analyzer exploit this information when pruning false paths, Index: clang/www/analyzer/available_checks.html =================================================================== --- clang/www/analyzer/available_checks.html +++ clang/www/analyzer/available_checks.html @@ -29,8 +29,8 @@
The static analyzer engine performs path-sensitive exploration of the program and -relies on a set of checkers to implement the logic for detecting and -constructing specific bug reports. Anyone who is interested in implementing their own -checker, should check out the Building a Checker in 24 Hours talk -(slides +
The static analyzer engine performs path-sensitive exploration of the program and +relies on a set of checkers to implement the logic for detecting and +constructing specific bug reports. Anyone who is interested in implementing their own +checker, should check out the Building a Checker in 24 Hours talk +(slides video) -and refer to this page for additional information on writing a checker. The static analyzer is a -part of the Clang project, so consult Hacking on Clang -and LLVM Programmer's Manual -for developer guidelines and send your questions and proposals to -cfe-dev mailing list. +and refer to this page for additional information on writing a checker. The static analyzer is a +part of the Clang project, so consult Hacking on Clang +and LLVM Programmer's Manual +for developer guidelines and send your questions and proposals to +cfe-dev mailing list.
- ProgramPoint - represents the corresponding location in the program (or the CFG). - ProgramPoint is also used to record additional information on - when/how the state was added. For example, PostPurgeDeadSymbolsKind - kind means that the state is the result of purging dead symbols - the - analyzer's equivalent of garbage collection. + ProgramPoint + represents the corresponding location in the program (or the CFG). + ProgramPoint is also used to record additional information on + when/how the state was added. For example, PostPurgeDeadSymbolsKind + kind means that the state is the result of purging dead symbols - the + analyzer's equivalent of garbage collection.
- ProgramState + ProgramState represents abstract state of the program. It consists of:
- Checkers are not merely passive receivers of the analyzer core changes - they + Checkers are not merely passive receivers of the analyzer core changes - they actively participate in the ProgramState construction through the - GenericDataMap which can be used to store the checker-defined part - of the state. Each time the analyzer engine explores a new statement, it - notifies each checker registered to listen for that statement, giving it an - opportunity to either report a bug or modify the state. (As a rule of thumb, - the checker itself should be stateless.) The checkers are called one after another - in the predefined order; thus, calling all the checkers adds a chain to the + GenericDataMap which can be used to store the checker-defined part + of the state. Each time the analyzer engine explores a new statement, it + notifies each checker registered to listen for that statement, giving it an + opportunity to either report a bug or modify the state. (As a rule of thumb, + the checker itself should be stateless.) The checkers are called one after another + in the predefined order; thus, calling all the checkers adds a chain to the ExplodedGraph.
- +- During symbolic execution, SVal - objects are used to represent the semantic evaluation of expressions. - They can represent things like concrete - integers, symbolic values, or memory locations (which are memory regions). - They are a discriminated union of "values", symbolic and otherwise. - If a value isn't symbolic, usually that means there is no symbolic - information to track. For example, if the value was an integer, such as - 42, it would be a ConcreteInt, - and the checker doesn't usually need to track any state with the concrete - number. In some cases, SVal is not a symbol, but it really should be - a symbolic value. This happens when the analyzer cannot reason about something - (yet). An example is floating point numbers. In such cases, the - SVal will evaluate to UnknownVal. - This represents a case that is outside the realm of the analyzer's reasoning - capabilities. SVals are value objects and their values can be viewed - using the .dump() method. Often they wrap persistent objects such as + During symbolic execution, SVal + objects are used to represent the semantic evaluation of expressions. + They can represent things like concrete + integers, symbolic values, or memory locations (which are memory regions). + They are a discriminated union of "values", symbolic and otherwise. + If a value isn't symbolic, usually that means there is no symbolic + information to track. For example, if the value was an integer, such as + 42, it would be a ConcreteInt, + and the checker doesn't usually need to track any state with the concrete + number. In some cases, SVal is not a symbol, but it really should be + a symbolic value. This happens when the analyzer cannot reason about something + (yet). An example is floating point numbers. In such cases, the + SVal will evaluate to UnknownVal. + This represents a case that is outside the realm of the analyzer's reasoning + capabilities. SVals are value objects and their values can be viewed + using the .dump() method. Often they wrap persistent objects such as symbols or regions.
- SymExpr (symbol) - is meant to represent abstract, but named, symbolic value. Symbols represent - an actual (immutable) value. We might not know what its specific value is, but - we can associate constraints with that value as we analyze a path. For - example, we might record that the value of a symbol is greater than + SymExpr (symbol) + is meant to represent abstract, but named, symbolic value. Symbols represent + an actual (immutable) value. We might not know what its specific value is, but + we can associate constraints with that value as we analyze a path. For + example, we might record that the value of a symbol is greater than 0, etc.
- MemRegion is similar to a symbol. - It is used to provide a lexicon of how to describe abstract memory. Regions can - layer on top of other regions, providing a layered approach to representing memory. - For example, a struct object on the stack might be represented by a VarRegion, - but a FieldRegion which is a subregion of the VarRegion could + MemRegion is similar to a symbol. + It is used to provide a lexicon of how to describe abstract memory. Regions can + layer on top of other regions, providing a layered approach to representing memory. + For example, a struct object on the stack might be represented by a VarRegion, + but a FieldRegion which is a subregion of the VarRegion could be used to represent the memory associated with a specific field of that object. - So how do we represent symbolic memory regions? That's what - SymbolicRegion - is for. It is a MemRegion that has an associated symbol. Since the + So how do we represent symbolic memory regions? That's what + SymbolicRegion + is for. It is a MemRegion that has an associated symbol. Since the symbol is unique and has a unique name; that symbol names the region.
- +Let's see how the analyzer processes the expressions in the following example:
@@ -193,60 +193,60 @@
-Let's look at how x*2 gets evaluated. When x is evaluated,
-we first construct an SVal that represents the lvalue of x, in
-this case it is an SVal that references the MemRegion for x.
-Afterwards, when we do the lvalue-to-rvalue conversion, we get a new SVal,
-which references the value currently bound to x. That value is
-symbolic; it's whatever x was bound to at the start of the function.
-Let's call that symbol $0. Similarly, we evaluate the expression for 2,
-and get an SVal that references the concrete number 2. When
-we evaluate x*2, we take the two SVals of the subexpressions,
-and create a new SVal that represents their multiplication (which in
-this case is a new symbolic expression, which we might call $1). When we
-evaluate the assignment to y, we again compute its lvalue (a MemRegion),
-and then bind the SVal for the RHS (which references the symbolic value $1)
+Let's look at how x*2 gets evaluated. When x is evaluated,
+we first construct an SVal that represents the lvalue of x, in
+this case it is an SVal that references the MemRegion for x.
+Afterwards, when we do the lvalue-to-rvalue conversion, we get a new SVal,
+which references the value currently bound to x. That value is
+symbolic; it's whatever x was bound to at the start of the function.
+Let's call that symbol $0. Similarly, we evaluate the expression for 2,
+and get an SVal that references the concrete number 2. When
+we evaluate x*2, we take the two SVals of the subexpressions,
+and create a new SVal that represents their multiplication (which in
+this case is a new symbolic expression, which we might call $1). When we
+evaluate the assignment to y, we again compute its lvalue (a MemRegion),
+and then bind the SVal for the RHS (which references the symbolic value $1)
to the MemRegion in the symbolic store.
-The second line is similar. When we evaluate x again, we do the same
-dance, and create an SVal that references the symbol $0. Note, two SVals
+The second line is similar. When we evaluate x again, we do the same
+dance, and create an SVal that references the symbol $0. Note, two SVals
might reference the same underlying values.
-To summarize, MemRegions are unique names for blocks of memory. Symbols are -unique names for abstract symbolic values. Some MemRegions represents abstract -symbolic chunks of memory, and thus are also based on symbols. SVals are just -references to values, and can reference either MemRegions, Symbols, or concrete +To summarize, MemRegions are unique names for blocks of memory. Symbols are +unique names for abstract symbolic values. Some MemRegions represents abstract +symbolic chunks of memory, and thus are also based on symbols. SVals are just +references to values, and can reference either MemRegions, Symbols, or concrete values (e.g., the number 1).
-All checkers inherit from the +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1Checker.html"> Checker template class; the template parameter(s) describe the type of events that the checker is interested in processing. The various types of events that are available are described in the file +href="https://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html"> CheckerDocumentation.cpp
For each event type requested, a corresponding callback function must be defined in the checker class ( +href="https://clang.llvm.org/doxygen/CheckerDocumentation_8cpp_source.html"> CheckerDocumentation.cpp shows the correct function name and signature for each event type). @@ -335,13 +335,13 @@
These events that will be used for each of these actions are, respectively, PreCall, +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PreCall.html">PreCall, PostCall, +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PostCall.html">PostCall, DeadSymbols, +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1DeadSymbols.html">DeadSymbols, and PointerEscape. +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1check_1_1PointerEscape.html">PointerEscape. The high-level structure of the checker's class is thus:
@@ -376,22 +376,22 @@
When a checker detects a mistake in the analyzed code, it needs a way to report it to the analyzer core so that it can be displayed. The two classes used to construct this report are BugType +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugType.html">BugType and +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1BugReport.html"> BugReport.
@@ -496,39 +496,39 @@ null pointer, on the other hand, should stop analysis, as there is no way for the program to meaningfully continue after such an error. -
If analysis can continue, then the most recent ExplodedNode -generated by the checker can be passed to the BugReport constructor -without additional modification. This ExplodedNode will be the one +
If analysis can continue, then the most recent ExplodedNode +generated by the checker can be passed to the BugReport constructor +without additional modification. This ExplodedNode will be the one returned by the most recent call to CheckerContext::addTransition. +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition. If no transition has been performed during the current callback, the checker should call CheckerContext::addTransition() +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#a264f48d97809707049689c37aa35af78">CheckerContext::addTransition() and use the returned node for bug reporting.
If analysis can not continue, then the current state should be transitioned into a so-called sink node, a node from which no further analysis will be performed. This is done by calling the +href="https://clang.llvm.org/doxygen/classclang_1_1ento_1_1CheckerContext.html#adeea33a5a2bed190210c4a2bb807a6f0"> CheckerContext::generateSink function; this function is the same as the addTransition function, but marks the state as a sink node. Like addTransition, this returns an ExplodedNode with the updated state, which can then be passed to the BugReport constructor.
-After a BugReport is created, it should be passed to the analyzer core -by calling CheckerContext::emitReport. +After a BugReport is created, it should be passed to the analyzer core +by calling CheckerContext::emitReport.
$ bin/llvm-lit -sv ../llvm/tools/clang/test/Analysis @@ -796,9 +796,9 @@
In the contrived example above, the analyzer has detected that the body of -the loop is never entered for the case where length <= 0. In this -particular example, you may know that the loop will always be entered because -the input parameter length will be greater than zero in all calls to this -function. You can teach the analyzer facts about your code as well as document -it by using assertions. By adding assert(length > 0) in the beginning -of the function, you tell the analyzer that your code is never expecting a zero +
In the contrived example above, the analyzer has detected that the body of +the loop is never entered for the case where length <= 0. In this +particular example, you may know that the loop will always be entered because +the input parameter length will be greater than zero in all calls to this +function. You can teach the analyzer facts about your code as well as document +it by using assertions. By adding assert(length > 0) in the beginning +of the function, you tell the analyzer that your code is never expecting a zero or a negative value, so it won't need to test the correctness of those paths.
@@ -198,15 +198,15 @@There is currently no solid mechanism for suppressing an analyzer warning, although this is currently being investigated. When you encounter an analyzer bug/false positive, check if it's one of the issues discussed above or if the -analyzer annotations can -resolve the issue. Second, please report it to +analyzer annotations can +resolve the issue. Second, please report it to help us improve user experience. As the last resort, consider using __clang_analyzer__ macro described below.
When the static analyzer is using clang to parse source files, it implicitly -defines the preprocessor macro __clang_analyzer__. One can use this +
When the static analyzer is using clang to parse source files, it implicitly +defines the preprocessor macro __clang_analyzer__. One can use this macro to selectively exclude code the analyzer examines. Here is an example:
@@ -215,8 +215,8 @@ #endif-This usage is discouraged because it makes the code dead to the analyzer from -now on. Instead, we prefer that users file bugs against the analyzer when it flags +This usage is discouraged because it makes the code dead to the analyzer from +now on. Instead, we prefer that users file bugs against the analyzer when it flags false positives. @@ -224,4 +224,3 @@