Index: docs/ClangStaticAnalyzer.rst
===================================================================
--- docs/ClangStaticAnalyzer.rst
+++ docs/ClangStaticAnalyzer.rst
@@ -0,0 +1,19 @@
+=====================
+Clang Static Analyzer
+=====================
+
+The Clang Static Analyzer is a source code analysis tool that finds bugs in C, C++, and Objective-C programs.
+It implements *path-sensitive*, *inter-procedural analysis* based on *symbolic execution* technique.
+
+This is the Static Analyzer documentation page.
+
+See the `Official Tool Page `_.
+
+.. toctree::
+ :caption: Table of Contents
+ :numbered:
+ :maxdepth: 2
+
+ analyzer/checkers
+ analyzer/developer-docs
+
Index: docs/analyzer/DebugChecks.rst
===================================================================
--- docs/analyzer/DebugChecks.rst
+++ docs/analyzer/DebugChecks.rst
@@ -1,287 +0,0 @@
-============
-Debug Checks
-============
-
-.. contents::
- :local:
-
-The analyzer contains a number of checkers which can aid in debugging. Enable
-them by using the "-analyzer-checker=" flag, followed by the name of the
-checker.
-
-
-General Analysis Dumpers
-========================
-
-These checkers are used to dump the results of various infrastructural analyses
-to stderr. Some checkers also have "view" variants, which will display a graph
-using a 'dot' format viewer (such as Graphviz on OS X) instead.
-
-- debug.DumpCallGraph, debug.ViewCallGraph: Show the call graph generated for
- the current translation unit. This is used to determine the order in which to
- analyze functions when inlining is enabled.
-
-- debug.DumpCFG, debug.ViewCFG: Show the CFG generated for each top-level
- function being analyzed.
-
-- debug.DumpDominators: Shows the dominance tree for the CFG of each top-level
- function.
-
-- debug.DumpLiveVars: Show the results of live variable analysis for each
- top-level function being analyzed.
-
-- debug.DumpLiveStmts: Show the results of live statement analysis for each
- top-level function being analyzed.
-
-- debug.ViewExplodedGraph: Show the Exploded Graphs generated for the
- analysis of different functions in the input translation unit. When there
- are several functions analyzed, display one graph per function. Beware
- that these graphs may grow very large, even for small functions.
-
-Path Tracking
-=============
-
-These checkers print information about the path taken by the analyzer engine.
-
-- debug.DumpCalls: Prints out every function or method call encountered during a
- path traversal. This is indented to show the call stack, but does NOT do any
- special handling of branches, meaning different paths could end up
- interleaved.
-
-- debug.DumpTraversal: Prints the name of each branch statement encountered
- during a path traversal ("IfStmt", "WhileStmt", etc). Currently used to check
- whether the analysis engine is doing BFS or DFS.
-
-
-State Checking
-==============
-
-These checkers will print out information about the analyzer state in the form
-of analysis warnings. They are intended for use with the -verify functionality
-in regression tests.
-
-- debug.TaintTest: Prints out the word "tainted" for every expression that
- carries taint. At the time of this writing, taint was only introduced by the
- checks under experimental.security.taint.TaintPropagation; this checker may
- eventually move to the security.taint package.
-
-- debug.ExprInspection: Responds to certain function calls, which are modeled
- after builtins. These function calls should affect the program state other
- than the evaluation of their arguments; to use them, you will need to declare
- them within your test file. The available functions are described below.
-
-(FIXME: debug.ExprInspection should probably be renamed, since it no longer only
-inspects expressions.)
-
-
-ExprInspection checks
----------------------
-
-- ``void clang_analyzer_eval(bool);``
-
- Prints TRUE if the argument is known to have a non-zero value, FALSE if the
- argument is known to have a zero or null value, and UNKNOWN if the argument
- isn't sufficiently constrained on this path. You can use this to test other
- values by using expressions like "x == 5". Note that this functionality is
- currently DISABLED in inlined functions, since different calls to the same
- inlined function could provide different information, making it difficult to
- write proper -verify directives.
-
- In C, the argument can be typed as 'int' or as '_Bool'.
-
- Example usage::
-
- clang_analyzer_eval(x); // expected-warning{{UNKNOWN}}
- if (!x) return;
- clang_analyzer_eval(x); // expected-warning{{TRUE}}
-
-
-- ``void clang_analyzer_checkInlined(bool);``
-
- If a call occurs within an inlined function, prints TRUE or FALSE according to
- the value of its argument. If a call occurs outside an inlined function,
- nothing is printed.
-
- The intended use of this checker is to assert that a function is inlined at
- least once (by passing 'true' and expecting a warning), or to assert that a
- function is never inlined (by passing 'false' and expecting no warning). The
- argument is technically unnecessary but is intended to clarify intent.
-
- You might wonder why we can't print TRUE if a function is ever inlined and
- FALSE if it is not. The problem is that any inlined function could conceivably
- also be analyzed as a top-level function (in which case both TRUE and FALSE
- would be printed), depending on the value of the -analyzer-inlining option.
-
- In C, the argument can be typed as 'int' or as '_Bool'.
-
- Example usage::
-
- int inlined() {
- clang_analyzer_checkInlined(true); // expected-warning{{TRUE}}
- return 42;
- }
-
- void topLevel() {
- clang_analyzer_checkInlined(false); // no-warning (not inlined)
- int value = inlined();
- // This assertion will not be valid if the previous call was not inlined.
- clang_analyzer_eval(value == 42); // expected-warning{{TRUE}}
- }
-
-- ``void clang_analyzer_warnIfReached();``
-
- Generate a warning if this line of code gets reached by the analyzer.
-
- Example usage::
-
- if (true) {
- clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
- }
- else {
- clang_analyzer_warnIfReached(); // no-warning
- }
-
-- ``void clang_analyzer_numTimesReached();``
-
- Same as above, but include the number of times this call expression
- gets reached by the analyzer during the current analysis.
-
- Example usage::
-
- for (int x = 0; x < 3; ++x) {
- clang_analyzer_numTimesReached(); // expected-warning{{3}}
- }
-
-- ``void clang_analyzer_warnOnDeadSymbol(int);``
-
- Subscribe for a delayed warning when the symbol that represents the value of
- the argument is garbage-collected by the analyzer.
-
- When calling 'clang_analyzer_warnOnDeadSymbol(x)', if value of 'x' is a
- symbol, then this symbol is marked by the ExprInspection checker. Then,
- during each garbage collection run, the checker sees if the marked symbol is
- being collected and issues the 'SYMBOL DEAD' warning if it does.
- This way you know where exactly, up to the line of code, the symbol dies.
-
- It is unlikely that you call this function after the symbol is already dead,
- because the very reference to it as the function argument prevents it from
- dying. However, if the argument is not a symbol but a concrete value,
- no warning would be issued.
-
- Example usage::
-
- do {
- int x = generate_some_integer();
- clang_analyzer_warnOnDeadSymbol(x);
- } while(0); // expected-warning{{SYMBOL DEAD}}
-
-
-- ``void clang_analyzer_explain(a single argument of any type);``
-
- This function explains the value of its argument in a human-readable manner
- in the warning message. You can make as many overrides of its prototype
- in the test code as necessary to explain various integral, pointer,
- or even record-type values. To simplify usage in C code (where overloading
- the function declaration is not allowed), you may append an arbitrary suffix
- to the function name, without affecting functionality.
-
- Example usage::
-
- void clang_analyzer_explain(int);
- void clang_analyzer_explain(void *);
-
- // Useful in C code
- void clang_analyzer_explain_int(int);
-
- void foo(int param, void *ptr) {
- clang_analyzer_explain(param); // expected-warning{{argument 'param'}}
- clang_analyzer_explain_int(param); // expected-warning{{argument 'param'}}
- if (!ptr)
- clang_analyzer_explain(ptr); // expected-warning{{memory address '0'}}
- }
-
-- ``void clang_analyzer_dump( /* a single argument of any type */);``
-
- Similar to clang_analyzer_explain, but produces a raw dump of the value,
- same as SVal::dump().
-
- Example usage::
-
- void clang_analyzer_dump(int);
- void foo(int x) {
- clang_analyzer_dump(x); // expected-warning{{reg_$0}}
- }
-
-- ``size_t clang_analyzer_getExtent(void *);``
-
- This function returns the value that represents the extent of a memory region
- pointed to by the argument. This value is often difficult to obtain otherwise,
- because no valid code that produces this value. However, it may be useful
- for testing purposes, to see how well does the analyzer model region extents.
-
- Example usage::
-
- void foo() {
- int x, *y;
- size_t xs = clang_analyzer_getExtent(&x);
- clang_analyzer_explain(xs); // expected-warning{{'4'}}
- size_t ys = clang_analyzer_getExtent(&y);
- clang_analyzer_explain(ys); // expected-warning{{'8'}}
- }
-
-- ``void clang_analyzer_printState();``
-
- Dumps the current ProgramState to the stderr. Quickly lookup the program state
- at any execution point without ViewExplodedGraph or re-compiling the program.
- This is not very useful for writing tests (apart from testing how ProgramState
- gets printed), but useful for debugging tests. Also, this method doesn't
- produce a warning, so it gets printed on the console before all other
- ExprInspection warnings.
-
- Example usage::
-
- void foo() {
- int x = 1;
- clang_analyzer_printState(); // Read the stderr!
- }
-
-- ``void clang_analyzer_hashDump(int);``
-
- The analyzer can generate a hash to identify reports. To debug what information
- is used to calculate this hash it is possible to dump the hashed string as a
- warning of an arbitrary expression using the function above.
-
- Example usage::
-
- void foo() {
- int x = 1;
- clang_analyzer_hashDump(x); // expected-warning{{hashed string for x}}
- }
-
-- ``void clang_analyzer_denote(int, const char *);``
-
- Denotes symbols with strings. A subsequent call to clang_analyzer_express()
- will expresses another symbol in terms of these string. Useful for testing
- relationships between different symbols.
-
- Example usage::
-
- void foo(int x) {
- clang_analyzer_denote(x, "$x");
- clang_analyzer_express(x + 1); // expected-warning{{$x + 1}}
- }
-
-- ``void clang_analyzer_express(int);``
-
- See clang_analyzer_denote().
-
-Statistics
-==========
-
-The debug.Stats checker collects various information about the analysis of each
-function, such as how many blocks were reached and if the analyzer timed out.
-
-There is also an additional -analyzer-stats flag, which enables various
-statistics within the analyzer engine. Note the Stats checker (which produces at
-least one bug report per function) may actually change the values reported by
--analyzer-stats.
Index: docs/analyzer/DesignDiscussions/InitializerLists.rst
===================================================================
--- docs/analyzer/DesignDiscussions/InitializerLists.rst
+++ docs/analyzer/DesignDiscussions/InitializerLists.rst
@@ -1,321 +0,0 @@
-This discussion took place in https://reviews.llvm.org/D35216
-"Escape symbols when creating std::initializer_list".
-
-It touches problems of modelling C++ standard library constructs in general,
-including modelling implementation-defined fields within C++ standard library
-objects, in particular constructing objects into pointers held by such fields,
-and separation of responsibilities between analyzer's core and checkers.
-
-**Artem:**
-
-I've seen a few false positives that appear because we construct
-C++11 std::initializer_list objects with brace initializers, and such
-construction is not properly modeled. For instance, if a new object is
-constructed on the heap only to be put into a brace-initialized STL container,
-the object is reported to be leaked.
-
-Approach (0): This can be trivially fixed by this patch, which causes pointers
-passed into initializer list expressions to immediately escape.
-
-This fix is overly conservative though. So i did a bit of investigation as to
-how model std::initializer_list better.
-
-According to the standard, std::initializer_list is an object that has
-methods begin(), end(), and size(), where begin() returns a pointer to continuous
-array of size() objects of type T, and end() is equal to begin() plus size().
-The standard does hint that it should be possible to implement
-std::initializer_list as a pair of pointers, or as a pointer and a size
-integer, however specific fields that the object would contain are an
-implementation detail.
-
-Ideally, we should be able to model the initializer list's methods precisely.
-Or, at least, it should be possible to explain to the analyzer that the list
-somehow "takes hold" of the values put into it. Initializer lists can also be
-copied, which is a separate story that i'm not trying to address here.
-
-The obvious approach to modeling std::initializer_list in a checker would be to
-construct a SymbolMetadata for the memory region of the initializer list object,
-which would be of type T* and represent begin(), so we'd trivially model begin()
-as a function that returns this symbol. The array pointed to by that symbol
-would be bindLoc()ed to contain the list's contents (probably as a CompoundVal
-to produce less bindings in the store). Extent of this array would represent
-size() and would be equal to the length of the list as written.
-
-So this sounds good, however apparently it does nothing to address our false
-positives: when the list escapes, our RegionStoreManager is not magically
-guessing that the metadata symbol attached to it, together with its contents,
-should also escape. In fact, it's impossible to trigger a pointer escape from
-within the checker.
-
-Approach (1): If only we enabled ProgramState::bindLoc(..., notifyChanges=true)
-to cause pointer escapes (not only region changes) (which sounds like the right
-thing to do anyway) such checker would be able to solve the false positives by
-triggering escapes when binding list elements to the list. However, it'd be as
-conservative as the current patch's solution. Ideally, we do not want escapes to
-happen so early. Instead, we'd prefer them to be delayed until the list itself
-escapes.
-
-So i believe that escaping metadata symbols whenever their base regions escape
-would be the right thing to do. Currently we didn't think about that because we
-had neither pointer-type metadatas nor non-pointer escapes.
-
-Approach (2): We could teach the Store to scan itself for bindings to
-metadata-symbolic-based regions during scanReachableSymbols() whenever a region
-turns out to be reachable. This requires no work on checker side, but it sounds
-performance-heavy.
-
-Approach (3): We could let checkers maintain the set of active metadata symbols
-in the program state (ideally somewhere in the Store, which sounds weird but
-causes the smallest amount of layering violations), so that the core knew what
-to escape. This puts a stress on the checkers, but with a smart data map it
-wouldn't be a problem.
-
-Approach (4): We could allow checkers to trigger pointer escapes in arbitrary
-moments. If we allow doing this within checkPointerEscape callback itself, we
-would be able to express facts like "when this region escapes, that metadata
-symbol attached to it should also escape". This sounds like an ultimate freedom,
-with maximum stress on the checkers - still not too much stress when we have
-smart data maps.
-
-I'm personally liking the approach (2) - it should be possible to avoid
-performance overhead, and clarity seems nice.
-
-**Gabor:**
-
-At this point, I am a bit wondering about two questions.
-
-- When should something belong to a checker and when should something belong
-to the engine? Sometimes we model library aspects in the engine and model
-language constructs in checkers.
-- What is the checker programming model that we are aiming for? Maximum
-freedom or more easy checker development?
-
-I think if we aim for maximum freedom, we do not need to worry about the
-potential stress on checkers, and we can introduce abstractions to mitigate that
-later on.
-If we want to simplify the API, then maybe it makes more sense to move language
-construct modeling to the engine when the checker API is not sufficient instead
-of complicating the API.
-
-Right now I have no preference or objections between the alternatives but there
-are some random thoughts:
-
-- Maybe it would be great to have a guideline how to evolve the analyzer and
-follow it, so it can help us to decide in similar situations
-- I do care about performance in this case. The reason is that we have a
-limited performance budget. And I think we should not expect most of the checker
-writers to add modeling of language constructs. So, in my opinion, it is ok to
-have less nice/more verbose API for language modeling if we can have better
-performance this way, since it only needs to be done once, and is done by the
-framework developers.
-
-**Artem:** These are some great questions, i guess it'd be better to discuss
-them more openly. As a quick dump of my current mood:
-
-- To me it seems obvious that we need to aim for a checker API that is both
-simple and powerful. This can probably by keeping the API as powerful as
-necessary while providing a layer of simple ready-made solutions on top of it.
-Probably a few reusable components for assembling checkers. And this layer
-should ideally be pleasant enough to work with, so that people would prefer to
-extend it when something is lacking, instead of falling back to the complex
-omnipotent API. I'm thinking of AST matchers vs. AST visitors as a roughly
-similar situation: matchers are not omnipotent, but they're so nice.
-
-- Separation between core and checkers is usually quite strange. Once we have
-shared state traits, i generally wouldn't mind having region store or range
-constraint manager as checkers (though it's probably not worth it to transform
-them - just a mood). The main thing to avoid here would be the situation when
-the checker overwrites stuff written by the core because it thinks it has a
-better idea what's going on, so the core should provide a good default behavior.
-
-- Yeah, i totally care about performance as well, and if i try to implement
-approach, i'd make sure it's good.
-
-**Artem:**
-
-> Approach (2): We could teach the Store to scan itself for bindings to
-> metadata-symbolic-based regions during scanReachableSymbols() whenever
-> a region turns out to be reachable. This requires no work on checker side,
-> but it sounds performance-heavy.
-
-Nope, this approach is wrong. Metadata symbols may become out-of-date: when the
-object changes, metadata symbols attached to it aren't changing (because symbols
-simply don't change). The same metadata may have different symbols to denote its
-value in different moments of time, but at most one of them represents the
-actual metadata value. So we'd be escaping more stuff than necessary.
-
-If only we had "ghost fields"
-(https://lists.llvm.org/pipermail/cfe-dev/2016-May/049000.html), it would have
-been much easier, because the ghost field would only contain the actual
-metadata, and the Store would always know about it. This example adds to my
-belief that ghost fields are exactly what we need for most C++ checkers.
-
-**Devin:**
-
-In this case, I would be fine with some sort of
-AbstractStorageMemoryRegion that meant "here is a memory region and somewhere
-reachable from here exists another region of type T". Or even multiple regions
-with different identifiers. This wouldn't specify how the memory is reachable,
-but it would allow for transfer functions to get at those regions and it would
-allow for invalidation.
-
-For std::initializer_list this reachable region would the region for the backing
-array and the transfer functions for begin() and end() yield the beginning and
-end element regions for it.
-
-In my view this differs from ghost variables in that (1) this storage does
-actually exist (it is just a library implementation detail where that storage
-lives) and (2) it is perfectly valid for a pointer into that storage to be
-returned and for another part of the program to read or write from that storage.
-(Well, in this case just read since it is allowed to be read-only memory).
-
-What I'm not OK with is modeling abstract analysis state (for example, the count
-of a NSMutableArray or the typestate of a file handle) as a value stored in some
-ginned up region in the store. This takes an easy problem that the analyzer does
-well at (modeling typestate) and turns it into a hard one that the analyzer is
-bad at (reasoning about the contents of the heap).
-
-I think the key criterion here is: "is the region accessible from outside the
-library". That is, does the library expose the region as a pointer that can be
-read to or written from in the client program? If so, then it makes sense for
-this to be in the store: we are modeling reachable storage as storage. But if
-we're just modeling arbitrary analysis facts that need to be invalidated when a
-pointer escapes then we shouldn't try to gin up storage for them just to get
-invalidation for free.
-
-**Artem:**
-
-> In this case, I would be fine with some sort of AbstractStorageMemoryRegion
-> that meant "here is a memory region and somewhere reachable from here exists
-> another region of type T". Or even multiple regions with different
-> identifiers. This wouldn't specify how the memory is reachable, but it would
-> allow for transfer functions to get at those regions and it would allow for
-> invalidation.
-
-Yeah, this is what we can easily implement now as a
-symbolic-region-based-on-a-metadata-symbol (though we can make a new region
-class for that if we eg. want it typed). The problem is that the relation
-between such storage region and its parent object region is essentially
-immaterial, similarly to the relation between SymbolRegionValue and its parent
-region. Region contents are mutable: today the abstract storage is reachable
-from its parent object, tomorrow it's not, and maybe something else becomes
-reachable, something that isn't even abstract. So the parent region for the
-abstract storage is most of the time at best a "nice to know" thing - we cannot
-rely on it to do any actual work. We'd anyway need to rely on the checker to do
-the job.
-
-> For std::initializer_list this reachable region would the region for the
-> backing array and the transfer functions for begin() and end() yield the
-> beginning and end element regions for it.
-
-So maybe in fact for std::initializer_list it may work fine because you cannot
-change the data after the object is constructed - so this region's contents are
-essentially immutable. For the future, i feel as if it is a dead end.
-
-I'd like to consider another funny example. Suppose we're trying to model
-std::unique_ptr. Consider::
-
- void bar(const std::unique_ptr &x);
-
- void foo(std::unique_ptr &x) {
- int *a = x.get(); // (a, 0, direct): &AbstractStorageRegion
- *a = 1; // (AbstractStorageRegion, 0, direct): 1 S32b
- int *b = new int;
- *b = 2; // (SymRegion{conj_$0}, 0 ,direct): 2 S32b
- x.reset(b); // Checker map: x -> SymRegion{conj_$0}
- bar(x); // 'a' doesn't escape (the pointer was unique), 'b' does.
- clang_analyzer_eval(*a == 1); // Making this true is up to the checker.
- clang_analyzer_eval(*b == 2); // Making this unknown is up to the checker.
- }
-
-The checker doesn't totally need to ensure that *a == 1 passes - even though the
-pointer was unique, it could theoretically have .get()-ed above and the code
-could of course break the uniqueness invariant (though we'd probably want it).
-The checker can say that "even if *a did escape, it was not because it was
-stuffed directly into bar()".
-
-The checker's direct responsibility, however, is to solve the *b == 2 thing
-(which is in fact the problem we're dealing with in this patch - escaping the
-storage region of the object).
-
-So we're talking about one more operation over the program state (scanning
-reachable symbols and regions) that cannot work without checker support.
-
-We can probably add a new callback "checkReachableSymbols" to solve this. This
-is in fact also related to the dead symbols problem (we're scanning for live
-symbols in the store and in the checkers separately, but we need to do so
-simultaneously with a single worklist). Hmm, in fact this sounds like a good
-idea; we can replace checkLiveSymbols with checkReachableSymbols.
-
-Or we could just have ghost member variables, and no checker support required at
-all. For ghost member variables, the relation with their parent region (which
-would be their superregion) is actually useful, the mutability of their contents
-is expressed naturally, and the store automagically sees reachable symbols, live
-symbols, escapes, invalidations, whatever.
-
-> In my view this differs from ghost variables in that (1) this storage does
-> actually exist (it is just a library implementation detail where that storage
-> lives) and (2) it is perfectly valid for a pointer into that storage to be
-> returned and for another part of the program to read or write from that
-> storage. (Well, in this case just read since it is allowed to be read-only
-> memory).
-
-> What I'm not OK with is modeling abstract analysis state (for example, the
-> count of a NSMutableArray or the typestate of a file handle) as a value stored
-> in some ginned up region in the store.This takes an easy problem that the
-> analyzer does well at (modeling typestate) and turns it into a hard one that
-> the analyzer is bad at (reasoning about the contents of the heap).
-
-Yeah, i tend to agree on that. For simple typestates, this is probably an
-overkill, so let's definitely put aside the idea of "ghost symbolic regions"
-that i had earlier.
-
-But, to summarize a bit, in our current case, however, the typestate we're
-looking for is the contents of the heap. And when we try to model such
-typestates (complex in this specific manner, i.e. heap-like) in any checker, we
-have a choice between re-doing this modeling in every such checker (which is
-something analyzer is indeed good at, but at a price of making checkers heavy)
-or instead relying on the Store to do exactly what it's designed to do.
-
-> I think the key criterion here is: "is the region accessible from outside
-> the library". That is, does the library expose the region as a pointer that
-> can be read to or written from in the client program? If so, then it makes
-> sense for this to be in the store: we are modeling reachable storage as
-> storage. But if we're just modeling arbitrary analysis facts that need to be
-> invalidated when a pointer escapes then we shouldn't try to gin up storage
-> for them just to get invalidation for free.
-
-As a metaphor, i'd probably compare it to body farms - the difference between
-ghost member variables and metadata symbols seems to me like the difference
-between body farms and evalCall. Both are nice to have, and body farms are very
-pleasant to work with, even if not omnipotent. I think it's fine for a
-FunctionDecl's body in a body farm to have a local variable, even if such
-variable doesn't actually exist, even if it cannot be seen from outside the
-function call. I'm not seeing immediate practical difference between "it does
-actually exist" and "it doesn't actually exist, just a handy abstraction".
-Similarly, i think it's fine if we have a CXXRecordDecl with
-implementation-defined contents, and try to farm up a member variable as a handy
-abstraction (we don't even need to know its name or offset, only that it's there
-somewhere).
-
-**Artem:**
-
-We've discussed it in person with Devin, and he provided more points to think
-about:
-
-- If the initializer list consists of non-POD data, constructors of list's
-objects need to take the sub-region of the list's region as this-region In the
-current (v2) version of this patch, these objects are constructed elsewhere and
-then trivial-copied into the list's metadata pointer region, which may be
-incorrect. This is our overall problem with C++ constructors, which manifests in
-this case as well. Additionally, objects would need to be constructed in the
-analyzer's core, which would not be able to predict that it needs to take a
-checker-specific region as this-region, which makes it harder, though it might
-be mitigated by sharing the checker state traits.
-
-- Because "ghost variables" are not material to the user, we need to somehow
-make super sure that they don't make it into the diagnostic messages.
-
-So, because this needs further digging into overall C++ support and rises too
-many questions, i'm delaying a better approach to this problem and will fall
-back to the original trivial patch.
Index: docs/analyzer/IPA.txt
===================================================================
--- docs/analyzer/IPA.txt
+++ docs/analyzer/IPA.txt
@@ -1,386 +0,0 @@
-Inlining
-========
-
-There are several options that control which calls the analyzer will consider for
-inlining. The major one is -analyzer-config ipa:
-
- -analyzer-config ipa=none - All inlining is disabled. This is the only mode
- available in LLVM 3.1 and earlier and in Xcode 4.3 and earlier.
-
- -analyzer-config ipa=basic-inlining - Turns on inlining for C functions, C++
- static member functions, and blocks -- essentially, the calls that behave
- like simple C function calls. This is essentially the mode used in
- Xcode 4.4.
-
- -analyzer-config ipa=inlining - Turns on inlining when we can confidently find
- the function/method body corresponding to the call. (C functions, static
- functions, devirtualized C++ methods, Objective-C class methods, Objective-C
- instance methods when ExprEngine is confident about the dynamic type of the
- instance).
-
- -analyzer-config ipa=dynamic - Inline instance methods for which the type is
- determined at runtime and we are not 100% sure that our type info is
- correct. For virtual calls, inline the most plausible definition.
-
- -analyzer-config ipa=dynamic-bifurcate - Same as -analyzer-config ipa=dynamic,
- but the path is split. We inline on one branch and do not inline on the
- other. This mode does not drop the coverage in cases when the parent class
- has code that is only exercised when some of its methods are overridden.
-
-Currently, -analyzer-config ipa=dynamic-bifurcate is the default mode.
-
-While -analyzer-config ipa determines in general how aggressively the analyzer
-will try to inline functions, several additional options control which types of
-functions can inlined, in an all-or-nothing way. These options use the
-analyzer's configuration table, so they are all specified as follows:
-
- -analyzer-config OPTION=VALUE
-
-### c++-inlining ###
-
-This option controls which C++ member functions may be inlined.
-
- -analyzer-config c++-inlining=[none | methods | constructors | destructors]
-
-Each of these modes implies that all the previous member function kinds will be
-inlined as well; it doesn't make sense to inline destructors without inlining
-constructors, for example.
-
-The default c++-inlining mode is 'destructors', meaning that all member
-functions with visible definitions will be considered for inlining. In some
-cases the analyzer may still choose not to inline the function.
-
-Note that under 'constructors', constructors for types with non-trivial
-destructors will not be inlined. Additionally, no C++ member functions will be
-inlined under -analyzer-config ipa=none or -analyzer-config ipa=basic-inlining,
-regardless of the setting of the c++-inlining mode.
-
-### c++-template-inlining ###
-
-This option controls whether C++ templated functions may be inlined.
-
- -analyzer-config c++-template-inlining=[true | false]
-
-Currently, template functions are considered for inlining by default.
-
-The motivation behind this option is that very generic code can be a source
-of false positives, either by considering paths that the caller considers
-impossible (by some unstated precondition), or by inlining some but not all
-of a deep implementation of a function.
-
-### c++-stdlib-inlining ###
-
-This option controls whether functions from the C++ standard library, including
-methods of the container classes in the Standard Template Library, should be
-considered for inlining.
-
- -analyzer-config c++-stdlib-inlining=[true | false]
-
-Currently, C++ standard library functions are considered for inlining by
-default.
-
-The standard library functions and the STL in particular are used ubiquitously
-enough that our tolerance for false positives is even lower here. A false
-positive due to poor modeling of the STL leads to a poor user experience, since
-most users would not be comfortable adding assertions to system headers in order
-to silence analyzer warnings.
-
-### c++-container-inlining ###
-
-This option controls whether constructors and destructors of "container" types
-should be considered for inlining.
-
- -analyzer-config c++-container-inlining=[true | false]
-
-Currently, these constructors and destructors are NOT considered for inlining
-by default.
-
-The current implementation of this setting checks whether a type has a member
-named 'iterator' or a member named 'begin'; these names are idiomatic in C++,
-with the latter specified in the C++11 standard. The analyzer currently does a
-fairly poor job of modeling certain data structure invariants of container-like
-objects. For example, these three expressions should be equivalent:
-
- std::distance(c.begin(), c.end()) == 0
- c.begin() == c.end()
- c.empty())
-
-Many of these issues are avoided if containers always have unknown, symbolic
-state, which is what happens when their constructors are treated as opaque.
-In the future, we may decide specific containers are "safe" to model through
-inlining, or choose to model them directly using checkers instead.
-
-
-Basics of Implementation
------------------------
-
-The low-level mechanism of inlining a function is handled in
-ExprEngine::inlineCall and ExprEngine::processCallExit.
-
-If the conditions are right for inlining, a CallEnter node is created and added
-to the analysis work list. The CallEnter node marks the change to a new
-LocationContext representing the called function, and its state includes the
-contents of the new stack frame. When the CallEnter node is actually processed,
-its single successor will be a edge to the first CFG block in the function.
-
-Exiting an inlined function is a bit more work, fortunately broken up into
-reasonable steps:
-
-1. The CoreEngine realizes we're at the end of an inlined call and generates a
- CallExitBegin node.
-
-2. ExprEngine takes over (in processCallExit) and finds the return value of the
- function, if it has one. This is bound to the expression that triggered the
- call. (In the case of calls without origin expressions, such as destructors,
- this step is skipped.)
-
-3. Dead symbols and bindings are cleaned out from the state, including any local
- bindings.
-
-4. A CallExitEnd node is generated, which marks the transition back to the
- caller's LocationContext.
-
-5. Custom post-call checks are processed and the final nodes are pushed back
- onto the work list, so that evaluation of the caller can continue.
-
-Retry Without Inlining
-----------------------
-
-In some cases, we would like to retry analysis without inlining a particular
-call.
-
-Currently, we use this technique to recover coverage in case we stop
-analyzing a path due to exceeding the maximum block count inside an inlined
-function.
-
-When this situation is detected, we walk up the path to find the first node
-before inlining was started and enqueue it on the WorkList with a special
-ReplayWithoutInlining bit added to it (ExprEngine::replayWithoutInlining). The
-path is then re-analyzed from that point without inlining that particular call.
-
-Deciding When to Inline
------------------------
-
-In general, the analyzer attempts to inline as much as possible, since it
-provides a better summary of what actually happens in the program. There are
-some cases, however, where the analyzer chooses not to inline:
-
-- If there is no definition available for the called function or method. In
- this case, there is no opportunity to inline.
-
-- If the CFG cannot be constructed for a called function, or the liveness
- cannot be computed. These are prerequisites for analyzing a function body,
- with or without inlining.
-
-- If the LocationContext chain for a given ExplodedNode reaches a maximum cutoff
- depth. This prevents unbounded analysis due to infinite recursion, but also
- serves as a useful cutoff for performance reasons.
-
-- If the function is variadic. This is not a hard limitation, but an engineering
- limitation.
-
- Tracked by: Support inlining of variadic functions
-
-- In C++, constructors are not inlined unless the destructor call will be
- processed by the ExprEngine. Thus, if the CFG was built without nodes for
- implicit destructors, or if the destructors for the given object are not
- represented in the CFG, the constructor will not be inlined. (As an exception,
- constructors for objects with trivial constructors can still be inlined.)
- See "C++ Caveats" below.
-
-- In C++, ExprEngine does not inline custom implementations of operator 'new'
- or operator 'delete', nor does it inline the constructors and destructors
- associated with these. See "C++ Caveats" below.
-
-- Calls resulting in "dynamic dispatch" are specially handled. See more below.
-
-- The FunctionSummaries map stores additional information about declarations,
- some of which is collected at runtime based on previous analyses.
- We do not inline functions which were not profitable to inline in a different
- context (for example, if the maximum block count was exceeded; see
- "Retry Without Inlining").
-
-
-Dynamic Calls and Devirtualization
-----------------------------------
-
-"Dynamic" calls are those that are resolved at runtime, such as C++ virtual
-method calls and Objective-C message sends. Due to the path-sensitive nature of
-the analysis, the analyzer may be able to reason about the dynamic type of the
-object whose method is being called and thus "devirtualize" the call.
-
-This path-sensitive devirtualization occurs when the analyzer can determine what
-method would actually be called at runtime. This is possible when the type
-information is constrained enough for a simulated C++/Objective-C object that
-the analyzer can make such a decision.
-
- == DynamicTypeInfo ==
-
-As the analyzer analyzes a path, it may accrue information to refine the
-knowledge about the type of an object. This can then be used to make better
-decisions about the target method of a call.
-
-Such type information is tracked as DynamicTypeInfo. This is path-sensitive
-data that is stored in ProgramState, which defines a mapping from MemRegions to
-an (optional) DynamicTypeInfo.
-
-If no DynamicTypeInfo has been explicitly set for a MemRegion, it will be lazily
-inferred from the region's type or associated symbol. Information from symbolic
-regions is weaker than from true typed regions.
-
- EXAMPLE: A C++ object declared "A obj" is known to have the class 'A', but a
- reference "A &ref" may dynamically be a subclass of 'A'.
-
-The DynamicTypePropagation checker gathers and propagates DynamicTypeInfo,
-updating it as information is observed along a path that can refine that type
-information for a region.
-
- WARNING: Not all of the existing analyzer code has been retrofitted to use
- DynamicTypeInfo, nor is it universally appropriate. In particular,
- DynamicTypeInfo always applies to a region with all casts stripped
- off, but sometimes the information provided by casts can be useful.
-
-
- == RuntimeDefinition ==
-
-The basis of devirtualization is CallEvent's getRuntimeDefinition() method,
-which returns a RuntimeDefinition object. When asked to provide a definition,
-the CallEvents for dynamic calls will use the DynamicTypeInfo in their
-ProgramState to attempt to devirtualize the call. In the case of no dynamic
-dispatch, or perfectly constrained devirtualization, the resulting
-RuntimeDefinition contains a Decl corresponding to the definition of the called
-function, and RuntimeDefinition::mayHaveOtherDefinitions will return FALSE.
-
-In the case of dynamic dispatch where our information is not perfect, CallEvent
-can make a guess, but RuntimeDefinition::mayHaveOtherDefinitions will return
-TRUE. The RuntimeDefinition object will then also include a MemRegion
-corresponding to the object being called (i.e., the "receiver" in Objective-C
-parlance), which ExprEngine uses to decide whether or not the call should be
-inlined.
-
- == Inlining Dynamic Calls ==
-
-The -analyzer-config ipa option has five different modes: none, basic-inlining,
-inlining, dynamic, and dynamic-bifurcate. Under -analyzer-config ipa=dynamic,
-all dynamic calls are inlined, whether we are certain or not that this will
-actually be the definition used at runtime. Under -analyzer-config ipa=inlining,
-only "near-perfect" devirtualized calls are inlined*, and other dynamic calls
-are evaluated conservatively (as if no definition were available).
-
-* Currently, no Objective-C messages are not inlined under
- -analyzer-config ipa=inlining, even if we are reasonably confident of the type
- of the receiver. We plan to enable this once we have tested our heuristics
- more thoroughly.
-
-The last option, -analyzer-config ipa=dynamic-bifurcate, behaves similarly to
-"dynamic", but performs a conservative invalidation in the general virtual case
-in *addition* to inlining. The details of this are discussed below.
-
-As stated above, -analyzer-config ipa=basic-inlining does not inline any C++
-member functions or Objective-C method calls, even if they are non-virtual or
-can be safely devirtualized.
-
-
-Bifurcation
------------
-
-ExprEngine::BifurcateCall implements the -analyzer-config ipa=dynamic-bifurcate
-mode.
-
-When a call is made on an object with imprecise dynamic type information
-(RuntimeDefinition::mayHaveOtherDefinitions() evaluates to TRUE), ExprEngine
-bifurcates the path and marks the object's region (retrieved from the
-RuntimeDefinition object) with a path-sensitive "mode" in the ProgramState.
-
-Currently, there are 2 modes:
-
- DynamicDispatchModeInlined - Models the case where the dynamic type information
- of the receiver (MemoryRegion) is assumed to be perfectly constrained so
- that a given definition of a method is expected to be the code actually
- called. When this mode is set, ExprEngine uses the Decl from
- RuntimeDefinition to inline any dynamically dispatched call sent to this
- receiver because the function definition is considered to be fully resolved.
-
- DynamicDispatchModeConservative - Models the case where the dynamic type
- information is assumed to be incorrect, for example, implies that the method
- definition is overridden in a subclass. In such cases, ExprEngine does not
- inline the methods sent to the receiver (MemoryRegion), even if a candidate
- definition is available. This mode is conservative about simulating the
- effects of a call.
-
-Going forward along the symbolic execution path, ExprEngine consults the mode
-of the receiver's MemRegion to make decisions on whether the calls should be
-inlined or not, which ensures that there is at most one split per region.
-
-At a high level, "bifurcation mode" allows for increased semantic coverage in
-cases where the parent method contains code which is only executed when the
-class is subclassed. The disadvantages of this mode are a (considerable?)
-performance hit and the possibility of false positives on the path where the
-conservative mode is used.
-
-Objective-C Message Heuristics
-------------------------------
-
-ExprEngine relies on a set of heuristics to partition the set of Objective-C
-method calls into those that require bifurcation and those that do not. Below
-are the cases when the DynamicTypeInfo of the object is considered precise
-(cannot be a subclass):
-
- - If the object was created with +alloc or +new and initialized with an -init
- method.
-
- - If the calls are property accesses using dot syntax. This is based on the
- assumption that children rarely override properties, or do so in an
- essentially compatible way.
-
- - If the class interface is declared inside the main source file. In this case
- it is unlikely that it will be subclassed.
-
- - If the method is not declared outside of main source file, either by the
- receiver's class or by any superclasses.
-
-C++ Caveats
---------------------
-
-C++11 [class.cdtor]p4 describes how the vtable of an object is modified as it is
-being constructed or destructed; that is, the type of the object depends on
-which base constructors have been completed. This is tracked using
-DynamicTypeInfo in the DynamicTypePropagation checker.
-
-There are several limitations in the current implementation:
-
-- Temporaries are poorly modeled right now because we're not confident in the
- placement of their destructors in the CFG. We currently won't inline their
- constructors unless the destructor is trivial, and don't process their
- destructors at all, not even to invalidate the region.
-
-- 'new' is poorly modeled due to some nasty CFG/design issues. This is tracked
- in PR12014. 'delete' is not modeled at all.
-
-- Arrays of objects are modeled very poorly right now. ExprEngine currently
- only simulates the first constructor and first destructor. Because of this,
- ExprEngine does not inline any constructors or destructors for arrays.
-
-
-CallEvent
-=========
-
-A CallEvent represents a specific call to a function, method, or other body of
-code. It is path-sensitive, containing both the current state (ProgramStateRef)
-and stack space (LocationContext), and provides uniform access to the argument
-values and return type of a call, no matter how the call is written in the
-source or what sort of code body is being invoked.
-
- NOTE: For those familiar with Cocoa, CallEvent is roughly equivalent to
- NSInvocation.
-
-CallEvent should be used whenever there is logic dealing with function calls
-that does not care how the call occurred.
-
-Examples include checking that arguments satisfy preconditions (such as
-__attribute__((nonnull))), and attempting to inline a call.
-
-CallEvents are reference-counted objects managed by a CallEventManager. While
-there is no inherent issue with persisting them (say, in a ProgramState's GDM),
-they are intended for short-lived use, and can be recreated from CFGElements or
-non-top-level StackFrameContexts fairly easily.
Index: docs/analyzer/RegionStore.txt
===================================================================
--- docs/analyzer/RegionStore.txt
+++ docs/analyzer/RegionStore.txt
@@ -1,171 +0,0 @@
-The analyzer "Store" represents the contents of memory regions. It is an opaque
-functional data structure stored in each ProgramState; the only class that can
-modify the store is its associated StoreManager.
-
-Currently (Feb. 2013), the only StoreManager implementation being used is
-RegionStoreManager. This store records bindings to memory regions using a "base
-region + offset" key. (This allows `*p` and `p[0]` to map to the same location,
-among other benefits.)
-
-Regions are grouped into "clusters", which roughly correspond to "regions with
-the same base region". This allows certain operations to be more efficient,
-such as invalidation.
-
-Regions that do not have a known offset use a special "symbolic" offset. These
-keys store both the original region, and the "concrete offset region" -- the
-last region whose offset is entirely concrete. (For example, in the expression
-`foo.bar[1][i].baz`, the concrete offset region is the array `foo.bar[1]`,
-since that has a known offset from the start of the top-level `foo` struct.)
-
-
-Binding Invalidation
-====================
-
-Supporting both concrete and symbolic offsets makes things a bit tricky. Here's
-an example:
-
- foo[0] = 0;
- foo[1] = 1;
- foo[i] = i;
-
-After the third assignment, nothing can be said about the value of `foo[0]`,
-because `foo[i]` may have overwritten it! Thus, *binding to a region with a
-symbolic offset invalidates the entire concrete offset region.* We know
-`foo[i]` is somewhere within `foo`, so we don't have to invalidate anything
-else, but we do have to be conservative about all other bindings within `foo`.
-
-Continuing the example:
-
- foo[i] = i;
- foo[0] = 0;
-
-After this latest assignment, nothing can be said about the value of `foo[i]`,
-because `foo[0]` may have overwritten it! *Binding to a region R with a
-concrete offset invalidates any symbolic offset bindings whose concrete offset
-region is a super-region **or** sub-region of R.* All we know about `foo[i]` is
-that it is somewhere within `foo`, so changing *anything* within `foo` might
-change `foo[i]`, and changing *all* of `foo` (or its base region) will
-*definitely* change `foo[i]`.
-
-This logic could be improved by using the current constraints on `i`, at the
-cost of speed. The latter case could also be improved by matching region kinds,
-i.e. changing `foo[0].a` is unlikely to affect `foo[i].b`, no matter what `i`
-is.
-
-For more detail, read through RegionStoreManager::removeSubRegionBindings in
-RegionStore.cpp.
-
-
-ObjCIvarRegions
-===============
-
-Objective-C instance variables require a bit of special handling. Like struct
-fields, they are not base regions, and when their parent object region is
-invalidated, all the instance variables must be invalidated as well. However,
-they have no concrete compile-time offsets (in the modern, "non-fragile"
-runtime), and so cannot easily be represented as an offset from the start of
-the object in the analyzer. Moreover, this means that invalidating a single
-instance variable should *not* invalidate the rest of the object, since unlike
-struct fields or array elements there is no way to perform pointer arithmetic
-to access another instance variable.
-
-Consequently, although the base region of an ObjCIvarRegion is the entire
-object, RegionStore offsets are computed from the start of the instance
-variable. Thus it is not valid to assume that all bindings with non-symbolic
-offsets start from the base region!
-
-
-Region Invalidation
-===================
-
-Unlike binding invalidation, region invalidation occurs when the entire
-contents of a region may have changed---say, because it has been passed to a
-function the analyzer can model, like memcpy, or because its address has
-escaped, usually as an argument to an opaque function call. In these cases we
-need to throw away not just all bindings within the region itself, but within
-its entire cluster, since neighboring regions may be accessed via pointer
-arithmetic.
-
-Region invalidation typically does even more than this, however. Because it
-usually represents the complete escape of a region from the analyzer's model,
-its *contents* must also be transitively invalidated. (For example, if a region
-'p' of type 'int **' is invalidated, the contents of '*p' and '**p' may have
-changed as well.) The algorithm that traverses this transitive closure of
-accessible regions is known as ClusterAnalysis, and is also used for finding
-all live bindings in the store (in order to throw away the dead ones). The name
-"ClusterAnalysis" predates the cluster-based organization of bindings, but
-refers to the same concept: during invalidation and liveness analysis, all
-bindings within a cluster must be treated in the same way for a conservative
-model of program behavior.
-
-
-Default Bindings
-================
-
-Most bindings in RegionStore are simple scalar values -- integers and pointers.
-These are known as "Direct" bindings. However, RegionStore supports a second
-type of binding called a "Default" binding. These are used to provide values to
-all the elements of an aggregate type (struct or array) without having to
-explicitly specify a binding for each individual element.
-
-When there is no Direct binding for a particular region, the store manager
-looks at each super-region in turn to see if there is a Default binding. If so,
-this value is used as the value of the original region. The search ends when
-the base region is reached, at which point the RegionStore will pick an
-appropriate default value for the region (usually a symbolic value, but
-sometimes zero, for static data, or "uninitialized", for stack variables).
-
- int manyInts[10];
- manyInts[1] = 42; // Creates a Direct binding for manyInts[1].
- print(manyInts[1]); // Retrieves the Direct binding for manyInts[1];
- print(manyInts[0]); // There is no Direct binding for manyInts[0].
- // Is there a Default binding for the entire array?
- // There is not, but it is a stack variable, so we use
- // "uninitialized" as the default value (and emit a
- // diagnostic!).
-
-NOTE: The fact that bindings are stored as a base region plus an offset limits
-the Default Binding strategy, because in C aggregates can contain other
-aggregates. In the current implementation of RegionStore, there is no way to
-distinguish a Default binding for an entire aggregate from a Default binding
-for the sub-aggregate at offset 0.
-
-
-Lazy Bindings (LazyCompoundVal)
-===============================
-
-RegionStore implements an optimization for copying aggregates (structs and
-arrays) called "lazy bindings", implemented using a special SVal called
-LazyCompoundVal. When the store is asked for the "binding" for an entire
-aggregate (i.e. for an lvalue-to-rvalue conversion), it returns a
-LazyCompoundVal instead. When this value is then stored into a variable, it is
-bound as a Default value. This makes copying arrays and structs much cheaper
-than if they had required memberwise access.
-
-Under the hood, a LazyCompoundVal is implemented as a uniqued pair of (region,
-store), representing "the value of the region during this 'snapshot' of the
-store". This has important implications for any sort of liveness or
-reachability analysis, which must take the bindings in the old store into
-account.
-
-Retrieving a value from a lazy binding happens in the same way as any other
-Default binding: since there is no direct binding, the store manager falls back
-to super-regions to look for an appropriate default binding. LazyCompoundVal
-differs from a normal default binding, however, in that it contains several
-different values, instead of one value that will appear several times. Because
-of this, the store manager has to reconstruct the subregion chain on top of the
-LazyCompoundVal region, and look up *that* region in the previous store.
-
-Here's a concrete example:
-
- CGPoint p;
- p.x = 42; // A Direct binding is made to the FieldRegion 'p.x'.
- CGPoint p2 = p; // A LazyCompoundVal is created for 'p', along with a
- // snapshot of the current store state. This value is then
- // used as a Default binding for the VarRegion 'p2'.
- return p2.x; // The binding for FieldRegion 'p2.x' is requested.
- // There is no Direct binding, so we look for a Default
- // binding to 'p2' and find the LCV.
- // Because it's a LCV, we look at our requested region
- // and see that it's the '.x' field. We ask for the value
- // of 'p.x' within the snapshot, and get back 42.
Index: docs/analyzer/checkers.rst
===================================================================
--- docs/analyzer/checkers.rst
+++ docs/analyzer/checkers.rst
@@ -0,0 +1,2003 @@
+==================
+Available Checkers
+==================
+
+The analyzer performs checks that are categorized into families or "checkers".
+
+The default set of checkers covers a variety of checks targeted at finding security and API usage bugs,
+dead code, and other logic errors. See the :ref:`default-checkers` checkers list below.
+
+In addition to these, the analyzer contains a number of :ref:`alpha-checkers` (aka *alpha* checkers).
+These checkers are under development and are switched off by default. They may crash or emit a higher number of false positives.
+
+The :ref:`debug-checkers` package contains checkers for analyzer developers for debugging purposes.
+
+.. contents:: Table of Contents
+ :depth: 4
+
+
+.. _default-checkers:
+
+Default Checkers
+----------------
+
+.. _core-checkers:
+
+core
+^^^^
+Models core language features and contains general-purpose checkers such as division by zero,
+null pointer dereference, usage of uninitialized values, etc.
+*These checkers must be always switched on as other checker rely on them.*
+
+core.CallAndMessage (C, C++, ObjC)
+""""""""""""""""""""""""""""""""""
+ Check for logical errors for function calls and Objective-C message expressions (e.g., uninitialized arguments, null function pointers).
+
+.. literalinclude:: checkers/callandmessage_example.c
+ :language: objc
+
+core.DivideZero (C, C++, ObjC)
+""""""""""""""""""""""""""""""
+ Check for division by zero.
+
+.. literalinclude:: checkers/dividezero_example.c
+ :language: c
+
+core.NonNullParamChecker (C, C++, ObjC)
+"""""""""""""""""""""""""""""""""""""""
+Check for null pointers passed as arguments to a function whose arguments are references or marked with the 'nonnull' attribute.
+
+.. code-block:: cpp
+
+ int f(int *p) __attribute__((nonnull));
+
+ void test(int *p) {
+ if (!p)
+ f(p); // warn
+ }
+
+core.NullDereference (C, C++, ObjC)
+"""""""""""""""""""""""""""""""""""
+Check for dereferences of null pointers.
+
+.. code-block:: objc
+
+ // C
+ void test(int *p) {
+ if (p)
+ return;
+
+ int x = p[0]; // warn
+ }
+
+ // C
+ void test(int *p) {
+ if (!p)
+ *p = 0; // warn
+ }
+
+ // C++
+ class C {
+ public:
+ int x;
+ };
+
+ void test() {
+ C *pc = 0;
+ int k = pc->x; // warn
+ }
+
+ // Objective-C
+ @interface MyClass {
+ @public
+ int x;
+ }
+ @end
+
+ void test() {
+ MyClass *obj = 0;
+ obj->x = 1; // warn
+ }
+
+core.StackAddressEscape (C)
+"""""""""""""""""""""""""""
+Check that addresses to stack memory do not escape the function.
+
+.. code-block:: c
+
+ char const *p;
+
+ void test() {
+ char const str[] = "string";
+ p = str; // warn
+ }
+
+ void* test() {
+ return __builtin_alloca(12); // warn
+ }
+
+ void test() {
+ static int *x;
+ int y;
+ x = &y; // warn
+ }
+
+
+core.UndefinedBinaryOperatorResult (C)
+""""""""""""""""""""""""""""""""""""""
+Check for undefined results of binary operators.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ int y = x + 1; // warn: left operand is garbage
+ }
+
+core.VLASize (C)
+""""""""""""""""
+Check for declarations of Variable Length Arrays of undefined or zero size.
+
+ Check for declarations of VLA of undefined or zero size.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ int vla1[x]; // warn: garbage as size
+ }
+
+ void test() {
+ int x = 0;
+ int vla2[x]; // warn: zero size
+ }
+
+core.uninitialized.ArraySubscript (C)
+"""""""""""""""""""""""""""""""""""""
+Check for uninitialized values used as array subscripts.
+
+.. code-block:: c
+
+ void test() {
+ int i, a[10];
+ int x = a[i]; // warn: array subscript is undefined
+ }
+
+core.uninitialized.Assign (C)
+"""""""""""""""""""""""""""""
+Check for assigning uninitialized values.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ x |= 1; // warn: left expression is uninitialized
+ }
+
+core.uninitialized.Branch (C)
+"""""""""""""""""""""""""""""
+Check for uninitialized values used as branch conditions.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ if (x) // warn
+ return;
+ }
+
+core.uninitialized.CapturedBlockVariable (C)
+""""""""""""""""""""""""""""""""""""""""""""
+Check for blocks that capture uninitialized values.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ ^{ int y = x; }(); // warn
+ }
+
+core.uninitialized.UndefReturn (C)
+""""""""""""""""""""""""""""""""""
+Check for uninitialized values being returned to the caller.
+
+.. code-block:: c
+
+ int test() {
+ int x;
+ return x; // warn
+ }
+
+.. _cplusplus-checkers:
+
+
+cpluslus
+^^^^^^^^
+
+C++ Checkers.
+
+cplusplus.InnerPointer
+""""""""""""""""""""""
+Check for inner pointers of C++ containers used after re/deallocation.
+
+cplusplus.NewDelete (C++)
+"""""""""""""""""""""""""
+Check for double-free and use-after-free problems. Traces memory managed by new/delete.
+
+.. literalinclude:: checkers/newdelete_example.cpp
+ :language: cpp
+
+cplusplus.NewDeleteLeaks (C++)
+""""""""""""""""""""""""""""""
+Check for memory leaks. Traces memory managed by new/delete.
+
+.. code-block:: cpp
+
+ void test() {
+ int *p = new int;
+ } // warn
+
+
+cplusplus.SelfAssignment (C++)
+""""""""""""""""""""""""""""""
+Checks C++ copy and move assignment operators for self assignment.
+
+.. _deadcode-checkers:
+
+deadcode
+^^^^^^^^
+
+Dead Code Checkers.
+
+deadcode.DeadStores (C)
+"""""""""""""""""""""""
+Check for values stored to variables that are never read afterwards.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ x = 1; // warn
+ }
+
+.. _nullability-checkers:
+
+nullability
+^^^^^^^^^^^
+
+Objective C checkers that warn for null pointer passing and dereferencing errors.
+
+nullability.NullPassedToNonnull (ObjC)
+""""""""""""""""""""""""""""""""""""""
+Warns when a null pointer is passed to a pointer which has a _Nonnull type.
+
+.. code-block:: objc
+
+ if (name != nil)
+ return;
+ // Warning: nil passed to a callee that requires a non-null 1st parameter
+ NSString *greeting = [@"Hello " stringByAppendingString:name];
+
+nullability.NullReturnedFromNonnull (ObjC)
+""""""""""""""""""""""""""""""""""""""""""
+Warns when a null pointer is returned from a function that has _Nonnull return type.
+
+.. code-block:: objc
+
+ - (nonnull id)firstChild {
+ id result = nil;
+ if ([_children count] > 0)
+ result = _children[0];
+
+ // Warning: nil returned from a method that is expected
+ // to return a non-null value
+ return result;
+ }
+
+nullability.NullableDereferenced (ObjC)
+"""""""""""""""""""""""""""""""""""""""
+Warns when a nullable pointer is dereferenced.
+
+.. code-block:: objc
+
+ struct LinkedList {
+ int data;
+ struct LinkedList *next;
+ };
+
+ struct LinkedList * _Nullable getNext(struct LinkedList *l);
+
+ void updateNextData(struct LinkedList *list, int newData) {
+ struct LinkedList *next = getNext(list);
+ // Warning: Nullable pointer is dereferenced
+ next->data = 7;
+ }
+
+nullability.NullablePassedToNonnull (ObjC)
+""""""""""""""""""""""""""""""""""""""""""
+Warns when a nullable pointer is passed to a pointer which has a _Nonnull type.
+
+.. code-block:: objc
+
+ typedef struct Dummy { int val; } Dummy;
+ Dummy *_Nullable returnsNullable();
+ void takesNonnull(Dummy *_Nonnull);
+
+ void test() {
+ Dummy *p = returnsNullable();
+ takesNonnull(p); // warn
+ }
+
+nullability.NullableReturnedFromNonnull (ObjC)
+""""""""""""""""""""""""""""""""""""""""""""""
+Warns when a nullable pointer is returned from a function that has _Nonnull return type.
+
+.. _optin-checkers:
+
+optin
+^^^^^
+
+Checkers for portability, performance or coding style specific rules.
+
+optin.cplusplus.VirtualCall (C++)
+"""""""""""""""""""""""""""""""""
+Check virtual function calls during construction or destruction.
+
+.. code-block:: cpp
+
+ class A {
+ public:
+ A() {
+ f(); // warn
+ }
+ virtual void f();
+ };
+
+ class A {
+ public:
+ ~A() {
+ this->f(); // warn
+ }
+ virtual void f();
+ };
+
+optin.mpi.MPI-Checker (C)
+"""""""""""""""""""""""""
+Checks MPI code.
+
+.. code-block:: c
+
+ void test() {
+ double buf = 0;
+ MPI_Request sendReq1;
+ MPI_Ireduce(MPI_IN_PLACE, &buf, 1, MPI_DOUBLE, MPI_SUM,
+ 0, MPI_COMM_WORLD, &sendReq1);
+ } // warn: request 'sendReq1' has no matching wait.
+
+ void test() {
+ double buf = 0;
+ MPI_Request sendReq;
+ MPI_Isend(&buf, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &sendReq);
+ MPI_Irecv(&buf, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &sendReq); // warn
+ MPI_Isend(&buf, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, &sendReq); // warn
+ MPI_Wait(&sendReq, MPI_STATUS_IGNORE);
+ }
+
+ void missingNonBlocking() {
+ int rank = 0;
+ MPI_Comm_rank(MPI_COMM_WORLD, &rank);
+ MPI_Request sendReq1[10][10][10];
+ MPI_Wait(&sendReq1[1][7][9], MPI_STATUS_IGNORE); // warn
+ }
+
+optin.osx.cocoa.localizability.EmptyLocalizationContextChecker (ObjC)
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Check that NSLocalizedString macros include a comment for context.
+
+.. code-block:: objc
+
+ - (void)test {
+ NSString *string = NSLocalizedString(@"LocalizedString", nil); // warn
+ NSString *string2 = NSLocalizedString(@"LocalizedString", @" "); // warn
+ NSString *string3 = NSLocalizedStringWithDefaultValue(
+ @"LocalizedString", nil, [[NSBundle alloc] init], nil,@""); // warn
+ }
+
+optin.osx.cocoa.localizability.NonLocalizedStringChecker (ObjC)
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Warns about uses of non-localized NSStrings passed to UI methods expecting localized NSStrings.
+
+.. code-block:: objc
+
+ NSString *alarmText =
+ NSLocalizedString(@"Enabled", @"Indicates alarm is turned on");
+ if (!isEnabled) {
+ alarmText = @"Disabled";
+ }
+ UILabel *alarmStateLabel = [[UILabel alloc] init];
+
+ // Warning: User-facing text should use localized string macro
+ [alarmStateLabel setText:alarmText];
+
+optin.performance.GCDAntipattern
+""""""""""""""""""""""""""""""""
+Check for performance anti-patterns when using Grand Central Dispatch.
+
+optin.performance.Padding
+"""""""""""""""""""""""""
+Check for excessively padded structs.
+
+optin.portability.UnixAPI
+"""""""""""""""""""""""""
+Finds implementation-defined behavior in UNIX/Posix functions.
+
+
+.. _security-checkers:
+
+security
+^^^^^^^^
+
+Security related checkers.
+
+security.FloatLoopCounter (C)
+"""""""""""""""""""""""""""""
+Warn on using a floating point value as a loop counter (CERT: FLP30-C, FLP30-CPP).
+
+.. code-block:: c
+
+ void test() {
+ for (float x = 0.1f; x <= 1.0f; x += 0.1f) {} // warn
+ }
+
+security.insecureAPI.UncheckedReturn (C)
+""""""""""""""""""""""""""""""""""""""""
+Warn on uses of functions whose return values must be always checked.
+
+.. code-block:: c
+
+ void test() {
+ setuid(1); // warn
+ }
+
+security.insecureAPI.bcmp (C)
+"""""""""""""""""""""""""""""
+Warn on uses of the 'bcmp' function.
+
+.. code-block:: c
+
+ void test() {
+ bcmp(ptr0, ptr1, n); // warn
+ }
+
+security.insecureAPI.bcopy (C)
+""""""""""""""""""""""""""""""
+Warn on uses of the 'bcopy' function.
+
+.. code-block:: c
+
+ void test() {
+ bcopy(src, dst, n); // warn
+ }
+
+security.insecureAPI.bzero (C)
+""""""""""""""""""""""""""""""
+Warn on uses of the 'bzero' function.
+
+.. code-block:: c
+
+ void test() {
+ bzero(ptr, n); // warn
+ }
+
+security.insecureAPI.getpw (C)
+""""""""""""""""""""""""""""""
+Warn on uses of the 'getpw' function.
+
+.. code-block:: c
+
+ void test() {
+ char buff[1024];
+ getpw(2, buff); // warn
+ }
+
+security.insecureAPI.gets (C)
+"""""""""""""""""""""""""""""
+Warn on uses of the 'gets' function.
+
+.. code-block:: c
+
+ void test() {
+ char buff[1024];
+ gets(buff); // warn
+ }
+
+security.insecureAPI.mkstemp (C)
+""""""""""""""""""""""""""""""""
+Warn when 'mkstemp' is passed fewer than 6 X's in the format string.
+
+.. code-block:: c
+
+ void test() {
+ mkstemp("XX"); // warn
+ }
+
+security.insecureAPI.mktemp (C)
+"""""""""""""""""""""""""""""""
+Warn on uses of the ``mktemp`` function.
+
+.. code-block:: c
+
+ void test() {
+ char *x = mktemp("/tmp/zxcv"); // warn: insecure, use mkstemp
+ }
+
+security.insecureAPI.rand (C)
+"""""""""""""""""""""""""""""
+Warn on uses of inferior random number generating functions (only if arc4random function is available):
+``drand48, erand48, jrand48, lcong48, lrand48, mrand48, nrand48, random, rand_r``.
+
+.. code-block:: c
+
+ void test() {
+ random(); // warn
+ }
+
+security.insecureAPI.strcpy (C)
+"""""""""""""""""""""""""""""""
+Warn on uses of the ``strcpy`` and ``strcat`` functions.
+
+.. code-block:: c
+
+ void test() {
+ char x[4];
+ char *y = "abcd";
+
+ strcpy(x, y); // warn
+ }
+
+
+security.insecureAPI.vfork (C)
+""""""""""""""""""""""""""""""
+ Warn on uses of the 'vfork' function.
+
+.. code-block:: c
+
+ void test() {
+ vfork(); // warn
+ }
+
+.. _unix-checkers:
+
+unix
+^^^^
+POSIX/Unix checkers.
+
+unix.API (C)
+""""""""""""
+Check calls to various UNIX/Posix functions: ``open, pthread_once, calloc, malloc, realloc, alloca``.
+
+.. literalinclude:: checkers/unix_api_example.c
+ :language: c
+
+unix.Malloc (C)
+"""""""""""""""
+Check for memory leaks, double free, and use-after-free problems. Traces memory managed by malloc()/free().
+
+.. literalinclude:: checkers/unix_malloc_example.c
+ :language: c
+
+unix.MallocSizeof (C)
+"""""""""""""""""""""
+Check for dubious ``malloc`` arguments involving ``sizeof``.
+
+.. code-block:: c
+
+ void test() {
+ long *p = malloc(sizeof(short));
+ // warn: result is converted to 'long *', which is
+ // incompatible with operand type 'short'
+ free(p);
+ }
+
+unix.MismatchedDeallocator (C, C++)
+"""""""""""""""""""""""""""""""""""
+Check for mismatched deallocators.
+
+.. literalinclude:: checkers/mismatched_deallocator_example.cpp
+ :language: c
+
+unix.Vfork (C)
+""""""""""""""
+Check for proper usage of ``vfork``.
+
+.. code-block:: c
+
+ int test(int x) {
+ pid_t pid = vfork(); // warn
+ if (pid != 0)
+ return 0;
+
+ switch (x) {
+ case 0:
+ pid = 1;
+ execl("", "", 0);
+ _exit(1);
+ break;
+ case 1:
+ x = 0; // warn: this assignment is prohibited
+ break;
+ case 2:
+ foo(); // warn: this function call is prohibited
+ break;
+ default:
+ return 0; // warn: return is prohibited
+ }
+
+ while(1);
+ }
+
+unix.cstring.BadSizeArg (C)
+"""""""""""""""""""""""""""
+Check the size argument passed into C string functions for common erroneous patterns. Use ``-Wno-strncat-size`` compiler option to mute other ``strncat``-related compiler warnings.
+
+.. code-block:: c
+
+ void test() {
+ char dest[3];
+ strncat(dest, """""""""""""""""""""""""*", sizeof(dest));
+ // warn: potential buffer overflow
+ }
+
+unix.cstrisng.NullArg (C)
+"""""""""""""""""""""""""
+Check for null pointers being passed as arguments to C string functions:
+``strlen, strnlen, strcpy, strncpy, strcat, strncat, strcmp, strncmp, strcasecmp, strncasecmp``.
+
+.. code-block:: c
+
+ int test() {
+ return strlen(0); // warn
+ }
+
+.. _osx-checkers:
+
+osx
+^^^
+OS X checkers.
+
+osx.API (C)
+"""""""""""
+Check for proper uses of various Apple APIs.
+
+.. code-block:: objc
+
+ void test() {
+ dispatch_once_t pred = 0;
+ dispatch_once(&pred, ^(){}); // warn: dispatch_once uses local
+ }
+
+osx.NumberObjectConversion (C, C++, ObjC)
+"""""""""""""""""""""""""""""""""""""""""
+Check for erroneous conversions of objects representing numbers into numbers.
+
+.. code-block:: objc
+
+ NSNumber *photoCount = [albumDescriptor objectForKey:@"PhotoCount"];
+ // Warning: Comparing a pointer value of type 'NSNumber *'
+ // to a scalar integer value
+ if (photoCount > 0) {
+ [self displayPhotos];
+ }
+
+osx.ObjCProperty (ObjC)
+"""""""""""""""""""""""
+Check for proper uses of Objective-C properties.
+
+.. code-block:: objc
+
+ NSNumber *photoCount = [albumDescriptor objectForKey:@"PhotoCount"];
+ // Warning: Comparing a pointer value of type 'NSNumber *'
+ // to a scalar integer value
+ if (photoCount > 0) {
+ [self displayPhotos];
+ }
+
+
+osx.SecKeychainAPI (C)
+""""""""""""""""""""""
+Check for proper uses of Secure Keychain APIs.
+
+.. literalinclude:: checkers/seckeychainapi_example.m
+ :language: objc
+
+osx.cocoa.AtSync (ObjC)
+"""""""""""""""""""""""
+Check for nil pointers used as mutexes for @synchronized.
+
+.. code-block:: objc
+
+ void test(id x) {
+ if (!x)
+ @synchronized(x) {} // warn: nil value used as mutex
+ }
+
+ void test() {
+ id y;
+ @synchronized(y) {} // warn: uninitialized value used as mutex
+ }
+
+osx.cocoa.AutoreleaseWrite
+""""""""""""""""""""""""""
+Warn about potentially crashing writes to autoreleasing objects from different autoreleasing pools in Objective-C.
+
+osx.cocoa.ClassRelease (ObjC)
+"""""""""""""""""""""""""""""
+Check for sending 'retain', 'release', or 'autorelease' directly to a Class.
+
+.. code-block:: objc
+
+ @interface MyClass : NSObject
+ @end
+
+ void test(void) {
+ [MyClass release]; // warn
+ }
+
+osx.cocoa.Dealloc (ObjC)
+""""""""""""""""""""""""
+Warn about Objective-C classes that lack a correct implementation of -dealloc
+
+.. literalinclude:: checkers/dealloc_example.m
+ :language: objc
+
+osx.cocoa.IncompatibleMethodTypes (ObjC)
+""""""""""""""""""""""""""""""""""""""""
+Warn about Objective-C method signatures with type incompatibilities.
+
+.. code-block:: objc
+
+ @interface MyClass1 : NSObject
+ - (int)foo;
+ @end
+
+ @implementation MyClass1
+ - (int)foo { return 1; }
+ @end
+
+ @interface MyClass2 : MyClass1
+ - (float)foo;
+ @end
+
+ @implementation MyClass2
+ - (float)foo { return 1.0; } // warn
+ @end
+
+osx.cocoa.Loops
+"""""""""""""""
+Improved modeling of loops using Cocoa collection types.
+
+osx.cocoa.MissingSuperCall (ObjC)
+"""""""""""""""""""""""""""""""""
+Warn about Objective-C methods that lack a necessary call to super.
+
+.. code-block:: objc
+
+ @interface Test : UIViewController
+ @end
+ @implementation test
+ - (void)viewDidLoad {} // warn
+ @end
+
+
+osx.cocoa.NSAutoreleasePool (ObjC)
+""""""""""""""""""""""""""""""""""
+Warn for suboptimal uses of NSAutoreleasePool in Objective-C GC mode.
+
+.. code-block:: objc
+
+ void test() {
+ NSAutoreleasePool *pool = [[NSAutoreleasePool alloc] init];
+ [pool release]; // warn
+ }
+
+osx.cocoa.NSError (ObjC)
+""""""""""""""""""""""""
+Check usage of NSError parameters.
+
+.. code-block:: objc
+
+ @interface A : NSObject
+ - (void)foo:(NSError """""""""""""""""""""""")error;
+ @end
+
+ @implementation A
+ - (void)foo:(NSError """""""""""""""""""""""")error {
+ // warn: method accepting NSError"""""""""""""""""""""""" should have a non-void
+ // return value
+ }
+ @end
+
+ @interface A : NSObject
+ - (BOOL)foo:(NSError """""""""""""""""""""""")error;
+ @end
+
+ @implementation A
+ - (BOOL)foo:(NSError """""""""""""""""""""""")error {
+ *error = 0; // warn: potential null dereference
+ return 0;
+ }
+ @end
+
+osx.cocoa.NilArg (ObjC)
+"""""""""""""""""""""""
+Check for prohibited nil arguments to ObjC method calls.
+
+ - caseInsensitiveCompare:
+ - compare:
+ - compare:options:
+ - compare:options:range:
+ - compare:options:range:locale:
+ - componentsSeparatedByCharactersInSet:
+ - initWithFormat:
+
+.. code-block:: objc
+
+ NSComparisonResult test(NSString *s) {
+ NSString *aString = nil;
+ return [s caseInsensitiveCompare:aString];
+ // warn: argument to 'NSString' method
+ // 'caseInsensitiveCompare:' cannot be nil
+ }
+
+
+osx.cocoa.NonNilReturnValue
+"""""""""""""""""""""""""""
+Models the APIs that are guaranteed to return a non-nil value.
+
+osx.cocoa.ObjCGenerics (ObjC)
+"""""""""""""""""""""""""""""
+Check for type errors when using Objective-C generics.
+
+.. code-block:: objc
+
+ NSMutableArray *names = [NSMutableArray array];
+ NSMutableArray *birthDates = names;
+
+ // Warning: Conversion from value of type 'NSDate *'
+ // to incompatible type 'NSString *'
+ [birthDates addObject: [NSDate date]];
+
+osx.cocoa.RetainCount (ObjC)
+""""""""""""""""""""""""""""
+Check for leaks and improper reference count management
+
+.. code-block:: objc
+
+ void test() {
+ NSString *s = [[NSString alloc] init]; // warn
+ }
+
+ CFStringRef test(char *bytes) {
+ return CFStringCreateWithCStringNoCopy(
+ 0, bytes, NSNEXTSTEPStringEncoding, 0); // warn
+ }
+
+
+osx.cocoa.RunLoopAutoreleaseLeak
+""""""""""""""""""""""""""""""""
+Check for leaked memory in autorelease pools that will never be drained.
+
+osx.cocoa.SelfInit (ObjC)
+"""""""""""""""""""""""""
+Check that 'self' is properly initialized inside an initializer method.
+
+.. code-block:: objc
+
+ @interface MyObj : NSObject {
+ id x;
+ }
+ - (id)init;
+ @end
+
+ @implementation MyObj
+ - (id)init {
+ [super init];
+ x = 0; // warn: instance variable used while 'self' is not
+ // initialized
+ return 0;
+ }
+ @end
+
+ @interface MyObj : NSObject
+ - (id)init;
+ @end
+
+ @implementation MyObj
+ - (id)init {
+ [super init];
+ return self; // warn: returning uninitialized 'self'
+ }
+ @end
+
+osx.cocoa.SuperDealloc (ObjC)
+"""""""""""""""""""""""""""""
+Warn about improper use of '[super dealloc]' in Objective-C.
+
+.. code-block:: objc
+
+ @interface SuperDeallocThenReleaseIvarClass : NSObject {
+ NSObject *_ivar;
+ }
+ @end
+
+ @implementation SuperDeallocThenReleaseIvarClass
+ - (void)dealloc {
+ [super dealloc];
+ [_ivar release]; // warn
+ }
+ @end
+
+osx.cocoa.UnusedIvars (ObjC)
+""""""""""""""""""""""""""""
+Warn about private ivars that are never used.
+
+.. code-block:: objc
+
+ @interface MyObj : NSObject {
+ @private
+ id x; // warn
+ }
+ @end
+
+ @implementation MyObj
+ @end
+
+osx.cocoa.VariadicMethodTypes (ObjC)
+""""""""""""""""""""""""""""""""""""
+Check for passing non-Objective-C types to variadic collection
+initialization methods that expect only Objective-C types.
+
+.. code-block:: objc
+
+ void test() {
+ [NSSet setWithObjects:@"Foo", "Bar", nil];
+ // warn: argument should be an ObjC pointer type, not 'char *'
+ }
+
+osx.coreFoundation.CFError (C)
+""""""""""""""""""""""""""""""
+Check usage of CFErrorRef* parameters
+
+.. code-block:: c
+
+ void test(CFErrorRef *error) {
+ // warn: function accepting CFErrorRef* should have a
+ // non-void return
+ }
+
+ int foo(CFErrorRef *error) {
+ *error = 0; // warn: potential null dereference
+ return 0;
+ }
+
+osx.coreFoundation.CFNumber (C)
+"""""""""""""""""""""""""""""""
+Check for proper uses of CFNumber APIs.
+
+.. code-block:: c
+
+ CFNumberRef test(unsigned char x) {
+ return CFNumberCreate(0, kCFNumberSInt16Type, &x);
+ // warn: 8 bit integer is used to initialize a 16 bit integer
+ }
+
+osx.coreFoundation.CFRetainRelease (C)
+""""""""""""""""""""""""""""""""""""""
+Check for null arguments to CFRetain/CFRelease/CFMakeCollectable.
+
+.. code-block:: c
+
+ void test(CFTypeRef p) {
+ if (!p)
+ CFRetain(p); // warn
+ }
+
+ void test(int x, CFTypeRef p) {
+ if (p)
+ return;
+
+ CFRelease(p); // warn
+ }
+
+osx.coreFoundation.containers.OutOfBounds (C)
+"""""""""""""""""""""""""""""""""""""""""""""
+Checks for index out-of-bounds when using 'CFArray' API.
+
+.. code-block:: c
+
+ void test() {
+ CFArrayRef A = CFArrayCreate(0, 0, 0, &kCFTypeArrayCallBacks);
+ CFArrayGetValueAtIndex(A, 0); // warn
+ }
+
+osx.coreFoundation.containers.PointerSizedValues (C)
+""""""""""""""""""""""""""""""""""""""""""""""""""""
+Warns if 'CFArray', 'CFDictionary', 'CFSet' are created with non-pointer-size values.
+
+.. code-block:: c
+
+ void test() {
+ int x[] = { 1 };
+ CFArrayRef A = CFArrayCreate(0, (const void """""""""""""""""""""""")x, 1,
+ &kCFTypeArrayCallBacks); // warn
+ }
+
+
+.. _alpha-checkers:
+
+Experimental Checkers
+---------------------
+
+*These are checkers with known issues or limitations that keep them from being on by default. They are likely to have false positives. Bug reports and especially patches are welcome.*
+
+alpha.clone
+^^^^^^^^^^^
+
+alpha.clone.CloneChecker (C, C++, ObjC)
+"""""""""""""""""""""""""""""""""""""""
+Reports similar pieces of code.
+
+.. code-block:: c
+
+ void log();
+
+ int max(int a, int b) { // warn
+ log();
+ if (a > b)
+ return a;
+ return b;
+ }
+
+ int maxClone(int x, int y) { // similar code here
+ log();
+ if (x > y)
+ return x;
+ return y;
+ }
+
+alpha.core.BoolAssignment (ObjC)
+""""""""""""""""""""""""""""""""
+Warn about assigning non-{0,1} values to boolean variables.
+
+.. code-block:: objc
+
+ void test() {
+ BOOL b = -1; // warn
+ }
+
+alpha.core
+^^^^^^^^^^
+
+alpha.core.CallAndMessageUnInitRefArg (C,C++, ObjC)
+"""""""""""""""""""""""""""""""""""""""""""""""""""
+Check for logical errors for function calls and Objective-C
+message expressions (e.g., uninitialized arguments, null function pointers, and pointer to undefined variables).
+
+.. code-block:: c
+
+ void test(void) {
+ int t;
+ int &p = t;
+ int &s = p;
+ int &q = s;
+ foo(q); // warn
+ }
+
+ void test(void) {
+ int x;
+ foo(&x); // warn
+ }
+
+alpha.core.CastSize (C)
+"""""""""""""""""""""""
+Check when casting a malloc'ed type ``T``, whether the size is a multiple of the size of ``T``.
+
+.. code-block:: c
+
+ void test() {
+ int *x = (int *) malloc(11); // warn
+ }
+
+alpha.core.CastToStruct (C, C++)
+""""""""""""""""""""""""""""""""
+Check for cast from non-struct pointer to struct pointer.
+
+.. code-block:: cpp
+
+ // C
+ struct s {};
+
+ void test(int *p) {
+ struct s *ps = (struct s *) p; // warn
+ }
+
+ // C++
+ class c {};
+
+ void test(int *p) {
+ c *pc = (c *) p; // warn
+ }
+
+alpha.core.Conversion (C, C++, ObjC)
+""""""""""""""""""""""""""""""""""""
+Loss of sign/precision in implicit conversions.
+
+.. code-block:: c
+
+ void test(unsigned U, signed S) {
+ if (S > 10) {
+ if (U < S) {
+ }
+ }
+ if (S < -10) {
+ if (U < S) { // warn (loss of sign)
+ }
+ }
+ }
+
+ void test() {
+ long long A = 1LL << 60;
+ short X = A; // warn (loss of precision)
+ }
+
+alpha.core.DynamicTypeChecker (ObjC)
+""""""""""""""""""""""""""""""""""""
+Check for cases where the dynamic and the static type of an object are unrelated.
+
+
+.. code-block:: objc
+
+ id date = [NSDate date];
+
+ // Warning: Object has a dynamic type 'NSDate *' which is
+ // incompatible with static type 'NSNumber *'"
+ NSNumber *number = date;
+ [number doubleValue];
+
+alpha.core.FixedAddr (C)
+""""""""""""""""""""""""
+Check for assignment of a fixed address to a pointer.
+
+.. code-block:: c
+
+ void test() {
+ int *p;
+ p = (int *) 0x10000; // warn
+ }
+
+alpha.core.IdenticalExpr (C, C++)
+"""""""""""""""""""""""""""""""""
+Warn about unintended use of identical expressions in operators.
+
+.. code-block:: cpp
+
+ // C
+ void test() {
+ int a = 5;
+ int b = a | 4 | a; // warn: identical expr on both sides
+ }
+
+ // C++
+ bool f(void);
+
+ void test(bool b) {
+ int i = 10;
+ if (f()) { // warn: true and false branches are identical
+ do {
+ i--;
+ } while (f());
+ } else {
+ do {
+ i--;
+ } while (f());
+ }
+ }
+
+alpha.core.PointerArithm (C)
+""""""""""""""""""""""""""""
+Check for pointer arithmetic on locations other than array elements.
+
+.. code-block:: c
+
+ void test() {
+ int x;
+ int *p;
+ p = &x + 1; // warn
+ }
+
+alpha.core.PointerSub (C)
+"""""""""""""""""""""""""
+Check for pointer subtractions on two pointers pointing to different memory chunks.
+
+.. code-block:: c
+
+ void test() {
+ int x, y;
+ int d = &y - &x; // warn
+ }
+
+alpha.core.SizeofPtr (C)
+""""""""""""""""""""""""
+Warn about unintended use of ``sizeof()`` on pointer expressions.
+
+.. code-block:: c
+
+ struct s {};
+
+ int test(struct s *p) {
+ return sizeof(p);
+ // warn: sizeof(ptr) can produce an unexpected result
+ }
+
+alpha.core.StackAddressAsyncEscape (C)
+""""""""""""""""""""""""""""""""""""""
+Check that addresses to stack memory do not escape the function that involves dispatch_after or dispatch_async.
+This checker is a part of ``core.StackAddressEscape``, but is temporarily disabled until some false positives are fixed.
+
+.. code-block:: c
+
+ dispatch_block_t test_block_inside_block_async_leak() {
+ int x = 123;
+ void (^inner)(void) = ^void(void) {
+ int y = x;
+ ++y;
+ };
+ void (^outer)(void) = ^void(void) {
+ int z = x;
+ ++z;
+ inner();
+ };
+ return outer; // warn: address of stack-allocated block is captured by a
+ // returned block
+ }
+
+alpha.core.TestAfterDivZero (C)
+"""""""""""""""""""""""""""""""
+Check for division by variable that is later compared against 0.
+Either the comparison is useless or there is division by zero.
+
+.. code-block:: c
+
+ void test(int x) {
+ var = 77 / x;
+ if (x == 0) { } // warn
+ }
+
+alpha.cplusplus
+^^^^^^^^^^^^^^^
+
+alpha.cplusplus.DeleteWithNonVirtualDtor (C++)
+""""""""""""""""""""""""""""""""""""""""""""""
+Reports destructions of polymorphic objects with a non-virtual destructor in their base class.
+
+.. code-block:: cpp
+
+ NonVirtual *create() {
+ NonVirtual *x = new NVDerived(); // note: conversion from derived to base
+ // happened here
+ return x;
+ }
+
+ void sink(NonVirtual *x) {
+ delete x; // warn: destruction of a polymorphic object with no virtual
+ // destructor
+ }
+
+alpha.cplusplus.EnumCastOutOfRange (C++)
+""""""""""""""""""""""""""""""""""""""""
+Check for integer to enumeration casts that could result in undefined values.
+
+.. code-block:: cpp
+
+ enum TestEnum {
+ A = 0
+ };
+
+ void foo() {
+ TestEnum t = static_cast(-1);
+ // warn: the value provided to the cast expression is not in
+ the valid range of values for the enum
+
+alpha.cplusplus.InvalidatedIterator (C++)
+"""""""""""""""""""""""""""""""""""""""""
+Check for use of invalidated iterators.
+
+.. code-block:: cpp
+
+ void bad_copy_assign_operator_list1(std::list &L1,
+ const std::list &L2) {
+ auto i0 = L1.cbegin();
+ L1 = L2;
+ *i0; // warn: invalidated iterator accessed
+ }
+
+
+alpha.cplusplus.IteratorRange (C++)
+"""""""""""""""""""""""""""""""""""
+Check for iterators used outside their valid ranges.
+
+.. code-block:: cpp
+
+ void simple_bad_end(const std::vector &v) {
+ auto i = v.end();
+ *i; // warn: iterator accessed outside of its range
+ }
+
+alpha.cplusplus.MismatchedIterator (C++)
+""""""""""""""""""""""""""""""""""""""""
+Check for use of iterators of different containers where iterators of the same container are expected.
+
+.. code-block:: cpp
+
+ void bad_insert3(std::vector &v1, std::vector &v2) {
+ v2.insert(v1.cbegin(), v2.cbegin(), v2.cend()); // warn: container accessed
+ // using foreign
+ // iterator argument
+ v1.insert(v1.cbegin(), v1.cbegin(), v2.cend()); // warn: iterators of
+ // different containers
+ // used where the same
+ // container is
+ // expected
+ v1.insert(v1.cbegin(), v2.cbegin(), v1.cend()); // warn: iterators of
+ // different containers
+ // used where the same
+ // container is
+ // expected
+ }
+
+alpha.cplusplus.MisusedMovedObject (C++)
+""""""""""""""""""""""""""""""""""""""""
+Method calls on a moved-from object and copying a moved-from object will be reported.
+
+
+.. code-block:: cpp
+
+ struct A {
+ void foo() {}
+ };
+
+ void f() {
+ A a;
+ A b = std::move(a); // note: 'a' became 'moved-from' here
+ a.foo(); // warn: method call on a 'moved-from' object 'a'
+ }
+
+alpha.cplusplus.UninitializedObject (C++)
+"""""""""""""""""""""""""""""""""""""""""
+
+This checker reports uninitialized fields in objects created after a constructor call.
+It doesn't only find direct uninitialized fields, but rather makes a deep inspection
+of the object, analyzing all of it's fields subfields.
+The checker regards inherited fields as direct fields, so one will
+recieve warnings for uninitialized inherited data members as well.
+
+.. code-block:: cpp
+
+ // With Pedantic and CheckPointeeInitialization set to true
+
+ struct A {
+ struct B {
+ int x; // note: uninitialized field 'this->b.x'
+ // note: uninitialized field 'this->bptr->x'
+ int y; // note: uninitialized field 'this->b.y'
+ // note: uninitialized field 'this->bptr->y'
+ };
+ int *iptr; // note: uninitialized pointer 'this->iptr'
+ B b;
+ B *bptr;
+ char *cptr; // note: uninitialized pointee 'this->cptr'
+
+ A (B *bptr, char *cptr) : bptr(bptr), cptr(cptr) {}
+ };
+
+ void f() {
+ A::B b;
+ char c;
+ A a(&b, &c); // warning: 6 uninitialized fields
+ // after the constructor call
+ }
+
+ // With Pedantic set to false and
+ // CheckPointeeInitialization set to true
+ // (every field is uninitialized)
+
+ struct A {
+ struct B {
+ int x;
+ int y;
+ };
+ int *iptr;
+ B b;
+ B *bptr;
+ char *cptr;
+
+ A (B *bptr, char *cptr) : bptr(bptr), cptr(cptr) {}
+ };
+
+ void f() {
+ A::B b;
+ char c;
+ A a(&b, &c); // no warning
+ }
+
+ // With Pedantic set to true and
+ // CheckPointeeInitialization set to false
+ // (pointees are regarded as initialized)
+
+ struct A {
+ struct B {
+ int x; // note: uninitialized field 'this->b.x'
+ int y; // note: uninitialized field 'this->b.y'
+ };
+ int *iptr; // note: uninitialized pointer 'this->iptr'
+ B b;
+ B *bptr;
+ char *cptr;
+
+ A (B *bptr, char *cptr) : bptr(bptr), cptr(cptr) {}
+ };
+
+ void f() {
+ A::B b;
+ char c;
+ A a(&b, &c); // warning: 3 uninitialized fields
+ // after the constructor call
+ }
+
+
+**Options**
+
+This checker has several options which can be set from command line (e.g. ``-analyzer-config alpha.cplusplus.UninitializedObject:Pedantic=true``):
+
+* ``Pedantic`` (boolean). If to false, the checker won't emit warnings for objects that don't have at least one initialized field. Defaults to false.
+
+* ``NotesAsWarnings`` (boolean). If set to true, the checker will emit a warning for each uninitalized field, as opposed to emitting one warning per constructor call, and listing the uninitialized fields that belongs to it in notes. *Defaults to false.*.
+
+* ``CheckPointeeInitialization`` (boolean). If set to false, the checker will not analyze the pointee of pointer/reference fields, and will only check whether the object itself is initialized. *Defaults to false.*.
+
+* ``IgnoreRecordsWithField`` (string). If supplied, the checker will not analyze structures that have a field with a name or type name that matches the given pattern. *Defaults to ""*. Can be set with ``-analyzer-config alpha.cplusplus.UninitializedObject:IgnoreRecordsWithField="[Tt]ag|[Kk]ind"``.
+
+
+alpha.deadcode
+^^^^^^^^^^^^^^
+alpha.deadcode.UnreachableCode (C, C++)
+"""""""""""""""""""""""""""""""""""""""
+Check unreachable code.
+
+.. code-block:: cpp
+
+ // C
+ int test() {
+ int x = 1;
+ while(x);
+ return x; // warn
+ }
+
+ // C++
+ void test() {
+ int a = 2;
+
+ while (a > 1)
+ a--;
+
+ if (a > 1)
+ a++; // warn
+ }
+
+ // Objective-C
+ void test(id x) {
+ return;
+ [x retain]; // warn
+ }
+
+alpha.llvm
+^^^^^^^^^^
+
+alpha.llvm.Conventions
+""""""""""""""""""""""
+
+Check code for LLVM codebase conventions:
+
+* A StringRef should not be bound to a temporary std::string whose lifetime is shorter than the StringRef's.
+* Clang AST nodes should not have fields that can allocate memory.
+
+
+alpha.osx
+^^^^^^^^^
+
+alpha.osx.cocoa.DirectIvarAssignment (ObjC)
+"""""""""""""""""""""""""""""""""""""""""""
+Check for direct assignments to instance variables.
+
+
+.. code-block:: objc
+
+ @interface MyClass : NSObject {}
+ @property (readonly) id A;
+ - (void) foo;
+ @end
+
+ @implementation MyClass
+ - (void) foo {
+ _A = 0; // warn
+ }
+ @end
+
+alpha.osx.cocoa.DirectIvarAssignmentForAnnotatedFunctions (ObjC)
+""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Check for direct assignments to instance variables in
+the methods annotated with ``objc_no_direct_instance_variable_assignment``.
+
+.. code-block:: objc
+
+ @interface MyClass : NSObject {}
+ @property (readonly) id A;
+ - (void) fAnnotated __attribute__((
+ annotate("objc_no_direct_instance_variable_assignment")));
+ - (void) fNotAnnotated;
+ @end
+
+ @implementation MyClass
+ - (void) fAnnotated {
+ _A = 0; // warn
+ }
+ - (void) fNotAnnotated {
+ _A = 0; // no warn
+ }
+ @end
+
+
+alpha.osx.cocoa.InstanceVariableInvalidation (ObjC)
+"""""""""""""""""""""""""""""""""""""""""""""""""""
+Check that the invalidatable instance variables are
+invalidated in the methods annotated with objc_instance_variable_invalidator.
+
+.. code-block:: objc
+
+ @protocol Invalidation
+ - (void) invalidate
+ __attribute__((annotate("objc_instance_variable_invalidator")));
+ @end
+
+ @interface InvalidationImpObj : NSObject
+ @end
+
+ @interface SubclassInvalidationImpObj : InvalidationImpObj {
+ InvalidationImpObj *var;
+ }
+ - (void)invalidate;
+ @end
+
+ @implementation SubclassInvalidationImpObj
+ - (void) invalidate {}
+ @end
+ // warn: var needs to be invalidated or set to nil
+
+alpha.osx.cocoa.MissingInvalidationMethod (ObjC)
+""""""""""""""""""""""""""""""""""""""""""""""""
+Check that the invalidation methods are present in classes that contain invalidatable instance variables.
+
+.. code-block:: objc
+
+ @protocol Invalidation
+ - (void)invalidate
+ __attribute__((annotate("objc_instance_variable_invalidator")));
+ @end
+
+ @interface NeedInvalidation : NSObject
+ @end
+
+ @interface MissingInvalidationMethodDecl : NSObject {
+ NeedInvalidation *Var; // warn
+ }
+ @end
+
+ @implementation MissingInvalidationMethodDecl
+ @end
+
+alpha.osx.cocoa.localizability.PluralMisuseChecker (ObjC)
+"""""""""""""""""""""""""""""""""""""""""""""""""""""""""
+Warns against using one vs. many plural pattern in code when generating localized strings.
+
+.. code-block:: objc
+
+ NSString *reminderText =
+ NSLocalizedString(@"None", @"Indicates no reminders");
+ if (reminderCount == 1) {
+ // Warning: Plural cases are not supported across all languages.
+ // Use a .stringsdict file instead
+ reminderText =
+ NSLocalizedString(@"1 Reminder", @"Indicates single reminder");
+ } else if (reminderCount >= 2) {
+ // Warning: Plural cases are not supported across all languages.
+ // Use a .stringsdict file instead
+ reminderText =
+ [NSString stringWithFormat:
+ NSLocalizedString(@"%@ Reminders", @"Indicates multiple reminders"),
+ reminderCount];
+ }
+
+alpha.security
+^^^^^^^^^^^^^^
+alpha.security.ArrayBound (C)
+"""""""""""""""""""""""""""""
+Warn about buffer overflows (older checker).
+
+.. code-block:: c
+
+ void test() {
+ char *s = "";
+ char c = s[1]; // warn
+ }
+
+ struct seven_words {
+ int c[7];
+ };
+
+ void test() {
+ struct seven_words a, *p;
+ p = &a;
+ p[0] = a;
+ p[1] = a;
+ p[2] = a; // warn
+ }
+
+ // note: requires unix.Malloc or
+ // alpha.unix.MallocWithAnnotations checks enabled.
+ void test() {
+ int *p = malloc(12);
+ p[3] = 4; // warn
+ }
+
+ void test() {
+ char a[2];
+ int *b = (int*)a;
+ b[1] = 3; // warn
+ }
+
+alpha.security.ArrayBoundV2 (C)
+"""""""""""""""""""""""""""""""
+Warn about buffer overflows (newer checker).
+
+.. code-block:: c
+
+ void test() {
+ char *s = "";
+ char c = s[1]; // warn
+ }
+
+ void test() {
+ int buf[100];
+ int *p = buf;
+ p = p + 99;
+ p[1] = 1; // warn
+ }
+
+ // note: compiler has internal check for this.
+ // Use -Wno-array-bounds to suppress compiler warning.
+ void test() {
+ int buf[100][100];
+ buf[0][-1] = 1; // warn
+ }
+
+ // note: requires alpha.security.taint check turned on.
+ void test() {
+ char s[] = "abc";
+ int x = getchar();
+ char c = s[x]; // warn: index is tainted
+ }
+
+alpha.security.MallocOverflow (C)
+"""""""""""""""""""""""""""""""""
+Check for overflows in the arguments to malloc().
+
+.. code-block:: c
+
+ void test(int n) {
+ void *p = malloc(n * sizeof(int)); // warn
+ }
+
+alpha.security.MmapWriteExec (C)
+""""""""""""""""""""""""""""""""
+Warn on mmap() calls that are both writable and executable.
+
+.. code-block:: c
+
+ void test(int n) {
+ void *c = mmap(NULL, 32, PROT_READ | PROT_WRITE | PROT_EXEC,
+ MAP_PRIVATE | MAP_ANON, -1, 0);
+ // warn: Both PROT_WRITE and PROT_EXEC flags are set. This can lead to
+ // exploitable memory regions, which could be overwritten with malicious
+ // code
+ }
+
+alpha.security.ReturnPtrRange (C)
+"""""""""""""""""""""""""""""""""
+Check for an out-of-bound pointer being returned to callers.
+
+.. code-block:: c
+
+ static int A[10];
+
+ int *test() {
+ int *p = A + 10;
+ return p; // warn
+ }
+
+ int test(void) {
+ int x;
+ return x; // warn: undefined or garbage returned
+ }
+
+alpha.security.taint.TaintPropagation (C, C++)
+""""""""""""""""""""""""""""""""""""""""""""""
+Generate taint information used by other checkers.
+A data is tainted when it comes from an unreliable source.
+
+.. code-block:: c
+
+ void test() {
+ char x = getchar(); // 'x' marked as tainted
+ system(&x); // warn: untrusted data is passed to a system call
+ }
+
+ // note: compiler internally checks if the second param to
+ // sprintf is a string literal or not.
+ // Use -Wno-format-security to suppress compiler warning.
+ void test() {
+ char s[10], buf[10];
+ fscanf(stdin, "%s", s); // 's' marked as tainted
+
+ sprintf(buf, s); // warn: untrusted data as a format string
+ }
+
+ void test() {
+ size_t ts;
+ scanf("%zd", &ts); // 'ts' marked as tainted
+ int *p = (int *)malloc(ts * sizeof(int));
+ // warn: untrusted data as buffer size
+ }
+
+alpha.unix
+^^^^^^^^^^^
+
+alpha.unix.BlockInCriticalSection (C)
+"""""""""""""""""""""""""""""""""""""
+Check for calls to blocking functions inside a critical section.
+Applies to: ``lock, unlock, sleep, getc, fgets, read, recv, pthread_mutex_lock,``
+`` pthread_mutex_unlock, mtx_lock, mtx_timedlock, mtx_trylock, mtx_unlock, lock_guard, unique_lock``
+
+.. code-block:: c
+
+ void test() {
+ std::mutex m;
+ m.lock();
+ sleep(3); // warn: a blocking function sleep is called inside a critical
+ // section
+ m.unlock();
+ }
+
+alpha.unix.Chroot (C)
+"""""""""""""""""""""
+Check improper use of chroot.
+
+.. code-block:: c
+
+ void f();
+
+ void test() {
+ chroot("/usr/local");
+ f(); // warn: no call of chdir("/") immediately after chroot
+ }
+
+alpha.unix.PthreadLock (C)
+""""""""""""""""""""""""""
+Simple lock -> unlock checker.
+Applies to: ``pthread_mutex_lock, pthread_rwlock_rdlock, pthread_rwlock_wrlock, lck_mtx_lock, lck_rw_lock_exclusive``
+``lck_rw_lock_shared, pthread_mutex_trylock, pthread_rwlock_tryrdlock, pthread_rwlock_tryrwlock, lck_mtx_try_lock,
+lck_rw_try_lock_exclusive, lck_rw_try_lock_shared, pthread_mutex_unlock, pthread_rwlock_unlock, lck_mtx_unlock, lck_rw_done``.
+
+
+.. code-block:: c
+
+ pthread_mutex_t mtx;
+
+ void test() {
+ pthread_mutex_lock(&mtx);
+ pthread_mutex_lock(&mtx);
+ // warn: this lock has already been acquired
+ }
+
+ lck_mtx_t lck1, lck2;
+
+ void test() {
+ lck_mtx_lock(&lck1);
+ lck_mtx_lock(&lck2);
+ lck_mtx_unlock(&lck1);
+ // warn: this was not the most recently acquired lock
+ }
+
+ lck_mtx_t lck1, lck2;
+
+ void test() {
+ if (lck_mtx_try_lock(&lck1) == 0)
+ return;
+
+ lck_mtx_lock(&lck2);
+ lck_mtx_unlock(&lck1);
+ // warn: this was not the most recently acquired lock
+ }
+
+alpha.unix.SimpleStream (C)
+"""""""""""""""""""""""""""
+Check for misuses of stream APIs. Check for misuses of stream APIs: ``fopen, fclose``
+(demo checker, the subject of the demo (`Slides `_ ,
+`Video `_) by Anna Zaks and Jordan Rose presented at the
+`2012 LLVM Developers' Meeting `_).
+
+.. code-block:: c
+
+ void test() {
+ FILE *F = fopen("myfile.txt", "w");
+ } // warn: opened file is never closed
+
+ void test() {
+ FILE *F = fopen("myfile.txt", "w");
+
+ if (F)
+ fclose(F);
+
+ fclose(F); // warn: closing a previously closed file stream
+ }
+
+alpha.unix.Stream (C)
+"""""""""""""""""""""
+Check stream handling functions: ``fopen, tmpfile, fclose, fread, fwrite, fseek, ftell, rewind, fgetpos,``
+``fsetpos, clearerr, feof, ferror, fileno``.
+
+.. code-block:: c
+
+ void test() {
+ FILE *p = fopen("foo", "r");
+ } // warn: opened file is never closed
+
+ void test() {
+ FILE *p = fopen("foo", "r");
+ fseek(p, 1, SEEK_SET); // warn: stream pointer might be NULL
+ fclose(p);
+ }
+
+ void test() {
+ FILE *p = fopen("foo", "r");
+
+ if (p)
+ fseek(p, 1, 3);
+ // warn: third arg should be SEEK_SET, SEEK_END, or SEEK_CUR
+
+ fclose(p);
+ }
+
+ void test() {
+ FILE *p = fopen("foo", "r");
+ fclose(p);
+ fclose(p); // warn: already closed
+ }
+
+ void test() {
+ FILE *p = tmpfile();
+ ftell(p); // warn: stream pointer might be NULL
+ fclose(p);
+ }
+
+
+alpha.unix.cstring.BufferOverlap (C)
+""""""""""""""""""""""""""""""""""""
+Checks for overlap in two buffer arguments. Applies to: ``memcpy, mempcpy``.
+
+.. code-block:: c
+
+ void test() {
+ int a[4] = {0};
+ memcpy(a + 2, a + 1, 8); // warn
+ }
+
+alpha.unix.cstring.NotNullTerminated (C)
+""""""""""""""""""""""""""""""""""""""""
+Check for arguments which are not null-terminated strings; applies to: ``strlen, strnlen, strcpy, strncpy, strcat, strncat``.
+
+.. code-block:: c
+
+ void test() {
+ int y = strlen((char *)&test); // warn
+ }
+
+alpha.unix.cstring.OutOfBounds (C)
+""""""""""""""""""""""""""""""""""
+Check for out-of-bounds access in string functions; applies to:`` strncopy, strncat``.
+
+
+.. code-block:: c
+
+ void test() {
+ int y = strlen((char *)&test); // warn
+ }
+
+
+Debug Checkers
+---------------
+
+.. _debug-checkers:
+
+
+debug
+^^^^^
+
+Checkers used for debugging the analyzer.
+:doc:`DebugChecks` page contains a detailed description.
+
+debug.AnalysisOrder
+"""""""""""""""""""
+Print callbacks that are called during analysis in order.
+
+debug.ConfigDumper
+""""""""""""""""""
+Dump config table.
+
+debug.DumpCFG Display
+"""""""""""""""""""""
+Control-Flow Graphs.
+
+debug.DumpCallGraph
+"""""""""""""""""""
+Display Call Graph.
+
+debug.DumpCalls
+"""""""""""""""
+Print calls as they are traversed by the engine.
+
+debug.DumpDominators
+""""""""""""""""""""
+Print the dominance tree for a given CFG.
+
+debug.DumpLiveVars
+""""""""""""""""""
+Print results of live variable analysis.
+
+debug.DumpTraversal
+"""""""""""""""""""
+Print branch conditions as they are traversed by the engine.
+
+debug.ExprInspection
+""""""""""""""""""""
+Check the analyzer's understanding of expressions.
+
+debug.Stats
+"""""""""""
+Emit warnings with analyzer statistics.
+
+debug.TaintTest
+"""""""""""""""
+Mark tainted symbols as such.
+
+debug.ViewCFG
+"""""""""""""
+View Control-Flow Graphs using GraphViz.
+
+debug.ViewCallGraph
+"""""""""""""""""""
+View Call Graph using GraphViz.
+
+debug.ViewExplodedGraph
+"""""""""""""""""""""""
+View Exploded Graphs using GraphViz.
+
Index: docs/analyzer/checkers/callandmessage_example.c
===================================================================
--- docs/analyzer/checkers/callandmessage_example.c
+++ docs/analyzer/checkers/callandmessage_example.c
@@ -0,0 +1,66 @@
+//C
+void test() {
+ void (*foo)(void);
+ foo = 0;
+ foo(); // warn: function pointer is null
+ }
+
+ // C++
+ class C {
+ public:
+ void f();
+ };
+
+ void test() {
+ C *pc;
+ pc->f(); // warn: object pointer is uninitialized
+ }
+
+ // C++
+ class C {
+ public:
+ void f();
+ };
+
+ void test() {
+ C *pc = 0;
+ pc->f(); // warn: object pointer is null
+ }
+
+ // Objective-C
+ @interface MyClass : NSObject
+ @property (readwrite,assign) id x;
+ - (long double)longDoubleM;
+ @end
+
+ void test() {
+ MyClass *obj1;
+ long double ld1 = [obj1 longDoubleM];
+ // warn: receiver is uninitialized
+ }
+
+ // Objective-C
+ @interface MyClass : NSObject
+ @property (readwrite,assign) id x;
+ - (long double)longDoubleM;
+ @end
+
+ void test() {
+ MyClass *obj1;
+ id i = obj1.x; // warn: uninitialized object pointer
+ }
+
+ // Objective-C
+ @interface Subscriptable : NSObject
+ - (id)objectAtIndexedSubscript:(unsigned int)index;
+ @end
+
+ @interface MyClass : Subscriptable
+ @property (readwrite,assign) id x;
+ - (long double)longDoubleM;
+ @end
+
+ void test() {
+ MyClass *obj1;
+ id i = obj1[0]; // warn: uninitialized object pointer
+ }
Index: docs/analyzer/checkers/dealloc_example.m
===================================================================
--- docs/analyzer/checkers/dealloc_example.m
+++ docs/analyzer/checkers/dealloc_example.m
@@ -0,0 +1,49 @@
+
+
+@interface MyObject : NSObject {
+ id _myproperty;
+}
+@end
+
+@implementation MyObject // warn: lacks 'dealloc'
+@end
+
+@interface MyObject : NSObject {}
+@property(assign) id myproperty;
+@end
+
+@implementation MyObject // warn: does not send 'dealloc' to super
+- (void)dealloc {
+ self.myproperty = 0;
+}
+@end
+
+@interface MyObject : NSObject {
+ id _myproperty;
+}
+@property(retain) id myproperty;
+@end
+
+@implementation MyObject
+@synthesize myproperty = _myproperty;
+ // warn: var was retained but wasn't released
+- (void)dealloc {
+ [super dealloc];
+}
+@end
+
+@interface MyObject : NSObject {
+ id _myproperty;
+}
+@property(assign) id myproperty;
+@end
+
+@implementation MyObject
+@synthesize myproperty = _myproperty;
+ // warn: var wasn't retained but was released
+- (void)dealloc {
+ [_myproperty release];
+ [super dealloc];
+}
+@end
+
Index: docs/analyzer/checkers/dividezero_example.c
===================================================================
--- docs/analyzer/checkers/dividezero_example.c
+++ docs/analyzer/checkers/dividezero_example.c
@@ -0,0 +1,9 @@
+void test(int z) {
+ if (z == 0)
+ int x = 1 / z; // warn
+}
+
+void test() {
+ int x = 1;
+ int y = x % 0; // warn
+}
Index: docs/analyzer/checkers/mismatched_deallocator_example.cpp
===================================================================
--- docs/analyzer/checkers/mismatched_deallocator_example.cpp
+++ docs/analyzer/checkers/mismatched_deallocator_example.cpp
@@ -0,0 +1,56 @@
+// C, C++
+void test() {
+ int *p = (int *)malloc(sizeof(int));
+ delete p; // warn
+}
+
+// C, C++
+void __attribute((ownership_returns(malloc))) *user_malloc(size_t);
+
+void test() {
+ int *p = (int *)user_malloc(sizeof(int));
+ delete p; // warn
+}
+
+// C, C++
+void test() {
+ int *p = new int;
+ free(p); // warn
+}
+
+// C, C++
+void test() {
+ int *p = new int[1];
+ realloc(p, sizeof(long)); // warn
+}
+
+// C, C++
+template
+struct SimpleSmartPointer {
+ T *ptr;
+
+ explicit SimpleSmartPointer(T *p = 0) : ptr(p) {}
+ ~SimpleSmartPointer() {
+ delete ptr; // warn
+ }
+};
+
+void test() {
+ SimpleSmartPointer a((int *)malloc(4));
+}
+
+// C++
+void test() {
+ int *p = (int *)operator new(0);
+ delete[] p; // warn
+}
+
+// Objective-C, C++
+void test(NSUInteger dataLength) {
+ int *p = new int;
+ NSData *d = [NSData dataWithBytesNoCopy:p
+ length:sizeof(int) freeWhenDone:1];
+ // warn +dataWithBytesNoCopy:length:freeWhenDone: cannot take
+ // ownership of memory allocated by 'new'
+}
+
Index: docs/analyzer/checkers/newdelete_example.cpp
===================================================================
--- docs/analyzer/checkers/newdelete_example.cpp
+++ docs/analyzer/checkers/newdelete_example.cpp
@@ -0,0 +1,41 @@
+void f(int *p);
+
+void testUseMiddleArgAfterDelete(int *p) {
+ delete p;
+ f(p); // warn: use after free
+}
+
+class SomeClass {
+public:
+ void f();
+};
+
+void test() {
+ SomeClass *c = new SomeClass;
+ delete c;
+ c->f(); // warn: use after free
+}
+
+void test() {
+ int *p = (int *)__builtin_alloca(sizeof(int));
+ delete p; // warn: deleting memory allocated by alloca
+}
+
+void test() {
+ int *p = new int;
+ delete p;
+ delete p; // warn: attempt to free released
+}
+
+void test() {
+ int i;
+ delete &i; // warn: delete address of local
+}
+
+void test() {
+ int *p = new int[1];
+ delete[] (++p);
+ // warn: argument to 'delete[]' is offset by 4 bytes
+ // from the start of memory allocated by 'new[]'
+}
+
Index: docs/analyzer/checkers/seckeychainapi_example.m
===================================================================
--- docs/analyzer/checkers/seckeychainapi_example.m
+++ docs/analyzer/checkers/seckeychainapi_example.m
@@ -0,0 +1,64 @@
+
+
+void test() {
+ unsigned int *ptr = 0;
+ UInt32 length;
+
+ SecKeychainItemFreeContent(ptr, &length);
+ // warn: trying to free data which has not been allocated
+}
+
+void test() {
+ unsigned int *ptr = 0;
+ UInt32 *length = 0;
+ void *outData;
+
+ OSStatus st =
+ SecKeychainItemCopyContent(2, ptr, ptr, length, outData);
+ // warn: data is not released
+}
+
+void test() {
+ unsigned int *ptr = 0;
+ UInt32 *length = 0;
+ void *outData;
+
+ OSStatus st =
+ SecKeychainItemCopyContent(2, ptr, ptr, length, &outData);
+
+ SecKeychainItemFreeContent(ptr, outData);
+ // warn: only call free if a non-NULL buffer was returned
+}
+
+void test() {
+ unsigned int *ptr = 0;
+ UInt32 *length = 0;
+ void *outData;
+
+ OSStatus st =
+ SecKeychainItemCopyContent(2, ptr, ptr, length, &outData);
+
+ st = SecKeychainItemCopyContent(2, ptr, ptr, length, &outData);
+ // warn: release data before another call to the allocator
+
+ if (st == noErr)
+ SecKeychainItemFreeContent(ptr, outData);
+}
+
+void test() {
+ SecKeychainItemRef itemRef = 0;
+ SecKeychainAttributeInfo *info = 0;
+ SecItemClass *itemClass = 0;
+ SecKeychainAttributeList *attrList = 0;
+ UInt32 *length = 0;
+ void *outData = 0;
+
+ OSStatus st =
+ SecKeychainItemCopyAttributesAndData(itemRef, info,
+ itemClass, &attrList,
+ length, &outData);
+
+ SecKeychainItemFreeContent(attrList, outData);
+ // warn: deallocator doesn't match the allocator
+}
+
Index: docs/analyzer/checkers/unix_api_example.c
===================================================================
--- docs/analyzer/checkers/unix_api_example.c
+++ docs/analyzer/checkers/unix_api_example.c
@@ -0,0 +1,37 @@
+
+// Currently the check is performed for apple targets only.
+void test(const char *path) {
+ int fd = open(path, O_CREAT);
+ // warn: call to 'open' requires a third argument when the
+ // 'O_CREAT' flag is set
+}
+
+void f();
+
+void test() {
+ pthread_once_t pred = {0x30B1BCBA, {0}};
+ pthread_once(&pred, f);
+ // warn: call to 'pthread_once' uses the local variable
+}
+
+void test() {
+ void *p = malloc(0); // warn: allocation size of 0 bytes
+}
+
+void test() {
+ void *p = calloc(0, 42); // warn: allocation size of 0 bytes
+}
+
+void test() {
+ void *p = malloc(1);
+ p = realloc(p, 0); // warn: allocation size of 0 bytes
+}
+
+void test() {
+ void *p = alloca(0); // warn: allocation size of 0 bytes
+}
+
+void test() {
+ void *p = valloc(0); // warn: allocation size of 0 bytes
+}
+
Index: docs/analyzer/checkers/unix_malloc_example.c
===================================================================
--- docs/analyzer/checkers/unix_malloc_example.c
+++ docs/analyzer/checkers/unix_malloc_example.c
@@ -0,0 +1,30 @@
+
+void test() {
+ int *p = malloc(1);
+ free(p);
+ free(p); // warn: attempt to free released memory
+}
+
+void test() {
+ int *p = malloc(sizeof(int));
+ free(p);
+ *p = 1; // warn: use after free
+}
+
+void test() {
+ int *p = malloc(1);
+ if (p)
+ return; // warn: memory is never released
+}
+
+void test() {
+ int a[] = { 1 };
+ free(a); // warn: argument is not allocated by malloc
+}
+
+void test() {
+ int *p = malloc(sizeof(char));
+ p = p - 1;
+ free(p); // warn: argument to free() is offset by -4 bytes
+}
+
Index: docs/analyzer/developer-docs.rst
===================================================================
--- docs/analyzer/developer-docs.rst
+++ docs/analyzer/developer-docs.rst
@@ -0,0 +1,14 @@
+Developer Docs
+==============
+
+Contents:
+
+.. toctree::
+ :maxdepth: 2
+
+ developer-docs/DebugChecks
+ developer-docs/IPA
+ developer-docs/InitializerLists
+ developer-docs/nullability
+ developer-docs/RegionStore
+
\ No newline at end of file
Index: docs/analyzer/developer-docs/DebugChecks.rst
===================================================================
--- docs/analyzer/developer-docs/DebugChecks.rst
+++ docs/analyzer/developer-docs/DebugChecks.rst
@@ -0,0 +1,287 @@
+============
+Debug Checks
+============
+
+.. contents::
+ :local:
+
+The analyzer contains a number of checkers which can aid in debugging. Enable
+them by using the "-analyzer-checker=" flag, followed by the name of the
+checker.
+
+
+General Analysis Dumpers
+========================
+
+These checkers are used to dump the results of various infrastructural analyses
+to stderr. Some checkers also have "view" variants, which will display a graph
+using a 'dot' format viewer (such as Graphviz on OS X) instead.
+
+- debug.DumpCallGraph, debug.ViewCallGraph: Show the call graph generated for
+ the current translation unit. This is used to determine the order in which to
+ analyze functions when inlining is enabled.
+
+- debug.DumpCFG, debug.ViewCFG: Show the CFG generated for each top-level
+ function being analyzed.
+
+- debug.DumpDominators: Shows the dominance tree for the CFG of each top-level
+ function.
+
+- debug.DumpLiveVars: Show the results of live variable analysis for each
+ top-level function being analyzed.
+
+- debug.DumpLiveStmts: Show the results of live statement analysis for each
+ top-level function being analyzed.
+
+- debug.ViewExplodedGraph: Show the Exploded Graphs generated for the
+ analysis of different functions in the input translation unit. When there
+ are several functions analyzed, display one graph per function. Beware
+ that these graphs may grow very large, even for small functions.
+
+Path Tracking
+=============
+
+These checkers print information about the path taken by the analyzer engine.
+
+- debug.DumpCalls: Prints out every function or method call encountered during a
+ path traversal. This is indented to show the call stack, but does NOT do any
+ special handling of branches, meaning different paths could end up
+ interleaved.
+
+- debug.DumpTraversal: Prints the name of each branch statement encountered
+ during a path traversal ("IfStmt", "WhileStmt", etc). Currently used to check
+ whether the analysis engine is doing BFS or DFS.
+
+
+State Checking
+==============
+
+These checkers will print out information about the analyzer state in the form
+of analysis warnings. They are intended for use with the -verify functionality
+in regression tests.
+
+- debug.TaintTest: Prints out the word "tainted" for every expression that
+ carries taint. At the time of this writing, taint was only introduced by the
+ checks under experimental.security.taint.TaintPropagation; this checker may
+ eventually move to the security.taint package.
+
+- debug.ExprInspection: Responds to certain function calls, which are modeled
+ after builtins. These function calls should affect the program state other
+ than the evaluation of their arguments; to use them, you will need to declare
+ them within your test file. The available functions are described below.
+
+(FIXME: debug.ExprInspection should probably be renamed, since it no longer only
+inspects expressions.)
+
+
+ExprInspection checks
+---------------------
+
+- ``void clang_analyzer_eval(bool);``
+
+ Prints TRUE if the argument is known to have a non-zero value, FALSE if the
+ argument is known to have a zero or null value, and UNKNOWN if the argument
+ isn't sufficiently constrained on this path. You can use this to test other
+ values by using expressions like "x == 5". Note that this functionality is
+ currently DISABLED in inlined functions, since different calls to the same
+ inlined function could provide different information, making it difficult to
+ write proper -verify directives.
+
+ In C, the argument can be typed as 'int' or as '_Bool'.
+
+ Example usage::
+
+ clang_analyzer_eval(x); // expected-warning{{UNKNOWN}}
+ if (!x) return;
+ clang_analyzer_eval(x); // expected-warning{{TRUE}}
+
+
+- ``void clang_analyzer_checkInlined(bool);``
+
+ If a call occurs within an inlined function, prints TRUE or FALSE according to
+ the value of its argument. If a call occurs outside an inlined function,
+ nothing is printed.
+
+ The intended use of this checker is to assert that a function is inlined at
+ least once (by passing 'true' and expecting a warning), or to assert that a
+ function is never inlined (by passing 'false' and expecting no warning). The
+ argument is technically unnecessary but is intended to clarify intent.
+
+ You might wonder why we can't print TRUE if a function is ever inlined and
+ FALSE if it is not. The problem is that any inlined function could conceivably
+ also be analyzed as a top-level function (in which case both TRUE and FALSE
+ would be printed), depending on the value of the -analyzer-inlining option.
+
+ In C, the argument can be typed as 'int' or as '_Bool'.
+
+ Example usage::
+
+ int inlined() {
+ clang_analyzer_checkInlined(true); // expected-warning{{TRUE}}
+ return 42;
+ }
+
+ void topLevel() {
+ clang_analyzer_checkInlined(false); // no-warning (not inlined)
+ int value = inlined();
+ // This assertion will not be valid if the previous call was not inlined.
+ clang_analyzer_eval(value == 42); // expected-warning{{TRUE}}
+ }
+
+- ``void clang_analyzer_warnIfReached();``
+
+ Generate a warning if this line of code gets reached by the analyzer.
+
+ Example usage::
+
+ if (true) {
+ clang_analyzer_warnIfReached(); // expected-warning{{REACHABLE}}
+ }
+ else {
+ clang_analyzer_warnIfReached(); // no-warning
+ }
+
+- ``void clang_analyzer_numTimesReached();``
+
+ Same as above, but include the number of times this call expression
+ gets reached by the analyzer during the current analysis.
+
+ Example usage::
+
+ for (int x = 0; x < 3; ++x) {
+ clang_analyzer_numTimesReached(); // expected-warning{{3}}
+ }
+
+- ``void clang_analyzer_warnOnDeadSymbol(int);``
+
+ Subscribe for a delayed warning when the symbol that represents the value of
+ the argument is garbage-collected by the analyzer.
+
+ When calling 'clang_analyzer_warnOnDeadSymbol(x)', if value of 'x' is a
+ symbol, then this symbol is marked by the ExprInspection checker. Then,
+ during each garbage collection run, the checker sees if the marked symbol is
+ being collected and issues the 'SYMBOL DEAD' warning if it does.
+ This way you know where exactly, up to the line of code, the symbol dies.
+
+ It is unlikely that you call this function after the symbol is already dead,
+ because the very reference to it as the function argument prevents it from
+ dying. However, if the argument is not a symbol but a concrete value,
+ no warning would be issued.
+
+ Example usage::
+
+ do {
+ int x = generate_some_integer();
+ clang_analyzer_warnOnDeadSymbol(x);
+ } while(0); // expected-warning{{SYMBOL DEAD}}
+
+
+- ``void clang_analyzer_explain(a single argument of any type);``
+
+ This function explains the value of its argument in a human-readable manner
+ in the warning message. You can make as many overrides of its prototype
+ in the test code as necessary to explain various integral, pointer,
+ or even record-type values. To simplify usage in C code (where overloading
+ the function declaration is not allowed), you may append an arbitrary suffix
+ to the function name, without affecting functionality.
+
+ Example usage::
+
+ void clang_analyzer_explain(int);
+ void clang_analyzer_explain(void *);
+
+ // Useful in C code
+ void clang_analyzer_explain_int(int);
+
+ void foo(int param, void *ptr) {
+ clang_analyzer_explain(param); // expected-warning{{argument 'param'}}
+ clang_analyzer_explain_int(param); // expected-warning{{argument 'param'}}
+ if (!ptr)
+ clang_analyzer_explain(ptr); // expected-warning{{memory address '0'}}
+ }
+
+- ``void clang_analyzer_dump( /* a single argument of any type */);``
+
+ Similar to clang_analyzer_explain, but produces a raw dump of the value,
+ same as SVal::dump().
+
+ Example usage::
+
+ void clang_analyzer_dump(int);
+ void foo(int x) {
+ clang_analyzer_dump(x); // expected-warning{{reg_$0}}
+ }
+
+- ``size_t clang_analyzer_getExtent(void *);``
+
+ This function returns the value that represents the extent of a memory region
+ pointed to by the argument. This value is often difficult to obtain otherwise,
+ because no valid code that produces this value. However, it may be useful
+ for testing purposes, to see how well does the analyzer model region extents.
+
+ Example usage::
+
+ void foo() {
+ int x, *y;
+ size_t xs = clang_analyzer_getExtent(&x);
+ clang_analyzer_explain(xs); // expected-warning{{'4'}}
+ size_t ys = clang_analyzer_getExtent(&y);
+ clang_analyzer_explain(ys); // expected-warning{{'8'}}
+ }
+
+- ``void clang_analyzer_printState();``
+
+ Dumps the current ProgramState to the stderr. Quickly lookup the program state
+ at any execution point without ViewExplodedGraph or re-compiling the program.
+ This is not very useful for writing tests (apart from testing how ProgramState
+ gets printed), but useful for debugging tests. Also, this method doesn't
+ produce a warning, so it gets printed on the console before all other
+ ExprInspection warnings.
+
+ Example usage::
+
+ void foo() {
+ int x = 1;
+ clang_analyzer_printState(); // Read the stderr!
+ }
+
+- ``void clang_analyzer_hashDump(int);``
+
+ The analyzer can generate a hash to identify reports. To debug what information
+ is used to calculate this hash it is possible to dump the hashed string as a
+ warning of an arbitrary expression using the function above.
+
+ Example usage::
+
+ void foo() {
+ int x = 1;
+ clang_analyzer_hashDump(x); // expected-warning{{hashed string for x}}
+ }
+
+- ``void clang_analyzer_denote(int, const char *);``
+
+ Denotes symbols with strings. A subsequent call to clang_analyzer_express()
+ will expresses another symbol in terms of these string. Useful for testing
+ relationships between different symbols.
+
+ Example usage::
+
+ void foo(int x) {
+ clang_analyzer_denote(x, "$x");
+ clang_analyzer_express(x + 1); // expected-warning{{$x + 1}}
+ }
+
+- ``void clang_analyzer_express(int);``
+
+ See clang_analyzer_denote().
+
+Statistics
+==========
+
+The debug.Stats checker collects various information about the analysis of each
+function, such as how many blocks were reached and if the analyzer timed out.
+
+There is also an additional -analyzer-stats flag, which enables various
+statistics within the analyzer engine. Note the Stats checker (which produces at
+least one bug report per function) may actually change the values reported by
+-analyzer-stats.
Index: docs/analyzer/developer-docs/IPA.rst
===================================================================
--- docs/analyzer/developer-docs/IPA.rst
+++ docs/analyzer/developer-docs/IPA.rst
@@ -0,0 +1,396 @@
+Inlining
+========
+
+There are several options that control which calls the analyzer will consider for
+inlining. The major one is ``-analyzer-config ipa``:
+
+* ``analyzer-config ipa=none`` - All inlining is disabled. This is the only mode
+ available in LLVM 3.1 and earlier and in Xcode 4.3 and earlier.
+
+* ``analyzer-config ipa=basic-inlining`` - Turns on inlining for C functions, C++
+ static member functions, and blocks -- essentially, the calls that behave
+ like simple C function calls. This is essentially the mode used in
+ Xcode 4.4.
+
+* ``analyzer-config ipa=inlining`` - Turns on inlining when we can confidently find
+ the function/method body corresponding to the call. (C functions, static
+ functions, devirtualized C++ methods, Objective-C class methods, Objective-C
+ instance methods when ExprEngine is confident about the dynamic type of the
+ instance).
+
+* ``analyzer-config ipa=dynamic`` - Inline instance methods for which the type is
+ determined at runtime and we are not 100% sure that our type info is
+ correct. For virtual calls, inline the most plausible definition.
+
+* ``analyzer-config ipa=dynamic-bifurcate`` - Same as -analyzer-config ipa=dynamic,
+ but the path is split. We inline on one branch and do not inline on the
+ other. This mode does not drop the coverage in cases when the parent class
+ has code that is only exercised when some of its methods are overridden.
+
+Currently, ``-analyzer-config ipa=dynamic-bifurcate`` is the default mode.
+
+While ``-analyzer-config ipa`` determines in general how aggressively the analyzer
+will try to inline functions, several additional options control which types of
+functions can inlined, in an all-or-nothing way. These options use the
+analyzer's configuration table, so they are all specified as follows:
+
+ ``-analyzer-config OPTION=VALUE``
+
+c++-inlining
+------------
+
+This option controls which C++ member functions may be inlined.
+
+ ``-analyzer-config c++-inlining=[none | methods | constructors | destructors]``
+
+Each of these modes implies that all the previous member function kinds will be
+inlined as well; it doesn't make sense to inline destructors without inlining
+constructors, for example.
+
+The default c++-inlining mode is 'destructors', meaning that all member
+functions with visible definitions will be considered for inlining. In some
+cases the analyzer may still choose not to inline the function.
+
+Note that under 'constructors', constructors for types with non-trivial
+destructors will not be inlined. Additionally, no C++ member functions will be
+inlined under -analyzer-config ipa=none or -analyzer-config ipa=basic-inlining,
+regardless of the setting of the c++-inlining mode.
+
+c++-template-inlining
+^^^^^^^^^^^^^^^^^^^^^
+
+This option controls whether C++ templated functions may be inlined.
+
+ ``-analyzer-config c++-template-inlining=[true | false]``
+
+Currently, template functions are considered for inlining by default.
+
+The motivation behind this option is that very generic code can be a source
+of false positives, either by considering paths that the caller considers
+impossible (by some unstated precondition), or by inlining some but not all
+of a deep implementation of a function.
+
+c++-stdlib-inlining
+^^^^^^^^^^^^^^^^^^^
+
+This option controls whether functions from the C++ standard library, including
+methods of the container classes in the Standard Template Library, should be
+considered for inlining.
+
+ ``-analyzer-config c++-stdlib-inlining=[true | false]``
+
+Currently, C++ standard library functions are considered for inlining by
+default.
+
+The standard library functions and the STL in particular are used ubiquitously
+enough that our tolerance for false positives is even lower here. A false
+positive due to poor modeling of the STL leads to a poor user experience, since
+most users would not be comfortable adding assertions to system headers in order
+to silence analyzer warnings.
+
+c++-container-inlining
+^^^^^^^^^^^^^^^^^^^^^^
+
+This option controls whether constructors and destructors of "container" types
+should be considered for inlining.
+
+ ``-analyzer-config c++-container-inlining=[true | false]``
+
+Currently, these constructors and destructors are NOT considered for inlining
+by default.
+
+The current implementation of this setting checks whether a type has a member
+named 'iterator' or a member named 'begin'; these names are idiomatic in C++,
+with the latter specified in the C++11 standard. The analyzer currently does a
+fairly poor job of modeling certain data structure invariants of container-like
+objects. For example, these three expressions should be equivalent:
+
+
+.. code-block:: cpp
+
+ std::distance(c.begin(), c.end()) == 0
+ c.begin() == c.end()
+ c.empty()
+
+Many of these issues are avoided if containers always have unknown, symbolic
+state, which is what happens when their constructors are treated as opaque.
+In the future, we may decide specific containers are "safe" to model through
+inlining, or choose to model them directly using checkers instead.
+
+
+Basics of Implementation
+------------------------
+
+The low-level mechanism of inlining a function is handled in
+ExprEngine::inlineCall and ExprEngine::processCallExit.
+
+If the conditions are right for inlining, a CallEnter node is created and added
+to the analysis work list. The CallEnter node marks the change to a new
+LocationContext representing the called function, and its state includes the
+contents of the new stack frame. When the CallEnter node is actually processed,
+its single successor will be a edge to the first CFG block in the function.
+
+Exiting an inlined function is a bit more work, fortunately broken up into
+reasonable steps:
+
+1. The CoreEngine realizes we're at the end of an inlined call and generates a
+ CallExitBegin node.
+
+2. ExprEngine takes over (in processCallExit) and finds the return value of the
+ function, if it has one. This is bound to the expression that triggered the
+ call. (In the case of calls without origin expressions, such as destructors,
+ this step is skipped.)
+
+3. Dead symbols and bindings are cleaned out from the state, including any local
+ bindings.
+
+4. A CallExitEnd node is generated, which marks the transition back to the
+ caller's LocationContext.
+
+5. Custom post-call checks are processed and the final nodes are pushed back
+ onto the work list, so that evaluation of the caller can continue.
+
+Retry Without Inlining
+^^^^^^^^^^^^^^^^^^^^^^
+
+In some cases, we would like to retry analysis without inlining a particular
+call.
+
+Currently, we use this technique to recover coverage in case we stop
+analyzing a path due to exceeding the maximum block count inside an inlined
+function.
+
+When this situation is detected, we walk up the path to find the first node
+before inlining was started and enqueue it on the WorkList with a special
+ReplayWithoutInlining bit added to it (ExprEngine::replayWithoutInlining). The
+path is then re-analyzed from that point without inlining that particular call.
+
+Deciding When to Inline
+^^^^^^^^^^^^^^^^^^^^^^^
+
+In general, the analyzer attempts to inline as much as possible, since it
+provides a better summary of what actually happens in the program. There are
+some cases, however, where the analyzer chooses not to inline:
+
+- If there is no definition available for the called function or method. In
+ this case, there is no opportunity to inline.
+
+- If the CFG cannot be constructed for a called function, or the liveness
+ cannot be computed. These are prerequisites for analyzing a function body,
+ with or without inlining.
+
+- If the LocationContext chain for a given ExplodedNode reaches a maximum cutoff
+ depth. This prevents unbounded analysis due to infinite recursion, but also
+ serves as a useful cutoff for performance reasons.
+
+- If the function is variadic. This is not a hard limitation, but an engineering
+ limitation.
+
+ Tracked by: Support inlining of variadic functions
+
+- In C++, constructors are not inlined unless the destructor call will be
+ processed by the ExprEngine. Thus, if the CFG was built without nodes for
+ implicit destructors, or if the destructors for the given object are not
+ represented in the CFG, the constructor will not be inlined. (As an exception,
+ constructors for objects with trivial constructors can still be inlined.)
+ See "C++ Caveats" below.
+
+- In C++, ExprEngine does not inline custom implementations of operator 'new'
+ or operator 'delete', nor does it inline the constructors and destructors
+ associated with these. See "C++ Caveats" below.
+
+- Calls resulting in "dynamic dispatch" are specially handled. See more below.
+
+- The FunctionSummaries map stores additional information about declarations,
+ some of which is collected at runtime based on previous analyses.
+ We do not inline functions which were not profitable to inline in a different
+ context (for example, if the maximum block count was exceeded; see
+ "Retry Without Inlining").
+
+
+Dynamic Calls and Devirtualization
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+"Dynamic" calls are those that are resolved at runtime, such as C++ virtual
+method calls and Objective-C message sends. Due to the path-sensitive nature of
+the analysis, the analyzer may be able to reason about the dynamic type of the
+object whose method is being called and thus "devirtualize" the call.
+
+This path-sensitive devirtualization occurs when the analyzer can determine what
+method would actually be called at runtime. This is possible when the type
+information is constrained enough for a simulated C++/Objective-C object that
+the analyzer can make such a decision.
+
+DynamicTypeInfo
+^^^^^^^^^^^^^^^
+
+As the analyzer analyzes a path, it may accrue information to refine the
+knowledge about the type of an object. This can then be used to make better
+decisions about the target method of a call.
+
+Such type information is tracked as DynamicTypeInfo. This is path-sensitive
+data that is stored in ProgramState, which defines a mapping from MemRegions to
+an (optional) DynamicTypeInfo.
+
+If no DynamicTypeInfo has been explicitly set for a MemRegion, it will be lazily
+inferred from the region's type or associated symbol. Information from symbolic
+regions is weaker than from true typed regions.
+
+ EXAMPLE: A C++ object declared "A obj" is known to have the class 'A', but a
+ reference "A &ref" may dynamically be a subclass of 'A'.
+
+The DynamicTypePropagation checker gathers and propagates DynamicTypeInfo,
+updating it as information is observed along a path that can refine that type
+information for a region.
+
+ WARNING: Not all of the existing analyzer code has been retrofitted to use
+ DynamicTypeInfo, nor is it universally appropriate. In particular,
+ DynamicTypeInfo always applies to a region with all casts stripped
+ off, but sometimes the information provided by casts can be useful.
+
+
+RuntimeDefinition
+^^^^^^^^^^^^^^^^^
+
+The basis of devirtualization is CallEvent's getRuntimeDefinition() method,
+which returns a RuntimeDefinition object. When asked to provide a definition,
+the CallEvents for dynamic calls will use the DynamicTypeInfo in their
+ProgramState to attempt to devirtualize the call. In the case of no dynamic
+dispatch, or perfectly constrained devirtualization, the resulting
+RuntimeDefinition contains a Decl corresponding to the definition of the called
+function, and RuntimeDefinition::mayHaveOtherDefinitions will return FALSE.
+
+In the case of dynamic dispatch where our information is not perfect, CallEvent
+can make a guess, but RuntimeDefinition::mayHaveOtherDefinitions will return
+TRUE. The RuntimeDefinition object will then also include a MemRegion
+corresponding to the object being called (i.e., the "receiver" in Objective-C
+parlance), which ExprEngine uses to decide whether or not the call should be
+inlined.
+
+Inlining Dynamic Calls
+^^^^^^^^^^^^^^^^^^^^^^
+
+The -analyzer-config ipa option has five different modes: none, basic-inlining,
+inlining, dynamic, and dynamic-bifurcate. Under -analyzer-config ipa=dynamic,
+all dynamic calls are inlined, whether we are certain or not that this will
+actually be the definition used at runtime. Under -analyzer-config ipa=inlining,
+only "near-perfect" devirtualized calls are inlined*, and other dynamic calls
+are evaluated conservatively (as if no definition were available).
+
+* Currently, no Objective-C messages are not inlined under
+ -analyzer-config ipa=inlining, even if we are reasonably confident of the type
+ of the receiver. We plan to enable this once we have tested our heuristics
+ more thoroughly.
+
+The last option, -analyzer-config ipa=dynamic-bifurcate, behaves similarly to
+"dynamic", but performs a conservative invalidation in the general virtual case
+in *addition* to inlining. The details of this are discussed below.
+
+As stated above, -analyzer-config ipa=basic-inlining does not inline any C++
+member functions or Objective-C method calls, even if they are non-virtual or
+can be safely devirtualized.
+
+
+Bifurcation
+^^^^^^^^^^^
+
+ExprEngine::BifurcateCall implements the ``-analyzer-config ipa=dynamic-bifurcate``
+mode.
+
+When a call is made on an object with imprecise dynamic type information
+(RuntimeDefinition::mayHaveOtherDefinitions() evaluates to TRUE), ExprEngine
+bifurcates the path and marks the object's region (retrieved from the
+RuntimeDefinition object) with a path-sensitive "mode" in the ProgramState.
+
+Currently, there are 2 modes:
+
+* ``DynamicDispatchModeInlined`` - Models the case where the dynamic type information
+ of the receiver (MemoryRegion) is assumed to be perfectly constrained so
+ that a given definition of a method is expected to be the code actually
+ called. When this mode is set, ExprEngine uses the Decl from
+ RuntimeDefinition to inline any dynamically dispatched call sent to this
+ receiver because the function definition is considered to be fully resolved.
+
+* ``DynamicDispatchModeConservative`` - Models the case where the dynamic type
+ information is assumed to be incorrect, for example, implies that the method
+ definition is overridden in a subclass. In such cases, ExprEngine does not
+ inline the methods sent to the receiver (MemoryRegion), even if a candidate
+ definition is available. This mode is conservative about simulating the
+ effects of a call.
+
+Going forward along the symbolic execution path, ExprEngine consults the mode
+of the receiver's MemRegion to make decisions on whether the calls should be
+inlined or not, which ensures that there is at most one split per region.
+
+At a high level, "bifurcation mode" allows for increased semantic coverage in
+cases where the parent method contains code which is only executed when the
+class is subclassed. The disadvantages of this mode are a (considerable?)
+performance hit and the possibility of false positives on the path where the
+conservative mode is used.
+
+Objective-C Message Heuristics
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+ExprEngine relies on a set of heuristics to partition the set of Objective-C
+method calls into those that require bifurcation and those that do not. Below
+are the cases when the DynamicTypeInfo of the object is considered precise
+(cannot be a subclass):
+
+ - If the object was created with +alloc or +new and initialized with an -init
+ method.
+
+ - If the calls are property accesses using dot syntax. This is based on the
+ assumption that children rarely override properties, or do so in an
+ essentially compatible way.
+
+ - If the class interface is declared inside the main source file. In this case
+ it is unlikely that it will be subclassed.
+
+ - If the method is not declared outside of main source file, either by the
+ receiver's class or by any superclasses.
+
+C++ Caveats
+^^^^^^^^^^^
+
+C++11 [class.cdtor]p4 describes how the vtable of an object is modified as it is
+being constructed or destructed; that is, the type of the object depends on
+which base constructors have been completed. This is tracked using
+DynamicTypeInfo in the DynamicTypePropagation checker.
+
+There are several limitations in the current implementation:
+
+* Temporaries are poorly modeled right now because we're not confident in the
+ placement of their destructors in the CFG. We currently won't inline their
+ constructors unless the destructor is trivial, and don't process their
+ destructors at all, not even to invalidate the region.
+
+* 'new' is poorly modeled due to some nasty CFG/design issues. This is tracked
+ in PR12014. 'delete' is not modeled at all.
+
+* Arrays of objects are modeled very poorly right now. ExprEngine currently
+ only simulates the first constructor and first destructor. Because of this,
+ ExprEngine does not inline any constructors or destructors for arrays.
+
+
+CallEvent
+^^^^^^^^^
+
+A CallEvent represents a specific call to a function, method, or other body of
+code. It is path-sensitive, containing both the current state (ProgramStateRef)
+and stack space (LocationContext), and provides uniform access to the argument
+values and return type of a call, no matter how the call is written in the
+source or what sort of code body is being invoked.
+
+ NOTE: For those familiar with Cocoa, CallEvent is roughly equivalent to
+ NSInvocation.
+
+CallEvent should be used whenever there is logic dealing with function calls
+that does not care how the call occurred.
+
+Examples include checking that arguments satisfy preconditions (such as
+__attribute__((nonnull))), and attempting to inline a call.
+
+CallEvents are reference-counted objects managed by a CallEventManager. While
+there is no inherent issue with persisting them (say, in a ProgramState's GDM),
+they are intended for short-lived use, and can be recreated from CFGElements or
+non-top-level StackFrameContexts fairly easily.
Index: docs/analyzer/developer-docs/InitializerLists.rst
===================================================================
--- docs/analyzer/developer-docs/InitializerLists.rst
+++ docs/analyzer/developer-docs/InitializerLists.rst
@@ -0,0 +1,327 @@
+================
+Initializer List
+================
+This discussion took place in https://reviews.llvm.org/D35216
+"Escape symbols when creating std::initializer_list".
+
+It touches problems of modelling C++ standard library constructs in general,
+including modelling implementation-defined fields within C++ standard library
+objects, in particular constructing objects into pointers held by such fields,
+and separation of responsibilities between analyzer's core and checkers.
+
+**Artem:**
+
+I've seen a few false positives that appear because we construct
+C++11 std::initializer_list objects with brace initializers, and such
+construction is not properly modeled. For instance, if a new object is
+constructed on the heap only to be put into a brace-initialized STL container,
+the object is reported to be leaked.
+
+Approach (0): This can be trivially fixed by this patch, which causes pointers
+passed into initializer list expressions to immediately escape.
+
+This fix is overly conservative though. So i did a bit of investigation as to
+how model std::initializer_list better.
+
+According to the standard, ``std::initializer_list`` is an object that has
+methods ``begin(), end(), and size()``, where ``begin()`` returns a pointer to continuous
+array of ``size()`` objects of type T, and end() is equal to begin() plus size().
+The standard does hint that it should be possible to implement
+``std::initializer_list`` as a pair of pointers, or as a pointer and a size
+integer, however specific fields that the object would contain are an
+implementation detail.
+
+Ideally, we should be able to model the initializer list's methods precisely.
+Or, at least, it should be possible to explain to the analyzer that the list
+somehow "takes hold" of the values put into it. Initializer lists can also be
+copied, which is a separate story that i'm not trying to address here.
+
+The obvious approach to modeling ``std::initializer_list`` in a checker would be to
+construct a SymbolMetadata for the memory region of the initializer list object,
+which would be of type ``T*`` and represent ``begin()``, so we'd trivially model ``begin()``
+as a function that returns this symbol. The array pointed to by that symbol
+would be ``bindLoc()``ed to contain the list's contents (probably as a ``CompoundVal``
+to produce less bindings in the store). Extent of this array would represent
+``size()`` and would be equal to the length of the list as written.
+
+So this sounds good, however apparently it does nothing to address our false
+positives: when the list escapes, our ``RegionStoreManager`` is not magically
+guessing that the metadata symbol attached to it, together with its contents,
+should also escape. In fact, it's impossible to trigger a pointer escape from
+within the checker.
+
+Approach (1): If only we enabled ``ProgramState::bindLoc(..., notifyChanges=true)``
+to cause pointer escapes (not only region changes) (which sounds like the right
+thing to do anyway) such checker would be able to solve the false positives by
+triggering escapes when binding list elements to the list. However, it'd be as
+conservative as the current patch's solution. Ideally, we do not want escapes to
+happen so early. Instead, we'd prefer them to be delayed until the list itself
+escapes.
+
+So i believe that escaping metadata symbols whenever their base regions escape
+would be the right thing to do. Currently we didn't think about that because we
+had neither pointer-type metadatas nor non-pointer escapes.
+
+Approach (2): We could teach the Store to scan itself for bindings to
+metadata-symbolic-based regions during scanReachableSymbols() whenever a region
+turns out to be reachable. This requires no work on checker side, but it sounds
+performance-heavy.
+
+Approach (3): We could let checkers maintain the set of active metadata symbols
+in the program state (ideally somewhere in the Store, which sounds weird but
+causes the smallest amount of layering violations), so that the core knew what
+to escape. This puts a stress on the checkers, but with a smart data map it
+wouldn't be a problem.
+
+Approach (4): We could allow checkers to trigger pointer escapes in arbitrary
+moments. If we allow doing this within ``checkPointerEscape`` callback itself, we
+would be able to express facts like "when this region escapes, that metadata
+symbol attached to it should also escape". This sounds like an ultimate freedom,
+with maximum stress on the checkers - still not too much stress when we have
+smart data maps.
+
+I'm personally liking the approach (2) - it should be possible to avoid
+performance overhead, and clarity seems nice.
+
+**Gabor:**
+
+At this point, I am a bit wondering about two questions.
+
+* When should something belong to a checker and when should something belong to the engine?
+ Sometimes we model library aspects in the engine and model language constructs in checkers.
+
+* What is the checker programming model that we are aiming for? Maximum freedom or more easy checker development?
+
+I think if we aim for maximum freedom, we do not need to worry about the
+potential stress on checkers, and we can introduce abstractions to mitigate that
+later on.
+If we want to simplify the API, then maybe it makes more sense to move language
+construct modeling to the engine when the checker API is not sufficient instead
+of complicating the API.
+
+Right now I have no preference or objections between the alternatives but there
+are some random thoughts:
+
+* Maybe it would be great to have a guideline how to evolve the analyzer and
+ follow it, so it can help us to decide in similar situations
+
+* I do care about performance in this case. The reason is that we have a
+ limited performance budget. And I think we should not expect most of the checker
+ writers to add modeling of language constructs. So, in my opinion, it is ok to
+ have less nice/more verbose API for language modeling if we can have better
+ performance this way, since it only needs to be done once, and is done by the
+ framework developers.
+
+**Artem:** These are some great questions, i guess it'd be better to discuss
+them more openly. As a quick dump of my current mood:
+
+* To me it seems obvious that we need to aim for a checker API that is both
+ simple and powerful. This can probably by keeping the API as powerful as
+ necessary while providing a layer of simple ready-made solutions on top of it.
+ Probably a few reusable components for assembling checkers. And this layer
+ should ideally be pleasant enough to work with, so that people would prefer to
+ extend it when something is lacking, instead of falling back to the complex
+ omnipotent API. I'm thinking of AST matchers vs. AST visitors as a roughly
+ similar situation: matchers are not omnipotent, but they're so nice.
+
+* Separation between core and checkers is usually quite strange. Once we have
+ shared state traits, i generally wouldn't mind having region store or range
+ constraint manager as checkers (though it's probably not worth it to transform
+ them - just a mood). The main thing to avoid here would be the situation when
+ the checker overwrites stuff written by the core because it thinks it has a
+ better idea what's going on, so the core should provide a good default behavior.
+
+* Yeah, i totally care about performance as well, and if i try to implement
+ approach, i'd make sure it's good.
+
+**Artem:**
+
+> Approach (2): We could teach the Store to scan itself for bindings to
+> metadata-symbolic-based regions during scanReachableSymbols() whenever
+> a region turns out to be reachable. This requires no work on checker side,
+> but it sounds performance-heavy.
+
+Nope, this approach is wrong. Metadata symbols may become out-of-date: when the
+object changes, metadata symbols attached to it aren't changing (because symbols
+simply don't change). The same metadata may have different symbols to denote its
+value in different moments of time, but at most one of them represents the
+actual metadata value. So we'd be escaping more stuff than necessary.
+
+If only we had "ghost fields"
+(https://lists.llvm.org/pipermail/cfe-dev/2016-May/049000.html), it would have
+been much easier, because the ghost field would only contain the actual
+metadata, and the Store would always know about it. This example adds to my
+belief that ghost fields are exactly what we need for most C++ checkers.
+
+**Devin:**
+
+In this case, I would be fine with some sort of
+AbstractStorageMemoryRegion that meant "here is a memory region and somewhere
+reachable from here exists another region of type T". Or even multiple regions
+with different identifiers. This wouldn't specify how the memory is reachable,
+but it would allow for transfer functions to get at those regions and it would
+allow for invalidation.
+
+For ``std::initializer_list`` this reachable region would the region for the backing
+array and the transfer functions for begin() and end() yield the beginning and
+end element regions for it.
+
+In my view this differs from ghost variables in that (1) this storage does
+actually exist (it is just a library implementation detail where that storage
+lives) and (2) it is perfectly valid for a pointer into that storage to be
+returned and for another part of the program to read or write from that storage.
+(Well, in this case just read since it is allowed to be read-only memory).
+
+What I'm not OK with is modeling abstract analysis state (for example, the count
+of a NSMutableArray or the typestate of a file handle) as a value stored in some
+ginned up region in the store. This takes an easy problem that the analyzer does
+well at (modeling typestate) and turns it into a hard one that the analyzer is
+bad at (reasoning about the contents of the heap).
+
+I think the key criterion here is: "is the region accessible from outside the
+library". That is, does the library expose the region as a pointer that can be
+read to or written from in the client program? If so, then it makes sense for
+this to be in the store: we are modeling reachable storage as storage. But if
+we're just modeling arbitrary analysis facts that need to be invalidated when a
+pointer escapes then we shouldn't try to gin up storage for them just to get
+invalidation for free.
+
+**Artem:**
+
+> In this case, I would be fine with some sort of ``AbstractStorageMemoryRegion``
+> that meant "here is a memory region and somewhere reachable from here exists
+> another region of type T". Or even multiple regions with different
+> identifiers. This wouldn't specify how the memory is reachable, but it would
+> allow for transfer functions to get at those regions and it would allow for
+> invalidation.
+
+Yeah, this is what we can easily implement now as a
+symbolic-region-based-on-a-metadata-symbol (though we can make a new region
+class for that if we eg. want it typed). The problem is that the relation
+between such storage region and its parent object region is essentially
+immaterial, similarly to the relation between ``SymbolRegionValue`` and its parent
+region. Region contents are mutable: today the abstract storage is reachable
+from its parent object, tomorrow it's not, and maybe something else becomes
+reachable, something that isn't even abstract. So the parent region for the
+abstract storage is most of the time at best a "nice to know" thing - we cannot
+rely on it to do any actual work. We'd anyway need to rely on the checker to do
+the job.
+
+> For std::initializer_list this reachable region would the region for the
+> backing array and the transfer functions for begin() and end() yield the
+> beginning and end element regions for it.
+
+So maybe in fact for std::initializer_list it may work fine because you cannot
+change the data after the object is constructed - so this region's contents are
+essentially immutable. For the future, i feel as if it is a dead end.
+
+I'd like to consider another funny example. Suppose we're trying to model
+
+.. code-block:: cpp
+
+ std::unique_ptr. Consider::
+
+ void bar(const std::unique_ptr &x);
+
+ void foo(std::unique_ptr &x) {
+ int *a = x.get(); // (a, 0, direct): &AbstractStorageRegion
+ *a = 1; // (AbstractStorageRegion, 0, direct): 1 S32b
+ int *b = new int;
+ *b = 2; // (SymRegion{conj_$0}, 0 ,direct): 2 S32b
+ x.reset(b); // Checker map: x -> SymRegion{conj_$0}
+ bar(x); // 'a' doesn't escape (the pointer was unique), 'b' does.
+ clang_analyzer_eval(*a == 1); // Making this true is up to the checker.
+ clang_analyzer_eval(*b == 2); // Making this unknown is up to the checker.
+ }
+
+The checker doesn't totally need to ensure that ``*a == 1`` passes - even though the
+pointer was unique, it could theoretically have ``.get()``-ed above and the code
+could of course break the uniqueness invariant (though we'd probably want it).
+The checker can say that "even if ``*a`` did escape, it was not because it was
+stuffed directly into bar()".
+
+The checker's direct responsibility, however, is to solve the ``*b == 2`` thing
+(which is in fact the problem we're dealing with in this patch - escaping the
+storage region of the object).
+
+So we're talking about one more operation over the program state (scanning
+reachable symbols and regions) that cannot work without checker support.
+
+We can probably add a new callback "checkReachableSymbols" to solve this. This
+is in fact also related to the dead symbols problem (we're scanning for live
+symbols in the store and in the checkers separately, but we need to do so
+simultaneously with a single worklist). Hmm, in fact this sounds like a good
+idea; we can replace checkLiveSymbols with checkReachableSymbols.
+
+Or we could just have ghost member variables, and no checker support required at
+all. For ghost member variables, the relation with their parent region (which
+would be their superregion) is actually useful, the mutability of their contents
+is expressed naturally, and the store automagically sees reachable symbols, live
+symbols, escapes, invalidations, whatever.
+
+> In my view this differs from ghost variables in that (1) this storage does
+> actually exist (it is just a library implementation detail where that storage
+> lives) and (2) it is perfectly valid for a pointer into that storage to be
+> returned and for another part of the program to read or write from that
+> storage. (Well, in this case just read since it is allowed to be read-only
+> memory).
+
+> What I'm not OK with is modeling abstract analysis state (for example, the
+> count of a NSMutableArray or the typestate of a file handle) as a value stored
+> in some ginned up region in the store.This takes an easy problem that the
+> analyzer does well at (modeling typestate) and turns it into a hard one that
+> the analyzer is bad at (reasoning about the contents of the heap).
+
+Yeah, i tend to agree on that. For simple typestates, this is probably an
+overkill, so let's definitely put aside the idea of "ghost symbolic regions"
+that i had earlier.
+
+But, to summarize a bit, in our current case, however, the typestate we're
+looking for is the contents of the heap. And when we try to model such
+typestates (complex in this specific manner, i.e. heap-like) in any checker, we
+have a choice between re-doing this modeling in every such checker (which is
+something analyzer is indeed good at, but at a price of making checkers heavy)
+or instead relying on the Store to do exactly what it's designed to do.
+
+> I think the key criterion here is: "is the region accessible from outside
+> the library". That is, does the library expose the region as a pointer that
+> can be read to or written from in the client program? If so, then it makes
+> sense for this to be in the store: we are modeling reachable storage as
+> storage. But if we're just modeling arbitrary analysis facts that need to be
+> invalidated when a pointer escapes then we shouldn't try to gin up storage
+> for them just to get invalidation for free.
+
+As a metaphor, i'd probably compare it to body farms - the difference between
+ghost member variables and metadata symbols seems to me like the difference
+between body farms and evalCall. Both are nice to have, and body farms are very
+pleasant to work with, even if not omnipotent. I think it's fine for a
+FunctionDecl's body in a body farm to have a local variable, even if such
+variable doesn't actually exist, even if it cannot be seen from outside the
+function call. I'm not seeing immediate practical difference between "it does
+actually exist" and "it doesn't actually exist, just a handy abstraction".
+Similarly, i think it's fine if we have a ``CXXRecordDecl`` with
+implementation-defined contents, and try to farm up a member variable as a handy
+abstraction (we don't even need to know its name or offset, only that it's there
+somewhere).
+
+**Artem:**
+
+We've discussed it in person with Devin, and he provided more points to think
+about:
+
+* If the initializer list consists of non-POD data, constructors of list's
+ objects need to take the sub-region of the list's region as this-region In the
+ current (v2) version of this patch, these objects are constructed elsewhere and
+ then trivial-copied into the list's metadata pointer region, which may be
+ incorrect. This is our overall problem with C++ constructors, which manifests in
+ this case as well. Additionally, objects would need to be constructed in the
+ analyzer's core, which would not be able to predict that it needs to take a
+ checker-specific region as this-region, which makes it harder, though it might
+ be mitigated by sharing the checker state traits.
+
+* Because "ghost variables" are not material to the user, we need to somehow
+ make super sure that they don't make it into the diagnostic messages.
+
+So, because this needs further digging into overall C++ support and rises too
+many questions, i'm delaying a better approach to this problem and will fall
+back to the original trivial patch.
Index: docs/analyzer/developer-docs/RegionStore.rst
===================================================================
--- docs/analyzer/developer-docs/RegionStore.rst
+++ docs/analyzer/developer-docs/RegionStore.rst
@@ -0,0 +1,183 @@
+============
+Region Store
+============
+The analyzer "Store" represents the contents of memory regions. It is an opaque
+functional data structure stored in each ``ProgramState``; the only class that
+can modify the store is its associated StoreManager.
+
+Currently (Feb. 2013), the only StoreManager implementation being used is
+``RegionStoreManager``. This store records bindings to memory regions using a
+"base region + offset" key. (This allows ``*p`` and ``p[0]`` to map to the same
+location, among other benefits.)
+
+Regions are grouped into "clusters", which roughly correspond to "regions with
+the same base region". This allows certain operations to be more efficient,
+such as invalidation.
+
+Regions that do not have a known offset use a special "symbolic" offset. These
+keys store both the original region, and the "concrete offset region" -- the
+last region whose offset is entirely concrete. (For example, in the expression
+``foo.bar[1][i].baz``, the concrete offset region is the array ``foo.bar[1]``,
+since that has a known offset from the start of the top-level ``foo`` struct.)
+
+
+Binding Invalidation
+--------------------
+
+Supporting both concrete and symbolic offsets makes things a bit tricky. Here's
+an example:
+
+.. code-block:: cpp
+
+ foo[0] = 0;
+ foo[1] = 1;
+ foo[i] = i;
+
+After the third assignment, nothing can be said about the value of ``foo[0]``,
+because ``foo[i]`` may have overwritten it! Thus, *binding to a region with a
+symbolic offset invalidates the entire concrete offset region.* We know
+``foo[i]`` is somewhere within ``foo``, so we don't have to invalidate
+anything else, but we do have to be conservative about all other bindings within
+``foo``.
+
+Continuing the example:
+
+.. code-block:: cpp
+
+ foo[i] = i;
+ foo[0] = 0;
+
+After this latest assignment, nothing can be said about the value of ``foo[i]``,
+because ``foo[0]`` may have overwritten it! *Binding to a region R with a
+concrete offset invalidates any symbolic offset bindings whose concrete offset
+region is a super-region **or** sub-region of R.* All we know about ``foo[i]``
+is that it is somewhere within ``foo``, so changing *anything* within ``foo``
+might change ``foo[i]``, and changing *all* of ``foo`` (or its base region) will
+*definitely* change ``foo[i]``.
+
+This logic could be improved by using the current constraints on ``i``, at the
+cost of speed. The latter case could also be improved by matching region kinds,
+i.e. changing ``foo[0].a`` is unlikely to affect ``foo[i].b``, no matter what
+``i`` is.
+
+For more detail, read through ``RegionStoreManager::removeSubRegionBindings`` in
+RegionStore.cpp.
+
+
+ObjCIvarRegions
+---------------
+
+Objective-C instance variables require a bit of special handling. Like struct
+fields, they are not base regions, and when their parent object region is
+invalidated, all the instance variables must be invalidated as well. However,
+they have no concrete compile-time offsets (in the modern, "non-fragile"
+runtime), and so cannot easily be represented as an offset from the start of
+the object in the analyzer. Moreover, this means that invalidating a single
+instance variable should *not* invalidate the rest of the object, since unlike
+struct fields or array elements there is no way to perform pointer arithmetic
+to access another instance variable.
+
+Consequently, although the base region of an ObjCIvarRegion is the entire
+object, RegionStore offsets are computed from the start of the instance
+variable. Thus it is not valid to assume that all bindings with non-symbolic
+offsets start from the base region!
+
+
+Region Invalidation
+-------------------
+
+Unlike binding invalidation, region invalidation occurs when the entire
+contents of a region may have changed---say, because it has been passed to a
+function the analyzer can model, like memcpy, or because its address has
+escaped, usually as an argument to an opaque function call. In these cases we
+need to throw away not just all bindings within the region itself, but within
+its entire cluster, since neighboring regions may be accessed via pointer
+arithmetic.
+
+Region invalidation typically does even more than this, however. Because it
+usually represents the complete escape of a region from the analyzer's model,
+its *contents* must also be transitively invalidated. (For example, if a region
+``p`` of type ``int **`` is invalidated, the contents of ``*p`` and ``**p`` may
+have changed as well.) The algorithm that traverses this transitive closure of
+accessible regions is known as ClusterAnalysis, and is also used for finding
+all live bindings in the store (in order to throw away the dead ones). The name
+"ClusterAnalysis" predates the cluster-based organization of bindings, but
+refers to the same concept: during invalidation and liveness analysis, all
+bindings within a cluster must be treated in the same way for a conservative
+model of program behavior.
+
+
+Default Bindings
+----------------
+
+Most bindings in RegionStore are simple scalar values -- integers and pointers.
+These are known as "Direct" bindings. However, RegionStore supports a second
+type of binding called a "Default" binding. These are used to provide values to
+all the elements of an aggregate type (struct or array) without having to
+explicitly specify a binding for each individual element.
+
+When there is no Direct binding for a particular region, the store manager
+looks at each super-region in turn to see if there is a Default binding. If so,
+this value is used as the value of the original region. The search ends when
+the base region is reached, at which point the RegionStore will pick an
+appropriate default value for the region (usually a symbolic value, but
+sometimes zero, for static data, or "uninitialized", for stack variables).
+
+.. code-block:: cpp
+
+ int manyInts[10];
+ manyInts[1] = 42; // Creates a Direct binding for manyInts[1].
+ print(manyInts[1]); // Retrieves the Direct binding for manyInts[1];
+ print(manyInts[0]); // There is no Direct binding for manyInts[0].
+ // Is there a Default binding for the entire array?
+ // There is not, but it is a stack variable, so we use
+ // "uninitialized" as the default value (and emit a
+ // diagnostic!).
+
+NOTE: The fact that bindings are stored as a base region plus an offset limits
+the Default Binding strategy, because in C aggregates can contain other
+aggregates. In the current implementation of RegionStore, there is no way to
+distinguish a Default binding for an entire aggregate from a Default binding
+for the sub-aggregate at offset 0.
+
+
+Lazy Bindings (LazyCompoundVal)
+-------------------------------
+
+RegionStore implements an optimization for copying aggregates (structs and
+arrays) called "lazy bindings", implemented using a special SVal called
+LazyCompoundVal. When the store is asked for the "binding" for an entire
+aggregate (i.e. for an lvalue-to-rvalue conversion), it returns a
+LazyCompoundVal instead. When this value is then stored into a variable, it is
+bound as a Default value. This makes copying arrays and structs much cheaper
+than if they had required memberwise access.
+
+Under the hood, a LazyCompoundVal is implemented as a uniqued pair of (region,
+store), representing "the value of the region during this 'snapshot' of the
+store". This has important implications for any sort of liveness or
+reachability analysis, which must take the bindings in the old store into
+account.
+
+Retrieving a value from a lazy binding happens in the same way as any other
+Default binding: since there is no direct binding, the store manager falls back
+to super-regions to look for an appropriate default binding. LazyCompoundVal
+differs from a normal default binding, however, in that it contains several
+different values, instead of one value that will appear several times. Because
+of this, the store manager has to reconstruct the subregion chain on top of the
+LazyCompoundVal region, and look up *that* region in the previous store.
+
+Here's a concrete example:
+
+.. code-block:: cpp
+
+ CGPoint p;
+ p.x = 42; // A Direct binding is made to the FieldRegion 'p.x'.
+ CGPoint p2 = p; // A LazyCompoundVal is created for 'p', along with a
+ // snapshot of the current store state. This value is then
+ // used as a Default binding for the VarRegion 'p2'.
+ return p2.x; // The binding for FieldRegion 'p2.x' is requested.
+ // There is no Direct binding, so we look for a Default
+ // binding to 'p2' and find the LCV.
+ // Because it's a LCV, we look at our requested region
+ // and see that it's the '.x' field. We ask for the value
+ // of 'p.x' within the snapshot, and get back 42.
Index: docs/analyzer/developer-docs/nullability.rst
===================================================================
--- docs/analyzer/developer-docs/nullability.rst
+++ docs/analyzer/developer-docs/nullability.rst
@@ -0,0 +1,107 @@
+==================
+Nullability Checks
+==================
+
+This document is a high level description of the nullablility checks.
+These checks intended to use the annotations that is described in this
+RFC: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-March/041798.html.
+
+Let's consider the following 2 categories:
+
+**1) nullable**
+
+If a pointer ``p`` has a nullable annotation and no explicit null check or assert, we should warn in the following cases:
+
+* ``p`` gets implicitly converted into nonnull pointer, for example, we are passing it to a function that takes a nonnull parameter.
+* ``p`` gets dereferenced
+
+Taking a branch on nullable pointers are the same like taking branch on null unspecified pointers.
+
+Explicit cast from nullable to nonnul:
+
+.. code-block:: cpp
+
+ __nullable id foo;
+ id bar = foo;
+ takesNonNull((_nonnull) bar); // should not warn here (backward compatibility hack)
+ anotherTakesNonNull(bar); // would be great to warn here, but not necessary(*)
+
+Because bar corresponds to the same symbol all the time it is not easy to implement the checker that way the cast only suppress the first call but not the second. For this reason in the first implementation after a contradictory cast happens, I will treat bar as nullable unspecified, this way all of the warnings will be suppressed. Treating the symbol as nullable unspecified also has an advantage that in case the takesNonNull function body is being inlined, the will be no warning, when the symbol is dereferenced. In case I have time after the initial version I might spend additional time to try to find a more sophisticated solution, in which we would produce the second warning (*).
+
+**2) nonnull**
+
+* Dereferencing a nonnull, or sending message to it is ok.
+* Converting nonnull to nullable is Ok.
+* When there is an explicit cast from nonnull to nullable I will trust the cast (it is probable there for a reason, because this cast does not suppress any warnings or errors).
+* But what should we do about null checks?:
+
+.. code-block:: cpp
+
+ __nonnull id takesNonnull(__nonnull id x) {
+ if (x == nil) {
+ // Defensive backward compatible code:
+ ....
+ return nil; // Should the analyzer cover this piece of code? Should we require the cast (__nonnull)nil?
+ }
+ ....
+ }
+
+There are these directions:
+
+* We can either take the branch; this way the branch is analyzed
+* Should we not warn about any nullability issues in that branch? Probably not, it is ok to break the nullability postconditions when the nullability preconditions are violated.
+* We can assume that these pointers are not null and we lose coverage with the analyzer. (This can be implemented either in constraint solver or in the checker itself.)
+
+Other Issues to keep in mind/take care of:
+
+* Messaging:
+
+ * Sending a message to a nullable pointer
+
+ * Even though the method might return a nonnull pointer, when it was sent to a nullable pointer the return type will be nullable.
+ * The result is nullable unless the receiver is known to be non null.
+
+ * Sending a message to a unspecified or nonnull pointer
+
+ * If the pointer is not assumed to be nil, we should be optimistic and use the nullability implied by the method.
+
+ * This will not happen automatically, since the AST will have null unspecified in this case.
+
+Inlining
+--------
+
+A symbol may need to be treated differently inside an inlined body. For example, consider these conversions from nonnull to nullable in presence of inlining:
+
+.. code-block:: cpp
+
+ id obj = getNonnull();
+ takesNullable(obj);
+ takesNonnull(obj);
+
+ void takesNullable(nullable id obj) {
+ obj->ivar // we should assume obj is nullable and warn here
+ }
+
+With no special treatment, when the takesNullable is inlined the analyzer will not warn when the obj symbol is dereferenced. One solution for this is to reanalyze takesNullable as a top level function to get possible violations. The alternative method, deducing nullability information from the arguments after inlining is not robust enough (for example there might be more parameters with different nullability, but in the given path the two parameters might end up being the same symbol or there can be nested functions that take different view of the nullability of the same symbol). So the symbol will remain nonnull to avoid false positives but the functions that takes nullable parameters will be analyzed separately as well without inlining.
+
+Annotations on multi level pointers
+-----------------------------------
+
+Tracking multiple levels of annotations for pointers pointing to pointers would make the checker more complicated, because this way a vector of nullability qualifiers would be needed to be tracked for each symbol. This is not a big caveat, since once the top level pointer is dereferenced, the symvol for the inner pointer will have the nullability information. The lack of multi level annotation tracking only observable, when multiple levels of pointers are passed to a function which has a parameter with multiple levels of annotations. So for now the checker support the top level nullability qualifiers only.:
+
+.. code-block:: cpp
+
+ int * __nonnull * __nullable p;
+ int ** q = p;
+ takesStarNullableStarNullable(q);
+
+Implementation notes
+--------------------
+
+What to track?
+
+* The checker would track memory regions, and to each relevant region a qualifier information would be attached which is either nullable, nonnull or null unspecified (or contradicted to suppress warnings for a specific region).
+* On a branch, where a nullable pointer is known to be non null, the checker treat it as a same way as a pointer annotated as nonnull.
+* When there is an explicit cast from a null unspecified to either nonnull or nullable I will trust the cast.
+* Unannotated pointers are treated the same way as pointers annotated with nullability unspecified qualifier, unless the region is wrapped in ASSUME_NONNULL macros.
+* We might want to implement a callback for entry points to top level functions, where the pointer nullability assumptions would be made.
Index: docs/analyzer/index.rst
===================================================================
--- docs/analyzer/index.rst
+++ docs/analyzer/index.rst
@@ -1,23 +0,0 @@
-.. Clang Static Analyzer documentation master file, created by
- sphinx-quickstart on Wed Jan 2 15:54:28 2013.
- You can adapt this file completely to your liking, but it should at least
- contain the root `toctree` directive.
-
-Welcome to Clang Static Analyzer's documentation!
-=================================================
-
-Contents:
-
-.. toctree::
- :maxdepth: 2
-
- DebugChecks
-
-
-Indices and tables
-==================
-
-* :ref:`genindex`
-* :ref:`modindex`
-* :ref:`search`
-
Index: docs/analyzer/nullability.rst
===================================================================
--- docs/analyzer/nullability.rst
+++ docs/analyzer/nullability.rst
@@ -1,92 +0,0 @@
-============
-Nullability Checks
-============
-
-This document is a high level description of the nullablility checks.
-These checks intended to use the annotations that is described in this
-RFC: http://lists.cs.uiuc.edu/pipermail/cfe-dev/2015-March/041798.html.
-
-Let's consider the following 2 categories:
-
-1) nullable
-============
-
-If a pointer 'p' has a nullable annotation and no explicit null check or assert, we should warn in the following cases:
-- 'p' gets implicitly converted into nonnull pointer, for example, we are passing it to a function that takes a nonnull parameter.
-- 'p' gets dereferenced
-
-Taking a branch on nullable pointers are the same like taking branch on null unspecified pointers.
-
-Explicit cast from nullable to nonnul::
-
- __nullable id foo;
- id bar = foo;
- takesNonNull((_nonnull) bar); <— should not warn here (backward compatibility hack)
- anotherTakesNonNull(bar); <— would be great to warn here, but not necessary(*)
-
-Because bar corresponds to the same symbol all the time it is not easy to implement the checker that way the cast only suppress the first call but not the second. For this reason in the first implementation after a contradictory cast happens, I will treat bar as nullable unspecified, this way all of the warnings will be suppressed. Treating the symbol as nullable unspecified also has an advantage that in case the takesNonNull function body is being inlined, the will be no warning, when the symbol is dereferenced. In case I have time after the initial version I might spend additional time to try to find a more sophisticated solution, in which we would produce the second warning (*).
-
-2) nonnull
-============
-
-- Dereferencing a nonnull, or sending message to it is ok.
-- Converting nonnull to nullable is Ok.
-- When there is an explicit cast from nonnull to nullable I will trust the cast (it is probable there for a reason, because this cast does not suppress any warnings or errors).
-- But what should we do about null checks?::
-
- __nonnull id takesNonnull(__nonnull id x) {
- if (x == nil) {
- // Defensive backward compatible code:
- ....
- return nil; <- Should the analyzer cover this piece of code? Should we require the cast (__nonnull)nil?
- }
- ....
- }
-
-There are these directions:
-- We can either take the branch; this way the branch is analyzed
- - Should we not warn about any nullability issues in that branch? Probably not, it is ok to break the nullability postconditions when the nullability preconditions are violated.
-- We can assume that these pointers are not null and we lose coverage with the analyzer. (This can be implemented either in constraint solver or in the checker itself.)
-
-Other Issues to keep in mind/take care of:
-Messaging:
-- Sending a message to a nullable pointer
- - Even though the method might return a nonnull pointer, when it was sent to a nullable pointer the return type will be nullable.
- - The result is nullable unless the receiver is known to be non null.
-- Sending a message to a unspecified or nonnull pointer
- - If the pointer is not assumed to be nil, we should be optimistic and use the nullability implied by the method.
- - This will not happen automatically, since the AST will have null unspecified in this case.
-
-Inlining
-============
-
-A symbol may need to be treated differently inside an inlined body. For example, consider these conversions from nonnull to nullable in presence of inlining::
-
- id obj = getNonnull();
- takesNullable(obj);
- takesNonnull(obj);
-
- void takesNullable(nullable id obj) {
- obj->ivar // we should assume obj is nullable and warn here
- }
-
-With no special treatment, when the takesNullable is inlined the analyzer will not warn when the obj symbol is dereferenced. One solution for this is to reanalyze takesNullable as a top level function to get possible violations. The alternative method, deducing nullability information from the arguments after inlining is not robust enough (for example there might be more parameters with different nullability, but in the given path the two parameters might end up being the same symbol or there can be nested functions that take different view of the nullability of the same symbol). So the symbol will remain nonnull to avoid false positives but the functions that takes nullable parameters will be analyzed separately as well without inlining.
-
-Annotations on multi level pointers
-============
-
-Tracking multiple levels of annotations for pointers pointing to pointers would make the checker more complicated, because this way a vector of nullability qualifiers would be needed to be tracked for each symbol. This is not a big caveat, since once the top level pointer is dereferenced, the symvol for the inner pointer will have the nullability information. The lack of multi level annotation tracking only observable, when multiple levels of pointers are passed to a function which has a parameter with multiple levels of annotations. So for now the checker support the top level nullability qualifiers only.::
-
- int * __nonnull * __nullable p;
- int ** q = p;
- takesStarNullableStarNullable(q);
-
-Implementation notes
-============
-
-What to track?
-- The checker would track memory regions, and to each relevant region a qualifier information would be attached which is either nullable, nonnull or null unspecified (or contradicted to suppress warnings for a specific region).
-- On a branch, where a nullable pointer is known to be non null, the checker treat it as a same way as a pointer annotated as nonnull.
-- When there is an explicit cast from a null unspecified to either nonnull or nullable I will trust the cast.
-- Unannotated pointers are treated the same way as pointers annotated with nullability unspecified qualifier, unless the region is wrapped in ASSUME_NONNULL macros.
-- We might want to implement a callback for entry points to top level functions, where the pointer nullability assumptions would be made.
Index: docs/conf.py
===================================================================
--- docs/conf.py
+++ docs/conf.py
@@ -66,7 +66,7 @@
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
-exclude_patterns = ['_build', 'analyzer']
+exclude_patterns = ['_build']
# The reST default role (used for this markup: `text`) to use for all documents.
#default_role = None
Index: docs/index.rst
===================================================================
--- docs/index.rst
+++ docs/index.rst
@@ -23,6 +23,7 @@
AttributeReference
DiagnosticsReference
CrossCompilation
+ ClangStaticAnalyzer
ThreadSafetyAnalysis
AddressSanitizer
ThreadSanitizer