Index: www/analyzer/checker_dev_manual.html
===================================================================
--- www/analyzer/checker_dev_manual.html
+++ www/analyzer/checker_dev_manual.html
@@ -675,6 +675,111 @@
(gdb) p C.getPredecessor()->getCodeDecl().getBody()->dump()
+
Making Your Checker Better
+
+- User facing documentation is important for adoption! Make sure the checker list is updated
+ at the homepage of the analyzer. Also ensure the description is clear to
+ non-analyzer-developers in Checkers.td.
+- Warning and note messages should be clear and easy to understand, even if a bit long.
+
+ - Messages should start with a capital letter (unlike Clang warnings!) and should not
+ end with ..
+ - Articles are usually omitted, eg. Dereference of a null pointer ->
+ Dereference of null pointer.
+ - Introduce BugReporterVisitors to emit additional notes that explain the warning
+ to the user better. There are some existing visitors that might be useful for your check,
+ e.g. trackNullOrUndefValue. For example, SimpleStreamChecker should highlight
+ the event of opening the file when reporting a file descriptor leak.
+
+- If the check tracks anything in the program state, it needs to implement the
+ checkDeadSymbolscallback to clean the state up.
+- The check should conservatively assume that the program is correct when a tracked symbol
+ is passed to a function that is unknown to the analyzer.
+ checkPointerEscape callback could help you handle that case.
+- Use safe and convenient APIs!
+
+ - Always use CheckerContext::generateErrorNode and
+ CheckerContext::generateNonFatalErrorNode for emitting bug reports.
+ Most importantly, never emit report against CheckerContext::getPredecessor.
+ - Prefer checkPreCall and checkPostCall to
+ checkPreStmt<CallExpr> and checkPostStmt<CallExpr>.
+ - Use CallDescription to detect hardcoded API calls in the program.
+ - Simplify C.getState()->getSVal(E, C.getLocationContext()) to C.getSVal(E).
+
+- Common sources of crashes:
+
+ - CallEvent::getOriginExpr is nullable - for example, it returns null for an
+ automatic destructor of a variable. The same applies to some values generated while the
+ call was modeled, eg. SymbolConjured::getStmt is nullable.
+ - CallEvent::getDecl is nullable - for example, it returns null for a
+ call of symbolic function pointer.
+ - addTransition, generateSink, generateNonFatalErrorNode,
+ generateErrorNode are nullable because you can transition to a node that you have already visited.
+ - Methods of CallExpr/FunctionDecl/CallEvent that
+ return arguments crash when the argument is out-of-bounds. If you checked the function name,
+ it doesn't mean that the function has the expected number of arguments!
+ Which is why you should use CallDescription.
+ - Nullability of different entities within different kinds of symbols and regions is usually
+ documented via assertions in their constructors.
+ - NamedDecl::getName will fail if the name of the declaration is not a single token,
+ e.g. for destructors. You could use NamedDecl::getNameAsString for those cases.
+ Note that this method is much slower and should be used sparringly, e.g. only when generating reports
+ but not during analysis.
+ - Is -analyzer-checker=core included in all test RUN: lines? It was never supported
+ to run the analyzer with the core checks disabled. It might cause unexpected behavior and
+ crashes. You should do all your testing with the core checks enabled.
+
+
+Patterns that you should most likely avoid even if they're not technically wrong:
+
+ - BugReporterVisitor should most likely not match the AST of the current program point
+ to decide when to emit a note. It is much easier to determine that by observing changes in
+ the program state.
+ - In State->getSVal(Region), if Region is not known to be a TypedValueRegion
+ and the optional type argument is not specified, the checker may accidentally try to dereference a
+ void pointer.
+ - Checker logic should not depend on whether a certain value is a Loc or NonLoc.
+ It should be immediately obvious whether the SVal is a Loc or a
+ NonLoc depending on the AST that is being checked. Checking whether a value
+ is Loc or Unknown/Undefined or whether the value is
+ NonLoc or Unknown/Undefined is totally fine.
+ - New symbols should not be constructed in the checker via direct calls to SymbolManager,
+ unless they are of SymbolMetadata class tagged by the checker,
+ or they represent newly created values such as the return value in evalCall.
+ For modeling arithmetic/bitwise/comparison operations, SValBuilder should be used.
+ - Custom ProgramPointTags should not be created within the checker. There is usually
+ no good reason for a checker to chain multiple nodes together, because checkers aren't worklists.
+
+Checkers are encouraged to actively participate in the analysis by sharing
+ their knowledge about the program state with the rest of the analyzer,
+ but they should not be disrupting the analysis unnecessarily:
+
+ - If a checker splits program state, this must be based on knowledge that
+ the newly appearing branches are definitely possible and worth exploring
+ from the user's perspective. Otherwise the state split should be delayed
+ until there's an indication that one of the paths is taken, or one of the
+ paths needs to be dropped entirely. For example, it is fine to eagerly split
+ paths while modeling isalpha(x) as long as x is constrained accordingly on
+ each path. At the same time, it is not a good idea to split paths over the
+ return value of printf() while modeling the call because nobody ever checks
+ for errors in printf; at best, it'd just double the remaining analysis time.
+
+ - Caution is advised when using CheckerContext::generateNonFatalErrorNode
+ because it generates an independent transition, much like addTransition.
+ It is easy to accidentally split paths while using it. Ideally, try to
+ structure the code so that it was obvious that every addTransition or
+ generateNonFatalErrorNode (or sequence of such if the split is intended) is
+ immediately followed by return from the checker callback.
+ - Multiple implementations of evalCall in different checkers should not conflict.
+ - When implementing evalAssume, the checker should always return a non-null state
+ for either the true assumption or the false assumption (or both).
+ - Checkers shall not mutate values of expressions, i.e. use the ProgramState::BindExpr API,
+ unless they are fully responsible for computing the value.
+ Under no circumstances should they change non-Unknown values of expressions.
+ Currently the only valid use case for this API in checkers is to model the return value in the evalCall callback.
+ If expression values are incorrect, ExprEngine needs to be fixed instead.
+
+
Here are some additional resources that are useful when working on the Clang