diff --git a/clang/docs/analyzer/checkers.rst b/clang/docs/analyzer/checkers.rst --- a/clang/docs/analyzer/checkers.rst +++ b/clang/docs/analyzer/checkers.rst @@ -66,7 +66,7 @@ core.NullDereference (C, C++, ObjC) """"""""""""""""""""""""""""""""""" -Check for dereferences of null pointers. +Check for dereferences of null pointers. This checker specifically does not report null pointer dereferences for x86 and x86-64 targets when the @@ -75,7 +75,7 @@ `__ for reference. -The ``SuppressAddressSpaces`` option suppresses +The ``SuppressAddressSpaces`` option suppresses warnings for null dereferences of all pointers with address spaces. You can disable this behavior with the option ``-analyzer-config core.NullDereference:SuppressAddressSpaces=false``. @@ -2366,17 +2366,119 @@ alpha.security.taint.TaintPropagation (C, C++) """""""""""""""""""""""""""""""""""""""""""""" -Taint analysis identifies untrusted sources of information (taint sources), rules as to how the untrusted data flows along the execution path (propagation rules), and points of execution where the use of tainted data is risky (taints sinks). +Taint analysis identifies potential security vulnerabilities where +attacker can inject malicious data to the program to execute an attack +(privilege escalation, command injection, SQL injection etc.). + +The malicious data is injected at the taint source (e.g. getenv() call) +which is then propagated through function calls and being used as arguments of +sensitive operations, also called as taint sinks (e.g. system() call). + +One can defend agains this type of vulnerability by always checking and +santizing the potentially malicious, untrusted user input. + +The goal of the checker is to discover and show to the user these potential +taint source-sink pairs and the propagation call chain. + The most notable examples of taint sources are: - network originating data + - files or standard input - environment variables - database originating data -``GenericTaintChecker`` is the main implementation checker for this rule, and it generates taint information used by other checkers. +Let us examine a practical example of a Command Injection attack. +.. code-block:: C + //Command Injection Vulnerability Example -.. code-block:: c + int main(int argc, char** argv) { + char cmd[1024] = "/bin/cat "; + char filename[1024]; + printf("Filename:"); + scanf (" %1023[^\n]", filename); // The attacker can inject a shell escape here + + if (access(filename,F_OK)){ + printf("File does not exist\n"); + return -1; + } + strcat(cmd, filename); + system(cmd); // Warning: Untrusted data is passed to a system call + } + +The program prints the content of any user specified file. +Unfortunately the attacker can execute arbitrary commands +with shell escapes. For example with the following input the `ls` command is also +executed after the contents of `/etc/shadow` is printed. +`Input: /etc/shadow ; ls /` + +The analysis implemented in this checker points out this problem. + +One can protect against such attack by for example checking if the provided +input refers to a valid file. +.. code-block:: C + //No vulnerability anymore, but we still get the warning. + int main(int argc, char** argv) { + char cmd[1024] = "/bin/cat "; + char filename[1024]; + printf("Filename:"); + scanf (" %1023[^\n]", filename); // The attacker can inject a shell escape here + + if (access(filename,F_OK)){//sanitizing user input + printf("File does not exist\n"); + return -1; + } + // filename is safe after this point + strcat(cmd, filename); + system(cmd); // Superflous Warning: Untrusted data is passed to a system call + } + +Unfortunately, the checker cannot discover automatically +that the programmer have performed data sanitation, so it still emits the warning. + +One can get rid of this superflous warning by telling about such data sanitation +actions to the analyzer by adding the following lines. + +.. code-block:: C + //Marking sanitized variables safe. + //No vulnerability anymore, no warning. + + //User defiend csa_sanitize function which compiles to no-op + #ifdef __clang_analyzer__ + void csa_sanitize(const void *); + #else + #define csa_sanitize(P) (void)P + #endif + + int main(int argc, char** argv) { + char cmd[1024] = "/bin/cat "; + char filename[1024]; + printf("Filename:"); + scanf (" %1023[^\n]", filename); + if (access(filename,F_OK)){//sanitizing user input + printf("File does not exist\n"); + return -1; + } + csa_sanitize(filename); // Indicating to CSA that filename variable is safe to be used after this point + strcat(cmd, filename); + system(cmd); // No warning + } + +To let the analyzer know that a variable sanitized and safe to be used, one needs to +define a `Filter` function in a `YAML`` configuration file and add the `csa_sanitize` function. +.. code-block:: YAML + Filters: + - Name: csa_sanitize + Args: [0] + +Then calling `csa_sanitize(X)` will tell the analyzer that `X` is safe to be used +after this point, because its contents are verified. It is the responisibility of the +programmer to ensure that this verification was indeed correct. +Please note that `csa_sanitize` function is defined so that it is a no-op in normal +production builds, when Clang Static Analyzer is not executed. + +Further examples of injection vulnerabilities this checker can find. +.. code-block:: c void test() { char x = getchar(); // 'x' marked as tainted system(&x); // warn: untrusted data is passed to a system call @@ -2389,35 +2491,50 @@ char s[10], buf[10]; fscanf(stdin, "%s", s); // 's' marked as tainted - sprintf(buf, s); // warn: untrusted data as a format string + sprintf(buf, s); // warn: untrusted data used as a format string } void test() { size_t ts; scanf("%zd", &ts); // 'ts' marked as tainted int *p = (int *)malloc(ts * sizeof(int)); - // warn: untrusted data as buffer size + // warn: untrusted data used as buffer size + } -There are built-in sources, propagations and sinks defined in code inside ``GenericTaintChecker``. -These operations are handled even if no external taint configuration is provided. +There are built-in sources, propagations and sinks even if no external taint configuration is provided. -Default sources defined by ``GenericTaintChecker``: +Default sources: ``_IO_getc``, ``fdopen``, ``fopen``, ``freopen``, ``get_current_dir_name``, ``getch``, ``getchar``, ``getchar_unlocked``, ``getwd``, ``getcwd``, ``getgroups``, ``gethostname``, ``getlogin``, ``getlogin_r``, ``getnameinfo``, ``gets``, ``gets_s``, ``getseuserbyname``, ``readlink``, ``readlinkat``, ``scanf``, ``scanf_s``, ``socket``, ``wgetch`` -Default propagations defined by ``GenericTaintChecker``: +Default propagations rules: ``atoi``, ``atol``, ``atoll``, ``basename``, ``dirname``, ``fgetc``, ``fgetln``, ``fgets``, ``fnmatch``, ``fread``, ``fscanf``, ``fscanf_s``, ``index``, ``inflate``, ``isalnum``, ``isalpha``, ``isascii``, ``isblank``, ``iscntrl``, ``isdigit``, ``isgraph``, ``islower``, ``isprint``, ``ispunct``, ``isspace``, ``isupper``, ``isxdigit``, ``memchr``, ``memrchr``, ``sscanf``, ``getc``, ``getc_unlocked``, ``getdelim``, ``getline``, ``getw``, ``memcmp``, ``memcpy``, ``memmem``, ``memmove``, ``mbtowc``, ``pread``, ``qsort``, ``qsort_r``, ``rawmemchr``, ``read``, ``recv``, ``recvfrom``, ``rindex``, ``strcasestr``, ``strchr``, ``strchrnul``, ``strcasecmp``, ``strcmp``, ``strcspn``, ``strlen``, ``strncasecmp``, ``strncmp``, ``strndup``, ``strndupa``, ``strnlen``, ``strpbrk``, ``strrchr``, ``strsep``, ``strspn``, ``strstr``, ``strtol``, ``strtoll``, ``strtoul``, ``strtoull``, ``tolower``, ``toupper``, ``ttyname``, ``ttyname_r``, ``wctomb``, ``wcwidth`` -Default sinks defined in ``GenericTaintChecker``: +Default sinks: ``printf``, ``setproctitle``, ``system``, ``popen``, ``execl``, ``execle``, ``execlp``, ``execv``, ``execvp``, ``execvP``, ``execve``, ``dlopen``, ``memcpy``, ``memmove``, ``strncpy``, ``strndup``, ``malloc``, ``calloc``, ``alloca``, ``memccpy``, ``realloc``, ``bcopy`` -The user can configure taint sources, sinks, and propagation rules by providing a configuration file via checker option ``alpha.security.taint.TaintPropagation:Config``. - -External taint configuration is in `YAML `_ format. The taint-related options defined in the config file extend but do not override the built-in sources, rules, sinks. +One can configure their own taint sources, sinks, and propagation rules by providing a configuration file via checker option ``alpha.security.taint.TaintPropagation:Config``. +The configuration file is in `YAML `_ format. The taint-related options defined in the config file extend but do not override the built-in sources, rules, sinks. The format of the external taint configuration file is not stable, and could change without any notice even in a non-backward compatible way. For a more detailed description of configuration options, please see the :doc:`user-docs/TaintAnalysisConfiguration`. For an example see :ref:`clangsa-taint-configuration-example`. +**Configuration** + +* `Config` Specifies the name of the YAML configuration file. The user can define their own taint sources and sinks. + +**Related Guidelines** + +* `CWE Data Neutralization Issues `_ +* `SEI Cert STR02-C. Sanitize data passed to complex subsystems `_ +* `SEI Cert ENV33-C. Do not call system() `_ +* `ENV03-C. Sanitize the environment when invoking external programs `_ + +**Limitations** + +* The taintedness property is not propagated through function calls which are unkown (or too complex) to the analyzer, unless there is a specific +propagation rule built-in to the checker or given in the YAML configuration file. This causes potential true positive findings to be lost. + alpha.unix ^^^^^^^^^^^ @@ -2767,7 +2884,7 @@ Check for uninitialized reads from common memory copy/manipulation functions such as: ``memcpy, mempcpy, memmove, memcmp, strcmp, strncmp, strcpy, strlen, strsep`` and many more. -.. code-block:: c +.. code-block:: c void test() { char src[10]; @@ -2776,12 +2893,12 @@ } Limitations: - + - Due to limitations of the memory modeling in the analyzer, one can likely observe a lot of false-positive reports like this: .. code-block:: c - + void false_positive() { int src[] = {1, 2, 3, 4}; int dst[5] = {0}; @@ -2790,9 +2907,9 @@ // that since the analyzer could not see a direct initialization of the // very last byte of the source buffer. } - + More details at the corresponding `GitHub issue `_. - + .. _alpha-nondeterminism-PointerIteration: alpha.nondeterminism.PointerIteration (C++)