Index: tools/scan-build-py/README.md =================================================================== --- /dev/null +++ tools/scan-build-py/README.md @@ -0,0 +1,120 @@ +scan-build +========== + +A package designed to wrap a build so that all calls to gcc/clang are +intercepted and logged into a [compilation database][1] and/or piped to +the clang static analyzer. Includes intercept-build tool, which logs +the build, as well as scan-build tool, which logs the build and runs +the clang static analyzer on it. + +Portability +----------- + +Should be working on UNIX operating systems. + +- It has been tested on FreeBSD, GNU/Linux and OS X. +- Prepared to work on windows, but need help to make it. + + +Prerequisites +------------- + +1. **python** interpreter (version 2.7, 3.2, 3.3, 3.4, 3.5). + + +How to use +---------- + +To run the Clang static analyzer against a project goes like this: + + $ scan-build + +To generate a compilation database file goes like this: + + $ intercept-build + +To run the Clang static analyzer against a project with compilation database +goes like this: + + $ analyze-build + +Use `--help` to know more about the commands. + + +Limitations +----------- + +Generally speaking, the `intercept-build` and `analyze-build` tools together +does the same job as `scan-build` does. So, you can expect the same output +from this line as simple `scan-build` would do: + + $ intercept-build && analyze-build + +The major difference is how and when the analyzer is run. The `scan-build` +tool has three distinct model to run the analyzer: + +1. Use compiler wrappers to make actions. + The compiler wrappers does run the real compiler and the analyzer. + This is the default behaviour, can be enforced with `--override-compiler` + flag. + +2. Use special library to intercept compiler calls durring the build process. + The analyzer run against each modules after the build finished. + Use `--intercept-first` flag to get this model. + +3. Use compiler wrappers to intercept compiler calls durring the build process. + The analyzer run against each modules after the build finished. + Use `--intercept-first` and `--override-compiler` flags together to get + this model. + +The 1. and 3. are using compiler wrappers, which works only if the build +process respects the `CC` and `CXX` environment variables. (Some build +process can override these variable as command line parameter only. This case +you need to pass the compiler wrappers manually. eg.: `intercept-build +--override-compiler make CC=intercept-cc CXX=intercept-c++ all` where the +original build command would have been `make all` only.) + +The 1. runs the analyzer right after the real compilation. So, if the build +process removes removes intermediate modules (generated sources) the analyzer +output still kept. + +The 2. and 3. generate the compilation database first, and filters out those +modules which are not exists. So, it's suitable for incremental analysis durring +the development. + +The 2. mode is available only on FreeBSD and Linux. Where library preload +is available from the dynamic loader. Not supported on OS X (unless System +Integrity Protection feature is turned off). + +`intercept-build` command uses only the 2. and 3. mode to generate the +compilation database. `analyze-build` does only run the analyzer against the +captured compiler calls. + + +Known problems +-------------- + +Because it uses `LD_PRELOAD` or `DYLD_INSERT_LIBRARIES` environment variables, +it does not append to it, but overrides it. So builds which are using these +variables might not work. (I don't know any build tool which does that, but +please let me know if you do.) + + +Problem reports +--------------- + +If you find a bug in this documentation or elsewhere in the program or would +like to propose an improvement, please use the project's [issue tracker][3]. +Please describing the bug and where you found it. If you have a suggestion +how to fix it, include that as well. Patches are also welcome. + + +License +------- + +The project is licensed under University of Illinois/NCSA Open Source License. +See LICENSE.TXT for details. + + [1]: http://clang.llvm.org/docs/JSONCompilationDatabase.html + [2]: https://pypi.python.org/pypi/scan-build + [3]: https://llvm.org/bugs/enter_bug.cgi?product=clang Index: tools/scan-build-py/bin/analyze-build =================================================================== --- /dev/null +++ tools/scan-build-py/bin/analyze-build @@ -0,0 +1,17 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import multiprocessing +multiprocessing.freeze_support() + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.analyze import analyze_build_main +sys.exit(analyze_build_main(this_dir, False)) Index: tools/scan-build-py/bin/analyze-c++ =================================================================== --- /dev/null +++ tools/scan-build-py/bin/analyze-c++ @@ -0,0 +1,14 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.analyze import analyze_build_wrapper +sys.exit(analyze_build_wrapper(True)) Index: tools/scan-build-py/bin/analyze-cc =================================================================== --- /dev/null +++ tools/scan-build-py/bin/analyze-cc @@ -0,0 +1,14 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.analyze import analyze_build_wrapper +sys.exit(analyze_build_wrapper(False)) Index: tools/scan-build-py/bin/intercept-build =================================================================== --- /dev/null +++ tools/scan-build-py/bin/intercept-build @@ -0,0 +1,17 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import multiprocessing +multiprocessing.freeze_support() + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.intercept import intercept_build_main +sys.exit(intercept_build_main(this_dir)) Index: tools/scan-build-py/bin/intercept-c++ =================================================================== --- /dev/null +++ tools/scan-build-py/bin/intercept-c++ @@ -0,0 +1,14 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.intercept import intercept_build_wrapper +sys.exit(intercept_build_wrapper(True)) Index: tools/scan-build-py/bin/intercept-cc =================================================================== --- /dev/null +++ tools/scan-build-py/bin/intercept-cc @@ -0,0 +1,14 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.intercept import intercept_build_wrapper +sys.exit(intercept_build_wrapper(False)) Index: tools/scan-build-py/bin/scan-build =================================================================== --- /dev/null +++ tools/scan-build-py/bin/scan-build @@ -0,0 +1,17 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. + +import multiprocessing +multiprocessing.freeze_support() + +import sys +import os.path +this_dir = os.path.dirname(os.path.realpath(__file__)) +sys.path.append(os.path.dirname(this_dir)) + +from libscanbuild.analyze import analyze_build_main +sys.exit(analyze_build_main(this_dir, True)) Index: tools/scan-build-py/libear/__init__.py =================================================================== --- /dev/null +++ tools/scan-build-py/libear/__init__.py @@ -0,0 +1,260 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module compiles the intercept library. """ + +import sys +import os +import os.path +import re +import tempfile +import shutil +import contextlib +import logging + +__all__ = ['build_libear'] + + +def build_libear(compiler, dst_dir): + """ Returns the full path to the 'libear' library. """ + + try: + src_dir = os.path.dirname(os.path.realpath(__file__)) + toolset = make_toolset(src_dir) + toolset.set_compiler(compiler) + toolset.set_language_standard('c99') + toolset.add_definitions(['-D_GNU_SOURCE']) + + configure = do_configure(toolset) + configure.check_function_exists('execve', 'HAVE_EXECVE') + configure.check_function_exists('execv', 'HAVE_EXECV') + configure.check_function_exists('execvpe', 'HAVE_EXECVPE') + configure.check_function_exists('execvp', 'HAVE_EXECVP') + configure.check_function_exists('execvP', 'HAVE_EXECVP2') + configure.check_function_exists('exect', 'HAVE_EXECT') + configure.check_function_exists('execl', 'HAVE_EXECL') + configure.check_function_exists('execlp', 'HAVE_EXECLP') + configure.check_function_exists('execle', 'HAVE_EXECLE') + configure.check_function_exists('posix_spawn', 'HAVE_POSIX_SPAWN') + configure.check_function_exists('posix_spawnp', 'HAVE_POSIX_SPAWNP') + configure.check_symbol_exists('_NSGetEnviron', 'crt_externs.h', + 'HAVE_NSGETENVIRON') + configure.write_by_template( + os.path.join(src_dir, 'config.h.in'), + os.path.join(dst_dir, 'config.h')) + + target = create_shared_library('ear', toolset) + target.add_include(dst_dir) + target.add_sources('ear.c') + target.link_against(toolset.dl_libraries()) + target.link_against(['pthread']) + target.build_release(dst_dir) + + return os.path.join(dst_dir, target.name) + + except Exception: + logging.info("Could not build interception library.", exc_info=True) + return None + + +def execute(cmd, *args, **kwargs): + """ Make subprocess execution silent. """ + + import subprocess + kwargs.update({'stdout': subprocess.PIPE, 'stderr': subprocess.STDOUT}) + return subprocess.check_call(cmd, *args, **kwargs) + + +@contextlib.contextmanager +def TemporaryDirectory(**kwargs): + name = tempfile.mkdtemp(**kwargs) + try: + yield name + finally: + shutil.rmtree(name) + + +class Toolset(object): + """ Abstract class to represent different toolset. """ + + def __init__(self, src_dir): + self.src_dir = src_dir + self.compiler = None + self.c_flags = [] + + def set_compiler(self, compiler): + """ part of public interface """ + self.compiler = compiler + + def set_language_standard(self, standard): + """ part of public interface """ + self.c_flags.append('-std=' + standard) + + def add_definitions(self, defines): + """ part of public interface """ + self.c_flags.extend(defines) + + def dl_libraries(self): + raise NotImplementedError() + + def shared_library_name(self, name): + raise NotImplementedError() + + def shared_library_c_flags(self, release): + extra = ['-DNDEBUG', '-O3'] if release else [] + return extra + ['-fPIC'] + self.c_flags + + def shared_library_ld_flags(self, release, name): + raise NotImplementedError() + + +class DarwinToolset(Toolset): + def __init__(self, src_dir): + Toolset.__init__(self, src_dir) + + def dl_libraries(self): + return [] + + def shared_library_name(self, name): + return 'lib' + name + '.dylib' + + def shared_library_ld_flags(self, release, name): + extra = ['-dead_strip'] if release else [] + return extra + ['-dynamiclib', '-install_name', '@rpath/' + name] + + +class UnixToolset(Toolset): + def __init__(self, src_dir): + Toolset.__init__(self, src_dir) + + def dl_libraries(self): + return [] + + def shared_library_name(self, name): + return 'lib' + name + '.so' + + def shared_library_ld_flags(self, release, name): + extra = [] if release else [] + return extra + ['-shared', '-Wl,-soname,' + name] + + +class LinuxToolset(UnixToolset): + def __init__(self, src_dir): + UnixToolset.__init__(self, src_dir) + + def dl_libraries(self): + return ['dl'] + + +def make_toolset(src_dir): + platform = sys.platform + if platform in {'win32', 'cygwin'}: + raise RuntimeError('not implemented on this platform') + elif platform == 'darwin': + return DarwinToolset(src_dir) + elif platform in {'linux', 'linux2'}: + return LinuxToolset(src_dir) + else: + return UnixToolset(src_dir) + + +class Configure(object): + def __init__(self, toolset): + self.ctx = toolset + self.results = {'APPLE': sys.platform == 'darwin'} + + def _try_to_compile_and_link(self, source): + try: + with TemporaryDirectory() as work_dir: + src_file = 'check.c' + with open(os.path.join(work_dir, src_file), 'w') as handle: + handle.write(source) + + execute([self.ctx.compiler, src_file] + self.ctx.c_flags, + cwd=work_dir) + return True + except Exception: + return False + + def check_function_exists(self, function, name): + template = "int FUNCTION(); int main() { return FUNCTION(); }" + source = template.replace("FUNCTION", function) + + logging.debug('Checking function %s', function) + found = self._try_to_compile_and_link(source) + logging.debug('Checking function %s -- %s', function, + 'found' if found else 'not found') + self.results.update({name: found}) + + def check_symbol_exists(self, symbol, include, name): + template = """#include + int main() { return ((int*)(&SYMBOL))[0]; }""" + source = template.replace('INCLUDE', include).replace("SYMBOL", symbol) + + logging.debug('Checking symbol %s', symbol) + found = self._try_to_compile_and_link(source) + logging.debug('Checking symbol %s -- %s', symbol, + 'found' if found else 'not found') + self.results.update({name: found}) + + def write_by_template(self, template, output): + def transform(line, definitions): + + pattern = re.compile(r'^#cmakedefine\s+(\S+)') + m = pattern.match(line) + if m: + key = m.group(1) + if key not in definitions or not definitions[key]: + return '/* #undef {} */\n'.format(key) + else: + return '#define {}\n'.format(key) + return line + + with open(template, 'r') as src_handle: + logging.debug('Writing config to %s', output) + with open(output, 'w') as dst_handle: + for line in src_handle: + dst_handle.write(transform(line, self.results)) + + +def do_configure(toolset): + return Configure(toolset) + + +class SharedLibrary(object): + def __init__(self, name, toolset): + self.name = toolset.shared_library_name(name) + self.ctx = toolset + self.inc = [] + self.src = [] + self.lib = [] + + def add_include(self, directory): + self.inc.extend(['-I', directory]) + + def add_sources(self, source): + self.src.append(source) + + def link_against(self, libraries): + self.lib.extend(['-l' + lib for lib in libraries]) + + def build_release(self, directory): + for src in self.src: + logging.debug('Compiling %s', src) + execute( + [self.ctx.compiler, '-c', os.path.join(self.ctx.src_dir, src), + '-o', src + '.o'] + self.inc + + self.ctx.shared_library_c_flags(True), + cwd=directory) + logging.debug('Linking %s', self.name) + execute( + [self.ctx.compiler] + [src + '.o' for src in self.src] + + ['-o', self.name] + self.lib + + self.ctx.shared_library_ld_flags(True, self.name), + cwd=directory) + + +def create_shared_library(name, toolset): + return SharedLibrary(name, toolset) Index: tools/scan-build-py/libear/config.h.in =================================================================== --- /dev/null +++ tools/scan-build-py/libear/config.h.in @@ -0,0 +1,23 @@ +/* -*- coding: utf-8 -*- +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +*/ + +#pragma once + +#cmakedefine HAVE_EXECVE +#cmakedefine HAVE_EXECV +#cmakedefine HAVE_EXECVPE +#cmakedefine HAVE_EXECVP +#cmakedefine HAVE_EXECVP2 +#cmakedefine HAVE_EXECT +#cmakedefine HAVE_EXECL +#cmakedefine HAVE_EXECLP +#cmakedefine HAVE_EXECLE +#cmakedefine HAVE_POSIX_SPAWN +#cmakedefine HAVE_POSIX_SPAWNP +#cmakedefine HAVE_NSGETENVIRON + +#cmakedefine APPLE Index: tools/scan-build-py/libear/ear.c =================================================================== --- /dev/null +++ tools/scan-build-py/libear/ear.c @@ -0,0 +1,605 @@ +/* -*- coding: utf-8 -*- +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +*/ + +/** + * This file implements a shared library. This library can be pre-loaded by + * the dynamic linker of the Operating System (OS). It implements a few function + * related to process creation. By pre-load this library the executed process + * uses these functions instead of those from the standard library. + * + * The idea here is to inject a logic before call the real methods. The logic is + * to dump the call into a file. To call the real method this library is doing + * the job of the dynamic linker. + * + * The only input for the log writing is about the destination directory. + * This is passed as environment variable. + */ + +#include "config.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +#if defined HAVE_POSIX_SPAWN || defined HAVE_POSIX_SPAWNP +#include +#endif + +#if defined HAVE_NSGETENVIRON +# include +#else +extern char **environ; +#endif + +#define ENV_OUTPUT "INTERCEPT_BUILD_TARGET_DIR" +#ifdef APPLE +# define ENV_FLAT "DYLD_FORCE_FLAT_NAMESPACE" +# define ENV_PRELOAD "DYLD_INSERT_LIBRARIES" +# define ENV_SIZE 3 +#else +# define ENV_PRELOAD "LD_PRELOAD" +# define ENV_SIZE 2 +#endif + +#define DLSYM(TYPE_, VAR_, SYMBOL_) \ + union { \ + void *from; \ + TYPE_ to; \ + } cast; \ + if (0 == (cast.from = dlsym(RTLD_NEXT, SYMBOL_))) { \ + perror("bear: dlsym"); \ + exit(EXIT_FAILURE); \ + } \ + TYPE_ const VAR_ = cast.to; + + +typedef char const * bear_env_t[ENV_SIZE]; + +static int bear_capture_env_t(bear_env_t *env); +static int bear_reset_env_t(bear_env_t *env); +static void bear_release_env_t(bear_env_t *env); +static char const **bear_update_environment(char *const envp[], bear_env_t *env); +static char const **bear_update_environ(char const **in, char const *key, char const *value); +static char **bear_get_environment(); +static void bear_report_call(char const *fun, char const *const argv[]); +static char const **bear_strings_build(char const *arg, va_list *ap); +static char const **bear_strings_copy(char const **const in); +static char const **bear_strings_append(char const **in, char const *e); +static size_t bear_strings_length(char const *const *in); +static void bear_strings_release(char const **); + + +static bear_env_t env_names = + { ENV_OUTPUT + , ENV_PRELOAD +#ifdef ENV_FLAT + , ENV_FLAT +#endif + }; + +static bear_env_t initial_env = + { 0 + , 0 +#ifdef ENV_FLAT + , 0 +#endif + }; + +static int initialized = 0; +static pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; + +static void on_load(void) __attribute__((constructor)); +static void on_unload(void) __attribute__((destructor)); + + +#ifdef HAVE_EXECVE +static int call_execve(const char *path, char *const argv[], + char *const envp[]); +#endif +#ifdef HAVE_EXECVP +static int call_execvp(const char *file, char *const argv[]); +#endif +#ifdef HAVE_EXECVPE +static int call_execvpe(const char *file, char *const argv[], + char *const envp[]); +#endif +#ifdef HAVE_EXECVP2 +static int call_execvP(const char *file, const char *search_path, + char *const argv[]); +#endif +#ifdef HAVE_EXECT +static int call_exect(const char *path, char *const argv[], + char *const envp[]); +#endif +#ifdef HAVE_POSIX_SPAWN +static int call_posix_spawn(pid_t *restrict pid, const char *restrict path, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], + char *const envp[restrict]); +#endif +#ifdef HAVE_POSIX_SPAWNP +static int call_posix_spawnp(pid_t *restrict pid, const char *restrict file, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], + char *const envp[restrict]); +#endif + + +/* Initialization method to Captures the relevant environment variables. + */ + +static void on_load(void) { + pthread_mutex_lock(&mutex); + if (!initialized) + initialized = bear_capture_env_t(&initial_env); + pthread_mutex_unlock(&mutex); +} + +static void on_unload(void) { + pthread_mutex_lock(&mutex); + bear_release_env_t(&initial_env); + initialized = 0; + pthread_mutex_unlock(&mutex); +} + + +/* These are the methods we are try to hijack. + */ + +#ifdef HAVE_EXECVE +int execve(const char *path, char *const argv[], char *const envp[]) { + bear_report_call(__func__, (char const *const *)argv); + return call_execve(path, argv, envp); +} +#endif + +#ifdef HAVE_EXECV +#ifndef HAVE_EXECVE +#error can not implement execv without execve +#endif +int execv(const char *path, char *const argv[]) { + bear_report_call(__func__, (char const *const *)argv); + char * const * envp = bear_get_environment(); + return call_execve(path, argv, envp); +} +#endif + +#ifdef HAVE_EXECVPE +int execvpe(const char *file, char *const argv[], char *const envp[]) { + bear_report_call(__func__, (char const *const *)argv); + return call_execvpe(file, argv, envp); +} +#endif + +#ifdef HAVE_EXECVP +int execvp(const char *file, char *const argv[]) { + bear_report_call(__func__, (char const *const *)argv); + return call_execvp(file, argv); +} +#endif + +#ifdef HAVE_EXECVP2 +int execvP(const char *file, const char *search_path, char *const argv[]) { + bear_report_call(__func__, (char const *const *)argv); + return call_execvP(file, search_path, argv); +} +#endif + +#ifdef HAVE_EXECT +int exect(const char *path, char *const argv[], char *const envp[]) { + bear_report_call(__func__, (char const *const *)argv); + return call_exect(path, argv, envp); +} +#endif + +#ifdef HAVE_EXECL +# ifndef HAVE_EXECVE +# error can not implement execl without execve +# endif +int execl(const char *path, const char *arg, ...) { + va_list args; + va_start(args, arg); + char const **argv = bear_strings_build(arg, &args); + va_end(args); + + bear_report_call(__func__, (char const *const *)argv); + char * const * envp = bear_get_environment(); + int const result = call_execve(path, (char *const *)argv, envp); + + bear_strings_release(argv); + return result; +} +#endif + +#ifdef HAVE_EXECLP +# ifndef HAVE_EXECVP +# error can not implement execlp without execvp +# endif +int execlp(const char *file, const char *arg, ...) { + va_list args; + va_start(args, arg); + char const **argv = bear_strings_build(arg, &args); + va_end(args); + + bear_report_call(__func__, (char const *const *)argv); + int const result = call_execvp(file, (char *const *)argv); + + bear_strings_release(argv); + return result; +} +#endif + +#ifdef HAVE_EXECLE +# ifndef HAVE_EXECVE +# error can not implement execle without execve +# endif +// int execle(const char *path, const char *arg, ..., char * const envp[]); +int execle(const char *path, const char *arg, ...) { + va_list args; + va_start(args, arg); + char const **argv = bear_strings_build(arg, &args); + char const **envp = va_arg(args, char const **); + va_end(args); + + bear_report_call(__func__, (char const *const *)argv); + int const result = + call_execve(path, (char *const *)argv, (char *const *)envp); + + bear_strings_release(argv); + return result; +} +#endif + +#ifdef HAVE_POSIX_SPAWN +int posix_spawn(pid_t *restrict pid, const char *restrict path, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], char *const envp[restrict]) { + bear_report_call(__func__, (char const *const *)argv); + return call_posix_spawn(pid, path, file_actions, attrp, argv, envp); +} +#endif + +#ifdef HAVE_POSIX_SPAWNP +int posix_spawnp(pid_t *restrict pid, const char *restrict file, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], char *const envp[restrict]) { + bear_report_call(__func__, (char const *const *)argv); + return call_posix_spawnp(pid, file, file_actions, attrp, argv, envp); +} +#endif + +/* These are the methods which forward the call to the standard implementation. + */ + +#ifdef HAVE_EXECVE +static int call_execve(const char *path, char *const argv[], + char *const envp[]) { + typedef int (*func)(const char *, char *const *, char *const *); + + DLSYM(func, fp, "execve"); + + char const **const menvp = bear_update_environment(envp, &initial_env); + int const result = (*fp)(path, argv, (char *const *)menvp); + bear_strings_release(menvp); + return result; +} +#endif + +#ifdef HAVE_EXECVPE +static int call_execvpe(const char *file, char *const argv[], + char *const envp[]) { + typedef int (*func)(const char *, char *const *, char *const *); + + DLSYM(func, fp, "execvpe"); + + char const **const menvp = bear_update_environment(envp, &initial_env); + int const result = (*fp)(file, argv, (char *const *)menvp); + bear_strings_release(menvp); + return result; +} +#endif + +#ifdef HAVE_EXECVP +static int call_execvp(const char *file, char *const argv[]) { + typedef int (*func)(const char *file, char *const argv[]); + + DLSYM(func, fp, "execvp"); + + bear_env_t current_env; + bear_capture_env_t(¤t_env); + bear_reset_env_t(&initial_env); + int const result = (*fp)(file, argv); + bear_reset_env_t(¤t_env); + bear_release_env_t(¤t_env); + + return result; +} +#endif + +#ifdef HAVE_EXECVP2 +static int call_execvP(const char *file, const char *search_path, + char *const argv[]) { + typedef int (*func)(const char *, const char *, char *const *); + + DLSYM(func, fp, "execvP"); + + bear_env_t current_env; + bear_capture_env_t(¤t_env); + bear_reset_env_t(&initial_env); + int const result = (*fp)(file, search_path, argv); + bear_reset_env_t(¤t_env); + bear_release_env_t(¤t_env); + + return result; +} +#endif + +#ifdef HAVE_EXECT +static int call_exect(const char *path, char *const argv[], + char *const envp[]) { + typedef int (*func)(const char *, char *const *, char *const *); + + DLSYM(func, fp, "exect"); + + char const **const menvp = bear_update_environment(envp, &initial_env); + int const result = (*fp)(path, argv, (char *const *)menvp); + bear_strings_release(menvp); + return result; +} +#endif + +#ifdef HAVE_POSIX_SPAWN +static int call_posix_spawn(pid_t *restrict pid, const char *restrict path, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], + char *const envp[restrict]) { + typedef int (*func)(pid_t *restrict, const char *restrict, + const posix_spawn_file_actions_t *, + const posix_spawnattr_t *restrict, + char *const *restrict, char *const *restrict); + + DLSYM(func, fp, "posix_spawn"); + + char const **const menvp = bear_update_environment(envp, &initial_env); + int const result = + (*fp)(pid, path, file_actions, attrp, argv, (char *const *restrict)menvp); + bear_strings_release(menvp); + return result; +} +#endif + +#ifdef HAVE_POSIX_SPAWNP +static int call_posix_spawnp(pid_t *restrict pid, const char *restrict file, + const posix_spawn_file_actions_t *file_actions, + const posix_spawnattr_t *restrict attrp, + char *const argv[restrict], + char *const envp[restrict]) { + typedef int (*func)(pid_t *restrict, const char *restrict, + const posix_spawn_file_actions_t *, + const posix_spawnattr_t *restrict, + char *const *restrict, char *const *restrict); + + DLSYM(func, fp, "posix_spawnp"); + + char const **const menvp = bear_update_environment(envp, &initial_env); + int const result = + (*fp)(pid, file, file_actions, attrp, argv, (char *const *restrict)menvp); + bear_strings_release(menvp); + return result; +} +#endif + +/* this method is to write log about the process creation. */ + +static void bear_report_call(char const *fun, char const *const argv[]) { + static int const GS = 0x1d; + static int const RS = 0x1e; + static int const US = 0x1f; + + if (!initialized) + return; + + pthread_mutex_lock(&mutex); + const char *cwd = getcwd(NULL, 0); + if (0 == cwd) { + perror("bear: getcwd"); + exit(EXIT_FAILURE); + } + char const * const out_dir = initial_env[0]; + size_t const path_max_length = strlen(out_dir) + 32; + char filename[path_max_length]; + if (-1 == snprintf(filename, path_max_length, "%s/%d.cmd", out_dir, getpid())) { + perror("bear: snprintf"); + exit(EXIT_FAILURE); + } + FILE * fd = fopen(filename, "a+"); + if (0 == fd) { + perror("bear: fopen"); + exit(EXIT_FAILURE); + } + fprintf(fd, "%d%c", getpid(), RS); + fprintf(fd, "%d%c", getppid(), RS); + fprintf(fd, "%s%c", fun, RS); + fprintf(fd, "%s%c", cwd, RS); + size_t const argc = bear_strings_length(argv); + for (size_t it = 0; it < argc; ++it) { + fprintf(fd, "%s%c", argv[it], US); + } + fprintf(fd, "%c", GS); + if (fclose(fd)) { + perror("bear: fclose"); + exit(EXIT_FAILURE); + } + free((void *)cwd); + pthread_mutex_unlock(&mutex); +} + +/* update environment assure that chilren processes will copy the desired + * behaviour */ + +static int bear_capture_env_t(bear_env_t *env) { + int status = 1; + for (size_t it = 0; it < ENV_SIZE; ++it) { + char const * const env_value = getenv(env_names[it]); + char const * const env_copy = (env_value) ? strdup(env_value) : env_value; + (*env)[it] = env_copy; + status &= (env_copy) ? 1 : 0; + } + return status; +} + +static int bear_reset_env_t(bear_env_t *env) { + int status = 1; + for (size_t it = 0; it < ENV_SIZE; ++it) { + if ((*env)[it]) { + setenv(env_names[it], (*env)[it], 1); + } else { + unsetenv(env_names[it]); + } + } + return status; +} + +static void bear_release_env_t(bear_env_t *env) { + for (size_t it = 0; it < ENV_SIZE; ++it) { + free((void *)(*env)[it]); + (*env)[it] = 0; + } +} + +static char const **bear_update_environment(char *const envp[], bear_env_t *env) { + char const **result = bear_strings_copy((char const **)envp); + for (size_t it = 0; it < ENV_SIZE && (*env)[it]; ++it) + result = bear_update_environ(result, env_names[it], (*env)[it]); + return result; +} + +static char const **bear_update_environ(char const *envs[], char const *key, char const * const value) { + // find the key if it's there + size_t const key_length = strlen(key); + char const **it = envs; + for (; (it) && (*it); ++it) { + if ((0 == strncmp(*it, key, key_length)) && + (strlen(*it) > key_length) && ('=' == (*it)[key_length])) + break; + } + // allocate a environment entry + size_t const value_length = strlen(value); + size_t const env_length = key_length + value_length + 2; + char *env = malloc(env_length); + if (0 == env) { + perror("bear: malloc [in env_update]"); + exit(EXIT_FAILURE); + } + if (-1 == snprintf(env, env_length, "%s=%s", key, value)) { + perror("bear: snprintf"); + exit(EXIT_FAILURE); + } + // replace or append the environment entry + if (it && *it) { + free((void *)*it); + *it = env; + return envs; + } + return bear_strings_append(envs, env); +} + +static char **bear_get_environment() { +#if defined HAVE_NSGETENVIRON + return *_NSGetEnviron(); +#else + return environ; +#endif +} + +/* util methods to deal with string arrays. environment and process arguments + * are both represented as string arrays. */ + +static char const **bear_strings_build(char const *const arg, va_list *args) { + char const **result = 0; + size_t size = 0; + for (char const *it = arg; it; it = va_arg(*args, char const *)) { + result = realloc(result, (size + 1) * sizeof(char const *)); + if (0 == result) { + perror("bear: realloc"); + exit(EXIT_FAILURE); + } + char const *copy = strdup(it); + if (0 == copy) { + perror("bear: strdup"); + exit(EXIT_FAILURE); + } + result[size++] = copy; + } + result = realloc(result, (size + 1) * sizeof(char const *)); + if (0 == result) { + perror("bear: realloc"); + exit(EXIT_FAILURE); + } + result[size++] = 0; + + return result; +} + +static char const **bear_strings_copy(char const **const in) { + size_t const size = bear_strings_length(in); + + char const **const result = malloc((size + 1) * sizeof(char const *)); + if (0 == result) { + perror("bear: malloc"); + exit(EXIT_FAILURE); + } + + char const **out_it = result; + for (char const *const *in_it = in; (in_it) && (*in_it); + ++in_it, ++out_it) { + *out_it = strdup(*in_it); + if (0 == *out_it) { + perror("bear: strdup"); + exit(EXIT_FAILURE); + } + } + *out_it = 0; + return result; +} + +static char const **bear_strings_append(char const **const in, + char const *const e) { + size_t size = bear_strings_length(in); + char const **result = realloc(in, (size + 2) * sizeof(char const *)); + if (0 == result) { + perror("bear: realloc"); + exit(EXIT_FAILURE); + } + result[size++] = e; + result[size++] = 0; + return result; +} + +static size_t bear_strings_length(char const *const *const in) { + size_t result = 0; + for (char const *const *it = in; (it) && (*it); ++it) + ++result; + return result; +} + +static void bear_strings_release(char const **in) { + for (char const *const *it = in; (it) && (*it); ++it) { + free((void *)*it); + } + free((void *)in); +} Index: tools/scan-build-py/libscanbuild/__init__.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/__init__.py @@ -0,0 +1,82 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" +This module responsible to run the Clang static analyzer against any build +and generate reports. +""" + + +def duplicate_check(method): + """ Predicate to detect duplicated entries. + + Unique hash method can be use to detect duplicates. Entries are + represented as dictionaries, which has no default hash method. + This implementation uses a set datatype to store the unique hash values. + + This method returns a method which can detect the duplicate values. """ + + def predicate(entry): + entry_hash = predicate.unique(entry) + if entry_hash not in predicate.state: + predicate.state.add(entry_hash) + return False + return True + + predicate.unique = method + predicate.state = set() + return predicate + + +def tempdir(): + """ Return the default temorary directory. """ + + from os import getenv + return getenv('TMPDIR', getenv('TEMP', getenv('TMP', '/tmp'))) + + +def initialize_logging(verbose_level): + """ Output content controlled by the verbosity level. """ + + import sys + import os.path + import logging + level = logging.WARNING - min(logging.WARNING, (10 * verbose_level)) + + if verbose_level <= 3: + fmt_string = '{0}: %(levelname)s: %(message)s' + else: + fmt_string = '{0}: %(levelname)s: %(funcName)s: %(message)s' + + program = os.path.basename(sys.argv[0]) + logging.basicConfig(format=fmt_string.format(program), level=level) + + +def command_entry_point(function): + """ Decorator for command entry points. """ + + import functools + import logging + + @functools.wraps(function) + def wrapper(*args, **kwargs): + + exit_code = 127 + try: + exit_code = function(*args, **kwargs) + except KeyboardInterrupt: + logging.warning('Keyboard interupt') + except Exception: + logging.exception('Internal error.') + if logging.getLogger().isEnabledFor(logging.DEBUG): + logging.error("Please report this bug and attach the output " + "to the bug report") + else: + logging.error("Please run this command again and turn on " + "verbose mode (add '-vvv' as argument).") + finally: + return exit_code + + return wrapper Index: tools/scan-build-py/libscanbuild/analyze.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/analyze.py @@ -0,0 +1,502 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module implements the 'scan-build' command API. + +To run the static analyzer against a build is done in multiple steps: + + -- Intercept: capture the compilation command during the build, + -- Analyze: run the analyzer against the captured commands, + -- Report: create a cover report from the analyzer outputs. """ + +import sys +import re +import os +import os.path +import json +import argparse +import logging +import subprocess +import multiprocessing +from libscanbuild import initialize_logging, tempdir, command_entry_point +from libscanbuild.runner import run +from libscanbuild.intercept import capture +from libscanbuild.report import report_directory, document +from libscanbuild.clang import get_checkers +from libscanbuild.runner import action_check +from libscanbuild.command import classify_parameters, classify_source + +__all__ = ['analyze_build_main', 'analyze_build_wrapper'] + +COMPILER_WRAPPER_CC = 'analyze-cc' +COMPILER_WRAPPER_CXX = 'analyze-c++' + + +@command_entry_point +def analyze_build_main(bin_dir, from_build_command): + """ Entry point for 'analyze-build' and 'scan-build'. """ + + parser = create_parser(from_build_command) + args = parser.parse_args() + validate(parser, args, from_build_command) + + # setup logging + initialize_logging(args.verbose) + logging.debug('Parsed arguments: %s', args) + + with report_directory(args.output, args.keep_empty) as target_dir: + if not from_build_command: + # run analyzer only and generate cover report + run_analyzer(args, target_dir) + number_of_bugs = document(args, target_dir, True) + return number_of_bugs if args.status_bugs else 0 + elif args.intercept_first: + # run build command and capture compiler executions + exit_code = capture(args, bin_dir) + # next step to run the analyzer against the captured commands + if need_analyzer(args.build): + run_analyzer(args, target_dir) + # cover report generation and bug counting + number_of_bugs = document(args, target_dir, True) + # remove the compilation database when it was not requested + if os.path.exists(args.cdb): + os.unlink(args.cdb) + # set exit status as it was requested + return number_of_bugs if args.status_bugs else exit_code + else: + return exit_code + else: + # run the build command with compiler wrappers which + # execute the analyzer too. (interposition) + environment = setup_environment(args, target_dir, bin_dir) + logging.debug('run build in environment: %s', environment) + exit_code = subprocess.call(args.build, env=environment) + logging.debug('build finished with exit code: %d', exit_code) + # cover report generation and bug counting + number_of_bugs = document(args, target_dir, False) + # set exit status as it was requested + return number_of_bugs if args.status_bugs else exit_code + + +def need_analyzer(args): + """ Check the intent of the build command. + + When static analyzer run against project configure step, it should be + silent and no need to run the analyzer or generate report. + + To run `scan-build` against the configure step might be neccessary, + when compiler wrappers are used. That's the moment when build setup + check the compiler and capture the location for the build process. """ + + return len(args) and not re.search('configure|autogen', args[0]) + + +def run_analyzer(args, output_dir): + """ Runs the analyzer against the given compilation database. """ + + def exclude(filename): + """ Return true when any excluded directory prefix the filename. """ + return any(re.match(r'^' + directory, filename) + for directory in args.excludes) + + consts = { + 'clang': args.clang, + 'output_dir': output_dir, + 'output_format': args.output_format, + 'output_failures': args.output_failures, + 'direct_args': analyzer_params(args) + } + + logging.debug('run analyzer against compilation database') + with open(args.cdb, 'r') as handle: + generator = (dict(cmd, **consts) + for cmd in json.load(handle) if not exclude(cmd['file'])) + # when verbose output requested execute sequentially + pool = multiprocessing.Pool(1 if args.verbose > 2 else None) + for current in pool.imap_unordered(run, generator): + if current is not None: + # display error message from the static analyzer + for line in current['error_output']: + logging.info(line.rstrip()) + pool.close() + pool.join() + + +def setup_environment(args, destination, bin_dir): + """ Set up environment for build command to interpose compiler wrapper. """ + + environment = dict(os.environ) + environment.update({ + 'CC': os.path.join(bin_dir, COMPILER_WRAPPER_CC), + 'CXX': os.path.join(bin_dir, COMPILER_WRAPPER_CXX), + 'ANALYZE_BUILD_CC': args.cc, + 'ANALYZE_BUILD_CXX': args.cxx, + 'ANALYZE_BUILD_CLANG': args.clang if need_analyzer(args.build) else '', + 'ANALYZE_BUILD_VERBOSE': 'DEBUG' if args.verbose > 2 else 'WARNING', + 'ANALYZE_BUILD_REPORT_DIR': destination, + 'ANALYZE_BUILD_REPORT_FORMAT': args.output_format, + 'ANALYZE_BUILD_REPORT_FAILURES': 'yes' if args.output_failures else '', + 'ANALYZE_BUILD_PARAMETERS': ' '.join(analyzer_params(args)) + }) + return environment + + +def analyze_build_wrapper(cplusplus): + """ Entry point for `analyze-cc` and `analyze-c++` compiler wrappers. """ + + # initialize wrapper logging + logging.basicConfig(format='analyze: %(levelname)s: %(message)s', + level=os.getenv('ANALYZE_BUILD_VERBOSE', 'INFO')) + # execute with real compiler + compiler = os.getenv('ANALYZE_BUILD_CXX', 'c++') if cplusplus \ + else os.getenv('ANALYZE_BUILD_CC', 'cc') + compilation = [compiler] + sys.argv[1:] + logging.info('execute compiler: %s', compilation) + result = subprocess.call(compilation) + # exit when it fails, ... + if result or not os.getenv('ANALYZE_BUILD_CLANG'): + return result + # ... and run the analyzer if all went well. + try: + # collect the needed parameters from environment, crash when missing + consts = { + 'clang': os.getenv('ANALYZE_BUILD_CLANG'), + 'output_dir': os.getenv('ANALYZE_BUILD_REPORT_DIR'), + 'output_format': os.getenv('ANALYZE_BUILD_REPORT_FORMAT'), + 'output_failures': os.getenv('ANALYZE_BUILD_REPORT_FAILURES'), + 'direct_args': os.getenv('ANALYZE_BUILD_PARAMETERS', + '').split(' '), + 'directory': os.getcwd(), + } + # get relevant parameters from command line arguments + args = classify_parameters(sys.argv) + filenames = args.pop('files', []) + for filename in (name for name in filenames if classify_source(name)): + parameters = dict(args, file=filename, **consts) + logging.debug('analyzer parameters %s', parameters) + current = action_check(parameters) + # display error message from the static analyzer + if current is not None: + for line in current['error_output']: + logging.info(line.rstrip()) + except Exception: + logging.exception("run analyzer inside compiler wrapper failed.") + return 0 + + +def analyzer_params(args): + """ A group of command line arguments can mapped to command + line arguments of the analyzer. This method generates those. """ + + def prefix_with(constant, pieces): + """ From a sequence create another sequence where every second element + is from the original sequence and the odd elements are the prefix. + + eg.: prefix_with(0, [1,2,3]) creates [0, 1, 0, 2, 0, 3] """ + + return [elem for piece in pieces for elem in [constant, piece]] + + result = [] + + if args.store_model: + result.append('-analyzer-store={0}'.format(args.store_model)) + if args.constraints_model: + result.append( + '-analyzer-constraints={0}'.format(args.constraints_model)) + if args.internal_stats: + result.append('-analyzer-stats') + if args.analyze_headers: + result.append('-analyzer-opt-analyze-headers') + if args.stats: + result.append('-analyzer-checker=debug.Stats') + if args.maxloop: + result.extend(['-analyzer-max-loop', str(args.maxloop)]) + if args.output_format: + result.append('-analyzer-output={0}'.format(args.output_format)) + if args.analyzer_config: + result.append(args.analyzer_config) + if args.verbose >= 4: + result.append('-analyzer-display-progress') + if args.plugins: + result.extend(prefix_with('-load', args.plugins)) + if args.enable_checker: + checkers = ','.join(args.enable_checker) + result.extend(['-analyzer-checker', checkers]) + if args.disable_checker: + checkers = ','.join(args.disable_checker) + result.extend(['-analyzer-disable-checker', checkers]) + if os.getenv('UBIVIZ'): + result.append('-analyzer-viz-egraph-ubigraph') + + return prefix_with('-Xclang', result) + + +def print_active_checkers(checkers): + """ Print active checkers to stdout. """ + + for name in sorted(name for name, (_, active) in checkers.items() + if active): + print(name) + + +def print_checkers(checkers): + """ Print verbose checker help to stdout. """ + + print('') + print('available checkers:') + print('') + for name in sorted(checkers.keys()): + description, active = checkers[name] + prefix = '+' if active else ' ' + if len(name) > 30: + print(' {0} {1}'.format(prefix, name)) + print(' ' * 35 + description) + else: + print(' {0} {1: <30} {2}'.format(prefix, name, description)) + print('') + print('NOTE: "+" indicates that an analysis is enabled by default.') + print('') + + +def validate(parser, args, from_build_command): + """ Validation done by the parser itself, but semantic check still + needs to be done. This method is doing that. """ + + if args.help_checkers_verbose: + print_checkers(get_checkers(args.clang, args.plugins)) + parser.exit() + elif args.help_checkers: + print_active_checkers(get_checkers(args.clang, args.plugins)) + parser.exit() + + if from_build_command and not args.build: + parser.error('missing build command') + + +def create_parser(from_build_command): + """ Command line argument parser factory method. """ + + parser = argparse.ArgumentParser( + formatter_class=argparse.ArgumentDefaultsHelpFormatter) + + parser.add_argument( + '--verbose', '-v', + action='count', + default=0, + help="""Enable verbose output from '%(prog)s'. A second and third + flag increases verbosity.""") + parser.add_argument( + '--override-compiler', + action='store_true', + help="""Always resort to the compiler wrapper even when better + interposition methods are available.""") + parser.add_argument( + '--intercept-first', + action='store_true', + help="""Run the build commands only, build a compilation database, + then run the static analyzer afterwards. + Generally speaking it has better coverage on build commands. + With '--override-compiler' it use compiler wrapper, but does + not run the analyzer till the build is finished. """) + parser.add_argument( + '--cdb', + metavar='', + default="compile_commands.json", + help="""The JSON compilation database.""") + + parser.add_argument( + '--output', '-o', + metavar='', + default=tempdir(), + help="""Specifies the output directory for analyzer reports. + Subdirectory will be created if default directory is targeted. + """) + parser.add_argument( + '--status-bugs', + action='store_true', + help="""By default, the exit status of '%(prog)s' is the same as the + executed build command. Specifying this option causes the exit + status of '%(prog)s' to be non zero if it found potential bugs + and zero otherwise.""") + parser.add_argument( + '--html-title', + metavar='', + help="""Specify the title used on generated HTML pages. + If not specified, a default title will be used.""") + parser.add_argument( + '--analyze-headers', + action='store_true', + help="""Also analyze functions in #included files. By default, such + functions are skipped unless they are called by functions + within the main source file.""") + format_group = parser.add_mutually_exclusive_group() + format_group.add_argument( + '--plist', '-plist', + dest='output_format', + const='plist', + default='html', + action='store_const', + help="""This option outputs the results as a set of .plist files.""") + format_group.add_argument( + '--plist-html', '-plist-html', + dest='output_format', + const='plist-html', + default='html', + action='store_const', + help="""This option outputs the results as a set of .html and .plist + files.""") + # TODO: implement '-view ' + + advanced = parser.add_argument_group('advanced options') + advanced.add_argument( + '--keep-empty', + action='store_true', + help="""Don't remove the build results directory even if no issues + were reported.""") + advanced.add_argument( + '--no-failure-reports', '-no-failure-reports', + dest='output_failures', + action='store_false', + help="""Do not create a 'failures' subdirectory that includes analyzer + crash reports and preprocessed source files.""") + advanced.add_argument( + '--stats', '-stats', + action='store_true', + help="""Generates visitation statistics for the project being analyzed. + """) + advanced.add_argument( + '--internal-stats', + action='store_true', + help="""Generate internal analyzer statistics.""") + advanced.add_argument( + '--maxloop', '-maxloop', + metavar='<loop count>', + type=int, + help="""Specifiy the number of times a block can be visited before + giving up. Increase for more comprehensive coverage at a cost + of speed.""") + advanced.add_argument( + '--store', '-store', + metavar='<model>', + dest='store_model', + choices=['region', 'basic'], + help="""Specify the store model used by the analyzer. + 'region' specifies a field- sensitive store model. + 'basic' which is far less precise but can more quickly + analyze code. 'basic' was the default store model for + checker-0.221 and earlier.""") + advanced.add_argument( + '--constraints', '-constraints', + metavar='<model>', + dest='constraints_model', + choices=['range', 'basic'], + help="""Specify the contraint engine used by the analyzer. Specifying + 'basic' uses a simpler, less powerful constraint model used by + checker-0.160 and earlier.""") + advanced.add_argument( + '--use-analyzer', + metavar='<path>', + dest='clang', + default='clang', + help="""'%(prog)s' uses the 'clang' executable relative to itself for + static analysis. One can override this behavior with this + option by using the 'clang' packaged with Xcode (on OS X) or + from the PATH.""") + advanced.add_argument( + '--use-cc', + metavar='<path>', + dest='cc', + default='cc', + help="""When '%(prog)s' analyzes a project by interposing a "fake + compiler", which executes a real compiler for compilation and + do other tasks (to run the static analyzer or just record the + compiler invocation). Because of this interposing, '%(prog)s' + does not know what compiler your project normally uses. + Instead, it simply overrides the CC environment variable, and + guesses your default compiler. + + If you need '%(prog)s' to use a specific compiler for + *compilation* then you can use this option to specify a path + to that compiler.""") + advanced.add_argument( + '--use-c++', + metavar='<path>', + dest='cxx', + default='c++', + help="""This is the same as "--use-cc" but for C++ code.""") + advanced.add_argument( + '--analyzer-config', '-analyzer-config', + metavar='<options>', + help="""Provide options to pass through to the analyzer's + -analyzer-config flag. Several options are separated with + comma: 'key1=val1,key2=val2' + + Available options: + stable-report-filename=true or false (default) + + Switch the page naming to: + report-<filename>-<function/method name>-<id>.html + instead of report-XXXXXX.html""") + advanced.add_argument( + '--exclude', + metavar='<directory>', + dest='excludes', + action='append', + default=[], + help="""Do not run static analyzer against files found in this + directory. (You can specify this option multiple times.) + Could be usefull when project contains 3rd party libraries. + The directory path shall be absolute path as file names in + the compilation database.""") + + plugins = parser.add_argument_group('checker options') + plugins.add_argument( + '--load-plugin', '-load-plugin', + metavar='<plugin library>', + dest='plugins', + action='append', + help="""Loading external checkers using the clang plugin interface.""") + plugins.add_argument( + '--enable-checker', '-enable-checker', + metavar='<checker name>', + action=AppendCommaSeparated, + help="""Enable specific checker.""") + plugins.add_argument( + '--disable-checker', '-disable-checker', + metavar='<checker name>', + action=AppendCommaSeparated, + help="""Disable specific checker.""") + plugins.add_argument( + '--help-checkers', + action='store_true', + help="""A default group of checkers is run unless explicitly disabled. + Exactly which checkers constitute the default group is a + function of the operating system in use. These can be printed + with this flag.""") + plugins.add_argument( + '--help-checkers-verbose', + action='store_true', + help="""Print all available checkers and mark the enabled ones.""") + + if from_build_command: + parser.add_argument( + dest='build', + nargs=argparse.REMAINDER, + help="""Command to run.""") + + return parser + + +class AppendCommaSeparated(argparse.Action): + """ argparse Action class to support multiple comma separated lists. """ + + def __call__(self, __parser, namespace, values, __option_string): + # getattr(obj, attr, default) does not really returns default but none + if getattr(namespace, self.dest, None) is None: + setattr(namespace, self.dest, []) + # once it's fixed we can use as expected + actual = getattr(namespace, self.dest) + actual.extend(values.split(',')) + setattr(namespace, self.dest, actual) Index: tools/scan-build-py/libscanbuild/clang.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/clang.py @@ -0,0 +1,156 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module is responsible for the Clang executable. + +Since Clang command line interface is so rich, but this project is using only +a subset of that, it makes sense to create a function specific wrapper. """ + +import re +import subprocess +import logging +from libscanbuild.shell import decode + +__all__ = ['get_version', 'get_arguments', 'get_checkers'] + + +def get_version(cmd): + """ Returns the compiler version as string. """ + + lines = subprocess.check_output([cmd, '-v'], stderr=subprocess.STDOUT) + return lines.decode('ascii').splitlines()[0] + + +def get_arguments(command, cwd): + """ Capture Clang invocation. + + This method returns the front-end invocation that would be executed as + a result of the given driver invocation. """ + + def lastline(stream): + last = None + for line in stream: + last = line + if last is None: + raise Exception("output not found") + return last + + cmd = command[:] + cmd.insert(1, '-###') + logging.debug('exec command in %s: %s', cwd, ' '.join(cmd)) + child = subprocess.Popen(cmd, + cwd=cwd, + universal_newlines=True, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT) + line = lastline(child.stdout) + child.stdout.close() + child.wait() + if child.returncode == 0: + if re.search(r'clang(.*): error:', line): + raise Exception(line) + return decode(line) + else: + raise Exception(line) + + +def get_active_checkers(clang, plugins): + """ To get the default plugins we execute Clang to print how this + compilation would be called. + + For input file we specify stdin and pass only language information. """ + + def checkers(language): + """ Returns a list of active checkers for the given language. """ + + load = [elem + for plugin in plugins + for elem in ['-Xclang', '-load', '-Xclang', plugin]] + cmd = [clang, '--analyze'] + load + ['-x', language, '-'] + pattern = re.compile(r'^-analyzer-checker=(.*)$') + return [pattern.match(arg).group(1) + for arg in get_arguments(cmd, '.') if pattern.match(arg)] + + result = set() + for language in ['c', 'c++', 'objective-c', 'objective-c++']: + result.update(checkers(language)) + return result + + +def get_checkers(clang, plugins): + """ Get all the available checkers from default and from the plugins. + + clang -- the compiler we are using + plugins -- list of plugins which was requested by the user + + This method returns a dictionary of all available checkers and status. + + {<plugin name>: (<plugin description>, <is active by default>)} """ + + plugins = plugins if plugins else [] + + def parse_checkers(stream): + """ Parse clang -analyzer-checker-help output. + + Below the line 'CHECKERS:' are there the name description pairs. + Many of them are in one line, but some long named plugins has the + name and the description in separate lines. + + The plugin name is always prefixed with two space character. The + name contains no whitespaces. Then followed by newline (if it's + too long) or other space characters comes the description of the + plugin. The description ends with a newline character. """ + + # find checkers header + for line in stream: + if re.match(r'^CHECKERS:', line): + break + # find entries + state = None + for line in stream: + if state and not re.match(r'^\s\s\S', line): + yield (state, line.strip()) + state = None + elif re.match(r'^\s\s\S+$', line.rstrip()): + state = line.strip() + else: + pattern = re.compile(r'^\s\s(?P<key>\S*)\s*(?P<value>.*)') + match = pattern.match(line.rstrip()) + if match: + current = match.groupdict() + yield (current['key'], current['value']) + + def is_active(actives, entry): + """ Returns true if plugin name is matching the active plugin names. + + actives -- set of active plugin names (or prefixes). + entry -- the current plugin name to judge. + + The active plugin names are specific plugin names or prefix of some + names. One example for prefix, when it say 'unix' and it shall match + on 'unix.API', 'unix.Malloc' and 'unix.MallocSizeof'. """ + + return any(re.match(r'^' + a + r'(\.|$)', entry) for a in actives) + + actives = get_active_checkers(clang, plugins) + + load = [elem for plugin in plugins for elem in ['-load', plugin]] + cmd = [clang, '-cc1'] + load + ['-analyzer-checker-help'] + + logging.debug('exec command: %s', ' '.join(cmd)) + child = subprocess.Popen(cmd, + universal_newlines=True, + stdout=subprocess.PIPE, + stderr=subprocess.STDOUT) + checkers = { + k: (v, is_active(actives, k)) + for k, v in parse_checkers(child.stdout) + } + child.stdout.close() + child.wait() + if child.returncode == 0 and len(checkers): + return checkers + else: + raise Exception('Could not query Clang for available checkers.') Index: tools/scan-build-py/libscanbuild/command.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/command.py @@ -0,0 +1,133 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module is responsible for to parse a compiler invocation. """ + +import re +import os + +__all__ = ['Action', 'classify_parameters', 'classify_source'] + + +class Action(object): + """ Enumeration class for compiler action. """ + + Link, Compile, Ignored = range(3) + + +def classify_parameters(command): + """ Parses the command line arguments of the given invocation. """ + + # result value of this method. + # some value are preset, some will be set only when found. + result = { + 'action': Action.Link, + 'files': [], + 'output': None, + 'compile_options': [], + 'c++': is_cplusplus_compiler(command[0]) + # archs_seen + # language + } + + # data structure to ignore compiler parameters. + # key: parameter name, value: number of parameters to ignore afterwards. + ignored = { + '-g': 0, + '-fsyntax-only': 0, + '-save-temps': 0, + '-install_name': 1, + '-exported_symbols_list': 1, + '-current_version': 1, + '-compatibility_version': 1, + '-init': 1, + '-e': 1, + '-seg1addr': 1, + '-bundle_loader': 1, + '-multiply_defined': 1, + '-sectorder': 3, + '--param': 1, + '--serialize-diagnostics': 1 + } + + args = iter(command[1:]) + for arg in args: + # compiler action parameters are the most important ones... + if arg in {'-E', '-S', '-cc1', '-M', '-MM', '-###'}: + result.update({'action': Action.Ignored}) + elif arg == '-c': + result.update({'action': max(result['action'], Action.Compile)}) + # arch flags are taken... + elif arg == '-arch': + archs = result.get('archs_seen', []) + result.update({'archs_seen': archs + [next(args)]}) + # explicit language option taken... + elif arg == '-x': + result.update({'language': next(args)}) + # output flag taken... + elif arg == '-o': + result.update({'output': next(args)}) + # warning disable options are taken... + elif re.match(r'^-Wno-', arg): + result['compile_options'].append(arg) + # warning options are ignored... + elif re.match(r'^-[mW].+', arg): + pass + # some preprocessor parameters are ignored... + elif arg in {'-MD', '-MMD', '-MG', '-MP'}: + pass + elif arg in {'-MF', '-MT', '-MQ'}: + next(args) + # linker options are ignored... + elif arg in {'-static', '-shared', '-s', '-rdynamic'} or \ + re.match(r'^-[lL].+', arg): + pass + elif arg in {'-l', '-L', '-u', '-z', '-T', '-Xlinker'}: + next(args) + # some other options are ignored... + elif arg in ignored.keys(): + for _ in range(ignored[arg]): + next(args) + # parameters which looks source file are taken... + elif re.match(r'^[^-].+', arg) and classify_source(arg): + result['files'].append(arg) + # and consider everything else as compile option. + else: + result['compile_options'].append(arg) + + return result + + +def classify_source(filename, cplusplus=False): + """ Return the language from file name extension. """ + + mapping = { + '.c': 'c++' if cplusplus else 'c', + '.i': 'c++-cpp-output' if cplusplus else 'c-cpp-output', + '.ii': 'c++-cpp-output', + '.m': 'objective-c', + '.mi': 'objective-c-cpp-output', + '.mm': 'objective-c++', + '.mii': 'objective-c++-cpp-output', + '.C': 'c++', + '.cc': 'c++', + '.CC': 'c++', + '.cp': 'c++', + '.cpp': 'c++', + '.cxx': 'c++', + '.c++': 'c++', + '.C++': 'c++', + '.txx': 'c++' + } + + __, extension = os.path.splitext(os.path.basename(filename)) + return mapping.get(extension) + + +def is_cplusplus_compiler(name): + """ Returns true when the compiler name refer to a C++ compiler. """ + + match = re.match(r'^([^/]*/)*(\w*-)*(\w+\+\+)(-(\d+(\.\d+){0,3}))?$', name) + return False if match is None else True Index: tools/scan-build-py/libscanbuild/intercept.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/intercept.py @@ -0,0 +1,359 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module is responsible to capture the compiler invocation of any +build process. The result of that should be a compilation database. + +This implementation is using the LD_PRELOAD or DYLD_INSERT_LIBRARIES +mechanisms provided by the dynamic linker. The related library is implemented +in C language and can be found under 'libear' directory. + +The 'libear' library is capturing all child process creation and logging the +relevant information about it into separate files in a specified directory. +The parameter of this process is the output directory name, where the report +files shall be placed. This parameter is passed as an environment variable. + +The module also implements compiler wrappers to intercept the compiler calls. + +The module implements the build command execution and the post-processing of +the output files, which will condensates into a compilation database. """ + +import sys +import os +import os.path +import re +import itertools +import json +import glob +import argparse +import logging +import subprocess +from libear import build_libear, TemporaryDirectory +from libscanbuild import duplicate_check, tempdir, initialize_logging +from libscanbuild import command_entry_point +from libscanbuild.command import Action, classify_parameters +from libscanbuild.shell import encode, decode + +__all__ = ['capture', 'intercept_build_main', 'intercept_build_wrapper'] + +GS = chr(0x1d) +RS = chr(0x1e) +US = chr(0x1f) + +COMPILER_WRAPPER_CC = 'intercept-cc' +COMPILER_WRAPPER_CXX = 'intercept-c++' + + +@command_entry_point +def intercept_build_main(bin_dir): + """ Entry point for 'intercept-build' command. """ + + parser = create_parser() + args = parser.parse_args() + + initialize_logging(args.verbose) + logging.debug('Parsed arguments: %s', args) + + if not args.build: + parser.print_help() + return 0 + + return capture(args, bin_dir) + + +def capture(args, bin_dir): + """ The entry point of build command interception. """ + + def post_processing(commands): + """ To make a compilation database, it needs to filter out commands + which are not compiler calls. Needs to find the source file name + from the arguments. And do shell escaping on the command. + + To support incremental builds, it is desired to read elements from + an existing compilation database from a previous run. These elemets + shall be merged with the new elements. """ + + # create entries from the current run + current = itertools.chain.from_iterable( + # creates a sequence of entry generators from an exec, + # but filter out non compiler calls before. + (format_entry(x) for x in commands if is_compiler_call(x))) + # read entries from previous run + if 'append' in args and args.append and os.path.exists(args.cdb): + with open(args.cdb) as handle: + previous = iter(json.load(handle)) + else: + previous = iter([]) + # filter out duplicate entries from both + duplicate = duplicate_check(entry_hash) + return (entry for entry in itertools.chain(previous, current) + if os.path.exists(entry['file']) and not duplicate(entry)) + + with TemporaryDirectory(prefix='intercept-', dir=tempdir()) as tmp_dir: + # run the build command + environment = setup_environment(args, tmp_dir, bin_dir) + logging.debug('run build in environment: %s', environment) + exit_code = subprocess.call(args.build, env=environment) + logging.info('build finished with exit code: %d', exit_code) + # read the intercepted exec calls + commands = itertools.chain.from_iterable( + parse_exec_trace(os.path.join(tmp_dir, filename)) + for filename in sorted(glob.iglob(os.path.join(tmp_dir, '*.cmd')))) + # do post processing only if that was requested + if 'raw_entries' not in args or not args.raw_entries: + entries = post_processing(commands) + else: + entries = commands + # dump the compilation database + with open(args.cdb, 'w+') as handle: + json.dump(list(entries), handle, sort_keys=True, indent=4) + return exit_code + + +def setup_environment(args, destination, bin_dir): + """ Sets up the environment for the build command. + + It sets the required environment variables and execute the given command. + The exec calls will be logged by the 'libear' preloaded library or by the + 'wrapper' programs. """ + + c_compiler = args.cc if 'cc' in args else 'cc' + cxx_compiler = args.cxx if 'cxx' in args else 'c++' + + libear_path = None if args.override_compiler or is_preload_disabled( + sys.platform) else build_libear(c_compiler, destination) + + environment = dict(os.environ) + environment.update({'INTERCEPT_BUILD_TARGET_DIR': destination}) + + if not libear_path: + logging.debug('intercept gonna use compiler wrappers') + environment.update({ + 'CC': os.path.join(bin_dir, COMPILER_WRAPPER_CC), + 'CXX': os.path.join(bin_dir, COMPILER_WRAPPER_CXX), + 'INTERCEPT_BUILD_CC': c_compiler, + 'INTERCEPT_BUILD_CXX': cxx_compiler, + 'INTERCEPT_BUILD_VERBOSE': 'DEBUG' if args.verbose > 2 else 'INFO' + }) + elif sys.platform == 'darwin': + logging.debug('intercept gonna preload libear on OSX') + environment.update({ + 'DYLD_INSERT_LIBRARIES': libear_path, + 'DYLD_FORCE_FLAT_NAMESPACE': '1' + }) + else: + logging.debug('intercept gonna preload libear on UNIX') + environment.update({'LD_PRELOAD': libear_path}) + + return environment + + +def intercept_build_wrapper(cplusplus): + """ Entry point for `intercept-cc` and `intercept-c++` compiler wrappers. + + It does generate execution report into target directory. And execute + the wrapped compilation with the real compiler. The parameters for + report and execution are from environment variables. + + Those parameters which for 'libear' library can't have meaningful + values are faked. """ + + # initialize wrapper logging + logging.basicConfig(format='intercept: %(levelname)s: %(message)s', + level=os.getenv('INTERCEPT_BUILD_VERBOSE', 'INFO')) + # write report + try: + target_dir = os.getenv('INTERCEPT_BUILD_TARGET_DIR') + if not target_dir: + raise UserWarning('exec report target directory not found') + pid = str(os.getpid()) + target_file = os.path.join(target_dir, pid + '.cmd') + logging.debug('writing exec report to: %s', target_file) + with open(target_file, 'ab') as handler: + working_dir = os.getcwd() + command = US.join(sys.argv) + US + content = RS.join([pid, pid, 'wrapper', working_dir, command]) + GS + handler.write(content.encode('utf-8')) + except IOError: + logging.exception('writing exec report failed') + except UserWarning as warning: + logging.warning(warning) + # execute with real compiler + compiler = os.getenv('INTERCEPT_BUILD_CXX', 'c++') if cplusplus \ + else os.getenv('INTERCEPT_BUILD_CC', 'cc') + compilation = [compiler] + sys.argv[1:] + logging.debug('execute compiler: %s', compilation) + return subprocess.call(compilation) + + +def parse_exec_trace(filename): + """ Parse the file generated by the 'libear' preloaded library. + + Given filename points to a file which contains the basic report + generated by the interception library or wrapper command. A single + report file _might_ contain multiple process creation info. """ + + logging.debug('parse exec trace file: %s', filename) + with open(filename, 'r') as handler: + content = handler.read() + for group in filter(bool, content.split(GS)): + records = group.split(RS) + yield { + 'pid': records[0], + 'ppid': records[1], + 'function': records[2], + 'directory': records[3], + 'command': records[4].split(US)[:-1] + } + + +def format_entry(entry): + """ Generate the desired fields for compilation database entries. """ + + def abspath(cwd, name): + """ Create normalized absolute path from input filename. """ + fullname = name if os.path.isabs(name) else os.path.join(cwd, name) + return os.path.normpath(fullname) + + logging.debug('format this command: %s', entry['command']) + atoms = classify_parameters(entry['command']) + if atoms['action'] <= Action.Compile: + for source in atoms['files']: + compiler = 'c++' if atoms['c++'] else 'cc' + flags = atoms['compile_options'] + flags += ['-o', atoms['output']] if atoms['output'] else [] + flags += ['-x', atoms['language']] if 'language' in atoms else [] + flags += [elem + for arch in atoms.get('archs_seen', []) + for elem in ['-arch', arch]] + command = [compiler, '-c'] + flags + [source] + logging.debug('formated as: %s', command) + yield { + 'directory': entry['directory'], + 'command': encode(command), + 'file': abspath(entry['directory'], source) + } + + +def is_compiler_call(entry): + """ A predicate to decide the entry is a compiler call or not. """ + + patterns = [ + re.compile(r'^([^/]*/)*intercept-c(c|\+\+)$'), + re.compile(r'^([^/]*/)*c(c|\+\+)$'), + re.compile(r'^([^/]*/)*([^-]*-)*[mg](cc|\+\+)(-\d+(\.\d+){0,2})?$'), + re.compile(r'^([^/]*/)*([^-]*-)*clang(\+\+)?(-\d+(\.\d+){0,2})?$'), + re.compile(r'^([^/]*/)*llvm-g(cc|\+\+)$'), + ] + executable = entry['command'][0] + return any((pattern.match(executable) for pattern in patterns)) + + +def is_preload_disabled(platform): + """ Library-based interposition will fail silently if SIP is enabled, + so this should be detected. You can detect whether SIP is enabled on + Darwin by checking whether (1) there is a binary called 'csrutil' in + the path and, if so, (2) whether the output of executing 'csrutil status' + contains 'System Integrity Protection status: enabled'. + + Same problem on linux when SELinux is enabled. The status query program + 'sestatus' and the output when it's enabled 'SELinux status: enabled'. """ + + if platform == 'darwin': + pattern = re.compile(r'System Integrity Protection status:\s+enabled') + command = ['csrutil', 'status'] + elif platform in {'linux', 'linux2'}: + pattern = re.compile(r'SELinux status:\s+enabled') + command = ['sestatus'] + else: + return False + + try: + lines = subprocess.check_output(command).decode('utf-8') + return any((pattern.match(line) for line in lines.splitlines())) + except: + return False + + +def entry_hash(entry): + """ Implement unique hash method for compilation database entries. """ + + # For faster lookup in set filename is reverted + filename = entry['file'][::-1] + # For faster lookup in set directory is reverted + directory = entry['directory'][::-1] + # On OS X the 'cc' and 'c++' compilers are wrappers for + # 'clang' therefore both call would be logged. To avoid + # this the hash does not contain the first word of the + # command. + command = ' '.join(decode(entry['command'])[1:]) + + return '<>'.join([filename, directory, command]) + + +def create_parser(): + """ Command line argument parser factory method. """ + + parser = argparse.ArgumentParser( + formatter_class=argparse.ArgumentDefaultsHelpFormatter) + + parser.add_argument( + '--verbose', '-v', + action='count', + default=0, + help="""Enable verbose output from '%(prog)s'. A second and third + flag increases verbosity.""") + parser.add_argument( + '--cdb', + metavar='<file>', + default="compile_commands.json", + help="""The JSON compilation database.""") + group = parser.add_mutually_exclusive_group() + group.add_argument( + '--append', + action='store_true', + help="""Append new entries to existing compilation database.""") + group.add_argument( + '--disable-filter', '-n', + dest='raw_entries', + action='store_true', + help="""Intercepted child process creation calls (exec calls) are all + logged to the output. The output is not a compilation database. + This flag is for debug purposes.""") + + advanced = parser.add_argument_group('advanced options') + advanced.add_argument( + '--override-compiler', + action='store_true', + help="""Always resort to the compiler wrapper even when better + intercept methods are available.""") + advanced.add_argument( + '--use-cc', + metavar='<path>', + dest='cc', + default='cc', + help="""When '%(prog)s' analyzes a project by interposing a compiler + wrapper, which executes a real compiler for compilation and + do other tasks (record the compiler invocation). Because of + this interposing, '%(prog)s' does not know what compiler your + project normally uses. Instead, it simply overrides the CC + environment variable, and guesses your default compiler. + + If you need '%(prog)s' to use a specific compiler for + *compilation* then you can use this option to specify a path + to that compiler.""") + advanced.add_argument( + '--use-c++', + metavar='<path>', + dest='cxx', + default='c++', + help="""This is the same as "--use-cc" but for C++ code.""") + + parser.add_argument( + dest='build', + nargs=argparse.REMAINDER, + help="""Command to run.""") + + return parser Index: tools/scan-build-py/libscanbuild/report.py =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/report.py @@ -0,0 +1,530 @@ +# -*- coding: utf-8 -*- +# The LLVM Compiler Infrastructure +# +# This file is distributed under the University of Illinois Open Source +# License. See LICENSE.TXT for details. +""" This module is responsible to generate 'index.html' for the report. + +The input for this step is the output directory, where individual reports +could be found. It parses those reports and generates 'index.html'. """ + +import re +import os +import os.path +import sys +import shutil +import time +import tempfile +import itertools +import plistlib +import glob +import json +import logging +import contextlib +from libscanbuild import duplicate_check +from libscanbuild.clang import get_version + +__all__ = ['report_directory', 'document'] + + +@contextlib.contextmanager +def report_directory(hint, keep): + """ Responsible for the report directory. + + hint -- could specify the parent directory of the output directory. + keep -- a boolean value to keep or delete the empty report directory. """ + + stamp = time.strftime('scan-build-%Y-%m-%d-%H%M%S-', time.localtime()) + name = tempfile.mkdtemp(prefix=stamp, dir=hint) + + logging.info('Report directory created: %s', name) + + try: + yield name + finally: + if os.listdir(name): + msg = "Run 'scan-view %s' to examine bug reports." + keep = True + else: + if keep: + msg = "Report directory '%s' contans no report, but kept." + else: + msg = "Removing directory '%s' because it contains no report." + logging.warning(msg, name) + + if not keep: + os.rmdir(name) + + +def document(args, output_dir, use_cdb): + """ Generates cover report and returns the number of bugs/crashes. """ + + html_reports_available = args.output_format in {'html', 'plist-html'} + + logging.debug('count crashes and bugs') + crash_count = sum(1 for _ in read_crashes(output_dir)) + bug_counter = create_counters() + for bug in read_bugs(output_dir, html_reports_available): + bug_counter(bug) + result = crash_count + bug_counter.total + + if html_reports_available and result: + logging.debug('generate index.html file') + # common prefix for source files to have sort filenames + prefix = commonprefix_from(args.cdb) if use_cdb else os.getcwd() + # assemble the cover from multiple fragments + try: + fragments = [] + if bug_counter.total: + fragments.append(bug_summary(output_dir, bug_counter)) + fragments.append(bug_report(output_dir, prefix)) + if crash_count: + fragments.append(crash_report(output_dir, prefix)) + assemble_cover(output_dir, prefix, args, fragments) + # copy additinal files to the report + copy_resource_files(output_dir) + if use_cdb: + shutil.copy(args.cdb, output_dir) + finally: + for fragment in fragments: + os.remove(fragment) + return result + + +def assemble_cover(output_dir, prefix, args, fragments): + """ Put together the fragments into a final report. """ + + import getpass + import socket + import datetime + + if args.html_title is None: + args.html_title = os.path.basename(prefix) + ' - analyzer results' + + with open(os.path.join(output_dir, 'index.html'), 'w') as handle: + indent = 0 + handle.write(reindent(""" + |<!DOCTYPE html> + |<html> + | <head> + | <title>{html_title} + | + | + | + | """, indent).format(html_title=args.html_title)) + handle.write(comment('SUMMARYENDHEAD')) + handle.write(reindent(""" + | + |

{html_title}

+ | + | + | + | + | + | + |
User:{user_name}@{host_name}
Working Directory:{current_dir}
Command Line:{cmd_args}
Clang Version:{clang_version}
Date:{date}
""", indent).format(html_title=args.html_title, + user_name=getpass.getuser(), + host_name=socket.gethostname(), + current_dir=prefix, + cmd_args=' '.join(sys.argv), + clang_version=get_version(args.clang), + date=datetime.datetime.today( + ).strftime('%c'))) + for fragment in fragments: + # copy the content of fragments + with open(fragment, 'r') as input_handle: + shutil.copyfileobj(input_handle, handle) + handle.write(reindent(""" + | + |""", indent)) + + +def bug_summary(output_dir, bug_counter): + """ Bug summary is a HTML table to give a better overview of the bugs. """ + + name = os.path.join(output_dir, 'summary.html.fragment') + with open(name, 'w') as handle: + indent = 4 + handle.write(reindent(""" + |

Bug Summary

+ | + | + | + | + | + | + | + | + | """, indent)) + handle.write(reindent(""" + | + | + | + | + | """, indent).format(bug_counter.total)) + for category, types in bug_counter.categories.items(): + handle.write(reindent(""" + | + | + | """, indent).format(category)) + for bug_type in types.values(): + handle.write(reindent(""" + | + | + | + | + | """, indent).format(**bug_type)) + handle.write(reindent(""" + | + |
Bug TypeQuantityDisplay?
All Bugs{0} + |
+ | + |
+ |
{0}
{bug_type}{bug_count} + |
+ | + |
+ |
""", indent)) + handle.write(comment('SUMMARYBUGEND')) + return name + + +def bug_report(output_dir, prefix): + """ Creates a fragment from the analyzer reports. """ + + pretty = prettify_bug(prefix, output_dir) + bugs = (pretty(bug) for bug in read_bugs(output_dir, True)) + + name = os.path.join(output_dir, 'bugs.html.fragment') + with open(name, 'w') as handle: + indent = 4 + handle.write(reindent(""" + |

Reports

+ | + | + | + | + | + | + | + | + | + | + | + | + | """, indent)) + handle.write(comment('REPORTBUGCOL')) + for current in bugs: + handle.write(reindent(""" + | + | + | + | + | + | + | + | + | """, indent).format(**current)) + handle.write(comment('REPORTBUG', {'id': current['report_file']})) + handle.write(reindent(""" + | + |
Bug Group + | Bug Type + |  ▾ + | FileFunction/MethodLinePath Length
{bug_category}{bug_type}{bug_file}{bug_function}{bug_line}{bug_path_length}View Report
""", indent)) + handle.write(comment('REPORTBUGEND')) + return name + + +def crash_report(output_dir, prefix): + """ Creates a fragment from the compiler crashes. """ + + pretty = prettify_crash(prefix, output_dir) + crashes = (pretty(crash) for crash in read_crashes(output_dir)) + + name = os.path.join(output_dir, 'crashes.html.fragment') + with open(name, 'w') as handle: + indent = 4 + handle.write(reindent(""" + |

Analyzer Failures

+ |

The analyzer had problems processing the following files:

+ | + | + | + | + | + | + | + | + | + | """, indent)) + for current in crashes: + handle.write(reindent(""" + | + | + | + | + | + | """, indent).format(**current)) + handle.write(comment('REPORTPROBLEM', current)) + handle.write(reindent(""" + | + |
ProblemSource FilePreprocessed FileSTDERR Output
{problem}{source}preprocessor outputanalyzer std err
""", indent)) + handle.write(comment('REPORTCRASHES')) + return name + + +def read_crashes(output_dir): + """ Generate a unique sequence of crashes from given output directory. """ + + return (parse_crash(filename) + for filename in glob.iglob(os.path.join(output_dir, 'failures', + '*.info.txt'))) + + +def read_bugs(output_dir, html): + """ Generate a unique sequence of bugs from given output directory. + + Duplicates can be in a project if the same module was compiled multiple + times with different compiler options. These would be better to show in + the final report (cover) only once. """ + + parser = parse_bug_html if html else parse_bug_plist + pattern = '*.html' if html else '*.plist' + + duplicate = duplicate_check( + lambda bug: '{bug_line}.{bug_path_length}:{bug_file}'.format(**bug)) + + bugs = itertools.chain.from_iterable( + # parser creates a bug generator not the bug itself + parser(filename) + for filename in glob.iglob(os.path.join(output_dir, pattern))) + + return (bug for bug in bugs if not duplicate(bug)) + + +def parse_bug_plist(filename): + """ Returns the generator of bugs from a single .plist file. """ + + content = plistlib.readPlist(filename) + files = content.get('files') + for bug in content.get('diagnostics', []): + if len(files) <= int(bug['location']['file']): + logging.warning('Parsing bug from "%s" failed', filename) + continue + + yield { + 'result': filename, + 'bug_type': bug['type'], + 'bug_category': bug['category'], + 'bug_line': int(bug['location']['line']), + 'bug_path_length': int(bug['location']['col']), + 'bug_file': files[int(bug['location']['file'])] + } + + +def parse_bug_html(filename): + """ Parse out the bug information from HTML output. """ + + patterns = [re.compile(r'$'), + re.compile(r'$'), + re.compile(r'$'), + re.compile(r'$'), + re.compile(r'$'), + re.compile(r'$'), + re.compile(r'$')] + endsign = re.compile(r'') + + bug = { + 'report_file': filename, + 'bug_function': 'n/a', # compatibility with < clang-3.5 + 'bug_category': 'Other', + 'bug_line': 0, + 'bug_path_length': 1 + } + + with open(filename) as handler: + for line in handler.readlines(): + # do not read the file further + if endsign.match(line): + break + # search for the right lines + for regex in patterns: + match = regex.match(line.strip()) + if match: + bug.update(match.groupdict()) + break + + encode_value(bug, 'bug_line', int) + encode_value(bug, 'bug_path_length', int) + + yield bug + + +def parse_crash(filename): + """ Parse out the crash information from the report file. """ + + match = re.match(r'(.*)\.info\.txt', filename) + name = match.group(1) if match else None + with open(filename) as handler: + lines = handler.readlines() + return { + 'source': lines[0].rstrip(), + 'problem': lines[1].rstrip(), + 'file': name, + 'info': name + '.info.txt', + 'stderr': name + '.stderr.txt' + } + + +def category_type_name(bug): + """ Create a new bug attribute from bug by category and type. + + The result will be used as CSS class selector in the final report. """ + + def smash(key): + """ Make value ready to be HTML attribute value. """ + + return bug.get(key, '').lower().replace(' ', '_').replace("'", '') + + return escape('bt_' + smash('bug_category') + '_' + smash('bug_type')) + + +def create_counters(): + """ Create counters for bug statistics. + + Two entries are maintained: 'total' is an integer, represents the + number of bugs. The 'categories' is a two level categorisation of bug + counters. The first level is 'bug category' the second is 'bug type'. + Each entry in this classification is a dictionary of 'count', 'type' + and 'label'. """ + + def predicate(bug): + bug_category = bug['bug_category'] + bug_type = bug['bug_type'] + current_category = predicate.categories.get(bug_category, dict()) + current_type = current_category.get(bug_type, { + 'bug_type': bug_type, + 'bug_type_class': category_type_name(bug), + 'bug_count': 0 + }) + current_type.update({'bug_count': current_type['bug_count'] + 1}) + current_category.update({bug_type: current_type}) + predicate.categories.update({bug_category: current_category}) + predicate.total += 1 + + predicate.total = 0 + predicate.categories = dict() + return predicate + + +def prettify_bug(prefix, output_dir): + def predicate(bug): + """ Make safe this values to embed into HTML. """ + + bug['bug_type_class'] = category_type_name(bug) + + encode_value(bug, 'bug_file', lambda x: escape(chop(prefix, x))) + encode_value(bug, 'bug_category', escape) + encode_value(bug, 'bug_type', escape) + encode_value(bug, 'report_file', lambda x: escape(chop(output_dir, x))) + return bug + + return predicate + + +def prettify_crash(prefix, output_dir): + def predicate(crash): + """ Make safe this values to embed into HTML. """ + + encode_value(crash, 'source', lambda x: escape(chop(prefix, x))) + encode_value(crash, 'problem', escape) + encode_value(crash, 'file', lambda x: escape(chop(output_dir, x))) + encode_value(crash, 'info', lambda x: escape(chop(output_dir, x))) + encode_value(crash, 'stderr', lambda x: escape(chop(output_dir, x))) + return crash + + return predicate + + +def copy_resource_files(output_dir): + """ Copy the javascript and css files to the report directory. """ + + this_dir = os.path.dirname(os.path.realpath(__file__)) + for resource in os.listdir(os.path.join(this_dir, 'resources')): + shutil.copy(os.path.join(this_dir, 'resources', resource), output_dir) + + +def encode_value(container, key, encode): + """ Run 'encode' on 'container[key]' value and update it. """ + + if key in container: + value = encode(container[key]) + container.update({key: value}) + + +def chop(prefix, filename): + """ Create 'filename' from '/prefix/filename' """ + + return filename if not len(prefix) else os.path.relpath(filename, prefix) + + +def escape(text): + """ Paranoid HTML escape method. (Python version independent) """ + + escape_table = { + '&': '&', + '"': '"', + "'": ''', + '>': '>', + '<': '<' + } + return ''.join(escape_table.get(c, c) for c in text) + + +def reindent(text, indent): + """ Utility function to format html output and keep indentation. """ + + result = '' + for line in text.splitlines(): + if len(line.strip()): + result += ' ' * indent + line.split('|')[1] + os.linesep + return result + + +def comment(name, opts=dict()): + """ Utility function to format meta information as comment. """ + + attributes = '' + for key, value in opts.items(): + attributes += ' {0}="{1}"'.format(key, value) + + return '{2}'.format(name, attributes, os.linesep) + + +def commonprefix_from(filename): + """ Create file prefix from a compilation database entries. """ + + with open(filename, 'r') as handle: + return commonprefix(item['file'] for item in json.load(handle)) + + +def commonprefix(files): + """ Fixed version of os.path.commonprefix. Return the longest path prefix + that is a prefix of all paths in filenames. """ + + result = None + for current in files: + if result is not None: + result = os.path.commonprefix([result, current]) + else: + result = current + + if result is None: + return '' + elif not os.path.isdir(result): + return os.path.dirname(result) + else: + return os.path.abspath(result) Index: tools/scan-build-py/libscanbuild/resources/scanview.css =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/resources/scanview.css @@ -0,0 +1,62 @@ +body { color:#000000; background-color:#ffffff } +body { font-family: Helvetica, sans-serif; font-size:9pt } +h1 { font-size: 14pt; } +h2 { font-size: 12pt; } +table { font-size:9pt } +table { border-spacing: 0px; border: 1px solid black } +th, table thead { + background-color:#eee; color:#666666; + font-weight: bold; cursor: default; + text-align:center; + font-weight: bold; font-family: Verdana; + white-space:nowrap; +} +.W { font-size:0px } +th, td { padding:5px; padding-left:8px; text-align:left } +td.SUMM_DESC { padding-left:12px } +td.DESC { white-space:pre } +td.Q { text-align:right } +td { text-align:left } +tbody.scrollContent { overflow:auto } + +table.form_group { + background-color: #ccc; + border: 1px solid #333; + padding: 2px; +} + +table.form_inner_group { + background-color: #ccc; + border: 1px solid #333; + padding: 0px; +} + +table.form { + background-color: #999; + border: 1px solid #333; + padding: 2px; +} + +td.form_label { + text-align: right; + vertical-align: top; +} +/* For one line entires */ +td.form_clabel { + text-align: right; + vertical-align: center; +} +td.form_value { + text-align: left; + vertical-align: top; +} +td.form_submit { + text-align: right; + vertical-align: top; +} + +h1.SubmitFail { + color: #f00; +} +h1.SubmitOk { +} Index: tools/scan-build-py/libscanbuild/resources/selectable.js =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/resources/selectable.js @@ -0,0 +1,47 @@ +function SetDisplay(RowClass, DisplayVal) +{ + var Rows = document.getElementsByTagName("tr"); + for ( var i = 0 ; i < Rows.length; ++i ) { + if (Rows[i].className == RowClass) { + Rows[i].style.display = DisplayVal; + } + } +} + +function CopyCheckedStateToCheckButtons(SummaryCheckButton) { + var Inputs = document.getElementsByTagName("input"); + for ( var i = 0 ; i < Inputs.length; ++i ) { + if (Inputs[i].type == "checkbox") { + if(Inputs[i] != SummaryCheckButton) { + Inputs[i].checked = SummaryCheckButton.checked; + Inputs[i].onclick(); + } + } + } +} + +function returnObjById( id ) { + if (document.getElementById) + var returnVar = document.getElementById(id); + else if (document.all) + var returnVar = document.all[id]; + else if (document.layers) + var returnVar = document.layers[id]; + return returnVar; +} + +var NumUnchecked = 0; + +function ToggleDisplay(CheckButton, ClassName) { + if (CheckButton.checked) { + SetDisplay(ClassName, ""); + if (--NumUnchecked == 0) { + returnObjById("AllBugsCheck").checked = true; + } + } + else { + SetDisplay(ClassName, "none"); + NumUnchecked++; + returnObjById("AllBugsCheck").checked = false; + } +} Index: tools/scan-build-py/libscanbuild/resources/sorttable.js =================================================================== --- /dev/null +++ tools/scan-build-py/libscanbuild/resources/sorttable.js @@ -0,0 +1,492 @@ +/* + SortTable + version 2 + 7th April 2007 + Stuart Langridge, http://www.kryogenix.org/code/browser/sorttable/ + + Instructions: + Download this file + Add to your HTML + Add class="sortable" to any table you'd like to make sortable + Click on the headers to sort + + Thanks to many, many people for contributions and suggestions. + Licenced as X11: http://www.kryogenix.org/code/browser/licence.html + This basically means: do what you want with it. +*/ + + +var stIsIE = /*@cc_on!@*/false; + +sorttable = { + init: function() { + // quit if this function has already been called + if (arguments.callee.done) return; + // flag this function so we don't do the same thing twice + arguments.callee.done = true; + // kill the timer + if (_timer) clearInterval(_timer); + + if (!document.createElement || !document.getElementsByTagName) return; + + sorttable.DATE_RE = /^(\d\d?)[\/\.-](\d\d?)[\/\.-]((\d\d)?\d\d)$/; + + forEach(document.getElementsByTagName('table'), function(table) { + if (table.className.search(/\bsortable\b/) != -1) { + sorttable.makeSortable(table); + } + }); + + }, + + makeSortable: function(table) { + if (table.getElementsByTagName('thead').length == 0) { + // table doesn't have a tHead. Since it should have, create one and + // put the first table row in it. + the = document.createElement('thead'); + the.appendChild(table.rows[0]); + table.insertBefore(the,table.firstChild); + } + // Safari doesn't support table.tHead, sigh + if (table.tHead == null) table.tHead = table.getElementsByTagName('thead')[0]; + + if (table.tHead.rows.length != 1) return; // can't cope with two header rows + + // Sorttable v1 put rows with a class of "sortbottom" at the bottom (as + // "total" rows, for example). This is B&R, since what you're supposed + // to do is put them in a tfoot. So, if there are sortbottom rows, + // for backward compatibility, move them to tfoot (creating it if needed). + sortbottomrows = []; + for (var i=0; i5' : ' ▴'; + this.appendChild(sortrevind); + return; + } + if (this.className.search(/\bsorttable_sorted_reverse\b/) != -1) { + // if we're already sorted by this column in reverse, just + // re-reverse the table, which is quicker + sorttable.reverse(this.sorttable_tbody); + this.className = this.className.replace('sorttable_sorted_reverse', + 'sorttable_sorted'); + this.removeChild(document.getElementById('sorttable_sortrevind')); + sortfwdind = document.createElement('span'); + sortfwdind.id = "sorttable_sortfwdind"; + sortfwdind.innerHTML = stIsIE ? ' 6' : ' ▾'; + this.appendChild(sortfwdind); + return; + } + + // remove sorttable_sorted classes + theadrow = this.parentNode; + forEach(theadrow.childNodes, function(cell) { + if (cell.nodeType == 1) { // an element + cell.className = cell.className.replace('sorttable_sorted_reverse',''); + cell.className = cell.className.replace('sorttable_sorted',''); + } + }); + sortfwdind = document.getElementById('sorttable_sortfwdind'); + if (sortfwdind) { sortfwdind.parentNode.removeChild(sortfwdind); } + sortrevind = document.getElementById('sorttable_sortrevind'); + if (sortrevind) { sortrevind.parentNode.removeChild(sortrevind); } + + this.className += ' sorttable_sorted'; + sortfwdind = document.createElement('span'); + sortfwdind.id = "sorttable_sortfwdind"; + sortfwdind.innerHTML = stIsIE ? ' 6' : ' ▾'; + this.appendChild(sortfwdind); + + // build an array to sort. This is a Schwartzian transform thing, + // i.e., we "decorate" each row with the actual sort key, + // sort based on the sort keys, and then put the rows back in order + // which is a lot faster because you only do getInnerText once per row + row_array = []; + col = this.sorttable_columnindex; + rows = this.sorttable_tbody.rows; + for (var j=0; j 12) { + // definitely dd/mm + return sorttable.sort_ddmm; + } else if (second > 12) { + return sorttable.sort_mmdd; + } else { + // looks like a date, but we can't tell which, so assume + // that it's dd/mm (English imperialism!) and keep looking + sortfn = sorttable.sort_ddmm; + } + } + } + } + return sortfn; + }, + + getInnerText: function(node) { + // gets the text we want to use for sorting for a cell. + // strips leading and trailing whitespace. + // this is *not* a generic getInnerText function; it's special to sorttable. + // for example, you can override the cell text with a customkey attribute. + // it also gets .value for fields. + + hasInputs = (typeof node.getElementsByTagName == 'function') && + node.getElementsByTagName('input').length; + + if (node.getAttribute("sorttable_customkey") != null) { + return node.getAttribute("sorttable_customkey"); + } + else if (typeof node.textContent != 'undefined' && !hasInputs) { + return node.textContent.replace(/^\s+|\s+$/g, ''); + } + else if (typeof node.innerText != 'undefined' && !hasInputs) { + return node.innerText.replace(/^\s+|\s+$/g, ''); + } + else if (typeof node.text != 'undefined' && !hasInputs) { + return node.text.replace(/^\s+|\s+$/g, ''); + } + else { + switch (node.nodeType) { + case 3: + if (node.nodeName.toLowerCase() == 'input') { + return node.value.replace(/^\s+|\s+$/g, ''); + } + case 4: + return node.nodeValue.replace(/^\s+|\s+$/g, ''); + break; + case 1: + case 11: + var innerText = ''; + for (var i = 0; i < node.childNodes.length; i++) { + innerText += sorttable.getInnerText(node.childNodes[i]); + } + return innerText.replace(/^\s+|\s+$/g, ''); + break; + default: + return ''; + } + } + }, + + reverse: function(tbody) { + // reverse the rows in a tbody + newrows = []; + for (var i=0; i=0; i--) { + tbody.appendChild(newrows[i]); + } + delete newrows; + }, + + /* sort functions + each sort function takes two parameters, a and b + you are comparing a[0] and b[0] */ + sort_numeric: function(a,b) { + aa = parseFloat(a[0].replace(/[^0-9.-]/g,'')); + if (isNaN(aa)) aa = 0; + bb = parseFloat(b[0].replace(/[^0-9.-]/g,'')); + if (isNaN(bb)) bb = 0; + return aa-bb; + }, + sort_alpha: function(a,b) { + if (a[0]==b[0]) return 0; + if (a[0] 0 ) { + var q = list[i]; list[i] = list[i+1]; list[i+1] = q; + swap = true; + } + } // for + t--; + + if (!swap) break; + + for(var i = t; i > b; --i) { + if ( comp_func(list[i], list[i-1]) < 0 ) { + var q = list[i]; list[i] = list[i-1]; list[i-1] = q; + swap = true; + } + } // for + b++; + + } // while(swap) + } +} + +/* ****************************************************************** + Supporting functions: bundled here to avoid depending on a library + ****************************************************************** */ + +// Dean Edwards/Matthias Miller/John Resig + +/* for Mozilla/Opera9 */ +if (document.addEventListener) { + document.addEventListener("DOMContentLoaded", sorttable.init, false); +} + +/* for Internet Explorer */ +/*@cc_on @*/ +/*@if (@_win32) + document.write("