diff --git a/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst b/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst --- a/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst +++ b/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst @@ -3,14 +3,33 @@ ===================================== Normally, static analysis works in the boundary of one translation unit (TU). -However, with additional steps and configuration we can enable the analysis to inline the definition of a function from another TU. +However, with additional steps and configuration we can enable the analysis to inline the definition of a function from +another TU. .. contents:: :local: -Manual CTU Analysis -------------------- +Overview +________ +CTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH +files or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static +analysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options +of the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This +process can be automated by other tools, like `CodeChecker `_ and scan-build-py +(preference for the former). + +PCH-based analysis +__________________ +The analysis needs the PCH dumps of all the translations units used in the project. +These can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem. +The index, which maps symbols' USR names to PCH dumps containing them must also be generated by the +`clang-extdef-mapping`. This tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to +determine the compilation flags used. +The analysis invocation must be provided with the directory which contains the dumps and the mapping files. + +Manual CTU Analysis +################### Let's consider these source files in our minimal example: .. code-block:: cpp @@ -47,7 +66,8 @@ ] We'd like to analyze `main.cpp` and discover the division by zero bug. -In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file of `foo.cpp`: +In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file +of `foo.cpp`: .. code-block:: bash @@ -58,7 +78,8 @@ compile_commands.json foo.cpp.ast foo.cpp main.cpp $ -The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the source files: +The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the +source files: .. code-block:: bash @@ -85,47 +106,34 @@ $ pwd /path/to/your/project - $ clang++ --analyze -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true -Xclang -analyzer-config -Xclang ctu-dir=. -Xclang -analyzer-output=plist-multi-file main.cpp + $ clang++ --analyze \ + -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ + -Xclang -analyzer-config -Xclang ctu-dir=. \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing=false \ + -Xclang -analyzer-output=plist-multi-file \ + main.cpp main.cpp:5:12: warning: Division by zero return 3 / foo(); ~~^~~~~~~ 1 warning generated. $ # The plist file with the result is generated. - $ ls + $ ls -F compile_commands.json externalDefMap.txt foo.ast foo.cpp foo.cpp.ast main.cpp main.plist $ -This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use `CodeChecker` or `scan-build-py`. +This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use +`CodeChecker` or `scan-build-py`. Automated CTU Analysis with CodeChecker ---------------------------------------- +####################################### The `CodeChecker `_ project fully supports automated CTU analysis with Clang. Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: .. code-block:: bash $ CodeChecker analyze --ctu compile_commands.json -o reports - [INFO 2019-07-16 17:21] - Pre-analysis started. - [INFO 2019-07-16 17:21] - Collecting data for ctu analysis. - [INFO 2019-07-16 17:21] - [1/2] foo.cpp - [INFO 2019-07-16 17:21] - [2/2] main.cpp - [INFO 2019-07-16 17:21] - Pre-analysis finished. - [INFO 2019-07-16 17:21] - Starting static analysis ... - [INFO 2019-07-16 17:21] - [1/2] clangsa analyzed foo.cpp successfully. - [INFO 2019-07-16 17:21] - [2/2] clangsa analyzed main.cpp successfully. - [INFO 2019-07-16 17:21] - ----==== Summary ====---- - [INFO 2019-07-16 17:21] - Successfully analyzed - [INFO 2019-07-16 17:21] - clangsa: 2 - [INFO 2019-07-16 17:21] - Total analyzed compilation commands: 2 - [INFO 2019-07-16 17:21] - ----=================---- - [INFO 2019-07-16 17:21] - Analysis finished. - [INFO 2019-07-16 17:21] - To view results in the terminal use the "CodeChecker parse" command. - [INFO 2019-07-16 17:21] - To store results use the "CodeChecker store" command. - [INFO 2019-07-16 17:21] - See --help and the user guide for further options about parsing and storing the reports. - [INFO 2019-07-16 17:21] - ----=================---- - [INFO 2019-07-16 17:21] - Analysis length: 0.659618854523 sec. - $ ls - compile_commands.json foo.cpp foo.cpp.ast main.cpp reports + $ ls -F + compile_commands.json foo.cpp foo.cpp.ast main.cpp reports/ $ tree reports reports ├── compile_cmd.json @@ -174,9 +182,9 @@ $ firefox html_out/index.html Automated CTU Analysis with scan-build-py (don't do it) -------------------------------------------------------- -We actively develop CTU with CodeChecker as a "runner" script, `scan-build-py` is not actively developed for CTU. -`scan-build-py` has various errors and issues, expect it to work with the very basic projects only. +############################################################# +We actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU. +`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. Example usage of scan-build-py: @@ -191,3 +199,154 @@ Opening in existing browser session. ^C $ + +On-demand analysis +__________________ +The analysis produces the necessary AST structure of external TUs during analysis. This requires the +compilation database in order to determine the exact compiler invocation used for each TU. +The index, which maps function USR names to source files containing them must also be generated by the +`clang-extdef-mapping`. The mapping of external definitions implicitly uses a +:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used. +Preferably the same compilation database should be used when generating the external definitions, and +during analysis. The analysis invocation must be provided with the directory which contains the mapping +files, and the compilation database which is used to determine compiler flags. + + +Manual CTU Analysis +################### + +Let's consider these source files in our minimal example: + +.. code-block:: cpp + + // main.cpp + int foo(); + + int main() { + return 3 / foo(); + } + +.. code-block:: cpp + + // foo.cpp + int foo() { + return 0; + } + +And a compilation database: + +.. code-block:: bash + + [ + { + "directory": "/path/to/your/project", + "command": "clang++ -c foo.cpp -o foo.o", + "file": "foo.cpp" + }, + { + "directory": "/path/to/your/project", + "command": "clang++ -c main.cpp -o main.o", + "file": "main.cpp" + } + ] + +We'd like to analyze `main.cpp` and discover the division by zero bug. +As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of +external definitions in the source files: + +.. code-block:: bash + + $ clang-extdef-mapping -p . foo.cpp + c:@F@foo# /path/to/your/project/foo.cpp + $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt + +Now everything is available for the CTU analysis. +We have to feed Clang with CTU specific extra arguments: + +.. code-block:: bash + + $ pwd + /path/to/your/project + $ clang++ --analyze \ + -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ + -Xclang -analyzer-config -Xclang ctu-dir=. \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing=true \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing-database=compile_commands.json \ + -Xclang -analyzer-output=plist-multi-file \ + main.cpp + main.cpp:5:12: warning: Division by zero + return 3 / foo(); + ~~^~~~~~~ + 1 warning generated. + $ # The plist file with the result is generated. + $ ls -F + compile_commands.json externalDefMap.txt foo.cpp main.cpp main.plist + $ + +This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use +`CodeChecker` or `scan-build-py`. + +Automated CTU Analysis with CodeChecker +####################################### +The `CodeChecker `_ project fully supports automated CTU analysis with Clang. +Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: + +.. code-block:: bash + + $ CodeChecker analyze --ctu --ctu-on-demand compile_commands.json -o reports + $ ls -F + compile_commands.json foo.cpp main.cpp reports/ + $ tree reports + reports + ├── compile_cmd.json + ├── compiler_info.json + ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist + ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist + ├── metadata.json + └── unique_compile_commands.json + + 0 directories, 6 files + $ + +The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools. +E.g. one may use `CodeChecker parse` to view the results in command line: + +.. code-block:: bash + + $ CodeChecker parse reports + [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero] + return 3 / foo(); + ^ + + Found 1 defect(s) in main.cpp + + + ----==== Summary ====---- + ----------------------- + Filename | Report count + ----------------------- + main.cpp | 1 + ----------------------- + ----------------------- + Severity | Report count + ----------------------- + HIGH | 1 + ----------------------- + ----=================---- + Total number of reports: 1 + ----=================---- + +Or we can use `CodeChecker parse -e html` to export the results into HTML format: + +.. code-block:: bash + + $ CodeChecker parse -e html -o html_out reports + $ firefox html_out/index.html + +Automated CTU Analysis with scan-build-py (don't do it) +####################################################### +We actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU. +`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. + +Currently On-demand analysis is not supported with `scan-build-py`. + diff --git a/clang/include/clang/CrossTU/CrossTranslationUnit.h b/clang/include/clang/CrossTU/CrossTranslationUnit.h --- a/clang/include/clang/CrossTU/CrossTranslationUnit.h +++ b/clang/include/clang/CrossTU/CrossTranslationUnit.h @@ -33,6 +33,10 @@ class NamedDecl; class TranslationUnitDecl; +namespace tooling { +class JSONCompilationDatabase; +} + namespace cross_tu { enum class index_error_code { @@ -42,12 +46,14 @@ multiple_definitions, missing_definition, failed_import, + failed_to_load_compilation_database, failed_to_get_external_ast, failed_to_generate_usr, triple_mismatch, lang_mismatch, lang_dialect_mismatch, - load_threshold_reached + load_threshold_reached, + ambiguous_compile_commands_database }; class IndexError : public llvm::ErrorInfo { @@ -78,7 +84,8 @@ }; /// This function parses an index file that determines which -/// translation unit contains which definition. +/// translation unit contains which definition. The IndexPath is not prefixed +/// with CTUDir, so an absolute path is expected for consistent results. /// /// The index file format is the following: /// each line consists of an USR and a filepath separated by a space. @@ -86,7 +93,7 @@ /// \return Returns a map where the USR is the key and the filepath is the value /// or an error. llvm::Expected> -parseCrossTUIndex(StringRef IndexPath, StringRef CrossTUDir); +parseCrossTUIndex(StringRef IndexPath); std::string createCrossTUIndexString(const llvm::StringMap &Index); @@ -209,14 +216,47 @@ /// imported the FileID. ImportedFileIDMap ImportedFileIDs; - /// Functor for loading ASTUnits from AST-dump files. - class ASTFileLoader { + using LoadResultTy = llvm::Expected>; + + class ASTLoader { + public: + /// Load the ASTUnit by an identifier. Subclasses should determine what this + /// would be. The function is used with a string read from the CTU index, + /// and the method used for loading determines the semantic meaning of + /// Identifier. + virtual LoadResultTy load(StringRef Identifier) = 0; + virtual ~ASTLoader() = default; + }; + + /// Implementation for loading ASTUnits from AST-dump files. + class ASTFileLoader : public ASTLoader { + public: + explicit ASTFileLoader(CompilerInstance &CI, StringRef CTUDir); + + /// ASTFileLoader uses a the path of the dump file as Identifier. + LoadResultTy load(StringRef Identifier) override; + + private: + CompilerInstance &CI; + StringRef CTUDir; + }; + + /// Implementation for loading ASTUnits by parsing them on-demand. + class ASTOnDemandLoader : public ASTLoader { public: - ASTFileLoader(const CompilerInstance &CI); - std::unique_ptr operator()(StringRef ASTFilePath); + ASTOnDemandLoader(StringRef OnDemandParsingDatabase); + + /// ASTOnDemandLoader uses the path of the source file to be parsed as + /// Identifier. + LoadResultTy load(StringRef Identifier) override; + + llvm::Error lazyInitCompileCommands(); private: - const CompilerInstance &CI; + StringRef OnDemandParsingDatabase; + /// In case of on-demand parsing, the compilation database is parsed and + /// stored. + std::unique_ptr CompileCommands; }; /// Maintain number of AST loads and check for reaching the load limit. @@ -242,7 +282,7 @@ /// are the concerns of ASTUnitStorage class. class ASTUnitStorage { public: - ASTUnitStorage(const CompilerInstance &CI); + ASTUnitStorage(CompilerInstance &CI); /// Loads an ASTUnit for a function. /// /// \param FunctionName USR name of the function. @@ -287,18 +327,16 @@ using IndexMapTy = BaseMapTy; IndexMapTy NameFileMap; - ASTFileLoader FileAccessor; + std::unique_ptr Loader; - /// Limit the number of loaded ASTs. Used to limit the memory usage of the - /// CrossTranslationUnitContext. - /// The ASTUnitStorage has the knowledge about if the AST to load is - /// actually loaded or returned from cache. This information is needed to - /// maintain the counter. + /// Limit the number of loaded ASTs. It is used to limit the memory usage + /// of the CrossTranslationUnitContext. The ASTUnitStorage has the + /// information whether the AST to load is actually loaded or returned from + /// cache. This information is needed to maintain the counter. ASTLoadGuard LoadGuard; }; ASTUnitStorage ASTStorage; - }; } // namespace cross_tu diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def --- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def +++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def @@ -381,6 +381,21 @@ "the name of the file containing the CTU index of definitions.", "externalDefMap.txt") +ANALYZER_OPTION(bool, CTUOnDemandParsing, "ctu-on-demand-parsing", + "Whether to parse function definitions from external TUs in " + "an on-demand manner during analysis. When using on-demand " + "parsing there is no need for pre-dumping ASTs. External " + "definition mapping is still needed, and a valid compilation " + "database with compile commands for the external TUs is also " + "necessary. Disabled by default.", + false) + +ANALYZER_OPTION(StringRef, CTUOnDemandParsingDatabase, + "ctu-on-demand-parsing-database", + "The path to the compilation database used for on-demand " + "parsing of ASTs during CTU analysis.", + "compile_commands.json") + ANALYZER_OPTION( StringRef, ModelPath, "model-path", "The analyzer can inline an alternative implementation written in C at the " diff --git a/clang/lib/CrossTU/CMakeLists.txt b/clang/lib/CrossTU/CMakeLists.txt --- a/clang/lib/CrossTU/CMakeLists.txt +++ b/clang/lib/CrossTU/CMakeLists.txt @@ -10,4 +10,6 @@ clangBasic clangFrontend clangIndex + clangTooling + clangSerialization ) diff --git a/clang/lib/CrossTU/CrossTranslationUnit.cpp b/clang/lib/CrossTU/CrossTranslationUnit.cpp --- a/clang/lib/CrossTU/CrossTranslationUnit.cpp +++ b/clang/lib/CrossTU/CrossTranslationUnit.cpp @@ -18,12 +18,16 @@ #include "clang/Frontend/CompilerInstance.h" #include "clang/Frontend/TextDiagnosticPrinter.h" #include "clang/Index/USRGeneration.h" -#include "llvm/ADT/Triple.h" +#include "clang/Tooling/JSONCompilationDatabase.h" +#include "clang/Tooling/Tooling.h" +#include "llvm/ADT/Optional.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/Triple.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ManagedStatic.h" #include "llvm/Support/Path.h" #include "llvm/Support/raw_ostream.h" +#include #include #include @@ -100,6 +104,8 @@ return "Failed to import the definition."; case index_error_code::failed_to_get_external_ast: return "Failed to load external AST source."; + case index_error_code::failed_to_load_compilation_database: + return "Failed to load compilation database."; case index_error_code::failed_to_generate_usr: return "Failed to generate USR."; case index_error_code::triple_mismatch: @@ -110,6 +116,9 @@ return "Language dialect mismatch"; case index_error_code::load_threshold_reached: return "Load threshold reached"; + case index_error_code::ambiguous_compile_commands_database: + return "Compile commands database contains multiple references to the " + "same source file."; } llvm_unreachable("Unrecognized index_error_code."); } @@ -129,7 +138,7 @@ } llvm::Expected> -parseCrossTUIndex(StringRef IndexPath, StringRef CrossTUDir) { +parseCrossTUIndex(StringRef IndexPath) { std::ifstream ExternalMapFile{std::string(IndexPath)}; if (!ExternalMapFile) return llvm::make_error(index_error_code::missing_index_file, @@ -147,9 +156,7 @@ return llvm::make_error( index_error_code::multiple_definitions, IndexPath.str(), LineNo); StringRef FileName = LineRef.substr(Pos + 1); - SmallString<256> FilePath = CrossTUDir; - llvm::sys::path::append(FilePath, FileName); - Result[LookupName] = std::string(FilePath); + Result[LookupName] = FileName.str(); } else return llvm::make_error( index_error_code::invalid_index_format, IndexPath.str(), LineNo); @@ -341,30 +348,46 @@ } } -CrossTranslationUnitContext::ASTFileLoader::ASTFileLoader( - const CompilerInstance &CI) - : CI(CI) {} +CrossTranslationUnitContext::ASTFileLoader::ASTFileLoader(CompilerInstance &CI, + StringRef CTUDir) + : CI(CI), CTUDir(CTUDir) {} -std::unique_ptr -CrossTranslationUnitContext::ASTFileLoader::operator()(StringRef ASTFilePath) { +CrossTranslationUnitContext::LoadResultTy +CrossTranslationUnitContext::ASTFileLoader::load(StringRef Identifier) { // Load AST from ast-dump. - IntrusiveRefCntPtr DiagOpts = new DiagnosticOptions(); - TextDiagnosticPrinter *DiagClient = - new TextDiagnosticPrinter(llvm::errs(), &*DiagOpts); - IntrusiveRefCntPtr DiagID(new DiagnosticIDs()); - IntrusiveRefCntPtr Diags( - new DiagnosticsEngine(DiagID, &*DiagOpts, DiagClient)); - - return ASTUnit::LoadFromASTFile( - std::string(ASTFilePath), CI.getPCHContainerOperations()->getRawReader(), - ASTUnit::LoadEverything, Diags, CI.getFileSystemOpts()); + + auto LoadFromFile = [this](StringRef Path) { + IntrusiveRefCntPtr DiagOpts = new DiagnosticOptions(); + TextDiagnosticPrinter *DiagClient = + new TextDiagnosticPrinter(llvm::errs(), &*DiagOpts); + IntrusiveRefCntPtr DiagID(new DiagnosticIDs()); + IntrusiveRefCntPtr Diags( + new DiagnosticsEngine(DiagID, &*DiagOpts, DiagClient)); + return ASTUnit::LoadFromASTFile( + std::string(Path.str()), CI.getPCHContainerOperations()->getRawReader(), + ASTUnit::LoadEverything, Diags, CI.getFileSystemOpts()); + }; + + if (llvm::sys::path::is_absolute(Identifier)) + return LoadFromFile(Identifier); + + llvm::SmallString<256> PrefixedPath = CTUDir; + llvm::sys::path::append(PrefixedPath, Identifier); + + return LoadFromFile(PrefixedPath); } CrossTranslationUnitContext::ASTUnitStorage::ASTUnitStorage( - const CompilerInstance &CI) - : FileAccessor(CI), LoadGuard(const_cast(CI) - .getAnalyzerOpts() - ->CTUImportThreshold) {} + CompilerInstance &CI) + : LoadGuard(CI.getAnalyzerOpts()->CTUImportThreshold) { + + AnalyzerOptionsRef Opts = CI.getAnalyzerOpts(); + if (Opts->CTUOnDemandParsing) + Loader = + std::make_unique(Opts->CTUOnDemandParsingDatabase); + else + Loader = std::make_unique(CI, Opts->CTUDir); +} llvm::Expected CrossTranslationUnitContext::ASTUnitStorage::getASTUnitForFile( @@ -380,8 +403,12 @@ index_error_code::load_threshold_reached); } - // Load the ASTUnit from the pre-dumped AST file specified by ASTFileName. - std::unique_ptr LoadedUnit = FileAccessor(FileName); + auto LoadAttempt = Loader->load(FileName); + + if (!LoadAttempt) + return LoadAttempt.takeError(); + + std::unique_ptr LoadedUnit = std::move(LoadAttempt.get()); // Need the raw pointer and the unique_ptr as well. ASTUnit *Unit = LoadedUnit.get(); @@ -461,7 +488,7 @@ else llvm::sys::path::append(IndexFile, IndexName); - if (auto IndexMapping = parseCrossTUIndex(IndexFile, CrossTUDir)) { + if (auto IndexMapping = parseCrossTUIndex(IndexFile)) { // Initialize member map. NameFileMap = *IndexMapping; return llvm::Error::success(); @@ -471,6 +498,10 @@ }; } +CrossTranslationUnitContext::ASTOnDemandLoader::ASTOnDemandLoader( + StringRef OnDemandParsingDatabase) + : OnDemandParsingDatabase(OnDemandParsingDatabase) {} + llvm::Expected CrossTranslationUnitContext::loadExternalAST( StringRef LookupName, StringRef CrossTUDir, StringRef IndexName, bool DisplayCTUProgress) { @@ -494,6 +525,117 @@ return Unit; } +/// Load the AST from a source-file, which is supposed to be located inside the +/// compilation database \p CompileCommands. The compilation database +/// can contain the path of the file under the key "file" as an absolute path, +/// or as a relative path. When emitting diagnostics, plist files may contain +/// references to a location in a TU, that is different from the main TU. In +/// such cases, the file path emitted by the DiagnosticEngine is based on how +/// the exact invocation is assembled inside the ClangTool, which performs the +/// building of the ASTs. In order to ensure absolute paths inside the +/// diagnostics, we use the ArgumentsAdjuster API of ClangTool to make sure that +/// the invocation inside ClangTool is always made with an absolute path. \p +/// Identifier is assumed to be the lookup-name of the file, which comes from +/// the Index. The Index is built by the \p clang-extdef-mapping tool, which is +/// supposed to generate absolute paths. +/// +/// We must have absolute paths inside the plist, because otherwise we would +/// not be able to parse the bug, because we could not find the files with +/// relative paths. The directory of one entry in the compilation db may be +/// different from the directory where the plist is interpreted. +/// +/// Note that as the ClangTool is instantiated with a lookup-vector, which +/// contains a single entry; the supposedly absolute path of the source file. +/// So, the ArgumentAdjuster will only be used on the single corresponding +/// invocation. This guarantees that even if two files match in name, but +/// differ in location, only the correct one's invocation will be handled. This +/// is due to the fact that the lookup is done correctly inside the +/// OnDemandParsingDatabase, so it works for already absolute paths given under +/// the "file" entry of the compilation database, but also if a relative path is +/// given. In such a case, the lookup uses the "directory" entry as well to +/// identify the correct file. +CrossTranslationUnitContext::LoadResultTy +CrossTranslationUnitContext::ASTOnDemandLoader::load(StringRef Identifier) { + + if (auto InitError = lazyInitCompileCommands()) + return std::move(InitError); + + using namespace tooling; + + SmallVector Files; + Files.push_back(std::string(Identifier)); + ClangTool Tool(*CompileCommands, Files); + + /// Lambda filter designed to find the source file argument inside an + /// invocation used to build the ASTs, and replace it with its absolute path + /// equivalent. + auto SourcePathNormalizer = [Identifier](const CommandLineArguments &Args, + StringRef FileName) { + /// Match the argument to the absolute path by checking whether it is a + /// postfix. + auto IsPostfixOfLookup = [Identifier](const std::string &Arg) { + return Identifier.rfind(Arg) != llvm::StringRef::npos; + }; + + /// Commandline arguments are modified, and the API dictates the return of + /// a new instance, so copy the original. + CommandLineArguments Result{Args}; + + /// Search for the source file argument. Start from the end as a heuristic, + /// as most invocations tend to contain the source file argument in their + /// latter half. Only the first match is replaced. + auto SourceFilePath = + std::find_if(Result.rbegin(), Result.rend(), IsPostfixOfLookup); + + /// If source file argument could not been found, return the original + /// CommandlineArgumentsInstance. + if (SourceFilePath == Result.rend()) + return Result; + + /// Overwrite the argument with the \p ASTSourcePath, as it is assumed to + /// be the absolute path of the file. + *SourceFilePath = Identifier.str(); + + return Result; + }; + + Tool.appendArgumentsAdjuster(std::move(SourcePathNormalizer)); + + std::vector> ASTs; + Tool.buildASTs(ASTs); + + /// There is an assumption that the compilation database does not contain + /// multiple entries for the same source file. + if (ASTs.size() > 1) + return llvm::make_error( + index_error_code::ambiguous_compile_commands_database); + + /// Ideally there is exactly one entry in the compilation database that + /// matches the source file. + if (ASTs.size() != 1) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + ASTs[0]->enableSourceFileDiagnostics(); + return std::move(ASTs[0]); +} + +llvm::Error +CrossTranslationUnitContext::ASTOnDemandLoader::lazyInitCompileCommands() { + // Lazily initialize the compilation database. + + if (CompileCommands) + return llvm::Error::success(); + + std::string LoadError; + CompileCommands = tooling::JSONCompilationDatabase::loadFromFile( + OnDemandParsingDatabase, LoadError, + tooling::JSONCommandLineSyntax::AutoDetect); + return CompileCommands ? llvm::Error::success() + : llvm::make_error( + index_error_code::failed_to_get_external_ast); +} + template llvm::Expected CrossTranslationUnitContext::importDefinitionImpl(const T *D, ASTUnit *Unit) { diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -511,6 +511,12 @@ Diags->Report(diag::err_analyzer_config_invalid_input) << "ctu-dir" << "a filename"; + if (AnOpts.CTUOnDemandParsing && + !llvm::sys::fs::exists(AnOpts.CTUOnDemandParsingDatabase)) + Diags->Report(diag::err_analyzer_config_invalid_input) + << "ctu-on-demand-parsing-database" + << "a filename"; + if (!AnOpts.ModelPath.empty() && !llvm::sys::fs::is_directory(AnOpts.ModelPath)) Diags->Report(diag::err_analyzer_config_invalid_input) << "model-path" diff --git a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp --- a/clang/lib/StaticAnalyzer/Core/CallEvent.cpp +++ b/clang/lib/StaticAnalyzer/Core/CallEvent.cpp @@ -573,6 +573,7 @@ cross_tu::CrossTranslationUnitContext &CTUCtx = *Engine.getCrossTranslationUnitContext(); + llvm::Expected CTUDeclOrError = CTUCtx.getCrossTUDefinition(FD, Opts.CTUDir, Opts.CTUIndexName, Opts.DisplayCTUProgress); diff --git a/clang/test/Analysis/Inputs/ctu-other.c b/clang/test/Analysis/Inputs/ctu-other.c --- a/clang/test/Analysis/Inputs/ctu-other.c +++ b/clang/test/Analysis/Inputs/ctu-other.c @@ -31,9 +31,11 @@ } // Test that asm import does not fail. +// TODO: Support the GNU extension asm keyword as well. +// Example using the GNU extension: asm("mov $42, %0" : "=r"(res)); int inlineAsm() { int res; - asm("mov $42, %0" + __asm__("mov $42, %0" : "=r"(res)); return res; } diff --git a/clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.txt b/clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.ast-dump.txt rename from clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.txt rename to clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.ast-dump.txt diff --git a/clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.txt b/clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt rename from clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.txt rename to clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c --- a/clang/test/Analysis/analyzer-config.c +++ b/clang/test/Analysis/analyzer-config.c @@ -33,6 +33,8 @@ // CHECK-NEXT: ctu-dir = "" // CHECK-NEXT: ctu-import-threshold = 100 // CHECK-NEXT: ctu-index-name = externalDefMap.txt +// CHECK-NEXT: ctu-on-demand-parsing = false +// CHECK-NEXT: ctu-on-demand-parsing-database = compile_commands.json // CHECK-NEXT: deadcode.DeadStores:ShowFixIts = false // CHECK-NEXT: deadcode.DeadStores:WarnForDeadNestedAssignments = true // CHECK-NEXT: debug.AnalysisOrder:* = false @@ -106,4 +108,4 @@ // CHECK-NEXT: unroll-loops = false // CHECK-NEXT: widen-loops = false // CHECK-NEXT: [stats] -// CHECK-NEXT: num-entries = 103 +// CHECK-NEXT: num-entries = 105 diff --git a/clang/test/Analysis/ctu-different-triples.cpp b/clang/test/Analysis/ctu-different-triples.cpp --- a/clang/test/Analysis/ctu-different-triples.cpp +++ b/clang/test/Analysis/ctu-different-triples.cpp @@ -2,7 +2,7 @@ // RUN: mkdir -p %t/ctudir // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple powerpc64-montavista-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/test/Analysis/ctu-main.c b/clang/test/Analysis/ctu-main.c --- a/clang/test/Analysis/ctu-main.c +++ b/clang/test/Analysis/ctu-main.c @@ -2,7 +2,7 @@ // RUN: mkdir -p %t/ctudir2 // RUN: %clang_cc1 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir2/ctu-other.c.ast %S/Inputs/ctu-other.c -// RUN: cp %S/Inputs/ctu-other.c.externalDefMap.txt %t/ctudir2/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.c.externalDefMap.ast-dump.txt %t/ctudir2/externalDefMap.txt // RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fsyntax-only -std=c89 -analyze \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ @@ -50,6 +50,10 @@ void testImplicit() { int res = identImplicit(6); // external implicit functions are not inlined clang_analyzer_eval(res == 6); // expected-warning{{TRUE}} + // Call something with uninitialized from the same function in which the implicit was called. + // This is necessary to reproduce a special bug in NoStoreFuncVisitor. + int uninitialized; + h(uninitialized); // expected-warning{{1st function call argument is an uninitialized value}} } // Tests the import of functions that have a struct parameter diff --git a/clang/test/Analysis/ctu-main.cpp b/clang/test/Analysis/ctu-main.cpp --- a/clang/test/Analysis/ctu-main.cpp +++ b/clang/test/Analysis/ctu-main.cpp @@ -4,7 +4,7 @@ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-chain.cpp.ast %S/Inputs/ctu-chain.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/test/Analysis/ctu-on-demand-parsing-ambigous-compilation-database.c b/clang/test/Analysis/ctu-on-demand-parsing-ambigous-compilation-database.c new file mode 100644 --- /dev/null +++ b/clang/test/Analysis/ctu-on-demand-parsing-ambigous-compilation-database.c @@ -0,0 +1,23 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: cp "%s" "%t/ctu-on-demand-parsing-ambiguous-compilation-database.c" +// RUN: cp "%S/Inputs/ctu-other.c" "%t/ctu-other.c" +// Path substitutions on Windows platform could contain backslashes. These are escaped in the json file. +// Note there is a duplicate entry for 'ctu-other.c'. +// RUN: echo '[{"directory":"%t","command":"gcc -c -std=c89 -Wno-visibility ctu-other.c","file":"ctu-other.c"},{"directory":"%t","command":"gcc -c -std=c89 -Wno-visibility ctu-other.c","file":"ctu-other.c"}]' | sed -e 's/\\/\\\\/g' > %t/compile_commands.json +// RUN: cd "%t" && %clang_extdef_map ctu-other.c > externalDefMap.txt +// The exit code of the analysis is 1 if the import error occurs +// RUN: cd "%t" && not %clang_cc1 -triple x86_64-pc-linux-gnu -fsyntax-only -std=c89 -analyze \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=. \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: ctu-on-demand-parsing-ambiguous-compilation-database.c 2>&1 | FileCheck %t/ctu-on-demand-parsing-ambiguous-compilation-database.c + +// CHECK: {{.*}}multiple definitions are found for the same key in index + +// 'int f(int)' is defined in ctu-other.c +int f(int); +void testAmbiguousImport() { + f(0); +} diff --git a/clang/test/Analysis/ctu-on-demand-parsing.c b/clang/test/Analysis/ctu-on-demand-parsing.c new file mode 100644 --- /dev/null +++ b/clang/test/Analysis/ctu-on-demand-parsing.c @@ -0,0 +1,72 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: cp "%s" "%t/ctu-on-demand-parsing.c" +// RUN: cp "%S/Inputs/ctu-other.c" "%t/ctu-other.c" +// Path substitutions on Windows platform could contain backslashes. These are escaped in the json file. +// RUN: echo '[{"directory":"%t","command":"gcc -c -std=c89 -Wno-visibility ctu-other.c","file":"ctu-other.c"}]' | sed -e 's/\\/\\\\/g' > %t/compile_commands.json +// RUN: cd "%t" && %clang_extdef_map ctu-other.c > externalDefMap.txt +// RUN: cd "%t" && %clang_cc1 -triple x86_64-pc-linux-gnu -fsyntax-only -std=c89 -analyze \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=. \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -verify ctu-on-demand-parsing.c + +void clang_analyzer_eval(int); + +// Test typedef and global variable in function. +typedef struct { + int a; + int b; +} FooBar; +extern FooBar fb; +int f(int); +void testGlobalVariable() { + clang_analyzer_eval(f(5) == 1); // expected-warning{{TRUE}} +} + +// Test enums. +int enumCheck(void); +enum A { x, y, z }; +void testEnum() { + clang_analyzer_eval(x == 0); // expected-warning{{TRUE}} + clang_analyzer_eval(enumCheck() == 42); // expected-warning{{TRUE}} +} + +// Test that asm import does not fail. +int inlineAsm(); +int testInlineAsm() { return inlineAsm(); } + +// Test reporting error in a macro. +struct S; +int g(struct S *); +void testMacro(void) { + g(0); + // expected-warning@ctu-other.c:29 {{Access to field 'a' results in a dereference of a null pointer (loaded from variable 'ctx')}} +} + +// The external function prototype is incomplete. +// warning:implicit functions are prohibited by c99 +void testImplicit() { + int res = identImplicit(6); // external implicit functions are not inlined + clang_analyzer_eval(res == 6); // expected-warning{{TRUE}} + // Call something with uninitialized from the same function in which the + // implicit was called. This is necessary to reproduce a special bug in + // NoStoreFuncVisitor. + int uninitialized; + h(uninitialized); // expected-warning{{1st function call argument is an uninitialized value}} +} + +// Tests the import of functions that have a struct parameter +// defined in its prototype. +struct DataType { + int a; + int b; +}; +int structInProto(struct DataType *d); +void testStructDefInArgument() { + struct DataType d; + d.a = 1; + d.b = 0; + clang_analyzer_eval(structInProto(&d) == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}} +} diff --git a/clang/test/Analysis/ctu-on-demand-parsing.cpp b/clang/test/Analysis/ctu-on-demand-parsing.cpp new file mode 100644 --- /dev/null +++ b/clang/test/Analysis/ctu-on-demand-parsing.cpp @@ -0,0 +1,102 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t/ctudir +// RUN: cp %s %t/ctu-on-demand-parsing.cpp +// RUN: cp %S/ctu-hdr.h %t/ctu-hdr.h +// RUN: cp %S/Inputs/ctu-chain.cpp %t/ctudir/ctu-chain.cpp +// RUN: cp %S/Inputs/ctu-other.cpp %t/ctudir/ctu-other.cpp +// Path substitutions on Windows platform could contain backslashes. These are escaped in the json file. +// RUN: echo '[{"directory":"%t/ctudir","command":"clang++ -c ctu-chain.cpp","file":"ctu-chain.cpp"},{"directory":"%t/ctudir","command":"clang++ -c ctu-other.cpp","file":"ctu-other.cpp"}]' | sed -e 's/\\/\\\\/g' > %t/compile_commands.json +// RUN: cd "%t/ctudir" && %clang_extdef_map ctu-chain.cpp ctu-other.cpp > externalDefMap.txt +// RUN: cd "%t" && %clang_analyze_cc1 -triple x86_64-pc-linux-gnu \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=ctudir \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -verify ctu-on-demand-parsing.cpp +// RUN: cd "%t" && %clang_analyze_cc1 -triple x86_64-pc-linux-gnu \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir="%t/ctudir" \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -analyzer-config display-ctu-progress=true 2>&1 ctu-on-demand-parsing.cpp | FileCheck %t/ctu-on-demand-parsing.cpp + +// CHECK: CTU loaded AST file: {{.*}}ctu-other.cpp +// CHECK: CTU loaded AST file: {{.*}}ctu-chain.cpp + +#include "ctu-hdr.h" + +void clang_analyzer_eval(int); + +int f(int); +int g(int); +int h(int); + +int callback_to_main(int x) { return x + 1; } + +namespace myns { +int fns(int x); + +namespace embed_ns { +int fens(int x); +} + +class embed_cls { +public: + int fecl(int x); +}; +} // namespace myns + +class mycls { +public: + int fcl(int x); + virtual int fvcl(int x); + static int fscl(int x); + + class embed_cls2 { + public: + int fecl2(int x); + }; +}; + +class derived : public mycls { +public: + virtual int fvcl(int x) override; +}; + +namespace chns { +int chf1(int x); +} + +int fun_using_anon_struct(int); +int other_macro_diag(int); + +void test_virtual_functions(mycls *obj) { + // The dynamic type is known. + clang_analyzer_eval(mycls().fvcl(1) == 8); // expected-warning{{TRUE}} + clang_analyzer_eval(derived().fvcl(1) == 9); // expected-warning{{TRUE}} + // We cannot decide about the dynamic type. + clang_analyzer_eval(obj->fvcl(1) == 8); // expected-warning{{FALSE}} expected-warning{{TRUE}} + clang_analyzer_eval(obj->fvcl(1) == 9); // expected-warning{{FALSE}} expected-warning{{TRUE}} +} + +int main() { + clang_analyzer_eval(f(3) == 2); // expected-warning{{TRUE}} + clang_analyzer_eval(f(4) == 3); // expected-warning{{TRUE}} + clang_analyzer_eval(f(5) == 3); // expected-warning{{FALSE}} + clang_analyzer_eval(g(4) == 6); // expected-warning{{TRUE}} + clang_analyzer_eval(h(2) == 8); // expected-warning{{TRUE}} + + clang_analyzer_eval(myns::fns(2) == 9); // expected-warning{{TRUE}} + clang_analyzer_eval(myns::embed_ns::fens(2) == -1); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls().fcl(1) == 6); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls::fscl(1) == 7); // expected-warning{{TRUE}} + clang_analyzer_eval(myns::embed_cls().fecl(1) == -6); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls::embed_cls2().fecl2(0) == -11); // expected-warning{{TRUE}} + + clang_analyzer_eval(chns::chf1(4) == 12); // expected-warning{{TRUE}} + clang_analyzer_eval(fun_using_anon_struct(8) == 8); // expected-warning{{TRUE}} + + clang_analyzer_eval(other_macro_diag(1) == 1); // expected-warning{{TRUE}} + // expected-warning@ctudir/ctu-other.cpp:93{{REACHABLE}} + MACRODIAG(); // expected-warning{{REACHABLE}} +} diff --git a/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp b/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp --- a/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp +++ b/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp @@ -5,7 +5,7 @@ // RUN: mkdir -p %t/ctudir // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-unknown-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp b/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp --- a/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp +++ b/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp @@ -7,10 +7,11 @@ //===----------------------------------------------------------------------===// #include "clang/CrossTU/CrossTranslationUnit.h" -#include "clang/Frontend/CompilerInstance.h" #include "clang/AST/ASTConsumer.h" +#include "clang/Frontend/CompilerInstance.h" #include "clang/Frontend/FrontendAction.h" #include "clang/Tooling/Tooling.h" +#include "llvm/ADT/Optional.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/Path.h" #include "llvm/Support/ToolOutputFile.h" @@ -162,7 +163,7 @@ IndexFile.os().flush(); EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName)); llvm::Expected> IndexOrErr = - parseCrossTUIndex(IndexFileName, ""); + parseCrossTUIndex(IndexFileName); EXPECT_TRUE((bool)IndexOrErr); llvm::StringMap ParsedIndex = IndexOrErr.get(); for (const auto &E : Index) { @@ -173,25 +174,5 @@ EXPECT_TRUE(Index.count(E.getKey())); } -TEST(CrossTranslationUnit, CTUDirIsHandledCorrectly) { - llvm::StringMap Index; - Index["a"] = "/b/c/d"; - std::string IndexText = createCrossTUIndexString(Index); - - int IndexFD; - llvm::SmallString<256> IndexFileName; - ASSERT_FALSE(llvm::sys::fs::createTemporaryFile("index", "txt", IndexFD, - IndexFileName)); - llvm::ToolOutputFile IndexFile(IndexFileName, IndexFD); - IndexFile.os() << IndexText; - IndexFile.os().flush(); - EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName)); - llvm::Expected> IndexOrErr = - parseCrossTUIndex(IndexFileName, "/ctudir"); - EXPECT_TRUE((bool)IndexOrErr); - llvm::StringMap ParsedIndex = IndexOrErr.get(); - EXPECT_EQ(ParsedIndex["a"], "/ctudir/b/c/d"); -} - } // end namespace cross_tu } // end namespace clang