diff --git a/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst b/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst --- a/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst +++ b/clang/docs/analyzer/user-docs/CrossTranslationUnit.rst @@ -3,14 +3,33 @@ ===================================== Normally, static analysis works in the boundary of one translation unit (TU). -However, with additional steps and configuration we can enable the analysis to inline the definition of a function from another TU. +However, with additional steps and configuration we can enable the analysis to inline the definition of a function from +another TU. .. contents:: :local: -Manual CTU Analysis -------------------- +Overview +________ +CTU analysis can be used in a variety of ways. The importing of external TU definitions can work with pre-dumped PCH +files or generating the necessary AST structure on-demand, during the analysis of the main TU. Driving the static +analysis can also be implemented in multiple ways. The most direct way is to specify the necessary commandline options +of the Clang frontend manually (and generate the prerequisite dependencies of the specific import method by hand). This +process can be automated by other tools, like `CodeChecker `_ and scan-build-py +(preference for the former). + +PCH-based analysis +__________________ +The analysis needs the PCH dumps of all the translations units used in the project. +These can be generated by the Clang Frontend itself, and must be arranged in a specific way in the filesystem. +The index, which maps symbols' USR names to PCH dumps containing them must also be generated by the +`clang-extdef-mapping`. This tool uses a :doc:`compilation database <../../JSONCompilationDatabase>` to +determine the compilation flags used. +The analysis invocation must be provided with the directory which contains the dumps and the mapping files. + +Manual CTU Analysis +################### Let's consider these source files in our minimal example: .. code-block:: cpp @@ -47,7 +66,8 @@ ] We'd like to analyze `main.cpp` and discover the division by zero bug. -In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file of `foo.cpp`: +In order to be able to inline the definition of `foo` from `foo.cpp` first we have to generate the `AST` (or `PCH`) file +of `foo.cpp`: .. code-block:: bash @@ -58,7 +78,8 @@ compile_commands.json foo.cpp.ast foo.cpp main.cpp $ -The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the source files: +The next step is to create a CTU index file which holds the `USR` name and location of external definitions in the +source files: .. code-block:: bash @@ -85,47 +106,34 @@ $ pwd /path/to/your/project - $ clang++ --analyze -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true -Xclang -analyzer-config -Xclang ctu-dir=. -Xclang -analyzer-output=plist-multi-file main.cpp + $ clang++ --analyze \ + -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ + -Xclang -analyzer-config -Xclang ctu-dir=. \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing=false \ + -Xclang -analyzer-output=plist-multi-file \ + main.cpp main.cpp:5:12: warning: Division by zero return 3 / foo(); ~~^~~~~~~ 1 warning generated. $ # The plist file with the result is generated. - $ ls + $ ls -F compile_commands.json externalDefMap.txt foo.ast foo.cpp foo.cpp.ast main.cpp main.plist $ -This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use `CodeChecker` or `scan-build-py`. +This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use +`CodeChecker` or `scan-build-py`. Automated CTU Analysis with CodeChecker ---------------------------------------- +####################################### The `CodeChecker `_ project fully supports automated CTU analysis with Clang. Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: .. code-block:: bash $ CodeChecker analyze --ctu compile_commands.json -o reports - [INFO 2019-07-16 17:21] - Pre-analysis started. - [INFO 2019-07-16 17:21] - Collecting data for ctu analysis. - [INFO 2019-07-16 17:21] - [1/2] foo.cpp - [INFO 2019-07-16 17:21] - [2/2] main.cpp - [INFO 2019-07-16 17:21] - Pre-analysis finished. - [INFO 2019-07-16 17:21] - Starting static analysis ... - [INFO 2019-07-16 17:21] - [1/2] clangsa analyzed foo.cpp successfully. - [INFO 2019-07-16 17:21] - [2/2] clangsa analyzed main.cpp successfully. - [INFO 2019-07-16 17:21] - ----==== Summary ====---- - [INFO 2019-07-16 17:21] - Successfully analyzed - [INFO 2019-07-16 17:21] - clangsa: 2 - [INFO 2019-07-16 17:21] - Total analyzed compilation commands: 2 - [INFO 2019-07-16 17:21] - ----=================---- - [INFO 2019-07-16 17:21] - Analysis finished. - [INFO 2019-07-16 17:21] - To view results in the terminal use the "CodeChecker parse" command. - [INFO 2019-07-16 17:21] - To store results use the "CodeChecker store" command. - [INFO 2019-07-16 17:21] - See --help and the user guide for further options about parsing and storing the reports. - [INFO 2019-07-16 17:21] - ----=================---- - [INFO 2019-07-16 17:21] - Analysis length: 0.659618854523 sec. - $ ls - compile_commands.json foo.cpp foo.cpp.ast main.cpp reports + $ ls -F + compile_commands.json foo.cpp foo.cpp.ast main.cpp reports/ $ tree reports reports ├── compile_cmd.json @@ -174,9 +182,9 @@ $ firefox html_out/index.html Automated CTU Analysis with scan-build-py (don't do it) -------------------------------------------------------- -We actively develop CTU with CodeChecker as a "runner" script, `scan-build-py` is not actively developed for CTU. -`scan-build-py` has various errors and issues, expect it to work with the very basic projects only. +############################################################# +We actively develop CTU with CodeChecker as the driver for this feature, `scan-build-py` is not actively developed for CTU. +`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. Example usage of scan-build-py: @@ -191,3 +199,154 @@ Opening in existing browser session. ^C $ + +On-demand analysis +__________________ +The analysis produces the necessary AST structure of external TUs during analysis. This requires the +compilation database in order to determine the exact compiler invocation used for each TU. +The index, which maps function USR names to source files containing them must also be generated by the +`clang-extdef-mapping`. The mapping of external definitions implicitly uses a +:doc:`compilation database <../../JSONCompilationDatabase>` to determine the compilation flags used. +Preferably the same compilation database should be used when generating the external definitions, and +during analysis. The analysis invocation must be provided with the directory which contains the mapping +files, and the compilation database which is used to determine compiler flags. + + +Manual CTU Analysis +################### + +Let's consider these source files in our minimal example: + +.. code-block:: cpp + + // main.cpp + int foo(); + + int main() { + return 3 / foo(); + } + +.. code-block:: cpp + + // foo.cpp + int foo() { + return 0; + } + +And a compilation database: + +.. code-block:: bash + + [ + { + "directory": "/path/to/your/project", + "command": "clang++ -c foo.cpp -o foo.o", + "file": "foo.cpp" + }, + { + "directory": "/path/to/your/project", + "command": "clang++ -c main.cpp -o main.o", + "file": "main.cpp" + } + ] + +We'd like to analyze `main.cpp` and discover the division by zero bug. +As we are using On-demand mode, we only need to create a CTU index file which holds the `USR` name and location of +external definitions in the source files: + +.. code-block:: bash + + $ clang-extdef-mapping -p . foo.cpp + c:@F@foo# /path/to/your/project/foo.cpp + $ clang-extdef-mapping -p . foo.cpp > externalDefMap.txt + +Now everything is available for the CTU analysis. +We have to feed Clang with CTU specific extra arguments: + +.. code-block:: bash + + $ pwd + /path/to/your/project + $ clang++ --analyze \ + -Xclang -analyzer-config -Xclang experimental-enable-naive-ctu-analysis=true \ + -Xclang -analyzer-config -Xclang ctu-dir=. \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing=true \ + -Xclang -analyzer-config -Xclang ctu-on-demand-parsing-database=compile_commands.json \ + -Xclang -analyzer-output=plist-multi-file \ + main.cpp + main.cpp:5:12: warning: Division by zero + return 3 / foo(); + ~~^~~~~~~ + 1 warning generated. + $ # The plist file with the result is generated. + $ ls -F + compile_commands.json externalDefMap.txt foo.cpp main.cpp main.plist + $ + +This manual procedure is error-prone and not scalable, therefore to analyze real projects it is recommended to use +`CodeChecker` or `scan-build-py`. + +Automated CTU Analysis with CodeChecker +####################################### +The `CodeChecker `_ project fully supports automated CTU analysis with Clang. +Once we have set up the `PATH` environment variable and we activated the python `venv` then it is all it takes: + +.. code-block:: bash + + $ CodeChecker analyze --ctu --ctu-on-demand compile_commands.json -o reports + $ ls -F + compile_commands.json foo.cpp main.cpp reports/ + $ tree reports + reports + ├── compile_cmd.json + ├── compiler_info.json + ├── foo.cpp_53f6fbf7ab7ec9931301524b551959e2.plist + ├── main.cpp_23db3d8df52ff0812e6e5a03071c8337.plist + ├── metadata.json + └── unique_compile_commands.json + + 0 directories, 6 files + $ + +The `plist` files contain the results of the analysis, which may be viewed with the regular analysis tools. +E.g. one may use `CodeChecker parse` to view the results in command line: + +.. code-block:: bash + + $ CodeChecker parse reports + [HIGH] /home/egbomrt/ctu_mini_raw_project/main.cpp:5:12: Division by zero [core.DivideZero] + return 3 / foo(); + ^ + + Found 1 defect(s) in main.cpp + + + ----==== Summary ====---- + ----------------------- + Filename | Report count + ----------------------- + main.cpp | 1 + ----------------------- + ----------------------- + Severity | Report count + ----------------------- + HIGH | 1 + ----------------------- + ----=================---- + Total number of reports: 1 + ----=================---- + +Or we can use `CodeChecker parse -e html` to export the results into HTML format: + +.. code-block:: bash + + $ CodeChecker parse -e html -o html_out reports + $ firefox html_out/index.html + +Automated CTU Analysis with scan-build-py (don't do it) +####################################################### +We actively develop CTU with CodeChecker as the driver for feature, `scan-build-py` is not actively developed for CTU. +`scan-build-py` has various errors and issues, expect it to work only with the very basic projects only. + +Currently On-demand analysis is not supported with `scan-build-py`. + diff --git a/clang/include/clang/CrossTU/CrossTranslationUnit.h b/clang/include/clang/CrossTU/CrossTranslationUnit.h --- a/clang/include/clang/CrossTU/CrossTranslationUnit.h +++ b/clang/include/clang/CrossTU/CrossTranslationUnit.h @@ -33,6 +33,10 @@ class NamedDecl; class TranslationUnitDecl; +namespace tooling { +class JSONCompilationDatabase; +} + namespace cross_tu { enum class index_error_code { @@ -42,12 +46,14 @@ multiple_definitions, missing_definition, failed_import, + failed_to_load_compilation_database, failed_to_get_external_ast, failed_to_generate_usr, triple_mismatch, lang_mismatch, lang_dialect_mismatch, - load_threshold_reached + load_threshold_reached, + ambiguous_invocation_list }; class IndexError : public llvm::ErrorInfo { @@ -78,7 +84,8 @@ }; /// This function parses an index file that determines which -/// translation unit contains which definition. +/// translation unit contains which definition. The IndexPath is not prefixed +/// with CTUDir, so an absolute path is expected for consistent results. /// /// The index file format is the following: /// each line consists of an USR and a filepath separated by a space. @@ -86,7 +93,7 @@ /// \return Returns a map where the USR is the key and the filepath is the value /// or an error. llvm::Expected> -parseCrossTUIndex(StringRef IndexPath, StringRef CrossTUDir); +parseCrossTUIndex(StringRef IndexPath); std::string createCrossTUIndexString(const llvm::StringMap &Index); @@ -209,14 +216,54 @@ /// imported the FileID. ImportedFileIDMap ImportedFileIDs; - /// Functor for loading ASTUnits from AST-dump files. - class ASTFileLoader { + using LoadResultTy = llvm::Expected>; + + class ASTLoader { + public: + /// Load the ASTUnit by an identifier. Subclasses should determine what this + /// would be. The function is used with a string read from the CTU index, + /// and the method used for loading determines the semantic meaning of + /// Identifier. + virtual LoadResultTy load(StringRef Identifier) = 0; + virtual ~ASTLoader() = default; + }; + + /// Implementation for loading ASTUnits from AST-dump files. + class ASTFileLoader : public ASTLoader { + public: + ASTFileLoader(CompilerInstance &CI, StringRef CTUDir); + + /// ASTFileLoader uses a the path of the dump file as Identifier. + LoadResultTy load(StringRef Identifier) override; + + private: + CompilerInstance &CI; + StringRef CTUDir; + }; + + /// Implementation for loading ASTUnits by parsing them on-demand. + class ASTOnDemandLoader : public ASTLoader { public: - ASTFileLoader(const CompilerInstance &CI); - std::unique_ptr operator()(StringRef ASTFilePath); + ASTOnDemandLoader(CompilerInstance &CI, StringRef InvocationListFilePath); + + /// ASTOnDemandLoader uses the path of the source file to be parsed as + /// Identifier. + LoadResultTy load(StringRef Identifier) override; + + llvm::Error lazyInitCompileCommands(); private: - const CompilerInstance &CI; + CompilerInstance &CI; + /// The path to the file containing the invocation list, which is in YAML + /// format, and contains a mapping from source files to compiler invocations + /// that produce the AST used for analysis. + StringRef InvocationListFilePath; + + using InvocationListTy = + llvm::StringMap>; + /// In case of on-demand parsing, the invocations for parsing the source + /// files is stored. + llvm::Optional InvocationList; }; /// Maintain number of AST loads and check for reaching the load limit. @@ -242,7 +289,7 @@ /// are the concerns of ASTUnitStorage class. class ASTUnitStorage { public: - ASTUnitStorage(const CompilerInstance &CI); + ASTUnitStorage(CompilerInstance &CI); /// Loads an ASTUnit for a function. /// /// \param FunctionName USR name of the function. @@ -287,18 +334,16 @@ using IndexMapTy = BaseMapTy; IndexMapTy NameFileMap; - ASTFileLoader FileAccessor; + std::unique_ptr Loader; - /// Limit the number of loaded ASTs. Used to limit the memory usage of the - /// CrossTranslationUnitContext. - /// The ASTUnitStorage has the knowledge about if the AST to load is - /// actually loaded or returned from cache. This information is needed to - /// maintain the counter. + /// Limit the number of loaded ASTs. It is used to limit the memory usage + /// of the CrossTranslationUnitContext. The ASTUnitStorage has the + /// information whether the AST to load is actually loaded or returned from + /// cache. This information is needed to maintain the counter. ASTLoadGuard LoadGuard; }; ASTUnitStorage ASTStorage; - }; } // namespace cross_tu diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def --- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def +++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def @@ -381,6 +381,22 @@ "the name of the file containing the CTU index of definitions.", "externalDefMap.txt") +ANALYZER_OPTION(bool, CTUOnDemandParsing, "ctu-on-demand-parsing", + "Whether to parse function definitions from external TUs in " + "an on-demand manner during analysis. When using on-demand " + "parsing there is no need for pre-dumping ASTs. External " + "definition mapping is still needed, and a valid compilation " + "database with compile commands for the external TUs is also " + "necessary. Disabled by default.", + false) + +ANALYZER_OPTION( + StringRef, CTUInvocationList, "ctu-invocation-list", + "The path to the YAML format file containing a mapping from source file " + "paths to command-line invocations represented as a list of arguments. " + "This invocation is used produce the source-file's AST.", + "invocations.yaml") + ANALYZER_OPTION( StringRef, ModelPath, "model-path", "The analyzer can inline an alternative implementation written in C at the " diff --git a/clang/lib/CrossTU/CrossTranslationUnit.cpp b/clang/lib/CrossTU/CrossTranslationUnit.cpp --- a/clang/lib/CrossTU/CrossTranslationUnit.cpp +++ b/clang/lib/CrossTU/CrossTranslationUnit.cpp @@ -14,17 +14,24 @@ #include "clang/AST/Decl.h" #include "clang/Basic/TargetInfo.h" #include "clang/CrossTU/CrossTUDiagnostic.h" +#include "clang/Driver/Driver.h" +#include "clang/Driver/Options.h" #include "clang/Frontend/ASTUnit.h" #include "clang/Frontend/CompilerInstance.h" #include "clang/Frontend/TextDiagnosticPrinter.h" #include "clang/Index/USRGeneration.h" -#include "llvm/ADT/Triple.h" +#include "llvm/ADT/Optional.h" #include "llvm/ADT/Statistic.h" +#include "llvm/ADT/Triple.h" +#include "llvm/Option/ArgList.h" #include "llvm/Support/ErrorHandling.h" #include "llvm/Support/ManagedStatic.h" #include "llvm/Support/Path.h" +#include "llvm/Support/YAMLParser.h" #include "llvm/Support/raw_ostream.h" +#include #include +#include #include namespace clang { @@ -100,6 +107,8 @@ return "Failed to import the definition."; case index_error_code::failed_to_get_external_ast: return "Failed to load external AST source."; + case index_error_code::failed_to_load_compilation_database: + return "Failed to load compilation database."; case index_error_code::failed_to_generate_usr: return "Failed to generate USR."; case index_error_code::triple_mismatch: @@ -110,6 +119,9 @@ return "Language dialect mismatch"; case index_error_code::load_threshold_reached: return "Load threshold reached"; + case index_error_code::ambiguous_invocation_list: + return "Invocation list contains multiple references to the same source" + " file."; } llvm_unreachable("Unrecognized index_error_code."); } @@ -129,7 +141,7 @@ } llvm::Expected> -parseCrossTUIndex(StringRef IndexPath, StringRef CrossTUDir) { +parseCrossTUIndex(StringRef IndexPath) { std::ifstream ExternalMapFile{std::string(IndexPath)}; if (!ExternalMapFile) return llvm::make_error(index_error_code::missing_index_file, @@ -139,21 +151,26 @@ std::string Line; unsigned LineNo = 1; while (std::getline(ExternalMapFile, Line)) { - const size_t Pos = Line.find(" "); - if (Pos > 0 && Pos != std::string::npos) { - StringRef LineRef{Line}; - StringRef LookupName = LineRef.substr(0, Pos); - if (Result.count(LookupName)) + StringRef LineRef{Line}; + const size_t Delimiter = LineRef.find(" "); + if (Delimiter > 0 && Delimiter != std::string::npos) { + StringRef LookupName = LineRef.substr(0, Delimiter); + + // Store paths with native-style directory separator. + SmallVector FilePath; + llvm::Twine{LineRef.substr(Delimiter + 1)}.toVector(FilePath); + llvm::sys::path::native(FilePath); + + bool InsertionOccured; + std::tie(std::ignore, InsertionOccured) = + Result.try_emplace(LookupName, FilePath.begin(), FilePath.end()); + if (!InsertionOccured) return llvm::make_error( index_error_code::multiple_definitions, IndexPath.str(), LineNo); - StringRef FileName = LineRef.substr(Pos + 1); - SmallString<256> FilePath = CrossTUDir; - llvm::sys::path::append(FilePath, FileName); - Result[LookupName] = std::string(FilePath); } else return llvm::make_error( index_error_code::invalid_index_format, IndexPath.str(), LineNo); - LineNo++; + ++LineNo; } return Result; } @@ -341,30 +358,44 @@ } } -CrossTranslationUnitContext::ASTFileLoader::ASTFileLoader( - const CompilerInstance &CI) - : CI(CI) {} +CrossTranslationUnitContext::ASTFileLoader::ASTFileLoader(CompilerInstance &CI, + StringRef CTUDir) + : CI(CI), CTUDir(CTUDir) {} -std::unique_ptr -CrossTranslationUnitContext::ASTFileLoader::operator()(StringRef ASTFilePath) { +CrossTranslationUnitContext::LoadResultTy +CrossTranslationUnitContext::ASTFileLoader::load(StringRef Identifier) { // Load AST from ast-dump. + + llvm::SmallString<256> Path; + + if (llvm::sys::path::is_absolute(Identifier)) { + Path = Identifier; + } else { + Path = CTUDir; + llvm::sys::path::append(Path, Identifier); + } + IntrusiveRefCntPtr DiagOpts = new DiagnosticOptions(); TextDiagnosticPrinter *DiagClient = new TextDiagnosticPrinter(llvm::errs(), &*DiagOpts); IntrusiveRefCntPtr DiagID(new DiagnosticIDs()); IntrusiveRefCntPtr Diags( new DiagnosticsEngine(DiagID, &*DiagOpts, DiagClient)); - return ASTUnit::LoadFromASTFile( - std::string(ASTFilePath), CI.getPCHContainerOperations()->getRawReader(), + std::string(Path.str()), CI.getPCHContainerOperations()->getRawReader(), ASTUnit::LoadEverything, Diags, CI.getFileSystemOpts()); } CrossTranslationUnitContext::ASTUnitStorage::ASTUnitStorage( - const CompilerInstance &CI) - : FileAccessor(CI), LoadGuard(const_cast(CI) - .getAnalyzerOpts() - ->CTUImportThreshold) {} + CompilerInstance &CI) + : LoadGuard(CI.getAnalyzerOpts()->CTUImportThreshold) { + + AnalyzerOptionsRef Opts = CI.getAnalyzerOpts(); + if (Opts->CTUOnDemandParsing) + Loader = std::make_unique(CI, Opts->CTUInvocationList); + else + Loader = std::make_unique(CI, Opts->CTUDir); +} llvm::Expected CrossTranslationUnitContext::ASTUnitStorage::getASTUnitForFile( @@ -380,8 +411,12 @@ index_error_code::load_threshold_reached); } - // Load the ASTUnit from the pre-dumped AST file specified by ASTFileName. - std::unique_ptr LoadedUnit = FileAccessor(FileName); + auto LoadAttempt = Loader->load(FileName); + + if (!LoadAttempt) + return LoadAttempt.takeError(); + + std::unique_ptr LoadedUnit = std::move(LoadAttempt.get()); // Need the raw pointer and the unique_ptr as well. ASTUnit *Unit = LoadedUnit.get(); @@ -461,7 +496,7 @@ else llvm::sys::path::append(IndexFile, IndexName); - if (auto IndexMapping = parseCrossTUIndex(IndexFile, CrossTUDir)) { + if (auto IndexMapping = parseCrossTUIndex(IndexFile)) { // Initialize member map. NameFileMap = *IndexMapping; return llvm::Error::success(); @@ -471,6 +506,10 @@ }; } +CrossTranslationUnitContext::ASTOnDemandLoader::ASTOnDemandLoader( + CompilerInstance &CI, StringRef InvocationListFilePath) + : CI(CI), InvocationListFilePath(InvocationListFilePath) {} + llvm::Expected CrossTranslationUnitContext::loadExternalAST( StringRef LookupName, StringRef CrossTUDir, StringRef IndexName, bool DisplayCTUProgress) { @@ -494,6 +533,148 @@ return Unit; } +/// Load the AST from a source-file, which is supposed to be located inside the +/// compilation database \p InvocationList. The compilation database +/// can contain the path of the file under the key "file" as an absolute path, +/// or as a relative path. When emitting diagnostics, plist files may contain +/// references to a location in a TU, that is different from the main TU. In +/// such cases, the file path emitted by the DiagnosticEngine is based on how +/// the exact invocation is assembled inside the ClangTool, which performs the +/// building of the ASTs. In order to ensure absolute paths inside the +/// diagnostics, we use the ArgumentsAdjuster API of ClangTool to make sure that +/// the invocation inside ClangTool is always made with an absolute path. \p +/// Identifier is assumed to be the lookup-name of the file, which comes from +/// the Index. The Index is built by the \p clang-extdef-mapping tool, which is +/// supposed to generate absolute paths. +/// +/// We must have absolute paths inside the plist, because otherwise we would +/// not be able to parse the bug, because we could not find the files with +/// relative paths. The directory of one entry in the compilation db may be +/// different from the directory where the plist is interpreted. +/// +/// Note that as the ClangTool is instantiated with a lookup-vector, which +/// contains a single entry; the supposedly absolute path of the source file. +/// So, the ArgumentAdjuster will only be used on the single corresponding +/// invocation. This guarantees that even if two files match in name, but +/// differ in location, only the correct one's invocation will be handled. This +/// is due to the fact that the lookup is done correctly inside the +/// InvocationListFilePath, so it works for already absolute paths given under +/// the "file" entry of the compilation database, but also if a relative path is +/// given. In such a case, the lookup uses the "directory" entry as well to +/// identify the correct file. +CrossTranslationUnitContext::LoadResultTy +CrossTranslationUnitContext::ASTOnDemandLoader::load(StringRef Identifier) { + + if (auto InitError = lazyInitCompileCommands()) + return std::move(InitError); + + assert(InvocationList); + + const SmallVector &InvocationCommand = + (*InvocationList)[Identifier]; + if (InvocationCommand.empty()) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + IntrusiveRefCntPtr DiagOpts{&CI.getDiagnosticOpts()}; + auto *DiagClient = new ForwardingDiagnosticConsumer{CI.getDiagnosticClient()}; + IntrusiveRefCntPtr DiagID{ + CI.getDiagnostics().getDiagnosticIDs()}; + IntrusiveRefCntPtr Diags( + new DiagnosticsEngine{DiagID, &*DiagOpts, DiagClient}); + + SmallVector CommandLineArgs(InvocationCommand.size()); + std::transform(InvocationCommand.begin(), InvocationCommand.end(), + CommandLineArgs.begin(), + [](auto &&CmdPart) { return CmdPart.c_str(); }); + + return std::unique_ptr(ASTUnit::LoadFromCommandLine( + CommandLineArgs.begin(), (CommandLineArgs.end()), + CI.getPCHContainerOperations(), Diags, + CI.getHeaderSearchOpts().ResourceDir)); +} + +llvm::Error +CrossTranslationUnitContext::ASTOnDemandLoader::lazyInitCompileCommands() { + /// Lazily initialize the invocation list member used for on-demand parsing. + if (InvocationList) + return llvm::Error::success(); + + InvocationList = InvocationListTy{}; + + auto FileContent = llvm::MemoryBuffer::getFile(InvocationListFilePath); + if (!FileContent) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + std::unique_ptr ContentBuffer = std::move(*FileContent); + assert(ContentBuffer && "If no error was produced after loading, the pointer " + "should not be nullptr."); + + /// LLVM YAML parser is used to extract information from invocation list file. + llvm::SourceMgr SM; + llvm::yaml::Stream InvocationFiles(*ContentBuffer, SM); + + /// Only the first document is processed. + llvm::yaml::document_iterator FirstInvocationFile = InvocationFiles.begin(); + + /// There has to be at least one document available. + if (FirstInvocationFile == InvocationFiles.end()) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + llvm::yaml::Node *DocumentRoot = FirstInvocationFile->getRoot(); + if (!DocumentRoot) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + /// According to the format specified the document must be a mapping, where + /// the keys are paths to source files, and values are sequences of invocation + /// parts. + auto *Mappings = dyn_cast(DocumentRoot); + if (!Mappings) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + for (auto &NextMapping : *Mappings) { + /// The keys should be strings, which represent a source-file path. + auto *Key = dyn_cast(NextMapping.getKey()); + if (!Key) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + /// The values should be sequences of strings, each representing a part of + /// the invocation. + auto *Args = dyn_cast(NextMapping.getValue()); + if (!Args) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + + SmallVector ValueStorage; + StringRef SourcePath = Key->getValue(ValueStorage); + + // Store paths with native-style directory separator. + SmallVector NativeSourcePath; + llvm::Twine{SourcePath}.toVector(NativeSourcePath); + llvm::sys::path::native(NativeSourcePath); + + StringRef InvocationKey{NativeSourcePath.begin(), NativeSourcePath.size()}; + + for (auto &Arg : *Args) { + auto *CmdString = dyn_cast(&Arg); + if (!CmdString) + return llvm::make_error( + index_error_code::failed_to_get_external_ast); + /// Every conversion starts with an empty working storage, as it is not + /// clear if this is a requirement of the YAML parser. + ValueStorage.clear(); + (*InvocationList)[InvocationKey].emplace_back( + CmdString->getValue(ValueStorage)); + } + } + + return llvm::Error::success(); +} + template llvm::Expected CrossTranslationUnitContext::importDefinitionImpl(const T *D, ASTUnit *Unit) { diff --git a/clang/lib/Frontend/CompilerInvocation.cpp b/clang/lib/Frontend/CompilerInvocation.cpp --- a/clang/lib/Frontend/CompilerInvocation.cpp +++ b/clang/lib/Frontend/CompilerInvocation.cpp @@ -511,6 +511,12 @@ Diags->Report(diag::err_analyzer_config_invalid_input) << "ctu-dir" << "a filename"; + if (AnOpts.CTUOnDemandParsing && + !llvm::sys::fs::exists(AnOpts.CTUInvocationList)) + Diags->Report(diag::err_analyzer_config_invalid_input) + << "ctu-invocation-list" + << "a filename"; + if (!AnOpts.ModelPath.empty() && !llvm::sys::fs::is_directory(AnOpts.ModelPath)) Diags->Report(diag::err_analyzer_config_invalid_input) << "model-path" diff --git a/clang/test/Analysis/Inputs/ctu-other.c b/clang/test/Analysis/Inputs/ctu-other.c --- a/clang/test/Analysis/Inputs/ctu-other.c +++ b/clang/test/Analysis/Inputs/ctu-other.c @@ -31,10 +31,12 @@ } // Test that asm import does not fail. +// TODO: Support the GNU extension asm keyword as well. +// Example using the GNU extension: asm("mov $42, %0" : "=r"(res)); int inlineAsm() { int res; - asm("mov $42, %0" - : "=r"(res)); + __asm__("mov $42, %0" + : "=r"(res)); return res; } diff --git a/clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.txt b/clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.ast-dump.txt rename from clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.txt rename to clang/test/Analysis/Inputs/ctu-other.c.externalDefMap.ast-dump.txt diff --git a/clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.txt b/clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt rename from clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.txt rename to clang/test/Analysis/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c --- a/clang/test/Analysis/analyzer-config.c +++ b/clang/test/Analysis/analyzer-config.c @@ -43,6 +43,8 @@ // CHECK-NEXT: ctu-dir = "" // CHECK-NEXT: ctu-import-threshold = 100 // CHECK-NEXT: ctu-index-name = externalDefMap.txt +// CHECK-NEXT: ctu-invocation-list = invocations.yaml +// CHECK-NEXT: ctu-on-demand-parsing = false // CHECK-NEXT: deadcode.DeadStores:ShowFixIts = false // CHECK-NEXT: deadcode.DeadStores:WarnForDeadNestedAssignments = true // CHECK-NEXT: debug.AnalysisOrder:* = false diff --git a/clang/test/Analysis/ctu-different-triples.cpp b/clang/test/Analysis/ctu-different-triples.cpp --- a/clang/test/Analysis/ctu-different-triples.cpp +++ b/clang/test/Analysis/ctu-different-triples.cpp @@ -2,7 +2,7 @@ // RUN: mkdir -p %t/ctudir // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple powerpc64-montavista-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/test/Analysis/ctu-main.c b/clang/test/Analysis/ctu-main.c --- a/clang/test/Analysis/ctu-main.c +++ b/clang/test/Analysis/ctu-main.c @@ -2,7 +2,7 @@ // RUN: mkdir -p %t/ctudir2 // RUN: %clang_cc1 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir2/ctu-other.c.ast %S/Inputs/ctu-other.c -// RUN: cp %S/Inputs/ctu-other.c.externalDefMap.txt %t/ctudir2/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.c.externalDefMap.ast-dump.txt %t/ctudir2/externalDefMap.txt // RUN: %clang_cc1 -triple x86_64-pc-linux-gnu -fsyntax-only -std=c89 -analyze \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ @@ -50,6 +50,10 @@ void testImplicit() { int res = identImplicit(6); // external implicit functions are not inlined clang_analyzer_eval(res == 6); // expected-warning{{TRUE}} + // Call something with uninitialized from the same function in which the implicit was called. + // This is necessary to reproduce a special bug in NoStoreFuncVisitor. + int uninitialized; + h(uninitialized); // expected-warning{{1st function call argument is an uninitialized value}} } // Tests the import of functions that have a struct parameter diff --git a/clang/test/Analysis/ctu-main.cpp b/clang/test/Analysis/ctu-main.cpp --- a/clang/test/Analysis/ctu-main.cpp +++ b/clang/test/Analysis/ctu-main.cpp @@ -4,7 +4,7 @@ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-chain.cpp.ast %S/Inputs/ctu-chain.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/test/Analysis/ctu-on-demand-parsing.c b/clang/test/Analysis/ctu-on-demand-parsing.c new file mode 100644 --- /dev/null +++ b/clang/test/Analysis/ctu-on-demand-parsing.c @@ -0,0 +1,76 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t +// RUN: cp "%s" "%t/ctu-on-demand-parsing.c" +// RUN: cp "%S/Inputs/ctu-other.c" "%t/ctu-other.c" +// Path substitutions on Windows platform could contain backslashes. These are escaped in the json file. +// RUN: echo '[{"directory":"%t","command":"gcc -std=c89 -Wno-visibility ctu-other.c","file":"ctu-other.c"}]' | sed -e 's/\\/\\\\/g' > %t/compile_commands.json +// RUN: echo '"%t/ctu-other.c": ["gcc", "-std=c89", "-Wno-visibility", "ctu-other.c"]' | sed -e 's/\\/\\\\/g' > %t/invocations.yaml +// RUN: cd "%t" && %clang_extdef_map "%t/ctu-other.c" > externalDefMap.txt +// RUN: cd "%t" && %clang_cc1 -fsyntax-only -std=c89 -analyze \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=. \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -analyzer-config ctu-invocation-list=invocations.yaml \ +// RUN: -verify ctu-on-demand-parsing.c + +void clang_analyzer_eval(int); + +// Test typedef and global variable in function. +typedef struct { + int a; + int b; +} FooBar; +extern FooBar fb; +int f(int); +void testGlobalVariable() { + clang_analyzer_eval(f(5) == 1); // expected-warning{{TRUE}} +} + +// Test enums. +int enumCheck(void); +enum A { x, + y, + z }; +void testEnum() { + clang_analyzer_eval(x == 0); // expected-warning{{TRUE}} + clang_analyzer_eval(enumCheck() == 42); // expected-warning{{TRUE}} +} + +// Test that asm import does not fail. +int inlineAsm(); +int testInlineAsm() { return inlineAsm(); } + +// Test reporting error in a macro. +struct S; +int g(struct S *); +void testMacro(void) { + g(0); + // expected-warning@ctu-other.c:29 {{Access to field 'a' results in a dereference of a null pointer (loaded from variable 'ctx')}} +} + +// The external function prototype is incomplete. +// warning:implicit functions are prohibited by c99 +void testImplicit() { + int res = identImplicit(6); // external implicit functions are not inlined + clang_analyzer_eval(res == 6); // expected-warning{{TRUE}} + // Call something with uninitialized from the same function in which the + // implicit was called. This is necessary to reproduce a special bug in + // NoStoreFuncVisitor. + int uninitialized; + h(uninitialized); // expected-warning{{1st function call argument is an uninitialized value}} +} + +// Tests the import of functions that have a struct parameter +// defined in its prototype. +struct DataType { + int a; + int b; +}; +int structInProto(struct DataType *d); +void testStructDefInArgument() { + struct DataType d; + d.a = 1; + d.b = 0; + clang_analyzer_eval(structInProto(&d) == 0); // expected-warning{{TRUE}} expected-warning{{FALSE}} +} diff --git a/clang/test/Analysis/ctu-on-demand-parsing.cpp b/clang/test/Analysis/ctu-on-demand-parsing.cpp new file mode 100644 --- /dev/null +++ b/clang/test/Analysis/ctu-on-demand-parsing.cpp @@ -0,0 +1,105 @@ +// RUN: rm -rf %t +// RUN: mkdir -p %t/Inputs +// RUN: cp %s %t/ctu-on-demand-parsing.cpp +// RUN: cp %S/ctu-hdr.h %t/ctu-hdr.h +// RUN: cp %S/Inputs/ctu-chain.cpp %t/Inputs/ctu-chain.cpp +// RUN: cp %S/Inputs/ctu-other.cpp %t/Inputs/ctu-other.cpp +// Path substitutions on Windows platform could contain backslashes. These are escaped in the json file. +// RUN: echo '[{"directory":"%t/Inputs","command":"clang++ ctu-chain.cpp","file":"ctu-chain.cpp"},{"directory":"%t/Inputs","command":"clang++ ctu-other.cpp","file":"ctu-other.cpp"}]' | sed -e 's/\\/\\\\/g' > %t/compile_commands.json +// RUN: echo '{"%t/Inputs/ctu-chain.cpp": ["g++", "%t/Inputs/ctu-chain.cpp"], "%t/Inputs/ctu-other.cpp": ["g++", "%t/Inputs/ctu-other.cpp"]}' | sed -e 's/\\/\\\\/g' > %t/invocations.yaml +// RUN: cd "%t" && %clang_extdef_map Inputs/ctu-chain.cpp Inputs/ctu-other.cpp > externalDefMap.txt +// RUN: cd "%t" && %clang_analyze_cc1 \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=. \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -analyzer-config ctu-invocation-list=invocations.yaml \ +// RUN: -verify ctu-on-demand-parsing.cpp +// RUN: cd "%t" && %clang_analyze_cc1 \ +// RUN: -analyzer-checker=core,debug.ExprInspection \ +// RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ +// RUN: -analyzer-config ctu-dir=. \ +// RUN: -analyzer-config ctu-on-demand-parsing=true \ +// RUN: -analyzer-config ctu-invocation-list=invocations.yaml \ +// RUN: -analyzer-config display-ctu-progress=true ctu-on-demand-parsing.cpp 2>&1 | FileCheck %t/ctu-on-demand-parsing.cpp + +// CHECK: CTU loaded AST file: {{.*}}ctu-other.cpp +// CHECK: CTU loaded AST file: {{.*}}ctu-chain.cpp + +#include "ctu-hdr.h" + +void clang_analyzer_eval(int); + +int f(int); +int g(int); +int h(int); + +int callback_to_main(int x) { return x + 1; } + +namespace myns { +int fns(int x); + +namespace embed_ns { +int fens(int x); +} + +class embed_cls { +public: + int fecl(int x); +}; +} // namespace myns + +class mycls { +public: + int fcl(int x); + virtual int fvcl(int x); + static int fscl(int x); + + class embed_cls2 { + public: + int fecl2(int x); + }; +}; + +class derived : public mycls { +public: + virtual int fvcl(int x) override; +}; + +namespace chns { +int chf1(int x); +} + +int fun_using_anon_struct(int); +int other_macro_diag(int); + +void test_virtual_functions(mycls *obj) { + // The dynamic type is known. + clang_analyzer_eval(mycls().fvcl(1) == 8); // expected-warning{{TRUE}} + clang_analyzer_eval(derived().fvcl(1) == 9); // expected-warning{{TRUE}} + // We cannot decide about the dynamic type. + clang_analyzer_eval(obj->fvcl(1) == 8); // expected-warning{{FALSE}} expected-warning{{TRUE}} + clang_analyzer_eval(obj->fvcl(1) == 9); // expected-warning{{FALSE}} expected-warning{{TRUE}} +} + +int main() { + clang_analyzer_eval(f(3) == 2); // expected-warning{{TRUE}} + clang_analyzer_eval(f(4) == 3); // expected-warning{{TRUE}} + clang_analyzer_eval(f(5) == 3); // expected-warning{{FALSE}} + clang_analyzer_eval(g(4) == 6); // expected-warning{{TRUE}} + clang_analyzer_eval(h(2) == 8); // expected-warning{{TRUE}} + + clang_analyzer_eval(myns::fns(2) == 9); // expected-warning{{TRUE}} + clang_analyzer_eval(myns::embed_ns::fens(2) == -1); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls().fcl(1) == 6); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls::fscl(1) == 7); // expected-warning{{TRUE}} + clang_analyzer_eval(myns::embed_cls().fecl(1) == -6); // expected-warning{{TRUE}} + clang_analyzer_eval(mycls::embed_cls2().fecl2(0) == -11); // expected-warning{{TRUE}} + + clang_analyzer_eval(chns::chf1(4) == 12); // expected-warning{{TRUE}} + clang_analyzer_eval(fun_using_anon_struct(8) == 8); // expected-warning{{TRUE}} + + clang_analyzer_eval(other_macro_diag(1) == 1); // expected-warning{{TRUE}} + // expected-warning@Inputs/ctu-other.cpp:93{{REACHABLE}} + MACRODIAG(); // expected-warning{{REACHABLE}} +} diff --git a/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp b/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp --- a/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp +++ b/clang/test/Analysis/ctu-unknown-parts-in-triples.cpp @@ -5,7 +5,7 @@ // RUN: mkdir -p %t/ctudir // RUN: %clang_cc1 -std=c++14 -triple x86_64-pc-linux-gnu \ // RUN: -emit-pch -o %t/ctudir/ctu-other.cpp.ast %S/Inputs/ctu-other.cpp -// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.txt %t/ctudir/externalDefMap.txt +// RUN: cp %S/Inputs/ctu-other.cpp.externalDefMap.ast-dump.txt %t/ctudir/externalDefMap.txt // RUN: %clang_analyze_cc1 -std=c++14 -triple x86_64-unknown-linux-gnu \ // RUN: -analyzer-checker=core,debug.ExprInspection \ // RUN: -analyzer-config experimental-enable-naive-ctu-analysis=true \ diff --git a/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp b/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp --- a/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp +++ b/clang/unittests/CrossTU/CrossTranslationUnitTest.cpp @@ -7,10 +7,11 @@ //===----------------------------------------------------------------------===// #include "clang/CrossTU/CrossTranslationUnit.h" -#include "clang/Frontend/CompilerInstance.h" #include "clang/AST/ASTConsumer.h" +#include "clang/Frontend/CompilerInstance.h" #include "clang/Frontend/FrontendAction.h" #include "clang/Tooling/Tooling.h" +#include "llvm/ADT/Optional.h" #include "llvm/Support/FileSystem.h" #include "llvm/Support/Path.h" #include "llvm/Support/ToolOutputFile.h" @@ -162,7 +163,7 @@ IndexFile.os().flush(); EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName)); llvm::Expected> IndexOrErr = - parseCrossTUIndex(IndexFileName, ""); + parseCrossTUIndex(IndexFileName); EXPECT_TRUE((bool)IndexOrErr); llvm::StringMap ParsedIndex = IndexOrErr.get(); for (const auto &E : Index) { @@ -173,25 +174,5 @@ EXPECT_TRUE(Index.count(E.getKey())); } -TEST(CrossTranslationUnit, CTUDirIsHandledCorrectly) { - llvm::StringMap Index; - Index["a"] = "/b/c/d"; - std::string IndexText = createCrossTUIndexString(Index); - - int IndexFD; - llvm::SmallString<256> IndexFileName; - ASSERT_FALSE(llvm::sys::fs::createTemporaryFile("index", "txt", IndexFD, - IndexFileName)); - llvm::ToolOutputFile IndexFile(IndexFileName, IndexFD); - IndexFile.os() << IndexText; - IndexFile.os().flush(); - EXPECT_TRUE(llvm::sys::fs::exists(IndexFileName)); - llvm::Expected> IndexOrErr = - parseCrossTUIndex(IndexFileName, "/ctudir"); - EXPECT_TRUE((bool)IndexOrErr); - llvm::StringMap ParsedIndex = IndexOrErr.get(); - EXPECT_EQ(ParsedIndex["a"], "/ctudir/b/c/d"); -} - } // end namespace cross_tu } // end namespace clang