This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Tooling/
-
Tooling/
-
JSONCompilationDatabase.cpp
-
test/Tooling/
-
Tooling/
-
auto-detect-from-source.cpp

Differential D51729

[Tooling] JSONCompilationDatabasePlugin infers compile commands for missing files
ClosedPublic

Authored by sammccall on Sep 6 2018, 5:35 AM.

Download Raw Diff

Details

Reviewers

bkramer

Commits

rG9d3530bd4eba: [Tooling] JSONCompilationDatabasePlugin infers compile commands for missing…
rL342228: [Tooling] JSONCompilationDatabasePlugin infers compile commands for missing…
rC342228: [Tooling] JSONCompilationDatabasePlugin infers compile commands for missing…

Summary

See the existing InterpolatingCompilationDatabase for details on how this works.
We've been using this in clangd for a while, the heuristics seem to work well.

Diff Detail

Repository: rC Clang

Event Timeline

sammccall created this revision.Sep 6 2018, 5:35 AM

Herald added subscribers: cfe-commits, kadircet, ioeric, ilya-biryukov. · View Herald TranscriptSep 6 2018, 5:35 AM

I like it

This revision is now accepted and ready to land.Sep 12 2018, 10:49 AM

Closed by commit rC342228: [Tooling] JSONCompilationDatabasePlugin infers compile commands for missing… (authored by sammccall). · Explain WhySep 14 2018, 5:25 AM

This revision was automatically updated to reflect the committed changes.

Unless my bisect is mistaken, this broke Clang Python binding test:
https://github.com/llvm-mirror/clang/blob/master/bindings/python/tests/cindex/test_cdb.py#L34

The test apparently assumes that compilation database will not return anything for a file that does not exist. Could you please suggest how the test should be adjusted for this change?

Sorry about that. I wasn't familiar with the python bindings.
Your bisect is correct, we changed the behavior that test is testing: now
an approximate match will be returned.

I guess either:

just remove that test (depending on how we feel about smoke-testing vs

exhaustively testing wrapperl like this)

change it to verify exactly what the fuzzy-matched result is, e.g.
- add -DFEATURE=0 to the first compile command for "Project.cpp" in the

test CDB so it's easy to recognize

make the lookup key "/home/john.doe/MyProject/Project.h" so it's

obvious what the match should be

verify that the lookup result has "-DFEATURE=0" and "Project.h" in

its args
It seems to me that test_lookup_succeed should also verify what gets
returned (otherwise it's not really testing anything at this point).

I can try to put together a patch, though I don't currently have a setup to
test it - let me know if this would be useful!

Before this patch, missing compilation database entries resulted in "Skipping .... Compile command not found." which is assumed by the tests in this clang-query patch: https://reviews.llvm.org/D54109

After this patch, a command is inferred, but this is not always desired.
Use case: find all function calls with a certain name using the compilation database produced by CMake.
Example command:

clang-query -p=/tmp/wsbuild -c 'set output print' -c 'm callExpr(callee(functionDecl(hasName("uat_new"))))' $(grep -rl uat_new epan/dissectors/)

Expected result for some template files which do not exist in the compilation database:

Skipping .../epan/dissectors/asn1/x509af/packet-x509af-template.c. Compile command not found.

Actual result (clang-query tries to parse the contents which will fail since it is not an actual C source file):

.../epan/dissectors/asn1/x509af/packet-x509af-template.c:18:10: fatal error: 'packet-ber.h' file not found

I'm not entirely sure what to do here. The old behavior works great in cases where a complete database is available (produced by CMake). The new behavior might work better for clangd (?), but it breaks a use case (see above).

In D51729#1287421, @Lekensteyn wrote:
Before this patch, missing compilation database entries resulted in "Skipping .... Compile command not found." which is assumed by the tests in this clang-query patch: https://reviews.llvm.org/D54109

After this patch, a command is inferred, but this is not always desired.
Use case: find all function calls with a certain name using the compilation database produced by CMake.
Example command:
clang-query -p=/tmp/wsbuild -c 'set output print' -c 'm callExpr(callee(functionDecl(hasName("uat_new"))))' $(grep -rl uat_new epan/dissectors/)
Expected result for some template files which do not exist in the compilation database:
Skipping .../epan/dissectors/asn1/x509af/packet-x509af-template.c. Compile command not found.
Actual result (clang-query tries to parse the contents which will fail since it is not an actual C source file):
.../epan/dissectors/asn1/x509af/packet-x509af-template.c:18:10: fatal error: 'packet-ber.h' file not found
I'm not entirely sure what to do here. The old behavior works great in cases where a complete database is available (produced by CMake). The new behavior might work better for clangd (?), but it breaks a use case (see above).

For clangd, but also clang-tidy and clang-query when the user *does* want to use it on files not represented in the CDB. (e.g. stale or headers)
There's indeed a tension here, because the CDB discovery needs to have a default configuration.

That said, in this case the behavior looks appropriate to me: you've explicitly specified files on the command line, ignoring them and returning with status 0 seems surprising.
For the case of "search over all TUs in CDB", the CDB does offer the ability to list TUs and iterate over compile commands, and ClangTool lets you run in this mode. We've discussed in the past adding a filename filter to AllTUsExecutor, which would be useful for this purpose and others. @ioeric

In D51729#1288360, @sammccall wrote:

I'm not entirely sure what to do here. The old behavior works great in cases where a complete database is available (produced by CMake). The new behavior might work better for clangd (?), but it breaks a use case (see above).

For clangd, but also clang-tidy and clang-query when the user *does* want to use it on files not represented in the CDB. (e.g. stale or headers)
There's indeed a tension here, because the CDB discovery needs to have a default configuration.

Support for querying header files seems valuable and a good argument to keep this change.

That said, in this case the behavior looks appropriate to me: you've explicitly specified files on the command line, ignoring them and returning with status 0 seems surprising.

The test explicitly lists them, but in practice the find or grep -rl provides the file names which might contain uninteresting matches that are not present in the CDB. As a compromise, it could return with status 1 but proceed querying files that were successfully parsed.

For the case of "search over all TUs in CDB", the CDB does offer the ability to list TUs and iterate over compile commands, and ClangTool lets you run in this mode. We've discussed in the past adding a filename filter to AllTUsExecutor, which would be useful for this purpose and others. @ioeric

AllTUsToolExecutor looks useful to enable concurrency, but a filename filter would not help in my case. The reason I do a "cheap" grep first before using clang-query is because building the AST is slow and consumes a lot of memory (after ~1k .c files, 12G was in use).

Querying the CDB and filtering out entries is currently possible, but quite verbose:

grep -le 'search term' $(jq -r '.[].file' compile_commands.json) | xargs clang-query -c 'm ...'

Thank you for the feedback, my concerns with this CDB patch has been resolved.

Revision Contents

Path

Size

lib/

Tooling/

JSONCompilationDatabase.cpp

5 lines

test/

Tooling/

auto-detect-from-source.cpp

10 lines

Diff 165472

lib/Tooling/JSONCompilationDatabase.cpp

Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	if (Syntax == JSONCommandLineSyntax::Windows) {
std::vector<std::string> Result(T.begin(), T.end());		std::vector<std::string> Result(T.begin(), T.end());
return Result;		return Result;
}		}
assert(Syntax == JSONCommandLineSyntax::Gnu);		assert(Syntax == JSONCommandLineSyntax::Gnu);
CommandLineArgumentParser parser(EscapedCommandLine);		CommandLineArgumentParser parser(EscapedCommandLine);
return parser.parse();		return parser.parse();
}		}

		// This plugin locates a nearby compile_command.json file, and also infers
		// compile commands for files not present in the database.
class JSONCompilationDatabasePlugin : public CompilationDatabasePlugin {		class JSONCompilationDatabasePlugin : public CompilationDatabasePlugin {
std::unique_ptr<CompilationDatabase>		std::unique_ptr<CompilationDatabase>
loadFromDirectory(StringRef Directory, std::string &ErrorMessage) override {		loadFromDirectory(StringRef Directory, std::string &ErrorMessage) override {
SmallString<1024> JSONDatabasePath(Directory);		SmallString<1024> JSONDatabasePath(Directory);
llvm::sys::path::append(JSONDatabasePath, "compile_commands.json");		llvm::sys::path::append(JSONDatabasePath, "compile_commands.json");
return JSONCompilationDatabase::loadFromFile(		auto Base = JSONCompilationDatabase::loadFromFile(
JSONDatabasePath, ErrorMessage, JSONCommandLineSyntax::AutoDetect);		JSONDatabasePath, ErrorMessage, JSONCommandLineSyntax::AutoDetect);
		return Base ? inferMissingCompileCommands(std::move(Base)) : nullptr;
}		}
};		};

} // namespace		} // namespace

// Register the JSONCompilationDatabasePlugin with the		// Register the JSONCompilationDatabasePlugin with the
// CompilationDatabasePluginRegistry using this statically initialized variable.		// CompilationDatabasePluginRegistry using this statically initialized variable.
static CompilationDatabasePluginRegistry::Add<JSONCompilationDatabasePlugin>		static CompilationDatabasePluginRegistry::Add<JSONCompilationDatabasePlugin>
▲ Show 20 Lines • Show All 207 Lines • Show Last 20 Lines

test/Tooling/auto-detect-from-source.cpp

	// RUN: rm -rf %t			// RUN: rm -rf %t
	// RUN: mkdir %t			// RUN: mkdir %t
	// RUN: echo "[{\"directory\":\".\",\"command\":\"clang++ -c %/t/test.cpp\",\"file\":\"%/t/test.cpp\"}]" \| sed -e 's/\\/\\\\/g' > %t/compile_commands.json			// RUN: echo "[{\"directory\":\".\",\"command\":\"clang++ -DSECRET=XYZZY -c %/t/test.cpp\",\"file\":\"%/t/test.cpp\"}]" \| sed -e 's/\\/\\\\/g' > %t/compile_commands.json
	// RUN: cp "%s" "%t/test.cpp"			// RUN: cp "%s" "%t/test.cpp"
	// RUN: not clang-check "%t/test.cpp" 2>&1 \| FileCheck %s			// RUN: not clang-check "%t/test.cpp" 2>&1 \| FileCheck %s

	// CHECK: C++ requires			// CHECK: XYZZY
	invalid;			SECRET;

				// Copy to a different file, and rely on the command being inferred.
				// RUN: cp "%s" "%t/other.cpp"
				// RUN: not clang-check "%t/other.cpp" 2>&1 \| FileCheck %s