Add support for .rsp files.
Details
Diff Detail
Event Timeline
Thanks for taking a look into this, the rsp files issue has came up before in the past but there wasn't enough investment to implement it.
Haven't checked the implementation in detail yet, I believe the layering should be different;
This is a common problem for all of the clang-related tools, as they all share the same "compilation database" abstraction layer, therefore it would be better to implement this at that layer so that other tools (e.g, clang-tidy) can also benefit from this.
You can find the related code in clang/include/clang/Tooling/CompilationDatabase.h and clang/lib/Tooling/CompilationDatabase.cpp.
Also compilation databases has been historically neglecting Virtual File System abstractions, it is hard to change it now. But would be great if you could try to keep that in mind while performing reads.
So would you mind making such changes ?
Thanks, layering seems better now. But as Sam mentioned in the issue it would be better to have something like expandRSPFiles similar to inferMissingCompileCommands.
As the problem is not in the JSONCompilationDatabase itself, it can also occur in FixedCompilationDatabase or any other models clang might've in the future.
The CompilationDatabase returned by expandRSPFiles could wrap any other CDB and upon querying of commands, it can expand any RSP files returned by the underlying CDB.
This will also ensure minimum changes to the existing code, reducing any chances of breakage. As for unittests, please see clang/unittests/Tooling/CompilationDatabaseTest.cpp
and a small nit, when modifying existing files, please only run formatting on the lines you touched to preserve history.
Sorry wrote comments but forgot to hit submit :(
clang/include/clang/Tooling/CompilationDatabase.h | ||
---|---|---|
229 ↗ | (On Diff #230060) | instead of exposing this functionality and handling them in Fixed and JSON compilation databases separately, I was rather talking about creating a new type of compilation database. as inferTargetAndDriverMode or inferMissingCompileCommands functions does. which will take an Inner/Base compilation database, and in its getCompileCommands method will try to expand any rsp files. does that make sense? Also please implement it in a new file. (but still put the declaration into this header) |
Expand response files before inferMissingCompileCommands and inferTargetAndDriverMode
clang/include/clang/Tooling/CompilationDatabase.h | ||
---|---|---|
223 ↗ | (On Diff #230756) | nit: s/response files/rsp(response) files/ |
230 ↗ | (On Diff #230756) | it looks like none of the callsites are setting this parameter can we get rid of this one? |
clang/lib/Tooling/CompilationDatabase.cpp | ||
402 ↗ | (On Diff #230756) | is it really common for rsp files to show up in fixed compilation databases? since compile_flags.txt itself is also a file doesn't make much sense to refer to another one. as it can also contain text of arbitrary length. |
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
167 ↗ | (On Diff #230756) | so it looks like we already have ExpandResponseFiles exposed in llvm/include/llvm/Support/CommandLine.h could you make use of it in here instead of re-implementing it again? I see that it has a different signature, but we can do the conversion back and forth in here, going from std::string to char* is cheap anyways, coming back is expensive though, and we can limit that to iff we have seen an argument starting with an @. So this would be something like: llvm::SmallVector<const char*, 20> Argvs; Argvs.reserve(Cmd.CommandLine.size()); bool SeenRSPFile = false; for(auto &Argv : Cmd.CommandLine) { Argvs.push_back(Argv.c_str()); SeenRSPFile |= Argv.front() == '@'; } if(!SeenRSPFile) continue; // call ExpandResponseFiles with Argvs, convert back to vector<std::string> and assign to Cmd.CommandLine |
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
167 ↗ | (On Diff #230756) | I think it's not easy to reuse ExpandResponseFiles without changing code because it uses llvm::sys::fs::current_path to resolve relative paths. In fact, most of the code here is copied from CommandLine |
clang/lib/Tooling/CompilationDatabase.cpp | ||
---|---|---|
402 ↗ | (On Diff #230756) | Yes, rsp files usually only appear on the command line. |
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
167 ↗ | (On Diff #230756) | But we can reuse static bool ExpandResponseFile(StringRef FName, StringSaver &Saver, TokenizerCallback Tokenizer, SmallVectorImpl<const char *> &NewArgv, bool MarkEOLs, bool RelativeNames) if we expose it in CommandLine.h |
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
167 ↗ | (On Diff #230756) | static bool ExpandResponseFile(StringRef FName, StringSaver &Saver, TokenizerCallback Tokenizer, SmallVectorImpl<const char *> &NewArgv, bool MarkEOLs, bool RelativeNames) also seems to use llvm::sys::fs::current_path to resolve relative paths. |
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
167 ↗ | (On Diff #230756) |
Ah, I missed this, thanks for pointing it out.
I believe in such a scenario, the correct course of action would be to update ExpandResponseFiles in CommandLine.h to accept a FileSystem &FS, instead of duplicating the logic. bool ExpandResponseFiles(StringSaver &Saver, TokenizerCallback Tokenizer, llvm::vfs::FileSystem &FS, SmallVectorImpl<const char *> &Argv, bool MarkEOLs = false, bool RelativeNames = false); Then you can pass llvm::vfs::getRealFileSystem() to these function in existing call sites. Also to update the implementation of ExpandResponseFiles you would need to make use of FileSystem::getCurrentWorkingDirectory() instead of llvm::sys::fs::current_path and FileSystem::getBufferForFile() instead of llvm::MemoryBuffer::getFile the rest shouldn't need any changes hopefully, except plumbing VFS to some helpers. Please send the signature change and call site updates in a separate patch though. After that patch you can change signature for std::unique_ptr<CompilationDatabase> expandResponseFiles(std::unique_ptr<CompilationDatabase> Base) to accept a IntrusiveRefCntPtr<FileSystem> VFS, which will be getRealFileSystem in case of JSONCompilationDatabase. But this will enable you to pass a InMemoryFileSystem in the unittest below. |
Build result: fail - 60344 tests passed, 2 failed and 732 were skipped.
failed: Clang.CodeGen/catch-implicit-integer-sign-changes-incdec.c failed: Clang.CodeGen/catch-implicit-signed-integer-truncations-incdec.c
Log files: console-log.txt, CMakeCache.txt
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
28 ↗ | (On Diff #231235) | nit: FS(std::move(FS)) |
49 ↗ | (On Diff #231235) | no need to restore/save, ExpandResponseFilesDatabase should have a FS that's not shared with others. it is iffy anyway, because if initial working directory is faulty, we'll mutate it to the working directory of the last compile command. |
61 ↗ | (On Diff #231235) | maybe make this check at the beginning of the for loop? could you also leave a fixme saying that we should rather propagate the current directory into ExpandResponseFiles as well in addition to FS and someone can take a look at it later on. |
clang/lib/Tooling/JSONCompilationDatabase.cpp | ||
172 ↗ | (On Diff #231235) | let's change this one to llvm::vfs::createPhysicalFileSystem to make sure we are passing a non-shared FS. Unfortunately, getRealFileSystem is shared with the whole process. |
clang/unittests/Tooling/CompilationDatabaseTest.cpp | ||
910 ↗ | (On Diff #231235) | nit: instead of clang-format off maybe provide command as a raw string literal? |
912 ↗ | (On Diff #231235) | this test is checking the functionality of ExpandFileCommands helper, which is already tested in llvm/unittests/Support/CommandLineTest.cpp it should be enough to have two simple tests:
|
mostly LG, a few last nits.
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp | ||
---|---|---|
51 ↗ | (On Diff #231362) | nit: drop and someone can take a look at it later on. |
clang/unittests/Tooling/CompilationDatabaseTest.cpp | ||
865 ↗ | (On Diff #231362) | make this part of the constructor instead of having a SetUp |
866 ↗ | (On Diff #231362) | move this to the test |
870 ↗ | (On Diff #231362) | nit: I would rather keep ASSERT_TRUE in here |
877 ↗ | (On Diff #231362) | nit: no need for braces |
Thanks a lot for working on this patch, LGTM!
again please make sure the changes are clang-formatted.
Do you have commit access?
I'm sure the code has been formatted.
Do you have commit access?
I don't have commit access, please commit it for me.