This is an archive of the discontinued LLVM Phabricator instance.

[clang][Tooling] Add support for .rsp files in compile_commands.json
ClosedPublic

Authored by lh123 on Nov 13 2019, 11:01 PM.

Diff Detail

Event Timeline

lh123 created this revision.Nov 13 2019, 11:01 PM
lh123 edited the summary of this revision. (Show Details)Nov 13 2019, 11:06 PM
lh123 added a comment.Nov 14 2019, 1:44 AM
This comment was removed by lh123.
lh123 updated this revision to Diff 229758.Nov 18 2019, 12:44 AM

update diff

lh123 updated this revision to Diff 229759.Nov 18 2019, 12:48 AM

format patch

Thanks for taking a look into this, the rsp files issue has came up before in the past but there wasn't enough investment to implement it.

Haven't checked the implementation in detail yet, I believe the layering should be different;

This is a common problem for all of the clang-related tools, as they all share the same "compilation database" abstraction layer, therefore it would be better to implement this at that layer so that other tools (e.g, clang-tidy) can also benefit from this.
You can find the related code in clang/include/clang/Tooling/CompilationDatabase.h and clang/lib/Tooling/CompilationDatabase.cpp.

Also compilation databases has been historically neglecting Virtual File System abstractions, it is hard to change it now. But would be great if you could try to keep that in mind while performing reads.

So would you mind making such changes ?

lh123 added a comment.EditedNov 18 2019, 4:06 AM

Thanks for taking a look into this, the rsp files issue has came up before in the past but there wasn't enough investment to implement it.

Haven't checked the implementation in detail yet, I believe the layering should be different;

This is a common problem for all of the clang-related tools, as they all share the same "compilation database" abstraction layer, therefore it would be better to implement this at that layer so that other tools (e.g, clang-tidy) can also benefit from this.
You can find the related code in clang/include/clang/Tooling/CompilationDatabase.h and clang/lib/Tooling/CompilationDatabase.cpp.

Also compilation databases has been historically neglecting Virtual File System abstractions, it is hard to change it now. But would be great if you could try to keep that in mind while performing reads.

So would you mind making such changes ?

Ok, I will look into this.

lh123 updated this revision to Diff 229807.Nov 18 2019, 4:29 AM

Move the implementation to JSONCompilationDatabase.

lh123 edited the summary of this revision. (Show Details)Nov 18 2019, 4:29 AM
lh123 updated this revision to Diff 229812.Nov 18 2019, 5:10 AM

Respect JSONCommandLineSyntax

Thanks, layering seems better now. But as Sam mentioned in the issue it would be better to have something like expandRSPFiles similar to inferMissingCompileCommands.
As the problem is not in the JSONCompilationDatabase itself, it can also occur in FixedCompilationDatabase or any other models clang might've in the future.

The CompilationDatabase returned by expandRSPFiles could wrap any other CDB and upon querying of commands, it can expand any RSP files returned by the underlying CDB.

This will also ensure minimum changes to the existing code, reducing any chances of breakage. As for unittests, please see clang/unittests/Tooling/CompilationDatabaseTest.cpp

and a small nit, when modifying existing files, please only run formatting on the lines you touched to preserve history.

lh123 updated this revision to Diff 230031.Nov 19 2019, 4:05 AM
lh123 edited the summary of this revision. (Show Details)

Address comment

lh123 updated this revision to Diff 230033.EditedNov 19 2019, 4:17 AM

fix typo in function document.

lh123 updated this revision to Diff 230060.Nov 19 2019, 6:31 AM

fixes some bug and add more test.

Sorry wrote comments but forgot to hit submit :(

clang/include/clang/Tooling/CompilationDatabase.h
229 ↗(On Diff #230060)

instead of exposing this functionality and handling them in Fixed and JSON compilation databases separately, I was rather talking about creating a new type of compilation database.

as inferTargetAndDriverMode or inferMissingCompileCommands functions does. which will take an Inner/Base compilation database, and in its getCompileCommands method will try to expand any rsp files.

does that make sense?

Also please implement it in a new file. (but still put the declaration into this header)

lh123 updated this revision to Diff 230752.Nov 23 2019, 12:01 AM

Address comment

lh123 marked an inline comment as done.Nov 23 2019, 12:04 AM
lh123 updated this revision to Diff 230756.Nov 23 2019, 2:37 AM

Expand response files before inferMissingCompileCommands and inferTargetAndDriverMode

lh123 edited the summary of this revision. (Show Details)Nov 26 2019, 2:53 AM
kadircet added inline comments.Nov 26 2019, 3:38 AM
clang/include/clang/Tooling/CompilationDatabase.h
223 ↗(On Diff #230756)

nit: s/response files/rsp(response) files/

230 ↗(On Diff #230756)

it looks like none of the callsites are setting this parameter can we get rid of this one?

clang/lib/Tooling/CompilationDatabase.cpp
402 ↗(On Diff #230756)

is it really common for rsp files to show up in fixed compilation databases? since compile_flags.txt itself is also a file doesn't make much sense to refer to another one. as it can also contain text of arbitrary length.

clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
167 ↗(On Diff #230756)

so it looks like we already have ExpandResponseFiles exposed in llvm/include/llvm/Support/CommandLine.h could you make use of it in here instead of re-implementing it again?

I see that it has a different signature, but we can do the conversion back and forth in here, going from std::string to char* is cheap anyways, coming back is expensive though, and we can limit that to iff we have seen an argument starting with an @. So this would be something like:

llvm::SmallVector<const char*, 20> Argvs;
Argvs.reserve(Cmd.CommandLine.size());
bool SeenRSPFile = false;
for(auto &Argv : Cmd.CommandLine) {
  Argvs.push_back(Argv.c_str());
  SeenRSPFile |= Argv.front() == '@';
}
if(!SeenRSPFile)
  continue;

// call ExpandResponseFiles with Argvs, convert back to vector<std::string> and assign to Cmd.CommandLine
lh123 marked an inline comment as done.Nov 26 2019, 4:34 AM
lh123 added inline comments.
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
167 ↗(On Diff #230756)

I think it's not easy to reuse ExpandResponseFiles without changing code because it uses llvm::sys::fs::current_path to resolve relative paths.

In fact, most of the code here is copied from CommandLine

lh123 marked an inline comment as done.Nov 26 2019, 4:44 AM
lh123 added inline comments.
clang/lib/Tooling/CompilationDatabase.cpp
402 ↗(On Diff #230756)

Yes, rsp files usually only appear on the command line.

lh123 updated this revision to Diff 231051.Nov 26 2019, 4:57 AM

address comments

lh123 marked 3 inline comments as done.Nov 26 2019, 4:58 AM
lh123 marked an inline comment as done.Nov 26 2019, 5:03 AM
lh123 added inline comments.
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
167 ↗(On Diff #230756)

But we can reuse static bool ExpandResponseFile(StringRef FName, StringSaver &Saver, TokenizerCallback Tokenizer, SmallVectorImpl<const char *> &NewArgv, bool MarkEOLs, bool RelativeNames) if we expose it in CommandLine.h

lh123 marked an inline comment as done.Nov 26 2019, 5:13 AM
lh123 added inline comments.
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
167 ↗(On Diff #230756)

static bool ExpandResponseFile(StringRef FName, StringSaver &Saver, TokenizerCallback Tokenizer, SmallVectorImpl<const char *> &NewArgv, bool MarkEOLs, bool RelativeNames) also seems to use llvm::sys::fs::current_path to resolve relative paths.

kadircet added inline comments.Nov 26 2019, 6:40 AM
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
167 ↗(On Diff #230756)

I think it's not easy to reuse ExpandResponseFiles without changing code because it uses llvm::sys::fs::current_path to resolve relative paths.

Ah, I missed this, thanks for pointing it out.

In fact, most of the code here is copied from CommandLine

I believe in such a scenario, the correct course of action would be to update ExpandResponseFiles in CommandLine.h to accept a FileSystem &FS, instead of duplicating the logic.
Namely changing the signature to:

bool ExpandResponseFiles(StringSaver &Saver, TokenizerCallback Tokenizer,
                         llvm::vfs::FileSystem &FS, SmallVectorImpl<const char *> &Argv,
                         bool MarkEOLs = false, bool RelativeNames = false);

Then you can pass llvm::vfs::getRealFileSystem() to these function in existing call sites.
And create a FS with working directory set to Cmd.Directory via FileSystem::setCurrentWorkingDirectory in this specific call site you want to introduce.

Also to update the implementation of ExpandResponseFiles you would need to make use of FileSystem::getCurrentWorkingDirectory() instead of llvm::sys::fs::current_path and FileSystem::getBufferForFile() instead of llvm::MemoryBuffer::getFile the rest shouldn't need any changes hopefully, except plumbing VFS to some helpers.

Please send the signature change and call site updates in a separate patch though.

After that patch you can change signature for std::unique_ptr<CompilationDatabase> expandResponseFiles(std::unique_ptr<CompilationDatabase> Base) to accept a IntrusiveRefCntPtr<FileSystem> VFS, which will be getRealFileSystem in case of JSONCompilationDatabase. But this will enable you to pass a InMemoryFileSystem in the unittest below.

lh123 updated this revision to Diff 231217.Nov 27 2019, 4:46 AM
lh123 set the repository for this revision to rG LLVM Github Monorepo.

rebase to D70769

lh123 marked 2 inline comments as done.Nov 27 2019, 4:47 AM

Build result: fail - 60344 tests passed, 2 failed and 732 were skipped.

failed: Clang.CodeGen/catch-implicit-integer-sign-changes-incdec.c
failed: Clang.CodeGen/catch-implicit-signed-integer-truncations-incdec.c

Log files: console-log.txt, CMakeCache.txt

Build result: FAILURE -
Log files: console-log.txt, CMakeCache.txt

kadircet added inline comments.Nov 27 2019, 9:24 AM
clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
28 ↗(On Diff #231235)

nit: FS(std::move(FS))

49 ↗(On Diff #231235)

no need to restore/save, ExpandResponseFilesDatabase should have a FS that's not shared with others.

it is iffy anyway, because if initial working directory is faulty, we'll mutate it to the working directory of the last compile command.
or even if it wasn't faulty, someone could delete the directory before we can restore, or someone can change it while we are in the middle of ExpandResponseFile calls.

61 ↗(On Diff #231235)

maybe make this check at the beginning of the for loop?

could you also leave a fixme saying that we should rather propagate the current directory into ExpandResponseFiles as well in addition to FS and someone can take a look at it later on.
That would be more principled than relying on ExpandResponseFilesDatabase having its own non-shared copy of FS.

clang/lib/Tooling/JSONCompilationDatabase.cpp
302

let's change this one to llvm::vfs::createPhysicalFileSystem to make sure we are passing a non-shared FS. Unfortunately, getRealFileSystem is shared with the whole process.

clang/unittests/Tooling/CompilationDatabaseTest.cpp
910 ↗(On Diff #231235)

nit: instead of clang-format off maybe provide command as a raw string literal?

912 ↗(On Diff #231235)

this test is checking the functionality of ExpandFileCommands helper, which is already tested in llvm/unittests/Support/CommandLineTest.cpp

it should be enough to have two simple tests:

  • with a compile command that refers to a simple rsp file, and making sure it was expanded.
  • with a regular compile command, making sure it stays the same.
lh123 updated this revision to Diff 231357.Nov 27 2019, 11:51 PM
lh123 marked 6 inline comments as done.

address comments.

Build result: FAILURE -
Log files: console-log.txt, CMakeCache.txt

lh123 updated this revision to Diff 231362.Nov 28 2019, 12:56 AM

fix bug.

mostly LG, a few last nits.

clang/lib/Tooling/ExpandResponseFilesCompilationDatabase.cpp
51 ↗(On Diff #231362)

nit: drop and someone can take a look at it later on.

clang/unittests/Tooling/CompilationDatabaseTest.cpp
865 ↗(On Diff #231362)

make this part of the constructor instead of having a SetUp

866 ↗(On Diff #231362)

move this to the test

870 ↗(On Diff #231362)

nit: I would rather keep ASSERT_TRUE in here

877 ↗(On Diff #231362)

nit: no need for braces

lh123 updated this revision to Diff 231393.Nov 28 2019, 2:58 AM
lh123 marked 5 inline comments as done.

address comments.

kadircet accepted this revision.Nov 29 2019, 1:27 AM

Thanks a lot for working on this patch, LGTM!

again please make sure the changes are clang-formatted.

Do you have commit access?

This revision is now accepted and ready to land.Nov 29 2019, 1:27 AM
kadircet retitled this revision from [clangd] Add support for .rsp files in compile_commands.json to [clang][Tooling] Add support for .rsp files in compile_commands.json.Nov 29 2019, 1:28 AM
lh123 added a comment.Nov 29 2019, 2:57 AM

Thanks a lot for working on this patch, LGTM!

again please make sure the changes are clang-formatted.

I'm sure the code has been formatted.

Do you have commit access?

I don't have commit access, please commit it for me.

Via Web