This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Expand response files before loading compilation database
AbandonedPublic

Authored by vladimir.plyashkun on Jun 21 2017, 3:39 AM.

Details

Summary

Due to command line length restrictions, arguments can be passed through response files.
Before trying to load compilation database from command line, response files should be expanded first.

Diff Detail

Repository
rL LLVM

Event Timeline

alexfh requested changes to this revision.Jun 30 2017, 2:52 AM

Please re-upload the patch with full context (see http://llvm.org/docs/Phabricator.html).

lib/Tooling/CommonOptionsParser.cpp
119

Add a space after // and a trailing period.

unittests/Tooling/CommonOptionsParserTest.cpp
73

Please fix the No newline at end of file.

This revision now requires changes to proceed.Jun 30 2017, 2:52 AM
vladimir.plyashkun edited edge metadata.
vladimir.plyashkun added a subscriber: alexfh.
  • moved test-case from separate file to existing one
  • fixed No newline at end of file problem and put space inside comment
alexfh requested changes to this revision.Jul 5 2017, 7:52 AM
alexfh added inline comments.
lib/Tooling/CommonOptionsParser.cpp
120

Add a trailing period, please.

unittests/Tooling/CompilationDatabaseTest.cpp
638–652 ↗(On Diff #105153)

Can we do all this in the virtual file system (include/clang/Basic/VirtualFileSystem.h) to avoid creating files from the unit test?

This revision now requires changes to proceed.Jul 5 2017, 7:52 AM
unittests/Tooling/CompilationDatabaseTest.cpp
638–652 ↗(On Diff #105153)

Unfortunately, i don't see any options how to do it with VFS.
All other tests related with response files expansion don't use VFS for some reasons, e.g. unittests/Support/CommandLineTest.cpp

vladimir.plyashkun edited edge metadata.
  • trailing period
  • moved test-cases to separate file

To discuss:
This patch is required for correct integration Clang-Tidy with CLion IDE.
By this moment, we unable to use CompilationDatabase.json from CLion side which is widely used in Clang-Tidy and in other common tools.
Anyway, there are possibility to pass compiler options directly through command line.
But we cannot pass compiler options directly through command-line, due to command-line-length restrictions.
So, we decided to pass compiler options through @ResponseFile mechanism.
But there are some problems with this approach.
Before this patch, /clang/lib/Tooling/CommonOptionsParser.cpp worked in this way:

  1. Load compilation database from command-line
  2. Expand response files
  3. Parse all other command line arguments

I think it's strange behavior?
Why we try to load compilation database first?
Maybe it's better to expand response files and only after that try to load compilation database and parse another options?
It's hard to refactor cl::ParseCommandLineOptions and cl::ExpandResponseFiles functions, due to high number of usages, so i copied some block directly into /clang/lib/Tooling/CommonOptionsParser.cpp
I don't think that is a best solution, but i don't have another one.

unittests/Tooling/CommonOptionsParserTest.cpp
41

I moved test-cases to separate file.
But as i mentioned before, i don't see any options how to use VFS here.
All other test-cases related with response files expansion also don't use VFS for some reasons, e.g. unittests/Support/CommandLineTest.cpp

klimek edited edge metadata.Jul 14 2017, 1:27 AM

To discuss:
This patch is required for correct integration Clang-Tidy with CLion IDE.
By this moment, we unable to use CompilationDatabase.json from CLion side which is widely used in Clang-Tidy and in other common tools.
Anyway, there are possibility to pass compiler options directly through command line.
But we cannot pass compiler options directly through command-line, due to command-line-length restrictions.
So, we decided to pass compiler options through @ResponseFile mechanism.
But there are some problems with this approach.
Before this patch, /clang/lib/Tooling/CommonOptionsParser.cpp worked in this way:

  1. Load compilation database from command-line
  2. Expand response files
  3. Parse all other command line arguments

I think it's strange behavior?
Why we try to load compilation database first?
Maybe it's better to expand response files and only after that try to load compilation database and parse another options?
It's hard to refactor cl::ParseCommandLineOptions and cl::ExpandResponseFiles functions, due to high number of usages, so i copied some block directly into /clang/lib/Tooling/CommonOptionsParser.cpp
I don't think that is a best solution, but i don't have another one.

I don't fully understand the problem yet, especially what you mean with "expand response files first".
Generally, the fixed-compilation-db mechanism is there so you can pass a single command line that will be used by all translation units - the response files in that command line would be expanded each time we run a tool on some TU. That seems in line with the design.
Can you post an example of what you're trying to do and what error message you get?

Thanks @klimek for the response!
Let me give an example.
Suppose we want to check main.cpp by clang-tidy.
As i said before, i cannot use CompilationDatabase.json (due to some technical reasons),
but there is way to pass all options though command line and call external process of clang-tidy in this way:
clang-tidy -checks=* main.cpp -export-fixes=... -- -std=c++11 -Ipath/to/include -Ipath/to/include2 -DMACRO ....
All compiler related options appear in the command line after -- symbol.
But due to command line length restrictions (especially on Windows OS) it's better to use @ResponseFile mechanism.
To achieve it, i've created arguments.rsp file with this content:
-checks=* main.cpp -export-fixes=... -- -std=c++11 -Ipath/to/include -Ipath/to/include2 -DMACRO ....
and tried to call clang-tidy in this way:
clang-tidy @arguments.rsp
But all compiler options are ignored by this call!
If you check how CommonOptionsParser.cpp parses command line arguments, you will see, that first of all, it tries to load fixed compilation database (by finding -- symbol in the command line).
Only after that it calls cl::ParseCommandLineOptions which loads content of arguments.rsp file and parses all the rest arguments.
So, my question was why not to expand response files before loading fixed compilation db from command line?
Or i'm doing something wrong?

Even if i'll change content of arguments.rsp to
-std=c++11 -Ipath/to/include -Ipath/to/include2 -DMACRO ....
and will try to call clang-tidy process in this way:
clang-tidy -checks=* main.cpp -export-fixes=... -- @arguments.rsp
it also has no effect, because all compiler options will be ignored (i thinks it's because that stripPositionalArgs() function deletes @arguments.rsp parameter as unused input).

Even if i'll change content of arguments.rsp to
-std=c++11 -Ipath/to/include -Ipath/to/include2 -DMACRO ....
and will try to call clang-tidy process in this way:
clang-tidy -checks=* main.cpp -export-fixes=... -- @arguments.rsp
it also has no effect, because all compiler options will be ignored (i thinks it's because that stripPositionalArgs() function deletes @arguments.rsp parameter as unused input).

Ah, ok: this is definitely the way it should work (adding the response file after the --). I think this is the use case we should fix - probably by fixing stripPositionalArgs?

To discuss:
...
By this moment, we unable to use CompilationDatabase.json from CLion side which is widely used in Clang-Tidy and in other common tools.

It would be interesting to learn more about the reasons why you can't use JSON compilation database. In case you don't want to clutter the project's directory, you can place the compile_commands.json file elsewhere (in a temporary directory, for example) and point clang tools to this directory using the -p command line flag.

Anyway, there are possibility to pass compiler options directly through command line.
But we cannot pass compiler options directly through command-line, due to command-line-length restrictions.
So, we decided to pass compiler options through @ResponseFile mechanism.
But there are some problems with this approach.
Before this patch, /clang/lib/Tooling/CommonOptionsParser.cpp worked in this way:

  1. Load compilation database from command-line
  2. Expand response files
  3. Parse all other command line arguments

I think it's strange behavior?

I don't think anyone has tried using response files for clang tools before. So the behavior here is almost certainly not intended. The other question is what would be a good way to handle response files in clang tools and whether we want it at all. IIUC, the only use case for response files is to workaround the limits on the size of a command line. If so, clang tools have an alternative solution - JSON compilation database. Are there any concerns using the alternative?

Even if i'll change content of arguments.rsp to
-std=c++11 -Ipath/to/include -Ipath/to/include2 -DMACRO ....
and will try to call clang-tidy process in this way:
clang-tidy -checks=* main.cpp -export-fixes=... -- @arguments.rsp
it also has no effect, because all compiler options will be ignored (i thinks it's because that stripPositionalArgs() function deletes @arguments.rsp parameter as unused input).

Ah, ok: this is definitely the way it should work (adding the response file after the --). I think this is the use case we should fix - probably by fixing stripPositionalArgs?

At which point is stripPositionalArgs called? Also (in case we want to actually support this use case) should we only allow response files when using the fixed compilation database or for any compilation database (I highly doubt about the latter)?

Are there any concerns using the alternative?

I can't say that it's a big problems, but i think that:
CompilationDatabase.json is more CMake specific format.
It can be generated automatically by CMake, while other build systems may not do it.
So we need to generate it on the fly (by tool or by hand), which also can lead to hidden problems due to different formats, different escaping rules, etc.
I think it's always good to have one unified format.

Are there any concerns using the alternative?

I can't say that it's a big problems, but i think that:
CompilationDatabase.json is more CMake specific format.
It can be generated automatically by CMake, while other build systems may not do it.
So we need to generate it on the fly (by tool or by hand), which also can lead to hidden problems due to different formats, different escaping rules, etc.
I think it's always good to have one unified format.

The compilation database is most certainly not cmake specific. We designed it in clang, and then implemented it in cmake as an example, because that's the most widely used C++ build system we were aware of (aside from autotools, which we didn't want to touch :). There are ways to create it from other build systems (ninja has support, and there are tools that provide generic support to intercept compiles).
If you want one unified format, the compilation database is it.
If you want special integration with your IDE, btw, I'd suggest you just implement your own compilation database (in C++) and make sure tools you support link to it.

If you want one unified format, the compilation database is it.

Is the clang-tidy ... -- <args> meant to be more or less a drop-in replacement for clang <args> (arguments-wise)?
If yes, this expansion of response files here is an another step in this direction.

Our only concerns are the internal ones (instead of using our well-tested response file implementation, we have to do something similar but different with JSON-based compilation database format).
Obviously, it is possible to use it, and if this patch won't be accepted, we would.
But then, again, I see this change as a positive by itself and don't really understand why the compiler driver should expand response arguments and the fixed compilation database shouldn't.

As IDE developers we don't have full control on the format of compiler options.
They come from users in free form which compiler can understand.
From the implementation view it would be more transparent and efficient to transfer them in the original form to Clang-Tidy (instead of generating intermediate files).

If you want one unified format, the compilation database is it.

Is the clang-tidy ... -- <args> meant to be more or less a drop-in replacement for clang <args> (arguments-wise)?
If yes, this expansion of response files here is an another step in this direction.

Yes. I'm still confused why in this case
clang-tidy @file --
would be expected to expand the response file? Am I missing something?

Our only concerns are the internal ones (instead of using our well-tested response file implementation, we have to do something similar but different with JSON-based compilation database format).
Obviously, it is possible to use it, and if this patch won't be accepted, we would.

I don't understand this. Can you elaborate?

But then, again, I see this change as a positive by itself and don't really understand why the compiler driver should expand response arguments and the fixed compilation database shouldn't.

As IDE developers we don't have full control on the format of compiler options.
They come from users in free form which compiler can understand.
From the implementation view it would be more transparent and efficient to transfer them in the original form to Clang-Tidy (instead of generating intermediate files).

Why are intermediate files (or your own implementation of a CompilationDatabase) not in the "original form". What *is* the original form?

Thanks for the response @klimek !
Sorry for possible confusions.

Yes. I'm still confused why in this case clang-tidy @file -- would be expected to expand the response file? Am I missing something?

I think that we discussed use-case when @response files can appear in the compiler options (right after -- symbol).

I don't understand this. Can you elaborate?

As IDE developers we can't control how users pass options to compiler.
For example, user can set compiler options directly in the CMake, like this:

set(CMAKE_CXX_FLAGS "@response.rsp")

and our goal is to pass this options to Clang-Tidy as is.
It doesn't matter how will we pass them to Clang-Tidy through compile_command.json:

{
  "directory" : "",
   "command": "clang++ ... @response.rsp",
   "file": "..."
}

or through command line (we choosed this method):

clang-tidy ... -- @response.rsp

If compiler options will contain response files, all contents will be discarded, because Clang based tools don't expand response files.
By this fact we can loose important compiler information (header paths, definitions, etc.) which will lead us to inaccurate analysis result.

asl added a subscriber: asl.Jul 20 2017, 10:15 AM

To discuss:
...
By this moment, we unable to use CompilationDatabase.json from CLion side which is widely used in Clang-Tidy and in other common tools.

It would be interesting to learn more about the reasons why you can't use JSON compilation database. In case you don't want to clutter the project's directory, you can place the compile_commands.json file elsewhere (in a temporary directory, for example) and point clang tools to this directory using the -p command line flag.

Many build systems normally generate response files on-fly in some circumstances (e.g. if command line is longer than some platform-imposed limit). So IMO response files should be a perfect citizen here.

Many build systems normally generate response files on-fly in some circumstances (e.g. if command line is longer than some platform-imposed limit). So IMO response files should be a perfect citizen here.

friendly ping

Many build systems normally generate response files on-fly in some circumstances (e.g. if command line is longer than some platform-imposed limit). So IMO response files should be a perfect citizen here.

friendly ping

Sorry for the delay. I lost track of this revision somehow.

Have you found an alternative solution or are you still interested in implementing this functionality?

vladimir.plyashkun added a comment.EditedAug 16 2017, 5:46 AM

@alexfh
Thanks for the response!
Yes, we re-implemented logic and now use JSON compilation database to pass compiler options to Clang-Tidy.
Anyway, i think in general it's useful to support @response_files, see my comment: https://reviews.llvm.org/D34440#813411

@alexfh
Thanks for the response!
Yes, we re-implemented logic and now use JSON compilation database to pass compiler options to Clang-Tidy.
Anyway, i think in general it's useful to support @response_files, see my comment: https://reviews.llvm.org/D34440#813411

If you see value in supporting response files and are willing to invest more time in the implementation, I'll expand a bit on https://reviews.llvm.org/D34440#809529. Since the primary use case for @response_files is to work around the command line length limit, and, as Anton pointed out, they can be generated by build systems on the fly (both only relevant on Windows, I guess), they may be valuable for passing compiler arguments (they can grow really large) and less so for other arguments of Clang tools. So it seems like the right place to expand response files is where the compiler arguments are retrieved from the compilation database. However, a bit of extra attention may be needed to properly handle the use case of Clang tooling as a library with in-memory files system (where we probably don't want to read random response files that compiler arguments in the compilation database refer to).

chh added a subscriber: chh.Apr 30 2018, 3:22 PM

FYI, Android NDK has another use case in https://github.com/android-ndk/ndk/issues/680.
It would be nice to have clang-tidy recognize the response file.