This is an archive of the discontinued LLVM Phabricator instance.

Teach Clang how to use response files when calling other tools
ClosedPublic

Authored by rafaelauler on Aug 13 2014, 9:13 PM.

Details

Summary

This patch addresses PR15171 and teaches Clang how to call other tools with response files, when the command line exceeds system limits. This is a problem for Windows systems, whose maximum command-line length is 32kb.

I introduce the concept of "response file support" for each Tool object. A given Tool may have full support for response files (e.g. MSVC's link.exe) or only support file names inside response files, but no flags (e.g. Apple's ld64, as commented in PR15171), or no support at all (the default case). Therefore, if you implement a toolchain in the clang driver and you want clang to be able to use response files in your tools, you must override a method (getReponseFileSupport()) to tell so.

I designed it to support different kinds of tools and internationalisation needs:

  • VS response files ( UTF-16 )
  • GNU tools ( uses system's current code page, windows' legacy intl. support, with escaped backslashes. On unix, fallback to UTF-8 )
  • Clang itself ( UTF-16 on windows, UTF-8 on unix )
  • ld64 response files ( only a limited file list, UTF-8 on unix )

With this design, I was able to test input file names with spaces and international characters for Windows. When the linker input is large enough, it creates a response file with the correct encoding. On a Mac, to test ld64, I temporarily changed Clang's behavior to always use response files regardless of the command size limit (avoiding using huge command line inputs). I tested clang with the LLVM test suite (compiling benchmarks) and it did fine.

Note: It depends on http://reviews.llvm.org/D4896 landing in the LLVM tree.

Diff Detail

Repository
rL LLVM

Event Timeline

rafaelauler retitled this revision from to Teach Clang how to use response files when calling other tools.
rafaelauler updated this object.
rafaelauler edited the test plan for this revision. (Show Details)
rafaelauler added reviewers: rafael, asl.
rafaelauler added a subscriber: Unknown Object (MLST).
rnk added a subscriber: rnk.Aug 14 2014, 5:00 PM

In this new patch, I now use writeFileWithEncoding by passing an enum member (conforming to my new patch in D4896). I also updated my changes in Tool, separating ResponseFileSupport in ResponseFileSupport, which will only tell whether the tool has full support for response files, file lists or no support, and ResponseFileEncoding, which will tell which encoding we should use when writing this response file.

I believe this makes the whole encoding issue much more clear and explicit. I also adopted Rafael's suggestion and added a comment explictly stating which tools need to use the current code page in their response file encoding.

Rebase & update the patch according to rnk's and rafael's suggestions on D4896.

silvas added a subscriber: silvas.Aug 25 2014, 11:06 AM

Lots of naked const char * type stuff here (that includes unique_ptr<char[]>). Better to use StringRef, SmallVectorImpl<char>&, and raw_ostream& where possible, maybe with judicious use of BumpPtrAllocator, memcpy, and strcpy.

There are a lot of places where you have to separate loops: one for computing a buffer size, and another for actually doing work. This is a maintenance problem (will the length computation stay in sync with the code?) and also causes two traversals over the memory. By directly doing push_back's onto a vector or writing into a raw_ostream, you avoid doing two traversals.

lib/Driver/Job.cpp
122 ↗(On Diff #12873)

This would be a lot nicer with ArrayRef<const char *>; that way, you won't lose the length.

169 ↗(On Diff #12873)

Check your naming here and in other places.

lib/Driver/Tools.h
546–550 ↗(On Diff #12873)

Ew... let's not have a bunch of random #ifdef's all over the code outside libSupport.

Hi Sean,

Thanks a lot for your careful review. I have answered your questions below, and I have few concerns. Please let me know if I am missing something:

This code is on the critical path for commands with very large lengths and this is a non-negligible time. For UNIX, for example, this would be triggered for commands spanning more than 2MB of arguments and Clang spends a good amount of time to parse even trivial arguments, because it is a huge number of arguments.

Another important point is that, currently, the Command class stores arguments as a list of char *.

This is the reason for this design with lots of naked char *. It avoids creating StringRefs, that would measure the length of char* in their constructor and cause more overhead. It also avoids using high-level classes such as streams and bump allocators because, since this is likely to be called only when the number of arguments is very large, would cause them to repeatedly perform heap allocations until they finally settle to support the final size of the string. Either that or we would need to work with very large slabs, wasting memory. In this design, the length is calculated once and a single allocation is made, similar to what is done in Execute() in Support/Windows/Program.inc.

Therefore, by calculating the size of the final command line beforehand, even though it looks like we are losing by iterating over all args twice (one for length calculation and other for parsing), we win by eliminating inefficiencies in the allocation mechanism in runtime or in memory consumption.

Regards,
Rafael

lib/Driver/Job.cpp
122 ↗(On Diff #12873)

That's a good point. However, I am afraid that I would still need to recalculate the length by traversing each argument, calculating its length and assessing whether it needs quoting or not.

169 ↗(On Diff #12873)

Thanks for pointing out, I fixed the naming. Will submit the patch soon.

lib/Driver/Tools.h
546–550 ↗(On Diff #12873)

I agree that this is inadequate. My previous patch version delegated this reasoning to libSupport. However, it was more difficult to understand and I changed it for clarity. Now, it is obvious that the response file encoding for this tool depends on whether it is running on Windows or not. In my previous libSupport version, I had enum members that meant "this is UTF on UNIX but UTF16 on Windows", and so on. I think this is not a good design because if you need to support a new tool, you may potentially need to add a new enum member in libSupport to encode your tuple <MyUNIXEncoding, MyWindowsEncoding>.

Since the encoding depends on the tool and on the operating system, I would appreciate if you have any suggestion on how to code this. Meanwhile, I will think in something.

Updating patch wrt to API changes in D4896. Fixing naming.

Removed the ifdefs from several instances of getResponseFileEncoding() by using a new datatype called EncodingStrategy, introduced in the llvm side of the patch at D4896. This datatype stores the desired encoding on both UNIX and Windows systems, letting the writeFileWithEncoding() function in Support to select the appropriate encoding to use.

Conforming with my last update on D4896, change the encodings to be exclusive for Windows platforms, making it clear that we always use UTF8 on Unix systems. Convert ResponseFile from const char * to StringRef.

rafael added inline comments.Aug 27 2014, 10:00 AM
include/clang/Driver/Job.h
87 ↗(On Diff #12960)

printArgsWithRespFile

100 ↗(On Diff #12960)

The function should be lowercase, the variable Uppercase.

lib/Driver/Job.cpp
32 ↗(On Diff #12960)

I wonder if we should say that the job needs a response file based only on llvm::sys::argumentsFitWithinSystemLimits and then emit an error if Creator.getResponseFilesSupport() says it is not supported.

279 ↗(On Diff #12960)

The code in PrintArgsWithRespFile looks fairly duplicated with what just follows. Couldn't this just patch Arguments and use the existing code path?

Hi Rafael,

I have answered your comments below, thanks. I will upload a patch fixing your concerns shortly after this message.

Sean, regarding the change to llvm::sys::argumentsFitWithinSystemLimits() to avoid code duplication, I am afraid it is not necessary after my new patch. I will send it now.

Regards,
Rafael

include/clang/Driver/Job.h
87 ↗(On Diff #12960)

changed

100 ↗(On Diff #12960)

changed

lib/Driver/Job.cpp
32 ↗(On Diff #12960)

The problem is that llvm::sys::argumentsFitWithinSystemLimits() is not exact, see:

bool llvm::sys::argumentsFitWithinSystemLimits(ArrayRef<const char*> Args) {

static long ArgMax = sysconf(_SC_ARG_MAX);

// System says no practical limit.
if (ArgMax == -1)
  return true;

// Conservatively account for space required by environment variables.
ArgMax /= 2;

...
Thus, sometimes it may say that you need a response file but, when building the argv vector, you discover that it actually fits in system limits. Therefore, I think that our best action is to silently try to pass the arguments without a response file because it may actually work.

279 ↗(On Diff #12960)

Done.

This update refactors the new code in Job.cpp to use streams, no longer building and managing its own char buffers. Thanks to Sean’s suggestion, the code now is much more simple. However, I wanted to build a stream that could directly write to the response file with the correct encoding, but raw_fd_ostream lacks the capability to write files with different encodings. Therefore, I added this capability to raw_fd_ostream with a small modification and introduced a new constructor variant that creates buffers that, when flushed to a file, is written in a different encoding. If you think it is inadequate to have such a feature in raw_fd_ostream, I can work on creating my own stream class to do so.

I also refactored part of the write function in raw_fd_ostream out of this class, to avoid duplicating code in this implementation. Then, I changed by writeFileWithEncoding() function to write in a given file descriptor, which is assumed to be opened, rather than opening the file by my own and then closing it. This enabled me to make raw_fd_ostream work with different encodings with little extra code.

silvas added inline comments.Aug 29 2014, 4:06 PM
lib/Driver/Job.cpp
105 ↗(On Diff #13116)

Tiny nit: Use nullptr instead of 0

133–139 ↗(On Diff #13116)

In the future, consider using std::find_if instead of writing this function http://en.cppreference.com/w/cpp/algorithm/find

I'm not sure a linear search here is acceptable though, see my comment below.

144–150 ↗(On Diff #13116)

This seems like an unusual thing to have a helper for. The real problem seems to be that this is part of a larger block of code which is duplicated. Just pulling out this part doesn't help things: get rid of this helper function and deduplicate a larger scope.

163 ↗(On Diff #13116)

Returning a SmallVector by value is undesirable. in this case, I think NRVO will negate part of the issue (large sizeof), but you still have the problem that this hardcodes what SmallVector size the caller is using. I would recommend switching this to taking a non-const SmallVectorImpl<const char *> &Out instead of returning a SmallVector.

172 ↗(On Diff #13116)

isInInputList seems to be doing linear work. If Arguments is mostly Inputs, then this will go quadratic. Is that a potential problem?

185–186 ↗(On Diff #13116)

Use ArrayRef here.

187–190 ↗(On Diff #13116)

Simpler:

for (auto Arg : Inputs)
  OS << Arg << '\n';

Also, for clarity, consider just saying const char * instead of auto.

205 ↗(On Diff #13116)

I don't think this is correct. ResponseFile.data() isn't guaranteed to be 0 terminated (and you don't want to embed that hidden assumption). Maybe use .append(begin, end)?

216–232 ↗(On Diff #13116)

This loop scares me. Could you clean this up in a separate patch? Or at least add comment here for now.

235 ↗(On Diff #13116)

This call seems duplicated, along with a lot of stuff in this if. Pull this into a helper?

lib/Driver/Tools.h
309 ↗(On Diff #13116)

Tiny nit: Don't use virtual with override.

531–537 ↗(On Diff #13116)

This seems highly repetitive. Is there a way to reduce it?

Also, I'm not sure that it makes sense to have a method returning WindowsEncodingMethod for a netbsd tool (same goes for most tools that have no reason to run on windows).

Sean,

Thanks for the fast reply. I answered some of your comments below. The remaining comments were fixed in the next patch (will upload it now).

Best regards,
Rafael Auler

lib/Driver/Job.cpp
205 ↗(On Diff #13116)

Great observation. I'll change ResponseFile back to a const char * because, otherwise, I would need to copy it to a null-terminated string storage, since I may need to directly store it in the argv vector (which only contains null-terminated strings).

216–232 ↗(On Diff #13116)

Sure, will work on it in a separate patch.

lib/Driver/Tools.h
531–537 ↗(On Diff #13116)

WindowsEncodingMethod, when used for UNIX tools, covers the case where these tools run in MinGW environments (in a cross-compilation setup).

I addressed Sean's comments regarding my last patch update:

  • Job.cpp: Removed the quadratic time complexity from buildArgv (for file lists)
  • Job.cpp: Removed/Refactored duplicated code
  • Job.cpp/Tools.h: Implemented other minor changes
  • Tools.h: Refactored/Created a superclass for all GNU tools that have the same behaviour when it comes to response files, avoiding repetitive member function implementations across many classes.
silvas added a comment.Sep 2 2014, 1:58 PM

This is looking a lot better.

In particular, I feel like the testing needs a bit more work; I'm not seeing the connection between the added tests and the functional changes in this patch.

I'd like espindola to give a second opinion on the code, and he can probably suggest improvements to the testing.

include/clang/Driver/Job.h
103 ↗(On Diff #13150)

Is there ever any situation where NeedsResponseFile == true but ResponseFile != nullptr? In general, what is the meaning of the 4 different possibilities for

(ResponseFile {=,!}= nullptr) x (NeedsResponseFile {=,!}= true)

rafael edited edge metadata.Sep 2 2014, 2:46 PM

My main comment is on the llvm side. We really can't have something as core as raw_fd_ostream handling char conversion. Some options are

  • build a buffer and write it at once.
  • have another streamer implementation. Hopefully implement it in a way that avoids reasoning about partial multibyte chars. A simple option might be to build the entire buffer in memory inside the streamer and convert on close (and error on sync?). Even for large response files, I expect the work being done by the sub process to always dwarf the cost of creating it, so we probably don't need to get fancy.

I have one more nit on the code itself and one on the tests.

include/clang/Driver/Tool.h
81 ↗(On Diff #13150)

All the overrides of these 3 methods always return a symbol constant, no? Maybe it would be better to make them non-virtual:

ResponseFileSupport getResponseFilesSupport() const {

return RespFileSupport;

}

and set RespFileSupport in the constructor of the base classes. We do things like that in, for example, MCAsmInfo.

test/Driver/response-file.c
17 ↗(On Diff #13150)

Please don't introduce an use of awk :-)

Can you use clang -E to produce a large file on the fly? Is 2MB really too big for current linux systems?

Hi Sean and Rafael,

Thanks for your feedback. I answered your comments below. I will also upload a patch now, both on the LLVM and on the Clang side, addressing Rafael's concerns regarding the streams and his virtual removal suggestion.

Regarding the testing, I had some difficulty in figuring out good tests for the response files mechanism. Personally, to ensure that the mechanism works fine, I change needsResponseFile to always return true and use clang in several regular scenarios to check that the response files work.

Without this hack, to test it, I need to truly generate a very big response file, which involves exercising the compilation of a large project (especially to test -filelist in ld64, which cannot be tested with our -DTEST -DTEST... approach). I guess this kind of test is more appropriate to the test suite (compiling big projects).

One thing I can think of now is to create a new flag "-force-response-file" for testing purposes. Is this reasonable?

Best regards,
Rafael Auler

include/clang/Driver/Job.h
103 ↗(On Diff #13150)

In particular, we use this to protect against a user of this class that does not call setResponseFile() even if we return true in needsResponseFile(). If the user did not configure a response file, ResponseFile will be a nullptr and we need to disable resp file usage.
On the other hand, if the user *did* call setResponseFile() but we don't need it (NeedsResponseFile = false), we should also disable response file usage.

test/Driver/response-file.c
17 ↗(On Diff #13150)

I'm no fan of awk, but I had a small problem with this test. Previously, I used a shell "for" loop to generate a large response file on the fly, but this made this test unavailable to windows machines without bash.

To make this work on Windows, I looked at what LLVM required from its Windows users to correctly run tests, and I saw that it requires that the user installs the gnuwin32 package with many tools, including awk. Awk was the only tool that I could think of where I could implement a loop to synthesise this large response file on the fly, making this test work both on unix and on windows machines.

Sorry but I'm not sure if I understood what you said about using clang -E to produce a large file on the fly. If you are asking whether clang -E uses response files, the idea here -- and it is actually based on a post that you wrote a longe time ago, giving ideas to test response files :-) -- is to pass many -D params to the driver, which, in turn, will be forced to use a response file when calling "clang -cc1 -E".

Yes, 2MB is enough to trigger response files in current linux systems. However, I can make the file larger, but the time of this test will also rise (a lot, depending on the size and if clang was compiled with -O0). The bottleneck is in Clang's parameter processing logic, which is stressed in this test.

rafaelauler edited edge metadata.

Hi,

This is the updated patch implementing:

  • using a raw_string_ostream to buffer the response file contents before calling writeFileWithEncoding() to write all the contents at once.
  • removing virtual from Tool's member functions that deal with response files, following Rafael's suggestion.

Best regards,
Rafael Auler

Without awk this is fine by me :-)

The idea of adding -force-response-file as a testing option would work, but I am not sure whether it is worth it or not.

Please just wait for Sean to give his OK too.

test/Driver/response-file.c
17 ↗(On Diff #13191)

My idea for using clang -E to generate the test is something like

clang -E response-gen.c | grep -v \# > %t2

with response-gen.c being

#define M -DTEST
#define M2 M M
#define M4 M2 M2
#define M8 M4 M4
#define M16 M8 M8
#define M32 M16 M16
#define M64 M32 M32
#define M128 M64 M64
#define M256 M128 M128
M256

with as many MXXX as you need.

rafaelauler updated this revision to Diff 13239.Sep 3 2014, 9:51 PM

Hi Rafael,

Thanks for the suggestion on removing awk from the lit test, I integrated it into this new version of the patch.

LGTM.

Sean, what do you think?

silvas added a comment.Sep 4 2014, 2:05 PM

I'm still uneasy about NeedsResponseFile (see my comment below).

Also a few style things.

include/clang/Driver/Job.h
101–104 ↗(On Diff #13239)

I still don't think this makes sense. You previously replied that this is done to "protect the user", but I think that two combinations are programming errors: We should assert if:
ResponseFile == nullptr and NeedsResponseFile == true
ResponseFile != nullptr and NeedsResponseFile == false
(if this is not the case, please provide an example where this could happen and clang would behave correctly)

This leaves only two valid options:
ResponseFile == nullptr and NeedsResponseFile == false
ResponseFile != nullptr and NeedsResponseFile == true

So NeedsResponseFile is redundant.

It seems like right now you are using NeedsResponseFile just as a flag that roughly means "remember to do some other work at some faraway place in the code". I really don't like having such a random coupling across disparate parts of the code. Can you centralize the logic of checking !llvm::sys::argumentsFitWithinSystemLimits(_Arguments) && Creator.getResponseFilesSupport() != Tool::RF_None with the logic of setting ResponseFile?

lib/Driver/Job.cpp
120–122 ↗(On Diff #13239)

Can we simplify this by just always quoting?

124–129 ↗(On Diff #13239)

Please make this a for loop so that it is easy to see what is being looped over. In particular, so that the Arg++ is easily visible.

for (; *Arg != '\0'; Arg++)

lib/Driver/Tools.cpp
5854–5865 ↗(On Diff #13239)

Simpler:

for (const auto &II : Inputs) {
  if (!II.isFilename())
    break;
  InputFileList.push_back(II.getFilename());
}
rafaelauler updated this revision to Diff 13422.Sep 8 2014, 2:53 PM

Hi Sean,

I agree that having a flag that triggers action in a far away code is bad, and, as such, I was studying the code to see if I could do a larger refactoring to put all response-file-related stuff in a single place, freeing us from any far away code that is important to response files. However, this turned out to be tougher than I think. Because we need both information from Tool objects and from the Driver (to determine whether the command will actually run, in which case we build a resopnse file), I couldn't centralize everything, but I did centralize the small logic that you requested. Unfortunately, we still have some far away code that is important (in Tools.cpp when we call setInputFiles(), crucial to enable support for file lists in ld64), but I believe that this is the best we can do. Therefore, I ended up with only minor modifications regarding the last patch, although I did address all your concerns.

After some assessment and because the Driver is the top level class that calls the other objects (Compilation, Job etc) to run a command, I put the small logic that determines whether to use response files in a Driver private member function. The reason for this is because we don't need to allocate a response file if we are not running the commands (and there are some code paths that instantiate Driver, Compilation and Job objects just to fetch the arguments, but will never run the commands). Therefore, in the Driver, we are able to accurately determine when the command will actually run, triggering the code that determines where response files are needed just before running the command. Also, the Driver seemed a natural place to put such control logic. On the other hand, I had to remove the constness of the Driver::ExecuteJobs -- notice that this function will now provision resources for response files (their name) and, therefore, can't be const. This is a small change that does not impact any code in the tree -- in fact, no user expected ExecuteJobs to be const.

I also addressed your other (minor comments). Yet, I took a look at the loop you mentioned that it would be important to be cleaned up (in Command::Print) and couldn't determine a good way to refactor this. I saw the logic there, and it seems that it is very particular of what we want when receiving crash reports (skip some arguments and quote others). When you mentioned that this needs to be cleaned up, did you mean to remove such idiosyncrasies and just send all arguments to the crash report, or just a smaller refactoring?

Thanks for putting in time to see this,
Rafael Auler

silvas accepted this revision.Sep 8 2014, 3:37 PM
silvas added a reviewer: silvas.

LGTM.

As for Command::Print, I was just hoping for an easy way to see what it is doing, but I guess from what you are telling me it is not so easy. It's not a huge deal.

This revision is now accepted and ready to land.Sep 8 2014, 3:37 PM

If it is okay for Rafael too, could you please commit this patch? I don't
have commit access.

Best regards,
Rafael Auler

rnk closed this revision.Sep 15 2014, 10:55 AM
rnk updated this revision to Diff 13721.

Closed by commit rL217792 (authored by @rnk).

rnk added a comment.Sep 15 2014, 10:56 AM

I saw this hadn't landed in a week, so I went ahead and did it in r217792.