This is an archive of the discontinued LLVM Phabricator instance.

[LLD] [Driver] Look for -m in response files as well
ClosedPublic

Authored by mstorsjo on Jun 7 2019, 12:59 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

mstorsjo created this revision.Jun 7 2019, 12:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2019, 12:59 PM
rnk added a comment.Jun 7 2019, 2:27 PM

For links that use response files (i.e. most of them) this effectively leaks a second copy of the entire response file. Every argument needs to be put in the string saver so it can be null terminated. Response files tend to be large enough that I'd prefer it if we didn't do that.

One way we could address this is to try to calculate the flavor twice:

  • First, use the unexpanded arguments to guess the flavor
  • Use the guessed flavor to expand response files: Windows tokenization for lld-link, GNU for everyone else
  • Re-parse the flavor using the expanded arguments, and call the appropriate entry point with those arguments
  • Remove existing rsp expansion code

This is kind of what clang does:
https://github.com/llvm/llvm-project/blob/master/clang/tools/driver/driver.cpp#L334

In D63024#1534763, @rnk wrote:

For links that use response files (i.e. most of them) this effectively leaks a second copy of the entire response file. Every argument needs to be put in the string saver so it can be null terminated. Response files tend to be large enough that I'd prefer it if we didn't do that.

One way we could address this is to try to calculate the flavor twice:

  • First, use the unexpanded arguments to guess the flavor
  • Use the guessed flavor to expand response files: Windows tokenization for lld-link, GNU for everyone else
  • Re-parse the flavor using the expanded arguments, and call the appropriate entry point with those arguments
  • Remove existing rsp expansion code

This sounds sensible in general. But there's one gotcha that I'm a bit unsure about how to handle best; the COFF and ELF linkers support the option rsp-quoting (-rsp-quoting val, --rsp-quoting val, -rsp-quoting=val or --rsp-quoting=val for the ELF one, --rsp-quoting=val for COFF). So after getting the unexpanded arguments and guessing flavor based on that, we'd need to interpret the arguments to handle the rsp-quoting option. For the flavor option, we have only one spelling and it's handled without the tablegen options. Should I try to do some similar handwritten parsing of rsp-quoting for the ELF syntax (which is a supserset of the variants accepted for COFF) and use that expanding response files at that stage?

For the -flavor option, we remove it from the args we pass on to the individual linkers - should we do that for rsp-quoting as well , or keep it, along with the rsp-quoting options in the individual linkers for printing in the help listings (even if they aren't actually used at that level)?

ruiu added a comment.Jun 10 2019, 12:08 AM

Do you have to support --rsp-quoting for the use case that you want to support? I believe if you are using MinGW, you are always using Unix-style quoting in response files, so if the following two conditions are met, you can conclude that a user is trying to use mingw driver, can't you?

  1. Response files are correctly expanded using the Unix-style quoting scheme
  2. There's a -m foo where foo implies mingw

Do you have to support --rsp-quoting for the use case that you want to support?

Not for my use case, no. But Reid wanted to move all expansion of response files out from the individual linkers here, to avoid reading (and allocating memory for) the (potentially large) response file twice. And in that case, we must handle --rsp-quoting perfectly here.

I believe if you are using MinGW, you are always using Unix-style quoting in response files

At least the default in ELF/DriverUtils.cpp is to use windows quoting style, if the linker binary runs on windows, FWIW.

ruiu added a comment.Jun 10 2019, 2:20 AM

Do you have to support --rsp-quoting for the use case that you want to support?

Not for my use case, no. But Reid wanted to move all expansion of response files out from the individual linkers here, to avoid reading (and allocating memory for) the (potentially large) response file twice. And in that case, we must handle --rsp-quoting perfectly here.

Each flavor's link() function is a public interface, and each function needs to do what the command would do when the same argument strings are given. That means we need to expand response files when the linker is invoked by lld::{coff,elf,wasm,..}::link().

I believe if you are using MinGW, you are always using Unix-style quoting in response files

At least the default in ELF/DriverUtils.cpp is to use windows quoting style, if the linker binary runs on windows, FWIW.

IIRC lld works that way because that's what ninja does, but I may be wrong.

Do you have to support --rsp-quoting for the use case that you want to support?

Not for my use case, no. But Reid wanted to move all expansion of response files out from the individual linkers here, to avoid reading (and allocating memory for) the (potentially large) response file twice. And in that case, we must handle --rsp-quoting perfectly here.

Each flavor's link() function is a public interface, and each function needs to do what the command would do when the same argument strings are given. That means we need to expand response files when the linker is invoked by lld::{coff,elf,wasm,..}::link().

Ah, right. So we at least must keep the handling there as well.

What's your opinion on Reid's performance/memory concern? Do you agree we should expand here and pass the expanded args to the individual linkers, to avoid double expansion? And how should we handle rsp-quoting then?

Or do you suggest to ignore the performance concern and expand once here just for peeking at the -m flag, and let the individual linkers expand it again, properly - like this patch?

ruiu added a comment.Jun 10 2019, 2:38 AM

I'm not particularly worried about the performance impact of doubly-expanding response files. If it is proved to be too slow, we need to do something, but until then I wouldn't optimize it.

But at least we can expand response files only for GNU flavor. For COFF and WASM linkers, we don't have to expand them at this stage.

I'm not particularly worried about the performance impact of doubly-expanding response files. If it is proved to be too slow, we need to do something, but until then I wouldn't optimize it.

But at least we can expand response files only for GNU flavor. For COFF and WASM linkers, we don't have to expand them at this stage.

Ok, that makes sense. Then we can't handle the -flavor option in a response file, but there's no need for that either.

I'll update the patch that way. And I think I'll default the quoting style to the host platform native (like the ELF linker); I think that'll cause fewer issues down the line. (Currently the MinGW driver doesn't handle response files at all, but they fall through to the COFF linker.)

mstorsjo updated this revision to Diff 203797.Jun 10 2019, 4:14 AM
mstorsjo retitled this revision from [LLD] [Driver] Look for -flavor/-m in response files as well to [LLD] [Driver] Look for -m in response files as well.

Localized the extra expansion to the ELF case, use the host platform native quoting style.

ruiu added inline comments.Jun 10 2019, 4:48 AM
tools/lld/lld.cpp
141–143 ↗(On Diff #203797)

If you factor this piece of code to a new function, we can discard SmallVector, saving a small amount of memory.

145 ↗(On Diff #203797)

Is it intentional that you call mingw::link with ExpandedArgs?

mstorsjo marked 2 inline comments as done.Jun 10 2019, 4:55 AM
mstorsjo added inline comments.
tools/lld/lld.cpp
141–143 ↗(On Diff #203797)

Could do that, but...

145 ↗(On Diff #203797)

The MinGW driver does no expansion of its own right now; I guess it would be better to do that there instead of passing the expanded args here - for consistency?

ruiu added inline comments.Jun 10 2019, 4:59 AM
tools/lld/lld.cpp
145 ↗(On Diff #203797)

Yeah I think so. lld::mingw::link's contract should be the same as other link() functions.

mstorsjo updated this revision to Diff 203805.Jun 10 2019, 5:07 AM

Expand only in isPETarget, expand in the mingw driver as well.

mstorsjo updated this revision to Diff 203807.Jun 10 2019, 5:14 AM

Avoid expanding unless no -m option was found.

ruiu accepted this revision.Jun 10 2019, 5:18 AM

LGTM

tools/lld/lld.cpp
89 ↗(On Diff #203807)

I'd avoid function overloading if I can simply give a different name. At first this looked like a recursive function call.

This revision is now accepted and ready to land.Jun 10 2019, 5:18 AM
rnk accepted this revision.Jun 10 2019, 12:53 PM

lgtm

tools/lld/lld.cpp
82 ↗(On Diff #203807)

Typo on expand

This revision was automatically updated to reflect the committed changes.