See the parent revision https://reviews.llvm.org/D54487 for differences between this implementation and the equivalent GCC option.
I realize that you're probably striving for option compatibility with gcc, but continuing to name it -frecord-gcc-switches when it actually records Clang switches seems weird to me. It almost sounds like something that would dump gcc equivalents of all Clang options, or maybe let you know which Clang options you've used that match gcc options. Either way, by the name -- if you aren't familiar with the gcc option -- it doesn't read like it records Clang options.
Would it be that bad to name it -frecord-clang-switches? Or just -frecord-switches?
I agree, and this was my original plan, but then I noticed that Clang already implements -grecord-gcc-switches and so I decided to mirror the naming for the -f variant as well.
If anything I think dropping the -gcc- altogether would make the most sense. I don't understand why GCC includes it in the first place.
Well, having something in the name that make it clear that this is about command line switches and not switch statements seems sensible. It's a little unfortunate to give a useful feature a vendor-specific name, but I think the right approach is probably to accept that for compatibility and then also accept a more neutral and descriptive name.
The code seems fine.
For (1) I think the answer should be that we emit .LLVM.command.line (or whatever name we land on) for every object-file format with the concept of names for sections.
For (2), I have only implemented this for ELF so far in LLVM. I don't know how reasonable that is, and if it isn't I can look at adding it to other common object-file formats that LLVM supports.
For (3), my current proposal for the format is the same as the .comment section for idents: null-surrounded strings. Interior spaces are escaped, in the same manner as for the -g variant. There may still be more thought to put into the format, as the GCC version null-terminates each individual option; the reason I avoided this is that during object linking the options of many command-lines simply get mixed together, which seems less than useful.
As an example, for ELF clang -frecord-command-line -S "foo .c" produces the ASM:
.section .LLVM.command.line,"MS",@progbits,1 .zero 1 .ascii "/usr/local/bin/clang-8 -frecord-command-line -S foo\\ .c" .zero 1
And for multiple command-lines in a single object (e.g. linking three objects with recorded command-lines) this would be:
.section .LLVM.command.line,"MS",@progbits,1 .zero 1 .ascii "/usr/local/bin/clang-8 -frecord-command-line -c foo\\ .c" .zero 1 .ascii "/usr/local/bin/clang-8 -frecord-command-line -some -unique -options -c bar.c" .zero 1 .ascii "/usr/local/bin/clang-8 -frecord-command-line -something -else -c baz.c" .zero 1
I will try to capture more of this in the documentation.
Looking a bit further, it seems some other object-file formats have conventions for naming (e.g. Mach-O sections are _snake_case) and so .LLVM.command.line may not end up being what we want universally.
Additionally it seems things like .ident are not supported by all object-file formats in LLVM; using Mach-O as an example again, it uses the default of "no .ident support", and I don't see the ident in the .S or .o
Okay. I don't particularly need you to implement this on additional targets, but yes, please capture all this layout information in the documentation and make sure that the driver enforces any meaningful restrictions about which targets support it.
How about something like "After linking, the section may contain multiple command lines, which will be individually terminated by null bytes. Separate arguments within a command line are combined with spaces; spaces and backslashes within an argument are escaped with backslashes."
And please write out "(ELF Only)" as something more like "This option is only currently supported on ELF targets."
So we're using the same command line option as GCC to produce something in the same section as GCC but formatting that section incompatibly with GCC? That combination of choices does not seem like a good idea.
From the GCC man page -frecord-gcc-switches "is only implemented on some targets and the exact format of the recording is target and binary file format dependent, but it usually takes the form of a section containing ASCII text"
In the LLVM portion of this review it was decided that the implications of this is that well-behaved consumers should not make assumptions about the format. They should also not make assumptions about the name of the section, but we are effectively backwards-compatible in that we are a section with ASCII text, so tools which just e.g. dump the section for uses will continue to work with our implementation.
@scott.linder Hi Scott, you may have an opion on https://gcc.gnu.org/ml/gcc-patches/2020-03/msg00230.html . I also started a thread on cfe-dev: http://lists.llvm.org/pipermail/cfe-dev/2020-March/064795.html