This patch extends support of the option -frecord-command-line to XCOFF. XCOFF doesn’t have custom sections like ELF, so the command line data is emitted to a .info section instead. A C_INFO symbol is generated with the .info section to preserve the command line data past the link step. Multiple command lines are separated by newlines and null bytes. The command line data can be retrieved on AIX with command what file_name.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Sorry for the delay in review, I am not too familiar with XCOFF so I was hoping someone else would take a look first.
If my understanding of the COFF docs I could find is correct then this LGTM save for some nits/suggestions
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
970–1037 | I sketched out some of the suggestions I had, although the bigger changes are just because the extra .info for the padded byte bugged me. If you aren't concerned with it I'm also happy with what you have | |
980 | I couldn't quickly find a reference for the alignment requirement, but it seems like there is an additional requirement that the length must also be non-zero, not just even? If so, can you update the comment? I would also rather explicitly use alignTo and max to express this (see suggestion), but if we don't expect the optimizer to clean it up I'm fine with the more terse version. | |
980 | Can you factor this out at function scope? It gets repeated below | |
984 | Redundant cast | |
988 | It seems odd to use a lambda when this is only used once. Why not just compute the last word directly? | |
1033 | Can you factor this out as Separator at function scope? | |
llvm/test/CodeGen/PowerPC/aix-command-line-metadata.ll | ||
21 | Having the padded byte force a new .info directive doesn't seem ideal. I suggested something to avoid it, but I suppose it also doesn't really harm anything. |
Thanks for the review @scott.linder. I applied the changes you requested with some differences.
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
980 | The length can be zero. See second paragraph of the comment section: https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format#XCOFF__a1pyfk2b4joyc__title__1 | |
1021 | I changed the assert that you requested because it would always fail: assert(Length - Index == MetadataPaddingSize); |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
985 | Handled the 0 length case here. |
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
---|---|---|
2337–2338 | I would add a comment explaining that for XCOFF, the command line metadata was emitted earlier. | |
llvm/lib/MC/MCAsmStreamer.cpp | ||
980 | There's no requirement to pad the .info section. When you generate assembly language, the .info pseudo-op can only generate words of data, so the if the data length is not 0(mod 4), the last word will have to be padded with low-order 0s. This means that the length of the .info section will be a multiple of 4. The length, on the other hand, should be exact. At link time, only "length" bytes will be copied to the output file. If you emit object code directly, you will not need to emit any padding bytes. | |
986 | Can this be moved up right after the computation of MetadataSize? Can't you just emit "0,"? Why call format_hex()? It seems odd to have a literal comma here instead of using Separator. | |
993 | You should use MetadataSize and Separator here. In fact, you could emit the length before checking for 0, expecially because a length of 0 is a rare case (and impossible for this patch). | |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
2973 | I would use a newline here. The AIX what command looks for @(#) and echos subsequent bytes until it sees a double quote, a backslash, a > symbol, newline, or null byte. The @(#) is not echoed, nor is the terminating character. The what command prints a newline after it finds a terminating character. This means that if the command line contains any of the special characters, the line will be truncated. Exception: If the @(#) is followed by "opt " or " opt ", the terminating characters are only a newline or null byte. This allows any of the other special characters to be part of the command line. It doesn't really matter if you use a newline or a null byte, but the legacy XL compiler uses a newline. The "opt" keyword should appear if the command line can contain a double quote, a > or a backslash. The legacy compiler also uses other keywords besides "opt", including "version" and "cfg". The what command doesn't do anything special with these keywords. | |
llvm/test/CodeGen/PowerPC/aix-command-line-metadata.ll | ||
17 | 0x0000002e: The actual length should be emitted. |
Thanks for the review @stephenpeckham, I updated the patch with the requested changes.
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
---|---|---|
2337–2338 | Minor nit: Typo. | |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
2973 | As mentioned offline, newline on its own has potential of ambiguity because it can appear in command line options (null bytes cannot). If there is a preference for newline to be present, then having a null byte after could help. Note that @(#)opt can appear on the command line too. Using what will have limitations (but we should leave the possibility open for other tools/methods to work). |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
2973 | Thanks; I confirm that my comment has been addressed. I have no further comments at this time. Feel free to commit if another reviewer approves the patch. |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
971 |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
980 | Does the comment still need updating then? It could capture the fact that the "lowest common denominator" is the assembly syntax as it works in terms of words, and so may requiring padding in the final word. We can be clear that we apply the same restriction to the object case judiciously, as it make the output identical and avoids more code paths. We could also note that the linker can use the length to optimize the final linked binary. These all clear things up to a fresh reader, so I think they are worthwhile things to include in comments. | |
992 | With the MetadataSize == 0 case handled the std::max should be redundant now, right? Also the literal 4 can be WordSize | |
1017 | Nit: for pointer casts I would prefer an explicit reinterpret_cast |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
980 | I updated the comment. Please let me know if it's sufficient |
Thank you for the patch and the changes! This LGTM, and AFAICT all of the comments have been addressed.
I would add a comment explaining that for XCOFF, the command line metadata was emitted earlier.