When a desired symbol name contains invalid character that the system assembler could not process, we need to emit .rename directive in assembly path in order for that desired symbol name to appear in the symbol table.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
---|---|---|
79 | Minor nit: Trailing whitespace at the end of line (multiple instances of OBJ-EMPTY exhibit this). |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
820 | Previously we avoided emitting this directive for some case where it wasn't strictly required (i.e. function descriptors). Why the change in behaviour? (at first glance seems somewhat orthogonal to this patch) |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
820 | If we don't emit this directive for function descriptor, then we would not trigger the rename for that .lglobl function descriptor when needed. While not having rename for the HIDE_EXT function descriptor is not a huge problem, we would have inconsistency in the object file generation path (object file generation will use the original name when available.). |
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
820 | And I think previously we choose to skip emitting the .lglobl for function descriptor because we have it as a label, and we would get a unnecessary symbol in the symbol table because of that. But now we have function descriptor as csect, so emitting .lglobl for it is redundant but do not have real effect. So I think it might be just easier to emit it and keep everything consistent. |
llvm/include/llvm/MC/MCSymbolXCOFF.h | ||
---|---|---|
64 | even if there is some Overhead, using hasRename() can let user know what the getSymbolTableName() do ? | |
llvm/lib/MC/MCAsmStreamer.cpp | ||
873 | I do not think print out rename directive here is a good idead, as the function name :emitCommonSymbol suggestion , it just print the .comm directive, and it is common function for all the target(not only for XCOFF). I think it maybe better to put the code in the function emitGlobalVariable() before we call the OutStreamer->emitCommonSymbol(GVSym, Size, Align); | |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
1613 | I think it maybe better to call emitSymbolAttribute(GVSym, MCSA_LGlobal) and then return directly, Reason:
into this function.
|
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
873 | As I mentioned in the other comment, emitGlobalVariable() gets called on both assembly path and object file generation path as well. So we could not directly call emitXCOFFRenameDirective() there because we only wants to do it on the assembly generation path. | |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
1613 | We only want to emitXCOFFRenameDirective when it's emitting assembly though. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
1613 | If we change the function void MCStreamer::emitXCOFFRenameDirective(const MCSymbol *Name, StringRef Rename) { llvm_unreachable("emitXCOFFRenameDirecitve is only supported on " "XCOFF targets"); } to void MCStreamer::emitXCOFFRenameDirective(const MCSymbol *Name, StringRef Rename) { } and object generation path will do nothing for the emitXCOFFRenameDirective(), it just some overhead for object generation path but the logic will be more clear. |
llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp | ||
---|---|---|
1613 | Object generation path will take the one in MCXCOFFStreamer.h, which currently doing report_fatal_error. |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
313 | Although this encoding generates runs of underscores, it does make it less intrusive in terms of being able to use c++filt on the result. Note that collisions increase with non-Latin scripts (which could be reduced if we append the hex values of all the underscores, including the original ones, to the end of the string). namespace ImplVerTags { typedef struct 一世 {} A; typedef struct 二世 {} B; } template <typename> void *implStaticDataUnsafe = nullptr; template void *implStaticDataUnsafe<ImplVerTags::A>; template void *implStaticDataUnsafe<ImplVerTags::B>; |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
313 | Just want to make sure I understand your suggestion: |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
313 | If we choose to get the results like _Renamed..6021f__o, in what way I could trigger a collision? Do we need to have a collision handling mechanism then? |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
313 | Hmm, I suggested appending but adding to the front does seem to make it easier to eyeball the split (in addition to making the more relevant part show up first for C identifiers in non-Latin scripts). I like it. Also, yes, I meant that f_$ would become (with the prefix) _Renamed..5f24f__. We shouldn't end up with collisions in that case. The transformation is completely reversible by identifying the number of underscores to determine the prefix length and then replacing each underscore with the encoded value. | |
346 | Moot point if we're going with the encoding change, but the collision handling mechanism needs to check that the output name is not also going to collide: void f_$o() { } void f$_o() { } void f_$o1() { } | |
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
3 | Can't help but notice that none of the renamed entities are referenced via the TOC. For example: int f$o(); int (*bar())() { return f$o; } |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
346 | Thanks. Now I like the not-collidable renaming mechanism much more when I know I avoided a while loop to check if the ouput name is going to collide. | |
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
3 | Thanks for pointing it out. Indeed I missed handling TOC entry renaming in this patch. |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
313 | Thanks for the clarification. Changed the renaming encoding using the idea in our discussion. |
LGTM ,but it need other person to approve it before landing.
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
---|---|---|
162 | minor: I do not think we need the line . |
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
---|---|---|
162 | I would prefer to keep that line to show the whole picture of a common symbol. |
llvm/lib/MC/MCAsmInfoXCOFF.cpp | ||
---|---|---|
49 | Please call isAlnum. | |
llvm/lib/MC/MCAsmStreamer.cpp | ||
803 | See other comment about missing space before left parenthesis. | |
848 | To escape double quote (") characters, the character should be doubled. Unlike with names that start with digits, Clang does manage to handle quotes in names. | |
871 | Minor nit: The space is missing before the first opening parenthesis. | |
873 | We're performing the same cast/type check multiple times here... Suggestion: MCSymbolXCOFF *XSym = dyn_cast<MCSymbolXCOFF>(Symbol); if (XSym && XSym->hasRename()) { | |
llvm/lib/MC/MCContext.cpp | ||
317 | Don't try to access the first character without locally ensuring that the string is non-empty. For example, ICC produces an "internal error" (https://godbolt.org/z/zLiJqb). It seems Clang avoids generating such by diagnosing empty string literals used as __asm__ names. | |
321 | Replace the comma at the end of the line with a semicolon. | |
llvm/lib/MC/MCStreamer.cpp | ||
1076 | s/emitXCOFFRenameDirecitve/emitXCOFFRenameDirective/; | |
llvm/lib/MC/MCSymbolXCOFF.cpp | ||
22 | s/representatation/representation/; | |
37 | s/representatation/representation/; | |
llvm/lib/Target/PowerPC/MCTargetDesc/PPCMCTargetDesc.cpp | ||
123–133 | if (MCSymbolXCOFF *XSym = dyn_cast<MCSymbolXCOFF>(S)) { | |
llvm/test/CodeGen/PowerPC/aix-xcoff-symbol-rename.ll | ||
2 | s/symbol contains/symbols containing/; | |
3 | s/XCOFF platform/an XCOFF platform/; | |
4 | s/32 bit/32-bit/; | |
5 | s/test/tests/; |
llvm/lib/MC/MCContext.cpp | ||
---|---|---|
317 | Thanks for the example and also trying it out with Clang. |
LGTM with minor nit; thanks.
llvm/lib/MC/MCAsmStreamer.cpp | ||
---|---|---|
851 | Add "a" before "double quote". |
I would suggest not to capitalize "Internal" if there is no clear special definition of that word that this is referencing.
Suggestion:
The name used internally in the assembly for references to the symbol.