This patch adds support for simple single qualified names that includes internal mangled names and normal symbol names.
Details
- Reviewers
- None
- Commits
- rGe63c799a767b: [Demangle] Add support for D simple single qualified names
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This patch still looks pretty long - perhaps it could be split up. Could support a single identifier first, then multiple identifiers (qualifiers) and then the LName things? (could be one at a time, but might be easy enough to check them all together there if they're really regularly structured)
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
385–386 | Might be easier to read with a StringSwitch (could make a StringRef from the Mnagled pointer and the length - maybe pass as StringRef here, rather than as the decomposed components) rather than a switch over length + strncmps? | |
391–392 | Remove else-after-return(s) | |
423–429 | This appears to be unstested |
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
385–386 | I guess you are missing the point of this switch. Maybe some comments can be added to cover this behaviour but the point here is: the length passed into parseLName is the length of the identifier name but not of the mangled name evaluated inside strncmp. This is specifically needed, because, for other special names, e.g. initializers (__initZ), a Z is suffixed indicating it is a special symbol. In order to verify that, extra length to accommodate the 'Z' char is needed. The StringSwitch is not applicable here because of that, otherwise would be a good fit. I see optimizations that can be done, however, which is:
| |
423–429 | Thanks, this actually needs to be on another patch, since this is technically part of the function mangling. It also includes this and ~this (they are not being tested here, but on https://reviews.llvm.org/D111423 although they should be all tested on https://reviews.llvm.org/D110576) I'm going to update the path when tests file get rearranged (due to the discussion on https://reviews.llvm.org/D111414) |
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
385–386 | Well, I realized that the first option is not doable here and the second one, from what I tested on clang 13, the optimizer can take what I suggested easily. https://godbolt.org/z/5dorfsWq9 |
I changed the patch to avoid else-after-return, use // comment style, remove untested code, use ++i over i++ when possible, and use the OutputBuffer.
@dblaikie Can you rereview now?
Still seems a bit long in one patch/hard to verify that all the cases are covered by tests, etc. Could you break this up further - maybe implementing only the most basic, single 'parseIdentifier' case, then the iteration, then the anonymous symbols, then the lnames?
I have to disagree here a bit, but I'm going to do it anyway since I really want to go forward with this and don't lose time discussing this nit. I think parsing LName special cases could be indeed in another patch, but either anonymous symbols and identifier iteration are really little things. About anonymous symbols, we are talking about 5 lines of code that are trivial:
// Skip over anonymous symbols. if (*Mangled == '0') { do ++Mangled; while (*Mangled == '0'); continue; }
And the iteration with an if for adding the dot is a simple do while dependent on isSymbolName which currently just checks if the next char is a digit or not before parsing another node, which is like 10 lines of code.
I'm going to also separate the fake parent symbols, which is not yet in the spec. See here for context https://github.com/dlang/dlang.org/pull/3124 .
https://reviews.llvm.org/D114305
https://reviews.llvm.org/D114307
https://reviews.llvm.org/D114308
https://reviews.llvm.org/D114309
I added four more patches, corresponding to the requested split content.
This appears to be unstested