This patch adds support for type back referencing, allowing demangling of compressed mangled symbols with repetitive types. Signed-off-by: Luís Ferreira <contact@lsferreira.net>
Details
- Reviewers
dblaikie - Commits
- rGb21ea1c2701d: [Demangle] Add support for D types back referencing
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/unittests/Demangle/DLangDemangleTest.cpp | ||
---|---|---|
19–20 | Might be worth some comments explaining what each test is intended to cover. Using similar phrasing to the comments in the implementation source would help cross reference (eg: is this last one meant to test the "LastBackref" bailout, or the "parseType" failure case? (or something else) | |
20 | /might/ be easier to follow if each case is tested in a separate mangled name? (helps to see which things are varying, which things are the same, so the intention of each test is clearer?) Hmm, yeah, I'm not sure I follow what this is testing. It's got several backrefs, perhaps where one would suffice? (similarly with the previous ABC.ABC.ABC.a test, for what it's worth - not quite following that either) & they look like they'd all parse through parseQualified... 8demangle, 4ABCi, Qf, Qh, 1a, Qh - looks all like that would be resolved by parseIdentifier (calling the base case of parseLName for the numerically prefixed ones, and the parseSymbolBackref for the Q ones) called in parseQualified, |
I added comments to the test cases and fixed C like comments, ++I instead of I++ and other requested changes on previous patches. Can you please re-review it?
llvm/unittests/Demangle/DLangDemangleTest.cpp | ||
---|---|---|
20 | Well, I added an overcomplicated test. I simplified it to use only one back reference, since referencing backreferences are not in the specification. The idea behind this backreference concept is to calculate the position offset to the current position but backwards, by converting the difference of a char ASCII value with 'a' or 'A' to the desired shifted position and relocating the pointer with that value, so that Ret points to that reference. For the current test case example _D8demangle4ABCi1aQd Qd is a backreference for position 4, counting from d char, so decoding it points to i. Since i is a valid type sequence, it succeeded. I added an invalid test case with a comment explaining it. LastBackRef is currently being used to check if the LastBackRef is the same as the currently decoded backreference, and therefore detecting a recursive backreference. This needs to be checked since a type can technically point to a letter. This logic can be later extended when pointing to non-recursive backreferences is a thing (I'm currently discussing this and doing some test cases to implement this in the official spec and the reference compiler) otherwise, checking for the 'Q' letter would be sufficient. |
Looks good, thanks!
llvm/unittests/Demangle/DLangDemangleTest.cpp | ||
---|---|---|
20 | Might be handy to include some of the explanation of the _D8demangle4ABCi1aQd from this discussion in a comment in the test |
I also added another test to include invalid back reference position with _D8demangle3fooQXXXx.
llvm/unittests/Demangle/DLangDemangleTest.cpp | ||
---|---|---|
20 | Added the suggested explanation on both patches. Updated the previous one as well. |
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
312 | Out of curiosity, do you have the full/actual Coverity error? Is it detecting based on the "Save" prefix that the variable is intended to save/restore? I'm guessing on this codepath we're probably on an error path anyway, and it doesn't matter if it's restored? But @ljmf00 could you confirm that that's the case and/or that this path (a LastBackref value, and a bail-out through this early return) is tested/behaving as desired? |
This is the coverity dump (if you have access its at https://scan.coverity.com/projects/llvm):
292const char *Demangler::parseTypeBackref(const char *Mangled) { 293 // A type back reference always points to a letter. 294 // TypeBackRef: 295 // Q NumberBackRef 296 // ^ 297 const char *Backref; 298 299 // If we appear to be moving backwards through the mangle string, then 300 // bail as this may be a recursive back reference. 1. Condition Mangled - this->Str >= this->LastBackref, taking false branch. 301 if (Mangled - Str >= LastBackref) 302 return nullptr; 303 2. save: Saving non-local this->LastBackref in local SaveRefPos. 304 int SaveRefPos = LastBackref; 3. modify: Modifying non-local this->LastBackref. 305 LastBackref = Mangled - Str; 306 307 // Get position of the back reference. 308 Mangled = decodeBackref(Mangled, Backref); 309 310 // Can't decode back reference. 4. Condition Backref == NULL, taking true branch. 311 if (Backref == nullptr) CID 1469001 (#1 of 1): Failure to restore non-local value (MISSING_RESTORE)5. end_of_scope: Value of non-local this->LastBackref that was saved in SaveRefPos is not restored as it was along other paths. 312 return nullptr; 313 314 // TODO: Add support for function type back references. 315 Backref = parseType(Backref); 316 A1. restore_example: The original value of non-local this->LastBackref was restored here. 317 LastBackref = SaveRefPos; 318 319 if (Backref == nullptr) 320 return nullptr; 321 322 return Mangled; 323} 324
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
312 | I don't see the need to restore such variable, as this will abort the demangler anyway. parse functions doesn't usually restore the demangler context, only decode functions, if needed. Can you clarify why this triggered your attention to this? I also requested access to coverity, is that possible? |
llvm/lib/Demangle/DLangDemangle.cpp | ||
---|---|---|
312 | I'm working my way back in time reviewing warnings - this was first detected at 2022/1/14 |
@ljmf00 coverity is warning that it looks like we fail to restore LastBackref if we early-return here - is that correct?