This is an archive of the discontinued LLVM Phabricator instance.

Add a Microsoft Demangler library and utility.
ClosedPublic

Authored by zturner on Jul 19 2018, 9:15 AM.

Details

Summary

This patch picks up the work from D34667 and makes it (hopefully) suitable for an initial commit. Changes from the initial revision:

  1. More symbols can demangle correctly now than before.
  2. Variable and function naming convention converted to LLVM conventions.
  3. Removal of all usage of STL containers. This seems to be important due to its use in libcxxabi and potential future use in compiler-rt. I know it was a sticking point last time. I'm still making use of some type traits, but only because the itanium demangler is already doing so.
  4. Added an llvm-undname utility.
  5. Tests converted to using lit style tests and making use of llvm-undname
  6. Removal of custom String classes (LLVMDemangle now has a StringView class which is shared by the itanium demangler, so I use that instead).
  7. Undecorated names now try to intelligently add spaces between pointers and function parameters to make the output easier to read.
  8. Removal of all error messages. Previous code would try to return a message indicating specifically why an error occurred. This made use of std::string, which is now removed (see point number 3). Obviously it's possible to deal with dynamically allocated strings without C++, but I think for the purposes of an initial commit, this isn't necessary and just gets in the way of understanding the core logic.

There are still some limitations. There are certain things which I already know don't undecorate correctly. My testing strategy until now has been to go look at the comprehensive set of mangling tests we have in clang/test/CodeGenCXX/mangle-* and convert 100% of them to demangling tests. Currently I have all of Rui's initial tests passing, plus about 95% of the file mangle-ms.cpp. For the ones that still fail, I chose to left them in the test file with FIXME label rather than omitting it, because otherwise it would just get lost in mangle-ms.cpp and we would forget about it.

In subsequent patches I plan to go through and try to get more things working, adding more of the tests from clang/test/CodeGenCXX/mangle-*.cpp along the way until we have 100%

Diff Detail

Repository
rL LLVM

Event Timeline

zturner created this revision.Jul 19 2018, 9:15 AM
llvm/lib/Demangle/MicrosoftDemangle.cpp
31 ↗(On Diff #156285)

I wrote a version of this in ItaniumDemangle.cpp called BumpPointerAllocator, maybe you should move that to a header too and use it instead.

majnemer added inline comments.Jul 19 2018, 9:30 AM
llvm/lib/Demangle/MicrosoftDemangle.cpp
79–89 ↗(On Diff #156285)

Doesn't seem to be in the LLVM style.

158 ↗(On Diff #156285)

FFar?

274 ↗(On Diff #156285)

std::array?

578 ↗(On Diff #156285)

Can't this blow up if there isn't a character left?

zturner added inline comments.Jul 19 2018, 9:36 AM
llvm/lib/Demangle/MicrosoftDemangle.cpp
274 ↗(On Diff #156285)

This is one of those cases where I wasn't sure how much of STL I could use, so I decided against it.

amccarth added inline comments.
llvm/lib/Demangle/MicrosoftDemangle.cpp
212 ↗(On Diff #156285)

typo: "instances"

219 ↗(On Diff #156285)

Str()? I don't see that member.

230 ↗(On Diff #156285)

typo: "descent"

599 ↗(On Diff #156285)

Should there be an else clause or must _ always be followed by L?

708 ↗(On Diff #156285)

Should this set Error to true?

945 ↗(On Diff #156285)

Set Error to true?

llvm/test/Demangle/ms-basic.test
9 ↗(On Diff #156285)

Should most of these CHECKs be CHECK-NEXT?

Without CHECK-NEXT, there's a risk that the output from the test failure could be misleading about which test actually failed.

zturner updated this revision to Diff 156344.Jul 19 2018, 12:55 PM

Fixed a couple of issues pointed out by @majnemer. Also I kind of re-wrote the core logic to use virtual functions and Node subclasses. While the previous implementation worked, it was a bit hard to maintain and extend because it was not always obvious what set of fields of Type would be valid at any given moment. I think having separate implementations for separate type makes the code substantially cleaner and easier to understand.

Wanted to upload this sooner rather than later so people don't spenda lot of time reviewing the other implementation.

majnemer accepted this revision.Jul 19 2018, 8:19 PM

I'm worried about some extreme template stuff (function templates are not considered for back-referencing but other templates and normal functions are OK) but you are gonna work thru the rest of mangle-ms-* so you'll get there ;)

llvm/lib/Demangle/MicrosoftDemangle.cpp
83 ↗(On Diff #156344)

isAlnum

494–495 ↗(On Diff #156344)

?

706–707 ↗(On Diff #156344)

Brace this

731 ↗(On Diff #156344)

Clang uses an int64 here. I think this should do the same.
A good test might be:
??$f@$0?IAAAAAAAAAAAAAAA@@@YAXXZ
Another good one:
??$g@$0HPPPPPPPPPPPPPPP@@@YAXXZ

1242–1299 ↗(On Diff #156344)

Both switches don't cover all cases. Maybe add assert(false) ?

This revision is now accepted and ready to land.Jul 19 2018, 8:19 PM
This revision was automatically updated to reflect the committed changes.