This is an archive of the discontinued LLVM Phabricator instance.

Add data structure to form equivalence classes of mangled names.
ClosedPublic

Authored by rsmith on Aug 17 2018, 6:58 PM.

Details

Summary

Given a set of equivalent name fragments, this mechanism determines whether two
mangled names are equivalent. The intent is to use this for fuzzy matching of
profile data against the program after certain refactorings are performed.

Diff Detail

Repository
rL LLVM

Event Timeline

rsmith created this revision.Aug 17 2018, 6:58 PM
kristina added a subscriber: kristina.EditedAug 19 2018, 8:41 AM

It looks like a lot to maintain given that the IA64 mangler is not capable of mangling certain constructs without short circuiting to some default string as well as needing additions for more exotic and non-standard things like SEH filter/etc. name mangling for IA64-style ABIs (should support for that that ever come about, it's one of the projects I worked on for an embedded application).

Although I'm guessing this doesn't intend to cover such cases and can also be handy for additional common things like a cleaner way of rewriting std::__1::basic_string... (aka std::string) to its reserved shorthand mangled form? (slightly off the subject but it always bothered me that ABI tags eliminated the shorthand forms, even have a mangler fork that rewrites that, given that in my case I'm not using stable C++ ABI anyway).

Otherwise LGTM.

It looks like a lot to maintain given that the IA64 mangler is not capable of mangling certain constructs without short circuiting to some default string

I'm not sure what you're referring to. (The Itanium demangler in LLVM is essentially feature-complete.) Can you clarify?

Although I'm guessing this doesn't intend to cover such cases and can also be handy for additional common things like a cleaner way of rewriting std::__1::basic_string... (aka std::string) to its reserved shorthand mangled form?

Yes, supporting remapping between libc++ manglings and libstdc++ manglings is one of the primary things this is aimed at.

It looks like a lot to maintain given that the IA64 mangler is not capable of mangling certain constructs without short circuiting to some default string

I'm not sure what you're referring to. (The Itanium demangler in LLVM is essentially feature-complete.) Can you clarify?

Last time I looked through it which was probably over two months ago, there were some constructs it could not mangle and would just emit a fixed string in place of proper mangled name, this was back during LLVM 7.0.0 development cycle. Maybe all those cases are covered now in which case I apologize for the confusion.

Yes, supporting remapping between libc++ manglings and libstdc++ manglings is one of the primary things this is aimed at.

Ah, excellent, I'm assuming there are plans to make some sort of option for people using either v2 ABI (Fucshia authors) or unstable ABI to allow using libstdc++ reserved names in cases of single-ABI systems where the prefix isn't required and short form of std::string mangling can likely be exported? (not against the ABI prefix, just nice to be able to make use of those reserved mangled forms in situations where there aren't going to be versioning conflicts, though I'm not sure if that's something that needs to be addressed in libc++ or as something in the mangler)

So yeah overall, LGTM, sorry for the confusion over the former point.

erik.pilkington accepted this revision.Aug 22 2018, 12:31 AM

LGTM too, pretty neat stuff!

lib/Support/ItaniumManglingCanonicalizer.cpp
83–85 ↗(On Diff #161367)

Can you add a testcase to show that all the ForwardTemplateReference special casing works? It looks like it does, but I think it would be nice to add. i.e. call the canonicalizing demangler with _ZN1ScvT_I1GEEv or something.

92 ↗(On Diff #161367)

Shouldn't this technically be alignof(Node)?

This revision is now accepted and ready to land.Aug 22 2018, 12:31 AM
rsmith marked an inline comment as done.Aug 24 2018, 3:15 PM
rsmith added inline comments.
lib/Support/ItaniumManglingCanonicalizer.cpp
92 ↗(On Diff #161367)

alignof(Node) isn't necessarily going to be enough in the medium term; we want at least enough alignment for all the derived classes of Node. alignof(Node) is enough for now, but only because Node has a vptr; if we remove the vptr, which we probably should, then we might not have sufficient alignment. Alignment for a Node* member is the most that any of the derived classes needs, so that's what I asked for here.

getOrCreateNode has a static_assert to ensure that we catch any problems introduced by a Node subclass having a higher alignment requirement than this.

This revision was automatically updated to reflect the committed changes.
a.elovikov added inline comments.
llvm/trunk/unittests/Support/ItaniumManglingCanonicalizerTest.cpp
19

Hi Richard,

We compile LLVM with the oldest supportable GCC version (4.8.5) and in that configuration this unittest fails with Seg. Fault.
For some reason GCC can't handle the StringRef inside these std::initializer_lists (i.e. changing StringRef to const char * in lines 19, 20 and 24 "fixes" the issue for me). It seems the issue was fixed somewhere in 4.9 timeline (4.9.4 works without any changes in the test).

But 4.8 is declared as supported in https://llvm.org/docs/GettingStarted.html#host-c-toolchain-both-compiler-and-standard-library so we need to do something with it. What would be your suggestion for that?

a.elovikov added inline comments.Aug 27 2018, 7:31 AM
llvm/trunk/unittests/Support/ItaniumManglingCanonicalizerTest.cpp
19

Seems to be fixed by Chandler's https://reviews.llvm.org/rL340702.