This is an archive of the discontinued LLVM Phabricator instance.

[Demangle] Add itaniumFindTypesInMangledName()
ClosedPublic

Authored by erik.pilkington on Aug 10 2018, 4:18 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

sgraenitz created this revision.Aug 10 2018, 4:18 PM

The respective LLDB patch is here: https://reviews.llvm.org/D50587

erik.pilkington commandeered this revision.Aug 10 2018, 4:44 PM
erik.pilkington edited reviewers, added: sgraenitz; removed: erik.pilkington.

I'll commandeer this so I can upload a patch with a test!

Add a test, libcxxabi changes.

labath added a subscriber: labath.Aug 11 2018, 12:32 AM

I've recently ran into another case where I've needed to modify a mangled name.

In short, the problem was that clang does not bother emitting complete object constructors (C1) when they are linkonce_odr, this causes a problem for us in expression evaluation, since it's then never possible to construct these objects in the expression. After my attempts to get clang (D46685) to emit these C1 symbols always resulted in code size regressions, we started looking at ways to fix this elsewhere. One of the possibilities was to substitute the C1 for the C2 version when we know they are identical. However, for this to work, I'd need a) to know whether a symbol is C1 or C2; b) modify the symbol from C1 to C2.

Doing things this way, would require adding more hooks to the demangler, which does not seem particularly nice. OTOH, exposing the AST would allow other interesting uses too. (For example, sometimes the mangled name actually contains more information than debug info -- debug info will describe a template function as foo<int>(int), but it will not tell you whether the int argument is actually the template parameter of the function, or if it's just a static type. This currently causes problems for us when trying to round-trip these functions into clang AST. If we could understand the structure of the mangled name (which preserves this kind of information), we could do a better job at supporting these functions).

I've put up my work so far at https://reviews.llvm.org/D50599. It is still very WIP, but I believe it's enough to show the general direction of where I'm heading with this. Unfortunately, I don't think I'll have the time in the next 2-3 weeks to complete that patch, but I wanted to show a possible alternative to the approach taken here (and see if it's even worth trying to complete that patch).

I'll commandeer this so I can upload a patch with a test!
Add a test, libcxxabi changes.

Sure, thanks for adding this.

Doing things this way, would require adding more hooks to the demangler, which does not seem particularly nice. OTOH, exposing the AST would allow other interesting uses too. ...

Yes, I am totally with you. Doing this on an exposed AST adds lots of potential, but it also seems like a bigger functional change to me, which requires proper planning and feature discussions. Thus, I am in favour of doing things step by step here. I'd first mimic the old LLDB behaviour with the new demangler and remove FastDemangle from LLDB. Afterwards we can plan and discuss at which places and for which extended functionality we can make use of an exposed AST.

In the meantime, I think llvm::itaniumFindTypesInMangledName() even adds value on its own as it's a simple way to visit parameter types in a mangled name.

What do you think?

I've put up my work so far at https://reviews.llvm.org/D50599. It is still very WIP, but I believe it's enough to show the general direction of where I'm heading with this.

Thanks for sharing this! I will have a closer look asap.

Doing things this way, would require adding more hooks to the demangler, which does not seem particularly nice. OTOH, exposing the AST would allow other interesting uses too. ...

Yes, I am totally with you. Doing this on an exposed AST adds lots of potential, but it also seems like a bigger functional change to me, which requires proper planning and feature discussions. Thus, I am in favour of doing things step by step here. I'd first mimic the old LLDB behaviour with the new demangler and remove FastDemangle from LLDB. Afterwards we can plan and discuss at which places and for which extended functionality we can make use of an exposed AST.

In the meantime, I think llvm::itaniumFindTypesInMangledName() even adds value on its own as it's a simple way to visit parameter types in a mangled name.

What do you think?

I think that's up to Erik mostly. Doing things this way definitely makes LLDB's life easier, but OTOH, I am generally trying to avoid cleaning up LLDB by pushing the mess into LLVM. This is not a particularly large mess, but it still a very odd specialized api that will only ever be used by a single customer.

I agree that the AST thing is a much bigger change, and originally I was planning to send an email to llvm-dev first (and probably I will at some point) after I clean things up, but your patches kinda forced my hand (I thought it was important for you to know that this is not a one-off thing, but there are other use cases for transforming mangled names). FWIW, I am fine with this going in and then being backed out once the more general framework is in, or just poking another hole for constructors if exposing the AST does not seem a good idea. That's particularly true as I cannot make much progress on that patch this week (though if either of you wants to pick that up, I wouldn't be offended :)).

sgraenitz accepted this revision.Aug 13 2018, 8:36 AM

Function for quite specialised use, but the implementation is simple and straightforward. LGTM

For my use case in LLDB this is perfect and we would like to use the functionality at least for a time. Pavel is right that exposing the AST (or implementing a more general visitor pattern etc.) would be very useful for many things in LLDB and probably other places around LLVM too, but I only know little about the details yet. For a good design we should take proper time. If we do that and if it all works out, then in the end this function may become obsolete again, right. My proposal for this case would be to keep FindTypesInMangledNameTest and reimplement itaniumFindTypesInMangledName locally in the test file with whatever tools we have by that time. I am happy to make the effort.

@erik.pilkington What do you think?

llvm/unittests/Demangle/FindTypesInMangledNameTest.cpp
33 ↗(On Diff #160206)

Well f(int&*, void) is maybe not the most practical example, but the mangled name alone should be valid :)

This revision is now accepted and ready to land.Aug 13 2018, 8:36 AM

I think that's up to Erik mostly. Doing things this way definitely makes LLDB's life easier, but OTOH, I am generally trying to avoid cleaning up LLDB by pushing the mess into LLVM. This is not a particularly large mess, but it still a very odd specialized api that will only ever be used by a single customer.

That's true, and we should probably look for a cleaner way of doing this. At the moment though, I think that adding this (small) hack so we can remove the big hack that is FastDemangle is a really good tradeoff.

I agree that the AST thing is a much bigger change, and originally I was planning to send an email to llvm-dev first (and probably I will at some point) after I clean things up, but your patches kinda forced my hand (I thought it was important for you to know that this is not a one-off thing, but there are other use cases for transforming mangled names). FWIW, I am fine with this going in and then being backed out once the more general framework is in, or just poking another hole for constructors if exposing the AST does not seem a good idea. That's particularly true as I cannot make much progress on that patch this week (though if either of you wants to pick that up, I wouldn't be offended :)).

Okay, look forward to seeing what you come up with!

For my use case in LLDB this is perfect and we would like to use the functionality at least for a time. Pavel is right that exposing the AST (or implementing a more general visitor pattern etc.) would be very useful for many things in LLDB and probably other places around LLVM too, but I only know little about the details yet. For a good design we should take proper time. If we do that and if it all works out, then in the end this function may become obsolete again, right. My proposal for this case would be to keep FindTypesInMangledNameTest and reimplement itaniumFindTypesInMangledName locally in the test file with whatever tools we have by that time. I am happy to make the effort.

@erik.pilkington What do you think?

Sounds good! I'll commit this so you can move forward with the dependent patch.

llvm/unittests/Demangle/FindTypesInMangledNameTest.cpp
33 ↗(On Diff #160206)

Oh, my mistake! I'll fix that in the commit.

This revision was automatically updated to reflect the committed changes.