Page MenuHomePhabricator

New C++ function name parsing logic
ClosedPublic

Authored by eugene on Mar 28 2017, 8:36 PM.

Details

Summary

Current implementation of CPlusPlusLanguage::MethodName::Parse() doesn't get anywhere close to covering full extent of possible function declarations.
It causes incorrect behavior in avoid-stepping and sometimes messes printing of thread backtrace.

This change implements more methodical parsing logic based on clang lexer and simple recursive parser.

Examples:
void std::vector<Class, std::allocator<Class> >::_M_emplace_back_aux<Class const&>(Class const&)
void (*&std::_Any_data::_M_access<void (*)()>())()

Diff Detail

Repository
rL LLVM

Event Timeline

eugene created this revision.Mar 28 2017, 8:36 PM
eugene created this object with visibility "eugene (Eugene Zemtsov)".
eugene created this object with edit policy "eugene (Eugene Zemtsov)".
eugene updated this revision to Diff 93438.Mar 29 2017, 7:01 PM
eugene added reviewers: labath, zturner, jingham.
eugene changed the visibility from "eugene (Eugene Zemtsov)" to "Public (No Login Required)".
eugene changed the edit policy from "eugene (Eugene Zemtsov)" to "All Users".
eugene added a subscriber: lldb-commits.

Improve template method parsing accuracy.

labath edited edge metadata.Mar 30 2017, 9:38 AM

I cant help but feel that this could have been done in a simpler way, but then again, some of the cases you have dug up are quite tricky.
I think we should do some performance measurements to see whether this needs more optimising or it's fine as is.

I propose the following benchmark:
bin/lldb bin/clang

  • make sure clang is statically linked

breakpoint set --name non_exisiting_function

Clang needs to be statically linked so we can access all its symbols without running​ it (which would skew the benchmark) -- this is the reason we cannot use lldb itself as most of its symbols are in liblldb.

Setting the breakpoint on a nonexistent function avoids us timing the breakpoint setting machinery, while still getting every symbol in the executable parsed.

If the performance is ok i am quite happy with this, apart from some stylistic nits.

source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
94 ↗(On Diff #93438)

How about the following api:

if (auto function​ = CPlusPlusNameParser::ParseAsFunctionDefinition(m_full.GetStringRef())) {
  ...
160 ↗(On Diff #93438)

Same here

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp
62 ↗(On Diff #93438)

Wouldn't it be better to change the type of m_next_token_index to size_t?

188 ↗(On Diff #93438)

Is this really the case for the types we are interested in? I would have hoped that this would get simplified to `std::enable_if<true, bool> before it reaches us?

572 ↗(On Diff #93438)

Could this be written as:
m_text.take_front(end_pos).drop_front(start_pos)?

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.h
33 ↗(On Diff #93438)

I think we dont put m_ for fields of dumb structs that are meant to be accessed directly.

94 ↗(On Diff #93438)

Please make the type move-only. Otherwise you will have a fun time debugging accidental copies. (You already have one, although it is benign now)

unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
106 ↗(On Diff #93438)

Would defining operator<< for std::ostream and StringRef ( local tomthis test) enable you to get rid of these?
I've ran into this before but was never annoyed enough to actually do it.. :/

eugene updated this revision to Diff 93572.Mar 30 2017, 6:04 PM

Addressing code-review comments.
Most notable change: MethodName::Parse() tries simple version of name parser, before invoking full power of CPlusPlusNameParser. It really helps with the perf.

eugene marked 4 inline comments as done.Mar 30 2017, 6:42 PM

I think we should do some performance measurements to see whether this needs more optimising or it's fine as is.

I propose the following benchmark:

bin/lldb bin/clang

make sure clang is statically linked
breakpoint set --name non_exisiting_function

Setting the breakpoint on a nonexistent function avoids us timing the breakpoint setting machinery, while still getting every symbol in the executable parsed.

I did some micro-benchmarking and on average new parser is ~3 time slower than the old one. (new parser - ~200k string/s, old parser - ~700k string/s)
clang::Lexer appears to be the slowest part of it.

On the clang breakpoint benchmark you proposed, it is hard to notice much of a difference. clang binary has about 500k functions.
With new parser it takes about 11s to try to set a breakpoint, on the old one it's about 10s. (release version of LLDB, debug static version of clang)

I decided to use a hybrid approach, when we use old parsing code for simple cases and call new parser only when it fails.
80% of clang functions are simple enough that we don't really need the new parser, so it helped to bring clang breakpoint test back to 10s.
I think it's reasonable to assume that similar distribution is true for most programs, and most of their functions can be parsed with the old code.

I don't think we can really improve performance of a new parser without giving up on clang::Lexer, and I'm reluctant to do it.
So I propose to keep hybrid approach and call a new parser only for complicated cases.

source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
94 ↗(On Diff #93438)

If you don't mind I'll leave it as it is.

I understand that it's very tempting to have two simple functions ParseAsFunctionDefinition and ParseAsFullName instead of a class, but I can imagine calling second one if first one fails, and in this case it'll be good that parser doesn't need to tokenize string all over again.

160 ↗(On Diff #93438)

see above

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp
62 ↗(On Diff #93438)

Unsigned types are dangerous. I prefer to avoid them wherever I can.

As far as I remember many members of C++ standard commit publicly regretted the fact that .size() returns an unsigned type.

188 ↗(On Diff #93438)

Yes. I was very surprised as well. But apparently compiler sometimes is being lazy and doesn't simplify everything :(

Running:
~$ nm -C --defined-only clang | grep std::enable_if

Gives a few symbols like this:
std::enable_if<(10u)<(64), bool>::type llvm::isUInt<10u>(unsigned long)

572 ↗(On Diff #93438)

Thanks! I still have a lot to learn about llvm classes.

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.h
94 ↗(On Diff #93438)

Good catch! Thanks!

unittests/Language/CPlusPlus/CPlusPlusLanguageTest.cpp
106 ↗(On Diff #93438)

I tried to to it, but it didn't work for me after a few tries.
I decided to just convert it to std::string, it solves the same problem.

I did some micro-benchmarking and on average new parser is ~3 time slower than the old one. (new parser - ~200k string/s, old parser - ~700k string/s)
clang::Lexer appears to be the slowest part of it.

On the clang breakpoint benchmark you proposed, it is hard to notice much of a difference. clang binary has about 500k functions.
With new parser it takes about 11s to try to set a breakpoint, on the old one it's about 10s. (release version of LLDB, debug static version of clang)

I decided to use a hybrid approach, when we use old parsing code for simple cases and call new parser only when it fails.
80% of clang functions are simple enough that we don't really need the new parser, so it helped to bring clang breakpoint test back to 10s.
I think it's reasonable to assume that similar distribution is true for most programs, and most of their functions can be parsed with the old code.

I don't think we can really improve performance of a new parser without giving up on clang::Lexer, and I'm reluctant to do it.
So I propose to keep hybrid approach and call a new parser only for complicated cases.

I like that idea. Let's go with that.

source/Plugins/Language/CPlusPlus/CPlusPlusLanguage.cpp
94 ↗(On Diff #93438)

Ok, that makes sense -- I didn't expect that would actually work.
I suppose one can always write CPlusPlusNameParser(foo).ParseAsFunctionDefinition() when he wants to make it a one-liner and doesn't care about tokenization reuse.

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp
18 ↗(On Diff #93572)

Are these necessary? You seem to prefix every occurence of Optional and None anyway...

257 ↗(On Diff #93572)

You don't need to go ConstString here. If you wanted to avoid strlen computation, just make this constexpr StringLiteral.

62 ↗(On Diff #93438)

Well... they are most dangerous when you try to combine them with signed types, which is exactly what you are doing now... :) It's also contrary to how things are done in other parts of the code base and goes against the principle of having as few nonsensical values for your variables as possible. So, I still think this (and all other variables you use for token indexes) should be size_t.

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.h
94 ↗(On Diff #93438)

Please also delete the assignment operator.. Having copy constructor deleted and assignment operator working is confusing,

One note on benchmarking: A did some performance profiling on LLDB using a similar approach to what Pavel suggested and if I remember correctly only ~10% of the time was spent on C++ name parsing (~15% was C++ demangling, ~50% was debug_info parsing, rest of them was fairly well distributed). Because of this I think some targeted micro benchmark will be much more useful to measure the performance of this code then an end-to-end test as an e2e test would have low signal to noise ratio.

eugene updated this revision to Diff 93693.Mar 31 2017, 12:48 PM
eugene marked 3 inline comments as done.

Addressing review commnets

eugene updated this revision to Diff 93694.Mar 31 2017, 12:52 PM
eugene marked an inline comment as done.
eugene marked 2 inline comments as done.Mar 31 2017, 12:57 PM
eugene added a subscriber: labath.

Because of this I think some targeted micro benchmark will be much more useful to measure the performance of this code then an end-to-end test as an e2e test would have low signal to noise ratio.

I did some micro-benchmarking and on average new parser is ~3 time slower than the old one. (new parser - ~200k string/s, old parser - ~700k string/s)
clang::Lexer appears to be the slowest part of it.
I mitigate this performance loss, by calling simplified parsing code for simple cases and calling new parser only when the old one fails.

source/Plugins/Language/CPlusPlus/CPlusPlusNameParser.cpp
18 ↗(On Diff #93572)

Well, I used None. Now I use Optional as well.

labath accepted this revision.Mar 31 2017, 1:58 PM

Thank you

Because of this I think some targeted micro benchmark will be much more useful to measure the performance of this code then an end-to-end test as an e2e test would have low signal to noise ratio.

I did some micro-benchmarking and on average new parser is ~3 time slower than the old one. (new parser - ~200k string/s, old parser - ~700k string/s)
clang::Lexer appears to be the slowest part of it.
I mitigate this performance loss, by calling simplified parsing code for simple cases and calling new parser only when the old one fails.

It was pretty clear that the new parser will be slower than the old one, even if I couldn't tell whether it would be 2x or 20x. That's why I wanted a macro benchmark to see whether that matters on the grand scale of things. If you say that 10% of time is name parsing, then we definitely don't want to make that 30%, which means the decision to use two parsers was correct.

This revision is now accepted and ready to land.Mar 31 2017, 1:58 PM
This revision was automatically updated to reflect the committed changes.
jingham edited edge metadata.Apr 6 2017, 4:06 PM

I'm sorry, I don't have time actually review the code here for correctness... But can you make sure that this also rejects a two or three field selector, not just "selector:" but "selector:otherField:"? That seems sufficiently different that you might get the one : but not the two : form right. You could test 3 & more colons, but at that point it's probably overkill.

This code is making debugging of large C++ apps so slow it is unusable...

This is part of the problem, but not the entire thing.. We had a mangled name:
_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiEEEESI_S7_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_EEEEESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESt6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_

It demangles to something that is 72MB in size!!! So yes, this code would take a while to parse all of that. So we need a way to deal with large C++ strings more effectively without having slow performance.

Greg, this name is amazing. My c++filt refuses to demangle it. We can probably give up on parsing C++ names if they're longer than 1000 characters or something.

The funny thing is this is only 981 characters long... Not sure what the right cutoff would be...

This code is processing demangled names. Since you say (I could not get my demangler to process it either) the symbol demangles to a multi-megabyte name, we can probably make the cutoff even longer then 1000 bytes.

OTOH, if we abort demangling of such names in the first place, then this code will not even get used.

This code is processing demangled names. Since you say (I could not get my demangler to process it either) the symbol demangles to a multi-megabyte name, we can probably make the cutoff even longer then 1000 bytes.

Problem is we don't have a cutoff when using the libc++ demangler. There is no such features. c++filt just calls into the system (abi::__cxa_demangle(mangled_name, NULL, NULL, NULL)) demangler. We seem to now try our fast demangler, and then fall back to "llvm::itaniumDemangle(...)".

We would switch over to using llvm::itaniumDemangle() all the time and then we could modify this call to have a max length. I believe the code inside llvm::itaniumDemangle is currently an exact local copy of abi::cxa_demangle(), so it would involve some maintenance as they try to keep in sync with abi::cxa_demangle() if we modify it...

OTOH, if we abort demangling of such names in the first place, then this code will not even get used.

The main question is how do we know what we should demangle and what we shouldn't _before_ we try to demangle it. No easy answer to that.