Page MenuHomePhabricator

Expression evaluation for overloaded C functions (redux)

Authored by ldrumm on Nov 29 2016, 12:16 PM.



This is a redux of Ewan's patch , refactored to properly substitute primitive types using a hook in the itanium demangler, and updated after the previous patch went stale

The new SubsPrimitiveParmItanium function takes a symbol name and replacement primitive type parameter as before but parses it using the FastDemangler, which has been modified to be able to notify clients of parse events (primitive types at this point).

Additionally, we now use a set of ConstStrings instead of a vector so that we try and resolve the same invalid candidate multiple times.

I'd particularly welcome feedback on whether there is a better way of doing the hook from the demangler. In the present implementation, the time tradeoff is a single nullptr check in the hot path, which was the least invasive change of a few I prototyped.

Diff Detail


Event Timeline

ldrumm updated this revision to Diff 79616.Nov 29 2016, 12:16 PM
ldrumm retitled this revision from to Expression evaluation for overloaded C functions (redux).
ldrumm updated this object.
ldrumm added reviewers: spyffe, clayborg.
ldrumm added a subscriber: Restricted Project.
clayborg requested changes to this revision.Nov 29 2016, 2:45 PM
clayborg edited edge metadata.
clayborg added inline comments.
449 ↗(On Diff #79616)

Don't you mean search instead of mangled above?

450–452 ↗(On Diff #79616)

Use llvm::StringRef instead of "const char *" and then use the llvm::StringRef functions instead of libc functions for string searching.

456–458 ↗(On Diff #79616)

Remove since using llvm::StringRef will have lengths already.

460 ↗(On Diff #79616)

remove llvm::StringRef temp as no longer needed since "mangled" will be a llvm::StringRef already.

464 ↗(On Diff #79616)

Why not use a std::string?

465–466 ↗(On Diff #79616)

Remove both lines above if you use a std::string. Also, why zero it first and then copy the string over it?

471 ↗(On Diff #79616)

Does this only get called with parameters? Or does it get called for everything?

476 ↗(On Diff #79616)

Use starts_with StringRef function.

You also need to verify that there is a word boundary at the end of this. If you are searching for "F", and replacing it with "E" and you get "Foo, Bar) const" in s then this will match you will change it to "Eoo, Bar) const" right? So you need to check s[search.size] and make sure it is a non identifier character. Unless this function only gets called with identifier boundaries. From my quick look at the FastDemangle changes, it seems that this will get called with the entire remaining part of the string.

481–486 ↗(On Diff #79616)

If you use a std::string just call .erase() and .insert()

This revision now requires changes to proceed.Nov 29 2016, 2:45 PM
alexshap added inline comments.Nov 29 2016, 3:18 PM
22 ↗(On Diff #79616)

some thoughts:
(also i assume i might be mistaken / missing smth)
in ./source/Core/Mangled.cpp at least 3 demanglers are used:
FastDemangle, itaniumDemangle, abi::__cxa_demangle .
Adding primitive_type_hook to FastDemangle makes the interface
more complicated & inconsistent (although yes, it's not particularly consistent right now).

17 ↗(On Diff #79616)

does this patch work when the functions come from a shared library ?
(i mean if we have which contains the implementations of get_arg_type(float), get_arg_type(int), and then in lldb we want to call "p get_arg_type(0.12f)")

ldrumm updated this revision to Diff 81753.Dec 16 2016, 7:06 AM
ldrumm edited edge metadata.
ldrumm marked 6 inline comments as done.

updated to use StringRef-based parser rather than char pointers

22 ↗(On Diff #79616)

You're right there are a bunch of manglers in use, and this is suboptimal. There is recent traction in LLVM to switch to using LLDB's FastDemangler once it supports everything; in terms of making the interface messy - you're again spot-on. None of the 4-or-so manglers visible to lldb have a symmetric API to support things like demangle -> remangle, so this is the least intrusive way I could come up with that didn't involve adding a new demangler to the list. I did think of several other possible approaches, including using the llvm demangler (which has no real public API apart from llvm::itaniumDemangle where the internal datastructure (struct Db) is not exposed, and modifying the llvm demangler to fix a problem in LLDB is going to be a much harder sell.

17 ↗(On Diff #79616)

Yes. The only thing that matters to the current implementation is that we know the mangled symbol name.

449 ↗(On Diff #79616)

I did. That was a silly mistake.

465–466 ↗(On Diff #79616)

The reason I set it to zero first was that I didn't know how many replacements would be made or how many times swap_parms_hook would be called, so this case guarantees it's always NUL-terminated. I've switched to std::string as you suggested, which conveniently sidesteps the issue :)

471 ↗(On Diff #79616)

I'm not sure I follow; s should always be a valid pointer if that's what you mean?

476 ↗(On Diff #79616)

This function only gets called when the demangler is trying to resolve a builtin type, so it only get called in a context where it's encoding a type, not an identifier. I could be wrong, but as far as I can tell, from reading the spec for itanium demangling (, and the code for the FastDemangler, this should be safe in the case you mentioned.

clayborg accepted this revision.Dec 16 2016, 10:54 AM
clayborg edited edge metadata.

Looks good.

This revision is now accepted and ready to land.Dec 16 2016, 10:54 AM
This revision was automatically updated to reflect the committed changes.