Page MenuHomePhabricator

[llvm-readobj][ELF]Add demangling support

Authored by jhenderson on Jan 16 2019, 9:13 AM.



This change adds demangling support to the ELF side of llvm-readobj, under the switch --demangle/-C. It relies on a refactor proposed in D56721.

The following places are demangled: symbol table dumps (static and dynamic), relocation dumps (static and dynamic), addrsig dumps, call graph profile dumps, and group sections.

Although GNU readelf doesn't support demangling, it is still a useful feature to have, and brings it on a par with llvm-objdump's capabilities.

This fixes

Diff Detail


Event Timeline

jhenderson created this revision.Jan 16 2019, 9:13 AM
jhenderson edited the summary of this revision. (Show Details)Jan 16 2019, 9:16 AM

Add documentation update.

(The documentation is already missing many options, but I might as well avoid making it worse!)

rupprecht accepted this revision.Jan 16 2019, 2:16 PM
rupprecht added inline comments.
26 ↗(On Diff #182071)

This is a really annoying nit, but could you indent the expected side so it reads more easily, vs being indented based on how long the check prefix is? e.g. for this block:

# LLVMCOMMON:       xRelocations [
# LLVMCOMMON:       x  Section {{.*}} {
# LLVMDEMANGLE-NEXT:x    {{ }}foo(char){{ }}
# LLVMMANGLED-NEXT: x    {{ }}_Z3fooc{{ }}
# LLVMCOMMON:       x]

(x's for clarity, showing the margin)

This revision is now accepted and ready to land.Jan 16 2019, 2:16 PM
grimar added inline comments.Jan 16 2019, 11:53 PM
14 ↗(On Diff #182071)

I already saw such way (using 'diff' tool) just recently somewhere in LLVM mails,
but I am not sure what is the benefit of doing that? I mean isn't it easier to do in the usual way:

# RUN: llvm-readobj --symbols --relocations --dyn-symbols --dyn-relocations \
# RUN:              --elf-section-groups --elf-cg-profile --addrsig         \
# RUN:              --demangle | FileCheck %s --check-prefixes=LLVMCOMMON,LLVMDEMANGLE
# RUN: llvm-readobj --symbols --relocations --dyn-symbols --dyn-relocations \
# RUN:              --elf-section-groups --elf-cg-profile --addrsig         \
# RUN:              -C --demangle | FileCheck %s --check-prefixes=LLVMCOMMON,LLVMDEMANGLE

That would avoid creating 2 files and calling the diff. And since calling tools and creating files affect on a
speed of the test execution, I think it is good to try to reduce it.

I am not sure it is really worth to compare the whole output between the option and its alias.
I just think it is not common and a bit excessive.

67 ↗(On Diff #182071)

I would suggest to remove LLVMCOMMON and just use LLVMDEMANGLE/LLVMMANGLED, something like:

# LLVMDEMANGLE:           Addrsig [
# LLVMDEMANGLE-NEXT:   Sym: foo(char){{ }}
# LLVMDEMANGLE-NEXT:   Sym: blah(float){{ }}

# LLVMMANGLED:           Addrsig [
# LLVMMANGLED-NEXT:   Sym: foo(char){{ }}
# LLVMMANGLED-NEXT:   Sym: blah(float){{ }}

(I think introducing 3 tags for a FileCheck call is unnecessary overcomplication here, it should look and read simpler if you have only 2).

206 ↗(On Diff #182071)

Does not seem you need .llvm_addrsig or Could you reduce the yaml to keep only important parts?

218 ↗(On Diff #182071)

The same, seems you do not need it.

191 ↗(On Diff #182071)

I noticed a small inconsistency in llvm-objdump about printing/not-printing the aliased with -help.
Seems they should only be printed witch -help-hidden, but atm two of them are always printed.
I was going to suggest a patch for this.

Seems you should not use cl::NotHidden here?

jhenderson marked 7 inline comments as done.Jan 17 2019, 4:37 AM
jhenderson added inline comments.
14 ↗(On Diff #182071)

That's actually four tool calls (llvm-readobj x 2 + FileCheck x 2). This approach is the same number of calls (llvm-readobj x 2 + FileCheck + diff). I don't know, but I wouldn't be surprised if diff is as fast as the second FileCheck run, although I see your point about creating the extra files (although I doubt it makes more than a small fraction of a second difference).

I personally have a marginal preference for this approach overall, as I think it's a little easier to read, and has the potential to catch odd bugs, but I don't feel strongly about it, and would change it to progress the review. However, a more important issue is consistency. Due to the bug in llvm-readelf mentioned in the test below, the GNU output has to use a single file across multiple runs (or to run FileCheck twice). As that already pipes to a file, keeping the LLVM ones piping to a file helps keep things consistent.

26 ↗(On Diff #182071)

Sure. I'm the king of nit-picking, so feel free to pick at them however minor they may seem!

67 ↗(On Diff #182071)

Whilst for the Addrsig bit, it's not too bad, I and a colleague found it much less readable for things like the relocation dumps. It also is less maintainable if we ever change the format slightly, as it requires updating two different places. Finally, it is harder to see where the actual differences are if written as you've suggested.

I think with the fix suggested by @rupprecht, and one small tweak (adding a '-' between LLVM/GNU and COMMON etc.) it is a little clearer and easier to distinguish the prefixes.

206 ↗(On Diff #182071)

They are needed to test the demangling of symbols in those two dumps. If they aren't in the yaml, they won't be tested (and they go through a separate call path to the other symbol names, so it's not like we're duplicating coverage).

218 ↗(On Diff #182071)

llvm-readobj looks up dynamic data from the PT_DYNAMIC segment, not the .dynamic section, and uses that to dump dynamic relocations (who have a separate method for printing to static relocations, so need testing separately). The address information is dependent on the PT_LOAD the sections are in, hence we need both segments.

191 ↗(On Diff #182071)

I agree it's inconsistent, and I'd like to improve that, but I'd actually lean towards having short options in the main help text, not just help-hidden. If you see a usage of "-C" (or whatever) in a build script, and it's not visible in the help text, it could be confusing for users unfamiliar with the tool. Putting it under -help-hidden doesn't help, because this is not a normal (outside LLVM) way of getting short alias information.

jhenderson marked an inline comment as done.

Improve readability of test with improved indentation and renaming LLVM/GNU* to LLVM-/GNU-*

grimar accepted this revision.Jan 17 2019, 5:16 AM

Thanks for the answers, James. LGTM.

14 ↗(On Diff #182071)

(although I doubt it makes more than a small fraction of a second difference).

Yes, but check-llvm already might take minutes to finish :) More tests - more slowdown.

I am OK with it as is.

16 ↗(On Diff #182071)

nit: full stop is missing.

206 ↗(On Diff #182071)

Oops, indeed I did not notice you are using them. Sorry.

191 ↗(On Diff #182071)

That sounds good. I was not sure what the way will be preferred (I supposed hiding aliases was intentional), but for me, it would be more convenient to see them under -help (that was why I found this initially, I did not see the lines I would expect to see, but saw the other lines at the same time).

Add missing full stops. Thanks @grimar and @rupprecht. I'll commit this once my slow Linux machine has finished checking another commit.

Rebase on the latest version of D56721.

jhenderson marked an inline comment as done.Jan 17 2019, 7:37 AM
jhenderson added inline comments.
800 ↗(On Diff #182266)

This line didn't compile with my GCC build. I had to change it to:

return opts::Demangle ? demangle(Name) : Name.str();
This revision was automatically updated to reflect the committed changes.