This is an archive of the discontinued LLVM Phabricator instance.

Improve disassembly of branch targets in non-relocatable objects
ClosedPublic

Authored by jhenderson on Apr 21 2020, 2:52 AM.

Details

Summary

llvm-objdump currently tries to pick symbols for branch targets based on the nearest section. However, if there are multiple sections with the same address, it only looks in one of them for the symbols before falling back to using absolute symbols. We should at least look in all sections with the same address for the candidate symbol (but see also https://bugs.llvm.org/show_bug.cgi?id=45627). This change does that, meaning that in some cases we now print a branch target when we didn't before. When there are multiple competing sections, we prefer symbols in the non-empty one over those in empty sections, as the former are more likely to be the "correct" one.

At some point, we should try to fix https://bugs.llvm.org/show_bug.cgi?id=45627 properly, but that might have performance implications with only minor gains, so for now, I'm just improving the situation.

Diff Detail

Event Timeline

jhenderson created this revision.Apr 21 2020, 2:52 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: rupprecht. · View Herald Transcript
grimar added inline comments.Apr 21 2020, 3:55 AM
llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
63

SHN_ABS?

Also a suggestion: what about making it:

Index: [[INDEX]]

And then stop using # RUN: llvm-objcopy -N absolute %t5 %t6 to remove it,
but also add a test for something else, like SHN_COMMON or SHN_LOPROC etc?

77

Could you expand "too high"?

97

##

llvm/tools/llvm-objdump/llvm-objdump.cpp
1257

Perhaps I'd rewrite this body slightly.
Often sort comparators check one field at once, it is easier to read and mantain them:

if (LHS.first != RHS.first)
  return LHS.first < RHS.first;
return LHS.second.getSize() < RHS.second.getSize();
1560

Perhaps just It? (consistent with the code above + I guess might look nicer after clang format).

1569

Missing a space after if. Perhaps 'if (TargetSym)' would be more consistent with the code in this file.

jhenderson marked 7 inline comments as done.

Addressed review comments:

  • clang-formatted
  • improved comments and variable names
  • slight tweak to lambda
  • used SHN_ABS and provided it via yaml2obj to allow removal of llvm-objcopy run in test.
llvm/tools/llvm-objdump/llvm-objdump.cpp
1569

Thanks. Somehow I forgot to clang-format.

grimar accepted this revision.Apr 21 2020, 4:53 AM

LGTM. Lets see what other reviewers think.

This revision is now accepted and ready to land.Apr 21 2020, 4:53 AM

LGTM with one minor suggestion :)

llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
24
# RUN: yaml2obj ... -o %5 ...
#                      ^ Should be '%6', right?
Higuoxing accepted this revision.Apr 21 2020, 7:37 AM
jhenderson marked an inline comment as done.Apr 22 2020, 3:38 AM
jhenderson added inline comments.
llvm/test/tools/llvm-objdump/X86/disassemble-same-section-addr.test
24

Oops, good spot. That actually is a test bug. The %t6 was left lying around on my machine due to the previous runs using llvm-objcopy. When I fixed %t5 to %t6 here, the test started failing as apparently llvm-objdump treats all symbols without a section (including those with a section index of SHN_LOPROC) as absolute symbols.

I think therefore the only way to do this test is to revert back to my original version using llvm-objcopy (or create an entirely new input).

jhenderson marked an inline comment as done.

Reinstate llvm-objcopy usage to check behaviour of no symbol being found. Also add case for SHN_LOPROC being counted as an absolute symbol.

jhenderson requested review of this revision.Apr 22 2020, 3:40 AM

@grimar or @Higuoxing, could I get another review please?

grimar accepted this revision.Apr 22 2020, 3:47 AM

LGTM

This revision is now accepted and ready to land.Apr 22 2020, 3:47 AM
This revision was automatically updated to reflect the committed changes.