This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Support disassembling by symbol name
ClosedPublic

Authored by rafauler on Mar 7 2018, 12:11 PM.

Details

Summary

Add a new option -df to llvm-objdump that takes function names
as arguments and instructs the disassembler to only dump those function
contents. Code originally written by Bill Nell.

Diff Detail

Event Timeline

rafauler created this revision.Mar 7 2018, 12:11 PM

This is static because the function DisassembleObject potentially runs multiple times. So the static is to avoid re-creating the same set for each object, and it's also why it checks if it is empty.

On 3/7/18, 5:02 PM, "Rafael Avila de Espindola" <rafael.espindola@gmail.com> wrote:

Rafael Auler via Phabricator via llvm-commits
<llvm-commits@lists.llvm.org> writes:


> +    static StringSet<> DisasmFuncsSet;

Why static?

> +    if (DisasmFuncsSet.empty() && !DisassembleFunctions.empty())

DisasmFuncsSet has just been created, so it is empty.

Cheers,
Rafael

Rafael Auler via Phabricator via llvm-commits
<llvm-commits@lists.llvm.org> writes:

+ static StringSet<> DisasmFuncsSet;

Why static?

+ if (DisasmFuncsSet.empty() && !DisassembleFunctions.empty())

DisasmFuncsSet has just been created, so it is empty.

Cheers,
Rafael

JDevlieghere added inline comments.Mar 8 2018, 9:16 AM
tools/llvm-objdump/llvm-objdump.cpp
1420

Not a big fan of having a static var here. How about extracting this into a function:

static StringSet<> GetDisasmFuncsSet() {
	static StringSet<> DisasmFuncsSet;
  if (DisasmFuncsSet.empty() && !DisassembleFunctions.empty())
    DisasmFuncsSet.insert(DisassembleFunctions.begin(),
                          DisassembleFunctions.end());
  return DisasmFuncsSet;
}

Rafael Auler <rafaelauler@fb.com> writes:

This is static because the function DisassembleObject potentially runs multiple times. So the static is to avoid re-creating the same set for each object, and it's also why it checks if it is empty.

OK, that is a bit unusual. Please move the set initialization just
before

llvm::for_each(InputFilenames, DumpInput);

So that it can be done only once by more conventional means. I am OK
with the set being global.

Cheers,
Rafael

rafauler updated this revision to Diff 137631.Mar 8 2018, 12:00 PM

Implement suggestions

Rafael Auler via Phabricator <reviews@reviews.llvm.org> writes:

+ if (!DisassembleFunctions.empty())
+ DisasmFuncsSet.insert(DisassembleFunctions.begin(),
+ DisassembleFunctions.end());

I don't think you need the if:

DisasmFuncsSet.insert(DisassembleFunctions.begin(),
                      DisassembleFunctions.end());

Will do nothing if DisassembleFunctions is empty.

LGTM with that.

Cheers,
Rafael

rafauler updated this revision to Diff 137799.Mar 9 2018, 10:53 AM

Removing the if

LGTM

Rafael Auler via Phabricator <reviews@reviews.llvm.org> writes:

rafauler updated this revision to Diff 137799.
rafauler added a comment.

Removing the if

Repository:

rL LLVM

https://reviews.llvm.org/D44224

Files:

test/tools/llvm-objdump/X86/disasm-specific-funcs.test
tools/llvm-objdump/llvm-objdump.cpp

Index: tools/llvm-objdump/llvm-objdump.cpp

  • tools/llvm-objdump/llvm-objdump.cpp

+++ tools/llvm-objdump/llvm-objdump.cpp
@@ -20,6 +20,7 @@
#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"
+#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/Triple.h"
#include "llvm/CodeGen/FaultMaps.h"
#include "llvm/DebugInfo/DWARF/DWARFContext.h"
@@ -85,6 +86,12 @@
DisassembleAlld("D", cl::desc("Alias for --disassemble-all"),

cl::aliasopt(DisassembleAll));

+static cl::list<std::string>
+DisassembleFunctions("df",
+ cl::CommaSeparated,
+ cl::desc("List of functions to disassemble"));
+static StringSet<> DisasmFuncsSet;
+
cl::opt<bool>
llvm::Relocations("r", cl::desc("Display the relocation entries in the file"));

@@ -1388,23 +1395,15 @@

  DataRefImpl DR = Section.getRawDataRefImpl();
  SegmentName = MachO->getSectionFinalSegmentName(DR);
}
  • StringRef name;
  • error(Section.getName(name)); -
  • if ((SectionAddr <= StopAddress) &&
  • (SectionAddr + SectSize) >= StartAddress) {
  • outs() << "Disassembly of section ";
  • if (!SegmentName.empty())
  • outs() << SegmentName << ",";
  • outs() << name << ':';
  • }

+ StringRef SectionName;
+ error(Section.getName(SectionName));

// If the section has no symbol at the start, just insert a dummy one.
if (Symbols.empty() || std::get<0>(Symbols[0]) != 0) {
  • Symbols.insert(Symbols.begin(),
  • std::make_tuple(SectionAddr, name, Section.isText()
  • ? ELF::STT_FUNC
  • : ELF::STT_OBJECT));

+ Symbols.insert(
+ Symbols.begin(),
+ std::make_tuple(SectionAddr, SectionName,
+ Section.isText() ? ELF::STT_FUNC : ELF::STT_OBJECT));

}
 
SmallString<40> Comments;

@@ -1417,6 +1416,7 @@

uint64_t Size;
uint64_t Index;

+ bool PrintedSection = false;

std::vector<RelocationRef>::const_iterator rel_cur = Rels.begin();
std::vector<RelocationRef>::const_iterator rel_end = Rels.end();

@@ -1441,6 +1441,19 @@

  continue;
}

+ /// Skip if user requested specific symbols and this is not in the list
+ if (!DisasmFuncsSet.empty() &&
+ !DisasmFuncsSet.count(std::get<1>(Symbols[si])))
+ continue;
+
+ if (!PrintedSection) {
+ PrintedSection = true;
+ outs() << "Disassembly of section ";
+ if (!SegmentName.empty())
+ outs() << SegmentName << ",";
+ outs() << SectionName << ':';
+ }
+

// Stop disassembly at the stop address specified
if (End + SectionAddr > StopAddress)
  End = StopAddress - SectionAddr;

@@ -2203,6 +2216,9 @@

  return 2;
}

+ DisasmFuncsSet.insert(DisassembleFunctions.begin(),
+ DisassembleFunctions.end());
+

llvm::for_each(InputFilenames, DumpInput);
 
return EXIT_SUCCESS;

Index: test/tools/llvm-objdump/X86/disasm-specific-funcs.test

  • /dev/null

+++ test/tools/llvm-objdump/X86/disasm-specific-funcs.test
@@ -0,0 +1,20 @@
+ RUN: yaml2obj -o %t.out %p/Inputs/simple-executable-x86_64.yaml
+
RUN: llvm-objdump -d %t.out -df=main | FileCheck %s
+
+ CHECK: Disassembly of section .anothertext:
+
CHECK-NEXT: main:
+ CHECK-NEXT: 10: 55 pushq %rbp
+
CHECK-NEXT: 11: 48 89 e5 movq %rsp, %rbp
+ CHECK-NEXT: 14: 48 83 ec 20 subq $32, %rsp
+
CHECK-NEXT: 18: 48 8d 04 25 a8 00 00 00 leaq 168, %rax
+ CHECK-NEXT: 20: c7 45 fc 00 00 00 00 movl $0, -4(%rbp)
+
CHECK-NEXT: 27: 48 89 45 f0 movq %rax, -16(%rbp)
+ CHECK-NEXT: 2b: 48 8b 45 f0 movq -16(%rbp), %rax
+
CHECK-NEXT: 2f: 8b 08 movl (%rax), %ecx
+ CHECK-NEXT: 31: 89 4d ec movl %ecx, -20(%rbp)
+
CHECK-NEXT: 34: e8 c7 ff ff ff callq -57
+ CHECK-NEXT: 39: 8b 4d ec movl -20(%rbp), %ecx
+
CHECK-NEXT: 3c: 01 c1 addl %eax, %ecx
+ CHECK-NEXT: 3e: 89 c8 movl %ecx, %eax
+
CHECK-NEXT: 40: 48 83 c4 20 addq $32, %rsp
+// CHECK-NEXT: 44: 5d popq %rbp

This revision was not accepted when it landed; it landed in state Needs Review.Mar 9 2018, 11:18 AM
This revision was automatically updated to reflect the committed changes.

Thank you Rafael and Jonas for reviewing