This is an archive of the discontinued LLVM Phabricator instance.

Implement `target modules dump headers`
ClosedPublic

Authored by amccarth on Mar 24 2016, 3:18 PM.

Details

Summary

I set out to add functionality to dump headers from a binary, only to discover that the functionality already existed but wasn't (as far as I can tell) hooked up to the command line. (It's accessible through the script bridge by enumerating modules and dumping them.) This patch adds a command line target modules dump headers to do this interactively.

I'm not wedded to the particulars of the name, so if you think it should be called something else, I'm open to suggestions.

Diff Detail

Repository
rL LLVM

Event Timeline

amccarth updated this revision to Diff 51610.Mar 24 2016, 3:18 PM
amccarth retitled this revision from to Implement `target modules dump headers`.
amccarth updated this object.
amccarth added a reviewer: clayborg.
amccarth added a subscriber: lldb-commits.
clayborg edited edge metadata.Mar 24 2016, 3:22 PM

What does some sample output of this look like? I can't remember what module->Dump(...) does.

I'm using it with PE files (Windows), so I see

A table of where the debug info exists
MSDOS Header
COFF Header
PECOFF header
a table of section headers (.rdata, .bss, etc.).

The section headers table is redundant with target modules dump sections, but I don't see another way to get the rest of it. I'm particularly interested int he COFF and PECOFF headers.

Literally:

Dumping headers for 1 module(s).
Headers for 'd:\src\fizzbuzz\a.exe':
    Module d:\src\fizzbuzz\a.exe
04189230:       ObjectFilePECOFF, file = 'd:\src\fizzbuzz\a.exe', arch = i386
      SectID     Type             File Address                             File Off.  File Size  Flags      Section Name
      ---------- ---------------- ---------------------------------------  ---------- ---------- ---------- ------------------------
----
      0x00000001 data             [0x0000000000401000-0x00000000004085cc)  0x00000400 0x00007600 0x40000040 a.exe..rdata
      0x00000002 zero-fill        [0x0000000000409000-0x000000000040b244)  0x00000000 0x00000000 0xc0000080 a.exe..bss
      0x00000003 data             [0x000000000040c000-0x000000000040da8c)  0x00007a00 0x00001c00 0xc0000040 a.exe..data
      0x00000004 code             [0x000000000040e000-0x000000000042c7b8)  0x00009600 0x0001e800 0x60000020 a.exe..text
      0x00000005 data             [0x000000000042d000-0x000000000042d93c)  0x00027e00 0x00000a00 0x40000040 a.exe..xdata
      0x00000006 data             [0x000000000042e000-0x000000000042e887)  0x00028800 0x00000a00 0x40000040 a.exe..idata
      0x00000007 dwarf-abbrev     [0x000000000042f000-0x000000000042f6c1)  0x00029200 0x00000800 0x42000040 a.exe..debug_abbrev
      0x00000008 dwarf-info       [0x0000000000430000-0x000000000043ac12)  0x00029a00 0x0000ae00 0x42000040 a.exe..debug_info
      0x00000009 dwarf-line       [0x000000000043b000-0x000000000043ec3a)  0x00034800 0x00003e00 0x42000040 a.exe..debug_line
      0x0000000a dwarf-loc        [0x000000000043f000-0x000000000043f08e)  0x00038600 0x00000200 0x42000040 a.exe..debug_loc
      0x0000000b dwarf-pubnames   [0x0000000000440000-0x0000000000442ca3)  0x00038800 0x00002e00 0x42000040 a.exe..debug_pubnames
      0x0000000c dwarf-pubtypes   [0x0000000000443000-0x0000000000443b59)  0x0003b600 0x00000c00 0x42000040 a.exe..debug_pubtypes
      0x0000000d dwarf-ranges     [0x0000000000444000-0x0000000000444888)  0x0003c200 0x00000a00 0x42000040 a.exe..debug_ranges
      0x0000000e dwarf-str        [0x0000000000445000-0x0000000000450a55)  0x0003cc00 0x0000bc00 0x42000040 a.exe..debug_str
      0x0000000f regular          [0x0000000000451000-0x000000000045336c)  0x00048800 0x00002400 0x42000040 a.exe..reloc
MSDOS Header
  e_magic    = 0x5a4d
  e_cblp     = 0x0000
  e_cp       = 0x0000
  e_crlc     = 0x0000
  e_cparhdr  = 0x0000
  e_minalloc = 0x0000
  e_maxalloc = 0x0000
  e_ss       = 0x0000
  e_sp       = 0x0000
  e_csum     = 0x0000
  e_ip       = 0x0000
  e_cs       = 0x0000
  e_lfarlc   = 0x0040
  e_ovno     = 0x0000
  e_res[4]   = { 0x0000, 0x0000, 0x0000, 0x0000 }
  e_oemid    = 0x0000
  e_oeminfo  = 0x0000
  e_res2[10] = { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, 0x0000 }
  e_lfanew   = 0x00000040
COFF Header
  machine = 0x014c
  nsects  = 0x000f
  modtime = 0x00000000
  symoff  = 0x0004ac00
  nsyms   = 0x000015f2
  hdrsize = 0x00e0
Optional COFF Header
  magic                   = 0x010b
  major_linker_version    = 0x00
  minor_linker_version    = 0x00
  code_size               = 0x0001e800
  data_size               = 0x0002c000
  bss_size                = 0x00000000
  entry                   = 0x000187b3
  code_offset             = 0x0000e000
  data_offset             = 0x00000000
  image_base              = 0x0000000000400000
  sect_alignment          = 0x00001000
  file_alignment          = 0x00000200
  major_os_system_version = 0x0006
  minor_os_system_version = 0x0000
  major_image_version     = 0x0000
  minor_image_version     = 0x0000
  major_subsystem_version = 0x0006
  minor_subsystem_version = 0x0000
  reserved1               = 0x00000000
  image_size              = 0x00053400
  header_size             = 0x00000400
  checksum                = 0x00000000
  subsystem               = 0x0003
  dll_flags               = 0x8140
  stack_reserve_size      = 0x0000000000100000
  stack_commit_size       = 0x0000000000001000
  heap_reserve_size       = 0x0000000000100000
  heap_commit_size        = 0x0000000000001000
  loader_flags            = 0x00000000
  num_data_dir_entries    = 0x00000010
  data_dirs[ 0] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 1] vmaddr = 0x0002e000, vmsize = 0x00000028
  data_dirs[ 2] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 3] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 4] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 5] vmaddr = 0x00051000, vmsize = 0x0000236c
  data_dirs[ 6] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 7] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 8] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[ 9] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[10] vmaddr = 0x00007ca0, vmsize = 0x00000040
  data_dirs[11] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[12] vmaddr = 0x0002e16c, vmsize = 0x00000144
  data_dirs[13] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[14] vmaddr = 0x00000000, vmsize = 0x00000000
  data_dirs[15] vmaddr = 0x00000000, vmsize = 0x00000000

Section Headers
IDX  name             vm addr    vm size    file off   file size  reloc off  line off   nreloc nline  flags
==== ---------------- ---------- ---------- ---------- ---------- ---------- ---------- ------ ------ ----------
[ 0] .rdata           0x00001000 0x000075cc 0x00000400 0x00007600 0x00000000 0x00000000 0x0000 0x0000 0x40000040
[ 1] .bss             0x00009000 0x00002244 0x00000000 0x00000000 0x00000000 0x00000000 0x0000 0x0000 0xc0000080
[ 2] .data            0x0000c000 0x00001a8c 0x00007a00 0x00001c00 0x00000000 0x00000000 0x0000 0x0000 0xc0000040
[ 3] .text            0x0000e000 0x0001e7b8 0x00009600 0x0001e800 0x00000000 0x00000000 0x0000 0x0000 0x60000020
[ 4] .xdata           0x0002d000 0x0000093c 0x00027e00 0x00000a00 0x00000000 0x00000000 0x0000 0x0000 0x40000040
[ 5] .idata           0x0002e000 0x00000887 0x00028800 0x00000a00 0x00000000 0x00000000 0x0000 0x0000 0x40000040
[ 6] .debug_abbrev    0x0002f000 0x000006c1 0x00029200 0x00000800 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[ 7] .debug_info      0x00030000 0x0000ac12 0x00029a00 0x0000ae00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[ 8] .debug_line      0x0003b000 0x00003c3a 0x00034800 0x00003e00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[ 9] .debug_loc       0x0003f000 0x0000008e 0x00038600 0x00000200 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[10] .debug_pubnames  0x00040000 0x00002ca3 0x00038800 0x00002e00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[11] .debug_pubtypes  0x00043000 0x00000b59 0x0003b600 0x00000c00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[12] .debug_ranges    0x00044000 0x00000888 0x0003c200 0x00000a00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[13] .debug_str       0x00045000 0x0000ba55 0x0003cc00 0x0000bc00 0x00000000 0x00000000 0x0000 0x0000 0x42000040
[14] .reloc           0x00051000 0x0000236c 0x00048800 0x00002400 0x00000000 0x00000000 0x0000 0x0000 0x42000040

04161368:     SymbolVendor (d:\src\fizzbuzz\a.exe)
(lldb) target modules dump sections
Dumping sections for 1 modules.
Sections for 'd:\src\fizzbuzz\a.exe' (i686):
  SectID     Type             File Address                             File Off.  File Size  Flags      Section Name
  ---------- ---------------- ---------------------------------------  ---------- ---------- ---------- ----------------------------

  0x00000001 data             [0x0000000000401000-0x00000000004085cc)  0x00000400 0x00007600 0x40000040 a.exe..rdata
  0x00000002 zero-fill        [0x0000000000409000-0x000000000040b244)  0x00000000 0x00000000 0xc0000080 a.exe..bss
  0x00000003 data             [0x000000000040c000-0x000000000040da8c)  0x00007a00 0x00001c00 0xc0000040 a.exe..data
  0x00000004 code             [0x000000000040e000-0x000000000042c7b8)  0x00009600 0x0001e800 0x60000020 a.exe..text
  0x00000005 data             [0x000000000042d000-0x000000000042d93c)  0x00027e00 0x00000a00 0x40000040 a.exe..xdata
  0x00000006 data             [0x000000000042e000-0x000000000042e887)  0x00028800 0x00000a00 0x40000040 a.exe..idata
  0x00000007 dwarf-abbrev     [0x000000000042f000-0x000000000042f6c1)  0x00029200 0x00000800 0x42000040 a.exe..debug_abbrev
  0x00000008 dwarf-info       [0x0000000000430000-0x000000000043ac12)  0x00029a00 0x0000ae00 0x42000040 a.exe..debug_info
  0x00000009 dwarf-line       [0x000000000043b000-0x000000000043ec3a)  0x00034800 0x00003e00 0x42000040 a.exe..debug_line
  0x0000000a dwarf-loc        [0x000000000043f000-0x000000000043f08e)  0x00038600 0x00000200 0x42000040 a.exe..debug_loc
  0x0000000b dwarf-pubnames   [0x0000000000440000-0x0000000000442ca3)  0x00038800 0x00002e00 0x42000040 a.exe..debug_pubnames
  0x0000000c dwarf-pubtypes   [0x0000000000443000-0x0000000000443b59)  0x0003b600 0x00000c00 0x42000040 a.exe..debug_pubtypes
  0x0000000d dwarf-ranges     [0x0000000000444000-0x0000000000444888)  0x0003c200 0x00000a00 0x42000040 a.exe..debug_ranges
  0x0000000e dwarf-str        [0x0000000000445000-0x0000000000450a55)  0x0003cc00 0x0000bc00 0x42000040 a.exe..debug_str
  0x0000000f regular          [0x0000000000451000-0x000000000045336c)  0x00048800 0x00002400 0x42000040 a.exe..reloc
clayborg requested changes to this revision.Apr 1 2016, 9:50 AM
clayborg edited edge metadata.

I would limit this scope to dumping ObjectFiles since that seems to be what you want. So this command should probably change from:

(lldb) target modules dump headers

to:

(lldb) target modules dump objfile

Then CommandObjectTargetModulesDump would just get the ObjectFile from any modules in the arguments and call ObjectFile::Dump(...) instead of module->Dump() since module->Dump() dumps the module itself, and the object file and the symbol file. We already have a "target modules dump symfile", so "target modules dump objfile" makes more sense here.

This revision now requires changes to proceed.Apr 1 2016, 9:50 AM

FWIW I believe we do actually want many of the PE headers, although I have to say I don't like the format of the output. It seems like we could break this up into smaller chunks like section headers, pe headers, binary headers, debug headers, and then allow the user to specify some combination of flags (or all) to display different levels of detail. But that could come as a followup. And also translate some of the hex codes to textual enum values like IMAGE_MACHINE_386 so it's easier to understand the output.

In the future we might even want a way to add operating system specific stuff to the dump. Windows for example can have resource files embedded in them that show file version and things like that, and it's nice to get a unified view that shows some of this OS specific stuff interspersed in a nicely formatted manner. A lot of this type of information is useful for crash reporting and bucketing

FWIW I believe we do actually want many of the PE headers, although I have to say I don't like the format of the output. It seems like we could break this up into smaller chunks like section headers, pe headers, binary headers, debug headers, and then allow the user to specify some combination of flags (or all) to display different levels of detail. But that could come as a followup. And also translate some of the hex codes to textual enum values like IMAGE_MACHINE_386 so it's easier to understand the output.

In the future we might even want a way to add operating system specific stuff to the dump. Windows for example can have resource files embedded in them that show file version and things like that, and it's nice to get a unified view that shows some of this OS specific stuff interspersed in a nicely formatted manner. A lot of this type of information is useful for crash reporting and bucketing

I agreed. To go a step further, we could have ObjectFile::Dump() take an extra "Args &args" parameter where each object file can have its own dumping args. It could then allow the PECOFF dumping to have options that are similar to the program that actually dumps things on the windows platform, mach-o can have options for dumping load command and many other mach specific things, and ELF could have options to dump the program and sections headers, the raw ELF symbol table and much more. Then the new "objfile" command could chop up any remaining args and pass then down into ObjectFile::Dump().

amccarth updated this revision to Diff 52588.Apr 4 2016, 11:41 AM
amccarth edited edge metadata.

OK, this now dumps headers for the ObjectFiles of the modules rather than the modules themselves, using target modules dump objfile as suggested.

My only concern here is that the SB API has a module dump, but no object file dump, and this essentially add the reverse for the command line.

I agree with the additional proposals (e.g., adding an arguments parameter, decoding some of the fields, grabbing other info like version resources), but I'd like to do those as subsequent changes.

clayborg requested changes to this revision.Apr 4 2016, 1:13 PM
clayborg edited edge metadata.

Close, we just need to remove the "Headers for" string. See inlined comment.

source/Commands/CommandObjectTarget.cpp
1568 ↗(On Diff #52588)

We should probably leave this for the objfile->Dump(...) to print. "Headers for '%s'" doesn't make sense here. So this print should probably be removed, or just the filename should be printed out without the "Headers for" part.

This revision now requires changes to proceed.Apr 4 2016, 1:13 PM
amccarth updated this revision to Diff 52618.Apr 4 2016, 1:28 PM
amccarth edited edge metadata.

Removed the "headers for" string, but kept the name of the file, as not all of the ObjectFile::Dump implementations (e.g., ELF) print the file name.

clayborg requested changes to this revision.Apr 4 2016, 1:37 PM
clayborg edited edge metadata.

See inlined comments.

source/Commands/CommandObjectTarget.cpp
1568 ↗(On Diff #52618)

Actually looking at the ObjectFilePECOFF and ObjectFileMachO dump functions, they both do something like:

lldb_private::Mutex::Locker locker(module_sp->GetMutex());
s->Printf("%p: ", static_cast<void*>(this));
s->Indent();
if (m_header.magic == MH_MAGIC_64 || m_header.magic == MH_CIGAM_64)
    s->PutCString("ObjectFileMachO64");
else
    s->PutCString("ObjectFileMachO32");

ArchSpec header_arch;
GetArchitecture(header_arch);

*s << ", file = '" << m_file << "', arch = " << header_arch.GetArchitectureName() << "\n";

We should update ObjectFileELF to do the same kind of thing and remove the filename from here.

1569 ↗(On Diff #52618)

No need to indent more if we aren't printing anything.

1571 ↗(On Diff #52618)

No need to indent less if we aren't printing anything.

This revision now requires changes to proceed.Apr 4 2016, 1:37 PM
amccarth added inline comments.Apr 4 2016, 1:53 PM
source/Commands/CommandObjectTarget.cpp
1568 ↗(On Diff #52618)

Can do. Is locking the module mutex actually necessary to dump the object file?

1569 ↗(On Diff #52618)

I thought IndentMore changes the state of the stream so that the dump on the next line would be indented relative to the file name we just printed. Is that not how it works?

Perhaps you meant that, since this will no longer print the file name, there's no point in indenting.

amccarth updated this revision to Diff 52624.Apr 4 2016, 2:15 PM
amccarth edited edge metadata.

Removed the per-file header from the common code and make the ObjectFileELF::Dump print its own header like the others.

clayborg accepted this revision.Apr 4 2016, 2:18 PM
clayborg edited edge metadata.

Looks good. Thanks for making all the changes.

This revision is now accepted and ready to land.Apr 4 2016, 2:18 PM
This revision was automatically updated to reflect the committed changes.