This is an archive of the discontinued LLVM Phabricator instance.

lldb incorrectly parses DWARF information from Go binary
AcceptedPublic

Authored by sam37 on Oct 1 2014, 6:05 PM.

Details

Summary

lldb has a problem parsing executables built with the Go compiler. For example here's the error message when I set a breakpoint:
(lldb) breakpoint set --name main.main
warning: .debug_arange set has bad header at 0x00000000: length=0x03011101, version=0x1308, cu_offset=0x1201110b, addr_size=1, seg_size=16
Breakpoint 1: where = sleep`main.main, address = 0x0000000000002000
(lldb)
It turns out that lldb is using 0 for the offset in the file of the .debug_aranges section from the beginning of the DWARF section. So lldb is trying to interpret the dwarf header as the .debug_aranges header.
I've traced it down to the SymbolFileDWARF::GetCachedSectionData function in SymbolFileDWARF.cpp, line 749:

data.SetData(m_dwarf_data, section_sp->GetOffset (), section_sp->GetFileSize());

This call uses m_dwarf_data which is a memory mapped section of the Dwarf data. It expects GetOffset() to be the number of bytes in the file from the beginning of the DWARF segment to the beginning of the .debug_aranges section. But that's not what Section::GetOffset returns. It returns the number of bytes from the beginning of the DWARF section in memory to the .debug_aranges section.

The Go compiler sets all the Dwarf virtual memory addresses to zero, as shown by a piece of the output from otool -l below. So the offset of the virtual memory address of the .debug_aranges section from the beginning of the DWARF segment is 0. That's why lldb was trying to use the DWARF header for the .debug_aranges header.

otool -l output:
Load command 4
      cmd LC_SEGMENT_64
  cmdsize 632
  segname __DWARF
   vmaddr 0x0000000000000000
   vmsize 0x0000000000000000
  fileoff 1699840
 filesize 535094
  maxprot 0x00000000
 initprot 0x00000000
   nsects 7
    flags 0x0
Section
  sectname __debug_abbrev
   segname __DWARF
      addr 0x0000000000000000
      size 0x00000000000000d3
    offset 1699840
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
Section
  sectname __debug_line
   segname __DWARF
      addr 0x0000000000000000
      size 0x000000000000d6f1
    offset 1700051
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0
 
<snip __debug_frame, __debug_info, __debug_pubnames and __debug_pubtypes sections>

Section
  sectname __debug_aranges
   segname __DWARF
      addr 0x0000000000000000
      size 0x0000000000000030
    offset 2234886
     align 2^0 (1)
    reloff 0
    nreloc 0
     flags 0x00000000
 reserved1 0
 reserved2 0

My suggested patch explicitly calculates the difference between the byte offset of the parent and the child section in the file.

// See if we memory mapped the DWARF segment?
if (m_dwarf_data.GetByteSize())
{
   // Get the offset of this section from the beginning of the file.
   lldb::offset_t offset = section_sp->GetFileOffset();
   SectionSP parent_sp (section_sp->GetParent());
   // If this section has a parent then we need the offset in the file from
   // the beginning of the parent section.
   if (parent_sp)
   {
       offset = offset - parent_sp->GetFileOffset();
   }
   data.SetData(m_dwarf_data, offset, section_sp->GetFileSize());
}
else
{

A possible criticism is that the code doesn't check to make sure the child's GetFileOffset() is larger than the parent's GetFileOffset(). Since offset is an unsigned number, if the parent address is greater than the child address then offset would be a very large number. SetData will clip the size of the data if it is out of range of what's available.
Not checking at this level is consistent with other parts of lldb. For example line 1619 of ObjectFileMachO.cpp has this entry:
sect64.addr - segment_sp->GetFileAddress(),
This is subtracting the in-memory address of the parent from the in-memory address of the section. No checking is done to make sure the parent's address is < child's address.

I also suggest adding the following comment to declaration of Section::GetOffset to make it clear the method is the offset in memory, not in the file:

// The offset in memory of this section from its parent. If
// there is no parent then this method returns the offset from
// the beginning of the program in memory.
 lldb::addr_t
 GetOffset () const;

The next question is whether GetOffset is being used incorrectly anyplace else in the code. I changed the declaration and definition of GetOffset in Section.h and Section.cpp to include an int argument so the compiler would flag other uses.

lldb::addr_t GetOffset (int noop) const;

Besides the use in SymbolFileDWARF.cpp this change broke the build in the following 2 methods:

Section::GetLoadBaseAddress
Section::ResolveContainedAddress

Based on their names, these methods want to use GetOffset to refer to the location of a section in memory, not in a file. That's the correct use of GetOffset.

I've run dotest.py on both the original and changed versions. Both report failures and unexpected successes, but they don't seem to be due to my change. For example

<with-change>/test/2014-09-28-08_56_00/Failure-x86_64-clang-TestCallWithTimeout.ExprCommandWithTimeoutsTestCase.test_with_dwarf.log
  File "/Users/sm10879/repos/lldb_3/test/expression_command/timeout/TestCallWithTimeout.py", line 76, in call_function
    self.assertTrue (return_value == lldb.eReturnStatusFailed)
<without-change>/test/2014-09-28-09_22_46/Failure-x86_64-clang-TestCallWithTimeout.ExprCommandWithTimeoutsTestCase.test_with_dwarf.log
  File "/Users/sm10879/repos/lldb_3/test/expression_command/timeout/TestCallWithTimeout.py", line 69, in call_function
    self.assertTrue (value.GetError().Success() == False)

Unfortunately this fix alone isn't enough to allow lldb to be used with Go code. The line numbers seem to be off by one in the DWARF information produced by the Go compiler. I'll follow up with the Go team on that.

I'm using Mac OS X 10.9.5.

Many thanks to Gary Clayton for taking the time to help get me started with using lldb on Go programs at WWDC.

Diff Detail

Event Timeline

sam37 updated this revision to Diff 14302.Oct 1 2014, 6:05 PM
sam37 retitled this revision from to lldb incorrectly parses DWARF information from Go binary.
sam37 updated this object.
sam37 edited the test plan for this revision. (Show Details)
sam37 added a subscriber: Unknown Object (MLST).
sam37 updated this object.Oct 1 2014, 6:11 PM
sam37 updated this object.
sam37 updated this object.Oct 1 2014, 6:13 PM
mg11 added a subscriber: mg11.Oct 1 2014, 10:49 PM

Looks good to me.

tfiala added a subscriber: tfiala.

Adding Jason as a proxy for Greg.

tfiala added a subscriber: clayborg.Oct 2 2014, 8:47 AM
jasonmolenda edited edge metadata.Oct 2 2014, 2:15 PM

made two small comments. Agree with adding clarifying comment about the behavior of Section::GetOffset to the header file. Not sure I understand why we're changing off of the GetOffset method in SymbolFileDWARF -- it seems like we can use the GetOffset method like before and if this Section has a parent Section, add that parent Section's offset in.

include/lldb/Core/Section.h
205

Section.h uses the terminology "absolute file virtual address of this section if m_parent == NULL. offset from parent file virtual address if m_parent != NULL". I'd use similar wording here, maybe

The absolute file virtual address of this section if it has no parent Section. If there is a parent Section, this is the offset from the parent Section's file virtual address."

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
750

Wouldn't it make sense to continue to use the Section::GetOffset() method here? I'm not clear on why we're changing to GetFileOffset().

756

If we use Section::GetOffset() above, then this would be

offset = offset + parent_sp-GetOffset();

right?

clayborg requested changes to this revision.Oct 6 2014, 3:04 PM
clayborg added a reviewer: clayborg.

The GO compiler should fix its mach-o sections to have unique addresses. I would really rather not parse an invalid MachO file and make work arounds for the incorrect file mach-o file.

This revision now requires changes to proceed.Oct 6 2014, 3:04 PM
sam37 added a comment.Oct 7 2014, 6:31 PM

To answer clayborg's comment:
"The GO compiler should fix its mach-o sections to have unique addresses. I would really rather not parse an invalid MachO file and make work arounds for the incorrect file mach-o file."

This is only the second time I've looked at the format of a Mach-O file in detail, so I may have made a mistake in my argument below. Please let me know.

This part of the SymbolFileDWARF code is using a memory mapped object to reference the DWARF information. A memory mapped object duplicates the layout of of the information on the disk. So the memory offset information should be the memory offset of the data on the disk, not where it will be loaded in virtual memory. This is a bug in lldb.

The Go compiler has the correct information for the offsets of the data on the disk.

As for the Go compiler setting the runtime load address of the section to 0, is that really an error? The DWARF information is not loaded into memory as part of normal program execution, right? Presumably that's why lldb memory maps the file. So is it really incorrect to say it's runtime load address in virtual memory is 0?

include/lldb/Core/Section.h
205

Thanks for the review!
The problem is that there are two types of offsets. From Address.h:
"There are currently two types of addresses
/ for a section:
/ @li file addresses
/ @li load addresses
/
/ File addresses represent the virtual addresses that are in the "on
/ disk" object files...
/ Load addresses represent the virtual addresses where each section
/ ends up getting loaded at runtime."
Given this distinction, I find that the phrase "absolute file virtual address" confusing.
I do agree that my comment could be improved. How about
GetOffset returns the difference between where the section is loaded in virtual memory at runtime and where the section's parent is loaded at runtime.

source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
756

As I noted above, GetOffset is for the offset in virtual memory where the code will be loaded. However this section of the code is using a memory mapped object. That means the memory corresponds directly to where the section information is located on the disk. For that you want the calculation I suggested.
It is really confusing.

clayborg accepted this revision.Oct 8 2014, 10:55 AM
clayborg edited edge metadata.

I would suggest still giving the DWARF segment a valid vm address range and giving each section within that a valid addr + size that falls within the DWARF segment's vm range.

After looking at your fix, the patch will work. Feel free to rename the Section::GetOffset() to Section::GetMemoryOffset() to make things clearer. We already have Section::GetFileOffset(), so it would be nice to specify "Memory" in GetMemoryOffset() to make sure things are clear from a Section API perspective.

Other than that, the patch does look like it will work just fine.

This revision is now accepted and ready to land.Oct 8 2014, 10:55 AM
sam37 added a comment.Oct 9 2014, 10:32 AM

I'm doing a final check and working on a revision incorporating the suggestions. Please don't commit this patch yet.

Whatever patch you end up doing, please test:

(lldb) expr -g (int)printf("Hello world\n")

To make sure we can still read the DWARF for JIT'ed code and debug it.

sam37 updated this revision to Diff 14770.EditedOct 11 2014, 7:00 AM
sam37 edited edge metadata.

Changed name of Section::GetOffset to Section::GetMemoryOffset
Added comment for Section::GetFileOffset to make it clear that the offset is always from the beginning of the file, not the parent section.
Changed comment for Section::GetMemoryOffset so it's clear that the offset in memory is from the parent section if there is a parent section.
The code in SymbolFileDWARF::GetCachedSectionData was not changed from my earlier revision.
Gary asked me to check the JIT processing. Running lldb on a C program produced the same result both with and without my patch:
(lldb) expr -g -- (int)printf("Hello world\n")
error: Execution was halted at the first instruction of the expression function because "debug" was requested.
Use "thread return -x" to return to the state before expression evaluation.