The implementation of Address::GetAddressClass() is based on file address. Those it will give incorrect result if there are more than one section for a particular file address. For example (see attach log), there are two sections (.text and .debug_ranges) for the file address 0xbcf0 (Two symbols are there in symtab). Hence the Address::GetAddressClass() will return "eAddressClassDebug" instead of "eAddressClassCode". This will cause breakpoint failure and incorrect disassembly view.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Two issues we need to resolve:
- I don't think AddressClass should have Absolute as a value. Absolute values in symbol tables are just numbers, not necessarily addresses.
- This change won't work for ARM
include/lldb/lldb-enumerations.h | ||
---|---|---|
829 ↗ | (On Diff #107847) | I would suggest removing "Address" from the end of the enum name. It is already in an enum that starts with "eAddressClass". I also question why any address class should be absolute. Absolute symbols are usually not addresses, but just values. |
source/Core/Address.cpp | ||
993–1003 ↗ | (On Diff #107847) | This won't work correctly for ARM binaries. ".text" can be filled with ARM, Thumb and Data and there is a CPU map that can help unwind this. The above code will just ay "eAddressClassCode" for all ".text". So this won't work. My guess is the right fix here is to check if the address has a valid section before calling the code below and removing all code above. if (!GetSection()) return eAddressClassInvalid; GetFileAddress will just return the m_offset if the section isn't valid. One could argue that Address::GetFileAddress() should only return the file address if the section is valid though, perhaps that should be the change we make here. |
In your log we see:
[10] .text PROGBITS 000000000000bb80 0000bb80 0000000000054380 0000000000000000 AX 0 0 16 [30] .debug_aranges MIPS_DWARF 0000000000000000 0006c24c 0000000000000560 0000000000000000 0 0 1
.debug_aranges doesn't have a valid address: it is set to 0x0. Not sure how the ELF plug-in is marking these sections, but I am guessing that that is the problem here.
(in the above log output .text has a valid address, but .debug_aranges doesn't. We might need to cross correlate withe the program headers to tell if something gets loaded (PT_LOAD) for a given file address. What does the output of:
(lldb) image dump sections
look like?
Looking at an ELF file with DWARF, I see:
(lldb) image dump sections Dumping sections for 1 modules. Sections for '/Volumes/android/aosp/out/target/product/generic/symbols/system/lib/libart.so' (arm): SectID Type File Address Perm File Off. File Size Flags Section Name ---------- ---------------- --------------------------------------- ---- ---------- ---------- ---------- ---------------------------- 0x00000001 regular --- 0x00000000 0x00000000 0x00000000 libart.so. 0x00000002 regular [0x0000000000000154-0x0000000000000167) r-- 0x00000154 0x00000013 0x00000002 libart.so..interp 0x00000003 elf-dynamic-symbols [0x0000000000000168-0x0000000000010d18) r-- 0x00000168 0x00010bb0 0x00000002 libart.so..dynsym 0x00000004 regular [0x0000000000010d18-0x000000000005228f) r-- 0x00010d18 0x00041577 0x00000002 libart.so..dynstr 0x00000005 regular [0x0000000000052290-0x000000000005a590) r-- 0x00052290 0x00008300 0x00000002 libart.so..hash 0x00000006 elf-relocation-entries [0x000000000005a590-0x0000000000067870) r-- 0x0005a590 0x0000d2e0 0x00000002 libart.so..rel.dyn 0x00000007 elf-relocation-entries [0x0000000000067870-0x0000000000068398) r-- 0x00067870 0x00000b28 0x00000002 libart.so..rel.plt 0x00000008 regular [0x0000000000068398-0x0000000000069468) r-x 0x00068398 0x000010d0 0x00000006 libart.so..plt 0x00000009 code [0x0000000000069470-0x00000000002a94f8) r-x 0x00069470 0x00240088 0x00000006 libart.so..text 0x0000000a ARM.exidx [0x00000000002a94f8-0x00000000002b12d0) r-- 0x002a94f8 0x00007dd8 0x00000082 libart.so..ARM.exidx 0x0000000b ARM.extab [0x00000000002b12d0-0x00000000002b1e44) r-- 0x002b12d0 0x00000b74 0x00000002 libart.so..ARM.extab 0x0000000c regular [0x00000000002b1e48-0x00000000002e39e4) r-- 0x002b1e48 0x00031b9c 0x00000002 libart.so..rodata 0x0000000d eh-frame [0x00000000002e39e4-0x00000000002e5fa8) r-- 0x002e39e4 0x000025c4 0x00000002 libart.so..eh_frame 0x0000000e regular [0x00000000002e5fa8-0x00000000002e63d4) r-- 0x002e5fa8 0x0000042c 0x00000002 libart.so..eh_frame_hdr 0x0000000f regular [0x00000000002e7e38-0x00000000002eaa24) rw- 0x002e6e38 0x00002bec 0x00000003 libart.so..data.rel.ro.local 0x00000010 regular [0x00000000002eaa24-0x00000000002eaa28) rw- 0x002e9a24 0x00000004 0x00000003 libart.so..fini_array 0x00000011 regular [0x00000000002eaa28-0x00000000002ee370) rw- 0x002e9a28 0x00003948 0x00000003 libart.so..data.rel.ro 0x00000012 regular [0x00000000002ee370-0x00000000002ee3c0) rw- 0x002ed370 0x00000050 0x00000003 libart.so..init_array 0x00000013 elf-dynamic-link-info [0x00000000002ee3c0-0x00000000002ee4f0) rw- 0x002ed3c0 0x00000130 0x00000003 libart.so..dynamic 0x00000014 regular [0x00000000002ee4f4-0x00000000002ef000) rw- 0x002ed4f4 0x00000b0c 0x00000003 libart.so..got 0x00000015 data [0x00000000002ef000-0x00000000002ef8c4) rw- 0x002ee000 0x000008c4 0x00000003 libart.so..data 0x00000016 zero-fill [0x00000000002ef8c8-0x00000000002f11fc) rw- 0x002ee8c8 0x00000000 0x00000003 libart.so..bss 0x00000017 regular --- 0x002ee8c4 0x00000010 0x00000030 libart.so..comment 0x00000018 dwarf-line --- 0x002ee8d4 0x002c10eb 0x00000000 libart.so..debug_line 0x00000019 dwarf-info --- 0x005af9bf 0x054eb22a 0x00000000 libart.so..debug_info 0x0000001a dwarf-abbrev --- 0x05a9abe9 0x000f5e5b 0x00000000 libart.so..debug_abbrev 0x0000001b dwarf-aranges --- 0x05b90a48 0x00011960 0x00000000 libart.so..debug_aranges 0x0000001c dwarf-loc --- 0x05ba23a8 0x00a0d623 0x00000000 libart.so..debug_loc 0x0000001d dwarf-ranges --- 0x065af9cb 0x0029c7b0 0x00000000 libart.so..debug_ranges 0x0000001e dwarf-macro --- 0x0684c17b 0x000ada15 0x00000000 libart.so..debug_macro 0x0000001f dwarf-str --- 0x068f9b90 0x004a45e9 0x00000030 libart.so..debug_str 0x00000020 dwarf-frame --- 0x06d9e17c 0x0003fc6c 0x00000000 libart.so..debug_frame 0x00000021 regular --- 0x06dddde8 0x0000001c 0x00000000 libart.so..note.gnu.gold-version 0x00000022 regular --- 0x06ddde04 0x00000038 0x00000000 libart.so..ARM.attributes 0x00000023 elf-symbol-table --- 0x06ddde3c 0x00056c30 0x00000000 libart.so..symtab 0x00000024 regular --- 0x06e34a6c 0x0006c9e6 0x00000000 libart.so..strtab 0x00000025 regular --- 0x06ea1452 0x0000017b 0x00000000 libart.so..shstrtab
Note how all DWARF sections have no file address. They are known to not be valid addresses. Are you seeing something different with your ELF file? Or do you have two ELF files? One without symbols and one with? The output of "image dump sections" should look like above where no DWARF sections have anything valid in the "File Address" column.
The output of image dump sections show all Dwarf sections with no file address.
The issue was when i try to add breakpoint on address (0xbcf0). Symtab::FindSymbolContainingFileAddress return $debug_ranges627 instead of main symbol. Hence there is a failure in setting breakpoint.
2: 000000000000bcc4 1664 FUNC GLOBAL DEFAULT 10 main
49686: 000000000000bcf0 0 NOTYPE LOCAL DEFAULT 40 $debug_ranges627
So the issue is with the ObjectFileELF when it makes its symbol table. It is taking this symbols:
49686: 000000000000bcf0 0 NOTYPE LOCAL DEFAULT 40 $debug_ranges627
And saying it is a code symbol. This symbols has a NOTYPE on it, not FUNC like the main symbol. Fix the ObjectFileELF to give an appropriate lldb::SymbolType value for it. It shouldn't be lldb::SymbolType::eSymbolTypeCode. So set all NOTYPE to lldb::SymbolType::eSymbolTypeInvalid or add a new enum that makes sense.
For the $debug_ranges627 we are getting correct symbol type i.e lldb::SymbolType::eSymbolTypeInvalid. The function Symbol::FindSymbolAtFileAddress(lldb::addr_t file_addr) search for the symbol with file_addr. These return symbol $debug_ranges627 lldb::SymbolType::eSymbolTypeInvalid . Actual symbol what we want for address 0xbcf0 was main symbol but RangeDataVector::FindEntryThatContains(B addr) return $debug_ranges627 .
000000000000bcc4 1664 FUNC GLOBAL DEFAULT 10 main
Should we add FindSymbolContainingFileAddressAndType() which will return correct symbol based on Type ?
The ObjectFile::GetAddressClass(addr_t file_addr) {
Symtab *symtab = GetSymtab(); if (symtab) { Symbol *symbol = symtab->FindSymbolContainingFileAddress(file_addr); // symtab->FindSymbolContainingFileAddressAndType(file_addr, eSymbolTypeCode) if (symbol) { ... ... ... } ... ... ...
}
...
...
...
}
I was out on a week long vacation, sorry for the delay. The main questions is: should symbols with NOTYPE actually be included in any address lookups. I would vote for no as how can 1 address resolve to more than one section. That doesn't make sense to me. One address should resolve to one and only one section. Sounds like the NOTYPE symbols should just say they don't have a section. Thoughts?
I think a better solution is to not give the symbol's address a valid section in the first place. This would be done in ObjectFileELF.cpp when . It doesn't seem like the section is valid because $debug_ranges627 shouldn't have an address that matches something in .text. It just doesn't make sense.
My proposed fix would be in ObjectFileElf.cpp in ObjectFileELF::ParseSymbols(). The current code looks like:
Elf64_Half section_idx = symbol.st_shndx; switch (section_idx) { case SHN_ABS: symbol_type = eSymbolTypeAbsolute; break; case SHN_UNDEF: symbol_type = eSymbolTypeUndefined; break; default: symbol_section_sp = section_list->GetSectionAtIndex(section_idx); break; }
It could be fixed with:
Elf64_Half section_idx = symbol.st_shndx; switch (section_idx) { case SHN_ABS: symbol_type = eSymbolTypeAbsolute; break; case SHN_UNDEF: symbol_type = eSymbolTypeUndefined; break; default: if (symbol.getType() != STT_NOTYPE) symbol_section_sp = section_list->GetSectionAtIndex(section_idx); break; }
It would be a good idea to see what this $debug_ranges627 symbol actually is. It seems like a stray linker symbol that wasn't stripped? It would be a good idea to figure out what this symbol is so we can determine how to deal with it correctly.
The $debug_rangesN symbols are added by linker. These symbols are not used anywhere We can skip these type of symbols (where symbol type is STT_NOTYPE and its section is Debug)
So I looked at the instances of STT_NOTYPE in a few shared libraries on my computer and they do seem to have valid addresses in them.
there are two sections (.text and .debug_ranges) for the file address 0xbcf0.
I don't see that from the log. I cleaned up the output a bit:
Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .MIPS.abiflags MIPS_ABIFLAGS 00000000000002a8 000002a8 0000000000000018 0000000000000018 A 0 0 8 [ 2] .MIPS.options MIPS_OPTIONS 00000000000002c0 000002c0 0000000000000370 0000000000000001 Ao 0 0 8 [ 3] .dynamic DYNAMIC 0000000000000630 00000630 0000000000000270 0000000000000010 A 6 0 8 [ 4] .hash HASH 00000000000008a0 000008a0 000000000000064c 0000000000000004 A 5 0 8 [ 5] .dynsym DYNSYM 0000000000000ef0 00000ef0 0000000000001320 0000000000000018 A 6 2 8 [ 6] .dynstr STRTAB 0000000000002210 00002210 0000000000002601 0000000000000000 A 0 0 1 [ 7] .gnu.version VERSYM 0000000000004812 00004812 0000000000000198 0000000000000002 A 5 0 2 [ 8] .gnu.version_r VERNEED 00000000000049b0 000049b0 0000000000000040 0000000000000000 A 6 2 8 [ 9] .rel.dyn REL 00000000000049f0 000049f0 0000000000001ae0 0000000000000010 A 5 0 8 [10] .text PROGBITS 000000000000bb80 0000bb80 0000000000054380 0000000000000000 AX 0 0 16 [11] .MIPS.stubs PROGBITS 000000000005ff00 0005ff00 0000000000000220 0000000000000000 AX 0 0 8 [12] .rodata PROGBITS 0000000000060120 00060120 0000000000003170 0000000000000000 A 0 0 16 [13] .interp PROGBITS 0000000000063290 00063290 0000000000000015 0000000000000000 A 0 0 1 [14] .eh_frame_hdr PROGBITS 00000000000632a8 000632a8 000000000000079c 0000000000000000 A 0 0 4 [15] .note.android.ide NOTE 0000000000063a44 00063a44 0000000000000098 0000000000000000 A 0 0 4 [16] .eh_frame PROGBITS 0000000000074140 00064140 00000000000031a0 0000000000000000 WA 0 0 16 [17] .gcc_except_table PROGBITS 00000000000772e0 000672e0 0000000000000c78 0000000000000000 WA 0 0 4 [18] .preinit_array PREINIT_ARRAY 0000000000077f58 00067f58 0000000000000010 0000000000000000 WA 0 0 8 [19] .init_array INIT_ARRAY 0000000000077f68 00067f68 0000000000000010 0000000000000000 WA 0 0 8 [20] .fini_array FINI_ARRAY 0000000000077f78 00067f78 0000000000000010 0000000000000000 WA 0 0 8 [21] .ctors PROGBITS 0000000000077f88 00067f88 0000000000000008 0000000000000000 WA 0 0 8 [22] .dtors PROGBITS 0000000000077f90 00067f90 0000000000000008 0000000000000000 WA 0 0 8 [23] .data.rel.ro PROGBITS 0000000000077fa0 00067fa0 0000000000001060 0000000000000000 WA 0 0 16 [24] .data PROGBITS 0000000000079000 00069000 0000000000000040 0000000000000000 WA 0 0 16 [25] .rld_map PROGBITS 0000000000079040 00069040 0000000000000008 0000000000000000 WA 0 0 8 [26] .got PROGBITS 0000000000079050 00069050 00000000000006b8 0000000000000008 WAp 0 0 16 [27] .bss NOBITS 0000000000079710 00069708 0000000000000520 0000000000000000 WA 0 0 16 [28] .comment PROGBITS 0000000000000000 00069708 0000000000000064 0000000000000001 MS 0 0 1 [29] .pdr PROGBITS 0000000000000000 0006976c 0000000000002ae0 0000000000000000 0 0 4 [30] .debug_aranges MIPS_DWARF 0000000000000000 0006c24c 0000000000000560 0000000000000000 0 0 1 [31] .debug_pubnames MIPS_DWARF 0000000000000000 0006c7ac 000000000002b3f7 0000000000000000 0 0 1 [32] .debug_info MIPS_DWARF 0000000000000000 00097ba3 00000000000972dc 0000000000000000 0 0 1 [33] .debug_abbrev MIPS_DWARF 0000000000000000 0012ee7f 0000000000003c5f 0000000000000000 0 0 1 [34] .debug_line MIPS_DWARF 0000000000000000 00132ade 0000000000035d51 0000000000000000 0 0 1 [35] .debug_frame MIPS_DWARF 0000000000000000 00168830 0000000000002688 0000000000000000 0 0 8 [36] .debug_str MIPS_DWARF 0000000000000000 0016aeb8 000000000005a6e3 0000000000000001 MS 0 0 1 [37] .debug_loc MIPS_DWARF 0000000000000000 001c559b 000000000006e1e5 0000000000000000 0 0 1 [38] .debug_macinfo MIPS_DWARF 0000000000000000 00233780 0000000000000010 0000000000000000 0 0 1 [39] .debug_pubtypes MIPS_DWARF 0000000000000000 00233790 0000000000010eaf 0000000000000000 0 0 1 [40] .debug_ranges MIPS_DWARF 0000000000000000 0024463f 000000000007b8f0 0000000000000000 0 0 1 [41] .gnu.attributes LOOS+ffffff5 0000000000000000 002bff2f 0000000000000010 0000000000000000 0 0 1 [42] .shstrtab STRTAB 0000000000000000 002bff3f 00000000000001ea 0000000000000000 0 0 1 [43] .symtab SYMTAB 0000000000000000 002c0130 0000000000147f60 0000000000000018 44 55368 8 [44] .strtab STRTAB 0000000000000000 00408090 0000000000091037 0000000000000000 0 0 1
Looking only at .text and .debug_ranges:
[Nr] Name Type Address Offset Size EntSize Flags Link Info Align [10] .text PROGBITS 000000000000bb80 0000bb80 0000000000054380 0000000000000000 AX 0 0 16 [40] .debug_ranges MIPS_DWARF 0000000000000000 0024463f 000000000007b8f0 0000000000000000 0 0 1
These don't overlap. .debug_ranges doesn't really have any valid addresses. ".debug_ranges" has an address of zero, but that doesn't mean it has an real "file address". We consider a file address to be a valid address that will eventually map into a process when it is loaded. Sections need to have the ability to say "I am never going to be loaded into memory in a process". Then each ObjectFile subclass, when it creates its sections, would need to set this bit correctly. For ObjectFileELF, this would mean we need to check the sh_flags in a section for the SHF_ALLOC bit, This bit, from the ELF spec, is documented as:
SHF_ALLOC: The section occupies memory during process execution. Some control sections do not reside in the memory image of an object file; this attribute is off for those sections.
The "A" character in the flags column above shows the SHF_ALLOC value for each section. We can see that many sections toward the end do not get loaded and thus should never be included when looking up a file address.
One easy way to say that a section has no file address is to set the Section's file address to LLDB_INVALID_ADDRESS for any ELF section that has sh_flags with SHF_ALLOC not set. So one fix would be to fix ObjectFileELF::CreateSections().
Exising code today is:
SectionSP section_sp(new Section( GetModule(), // Module to which this section belongs. this, // ObjectFile to which this section belongs and should read // section data from. SectionIndex(I), // Section ID. name, // Section name. sect_type, // Section type. header.sh_addr, // VM address. vm_size, // VM size in bytes of this section. header.sh_offset, // Offset of this section in the file. file_size, // Size of the section as found in the file. log2align, // Alignment of the section header.sh_flags, // Flags for this section. target_bytes_size)); // Number of host bytes per target byte
And probably should be:
const addr_t sect_file_addr = header.sh_flags & SHF_ALLOC ? header.sh_addr : LLDB_INVALID_ADDRESS; SectionSP section_sp(new Section( GetModule(), // Module to which this section belongs. this, // ObjectFile to which this section belongs and should read // section data from. SectionIndex(I), // Section ID. name, // Section name. sect_type, // Section type. sect_file_addr, // VM address. vm_size, // VM size in bytes of this section. header.sh_offset, // Offset of this section in the file. file_size, // Size of the section as found in the file. log2align, // Alignment of the section header.sh_flags, // Flags for this section. target_bytes_size)); // Number of host bytes per target byte
Maybe back out your current change and try this out?