Page MenuHomePhabricator

[lld-macho] Emit STABS symbols for debugging, and drop debug sections

Authored by int3 on Oct 12 2020, 11:28 AM.



Debug sections contain a large amount of data. In order not to bloat the size
of the final binary, we remove them and instead emit STABS symbols for
dsymutil and the debugger to locate their contents in the object files.

With this diff, dsymutil is able to locate the debug info. However, we need
a few more features before lldb is able to work well with our binaries --
e.g. having LC_DYSYMTAB accurately reflect the number of local symbols,
emitting LC_UUID, and more. Those will be handled in follow-up diffs.

Note also that the STABS we emit differ slightly from what ld64 does. First, we
emit the path to the source file as one N_SO symbol instead of two. (ld64
emits one N_SO for the dirname and one of the basename.) Second, we do not
emit N_BNSYM and N_ENSYM STABS to mark the start and end of functions,
because the N_FUN STABS already serve that purpose. @clayborg recommended
these changes based on his knowledge of what the debugging tools look for.

Additionally, this current implementation doesn't accurately reflect the size
of function symbols. It uses the size of their containing sectioins as a proxy,
but that is only accurate if .subsections_with_symbols is set, and if there
isn't an N_ALT_ENTRY in that particular subsection. I think we have two
options to solve this:

  1. We can split up subsections by symbol even if .subsections_with_symbols is not set, but include constraints to ensure those subsections retain their order in the final output. This is ld64's approach.
  2. We could just add a size field to our Symbol class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now.

Diff Detail

Event Timeline

int3 created this revision.Oct 12 2020, 11:28 AM
Herald added a project: Restricted Project. · View Herald Transcript
int3 requested review of this revision.Oct 12 2020, 11:28 AM
int3 updated this revision to Diff 297745.Oct 12 2020, 7:08 PM

don't run test on windows; fixing the paths there is more work than I'd like to do for now

smeenai added a subscriber: smeenai.Mon, Nov 9, 6:41 PM

We could just add a size field to our Symbol class. This seems simpler, and I'm more inclined toward it, but I'm not sure if there are use cases that it doesn't handle well. As such I'm punting on the decision for now.

Yeah, I like this more too, but punting on it until later is fine.

As far as I can see, we only need a very small amount of information from the DWARF in the object file (the source file information). Do you know if the DWARF parsing is done lazily (so we only pay the cost for the parts of the DWARF we care about), or if it'll attempt to parse the entirety of the object file's DWARF (which would be wasteful)?


I guess we can optimize this later to not duplicate strings in the string table? (Since the symbol table entry will also add the symbol name.)


I think this is valid. How does ld64 make this determination though?

clayborg requested changes to this revision.Mon, Nov 9, 11:07 PM
clayborg added inline comments.

stab.value should be set to the modification date of the .o file here. If this is a path to a .o file, then the mod time of the .o file, and if this is a "foo.a(bar.o)", it needs to be the modification time of the .o file as mentioned in the BSD archive file data structures. This is important because sometimes .a file have multiple .o files with the same name and the only way to tell them apart is the mod time.


"stab.sect" needs to be set to be the 1 based mach-o section index that contains the "stab.value" address within the mach-o file. Any symbol that you emit that has an address in "stab.sect" has its "stab.sect" set to the 1 based section index. So you take all of the sections from all LC_SEGMENT or LC_SEGMENT_64 load commands and make a list of them and the first one will have index 1. From my python tool I have something that dumps the sections and shows the index:

$ a.out --sections
=========== ===== ------------------ ------------------ ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------------------
0x000000b0: [  1] 0x0000000100000ec0 0x00000000000000c1 0x00000ec0 0x00000004 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 __TEXT.__text
0x00000100: [  2] 0x0000000100000f82 0x0000000000000006 0x00000f82 0x00000001 0x00000000 0x00000000 0x80000408 0x00000000 0x00000006 0x00000000 __TEXT.__stubs
0x00000150: [  3] 0x0000000100000f88 0x000000000000001a 0x00000f88 0x00000002 0x00000000 0x00000000 0x80000400 0x00000000 0x00000000 0x00000000 __TEXT.__stub_helper
0x000001a0: [  4] 0x0000000100000fa2 0x000000000000000d 0x00000fa2 0x00000000 0x00000000 0x00000000 0x00000002 0x00000000 0x00000000 0x00000000 __TEXT.__cstring
0x000001f0: [  5] 0x0000000100000fb0 0x0000000000000048 0x00000fb0 0x00000002 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __TEXT.__unwind_info
0x00000288: [  6] 0x0000000100001000 0x0000000000000008 0x00001000 0x00000003 0x00000000 0x00000000 0x00000006 0x00000001 0x00000000 0x00000000 __DATA_CONST.__got
0x00000320: [  7] 0x0000000100002000 0x0000000000000008 0x00002000 0x00000003 0x00000000 0x00000000 0x00000007 0x00000002 0x00000000 0x00000000 __DATA.__la_symbol_ptr
0x00000370: [  8] 0x0000000100002008 0x0000000000000008 0x00002008 0x00000003 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 __DATA.__data

LLDB uses this section index to grab the right section when it parses the symbol.


Yes, this should be uniqued if it isn't happening already. Large C++ compiles will really trigger this and cause huge symbol table bloat.


Yes, we need to emit N_GSYM symbols for globals, and N_STSYM symbols for static variables. Those need to be linked in the debug info.

N_STSYM always has the address set to a valid address. The stab entry should have the "value" set to the address, and "sect" set to the right section index just like the N_FUN entry.

N_GSYM may or may not have the address filed in. If the address of the global can be calculated, the "value" should be set to the address, and "sect" set to the right section index just like the N_FUN entry. If the address can't be calculated, N_GSYM can be zero and we will need to find the global by matching up with the non STAB symbol whose name must match exactly (and that symbol will have "value" and "sect" set correctly).


Not necessarily. Read only globals can be in __text IIRC on ARM and possibly ARM64 or even other architectures.


Is this truly unreachable code?


Is this list sorted by file? We don't want to be emitting multiple N_SO + N_OSO entries for the same source file. dsymutil will be a lot less efficient if we have a few entries for "/tmp/foo.cpp" and "/tmp/foo.o" and then a few function, then some other source + object file, and then "/tmp/foo.cpp" and "/tmp/foo.o" again.


So in mach-o files, and preparing for lld emitting a LC_DYSYMTAB load command, all local symbols must be in a contiguous block of symbols. In fact all locals, external, and undefined symbols must be in a contiguous block. That is required because if "ilocalsym" represents the first local symbol table index of all of the local symbols and "nlocalsym" is the out. Same for external with "iextdefsym" and "nextdefsym", and for undefined symbols with "iundefsym" and "nundefsym".

$ otool -lv a.out
Load command 7
            cmd LC_DYSYMTAB
        cmdsize 80
      ilocalsym 0
      nlocalsym 17
     iextdefsym 17
     nextdefsym 4
      iundefsym 21
      nundefsym 2
This revision now requires changes to proceed.Mon, Nov 9, 11:07 PM
int3 marked 3 inline comments as done.Tue, Nov 10, 4:49 PM

@smeenai Dwarf parsing appears to be on-demand, though I don't know if it's parsing just the subset of things we end up using. Specifically, compile_units() calls parseNormalUnits, which parses all units of type DW_SECT_INFO and DW_SECT_EXT_TYPES. I'll add a TODO to investigate more later.

@clayborg thanks for all the insightful comments! I'll work the modtime and n_sect follow-up diffs tomorrow so that they can be reviewed as a coherent whole.


yeah I was going to do it in a follow-up. Will add a TODO here. Didn't think about the archive case though, thanks for the tip!


Ah, I was following ld64's code here which also hard-codes a sect value of 1 for N_FUN stabs, but I didn't understand the significance of the value. Thanks for the explanation! I think it works out in ld64's case because ld64 puts all functions in __text in the final binary. Functions that are in non-text sections in the input object files all get coalesced into __text. LLD has yet to implement that behavior, but we plan on doing so.

I think I'll add support for this in the same diff where I'll add support for N_GSYM. That way we can write a test for values of n_sect other than 1.


ld64 doesn't appear to do it, but it does seem like a worthwhile optimization. I'll add a TODO inside StringTableSection::addString.


ld64 checks if their atoms have a type of typeCode. That type is determined by their sectionType() method, which uses a combination of input section flags and section names to determine it. I *think* any __text input section will be of typeCode, but I'll have to investigate further. I think we can punt on this.


The only file types are object files, bitcode files, archives, and dylibs. We extract ObjFiles from ArchiveFiles & create them from BitcodeFiles before processing their InputSections, so we shouldn't see either of those here. And we shouldn't be emitting debug info for functions that our output binary references from dylibs (I think... correct me if I'm wrong).


yeah sorting did occur to me but I thought I'd implement it later. It's probably quite straightforward though, so I can do it in this diff

int3 marked 2 inline comments as done.Wed, Nov 11, 8:14 PM
int3 added inline comments.
int3 updated this revision to Diff 308547.Mon, Nov 30, 10:35 PM
int3 marked 3 inline comments as done.


int3 added a comment.Mon, Nov 30, 10:35 PM

Sorry this took a while... got a bit sidetracked


Addressed in D92366: [lld-macho] Flesh out STABS implementation. Though I'm not sure how to test the case where we have a defined global whose address can't be calculated. How does that situation arise?


After thinking about it a bit more, we should really refactor things such that this cast isn't necessary. But it's a bit hairy, so not in this diff

clayborg requested changes to this revision.Mon, Nov 30, 11:19 PM

So marking as needing changes because the size of the N_FUN entries must be correct. dsymutil creates an address map when remapping symbols with ranges and if the size of the N_FUN stabs entry is too large, it will cause major problems when linking the DWARF. We should also find a rock solid way to identify N_FUN entries if possible instead of relying on the section name being "__text".


Sounds good!


Sounds good.


This can't be wrong and must be correct. If it is wrong, dsymutil will link the wrong address ranges for each function and really bad DWARF will result.


This can be done by just declaring a global variable and not assigning it a value:

$ cat main.c
int global;

int main(int argc, const char **argv) {
  return 0;
$ clang -c -g main.c
$ clang -g main.o
$ dsymutil -s a.out 
Symbol table for: 'a.out' (x86_64)
Index    n_strx   n_type             n_sect n_desc n_value
======== -------- ------------------ ------ ------ ----------------
[     0] 00000035 64 (N_SO         ) 00     0000   0000000000000000 '/tmp/'
[     1] 0000003b 64 (N_SO         ) 00     0000   0000000000000000 'main.c'
[     2] 00000042 66 (N_OSO        ) 03     0001   000000005fc5ec13 '/private/tmp/main.o'
[     3] 00000001 2e (N_BNSYM      ) 01     0000   0000000100003fa0
[     4] 00000056 24 (N_FUN        ) 01     0000   0000000100003fa0 '_main'
[     5] 00000001 24 (N_FUN        ) 00     0000   0000000000000016
[     6] 00000001 4e (N_ENSYM      ) 01     0000   0000000000000016
[     7] 0000005c 20 (N_GSYM       ) 00     0000   0000000000000000 '_global'
[     8] 00000001 64 (N_SO         ) 01     0000   0000000000000000
[     9] 00000002 0f (     SECT EXT) 01     0010   0000000100000000 '__mh_execute_header'
[    10] 00000016 0f (     SECT EXT) 03     0000   0000000100004000 '_global'
[    11] 0000001e 0f (     SECT EXT) 01     0000   0000000100003fa0 '_main'
[    12] 00000024 01 (     UNDF EXT) 00     0100   0000000000000000 'dyld_stub_binder'

Note how symbol 7 has no valid address even though the actual non debug symbol does symbol 10. If lld can always emit the valid address for the STAB entry, that is quite ok. Just noting that ld64 doesn't always know it for some reason (or what ever linker the clang driver is evoking these days...).


I think we really need to get this right. Any section can contain functions. Most of the time, functions are in __text, but not always.



This revision now requires changes to proceed.Mon, Nov 30, 11:19 PM
int3 marked 9 inline comments as done.Tue, Dec 1, 2:16 PM
int3 added inline comments.

I'd like to punt on this for now. The goal for now is not to get the STABS implementation to 100% correctness, but to create sort-of-working binaries for local testing. As mentioned in the commit message, I'm on the fence about the best way to store symbol size in LLD's current architecture, and I think it will be clearer once more of the linker has been implemented.


ah... yeah LLD is able to emit the address for BSS symbols. I've added a test for that particular case in the other diff

int3 requested review of this revision.Tue, Dec 1, 2:22 PM
int3 marked 2 inline comments as done.
clayborg accepted this revision.Tue, Dec 1, 2:38 PM
clayborg added inline comments.

understood, just wanted you to be aware that debugging these binaries or using a dSYM file generated from these binaries will have issues until this is fixed.

This revision is now accepted and ready to land.Tue, Dec 1, 2:38 PM
This revision was landed with ongoing or failed builds.Tue, Dec 1, 3:05 PM
This revision was automatically updated to reflect the committed changes.