In addition to the traditional nlist records in Mach-O binaries, the dynamic linker dyld has a trie structure that it uses to record symbols with external linkage. It uses this structure when resolving symbols at runtime; the nlist records can be completely stripped from a binary without detriment. lldb has read the special class of reexport symbols out of the dyld trie until now.
This patch has four parts -
- Remove checks for an empty string table / nlist table as meaning there is no symbol table.
- Changes ParseTrieEntries to recognize the externally-visible symbols and add them to a second array of TrieEntries.
- After populating the symbol table from the nlist records, looks for any matching symbol addresses that we read from the trie, marks them as already seen so we don't add duplicated symbols in the table.
- Adds the trie entries that hadn't already been seen, and marks any function starts with those addresses as already-added.
There is a test case that has a variety of text and data symbols with different linkage visibility and a test case that checks that we don't have duplicate symbol table entries, and that we can still find the externally visible symbols after stripping the binary.
The patch at this point is pretty straightforward. It's easy to make mistakes in ObjectFileMachO::ParseSymtab, and in the process of writing this I think I made all of them. But I'm open to any feedback about how things might be done more clearly.
Adrian, I wasn't sure how well I'm conforming to best practices on the testsuite Makefile, where I'm compiling my main.cpp has a dylib, then making a stripped copy. This works, but if you have a chance to look at it and provide feedback, I would appreciate it.
The only bits that are used in this field are:
So why not just set the highest bit and avoid clobbering all of the other flags?: