Page MenuHomePhabricator

[lld-macho] create __TEXT,__unwind_info from __LD,__compact_unwind

Authored by gkm on Aug 28 2020, 12:43 PM.



Digest the input __LD,__compact_unwind and produce the output __TEXT,__unwind_info. This is the initial commit with the major functionality.

Successor commits will add handling for ...

  • __TEXT,__eh_frame
  • personalities & LSDA
  • -r pass-through

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
gkm updated this revision to Diff 289338.Sep 1 2020, 6:14 PM

Improvements & cleanups to

  • accept input from multiple files or stdin
  • fail if any expected categories of input are absent
int3 added inline comments.Sep 1 2020, 8:21 PM
51–66 ↗(On Diff #288677)

I would much prefer if we could have a method that explicitly computes the unwind info... but first let me see if I understand the constraints here.

From what I understand, the size of the unwind info depends on the values of CompactUnwindEntry64::functionAddress, or more precisely, their relative offsets. So we need to assign addresses to our functions before computing the unwind info size. After addresses are assigned, the cuSection->writeTo() call below will update the functionAddress entries to point to the final addresses, after which we can compute the needed size. Does all that sound right?

From what I can tell, ld64 calculates unwind info size based on "tentative addresses", which I think have the right relative offsets, though their final values have yet to be fixed. I wonder if we could do something similar with our current design...

int3 added inline comments.Sep 1 2020, 8:24 PM
51–66 ↗(On Diff #288677)

Also -- do you know if it's possible for unwind info to point to functions that reside in sections other than __text?

gkm added inline comments.Sep 2 2020, 12:05 AM
51–66 ↗(On Diff #288677)

Yes, that's substantially correct, and I will elaborate:

UnwindInfoSection::getSize() must process the compact unwind entries and partition them into 4 KiB second-level pages in order to determine the final size of the __TEXT,__unwind_info. There are three factors that influence layout & therefore size:

  1. A maximum of 1021 entries can fit into a 4 KiB page
  2. The difference between address of first and last functions in a second-level page must be less than 2^24. The page will be cut short and remain under-filled when a candidate entry's 24-bit relative offset field would overflow.
  3. We do a linear scan and fold adjacent entries that have the same encoding, thus laying-out fewer entries in the output than are present in the input.

As currently implemented, I use mutable data members to cache the necessary and substantial work done in UnwindInfoSection::getSize() const so that UnwindInfoSection::writeTo() const need not redo it.

I believe it is not strictly necessary to relocate the compact unwind entries in getSize(). I can probably work with the raw input sections for page partitioning and entry folding, though the code will be trickier, and I won't escape the need to relocate later in writeTo() for sake of LSDA and personality addresses.

I chose to use data members as cache, to relocate early, and to use mutable to work-around constness of getSize() and writeTo() because that results in code that does things in the proper sequence, and in the most straightforward manner for sake of clarity and efficiency.

51–66 ↗(On Diff #288677)

With assembler language, anything is possible. I assume one can place compiled functions in alternate sections via attributes or pragmas. Those sections might be discontiguous with __text, and therefore require lay-out in to distinct second-level pages. I don't know if ld64 supports non-__text compact unwind. I will need to conjure some test cases to see how it behaves. If ld64 doesn't handle non-__text, then we won't either. If it does, then I will add that later.

gkm updated this revision to Diff 289583.Sep 2 2020, 3:18 PM


  • Remove redundant unwind_info_section_header_lsda_entry, and use existing unwind_info_section_header_lsda_index_entry instead.
  • Don't cache UnwindInfoSection::isNeeded() result since it is only called once.


  • Permute the register save set in order to generate a larger variety of encodings.
  • Align generated functions at larger boundary so it works for testing ld64 also.


  • Expand the object-file encoding regexp to capture optional personality+lsda
gkm edited the summary of this revision. (Show Details)Sep 2 2020, 10:01 PM
int3 added inline comments.Sep 3 2020, 2:57 PM
51–66 ↗(On Diff #288677)

I believe it is not strictly necessary to relocate the compact unwind entries in getSize(). I can probably work with the raw input sections for page partitioning and entry folding

How about doing most of what's currently in getSize() in a finalizeContents() method instead, much like what is currently being done for the the __LINKEDIT sections? My main concern is -- aside from a mutating method being marked as const -- is that compact unwind's getSize() has an implicit dependency on the addresses of another section. Other synthetic sections' finalizeContent methods have such a dependency, too, so this would fit the mold.

Of course, unlike the __LINKEDIT sections, the address assignments of later sections do depend on compact unwind's size, so we'll have to call it in the middle of assignAddresses. Such special-casing of CW would be a bit more verbose than what we currently have, but I think it's a good thing to make the uniqueness of compact unwind's requirements explicit.

gkm updated this revision to Diff 290105.Sep 5 2020, 12:19 PM

Do the work in finalize() member rather than getSize() const. Drop mutable from data members. Ahhh! So much nicer!

gkm updated this revision to Diff 290114.Sep 5 2020, 5:32 PM
  • use %python prefix in test file
  • minor cleanups
int3 added a comment.Sep 8 2020, 10:56 AM

Having things in finalize() indeed looks a lot cleaner :D


this include doesn't seem necessary

91–92 ↗(On Diff #290114)

is the Twine() call here necessary? I would expect the concatenated StringRefs to already be a Twine

107 ↗(On Diff #290114)

ditto, avoid <map>

120 ↗(On Diff #290114)

codebase convention is to avoid curly braces for one-liner blocks

127 ↗(On Diff #290114)


147 ↗(On Diff #290114)


150 ↗(On Diff #290114)


155 ↗(On Diff #290114)


156–158 ↗(On Diff #290114)

nit 1: the type here is pretty verbose, a type alias would be nice (auto would be acceptable here too I think)

nit 2: it0 and itN aren't the most descriptive names... something like intervalStart and intervalMax would be nice

167–168 ↗(On Diff #290114)

why not pass functionAddressMax in as the 3rd parameter to lower_bound?

edit: oh, because the return type isn't the container element type. would be good to have a comment :)

193–196 ↗(On Diff #290114)

can be rm'ed

211 ↗(On Diff #290114)

would be good to have a comment to the effect that there are a bunch of integer arrays immediately after the section header (I had to pause and think about what &uip[1] meant)

257 ↗(On Diff #290114)

what does a kind of 3 indicate?

49 ↗(On Diff #290114)

try to avoid <map> if possible:

also the key here should be compact_unwind_encoding_t to be consistent :)

50 ↗(On Diff #290114)

nit: I'd prefer structs over pairs for readable field names


I think it might be worthwhile to duplicate some of the code in InputSection::writeTo to specialize it for the compact unwind case. The current writeTo code assumes that both the source and target sections referenced by a relocation have their final output addresses determined. But this is not true for the compact unwind section -- it doesn't have a valid output address. It just happens to work because there are no pcrel relocations involved. We should really error out if we see a pcrel relocation while relocating __compact_unwind.

Creating a separate compact unwind relocating method would free us from having to create Output{Segments, Sections} that we discard later. I'm hoping we can retain the invariant where we add OutputSections to OutputSegments only after we have determined that they are needed.


My understanding of the need for is that some test cases need to be very large in order to exercise the edge cases in the implementation. In addition to providing that functionality, this script also serves as a fuzzer by generating random test cases. Would it be possible for the unit test to instead enumerate a small number of cases that cover all important code paths, in order that we run fewer test cases?

int3 added inline comments.Sep 8 2020, 5:15 PM
90–92 ↗(On Diff #290114)

Is this check really necessary? What kind of errors are we defending against? I would rather we not loop over the relocations unless necessary (for performance)

95–96 ↗(On Diff #290114)

CU relocations can be section-based. Just checked a simple program:

~/tmp: llvm-readobj --relocations --expand-relocs bar.o

File: bar.o
Format: Mach-O 64-bit x86-64
Arch: x86_64
AddressSize: 64bit
Relocations [
  Section __compact_unwind {
    Relocation {
      Offset: 0x0
      PCRel: 0
      Length: 3
      Type: X86_64_RELOC_UNSIGNED (0)
      Section: __text (1)
~/tmp: cat bar.cpp
int foo() {
  return 123;
smeenai added inline comments.Sep 8 2020, 5:17 PM
90–92 ↗(On Diff #290114)

Yup. I don't have the full context here yet, but is relevant.

gkm marked 18 inline comments as done.Sep 9 2020, 5:55 PM
gkm added inline comments.
90–92 ↗(On Diff #290114)

This is an assertion--a sanity check on my own assumptions. Once I tested more widely, I will have no further use for it. A TODO comment to convey my intention is appropriate.

91–92 ↗(On Diff #290114)

Regarding Twine(): I meant to only pass the initial string constant so that later + operands would be chained to it. I did not mean to pass entire concat expression. I used explicit Twine() following Saleem's example in an earlier diff. I'm happy to drop it. I see that diag functions already coerce string constants to Twine anyway.

95–96 ↗(On Diff #290114)

Yes. I expect relocs to all reference sections whose address is assigned, i.e., __TEXT,__text. The assertion checks if this is ever NOT the case.

167–168 ↗(On Diff #290114)

I added a comment explaining the choice.

50 ↗(On Diff #290114)

commonEncodings works best as a vector of pairs because it is filled from a DenseMap iterator which returns precisely the pairs we desire.


I still owe you an answer for this one ...


It is already a single test case, made deterministic by passing --seed' so that check-lld-macho` runs are consistent. The multiplicity that might be reduced is the number of functions generated for that one test case. However, since I need to test boundary conditions around breaking 1021-entry pages, my options are limited. If there is a way to prefer an installed llvm-mc over the locally built one, that would solve it.

gkm marked 5 inline comments as done.Sep 9 2020, 5:58 PM
gkm added inline comments.
90–92 ↗(On Diff #290114)

I am only looping over a subset of relocs, numbered 1+ per function. The "+" represents the subset of functions that have personality+lsda.

gkm updated this revision to Diff 290858.Sep 9 2020, 6:23 PM
  • follow review feedback
gkm updated this revision to Diff 291330.Sep 11 2020, 1:52 PM
  • finalize LD,compact_unwind before relocating
gkm updated this revision to Diff 291439.Sep 12 2020, 7:07 PM
  • Folding adjacent CU entries requires a vector with monotonically increasing functionAddress, and to get that we must first apply std::sort().
  • Abandon passing pageBreak 2nd compare arg via lambda-capture, because it is a fragile technique. It is only by an accident of the implementation that it works on std::lower_bound. It does not work for std::upper_bound, std::binary_search, or std::equal_range.
gkm added inline comments.Sep 17 2020, 2:56 PM

I believe this latest rev works well and isn't so ugly. I retain the normal flow for gathering input sections and attaching them to MergedOutputSection for __LD, but I exclude it from outputSegments (without -r) so further downstream processing doesn't happen. I do insert __LD into nameToOutputSegment[] so that the synthetic UnwindInfoSection::finalize() can find it for processing. That seems minimally invasive and doesn't need a subclass.

All evidence I have seen shows that reloc types in __LD,__compact_unwind are always absolute and never PC-relative. The only output address that must be determined is lld's VMA. OutputSection::writeTo() accepts a pointer to a buffer, which is normally mapped to the linker's output file. However, when relocating __compact_unwind, the buffer is the internal std::vector<CompactUnwindEntry64>::data() rather than the output file.

gkm updated this revision to Diff 292632.Sep 17 2020, 2:59 PM
  • clean-up some ugly special cases by excluding __LD from outputSegments
gkm updated this revision to Diff 292643.Sep 17 2020, 3:36 PM
  • revise, expand & prune comments
  • improve variable names for CU entry folding
int3 accepted this revision.Sep 17 2020, 6:36 PM
int3 added inline comments.
73–74 ↗(On Diff #292643)

Instead of filtering it out here, how about checking for __LD,__compact_unwind inside createOutputSections() while looping over mergedOutputSections, and then doing something like unwindInfoSection->setCompactUnwind(...)? Benefits: no need for the loop in UnwindInfoSection::isNeeded(), no need for getOutputSection, and no need to create an outputSegment to be discarded later.

That said, the current setup is definitely much cleaner than what we had before, and I'm happy to have this refinement in a follow-up diff

201 ↗(On Diff #292643)

s/adddress/addresses are/

288 ↗(On Diff #292643)

nit: static_cast<size_t>

14–17 ↗(On Diff #292643)

nit: group includes with LLD headers first, then LLVM headers, then system headers


how about using INT_MAX - 1 here instead, to cover the unlikely case where we have more than 100 sections? ld64 seems to do that


nit: Synthetic sections that only need to be used within Writer -- like SymtabSection -- can be a member of Writer instead of InStruct.


maybe change this seed to something that's not company-specific


JFYI, the parens are unnecessary, but I'm fine if you prefer them


I don't suppose you meant to leave and False in?

gkm marked 10 inline comments as done.Sep 18 2020, 8:07 PM
gkm added inline comments.
73–74 ↗(On Diff #292643)

No followup necessary. I did it all here.

gkm updated this revision to Diff 292936.Sep 18 2020, 8:32 PM
gkm marked an inline comment as done.
  • Final round of cleanups & review-feedback integration
gkm removed a reviewer: Restricted Project.Sep 18 2020, 8:33 PM
gkm removed a project: Restricted Project.
This revision is now accepted and ready to land.Sep 18 2020, 8:33 PM
int3 added inline comments.Sep 18 2020, 8:34 PM
65 ↗(On Diff #292936)

is this still necessary?

gkm updated this revision to Diff 292937.Sep 18 2020, 8:48 PM
  • Remove redundant change to lld/MachO/OutputSegment.cpp
This revision was landed with ongoing or failed builds.Sep 18 2020, 10:02 PM
This revision was automatically updated to reflect the committed changes.
thakis added inline comments.

llvm still supports python2.7 for now.

thakis added inline comments.Sep 19 2020, 6:17 AM

I went ahead and made the scripts 2.7-compatible in e22a4fd59de668af1cb943e23a6f4bfc93090e0f. Once we drop 2.7 support, it's hopefully easy to revert. Sorry for the trouble!