This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Refactor segment/section creation, sorting, and merging
ClosedPublic

Authored by int3 on Jun 15 2020, 3:12 PM.

Details

Summary

There were a few issues with the previous setup:

  1. The section sorting comparator used a declarative map of section names to determine the correct order, but it turns out we need to match on more than just names -- in particular, an upcoming diff will sort based on whether the S_ZERO_FILL flag is set. This diff changes the sorter to a more imperative but flexible form.
  1. We were sorting OutputSections stored in a MapVector, which left the MapVector in an inconsistent state -- the wrong keys map to the wrong values! In practice, we weren't doing key lookups (only container iteration) after the sort, so this was fine, but it was still a dubious state of affairs. This diff copies the OutputSections to a vector before sorting them.
  1. We were adding unneeded OutputSections to OutputSegments and then filtering them out later, which meant that we had to remember whether an OutputSegment was in a pre- or post-filtered state. This diff only adds the sections to the segments if they are needed.

In addition to those major changes, two minor ones worth noting:

  1. I renamed all OutputSection variable names to osec, to parallel isec. Previously we were using some inconsistent combination of osec, os, and section.
  1. I added a check (and a test) for InputSections with names that clashed with those of our synthetic OutputSections.

Diff Detail

Event Timeline

int3 created this revision.Jun 15 2020, 3:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2020, 3:12 PM
int3 updated this revision to Diff 270884.Jun 15 2020, 3:15 PM
int3 edited the summary of this revision. (Show Details)

format commit msg

Overall, I'm liking this change. I feel like its simpler.

lld/MachO/OutputSegment.cpp
41–42

Nit: section instead of osec would be nicer IMO.

47–49

Similar

lld/MachO/Writer.cpp
137–138

Similar

347

I think that this might be easier to read as a llvm::StringSwitch.

364

Why does the section ordering check the segment name? This deserves a comment at least.

382

s/osec/section/?

438

Minor micro-opt? swap the condition as we expect the conflict to be rare, and that makes it easier to follow too IMO.

454–459

s/osec/section/?

int3 marked 2 inline comments as done.Jun 16 2020, 5:47 PM
int3 added inline comments.
lld/MachO/OutputSegment.cpp
41–42

in addition to paralleling isec, it's also the naming convention used by lld-ELF...

lld/MachO/Writer.cpp
347

TIL about StringSwitch, thanks

int3 updated this revision to Diff 271254.Jun 16 2020, 6:27 PM
int3 marked 3 inline comments as done.

update

compnerd accepted this revision.Jun 18 2020, 8:58 AM
This revision is now accepted and ready to land.Jun 18 2020, 8:58 AM
Ktwu added a subscriber: Ktwu.Jun 18 2020, 9:59 AM

This is nice; thanks for the cleanup!

lld/MachO/Writer.cpp
337

Why move the ordering functions back into the writer?

355

Why use std::numeric_limits instead of constants like -3, -2, -1, 0? Magic numbers suck but std::numeric_limits<int>::min() seems sort of verbose.

Nit: cache std::numeric_limits<int>::min() in a temp variable?

MaskRay accepted this revision.Jun 18 2020, 3:02 PM
MaskRay added a subscriber: MaskRay.

LGTM.

lld/MachO/Writer.cpp
454–459

osec looks fine. I appreciate that in some places of ELF/ we use osec to emphasize that it is an OutputSection, not an InputSection.

int3 marked 4 inline comments as done.Jun 21 2020, 5:32 PM
int3 added inline comments.
lld/MachO/OutputSegment.h
20–24

for future reference: the change from a pointer to an array was required so that these constants could be used in StringSwitch.

lld/MachO/Writer.cpp
337

I thought it was nice for all the sorting code to live together, so I brought the ordering functions next to sortSegmentsAndSections. Also while looking up the ordering functions I always forgot if they lived in OutputSegment.cpp or OutputSection.cpp...

I guess another way to look at it is an OO vs functional style of code organization -- I prefer the latter, especially since lld is mostly performing complex operations on relatively simple objects

355

fair enough, I switched to -3, .. 0 instead

This revision was automatically updated to reflect the committed changes.
int3 marked an inline comment as done.