This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Extend ICF to literal sections
ClosedPublic

Authored by int3 on Jun 21 2021, 2:22 PM.

Details

Reviewers
gkm
Group Reviewers
Restricted Project
Commits
rG557e1fa02f47: [lld-macho] Extend ICF to literal sections
Summary

Literal sections can be deduplicated before running ICF. That makes it
easy to compare them during ICF: we can tell if two literals are
constant-equal by comparing their offsets in their OutputSection.

LLD-ELF takes a similar approach.

Diff Detail

Event Timeline

int3 created this revision.Jun 21 2021, 2:22 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: dang. · View Herald Transcript
int3 requested review of this revision.Jun 21 2021, 2:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 21 2021, 2:22 PM
int3 added inline comments.Jun 21 2021, 2:37 PM
lld/MachO/ICF.cpp
107–108

we don't want to compare value early because two symbols with different values may point to the same literal

int3 updated this revision to Diff 354357.Jun 24 2021, 2:31 PM

rebase + fix absolute symbol check

gkm accepted this revision.Jun 24 2021, 2:47 PM

Groovy!

lld/MachO/ICF.cpp
112–119

I prefer this for convenient breakpoint locations when particular conditions fail.

117–119

Ditto.

122

Ditto; and I think you meant this, no?

return da->value == db->value;
142
211
212

Super nitty: I prefer to see the simpler oparand(s) come earlier in the expression.

lld/test/MachO/icf-literals.s
33–36
This revision is now accepted and ready to land.Jun 24 2021, 2:47 PM
int3 updated this revision to Diff 354369.Jun 24 2021, 3:06 PM

fix hash

int3 added inline comments.Jun 24 2021, 6:24 PM
lld/test/MachO/icf-literals.s
33–36

oops. I guess I should add a section of callqs like the icf.s test to check that we aren't deduplicating things unexpectedly

int3 added inline comments.Jun 24 2021, 7:14 PM
lld/MachO/ICF.cpp
112–119

it feels unfortunate to contort the code for easier breakpointing. Is the goal to be able to breakpoint right before function exit? for that case, could we just set a breakpoint at the retq instruction's address? (I checked the disassembly, and clang does indeed emit a single retq for this whole function)

gkm added inline comments.Jun 25 2021, 2:10 PM
lld/MachO/ICF.cpp
112–119

IMO, it's not contortion so much as vertical expansion! ;)
I want to see if a particular predicate fails, so I'll breakpoint the return false; beneath it.

int3 marked 3 inline comments as done.Jun 25 2021, 2:29 PM
int3 added inline comments.
lld/MachO/ICF.cpp
112–119

hm, so the idea is that you're more likely to want to debug when things don't compare to be equal?

int3 added inline comments.Jun 25 2021, 5:17 PM
lld/MachO/ICF.cpp
112–119

IMO, it's not contortion so much as vertical expansion! ;)

I think it makes for more confusing code, because it obscures when fall-through behavior (to the return true;) is actually needed.

gkm added inline comments.Jun 28 2021, 9:24 AM
lld/MachO/ICF.cpp
112–119

hm, so the idea is that you're more likely to want to debug when things don't compare to be equal?

So far, all of my debugging has been to see when predicates unexpectedly fail. We don't need to debate this any further. Do what you think best. ;)

112–119

It makes sense when you recognize that everything follows a regular pattern: A sequence of if (UNEQUAL) return false; falling through to return true;

gkm added inline comments.Jun 28 2021, 9:39 AM
lld/MachO/ICF.cpp
112–119

Overall, the if (UNEQUAL) return false; pattern serves all breakpointing purposes: when there is an unexpected predicate failure, I break on the return false;, and when there is an unexpected predicate success, I break on the statement following the entire if statement. I like it. It works for me. FYI, the earliest versions of equalsConstant() and equalsVariable() were beautiful single statements: return MASSIVE-MULTILINE-COMPOUND-PREDICATE, but then I needed to debug and wanted fine-grained control over breakpoints. ;)

int3 added a comment.Jun 28 2021, 11:43 AM

Mm I see. I think it's easy enough to switch between the two forms for debugging purposes, so I'm landing it as-is...

This revision was landed with ongoing or failed builds.Jun 28 2021, 11:50 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Jun 28 2021, 7:29 PM
thakis added inline comments.
lld/MachO/Writer.cpp
948

I rebased the cstring alignment patch across this naively (a8a6e5b094aac642f436390294ec837400c521bb) and this came out pretty weird (because for non-deduped cstring sections this doesn't do any folding).

Maybe CStringSection should keep overriding finalize() and also add an empty virtual foldLiterals that DedupCStringSection overrides?

int3 added inline comments.Jun 28 2021, 9:13 PM
lld/MachO/Writer.cpp
948

I think it would be confusing / a potential source of bugs to have CStringSection and DedupCStringSection finalized at different points in the link. Perhaps a comment here would be sufficient?

int3 added inline comments.Jun 28 2021, 10:37 PM
lld/MachO/Writer.cpp
948

comment added in D105044