This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho][nfc] Refactor in preparation for 32-bit support
ClosedPublic

Authored by int3 on Mar 30 2021, 6:58 PM.

Details

Reviewers
oontvoo
Group Reviewers
Restricted Project
Commits
rG817d98d84186: [lld-macho][nfc] Refactor in preparation for 32-bit support
Summary

The main challenge was handling the different on-disk structures (e.g.
mach_header vs mach_header_64). I tried to strike a balance between
sprinkling target->wordSize == 8 checks everywhere (branchy = slow, and ugly)
and templatizing everything (causes code bloat, also ugly). I think I struck a
decent balance by judicious use of type erasure.

Note that LLD-ELF has a similar architecture, though it seems to use more templating.

Linking chromium_framework takes about the same time before and after this
change:

    N           Min           Max        Median           Avg        Stddev
x  20          4.52          4.67         4.595        4.5945   0.044423204
+  20           4.5          4.71         4.575         4.582   0.056344803
No difference proven at 95.0% confidence

Diff Detail

Event Timeline

int3 created this revision.Mar 30 2021, 6:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2021, 6:58 PM
int3 requested review of this revision.Mar 30 2021, 6:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2021, 6:58 PM
int3 retitled this revision from [lld-macho] Refactor in preparation for 32-bit support to [lld-macho][nfc] Refactor in preparation for 32-bit support.Mar 30 2021, 6:59 PM
int3 edited the summary of this revision. (Show Details)Mar 30 2021, 7:02 PM
int3 added inline comments.Mar 30 2021, 7:11 PM
lld/MachO/Target.h
43

changed this parameter so we wouldn't have to templatize TargetInfo

tschuett added a subscriber: tschuett.EditedApr 1 2021, 1:10 AM

Please feel free to ignore me, but it looks odd to needle 32bit support through the ARM64 and X86_64 structs. Inheritance? Common base class?

Is sprinkling templates all over ObjFile better than ObjFile32 and ObjFile64?

int3 added a comment.Apr 1 2021, 8:17 AM

Please feel free to ignore me, but it looks odd to needle 32bit support through the ARM64 and X86_64 structs. Inheritance? Common base class?

I guess you're referring to the set32BitProperties thing? I was hoping to keep things like target->wordSize a direct field access instead of a virtual method, and I couldn't find a way to do it with inheritance... but maybe virtual methods are the way to go after all.

Is sprinkling templates all over ObjFile better than ObjFile32 and ObjFile64?

Pretty sure yes, otherwise every reference to ObjFile will either need to dispatch between the two, or have templates sprinkled on themselves

oontvoo added a subscriber: oontvoo.Apr 1 2021, 8:20 AM
oontvoo added inline comments.
lld/MachO/Arch/ARM64.cpp
31

seems inconsistent that this parameter has name and the other two don't

49–50

Interesting. How so? (In the previous version I saw you were using uint64_t WordSize = ..., which makes more sense)
Can you explain why using an anonymous enum here is better than that?

lld/MachO/Arch/X86_64.cpp
28

maybe just uint64_t ?

Please feel free to ignore me, but it looks odd to needle 32bit support through the ARM64 and X86_64 structs. Inheritance? Common base class?

I guess you're referring to the set32BitProperties thing? I was hoping to keep things like target->wordSize a direct field access instead of a virtual method, and I couldn't find a way to do it with inheritance... but maybe virtual methods are the way to go after all.

But target->wordSize also goes through the vtable?
And you are loosing section_64 and have to use the less type-safe offset.

oontvoo added inline comments.Apr 1 2021, 9:15 AM
lld/MachO/Target.h
90

not to be pedantic, but maybe ILP32 ?

int3 marked an inline comment as done.Apr 1 2021, 2:59 PM

But target->wordSize also goes through the vtable?

I don't think that's true...? Why would it?

And you are loosing section_64 and have to use the less type-safe offset.

I don't see how we can avoid this without templatizing TargetInfo.... inheritance / virtual dispatch doesn't get around the problem of having to handle both section and section_64. (And FWIW, LLD-ELF passes offsets to TargetInfo too, presumably for the same reason)

lld/MachO/Arch/ARM64.cpp
31

My philosophy is to avoid redundancy :) MemoryBufferRef mbref is saying the same thing twice, whereas offset actually provides information that uint64_t doesn't

49–50

lint doesn't approve of uint64_t WordSize (it wants wordSize). That would result in it being shadowed by TargetInfo::wordSize. I think this is less ugly than having to use ::wordSize...

Optimization-wise, I think this is the same as using constexpr -- at least, the compiler is able to use the constant value as an immediate in both cases

lld/MachO/Target.h
90

for sure

int3 updated this revision to Diff 334841.Apr 1 2021, 3:49 PM
int3 marked an inline comment as done.

LP32 -> ILP32 + move set32BitProperties to ILP32

int3 added a comment.Apr 1 2021, 3:54 PM

I do agree that having 32-bit specific code in TargetInfo itself (and its 64-bit child classes) is a bit weird though. So I moved the 32-bit/64-bit property setting code into ILP32 and LP64 respectively, plus made the sole TargetInfo constructor require that one of the two be passed. It's not quite inheritance, and it does require constructing a zero-size struct just for template type inference purposes, but it does the job :)

int3 added inline comments.Apr 1 2021, 10:03 PM
lld/MachO/Arch/ARM64.cpp
49–50

I moved wordSize into the LP* classes, so we don't have to do this anymore :)

LGTM!

lld/MachO/Target.h
32

I wonder if it's possible to have TargetInfo contains all the stuff the LP* structs below have. That way, we don't have to branch on target->wordSize everytime to decide which struct to use .

(doesn't have to be done in this patch)

int3 added inline comments.Apr 2 2021, 10:43 AM
lld/MachO/Target.h
32

We never branch on target->wordSize *just* to access different values though -- we are also doing it for the types. For instance, in the case of segmentLCType, we would have to branch anyway to choose between segment_command and segment_command_64. So I don't think there's any branching we could remove by moving more values from the LP* structs here.

int3 added a comment.Apr 2 2021, 10:51 AM

LGTM = stamp? :)

oontvoo accepted this revision.Apr 2 2021, 11:07 AM

LGTM = stamp? :)

Yes - I thought maybe @tschuett wanted to stamp ...
But here you go!

This revision is now accepted and ready to land.Apr 2 2021, 11:07 AM
This revision was landed with ongoing or failed builds.Apr 2 2021, 3:47 PM
This revision was automatically updated to reflect the committed changes.