This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Use intermediate arrays to store opcodes
ClosedPublic

Authored by thevinster on Jul 12 2021, 8:50 PM.

Details

Reviewers
int3
gkm
Group Reviewers
Restricted Project
Commits
rGf2b1264141b0: [lld-macho] Use intermediate arrays to store opcodes
Summary

We want to incorporate some of the optimization passes in bind opcodes from ld64.
This revision makes no functional changes but to start storing opcodes in intermediate
containers in preparation for implementing the optimization passes in a follow-up revision.

Diff Detail

Event Timeline

thevinster created this revision.Jul 12 2021, 8:50 PM
Herald added a reviewer: gkm. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added a subscriber: mgrang. · View Herald Transcript
thevinster requested review of this revision.Jul 12 2021, 8:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2021, 8:50 PM
thevinster retitled this revision from [lld-macho] Use intermediate arrays to store opcodes to [WIP] [lld-macho] Use intermediate arrays to store opcodes.Jul 12 2021, 8:51 PM
thevinster edited the summary of this revision. (Show Details)Jul 12 2021, 8:53 PM
thevinster edited the summary of this revision. (Show Details)

Fix clang-format, rename variables

thevinster retitled this revision from [WIP] [lld-macho] Use intermediate arrays to store opcodes to [WIP][lld-macho] Use intermediate arrays to store opcodes.Jul 13 2021, 9:59 PM
thevinster edited the summary of this revision. (Show Details)

Fix clang-tidy

thevinster retitled this revision from [WIP][lld-macho] Use intermediate arrays to store opcodes to [lld-macho] Use intermediate arrays to store opcodes.Jul 13 2021, 11:07 PM
thevinster added inline comments.
lld/MachO/SyntheticSections.cpp
395

@int3 - sortBindings already sorts the bindings. Is there any reason to have to re-sort again?

int3 added inline comments.Jul 13 2021, 11:20 PM
lld/MachO/SyntheticSections.cpp
395

nope, it's leftover code from before I added sortBindings. good catch :)

int3 added inline comments.Jul 15 2021, 11:59 AM
lld/MachO/SyntheticSections.cpp
280

how about BindIR?

282

how about renaming it to data so that there's less confusion about why we're using delta to refer to an offset or an addend?

(alternatively we could use a union here)

299–300

cast shouldn't be necessary any more I think. I used it for writing to the stream because we would otherwise call the integer overload of the stream operator. But here there's no overload to worry about

303
319

I don't think there's anything to mask here, since we are explicitly setting op.opcode to one of the opcode values minus any extra data. But we could do an assert to be on the safe side :)

int3 added a comment.Jul 15 2021, 12:02 PM

commit title nit: should mention *bind* opcodes for clarity

oontvoo added inline comments.Jul 15 2021, 1:03 PM
lld/MachO/SyntheticSections.cpp
318

Can this be const & instead? op isn't being modified by this.

432

const auto& ?

464

const auto& ?

thevinster marked 7 inline comments as done.Jul 15 2021, 3:43 PM
thevinster added inline comments.
lld/MachO/SyntheticSections.cpp
319

Doesn't this operation BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB | seg->index mean it's possible that extra data could be embedded in an opcode value? Otherwise, why perform seg->index in the first place?

Address feedback

int3 accepted this revision.Jul 15 2021, 3:54 PM

lgtm with minor changes. Do you have commit rights or should I commit it for you?

lld/MachO/SyntheticSections.cpp
281–282

do we need these default values?

298–299

ah I guess we need the cast

319

derp yeah I forgot about that one

This revision is now accepted and ready to land.Jul 15 2021, 3:54 PM
thevinster marked an inline comment as done.Jul 15 2021, 4:01 PM
thevinster added inline comments.
lld/MachO/SyntheticSections.cpp
281–282

Probably not, since we always construct these values explicitly. But, the reason for having it as default is 1/ follow the convention of Binding struct and 2/ in the event that something goes awry, 0xF0 is not valid opcode and should make the program scream instead of accidentally writing "valid" values.

Plus, I think it'll debugging easier if we see these specific values when debugging.

298–299

Fwiw, the tests did pass. I'll revert to the old implementation.

int3 added inline comments.Jul 15 2021, 4:09 PM
lld/MachO/SyntheticSections.cpp
281–282

0xF0 is not valid opcode and should make the program scream instead of accidentally writing "valid" values.

Can you add that as a comment? Otherwise it looks a bit like a magic number :)

thevinster marked 3 inline comments as done.Jul 15 2021, 4:35 PM

Add comments on default values, revert to cast

This revision was landed with ongoing or failed builds.Jul 15 2021, 5:11 PM
This revision was automatically updated to reflect the committed changes.
int3 added inline comments.Jul 15 2021, 6:53 PM
lld/MachO/SyntheticSections.cpp
298–299

I get why we added the cast back, but the temporary seems unnecessary... we can construct it inline as part of the push_back statement.

Also, the cast is only needed for BIND_OPCODE_SET_SEGMENT_AND_OFFSET_ULEB | seg->index as | returns an int (regardless of the size of its operands). We don't need the cast for the other opcodes below

(I thought you meant 'revert' as in 'partial revert', didn't realize you were going to do a full one)

thevinster added inline comments.Jul 15 2021, 6:56 PM
lld/MachO/SyntheticSections.cpp
298–299

My bad. I completely misunderstood. I'll make a forward fix of this on D105867 and make sure things are still running smoothly.