This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Port CallGraphSort from COFF/ELF
ClosedPublic

Authored by lgrey on Oct 20 2021, 11:12 AM.

Details

Reviewers
thakis
gkm
jhenderson
Group Reviewers
Restricted Project
Commits
rG6db04b97e6a2: [lld-macho] Port CallGraphSort from COFF/ELF
Summary

Depends on D112160

This adds the new options --call-graph-profile-sort (default), --no-call-graph-profile-sort and --print-symbol-order=. If call graph profile sorting is enabled, reads __LLVM,__cg_profile sections from object files and uses the resulting graph to put callees and callers close to each other in the final binary via the C3 clustering heuristic.

Diff Detail

Event Timeline

lgrey created this revision.Oct 20 2021, 11:12 AM
Herald added a reviewer: Restricted Project. · View Herald Transcript
Herald added subscribers: dang, mgorny. · View Herald Transcript
lgrey requested review of this revision.Oct 20 2021, 11:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2021, 11:12 AM
lgrey added inline comments.Oct 20 2021, 11:14 AM
lld/MachO/CallGraphSort.cpp
153

ELF/COFF use a larger constant but the original paper says page size and I couldn't find anything in history/reviews that motivates the larger number.

lld/MachO/InputFiles.cpp
308

I couldn't find prior art on finding endianness, though there's code elsewhere that assumes little. Is there a better way to do this?

lgrey updated this revision to Diff 381035.Oct 20 2021, 11:19 AM

Removed strays from test

There are some effects when 4k pages are backed by contiguous physical memory, but that probably needs a custom loader:
https://docs.mellanox.com/display/rdmacore50/Contiguous+Pages

lgrey updated this revision to Diff 381074.Oct 20 2021, 12:53 PM
lgrey edited the summary of this revision. (Show Details)

Format

oontvoo added inline comments.Oct 20 2021, 1:20 PM
lld/MachO/CallGraphSort.cpp
252

TimeTraceScope here? (as it'd be doing non-negligible amount of work)

lld/MachO/CallGraphSort.h
17

why not #include the header instead?
(you'll need it in the .cpp anyway)

lld/MachO/Driver.cpp
1035

[optional] perhaps add a TimeTraceScope here?

1445

only if callGraphProfileSort==true ?

lld/MachO/InputFiles.cpp
308

I think the unwritten assumption here is that macho is all little-endian now.

lld/MachO/InputFiles.h
61

would be helpful to have some comment clarifying on "Index" into what.

lgrey updated this revision to Diff 381300.Oct 21 2021, 9:38 AM
lgrey marked 5 inline comments as done.

Address comments: document CallGraphEntry struct, add time traces, only extract data if --call-graph-profile-sort, shuffle includes

MaskRay added a subscriber: MaskRay.EditedOct 27 2021, 4:13 PM

Any performance number? This is a very minor optimization and probably rarely doesn't really benefit mobile/desktop applications.

Windows Chrome got 3% on Speedometer from adding this. Locally on Chromium Mac, I'm seeing about the same thing (3.3% if I'm measuring correctly: means of 151 vs 156 with similar confidence intervals).

Windows Chrome got 3% on Speedometer from adding this. Locally on Chromium Mac, I'm seeing about the same thing (3.3% if I'm measuring correctly: means of 151 vs 156 with similar confidence intervals).

Ah, nice. Thanks for the number.

lld/MachO/CallGraphSort.cpp
233
lgrey updated this revision to Diff 383160.Oct 28 2021, 2:06 PM

Remove braces around single-line if

lgrey marked an inline comment as done.Oct 28 2021, 2:06 PM
thakis accepted this revision.Jan 11 2022, 1:09 PM

Cool.

lld/MachO/Driver.cpp
1401

should we warn if there's both an explicit --call-graph-profile-sort flag and an -order_file flag? Or maybe later when we know that there's call graph info in the .o file but also an order file?

(…or are you planning on adding support for both flags at the same time in a follow-up?)

lld/MachO/InputFiles.cpp
307

Should this have a TimeTrace too?

lld/MachO/Options.td
77

nit: Put just (default) here, and omit it from the line below. That's consistent with the rest of the help texts.

This revision is now accepted and ready to land.Jan 11 2022, 1:09 PM
lgrey updated this revision to Diff 399061.Jan 11 2022, 1:34 PM

Var renames, missing newline

lgrey updated this revision to Diff 399068.Jan 11 2022, 1:40 PM

Restore accidentally stomped diff

lgrey updated this revision to Diff 399093.Jan 11 2022, 2:55 PM
lgrey marked 2 inline comments as done.

Add time trace for parsing section, options edit

lld/MachO/Driver.cpp
1401

Yes, planning to add support to both as a follow-up

This revision was landed with ongoing or failed builds.Jan 12 2022, 7:57 AM
This revision was automatically updated to reflect the committed changes.