This is an archive of the discontinued LLVM Phabricator instance.

[CMake] Add perf profiling for clang-bolt
AbandonedPublic

Authored by Amir on Dec 6 2022, 6:55 PM.

Details

Reviewers
phosek
Group Reviewers
Restricted Project
Summary

perf provides a faster and easier way to collect BOLT profile. Generalize CMake
handling of applying BOLT to Clang to allow using perf with or without branch
stacks (LBR) for profile collection. This also enables Clang-BOLT for AArch64
platforms with perf profiling enabled.

Diff Detail

Event Timeline

Amir created this revision.Dec 6 2022, 6:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 6:55 PM
Amir requested review of this revision.Dec 6 2022, 6:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 6 2022, 6:55 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
Amir updated this revision to Diff 480739.Dec 6 2022, 7:12 PM

Fix dependence between bolt-profile and clang (either instrumented or not)

Amir updated this revision to Diff 480760.Dec 6 2022, 8:25 PM

Fixed COMPILER_LAUNCHER, perf2bolt invocation

Amir updated this revision to Diff 480982.Dec 7 2022, 10:56 AM

Avoid using perf2bolt, provide perf.data directly

Amir updated this revision to Diff 481078.Dec 7 2022, 2:59 PM

Fix instrumentation and no-LBR modes

Amir updated this revision to Diff 481138.Dec 7 2022, 7:02 PM

Documentation

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2022, 7:02 PM
Amir retitled this revision from [CMake] Use perf with LBR for clang-bolt (WIP) to [CMake] Add perf profiling for clang-bolt.Dec 7 2022, 7:30 PM
Amir edited the summary of this revision. (Show Details)
Amir edited the summary of this revision. (Show Details)Dec 9 2022, 12:48 PM
Amir added a subscriber: DavidSpickett.

Could we add the perf related logic to https://github.com/llvm/llvm-project/blob/ba3d808feedaa7f31750d8bc02754e15b372c868/clang/utils/perf-training/perf-helper.py? I think that's a better place since we eventually want to replace the use of ExternalProject_Add with https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training so we should try to keep the amount of logic in CMake down to minimum.

Amir added a comment.Dec 10 2022, 10:58 AM

Could we add the perf related logic to https://github.com/llvm/llvm-project/blob/ba3d808feedaa7f31750d8bc02754e15b372c868/clang/utils/perf-training/perf-helper.py? I think that's a better place since we eventually want to replace the use of ExternalProject_Add with https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training so we should try to keep the amount of logic in CMake down to minimum.

Sure! I didn't realize perf-helper had dtrace functionality in place. Adding Linux perf functions would be logical.

Amir updated this revision to Diff 483400.Dec 15 2022, 5:33 PM

Generalize to -DCLANG_BOLT={Instrument,perf,LBR}, update documentation

Amir added a comment.Dec 15 2022, 5:36 PM

@phosek – this diff adds support for AArch64 via Linux perf. I believe it makes sense to add this functionality first in an incremental fashion and refactor it later, moving parts into perf-training script. What do you think? (And thank you for reviewing this stuff!)

Amir updated this revision to Diff 484727.Dec 21 2022, 6:01 PM

Convert perf profile using perf2bolt (aggregate-only mode)

Matt added a subscriber: Matt.Jan 25 2023, 8:57 AM
Amir abandoned this revision.Feb 8 2023, 12:33 PM

Abandon in favor of D143553