This is an archive of the discontinued LLVM Phabricator instance.

[LPM] Port CGProfilePass from NPM to LPM
ClosedPublic

Authored by zequanwu on Jul 1 2020, 5:46 PM.

Diff Detail

Event Timeline

zequanwu created this revision.Jul 1 2020, 5:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2020, 5:46 PM
zequanwu updated this revision to Diff 274979.Jul 1 2020, 5:59 PM

update test.

MaskRay added inline comments.
llvm/lib/Transforms/Instrumentation/CGProfile.cpp
29

a static function does not need to be wrapped inside an unnamed namespace

104–109

wrap a struct/class in an unnamed namespace to prevent name collision

hans added a comment.Jul 2 2020, 2:55 AM

Looks great, thanks! Just some minor comments.

llvm/include/llvm/InitializePasses.h
200

I think these declarations should be in alphabetic order.

llvm/include/llvm/Transforms/IPO.h
285

ultra nit: I would probably keep the empty line before the } here

llvm/lib/Transforms/Instrumentation/CGProfile.cpp
104–109

Might as well declare it 'final' while you're here.

hans added inline comments.Jul 2 2020, 2:55 AM
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
839

In the new pm this pass seems to be added conditionally based on some flag: if (PTO.CallGraphProfile)
Is it possible to do something similar here?

llvm/lib/Transforms/Instrumentation/CGProfile.cpp
133

Should it be "cg-profile" to match the npm pass name? Arthur has been working on making such pass names equal recently.

zequanwu updated this revision to Diff 275160.Jul 2 2020, 10:27 AM
zequanwu marked 7 inline comments as done.

Address comments.

Herald added a project: Restricted Project. · View Herald TranscriptJul 2 2020, 10:27 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
zequanwu updated this revision to Diff 275558.EditedJul 5 2020, 11:57 AM

Enable CGProfilePass for opt with LPM by default, like opt with NPM.

MaskRay added inline comments.
llvm/tools/opt/opt.cpp
281 ↗(On Diff #275558)

If there is no strong need for tuning this, please delete the option and PassManagerBuilder::CallGraphProfile


I know that -enable-npm-call-graph-profile exists, but it seems like a temporary workaround for me. @zhizhouy @void (D62627) Is the option still used?

zhizhouy added inline comments.Jul 5 2020, 2:35 PM
llvm/tools/opt/opt.cpp
281 ↗(On Diff #275558)

Does GNU assembler recognize .cgprofile section now?

I think we should keep this option as long as there is still usage of other than integrated assembler.

MaskRay added inline comments.Jul 5 2020, 3:02 PM
llvm/tools/opt/opt.cpp
281 ↗(On Diff #275558)

Does GNU assembler recognize .cgprofile section now?

I don't think it will ever support this section.

I think we should keep this option as long as there is still usage of other than integrated assembler.

Can you give a link about the use case?

zhizhouy added inline comments.Jul 5 2020, 3:35 PM
llvm/tools/opt/opt.cpp
281 ↗(On Diff #275558)

I have a link of the effort to migrate Linux kernel to integrated assembler in ChromeOS: https://bugs.chromium.org/p/chromium/issues/detail?id=1020923

I think they are only to migrate newer versions thus the old linux kernels are still using GNU assembler, and all kernels are built with LLVM now.

MaskRay added inline comments.Jul 5 2020, 4:55 PM
llvm/tools/opt/opt.cpp
281 ↗(On Diff #275558)

If we have to have an option, we should make both legacy pm and new pm use the same option. We should not create two cl::opt.

echristo added a subscriber: echristo.

Adding Chandler and Alina here as well.

In general, I don't think that this is such a great idea. Being able to have this sort of thing work more reliably is one of the reasons for the new pass manager. I think I'd like to see this split out into an old versus new pass manager pass to avoid the difficulty of cleaning this up after we finish migrating llvm to the new pass manager. This also seems to add some technical debt around options and other enablement which is also less than ideal. Is this compelling to add right now versus finishing work migrating llvm completely to the new pass manager and removing the old one? From speaking with Alina I think that work should be done in a short while.

Thanks.

-eric

Adding Chandler and Alina here as well.

In general, I don't think that this is such a great idea. Being able to have this sort of thing work more reliably is one of the reasons for the new pass manager. I think I'd like to see this split out into an old versus new pass manager pass to avoid the difficulty of cleaning this up after we finish migrating llvm to the new pass manager. This also seems to add some technical debt around options and other enablement which is also less than ideal. Is this compelling to add right now versus finishing work migrating llvm completely to the new pass manager and removing the old one? From speaking with Alina I think that work should be done in a short while.

Thanks.

-eric

I don't think we're that close yet, probably at least a couple months out, there are lots of loose ends to be tied up. I'll make a post soon in llvm-dev (maybe first we can sync up again) about what I think needs to be done before the NPM switch.

hans added a comment.Jul 6 2020, 1:25 AM

Adding Chandler and Alina here as well.

In general, I don't think that this is such a great idea. Being able to have this sort of thing work more reliably is one of the reasons for the new pass manager. I think I'd like to see this split out into an old versus new pass manager pass to avoid the difficulty of cleaning this up after we finish migrating llvm to the new pass manager. This also seems to add some technical debt around options and other enablement which is also less than ideal. Is this compelling to add right now versus finishing work migrating llvm completely to the new pass manager and removing the old one? From speaking with Alina I think that work should be done in a short while.

Given how long the new pass manager has been in progress, we definitely don't want to block on enabling it. So yes, porting this pass to the current pass manager is compelling to do right now. I also don't see why it should be a big deal.

As for splitting it into separate passes, this patch technically does that, although it extracts and changes the core code a bit so it can be shared between the passes. I think that's how most passes have been adapted to work with both pass managers, no?

zequanwu updated this revision to Diff 275763.Jul 6 2020, 10:48 AM

Delete enable-npm-call-graph-profile option for NPM, using enable-call-graph-profile for both LPM and NPM.

zequanwu marked an inline comment as done.Jul 6 2020, 10:49 AM

Adding Chandler and Alina here as well.

In general, I don't think that this is such a great idea. Being able to have this sort of thing work more reliably is one of the reasons for the new pass manager. I think I'd like to see this split out into an old versus new pass manager pass to avoid the difficulty of cleaning this up after we finish migrating llvm to the new pass manager. This also seems to add some technical debt around options and other enablement which is also less than ideal. Is this compelling to add right now versus finishing work migrating llvm completely to the new pass manager and removing the old one? From speaking with Alina I think that work should be done in a short while.

Thanks.

-eric

I don't think we're that close yet, probably at least a couple months out, there are lots of loose ends to be tied up. I'll make a post soon in llvm-dev (maybe first we can sync up again) about what I think needs to be done before the NPM switch.

+1 to sync up again and make progress towards the NPM switch.

I don't want to block this patch, but I do agree with Eric's point. We *really* want to focus more on the switch then invest into more LPM infra. Short term resolutions to unblock folks, with the best split possible, sure, keeping in mind they'll need to be cleaned up. But I don't want us to lose focus on tying up the remaining loose ends for the switch.
I think it's critical for LLVM's codebase health to focus on the NPM switch in the next couple of months.

hans added a comment.Jul 7 2020, 1:35 AM

I don't want to block this patch, but I do agree with Eric's point. We *really* want to focus more on the switch then invest into more LPM infra. Short term resolutions to unblock folks, with the best split possible, sure, keeping in mind they'll need to be cleaned up.

Sounds good to me.

hans accepted this revision.Jul 7 2020, 1:35 AM

LGTM

nikic requested changes to this revision.Jul 7 2020, 1:42 AM
nikic added a subscriber: nikic.
nikic added inline comments.
llvm/test/Other/opt-O2-pipeline.ll
289

Is it possible to switch this pass to use LazyBPI / LazyBFA, only fetched if PGO is actually in use?

PGO functionality that most people don't use adding expensive analysis passes like PDT should be avoided.

This revision now requires changes to proceed.Jul 7 2020, 1:42 AM
hans added inline comments.Jul 7 2020, 2:16 AM
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
170

Oh, just noticed: I think CallGraphProfile should be initialized along with the other flags here.

llvm/test/Other/opt-O2-pipeline.ll
289

I wonder if just switching to LazyBlockFrequencyInfo would help though. It looks to me like the CGProfile would request info about each function anyway.

I was surprised to see that Clang sets Opts.CallGraphProfile solely based on whether the integrated assembler is used. Maybe a better fix is to only set that to true when a profile is actually being used?

nikic added inline comments.Jul 7 2020, 8:52 AM
llvm/test/Other/opt-O2-pipeline.ll
289

I wonder if just switching to LazyBlockFrequencyInfo would help though. It looks to me like the CGProfile would request info about each function anyway.

It would only help if there is some way to only fetch the analysis conditionally. I believe many PGO passes use something like PSI.hasProfileSummary() or F.hasProfileData() for that.

I was surprised to see that Clang sets Opts.CallGraphProfile solely based on whether the integrated assembler is used. Maybe a better fix is to only set that to true when a profile is actually being used?

Right, just disabling this by default in clang/opt would also work.

For reference, the current compile-time numbers for this patch: https://llvm-compile-time-tracker.com/compare.php?from=516ff1d4baee28b1911737e47b42973567adf8ff&to=8df840660bb764b6653fcfd9ac7a72cc6adebde6&stat=instructions Not huge, but it adds up (some similar regressions have been introduced in LLVM 10).

zequanwu marked an inline comment as done.Jul 7 2020, 10:01 AM
zequanwu added inline comments.
llvm/test/Other/opt-O2-pipeline.ll
289

Do you mean disabling it just for LPM or both?

zequanwu marked an inline comment as done.Jul 7 2020, 11:32 AM
zequanwu added inline comments.
llvm/test/Other/opt-O2-pipeline.ll
289

I was surprised to see that Clang sets Opts.CallGraphProfile solely based on whether the integrated assembler is used. Maybe a better fix is to only set that to true when a profile is actually being used?

For Clang, a better fix I think is that Opts.CallGraphProfile should based on both whether the integrated assembler is used and whether profile instrumentation is turned on. What do you think?

MaskRay added inline comments.Jul 7 2020, 12:19 PM
llvm/test/Other/opt-O2-pipeline.ll
289

I'd prefer not having CallGraphProfile

  • -no-integrated-as -S => no .cgprofile (.llvm_addrsig behaves this way)
  • -S -> .cgprofile
zequanwu added inline comments.Jul 7 2020, 1:33 PM
llvm/test/Other/opt-O2-pipeline.ll
289

As discussed above, I think CGProfilePass should be disabled by default in clang unless -no-integrated-as is not given and -fprofile-instrument-use-path= is given. So, Opts.CallGraphProfile is a convenient switch for that.

zequanwu updated this revision to Diff 276208.Jul 7 2020, 2:25 PM
zequanwu marked an inline comment as done.
  • Disable enable-call-graph-profile by default in opt.
  • Disable CGProfilePass by default in clang unless -no-integrated-as is not given and -fprofile-instrument-use-path= is given, as this pass only generates module metadata when profile data is given.

I still haven't seen a strong argument keeping a command line option -enable-npm-call-graph-profile. Asked in D62627.

Opts.getProfileUse() != CodeGenOptions::ProfileNone in

Opts.CallGraphProfile = Opts.getProfileUse() != CodeGenOptions::ProfileNone &&
                          !Opts.DisableIntegratedAS;

is redundant. CGProfile.cpp is a no-op if no function provides getEntryFreq().

hans added a comment.Jul 8 2020, 4:53 AM

Opts.getProfileUse() != CodeGenOptions::ProfileNone in

Opts.CallGraphProfile = Opts.getProfileUse() != CodeGenOptions::ProfileNone &&
                          !Opts.DisableIntegratedAS;

is redundant. CGProfile.cpp is a no-op if no function provides getEntryFreq().

It's a functional no-op, but it runs the BFI analysis, which as Nikita pointed out above adds some compile-time cost. Not scheduling the pass unless we're using profile info seems like a reasonable way to avoid that cost to me.

The alternative of using LazyBlockFrequencyInfoPass and checking PSI->hasProfileSummary() first would also work I guess. If you think that's cleaner, maybe that's the better way to go.

zequanwu updated this revision to Diff 276526.Jul 8 2020, 12:36 PM
  • Remove "enable-call-graph-profile" option and enable CGProfilePass by default, unless -no-integrated-as is given in clang.
  • Use LazyBlockFrequencyInfoPass instead of BlockFrequencyInfoWrapperPass and check F.getEntryCount before get BFI to reduce cost.
zequanwu marked an inline comment as done.Jul 8 2020, 12:41 PM

The alternative of using LazyBlockFrequencyInfoPass and checking PSI->hasProfileSummary() first would also work I guess. If you think that's cleaner, maybe that's the better way to go.

Since PSI->hasProfileSummary() is not necessary for this pass, it relies on function entry count. So, I check for F.getEntryCount() before getting BFI.

The alternative of using LazyBlockFrequencyInfoPass and checking PSI->hasProfileSummary() first would also work I guess. If you think that's cleaner, maybe that's the better way to go.

Since PSI->hasProfileSummary() is not necessary for this pass, it relies on function entry count. So, I check for F.getEntryCount() before getting BFI.

Thanks. The last update looks good to me. I'll defer the approval to @nikic and folks who have expressed concerns about deleting legacy PM.

nikic accepted this revision.Jul 9 2020, 11:42 AM
llvm/test/CodeGen/AMDGPU/opt-pipeline.ll
285

This test is out of date.

hans accepted this revision.Jul 9 2020, 12:11 PM

Sounds great!

lgtm2 (with the test update Nikita mentioned)

zequanwu updated this revision to Diff 276808.Jul 9 2020, 12:40 PM

Update test case.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 9 2020, 1:04 PM
This revision was automatically updated to reflect the committed changes.

Some inline nits. I see you've already committed and that's fine - I still don't think we should do it, but we can delete it again soon :)

clang/lib/CodeGen/BackendUtil.cpp
623

Comment here as to why.

1150–1152

Comment here as to why.

1570–1572

Ditto :)

llvm/lib/Transforms/Instrumentation/CGProfile.cpp
63

Extra space? Did clang-format put this in?

65–69

Comment? What's the change for?

MaskRay reopened this revision.Jul 9 2020, 1:42 PM
MaskRay requested changes to this revision.
This revision now requires changes to proceed.Jul 9 2020, 1:42 PM
zequanwu updated this revision to Diff 276834.Jul 9 2020, 2:15 PM

Add comments and fix test failure in http://45.33.8.238/linux/22500/step_7.txt.

zequanwu marked 5 inline comments as done.Jul 9 2020, 2:16 PM
zequanwu added inline comments.
llvm/lib/Transforms/Instrumentation/CGProfile.cpp
63

Yes, clang-format put this in.

hans accepted this revision.Jul 10 2020, 1:50 AM

Still lgtm. For what it's worth, I think you could have just re-committed with the fixes rather than uploading for review again.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 10 2020, 9:05 AM
This revision was automatically updated to reflect the committed changes.

Still lgtm. For what it's worth, I think you could have just re-committed with the fixes rather than uploading for review again.

Gotcha, thanks.

Still lgtm. For what it's worth, I think you could have just re-committed with the fixes rather than uploading for review again.

This may be a difference of habits but I usually upload the last revision if it contains anything more than comment changes. The reviewed version might be read by posterity to get a quick overview about the patch. Browsing git log -p is not very convenient at times.

llvm/test/CodeGen/AMDGPU/opt-pipeline.ll