Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looks great, thanks! Just some minor comments.
llvm/include/llvm/InitializePasses.h | ||
---|---|---|
200 | I think these declarations should be in alphabetic order. | |
llvm/include/llvm/Transforms/IPO.h | ||
285 | ultra nit: I would probably keep the empty line before the } here | |
llvm/lib/Transforms/Instrumentation/CGProfile.cpp | ||
104–109 | Might as well declare it 'final' while you're here. |
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp | ||
---|---|---|
839 | In the new pm this pass seems to be added conditionally based on some flag: if (PTO.CallGraphProfile) | |
llvm/lib/Transforms/Instrumentation/CGProfile.cpp | ||
133 | Should it be "cg-profile" to match the npm pass name? Arthur has been working on making such pass names equal recently. |
llvm/tools/opt/opt.cpp | ||
---|---|---|
281 ↗ | (On Diff #275558) |
llvm/tools/opt/opt.cpp | ||
---|---|---|
281 ↗ | (On Diff #275558) | Does GNU assembler recognize .cgprofile section now? I think we should keep this option as long as there is still usage of other than integrated assembler. |
llvm/tools/opt/opt.cpp | ||
---|---|---|
281 ↗ | (On Diff #275558) |
I don't think it will ever support this section.
Can you give a link about the use case? |
llvm/tools/opt/opt.cpp | ||
---|---|---|
281 ↗ | (On Diff #275558) | I have a link of the effort to migrate Linux kernel to integrated assembler in ChromeOS: https://bugs.chromium.org/p/chromium/issues/detail?id=1020923 I think they are only to migrate newer versions thus the old linux kernels are still using GNU assembler, and all kernels are built with LLVM now. |
llvm/tools/opt/opt.cpp | ||
---|---|---|
281 ↗ | (On Diff #275558) | If we have to have an option, we should make both legacy pm and new pm use the same option. We should not create two cl::opt. |
Adding Chandler and Alina here as well.
In general, I don't think that this is such a great idea. Being able to have this sort of thing work more reliably is one of the reasons for the new pass manager. I think I'd like to see this split out into an old versus new pass manager pass to avoid the difficulty of cleaning this up after we finish migrating llvm to the new pass manager. This also seems to add some technical debt around options and other enablement which is also less than ideal. Is this compelling to add right now versus finishing work migrating llvm completely to the new pass manager and removing the old one? From speaking with Alina I think that work should be done in a short while.
Thanks.
-eric
I don't think we're that close yet, probably at least a couple months out, there are lots of loose ends to be tied up. I'll make a post soon in llvm-dev (maybe first we can sync up again) about what I think needs to be done before the NPM switch.
Given how long the new pass manager has been in progress, we definitely don't want to block on enabling it. So yes, porting this pass to the current pass manager is compelling to do right now. I also don't see why it should be a big deal.
As for splitting it into separate passes, this patch technically does that, although it extracts and changes the core code a bit so it can be shared between the passes. I think that's how most passes have been adapted to work with both pass managers, no?
Delete enable-npm-call-graph-profile option for NPM, using enable-call-graph-profile for both LPM and NPM.
+1 to sync up again and make progress towards the NPM switch.
I don't want to block this patch, but I do agree with Eric's point. We *really* want to focus more on the switch then invest into more LPM infra. Short term resolutions to unblock folks, with the best split possible, sure, keeping in mind they'll need to be cleaned up. But I don't want us to lose focus on tying up the remaining loose ends for the switch.
I think it's critical for LLVM's codebase health to focus on the NPM switch in the next couple of months.
I don't want to block this patch, but I do agree with Eric's point. We *really* want to focus more on the switch then invest into more LPM infra. Short term resolutions to unblock folks, with the best split possible, sure, keeping in mind they'll need to be cleaned up.
Sounds good to me.
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 | Is it possible to switch this pass to use LazyBPI / LazyBFA, only fetched if PGO is actually in use? PGO functionality that most people don't use adding expensive analysis passes like PDT should be avoided. |
llvm/lib/Transforms/IPO/PassManagerBuilder.cpp | ||
---|---|---|
170 | Oh, just noticed: I think CallGraphProfile should be initialized along with the other flags here. | |
llvm/test/Other/opt-O2-pipeline.ll | ||
289 | I wonder if just switching to LazyBlockFrequencyInfo would help though. It looks to me like the CGProfile would request info about each function anyway. I was surprised to see that Clang sets Opts.CallGraphProfile solely based on whether the integrated assembler is used. Maybe a better fix is to only set that to true when a profile is actually being used? |
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 |
It would only help if there is some way to only fetch the analysis conditionally. I believe many PGO passes use something like PSI.hasProfileSummary() or F.hasProfileData() for that.
Right, just disabling this by default in clang/opt would also work. For reference, the current compile-time numbers for this patch: https://llvm-compile-time-tracker.com/compare.php?from=516ff1d4baee28b1911737e47b42973567adf8ff&to=8df840660bb764b6653fcfd9ac7a72cc6adebde6&stat=instructions Not huge, but it adds up (some similar regressions have been introduced in LLVM 10). |
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 | Do you mean disabling it just for LPM or both? |
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 |
For Clang, a better fix I think is that Opts.CallGraphProfile should based on both whether the integrated assembler is used and whether profile instrumentation is turned on. What do you think? |
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 | I'd prefer not having CallGraphProfile
|
llvm/test/Other/opt-O2-pipeline.ll | ||
---|---|---|
289 | As discussed above, I think CGProfilePass should be disabled by default in clang unless -no-integrated-as is not given and -fprofile-instrument-use-path= is given. So, Opts.CallGraphProfile is a convenient switch for that. |
- Disable enable-call-graph-profile by default in opt.
- Disable CGProfilePass by default in clang unless -no-integrated-as is not given and -fprofile-instrument-use-path= is given, as this pass only generates module metadata when profile data is given.
I still haven't seen a strong argument keeping a command line option -enable-npm-call-graph-profile. Asked in D62627.
Opts.getProfileUse() != CodeGenOptions::ProfileNone in
Opts.CallGraphProfile = Opts.getProfileUse() != CodeGenOptions::ProfileNone && !Opts.DisableIntegratedAS;
is redundant. CGProfile.cpp is a no-op if no function provides getEntryFreq().
It's a functional no-op, but it runs the BFI analysis, which as Nikita pointed out above adds some compile-time cost. Not scheduling the pass unless we're using profile info seems like a reasonable way to avoid that cost to me.
The alternative of using LazyBlockFrequencyInfoPass and checking PSI->hasProfileSummary() first would also work I guess. If you think that's cleaner, maybe that's the better way to go.
- Remove "enable-call-graph-profile" option and enable CGProfilePass by default, unless -no-integrated-as is given in clang.
- Use LazyBlockFrequencyInfoPass instead of BlockFrequencyInfoWrapperPass and check F.getEntryCount before get BFI to reduce cost.
The alternative of using LazyBlockFrequencyInfoPass and checking PSI->hasProfileSummary() first would also work I guess. If you think that's cleaner, maybe that's the better way to go.
Since PSI->hasProfileSummary() is not necessary for this pass, it relies on function entry count. So, I check for F.getEntryCount() before getting BFI.
Thanks. The last update looks good to me. I'll defer the approval to @nikic and folks who have expressed concerns about deleting legacy PM.
LG from my side.
New compile-time numbers: https://llvm-compile-time-tracker.com/compare.php?from=0b39d2d75275b80994dac06b7ad05031cbd09393&to=fd070b79e063fff2fad3cd4a467f64dfca83eb90&stat=instructions It's nearly neutral now.
llvm/test/CodeGen/AMDGPU/opt-pipeline.ll | ||
---|---|---|
285 | This test is out of date. |
Some inline nits. I see you've already committed and that's fine - I still don't think we should do it, but we can delete it again soon :)
clang/lib/CodeGen/BackendUtil.cpp | ||
---|---|---|
623 | Comment here as to why. | |
1150–1152 | Comment here as to why. | |
1570–1572 | Ditto :) | |
llvm/lib/Transforms/Instrumentation/CGProfile.cpp | ||
63 | Extra space? Did clang-format put this in? | |
65–69 | Comment? What's the change for? |
llvm/lib/Transforms/Instrumentation/CGProfile.cpp | ||
---|---|---|
63 | Yes, clang-format put this in. |
Still lgtm. For what it's worth, I think you could have just re-committed with the fixes rather than uploading for review again.
This may be a difference of habits but I usually upload the last revision if it contains anything more than comment changes. The reviewed version might be read by posterity to get a quick overview about the patch. Browsing git log -p is not very convenient at times.
Comment here as to why.