Since this is a LangRef change, I suggest sending the summary of the patch as a RFC to llvm-dev list first to hear opinions from the community first.
Wed, Dec 2
For documentation, please update docs/LangRef.rst.
Sat, Nov 21
thanks for the background. Regarding D90708 about the LangRef change, was there a RFC before the patch?
This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.
Fri, Nov 20
Thu, Nov 19
Wed, Nov 18
Can you split this patch into two? one for verification, the other for fixing?
Thu, Nov 12
Given Wei's feedback, I am ok with the current design now, including the
external sample format.
Wed, Nov 11
Tue, Nov 10
Oct 31 2020
Oct 29 2020
Oct 28 2020
How about inline cost analysis? It needs to skip the new instructions. Similarly for the Partial inliner, the static cost of this should be set to zero.
longer term, the profile will be dumped into PGO's raw file, so for now is there a need for a user level option? should an internal option good enough?
There should be a related LLVM side of changes. Is it in a different patch?
Oct 22 2020
Oct 21 2020
Oct 13 2020
ok, makes sense. Basically since the inner loop is laid out first, the outer loop can not individually split out the blocks already laid out. What is the manifest of the problem? wrong profile data or compiler crash or something else?
Oct 9 2020
Profile runtime has 4 ways of setting output: 1) default 2) compiler command line arg 3) environment variable at runtime; and 4) user invocation of runtime API at runtime. The order of the precedence is 4> 3 > 2 > 1. Is the behavior here consistent with that?
Are the new PM inliner issues being looked into too?
Oct 2 2020
Wait a few days in case others have comments.
Then why modifying existing tests at all? How about add an explicit test using i64 type?
Perhaps add a new test to demonstrate it (that i32 profile weight still works)?
Oct 1 2020
Sep 30 2020
I like the direction of the patch. There is one concern -- it makes the IR not backward compatible which may affect some users. +vsk.
Sep 25 2020
The reason is that the always inliner is still not in-par with regular inliner -- if you look at the IFI setup -- missing BFI information.
Perhaps also add clang option manual description.
Sep 24 2020
Looks good. Makes the tsan and instrumentation interaction also cleaner.
Sep 22 2020
Sep 21 2020
I am ok adding an internal option to control this with it being default for now.
Sep 17 2020
One major problem with this patch is that the AlwaysInliner is dummy, it is not aware of profile information thus won't be able to perform proper profile update after the inlining of always inline callees. This can be an issue for Frontend PGO or when thinLTO/LTO is on (where there is cross module calls to always inline functions -- though it is not common).
Is it possible to add a test case?
Sep 15 2020
The AlwaysInliner does not operate on DAG, but depend on function order in the module, so it is possible to regress with the change. I don't have strong opinion on this, but you want want to hear others opinion.
Sep 14 2020
What is the compile time implication?
LGTM (if the option is documented, the documentation part also needs to be updated).
Adding a test to make sure they are in sync will be useful.
Sep 13 2020
Change in this direction is welcome. Added vsk and beanz to comment on the implication on runtime builds.
Sep 11 2020
Can you share the performance data ?
Sep 10 2020
Sep 8 2020
This is in theory similar to profile guided size optimization -- the difference is that optnone may not result in size reduction.
Sep 2 2020
For x86 target, should it be turned on when -fprofile-use= option is specified unless -fno-split-machine-function is specified?
Sep 1 2020
I accidentally threw in a third choice :). Yes -fmemory-profile is fine
which is consistent with -fsanitize=
I am fine with -fmemory-profiler.
Aug 19 2020
A heads up -- I won't be able to review patch until mid Sept. Hope this is fine.
looks good to me. Wait to see if Craig has any more comments.
Aug 18 2020
Aug 17 2020
I suggest the following steps to commit this patch:
Aug 14 2020
lgtm. The user option should be documented after the runtime is ready.
I think the user visible option needs to match the functionality. Internal naming just needs some comments to document.
I mean the check llvm::shouldOptimizeForSize(L->getHeader(), PSI, BFI,
PGSOQueryType::IRPass) on loop header.
Aug 13 2020
one nit: since the same instrumentation can be used to profiling global variable accesses (especially those indirect accessed), the option name seems excluding those cases. Shall it be renamed to fmem-prof?
Aug 12 2020
Aug 11 2020
If shouldOptimizeForSize is not available for loop analysis, should the check of be removed unconditionally instead of gated upon whether there is a hint?
Aug 10 2020
A lot of test changes can probably be extracted out as NFC (to make the string test more robust) -- this will reduce the size of the patch.
please also document the new commands in docs/CommandGuide/llvm-profdata.rst
Aug 8 2020
Aug 5 2020
Regarding reroller -- compiler with PGO will adjust the agressiveness of the unroller based on instruction workset size estimation. Doing this in later pass or in Propeller can help catch cases that are mis-handled.
Aug 4 2020
lgtm (as Eugene pointed out -- use std::accumulate instead ).
What is the use case of the new APIs?
Aug 3 2020
LGTM ( how much size savings can we expect?)
It is ok to push this patch, but please update the summary of this patch from https://reviews.llvm.org/D84766 plus the additional improvement bits. Also document the background history a little.
Looks good to me.
Jul 31 2020
Jul 30 2020
I think the test should use profile-gen option to compare generated hash of otherwise identical functions (except for memop).
Jul 29 2020
right. It occurred to me during review, but did not think of the hard coded
values in proftext depends on LE.
Just realized that we need a test case to show it fixes the original issue (existence with memop --> different hash). Ok as a follow up .
lgtm (with the small test enhancement)
Jul 28 2020
changes like in llvm/test/Transforms/PGOProfile/PR41279.ll etc can be independently committed.