- User Since
- Apr 13 2015, 9:48 AM (235 w, 2 d)
Sun, Oct 13
Handling what Wei's case will be a nice thing to have, but it may require more significant change in JT. Currently the JT candidate BB selection is based on checking the conditional value used by branch or return value of ret instr (with this patch).
Sat, Oct 12
Wenlei, this sounds like a good idea. Patches are welcome!
Fri, Oct 11
JumpThreading is basically basic block cloning followed by control flow simplification. This is just a special case where the second part is missing.
The code can be broken down and contributed in the following order:
SizeOpts and MachineSizeOpts changes can also be extracted into its own patch.
Wed, Oct 9
Tue, Oct 8
Fri, Oct 4
llvmprof command documentation also needs to be updated.
Thu, Oct 3
+petr who has similar needs for Fuchia platform
Tue, Oct 1
This patch should add test cases for each affected codegen passes.
Sun, Sep 29
looks fine to me, but please wait for evghenii@'s comment.
lgtm, but please wait to see if xur@ has more comments.
Fri, Sep 27
This generally looks good, but I suggest changing the name of ValueProfileOracle to ValueProfileCollector to make it explicit.
Wed, Sep 25
Tue, Sep 24
Mon, Sep 23
Fri, Sep 20
Some unit tests can be added to llvm/unittests/ProfileData/InstrProfTest.cpp -- there are existing profile summary tests that can be extended.
Thu, Sep 19
Is any compile time impact witnessed?
Sounds a tricky situation to handle. Suppose the profile size increases some limit, what criterial should be used to trim the size? Trim the symbol list or drop more samples?
Is the intended workflow to create a profile with full symbol list and then have an option to trim symbol list according to size impact? Why not do the trimming at the time of the profile creation?
Wed, Sep 18
Tue, Sep 17
you are right. lgtm
Sep 15 2019
If we use NameTable, is ProfileSymbolList still needed? I assume NameTable is a superset ?
Sep 11 2019
It is useful to have generic interface like isHotCountNthPercentile(..) to check if a count is hot relative to a given percentile cutoff. In theory, PGSO does not have a fixed hotness threshold as it depends on workingset size. If we partition working set size into ranges like small, medium, large, huge, then PGSO can use different hotness thresholds based on that.
Sep 10 2019
I don't think we should use 'PGSO' in the profile summary class. Instead, we should differentiate it from 'hot' with other categories like VeryHot, LukeWarm etc ...
Sep 6 2019
some benchmark numbers?
Sep 4 2019
you can probably speed up it a little by using the insert interface, but it may make code less readable. LGTM
Sep 3 2019
Can you split the patches into smaller ones -- at least 3 : ProfileSummary API changes, TTI API changes, and the rest (may need to be split too, but we will see).
A separate mode is fine. Buffering the errors and reporting them later is also useful (allowing more context info to be emitted).
yes, the old behavior should be the default. It raises the awareness of the problem (and merge errors are usually rare in reality).
Aug 30 2019
lgtm (please consolidate two test cases by factoring the common source code out and have two wrapper tests referencing it).
Aug 29 2019
This is a behavior change and I think it should be guarded with an option. The default of the option should be set to 'FailOnAnyWhenMerge' instead of 'FailOnAllWhenMerge'.
Suggesting on the naming: instead of using SymbolWhiteList, use 'ProfileSymbolList', or just 'SymbolList'. Otherwise the patch looks ok
Aug 27 2019
Aug 24 2019
Aug 22 2019
LLC miss info should be very sparse, so mixing it with LBR may not bring a lot of benefit. Having mixure of different types of profile data can lead to combinatorial growth of the number of readers. As the number of profile types increases, it can cause the reader to be unmaintaintable. That is why I suggest making it clean -- one type per section.
Aug 21 2019
llvm-profdata should also have a round trip test:
Aug 19 2019
Can you split the format extension change into a separate patch?
Aug 9 2019
static prediction is intended to help common use cases, so for corner cases when NaNs are used frequently, user will need to resort to source annotation (such as builtin_expect).
Aug 8 2019
Aug 7 2019
See Some minor name suggestions, also
Jul 29 2019
This version looks fine to me, but please let other reviewers to weigh in as well.
Jul 25 2019
I suspect other affected cases are due to bad static profile data too.
Jul 24 2019
I did some test with the patch with samplePGO and it works fine. LGTM
It is also better to introduce a coldness factor parameter 'F', i.e, if the source block's count is less than 1/F of the header count, suppress the hoisting.
Jul 22 2019
Hal probably remembered more details. IIRC, the argument was that LICM provides IR canonicalization which can simplify analysis -- there were some examples about exposed vectorization opportunities etc.
This was tried before. IIRC, the conclusion was to implement look sinking pass to undo non-profitable LICM. The loop sinking pass was in D22778. Can the loop sinking pass be enhanced to handle the case here ( I have not looked in details) ?
Jul 19 2019
carrot, can you help looking into this? The cost model should look into extra direct jumps introduced as well.
Jul 18 2019
Is the test case slow down with PGO or not? Also do you have branch misprediction perf data? (large slowdowns like this is usually triggered by side effect like this). Is there a trimmed down version to demonstrate the issue?
Jul 16 2019
Jul 12 2019
Jul 11 2019
Can you put an example (text CFG) showing cases where tailDup is bad -- possibly as a source comment?
Jul 10 2019
Jul 9 2019
Some end-to-end tests as in compiler-rt are probably helpful here.
Is it possible to split the patch into a NFC refactoring and one functional change?
Jul 8 2019
Jul 5 2019
Jul 4 2019
please flip the flag to false by default and check in the rest first.
Jul 3 2019
my test hit some other unrelated debug assert. Can you first change the debug assert into error message so that there is no need to use assert enabled build to test the option?
Does the bursty style sampling provide additional benefit? A non-bursty style sampling can be cheaper:
Do you have a functional test case to show what this patch is trying to accomplish/fix?
Jul 2 2019
let me do some internal PGO build tomorrow with the option on and get back.