- User Since
- Jan 23 2018, 4:17 PM (138 w, 4 d)
Wed, Sep 16
Tue, Sep 15
Minor change to test case
Thu, Sep 10
Fri, Aug 28
Thu, Aug 27
I won't have time to look into this in the near future unfortunately, as I am mainly focused on getting things in good shape to enable DSE & MemorySSA by default. It might be good to file a bug report so this opportunity does not drop off our radar. If anyone is interested in giving implementing it a try, even better :)
Wed, Aug 26
@fhahn is this patch supposed to support the following case?
Sun, Aug 23
Aug 18 2020
Added description of the algorithm to the header + formatting.
Aug 17 2020
I think making 3 transformation passes immune to optnone ruins the whole idea. You really should have mechanism to set very specific and well documented overrides to the default.
I think this should be put on hold until D85781 reworked and relanded again.
Aug 16 2020
How your last commit differs from the approved one? Why you have so many additional changes in the tests?
Aug 13 2020
I don't propose anything completely new here. New algorithm does essentially the same as existing one but
- fixes several correctness issues
- extends the approach to handle loops and invokes in a universal way
Aug 12 2020
Added a comment to updateEstimatedBlockWeight
Aug 11 2020
Unfortunately, in most cases, I don't see how this can be done with out loosing current coverage. There are few cases like 'successors: %bb.3(0x80000000), %bb.4(0x00000000)' where we could remove exact numbers and just check block names. I think we better keep exact probabilities in all other cases. That actually helps to catch unintended changes.
Do you have performance numbers related to the change?
I measured performance on bunch of java related benchmarks we have in house including SPECJVM2008, SPECJbb2015, DaCapo9.12 and others. I don't see any noticeable impact on performance.
Any perf data?
Aug 6 2020
Aug 4 2020
Updated according to comments + formatting.
Jul 31 2020
It's a bit shorter now. Let's see if it's manageable now. Further reduction would be problematic.
Rebased on top of D84838
Jul 30 2020
This change causes problems to us. Even though "enable-npm-optnone" is false by default it affects us because we use custom pass instrumentation callback which causes passes to be skipped under some conditions.
Jul 29 2020
Jul 28 2020
One more general comment. If we introduce budget for total number of scanned instructions do we still need to limit number of visited blocks?
Marked as WIP since I'm going to extract one more part out from this review.
I would suggest to separate an introduction of additional limit for a total number of scanned instructions and reduction of the current limit for number of scanned blocks to a different change sets. Also besides compile time numbers we should evaluate overall impact on performance before making such changes.
Jul 27 2020
Updated as requested.
Jul 24 2020
I have moved part of the change into a separate review D84514. Let's start with that...
Jul 21 2020
I'm glad to see this discussion alive again. Thanks Roman!
Jul 20 2020
Jul 9 2020
I measured performance on bunch of java related benchmarks we have in house including SPECJVM2008, SPECJbb2015, DaCapo9.12 and others. I don't see any noticeable impact on performance. Please let me know if you need more details on these experiments or some additional perf data.
Jun 26 2020
Jun 25 2020
Sorry for keeping silence for that long. I had to work on higher priority tasks before I got a chance to return back to this. I think I fixed all your remarks. Please take a look.
Fixes according to comments + rebase
Jun 17 2020
This change caused verification to catch "invalid" IR produced by instcombine. Filed a bug for that https://bugs.llvm.org/show_bug.cgi?id=46362
May 27 2020
May 26 2020
May 20 2020
May 19 2020
May 8 2020
Number of changes:
- Previous version had an issue and was not able to propagate estimated weight from loop exits (in case more than 1 exit) to its entries. To fix that we introduce LoopData structure and trakc weight for loops explicitly.
- Returned old behavior for 'invoke' heuristics to dominate 'cold' heuristics.
May 7 2020
I think the new API of setting probabilities for all edges "atomically" is a good move. There is one suggestion though. In fact, there is no need to "repack" vector of probabilities into map. We can simply copy or even move (in most cases we have rvalue) entire vector into a map. This can be done as a separate change though.
May 6 2020
Apr 30 2020
Apr 29 2020
Another attempt to fix failing lit test.
Fix Other/new-pm-thinlto-prelink-pgo-defaults.ll failing on AMD.
Request cached PDT through AM
Pass arguments directly to constructor.
Apr 28 2020
Functionally the change looks ok.
I think we should improve current API of BP (not in this change though) because it's easy to break invariant that sum of all probabilities is 1.
We better disallow cases when only part of edges have probability set and others unset. 1/NumSuccesors will likely be incorrect value in most cases with uneven distribution.