- User Since
- Feb 20 2015, 10:57 AM (266 w, 4 d)
Friendly ping. Because the fix for the patch contains some substantial change so I want it to be reviewed before I recommit the patch. Thanks in advance!
Thanks Guillaume for the fix!
Fri, Mar 27
Address David and Wenlei's comments.
Tue, Mar 24
reopen it for the review of the patch to be recommitted.
The original committed patch was reverted because a test failure in libc++ was reported.
Fix an unwinding problem exposed in libc++ test.
Fri, Mar 20
Thu, Mar 19
Tue, Mar 17
Mon, Mar 16
Wed, Mar 4
Thanks for the review and helpful suggestions!
Tue, Mar 3
Address @thegameg's comment.
Mon, Mar 2
Feb 22 2020
Address @thegameg's comment.
Feb 18 2020
Thanks for catching the mistake!
Feb 9 2020
Feb 4 2020
Jan 30 2020
Jan 27 2020
Address Florian's comments.
Jan 16 2020
Jan 15 2020
Jan 14 2020
Jan 10 2020
The additional pipeline testing will catch any future pass change to the pipeline. A related but separate question is do we have a way to check whether there is any other missing pass in thinlto newpm similar as that in D72386?
Jan 9 2020
Thanks for the review!
Address Teresa's comment.
Enable loop and slp vectorization in O2/O3 (Os for opt as well) in gold plugin, lld, opt and llvm-lto2. Add related tests.
Jan 8 2020
Teresa, do we have existing tests where I can add some check to see if some passes are executed for the lld and gold plugin changes?
Jan 7 2020
update flag in the test.
Jan 6 2020
Dec 13 2019
Dec 11 2019
Dec 6 2019
Dec 5 2019
Did performance test and I saw 0.4% improvement in an internal benchmark. That is a good improvement, thanks for the change!
I rerun perf test and I don't see any performance change. Last run I saw very small improvement on latency in a benchmark. This is fine since the benchmark has some fluctuation by itself.
Dec 2 2019
Nov 27 2019
I did performance test for this change and the result is neutral.
Nov 25 2019
The change makes sense to me. The only concern is for inline instance getEntrySamples is not precise and sometimes can have a large difference with the actual head count. It will be interesting to see the actual effect. I will test it combined with D70655 on my side.
That is a good catch. From my understanding it could potentially reduce inlining in some cold places and potentially increase regular inlinling. Do you see how much impact on performance and code size by changing it? I will do some evaluation on my side.
However the reality is most of the inlining actually happens during regular inliner.
Nov 21 2019
Nov 19 2019
Maybe we can rename current ThreadEdge to TryThreadEdge and let TryThreadEdge call ThreadEdge to do the transformation?
Nov 15 2019
Nov 14 2019
Nov 6 2019
Nov 5 2019
I saw 0.3% improvement in our server benchmark. That is a great improvement!
Sorry I missed the patch. Thanks for fixing the test!
Sorry, there is a little more refactoring which could be done.
I don't have further comment. It LGTM. I am doing some performance test for it and will report back its performance impact on our side (expect to be positive).
Nov 4 2019
Could you add a testcase to verify that no profile update for inlining in sampleloader pass?
Nov 3 2019
Thanks for the patch!
Oct 31 2019
Oct 29 2019
Looks like a very helpful patch. I saw the problem showing up at multiple places when I was looking at a halide testcase and I was wondering what was wrong. I can try it out and see if the patch can fix it.
Oct 18 2019
Oct 16 2019
Address Wenlei's comment.
Oct 15 2019
Address Wenlei and David's comments.
Change SampleProfileReaderItaniumRemapper from a proxy of SampleProfileReader to a helper object inside of SampleProfileReader, and it is used in two places: one in SampleProfileReader::getSamplesFor for looking up a function sample and one in SampleProfileReaderExtBinary::readFuncProfiles for loading profiles on demand.
Thanks Wenlei and David for the reviews.
Oct 14 2019
Oct 12 2019
I change the testcase a little so the terminator won't be ret, but the generated code pattern is the same. Should it be handled as well?