User Details
- User Since
- Aug 6 2015, 11:49 PM (398 w, 2 d)
Nov 26 2020
LGTM.
Sep 21 2020
We could perhaps commit it without enabling it to begin with.
Sep 20 2020
@SjoerdMeijer, I could confirm that D51160 with this scheduler model shows the improvement and it had been successfully applied to our in-house compiler.
Also, there was no issue about performance & regressions with this patch. So, I think that you can commit this with minor fixes @evgeny777 mentioned before.
May 29 2018
Hi Javed.
May 28 2018
When I tried to test this patch, it showed some improvements on several benchmarks like Spec2000/2006.
But other benchmarks like dhrystone, commercial benchmark's sub workloads show performance degradation.
May 16 2018
I will update benchmark results until next week.
benchmark list : dhrystone, spec2000, spec2006, one commercial benchmark.
Apr 26 2018
Apr 17 2018
Apr 12 2018
Addressed SjoerdMeijer' new comments. Thanks.
Apr 11 2018
Sep 26 2017
Mar 27 2017
Thanks Benjamin.
Committed in r298895.
Mar 6 2017
@evandro
This is code review about the case "Some targets using CSEL is more prefer than CSINV, CSINC.".
Mar 5 2017
Change the feature name PreferCSEL to FastCSEL.
Addressing Renato, James's comments.
Thanks.
Addressing Benjamin's comment.
Thanks.
Mar 2 2017
Thanks for your opinion & example Renato.
Mar 1 2017
After internal(Samsung only) discussion, revert patch about Exynos-M3 part.
Feb 27 2017
Thanks for comment Renato.
Feb 24 2017
Addressing Matthias's comment.
Thanks.
Sep 28 2016
Aug 15 2016
Thanks. Hal, Tim.
Aug 2 2016
OK Tim.
Or how about give up inserting prefetch intrinsic when the loop has inline asm?
I don't think users are stupid that they don't know prefetch is necessary or not when they use inline asm.
Thanks for the comment Tim.
Aug 1 2016
ping2.
Jul 24 2016
ping?
Jul 18 2016
Is Diff 64111 too much modification?
Jul 15 2016
Addressed Hal's comments.
Jul 14 2016
Jul 6 2016
Jun 21 2016
Thanks for review Hal.!
May 8 2016
Apr 26 2016
LGTM.
Apr 25 2016
Commited in r267502.
Apr 24 2016
Apr 22 2016
Apr 13 2016
Hi Chad.
Apr 11 2016
I think this optimization will affect all ARM Architectures. Is this optimization also good for cortex-a57?
Apr 8 2016
Apr 7 2016
Addressed Gerolf's comments.
Apr 5 2016
Apr 4 2016
sure, sorry I missed that. I looked at this too long, I guess :-). It is principally the same ‘better ILP' story as for integers. The prototypical idea is this: imagine two fmul operands feeding the fadd. When the two fmul can execute in parallel it can be faster to issue fmul, fmul, fadd rather than fmul, fmadd.
Apr 3 2016
Mar 31 2016
Mar 30 2016
Hi Gerolf.
LGTM, too. with a few minor nits.
Hi Gerolf.
Mar 29 2016
Mar 28 2016
Evandro, we still wait your reply over 2weeks.
Could you please answer or review about our questions?
LGTM! too.
Mar 26 2016
I also think this is reasonable patch.
Mar 25 2016
I also think this is reasonable change.
Mar 22 2016
Kindly ping ...
Mar 21 2016
Mar 18 2016
ping?
Mar 16 2016
Mar 14 2016
Rebase patch against latest trunk change & modiy comments.
Hi Evandro.
Mar 13 2016
Thanks for review Chad.!
Mar 11 2016
Hi Sanjay.
I already explained that we got improvement on commercial benchmark by e-mail.
Modify the patch which can consider uArch information.
Mar 10 2016
Mar 7 2016
Mar 2 2016
Thanks Chad.!
Commited in r262580.