User Details
- User Since
- Jun 28 2018, 9:57 PM (133 w, 1 d)
Yesterday
Thanks for your review @MaskRay
Also thanks Hubert for your good comments. @hubert.reinterpretcast
That is quite interesting, why they designed the feature to work that way. Is it recommended to reference debug sections through the label minus the length field size (4 or 12) or they provide some means to simplify the calculation?
Thu, Jan 14
fix according to comments:
1: unchanged the Offset type
2: fix Lint warnings
Mon, Jan 11
gentle ping
Mon, Jan 4
gentle ping
Appreciate your review @jsji . Could you please help to have a look at this patch's patent https://reviews.llvm.org/D92069. Thanks again.
Wed, Dec 30
1: set the register pressure factor to 1.5
Sun, Dec 27
Wed, Dec 23
gentle ping
fix lint warnings
Tue, Dec 22
Mon, Dec 21
perf test shows some degradations together with some improvements. Plan change for now to get more tuning.
Since @efriedma has concerns about adding more hook in machineLICM pass, and the RA works as expected(@qcolombet please correct me if you find there is any issue in RA process in above comments), I made a new solution for this issue:
instead of hoisting all remat instructions, we now do:
// For remat instructions which are inside current working loop, we should // always hoist them. // For remat instructions which intend to be hoisted to outer parent loop, we // only hoist non-cheap ones as RA can not pull all remat instructions down to // inner loop as it will first try to split them in outer loop.
@qcolombet @efriedma , sorry for the late response and the long main. I made a summary for this issue.
Sun, Dec 20
Fri, Dec 18
address @arsenm comments:
1: handle the case when hoisted and dup registers are both dead.
address @arsenm comments:
1: do not use const_cast
2: add comments for the test point in test case.
update the comments and gentle ping
Dec 17 2020
I should have said this in my first comments. I like this refactor. I just want to make sure I understand this refactoring more clear^_^. Thanks for your detailed explanation. @nemanjai
Dec 16 2020
update according to @jsji comments
The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen.
I think it is better to explicitly give some reasons why we need this big refactoring, in other words, what's the disadvantage/limitation of legacy implementation? Thank you for the big effort.
Dec 15 2020
Thanks for clean-up. Some minor comments.
1: add side effect flag for instructions too.
Dec 13 2020
gentle ping...
Thank you! @spatel
Dec 11 2020
Dec 10 2020
1: update according to parent patch https://reviews.llvm.org/D92068 changes
1: address @spatel comments
1: don't require LiveIntervals analysis pass to estimate register pressure.
Hi @spatel @lebedev.ri , I changed the patch not requiring LiveIntervals pass. Now this patch is NFC.
1: don't require LiveIntervals analysis
2: revert the test case change as now this should be NFC
Dec 9 2020
gentle ping...
gentle ping...
Dec 8 2020
Dec 7 2020
Thanks for your comments @spatel @lebedev.ri
Dec 6 2020
1: fix comments
Dec 4 2020
@spatel Very much appreciate your comments. It is a good start for this patch to move on.
Dec 3 2020
gentle ping
lint warning fix
1: fix lint warning messages
2: fix x86 CodeGen/X86/coalescer-dce.ll verify error after liveintervals introduced in machine combiner pass
Dec 1 2020
thanks for your review, I will update the lint warning in the commit
This is more clear than the previous version. Some implementation comments.
Nov 26 2020
gentle ping
Nov 24 2020
Nov 19 2020
LGTM. Thanks for the new fix.
Nov 18 2020
After a discussion with the group I would like to correct what I said in the previous post.
There already is a plan to do this in ISel in a different patch. The reason we also want to do this optimization here is to try to catch situations where this pattern is not known in ISel and only appears after other optimizations later on. Ideally we do not want to have any situations where the XForm exists in the final binary and having this final check in the PreEmitPeephole should ensure that. Basically, we also want to do this check here to find anything that ISel may have missed.
Nov 17 2020
I saw some testcases are removed compared with your committed version. Is there any reason?
Nov 15 2020
ping
Nov 11 2020
Using dform with offset 0 can save one register r0/X0, this is benefit for register allocation? But adding it in PPCPreEmitPeephole pass which is after register allocation will make the benefit gone.
Maybe we need to do it before register allocation? For example at the place where the x-form with zero register is generated.