- User Since
- Feb 20 2015, 10:57 AM (122 w, 10 h)
Wed, Jun 21
Thanks for helping on fixing the bug!
Fri, Jun 16
Sat, Jun 10
Sorry for the delay. The rewrite based on SSA looks much cleaner now. About the algorithm, IIUC it tries to find loop based on define-use of tied operand or operand commutable with tied operand. However, I still have concern that the method can increase redundent copy sometimes.
Wed, Jun 7
One comment about simplifying the test. Other than that, LGTM.
Mon, Jun 5
Wed, May 31
LGTM. Only a minor comment.
Tue, May 30
I have no other comments. See whether Davide has further comments.
Sun, May 28
Fri, May 26
Thu, May 25
No more comments seen so prepare to commit the patch. Thanks for the review!
May 19 2017
Initialize commutative in Expression's constructor.
Fix a bug related with commutative: need to swap cmp predicate at the same time when we swap the operands.
May 18 2017
May 16 2017
Discussed with Chandler offline, and we decided to split the patch and tried to commit the store shrinking first.
May 14 2017
Overall looks good. Some minor comments inlined.
May 11 2017
May 10 2017
Chandler, Thanks for the comments. They are very helpful. I will address them in the next revision. I only replied some comments which I had questions or concerns.
May 9 2017
I change the DenseMap numberingExpression (mapping from value numbering to GVN::Expression) to std::vector. Although there is still some minor increase but better than before.
May 5 2017
Forgot to update the patch.
May 2 2017
May 1 2017
+ Andy, for the history on pre-RA-sched and misched.
Apr 28 2017
Address Eli, Matt and Chandler's comments.
Apr 26 2017
Some minor comments.
Apr 25 2017
Thanks for drafting the comments. It is apparently more descriptive and clearer, and I like the varnames -- (LargeVal and SmallVal), which are much better than what I used -- (OrigVal, MaskedVal). I will rewrite the comments based on your draft.
Apr 21 2017
Thanks for bearing with my poor English. I will fix the terminologies and comments according to your suggestions.
Apr 20 2017
I am thinking of another two issues:
- Is it possible that some of the BBs on the chain may be very big and we don't want to partial inline them?
- The existing pattern handles if (a || b || c ...) case, but it may not be easy to extend for cases like (a && b && c) and ((a && b) || c). Basically, we want to find a collection of bbs with small sizes starting from entry. The bb collection only have two exits. one of them is ReturnBlock and the other is NonReturnBlock. The edges inside of the collection can be of any pattern.
Apr 19 2017
Daniel, thanks for the comments.
Apr 18 2017
Apr 17 2017
Haicheng, the testcase LGTM. Thanks for working on it!
Sanjoy, thanks for all the helpful comments.
Apr 13 2017
Remove some code from the test that doesn't impact it.
Address Sanjoy's comments: Add another two proxy functions: getZeroExtendExprCached and getSignExtendExprCached.
Apr 12 2017
Address Sanjoy's comments: Use SmallDenseMap and lambda.
Address Sanjoy's comments.
- Implement local caches for getZeroExtendExpr and getSignExtendExpr.
- Generate the IR programatically for the testcase.
Apr 10 2017
Sanjoy, thanks for the comments.
Apr 5 2017
I am running into a problem about the testcase tail-merge-after-mbp.ll. I am working on a patch related with register allocation and it somehow triggers the tail merge before block placement, and breaks the checks in the test. The tail merge triggered is doing in exactly the same way as the test describes.
Apr 4 2017
I extended the test and now it took more than one hour on my sandybridge machine when built with clang in release mode.
I added early returns in getZeroExtendExpr/getSignExtendExpr for SCEVAddRecExpr with NW flag. Like the test shows, the compile explosion can only happen when the step of SCEVAddRecExpr is negative and NW flag can be marked. With the change, the test now takes less than one second.
Apr 3 2017
- Address Chandler's comments.
- Fix unittest errors.
- Add unittest for load shrinking part. Add the original motivation case as a unittest.
- Add cost evaluation for the case when there is multiple use node inside the shrinking pattern.
Chandler, thanks for the review and sorry about the delay of replying. It takes me a while to fix some issues of the patch found when I was adding test for the load shrinking part and doing the unittest.
Mar 25 2017
Quentin, it is really a good finding, thanks a lot! I was cheated by the large amount of reassociation candidates and I have verified the non-linear increase of compile time is indeed because of SCEVExpand!
Mar 24 2017
Revamp the patch.
Mar 8 2017
Thanks for the comment.
Mar 6 2017
Could you add a test-case like this?
Sure. I will add such testcase after other major issues are solved.
Mar 4 2017
Although I did't find regression in internal benchmarks testing, I still moved the transformation to codegenprepare because we want to use TargetLowering information to decide how to shrink in some cases.
Update patch according to Eli's comments.
Feb 27 2017
Your testcase has dead instructions (%conv).
Will fix it.
Fixed a typo in the testcase.
Add a testcase by checking the LSR debug output.
I don't like a hanging test to be killed after timeout. I will try to create another testcase by checking the LSR trace.
Feb 24 2017
Feb 22 2017
Feb 16 2017
Feb 15 2017
Feb 10 2017
Sorry, wrong reply. Drop the patch and try to push https://reviews.llvm.org/D27366
Feb 8 2017
nemanjai, do you have any update about the patch? We also depend on it on x86 side so want to know the status.
Feb 7 2017
First, sorry to revisit the patch after long time.
Feb 6 2017
LGTM with A nitpick.
Sorry to chime in late.
Feb 5 2017
Make change to save compile time and add assertion per David's suggestions.