Page MenuHomePhabricator

Carrot (Guozhi Wei)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 15 2015, 3:50 PM (214 w, 2 d)

Recent Activity

Thu, Aug 22

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

@hans this patch should be reverted from 9.0 I think so rc3 is “fixed”.

I'd rather see the fix land on trunk first (reverting on the branch is also not trivial, there are merge conflicts in several test files). From the discussion at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190819/686087.html, it should be ready to go in again.

Thu, Aug 22, 2:34 PM · Restricted Project
Carrot committed rG51f48295cbe8: [MBP] Disable aggressive loop rotate in plain mode (authored by Carrot).
[MBP] Disable aggressive loop rotate in plain mode
Thu, Aug 22, 9:23 AM
Carrot committed rL369664: [MBP] Disable aggressive loop rotate in plain mode.
[MBP] Disable aggressive loop rotate in plain mode
Thu, Aug 22, 9:22 AM

Fri, Aug 16

Carrot committed rGe03f6a163176: [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization (authored by Carrot).
[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization
Fri, Aug 16, 9:26 AM
Carrot committed rL369125: [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization.
[CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization
Fri, Aug 16, 9:26 AM
Carrot closed D66096: [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization.
Fri, Aug 16, 9:26 AM · Restricted Project

Mon, Aug 12

Carrot updated the diff for D66096: [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization.

Will commit this version.

Mon, Aug 12, 3:50 PM · Restricted Project
Carrot created D66096: [CodeGen/Analysis] Intrinsic llvm.assume should not block tail call optimization.
Mon, Aug 12, 10:52 AM · Restricted Project

Fri, Aug 9

Carrot added inline comments to D65303: [BPI] Adjust the probability for floating point unordered comparison.
Fri, Aug 9, 9:40 AM · Restricted Project
Carrot updated the diff for D65303: [BPI] Adjust the probability for floating point unordered comparison.
Fri, Aug 9, 9:40 AM · Restricted Project

Thu, Aug 8

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

@hjagasia, thank you for the verification.

Thu, Aug 8, 2:04 PM · Restricted Project
Carrot committed rG80347c3acc08: [MBP] Disable aggressive loop rotate in plain mode (authored by Carrot).
[MBP] Disable aggressive loop rotate in plain mode
Thu, Aug 8, 1:26 PM
Carrot committed rL368339: [MBP] Disable aggressive loop rotate in plain mode.
[MBP] Disable aggressive loop rotate in plain mode
Thu, Aug 8, 1:26 PM
Carrot closed D65673: [MBP] Disable aggressive loop rotate in plain mode.
Thu, Aug 8, 1:26 PM · Restricted Project
Carrot updated the diff for D65673: [MBP] Disable aggressive loop rotate in plain mode.
Thu, Aug 8, 10:19 AM · Restricted Project

Wed, Aug 7

Carrot updated the diff for D65673: [MBP] Disable aggressive loop rotate in plain mode.
Wed, Aug 7, 2:59 PM · Restricted Project

Tue, Aug 6

Carrot added a comment to D65303: [BPI] Adjust the probability for floating point unordered comparison.

ping

Tue, Aug 6, 3:55 PM · Restricted Project

Fri, Aug 2

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Patch https://reviews.llvm.org/D65673 for restoring the original layout in plain mode.

Fri, Aug 2, 1:29 PM · Restricted Project
Carrot created D65673: [MBP] Disable aggressive loop rotate in plain mode.
Fri, Aug 2, 1:27 PM · Restricted Project

Thu, Aug 1

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

I would suggest putting the optimization under an option and disable it by default for now. Once all problems are resolved we can change the default. What do you think?

After restore to original behavior in plain mode, you can use -force-precise-rotation-cost=true to use this more aggressive loop layout.

Is there a patch in progress for restoring to the original behaviour in non-profile mode? It would be nice if we could get this resolved soon.

Thu, Aug 1, 4:00 PM · Restricted Project

Wed, Jul 31

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

I would suggest putting the optimization under an option and disable it by default for now. Once all problems are resolved we can change the default. What do you think?

Wed, Jul 31, 8:27 AM · Restricted Project

Tue, Jul 30

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

We got performance improvement in our internal search benchmark.

How does this transformation impact the benchmark when not using profile data?

In plain mode we also got performance improvement, the speedup is a little smaller than FDO Mode.

I have a general question/comment. By now it's more or less evident that benefit of optimization heavily depends on correctness of profile information. That means in general case there is no way to reason about its effectiveness. Thus I believe it should be turned off if there is no profile.

Tue, Jul 30, 8:30 AM · Restricted Project

Fri, Jul 26

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

We got performance improvement in our internal search benchmark.

How does this transformation impact the benchmark when not using profile data?

Benchmark is running, will report the result once it is finished.

Fri, Jul 26, 3:50 PM · Restricted Project
Carrot added a comment to D65303: [BPI] Adjust the probability for floating point unordered comparison.

isnan() is lowered to fcmp uno, in this case the taken probability may be higher, but still much smaller than ordered result, and the usage of isnan() is already very rare, most of uno comparisons are generated from different math library functions.
On the other side I guess most of the taken uno comparisons in a correct program comes from explicitly isnan() call.

Fri, Jul 26, 1:55 PM · Restricted Project
Carrot added inline comments to D65303: [BPI] Adjust the probability for floating point unordered comparison.
Fri, Jul 26, 8:28 AM · Restricted Project
Carrot updated the diff for D65303: [BPI] Adjust the probability for floating point unordered comparison.

Comment change.

Fri, Jul 26, 8:28 AM · Restricted Project

Thu, Jul 25

Carrot created D65303: [BPI] Adjust the probability for floating point unordered comparison.
Thu, Jul 25, 3:49 PM · Restricted Project
Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

@ebrevnov, it's better to provide a reproducer. Otherwise I can't analyze the problem that impacts your code. Four is not a big number.

Thu, Jul 25, 3:26 PM · Restricted Project

Jul 24 2019

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

ebrevnov, with following patch the test case can be layout correctly. Can you try it with your actual code?

Jul 24 2019, 2:58 PM · Restricted Project

Jul 23 2019

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Thanks for the test case, I can reproduce it now.

Jul 23 2019, 10:49 AM · Restricted Project

Jul 22 2019

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Evgeniy, could you try to build your code with FDO? The layout code is based on profile information, if that is not available, the static estimated profile information is used. Since the loaded values are not NaN, there should be no branch from loop header to latch, but the estimated profile gives it a non trivial value, so the latch is moved before header, and one taken branch is reduced in the NaN path. I think the static profile estimation can be enhanced to treat a floating point number as not a NaN, or as a NaN only with a very small possibility.

Jul 22 2019, 10:51 AM · Restricted Project

Jul 18 2019

Carrot added a comment to D64376: [MBP] Avoid tail duplication if it can't bring benefit.

What about CTMark? SPEC?

Jul 18 2019, 10:40 AM · Restricted Project

Jul 15 2019

Carrot added inline comments to D64376: [MBP] Avoid tail duplication if it can't bring benefit.
Jul 15 2019, 9:47 AM · Restricted Project
Carrot updated the diff for D64376: [MBP] Avoid tail duplication if it can't bring benefit.

Add comments.

Jul 15 2019, 9:46 AM · Restricted Project

Jul 11 2019

Carrot updated the diff for D64376: [MBP] Avoid tail duplication if it can't bring benefit.

Add a simple CFG example to show the inefficient tail duplication.

Jul 11 2019, 4:02 PM · Restricted Project

Jul 8 2019

Carrot created D64376: [MBP] Avoid tail duplication if it can't bring benefit.
Jul 8 2019, 4:02 PM · Restricted Project

Jun 19 2019

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Do you have any benchmark numbers to show that this is generally profitable? From our downstream testing, it is not clear that this change is beneficial.

Jun 19 2019, 12:03 PM · Restricted Project

Jun 14 2019

Carrot committed rGd2210af3322d: [MBP] Move a latch block with conditional exit and multi predecessors to top of… (authored by Carrot).
[MBP] Move a latch block with conditional exit and multi predecessors to top of…
Jun 14 2019, 4:06 PM
Carrot committed rL363471: [MBP] Move a latch block with conditional exit and multi predecessors to top of….
[MBP] Move a latch block with conditional exit and multi predecessors to top of…
Jun 14 2019, 4:06 PM
Carrot closed D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.
Jun 14 2019, 4:05 PM · Restricted Project

Jun 7 2019

Carrot added a comment to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Some analysis of the test case changes.

Jun 7 2019, 3:31 PM · Restricted Project

Jun 4 2019

Carrot added inline comments to D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.
Jun 4 2019, 1:59 PM · Restricted Project
Carrot updated the diff for D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.
Jun 4 2019, 1:54 PM · Restricted Project

May 31 2019

Carrot committed rGc3a24e93d527: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals (authored by Carrot).
[PPC] Correctly adjust branch probability in PPCReduceCRLogicals
May 31 2019, 9:10 AM
Carrot committed rL362237: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
[PPC] Correctly adjust branch probability in PPCReduceCRLogicals
May 31 2019, 9:08 AM
Carrot closed D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
May 31 2019, 9:08 AM · Restricted Project

May 30 2019

Carrot updated the diff for D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.
May 30 2019, 2:05 PM · Restricted Project
Carrot added inline comments to D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
May 30 2019, 11:41 AM · Restricted Project
Carrot added a comment to D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.

Hi Carrot,
I agree with this change, conceptually. Have you done any performance measurements to see what the impact is?

May 30 2019, 11:34 AM · Restricted Project

May 29 2019

Carrot updated the diff for D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
May 29 2019, 10:18 AM · Restricted Project

May 28 2019

Carrot updated the diff for D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.

Changed the edge frequency distribution, so I can avoid floating point square root computation.

May 28 2019, 10:58 AM · Restricted Project
Carrot added inline comments to D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
May 28 2019, 8:20 AM · Restricted Project

May 24 2019

Carrot created D62430: [PPC] Correctly adjust branch probability in PPCReduceCRLogicals.
May 24 2019, 3:20 PM · Restricted Project

May 23 2019

Carrot updated the diff for D43256: [MBP] Move a latch block with conditional exit and multi predecessors to top of loop.

Update the patch to the current code base, also make two improvements:

May 23 2019, 2:51 PM · Restricted Project

May 21 2019

Carrot added a comment to D62079: [MBP] Rotate should bring more fallthrough.

Precisely compute the fallthrough frequency is very helpful in layout of loop header and exit BB. It is included in my next patch.

May 21 2019, 8:28 AM · Restricted Project

May 17 2019

Carrot created D62079: [MBP] Rotate should bring more fallthrough.
May 17 2019, 1:54 PM · Restricted Project

Apr 17 2019

Carrot added a comment to D59535: [SelectionDAG] Compute known bits of CopyFromReg.

Just FYI.

Apr 17 2019, 11:09 AM · Restricted Project

Apr 5 2019

Carrot committed rG36fc9c31072f: [LCG] Add aliased functions as LCG roots (authored by Carrot).
[LCG] Add aliased functions as LCG roots
Apr 5 2019, 11:51 AM
Carrot committed rL357795: [LCG] Add aliased functions as LCG roots.
[LCG] Add aliased functions as LCG roots
Apr 5 2019, 11:49 AM
Carrot closed D59898: [LCG] Add aliased functions as LCG roots.
Apr 5 2019, 11:49 AM · Restricted Project
Carrot updated the diff for D59898: [LCG] Add aliased functions as LCG roots.

Will check in this version.

Apr 5 2019, 11:47 AM · Restricted Project

Apr 4 2019

Carrot added a comment to D59898: [LCG] Add aliased functions as LCG roots.

ping

Apr 4 2019, 11:01 AM · Restricted Project

Mar 28 2019

Carrot updated the diff for D59898: [LCG] Add aliased functions as LCG roots.
Mar 28 2019, 1:00 PM · Restricted Project

Mar 27 2019

Carrot updated the diff for D59898: [LCG] Add aliased functions as LCG roots.
Mar 27 2019, 2:14 PM · Restricted Project
Carrot created D59898: [LCG] Add aliased functions as LCG roots.
Mar 27 2019, 1:25 PM · Restricted Project

Mar 26 2019

Carrot committed rG330dcd9dabd3: [PPC] Refactor PPCBranchSelector.cpp (authored by Carrot).
[PPC] Refactor PPCBranchSelector.cpp
Mar 26 2019, 2:28 PM
Carrot committed rL357033: [PPC] Refactor PPCBranchSelector.cpp.
[PPC] Refactor PPCBranchSelector.cpp
Mar 26 2019, 2:26 PM
Carrot closed D59623: [PPC] Refactor PPCBranchSelector.cpp.
Mar 26 2019, 2:26 PM · Restricted Project

Mar 20 2019

Carrot accepted D59602: [CodeGenPrepare] limit formation of overflow intrinsics (PR41129).

Thanks a lot!

Mar 20 2019, 6:34 PM · Restricted Project
Carrot created D59623: [PPC] Refactor PPCBranchSelector.cpp.
Mar 20 2019, 5:20 PM · Restricted Project

Mar 19 2019

Carrot added inline comments to D59153: [MBP] Make sure the exit BB is the most possible successor before rotating a loop.
Mar 19 2019, 1:20 PM · Restricted Project

Mar 18 2019

Carrot added a comment to D57789: [CGP] form usub with overflow from sub+icmp.

A bug https://bugs.llvm.org/show_bug.cgi?id=41129 is filed for the regression.
Thanks a lot for the investigation.
Please let me know if more information is required.

Mar 18 2019, 10:58 AM · Restricted Project

Mar 14 2019

Carrot added a comment to D57789: [CGP] form usub with overflow from sub+icmp.

This patch causes 5% regression of one of our eigen benchmarks on Haswell.

Mar 14 2019, 11:47 AM · Restricted Project

Mar 8 2019

Carrot created D59153: [MBP] Make sure the exit BB is the most possible successor before rotating a loop.
Mar 8 2019, 1:40 PM · Restricted Project

Mar 6 2019

Carrot committed rG11308bdb433e: [PPC] Adjust the computed branch offset for the possible shorter distance (authored by Carrot).
[PPC] Adjust the computed branch offset for the possible shorter distance
Mar 6 2019, 10:24 AM
Carrot committed rL355529: [PPC] Adjust the computed branch offset for the possible shorter distance.
[PPC] Adjust the computed branch offset for the possible shorter distance
Mar 6 2019, 10:24 AM
Carrot closed D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.
Mar 6 2019, 10:24 AM · Restricted Project

Mar 5 2019

Carrot committed rGf124e75656dd: [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges (authored by Carrot).
[X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges
Mar 5 2019, 10:54 AM
Carrot committed rL355430: [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges.
[X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges
Mar 5 2019, 10:54 AM
Carrot closed D58646: [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges.
Mar 5 2019, 10:54 AM · Restricted Project

Mar 4 2019

Carrot added a comment to D58646: [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges.

ping

Mar 4 2019, 9:37 AM · Restricted Project

Feb 27 2019

Carrot added a comment to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Any other comments?

Feb 27 2019, 1:10 PM · Restricted Project

Feb 25 2019

Carrot created D58646: [X86] In X86DomainReassignment.cpp add enclosed registers to EnclosedEdges.
Feb 25 2019, 2:42 PM · Restricted Project

Feb 22 2019

Carrot committed rG4c8e480358c3: [MBP] Factor out function hasViableTopFallthrough and enhancement (authored by Carrot).
[MBP] Factor out function hasViableTopFallthrough and enhancement
Feb 22 2019, 10:05 AM
Carrot committed rL354682: [MBP] Factor out function hasViableTopFallthrough and enhancement.
[MBP] Factor out function hasViableTopFallthrough and enhancement
Feb 22 2019, 10:04 AM
Carrot closed D58393: [MBP] Factor out function hasViableTopFallthrough and enhancement.
Feb 22 2019, 10:04 AM · Restricted Project

Feb 21 2019

Carrot updated the diff for D58393: [MBP] Factor out function hasViableTopFallthrough and enhancement.
Feb 21 2019, 3:24 PM · Restricted Project
Carrot added inline comments to D58393: [MBP] Factor out function hasViableTopFallthrough and enhancement.
Feb 21 2019, 2:23 PM · Restricted Project

Feb 20 2019

Carrot updated the diff for D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Add comment to explain the case when the inline asm occurs between branch block and dest block.

Feb 20 2019, 2:48 PM · Restricted Project
Carrot added inline comments to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.
Feb 20 2019, 2:48 PM · Restricted Project

Feb 19 2019

Carrot added a comment to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

I can't help but feel like this patch adds further complication to an already excessively complicated function. To put things in perspective - this file is currently 282 lines and 210 of those lines are in a single function. Frankly, I think refactoring this into a few conceptual sections implemented in separate functions would go a long way.

Seems to me that conceptually this pass does the following:

  • Renumber blocks
  • Compute the size of each block
  • Identify PC-relative branches and compute the branch distance
  • Convert PC-relative branches that branch too far into the "long branch sequence" (i.e. invert branch condition, convert to bc+8, b)

    I understand if you don't want to do this refactoring and want to proceed with this patch as-is to solve the immediate problem. However, I thought I would bring this up since it seems like refactoring this would go a long way to making this readable.
Feb 19 2019, 2:32 PM · Restricted Project
Carrot created D58393: [MBP] Factor out function hasViableTopFallthrough and enhancement.
Feb 19 2019, 10:55 AM · Restricted Project

Feb 15 2019

Carrot added inline comments to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.
Feb 15 2019, 7:10 PM · Restricted Project
Carrot added inline comments to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.
Feb 15 2019, 1:38 PM · Restricted Project
Carrot updated the diff for D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Also handle the backward branch.

Feb 15 2019, 11:58 AM · Restricted Project

Feb 13 2019

Carrot updated the diff for D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Add a test case.

Feb 13 2019, 4:11 PM · Restricted Project
Carrot added a comment to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Any testcases that can show the problem and test the fix? Thanks.

It can only be triggered by a very large (>32KB) function body because the range of conditional branch is +/- 32KB.

In our case, the large function body is caused by aggressive thinlto guided inlining.

So no small test case can demonstrate the problem :(

You can use the assembly directive '.space' to create arbitrary sized basic blocks. See test/CodeGen/RISCV/branch-relaxation.ll for examples.

Feb 13 2019, 3:31 PM · Restricted Project

Feb 12 2019

Carrot added inline comments to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.
Feb 12 2019, 3:08 PM · Restricted Project
Carrot added a comment to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

I think we should be able to come up with a smaller test case that demonstrate the problem here?
No necessary causing run time problem due to wrong branch,
but causing smaller calculated branch offset, and we can check that by adding a few dbgs() to check calculated branch offset in code?

Because of the code at line 137

Feb 12 2019, 2:44 PM · Restricted Project

Feb 11 2019

Carrot added a comment to D57718: [PPC] Adjust the computed branch offset for the possible shorter distance.

Any testcases that can show the problem and test the fix? Thanks.

Feb 11 2019, 1:22 PM · Restricted Project