bmakam (Balaram Makam)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 19 2014, 10:34 AM (165 w, 1 d)

Recent Activity

Fri, Oct 27

bmakam added inline comments to D38631: [SimplifyCFG] use pass options and remove the latesimplifycfg pass.
Fri, Oct 27, 7:48 AM

Thu, Oct 26

bmakam committed rL316721: Revert "[CGP] Merge empty case blocks if no extra moves are added.".
Revert "[CGP] Merge empty case blocks if no extra moves are added."
Thu, Oct 26, 5:35 PM
bmakam committed rL316711: [CGP] Merge empty case blocks if no extra moves are added..
[CGP] Merge empty case blocks if no extra moves are added.
Thu, Oct 26, 3:34 PM
bmakam closed D37343: [CGP] Merge empty case blocks if no extra moves are added. by committing rL316711: [CGP] Merge empty case blocks if no extra moves are added..
Thu, Oct 26, 3:34 PM
bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

Thanks for the reviews.

Thu, Oct 26, 3:24 PM
bmakam committed rL316669: Reapply r316582 [Local] Fix a bug in the domtree update logic for….
Reapply r316582 [Local] Fix a bug in the domtree update logic for…
Thu, Oct 26, 8:05 AM

Wed, Oct 25

bmakam committed rL316612: Revert r316582 [Local] Fix a bug in the domtree update logic for….
Revert r316582 [Local] Fix a bug in the domtree update logic for…
Wed, Oct 25, 2:33 PM
bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

Hi everyone,

Wed, Oct 25, 9:30 AM
bmakam added a comment to D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

Thanks for the reviews.

Wed, Oct 25, 7:57 AM
bmakam committed rL316582: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..
[Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.
Wed, Oct 25, 7:56 AM
bmakam closed D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred. by committing rL316582: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..
Wed, Oct 25, 7:56 AM

Oct 17 2017

bmakam added reviewers for D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.: dberlin, hfinkel.
Oct 17 2017, 9:50 AM
bmakam updated the diff for D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

Reduce unittest.

Oct 17 2017, 9:37 AM
bmakam added a comment to D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

Thanks for adding the comment.

The testcase seems really long and it's not obvious why what it tries to exercise -- have you tried reducing it with bugpoint to something shorter and easier to reason about?

Oct 17 2017, 9:11 AM

Oct 16 2017

bmakam updated the diff for D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

Add unittest and a comment.

Oct 16 2017, 3:50 PM
bmakam added a comment to D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

I think that this deserves a comment with a (partial) example, otherwise the change looks suspicious.

Oct 16 2017, 10:05 AM
bmakam added a comment to D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..

For some irreducible CFG the domtree nodes might be dead, do not update domtree for dead nodes.

Can you show some example of such CFG and add a comment explaining this check?

DomTree::getNode return nullptr only for forward-unreachable CFG nodes.

Oct 16 2017, 9:53 AM
bmakam added a reviewer for D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred.: kuhar.
Oct 16 2017, 9:41 AM
bmakam created D38960: [Local] Fix a bug in the domtree update logic for MergeBasicBlockIntoOnlyPred..
Oct 16 2017, 9:40 AM

Oct 10 2017

bmakam added inline comments to D38558: [JumpThreading] Preserve DT and LVI across the pass..
Oct 10 2017, 9:16 AM

Oct 5 2017

bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Minor nit, use is_contained. NFCI.

Oct 5 2017, 1:04 PM
bmakam added a comment to D38590: [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for ARM/AArch64.

I modified the test case per your suggestions before commiting. Thanks for taking a look.

Oct 5 2017, 11:36 AM
bmakam committed rL315002: [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for….
[ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for…
Oct 5 2017, 11:35 AM
bmakam closed D38590: [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for ARM/AArch64 by committing rL315002: [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for….
Oct 5 2017, 11:35 AM
bmakam created D38590: [ARM/AARCH64] Make test MachineBranchProb.ll more robust and re-enable for ARM/AArch64.
Oct 5 2017, 10:58 AM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Oct 5 2017, 7:17 AM

Oct 4 2017

bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

fix the sequence of deleting an edge from DT.

Oct 4 2017, 2:36 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Oct 4 2017, 2:34 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Oct 4 2017, 2:18 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Address review comments.

Oct 4 2017, 1:17 PM
bmakam committed rL314912: "[ARM] Mark flaky test MachineBranchProb.ll unsupported again for ARM/AArch64".
"[ARM] Mark flaky test MachineBranchProb.ll unsupported again for ARM/AArch64"
Oct 4 2017, 9:47 AM

Oct 3 2017

bmakam committed rL314857: [AArch64] Use LateSimplifyCFG after expanding atomic operations..
[AArch64] Use LateSimplifyCFG after expanding atomic operations.
Oct 3 2017, 3:41 PM
bmakam closed D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations. by committing rL314857: [AArch64] Use LateSimplifyCFG after expanding atomic operations..
Oct 3 2017, 3:41 PM
bmakam added inline comments to D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations..
Oct 3 2017, 1:28 PM
bmakam updated the diff for D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations..

Added a comment in the test case to explain that the preheader is simplified during latesimplifycfg.

Oct 3 2017, 1:25 PM

Oct 2 2017

bmakam added a comment to D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations..

Kindly ping.

Oct 2 2017, 1:27 PM

Sep 29 2017

bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

Thanks Daniel/Eli, I learnt something new today and am glad I asked the question. :)

Sep 29 2017, 1:51 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Sep 29 2017, 1:06 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Address review changes.

Sep 29 2017, 1:05 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Fix make-check asserts.

Sep 29 2017, 11:46 AM

Sep 28 2017

bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Use batch updater to inform DT about CFG updates.

Sep 28 2017, 11:56 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Sep 28 2017, 11:42 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Rebased.

Sep 28 2017, 11:04 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Sep 28 2017, 8:38 PM
bmakam added inline comments to D37343: [CGP] Merge empty case blocks if no extra moves are added..
Sep 28 2017, 8:09 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Precompute DT as suggested by Hal and Eli. Please take a look.

Sep 28 2017, 4:17 PM
bmakam added a comment to D37866: [LateJumpThreading] Enable LateJumpThreading right before CGP..

What's going on with this patch? We also saw a performance regression on one of our benchmarks. And I was interested in whether this woudl help.

Sep 28 2017, 8:18 AM
bmakam updated the diff for D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations..

Added a testcase. Please take a look.

Sep 28 2017, 8:15 AM

Sep 27 2017

bmakam abandoned D38288: [InstCombine] Restrict transforming shared selects using SimplifySelectsFeedingBinaryOp when we cannot simplify the binary op..

abandon this revision since it is a duplicate of D38263

Sep 27 2017, 7:29 AM

Sep 26 2017

bmakam updated the diff for D38288: [InstCombine] Restrict transforming shared selects using SimplifySelectsFeedingBinaryOp when we cannot simplify the binary op..

Address Chad's testcase.

Sep 26 2017, 3:30 PM
bmakam created D38288: [InstCombine] Restrict transforming shared selects using SimplifySelectsFeedingBinaryOp when we cannot simplify the binary op..
Sep 26 2017, 3:30 PM

Sep 25 2017

bmakam created D38262: [AArch64] Use LateSimplifyCFG after expanding atomic operations..
Sep 25 2017, 2:32 PM
bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

Remove quadratic complexity to calculate DT.

You're still computing DT once for every empty block?

Sep 25 2017, 12:30 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Address review comments:

Sep 25 2017, 5:03 AM

Sep 22 2017

bmakam committed rL313998: [Falkor] Add falkor CPU to host detection.
[Falkor] Add falkor CPU to host detection
Sep 22 2017, 10:48 AM

Sep 20 2017

bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

I think this needs more testcases to illustrate the different possibilities here.

Sep 20 2017, 10:11 AM

Sep 14 2017

bmakam updated the diff for D37866: [LateJumpThreading] Enable LateJumpThreading right before CGP..

Address Krzysztof's comment.

Sep 14 2017, 2:30 PM
bmakam added a comment to D36404: Disable jump threading into loop headers.

We need to figure out if LSR is doing something reasonable. It's possible that LSR is actually causing the regression, and old jumpthreading is just serving to obscure the loop structure (thus preventing LSR from doing whatever bad thing it's doing).

Sep 14 2017, 2:09 PM
bmakam created D37866: [LateJumpThreading] Enable LateJumpThreading right before CGP..
Sep 14 2017, 2:08 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..

Address Eli's comment.

Sep 14 2017, 9:59 AM
bmakam added inline comments to D37816: Experimental late jump threading pass.
Sep 14 2017, 8:31 AM
bmakam added inline comments to D37816: Experimental late jump threading pass.
Sep 14 2017, 8:08 AM

Sep 13 2017

bmakam added a comment to D36404: Disable jump threading into loop headers.

Have you tried it? Changing jump threading to have early/late versions is trivial, the question is when should the late one run, and if that would help in the first place.

I reverted this change and it recovered the regression. I was going to try to re-run the whole jumpthreading pass after latesimplifycfg to see if it would help, but I don't know if it is a good place because LSR that runs after, will not have the loop in canonical form and loopsimplify will again turn it into irreducible loop. Another place I was thinking was to try before CGP.

If it'll work right before CGP, that's probably best. That way targets can run IR-level passes that want to look at loops before that.

I tried right after latesimplifycfg and it almost recovers the regression. However, running it right before CGP made it even worse for performance of spec2017/perlbench. It seems to be due to some kind of bad interaction with LSR, I haven't digged deeper. Would running right before LSR be reasonable?

Sep 13 2017, 2:31 PM
bmakam added a comment to D37343: [CGP] Merge empty case blocks if no extra moves are added..

kindly ping.

Sep 13 2017, 11:56 AM
bmakam added a comment to D36404: Disable jump threading into loop headers.

Have you tried it? Changing jump threading to have early/late versions is trivial, the question is when should the late one run, and if that would help in the first place.

Sep 13 2017, 8:59 AM
bmakam added a comment to D36404: Disable jump threading into loop headers.

On our internal benchmarks it's mostly neutral, but the benchmark that motivated it improved by 5.3%. We have some extra code, though, that is supposed to deal with similar cases, so some of the impact may be reduced. By default it keeps the behavior unchanged, so for all other architectures it should have no effect.

Sep 13 2017, 8:12 AM

Sep 9 2017

bmakam abandoned D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

Subsumed by D37343

Sep 9 2017, 10:07 PM
bmakam updated the diff for D37343: [CGP] Merge empty case blocks if no extra moves are added..
  • Added some more checks to allow merging empty cases when we can estimate extra moves are not added.
  • Added test cases.
  • spec2017/perlbench improves by 4.5%. No other improvements or regressions observed in Spec.
Sep 9 2017, 10:05 PM

Aug 31 2017

bmakam added a comment to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

Is the actual value of the PHI operand relevant here? It looks like some of the testcases on r289988 involve constants, which are materialized during isel (and can be substantially more expensive).

I took a stab at this and created D37343. Please take a look.

Aug 31 2017, 11:04 AM
bmakam created D37343: [CGP] Merge empty case blocks if no extra moves are added..
Aug 31 2017, 11:04 AM

Aug 30 2017

bmakam added a comment to D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..

Thanks for the review.

Aug 30 2017, 8:05 AM
bmakam committed rL312126: Re-land MachineInstr: Reason locally about some memory objects before going to….
Re-land MachineInstr: Reason locally about some memory objects before going to…
Aug 30 2017, 7:58 AM
bmakam closed D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA. by committing rL312126: Re-land MachineInstr: Reason locally about some memory objects before going to….
Aug 30 2017, 7:58 AM

Aug 29 2017

bmakam updated the diff for D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..

Move assert closer to AA->alias call.

Aug 29 2017, 2:39 PM
bmakam updated the diff for D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..

rebase and fix x86 lit test.

Aug 29 2017, 10:33 AM

Aug 25 2017

bmakam added inline comments to D37076: [LICM] Allow sinking when foldable in loop.
Aug 25 2017, 8:12 AM
bmakam added a comment to D37076: [LICM] Allow sinking when foldable in loop.

ContainFolderableUsersInLoop -> ContainFoldableUsersInLoop?

Aug 25 2017, 8:10 AM

Aug 24 2017

bmakam updated the diff for D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..
Aug 24 2017, 1:54 PM

Aug 21 2017

bmakam updated the diff for D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..
Aug 21 2017, 3:51 PM

Aug 20 2017

bmakam updated the diff for D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..

Trim the patch and avoid handling stack references for non-value MMOs to prevent LNT test failures.

Aug 20 2017, 1:33 PM

Aug 18 2017

bmakam created D36900: Re-land MachineInstr: Reason locally about some memory objects before going to AA..
Aug 18 2017, 2:39 PM

Aug 16 2017

bmakam committed rL311008: Revert "MachineInstr: Reason locally about some memory objects before going to….
Revert "MachineInstr: Reason locally about some memory objects before going to…
Aug 16 2017, 7:18 AM

Aug 14 2017

bmakam added a comment to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

I have looked into the phi operands closely and I am not convinced if phi operands involving constants have any influence on the profitability of merging empty blocks. I identified empty exit blocks if when merged with their successors improved the performance a bit, yet if another similar empty exit block was merged with the destination block, it sinks the performance. The only difference between them is that the successor block is also empty when the performance regressed.

Aug 14 2017, 1:22 PM
bmakam closed D34181: MachineInstr: Reason locally about some memory objects before going to AA..

Thanks for taking a look. Rebased and committed in r310825.

Aug 14 2017, 2:46 AM
bmakam committed rL310825: MachineInstr: Reason locally about some memory objects before going to AA..
MachineInstr: Reason locally about some memory objects before going to AA.
Aug 14 2017, 2:42 AM

Aug 8 2017

bmakam added a comment to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

To clarify, the original version of this patch recovered the full 3% regression, but the new version only recovers 0.7%?

Correct.

Is the actual value of the PHI operand relevant here? It looks like some of the testcases on r289988 involve constants, which are materialized during isel (and can be substantially more expensive).

Thanks Eli, this looks interesting. Another observation is that if we increase cgp-freq-ratio-to-skip-merge to 1000 it will recover the full 3% regression. I am trying to reduce the exact basic block which when skipped merging in CGP removes placing copies in hot path and improves performance due to reduced dynamic instruction count and the basic block(s) which when merged eliminate the unnecessary branches and improve the performance due to better branching/i-cache utilization. This might provide an answer to your question about the relevance of PHI operands.

Aug 8 2017, 10:00 AM

Aug 7 2017

bmakam added a comment to D36404: Disable jump threading into loop headers.

@bmakam: Isn't this very similar to some of your recent work?

Aug 7 2017, 11:19 AM

Aug 2 2017

bmakam updated the diff for D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

Update to address Eli's comments.

Aug 2 2017, 1:02 PM
bmakam added a comment to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

A gentle ping.

Aug 2 2017, 11:49 AM

Jul 31 2017

bmakam added inline comments to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..
Jul 31 2017, 8:37 AM

Jul 28 2017

bmakam updated the diff for D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

Updated patch to address the underlying issue.

Jul 28 2017, 1:48 PM

Jul 20 2017

bmakam planned changes to D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..

The underlying issue is that after loopsimplify creates empty exit blocks, CGP cannot clean up the empty blocks if it happens to be coming from a switch case. I am working on an alternative approach to address this issue.

Jul 20 2017, 10:50 AM

Jul 19 2017

bmakam committed rL308422: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can….
[SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can…
Jul 19 2017, 1:55 AM
bmakam closed D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure. by committing rL308422: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can….
Jul 19 2017, 1:55 AM
bmakam removed a dependent revision for D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify.: D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure..
Jul 19 2017, 1:50 AM
bmakam removed a dependency for D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.: D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..
Jul 19 2017, 1:50 AM

Jul 18 2017

bmakam added a dependency for D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.: D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..
Jul 18 2017, 3:18 PM
bmakam added a dependent revision for D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify.: D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure..
Jul 18 2017, 3:18 PM
bmakam removed a dependent revision for D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure.: D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify..
Jul 18 2017, 3:18 PM
bmakam removed a dependency for D35584: [CGP] Fold empty dedicated exit blocks created by loopsimplify.: D35411: [SimplifyCFG] Defer folding unconditional branches to LateSimplifyCFG if it can destroy canonical loop structure..
Jul 18 2017, 3:18 PM