iteratee (Kyle Butt)
User

Projects

User does not belong to any projects.

User Details

User Since
Oct 6 2015, 11:37 AM (106 w, 5 d)

Recent Activity

Mon, Oct 16

iteratee added inline comments to D38848: {ARM} IfConversion does not handle un-analyzable branch correctly.
Mon, Oct 16, 9:52 AM

Thu, Oct 12

iteratee added a comment to D38848: {ARM} IfConversion does not handle un-analyzable branch correctly.

In general, please follow existing style for the file you're editing.

Thu, Oct 12, 10:24 AM

Fri, Oct 6

iteratee added inline comments to D35844: Correct dwarf unwind information in function epilogue.
Fri, Oct 6, 10:09 AM

Thu, Oct 5

iteratee added inline comments to D35844: Correct dwarf unwind information in function epilogue.
Thu, Oct 5, 1:32 PM

Wed, Oct 4

iteratee added a comment to D33850: Inlining: Don't re-map simplified cloned instructions..

If this got dropped I'm sorry.

Wed, Oct 4, 4:54 PM
iteratee accepted D38563: [MachineBlockPlacement] Make sure PreferredLoopExit is cleared everytime new loop is processed.
Wed, Oct 4, 2:40 PM

Sep 11 2017

iteratee accepted D32776: [PowerPC] Update branch coalescing to be a PowerPC specific pass.

I'm OK with this.

Sep 11 2017, 1:47 PM

Sep 8 2017

iteratee committed rL312843: PPC: Don't select lxv/stxv for insufficiently aligned stack slots..
PPC: Don't select lxv/stxv for insufficiently aligned stack slots.
Sep 8 2017, 5:40 PM
iteratee created D37654: PPC: Don't select lxv/stxv for insufficiently aligned stack slots..
Sep 8 2017, 4:58 PM
iteratee added a comment to D37611: [IfConversion] More simple, correct dead/kill liveness handling.

This looks fine overall to me.

Sep 8 2017, 10:53 AM

Sep 5 2017

iteratee requested changes to D32776: [PowerPC] Update branch coalescing to be a PowerPC specific pass.

I don't think this should go in, even off by default as long as it's broken. You can pull the produceSameValue check or fix it, but I don't like having a known-broken pass in tree.

Sep 5 2017, 11:53 AM

Aug 31 2017

iteratee added inline comments to D32776: [PowerPC] Update branch coalescing to be a PowerPC specific pass.
Aug 31 2017, 12:45 PM

Aug 30 2017

iteratee added inline comments to D28249: Improve scheduling with branch coalescing.
Aug 30 2017, 3:47 PM

Aug 24 2017

iteratee added a comment to D29865: [PDSE] Add a no-op pass..

dannyb asked me to take over the review.
Can you fix the issues that filcab identified? I'll see if I can get you an answer about critical edge splitting.

Aug 24 2017, 2:16 PM
iteratee added a reviewer for D29865: [PDSE] Add a no-op pass.: iteratee.
Aug 24 2017, 2:15 PM

Aug 16 2017

iteratee added a comment to D34396: Adding code padding for performance stability - first policy (BranchesWithSameTargetAvoidancePolicy).

Do you plan on doing something similar for the DSB decode cache issues that occasionally arise?
Specifically this: https://bugs.llvm.org/show_bug.cgi?id=5615

Aug 16 2017, 2:27 PM
iteratee added inline comments to D34396: Adding code padding for performance stability - first policy (BranchesWithSameTargetAvoidancePolicy).
Aug 16 2017, 2:23 PM

Aug 4 2017

iteratee added a comment to D36338: BlockPlacement: Consider hotness of blocks relative to a loop iteration rather than relative to the loop as a whole.

Should we adjust the ratio if it's not specified explicitly and no profile data is available?

Aug 4 2017, 2:18 PM
iteratee closed D36296: BlockPlacement: add a flag to force cold block outlining w/o a profile..

Submitted in r310129

Aug 4 2017, 2:15 PM
iteratee committed rL310129: BlockPlacement: add a flag to force cold block outlining w/o a profile..
BlockPlacement: add a flag to force cold block outlining w/o a profile.
Aug 4 2017, 2:14 PM
iteratee added inline comments to D35844: Correct dwarf unwind information in function epilogue.
Aug 4 2017, 1:16 PM

Aug 3 2017

iteratee added a comment to D36296: BlockPlacement: add a flag to force cold block outlining w/o a profile..

Known issues:
needs a test case, and needs an adjustment to the frequency for non-profile functions.

Aug 3 2017, 5:03 PM
iteratee created D36296: BlockPlacement: add a flag to force cold block outlining w/o a profile..
Aug 3 2017, 5:03 PM
iteratee added a comment to D35844: Correct dwarf unwind information in function epilogue.

Can you provide a description of what you had to change relative to the rollback, and how you're verifying that the issue that caused the rollback has been fixed?

Aug 3 2017, 1:06 PM

Jul 10 2017

iteratee accepted D34745: Revert Revert [MBP] do not rotate loop if it creates extra branch.

Looks fine. I would prefer the fallthrough checks be successor checks, but I can accept it either way.

Jul 10 2017, 11:35 AM

Jun 30 2017

iteratee updated the diff for D33574: PPC: Verify that branch fixups fit within the range..

All the checks use unreachable now.

Jun 30 2017, 3:59 PM

Jun 27 2017

iteratee committed rL306495: Inlining: Don't re-map simplified cloned instructions..
Inlining: Don't re-map simplified cloned instructions.
Jun 27 2017, 6:41 PM
iteratee closed D33850: Inlining: Don't re-map simplified cloned instructions. by committing rL306495: Inlining: Don't re-map simplified cloned instructions..
Jun 27 2017, 6:41 PM

Jun 26 2017

iteratee updated the diff for D33850: Inlining: Don't re-map simplified cloned instructions..

Use cast instead of dyn_cast

Jun 26 2017, 6:58 PM

Jun 23 2017

iteratee accepted D34271: [MBP] do not rotate loop if it creates extra branch.

Approved with the changes I've marked.

Jun 23 2017, 11:39 AM
iteratee accepted D18046: [X86] Providing correct unwind info in function epilogue.

It would be nice to get rid of that negative completely. Not in this patch, but just remove it completely wherever we create CFI Instructions.

Jun 23 2017, 11:32 AM

Jun 22 2017

iteratee added a comment to D18046: [X86] Providing correct unwind info in function epilogue.

This is looking really good. Thanks.

Jun 22 2017, 11:38 AM
iteratee accepted D34388: [IfConversion] Hoist removeBranch calls out of if/else clauses [NFC].
Jun 22 2017, 11:27 AM
iteratee added inline comments to D34271: [MBP] do not rotate loop if it creates extra branch.
Jun 22 2017, 11:26 AM

Jun 21 2017

iteratee added inline comments to D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 21 2017, 12:02 PM
iteratee updated the diff for D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 21 2017, 12:02 PM
iteratee reclaimed D33850: Inlining: Don't re-map simplified cloned instructions..

Chandler, After r305934 A test for this isn't possible. We should still commit the fix.

Jun 21 2017, 11:18 AM
iteratee accepted D34456: Do not inline recursive direct calls in sample loader pass..
Jun 21 2017, 10:56 AM

Jun 19 2017

iteratee added a comment to D18046: [X86] Providing correct unwind info in function epilogue.

OK. It's looking pretty good overall. I looked a lot closer at the actual CFI Code. Mostly it looks great. I don't think you need AdjustCFAOffset at all. You spend a lot of work maintaining it, but in the one case that it's used, it's just (OutgoingOffset - IncomingOffset). It's only used with blocks that don't contain an offset def.

Jun 19 2017, 5:13 PM
iteratee added a comment to D34099: [IfConversion] Maintain the CFG when predicating/merging blocks in IfConvert*.

Mostly looks good. It will be easier to follow when you factor out the NFC sections and rebase.

Jun 19 2017, 1:59 PM
iteratee added inline comments to D34271: [MBP] do not rotate loop if it creates extra branch.
Jun 19 2017, 1:52 PM

Jun 15 2017

iteratee added a comment to D18046: [X86] Providing correct unwind info in function epilogue.

This is looking pretty good. I'm going to go back over, but most of my initial concerns have been satisfied.

Jun 15 2017, 11:08 AM

Jun 8 2017

iteratee abandoned D33850: Inlining: Don't re-map simplified cloned instructions..

No longer needed because of https://reviews.llvm.org/D34017

Jun 8 2017, 3:29 PM
iteratee accepted D34017: Do not early-inline recursive calls in sample profile loader..
Jun 8 2017, 11:26 AM

Jun 7 2017

iteratee added inline comments to D34017: Do not early-inline recursive calls in sample profile loader..
Jun 7 2017, 5:13 PM
iteratee added a comment to D34017: Do not early-inline recursive calls in sample profile loader..

It doesn't have to blow it up exponentially. Can we clone the function to be inlined, re-writing the recursive calls to call the clone, and then inline that? Similar to a worker-wrapper transformation

You mean clone the caller before any inlining, and inline that copy instead of the caller itself in early inlining? Yes we could do that, but that adds unnecessary copy overhead for every caller (which could be large). As llvm does not do recursive inlining anyway, this only solves cases where profile is collected from gcc binary, which should be temporary. So I guess it's not worth the effort to special-handle recursive early inlining.

Jun 7 2017, 4:53 PM
iteratee added a comment to D34017: Do not early-inline recursive calls in sample profile loader..

It doesn't have to blow it up exponentially. Can we clone the function to be inlined, re-writing the recursive calls to call the clone, and then inline that? Similar to a worker-wrapper transformation

Jun 7 2017, 4:25 PM
iteratee added inline comments to D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 7 2017, 3:12 PM
iteratee added a comment to D18046: [X86] Providing correct unwind info in function epilogue.

This is coming along nicely. I forgot to say last time that I was pleased overall.

Jun 7 2017, 1:27 PM
iteratee added inline comments to D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 7 2017, 10:52 AM
iteratee updated the summary of D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 7 2017, 10:52 AM

Jun 5 2017

iteratee added a reviewer for D33850: Inlining: Don't re-map simplified cloned instructions.: eraman.
Jun 5 2017, 4:15 PM

Jun 2 2017

iteratee created D33850: Inlining: Don't re-map simplified cloned instructions..
Jun 2 2017, 1:53 PM
iteratee added a comment to D18046: [X86] Providing correct unwind info in function epilogue.

Some initial thoughts. I would like to hide the actual CFI algorithms from the existing passes as much as possible.

Jun 2 2017, 1:51 PM

May 30 2017

iteratee added inline comments to D33574: PPC: Verify that branch fixups fit within the range..
May 30 2017, 10:22 AM
iteratee updated the diff for D33574: PPC: Verify that branch fixups fit within the range..

Tidy up comparisons.

May 30 2017, 10:21 AM
iteratee added a comment to D33562: MachineLICM: Add new condition for hoisting of caller preserved registers.

You talk about a call instruction? Is X2 saved and restored in the called function? Then it's just a CSR and should not be mentioned in the clobber list so no problem with my proposal above.

But it is the caller that saves and restores it, not the callee. The sequence is essentially this (all in the caller of course):

  • Save X2 to it's stack slot
  • Update X2 prior to the call
  • Call the function through a pointer
  • Restore X2 immediately after the call
May 30 2017, 10:18 AM

May 26 2017

iteratee updated the summary of D33577: CodeGen: BlockPlacement: Use Branching factor to choose between near equals..
May 26 2017, 3:22 PM
iteratee added a comment to D33577: CodeGen: BlockPlacement: Use Branching factor to choose between near equals..
It is basically a choice between a layout (exiting) that has 50% chance of not taking any branches , 25% of taking one branch, and 25% of taking more than one branches   vs the new layout that has only 6.25% chance of taking zero branch and 93.75% of taking only one branch.

The existing layout only has 25% chance of taking more than 2 branches -- is it worth sacrificing 43.75% of chances to not take any branches for the improvement for the 25% cases?

May 26 2017, 3:19 PM
iteratee updated the diff for D33577: CodeGen: BlockPlacement: Use Branching factor to choose between near equals..

Add comments to the change summary about the global heuristic that we're approximating (maybe badly) by calculating the branch factor.

May 26 2017, 2:04 PM
iteratee accepted D33562: MachineLICM: Add new condition for hoisting of caller preserved registers.

This looks fine to me.

May 26 2017, 12:56 PM
iteratee added a comment to D33577: CodeGen: BlockPlacement: Use Branching factor to choose between near equals..

I think you counted wrong. The dynamic taken branch count is the same. But the distribution of the taken branch count is more consistent.

May 26 2017, 10:46 AM

May 25 2017

iteratee created D33577: CodeGen: BlockPlacement: Use Branching factor to choose between near equals..
May 25 2017, 4:36 PM
iteratee added a comment to D33574: PPC: Verify that branch fixups fit within the range..

I'm fairly certain these are right, but I wanted to get another set of eyes on them before I committed them.

May 25 2017, 4:04 PM
iteratee created D33574: PPC: Verify that branch fixups fit within the range..
May 25 2017, 4:04 PM
iteratee abandoned D25484: Post commit review of changes to D18226.
May 25 2017, 4:01 PM
iteratee committed rL303904: PPC: Correct Size for GETtlsADDR.
PPC: Correct Size for GETtlsADDR
May 25 2017, 12:38 PM

May 17 2017

iteratee committed rL303316: CodeGen: BlockPlacement: Add Message strings to asserts. NFC.
CodeGen: BlockPlacement: Add Message strings to asserts. NFC
May 17 2017, 4:58 PM
iteratee closed D33078: CodeGen: BlockPlacement: Add Message strings to asserts. NFC by committing rL303316: CodeGen: BlockPlacement: Add Message strings to asserts. NFC.
May 17 2017, 4:58 PM
iteratee committed rL303307: CodeGen: Power: Add lowering for shifts of v1i128..
CodeGen: Power: Add lowering for shifts of v1i128.
May 17 2017, 3:08 PM
iteratee closed D32774: CodeGen: Power: Add lowering for shifts of v1i128. by committing rL303307: CodeGen: Power: Add lowering for shifts of v1i128..
May 17 2017, 3:08 PM

May 15 2017

iteratee added inline comments to D32774: CodeGen: Power: Add lowering for shifts of v1i128..
May 15 2017, 11:25 AM
iteratee updated the diff for D32774: CodeGen: Power: Add lowering for shifts of v1i128..
May 15 2017, 11:25 AM
iteratee closed D32324: CodeGen: BlockPlacement: Increase tail duplication size for O3..

Committed in rL303084

May 15 2017, 10:53 AM
iteratee committed rL303084: CodeGen: BlockPlacement: Increase tail duplication size for O3..
CodeGen: BlockPlacement: Increase tail duplication size for O3.
May 15 2017, 10:44 AM

May 12 2017

iteratee updated the diff for D32324: CodeGen: BlockPlacement: Increase tail duplication size for O3..

If either threshold is the only one explicitly set, use that threshold.
Otherwise, if both, or neither are set, use the aggressive threshold at O3

May 12 2017, 2:48 PM
iteratee added a comment to D32324: CodeGen: BlockPlacement: Increase tail duplication size for O3..

No, I wanted to get agreement before I re-wrote it. I can do it if you'd like to see it before deciding.

May 12 2017, 10:17 AM

May 11 2017

iteratee accepted D33076: [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC..

Looks fine with the comments I gave addressed.

May 11 2017, 3:54 PM
iteratee updated the diff for D33074: InstCombine: Allow sinking instructions with more uses in the same block..

Now with more Test Case!

May 11 2017, 3:24 PM
iteratee added inline comments to D33076: [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC..
May 11 2017, 2:42 PM
iteratee added inline comments to D33076: [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC..
May 11 2017, 1:11 PM
iteratee added inline comments to D33076: [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC..
May 11 2017, 11:36 AM
iteratee added inline comments to D33076: [PPC] Move the combine "a << (b % (sizeof(a) * 8)) -> (PPCshl a, b)" to the backend. NFC..
May 11 2017, 11:21 AM
iteratee accepted D33037: [IfConversion] Keep the CFG updated incrementally in IfConvertTriangle.

Feel free to send me patches that update the other parts of IfConversion

May 11 2017, 11:09 AM
iteratee added inline comments to D33074: InstCombine: Allow sinking instructions with more uses in the same block..
May 11 2017, 7:41 AM

May 10 2017

iteratee updated the diff for D32774: CodeGen: Power: Add lowering for shifts of v1i128..

Add splat to the output.

May 10 2017, 4:20 PM
iteratee created D33078: CodeGen: BlockPlacement: Add Message strings to asserts. NFC.
May 10 2017, 4:19 PM
iteratee created D33074: InstCombine: Allow sinking instructions with more uses in the same block..
May 10 2017, 2:56 PM
iteratee added a comment to D33037: [IfConversion] Keep the CFG updated incrementally in IfConvertTriangle.

I have on purpose not attacked IfConvertSimple, IfConvertForkedDiamond
and IfConvertDiamond to keep the size of the patch down, and I haven't seen any
test case where they actually go wrong, but I think they suffer from similar
problems since they also use RemoveExtraEdges/analyzeBranch to fix the CFG at
the end.

May 10 2017, 9:33 AM

May 9 2017

iteratee added a comment to D32774: CodeGen: Power: Add lowering for shifts of v1i128..

Ping? Do we know if this vector operation requires the byte splat on power9?

May 9 2017, 1:42 PM
iteratee accepted D32996: [IfConversion] Add missing check in IfConversion/canFallThroughTo.

Looks good to me with the changes mentioned.

May 9 2017, 11:13 AM

May 5 2017

iteratee added inline comments to D32774: CodeGen: Power: Add lowering for shifts of v1i128..
May 5 2017, 1:30 PM
iteratee added inline comments to D32774: CodeGen: Power: Add lowering for shifts of v1i128..
May 5 2017, 10:09 AM
iteratee updated the diff for D32774: CodeGen: Power: Add lowering for shifts of v1i128..

Change vendor to unknown

May 5 2017, 10:09 AM
iteratee abandoned D30728: CodeGen: Placement: Apply triangle heuristic more aggressively at O3..
May 5 2017, 9:58 AM

May 4 2017

iteratee added inline comments to D32324: CodeGen: BlockPlacement: Increase tail duplication size for O3..
May 4 2017, 3:44 PM

May 2 2017

iteratee updated the diff for D32324: CodeGen: BlockPlacement: Increase tail duplication size for O3..

Made the aggressive threshold an option.

May 2 2017, 6:00 PM
iteratee added a comment to D32774: CodeGen: Power: Add lowering for shifts of v1i128..

Note that we should eventually produce the power9 sequence for the non-vector case as well.

May 2 2017, 5:18 PM
iteratee updated the diff for D32774: CodeGen: Power: Add lowering for shifts of v1i128..

Update test to check for the power9 sequence.

May 2 2017, 5:18 PM
iteratee created D32774: CodeGen: Power: Add lowering for shifts of v1i128..
May 2 2017, 4:50 PM