Page MenuHomePhabricator

tjablin (Thomas Jablin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 6 2013, 2:59 PM (353 w, 4 d)

Recent Activity

Aug 15 2016

tjablin added a comment to D22243: [PPC] Handling CallInst in PPCBoolRetToInt.

Did this ever land upstream? PR25548 is still open and I can't find a matching commit.

Aug 15 2016, 10:25 AM

Aug 2 2016

tjablin added a comment to D22243: [PPC] Handling CallInst in PPCBoolRetToInt.

LGTM

Aug 2 2016, 6:38 PM

Aug 1 2016

tjablin added a comment to D22243: [PPC] Handling CallInst in PPCBoolRetToInt.

I believe the code is correct, but I have a few quibbles.

Aug 1 2016, 4:14 PM

Jul 28 2016

tjablin retitled D22948: Add Percent Symbol Before Named PPC Registers from to Add Percent Symbol Before Named PPC Registers.
Jul 28 2016, 4:51 PM

Jul 5 2016

tjablin updated the diff for D20310: Teach LLVM about Power 9 D-Form VSX Instructions.

Posting for CY.

Jul 5 2016, 9:32 AM

Jun 23 2016

tjablin added a comment to D21397: Teach SimplifyCFG to Create Switches from InstCombine Or Mask'd Comparisons.

Ping?

Jun 23 2016, 6:48 AM

Jun 20 2016

tjablin committed rL273197: test commit: remove trailing whitespace.
test commit: remove trailing whitespace
Jun 20 2016, 1:50 PM

Jun 16 2016

tjablin updated the diff for D21397: Teach SimplifyCFG to Create Switches from InstCombine Or Mask'd Comparisons.

I have split out and applied the bug fix and m_ConstantInt => m_APInt portions of the past separately. What remains is just the new functionality.

Jun 16 2016, 7:00 PM
tjablin retitled D21440: Use m_APInt in SimplifyCFG from to Use m_APInt in SimplifyCFG.
Jun 16 2016, 11:44 AM

Jun 15 2016

tjablin added inline comments to D21417: Fix Side-Conditions in SimplifyCFG for Creating Switches from InstCombine And Mask'd Comparisons.
Jun 15 2016, 6:40 PM
tjablin retitled D21417: Fix Side-Conditions in SimplifyCFG for Creating Switches from InstCombine And Mask'd Comparisons from to Fix Side-Conditions in SimplifyCFG for Creating Switches from InstCombine And Mask'd Comparisons.
Jun 15 2016, 3:52 PM
tjablin abandoned D21315: Reorder SimplifyCFG and SROA?.
Jun 15 2016, 12:43 PM
tjablin retitled D21397: Teach SimplifyCFG to Create Switches from InstCombine Or Mask'd Comparisons from to Teach SimplifyCFG to Create Switches from InstCombine Or Mask'd Comparisons.
Jun 15 2016, 12:41 PM

Jun 14 2016

tjablin added a comment to D21315: Reorder SimplifyCFG and SROA?.

Hi Chandler,
The FoldValueComparisonIntoPredecessors logic already mostly understands the code pattern produced by FoldingBranchToCommonDest, but would need to duplicate most of the logic from Early-CSE to get the rest of the way there. The current pass order is: CFGSimplification, SROA, EarlyCSE. If either EarlyCSE or SROA runs before SimplifyCFG there's no problem. Alternatively, a second pass through SimplifyCFG will also generate good code as long as it is before InstCombine. InstCombine is problematic since it "strength reduces" some equality comparisons to bitwise operations. For example:

(i == 5334 || i == 5335)

becomes

((i | 1) == 5335)
Jun 14 2016, 10:33 AM
tjablin updated the diff for D20310: Teach LLVM about Power 9 D-Form VSX Instructions.

Another update from CY.

Jun 14 2016, 8:03 AM

Jun 13 2016

tjablin retitled D21315: Reorder SimplifyCFG and SROA? from to Reorder SimplifyCFG and SROA?.
Jun 13 2016, 4:31 PM

May 31 2016

tjablin updated the diff for D20310: Teach LLVM about Power 9 D-Form VSX Instructions.

From CY:

May 31 2016, 8:13 AM

May 16 2016

tjablin retitled D20310: Teach LLVM about Power 9 D-Form VSX Instructions from to Teach LLVM about Power 9 D-Form VSX Instructions.
May 16 2016, 3:58 PM

May 12 2016

tjablin updated subscribers of D19825: Power9 - Add exploitation of vector load and store that do not require swaps.
May 12 2016, 12:58 PM

May 9 2016

tjablin updated the diff for D19564: Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation.

Switch to if !() continue; construct instead of nested if per Adrian's comment.

May 9 2016, 4:23 PM

May 5 2016

tjablin updated the diff for D19564: Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation.

Address Adrian's concerns:

May 5 2016, 1:22 PM

Apr 27 2016

tjablin updated the diff for D19564: Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation.

Update to address David's comments. This version of the patch will never insert additional PHINodes for DbgInfoIntrinsics. Instead, DbgInfoIntrinsics are updated when the needed Value is already live in the DbgInfoIntrinsic's BasicBlock. If the Value is dead, it is replaced with Undef.

Apr 27 2016, 4:33 PM

Apr 26 2016

tjablin retitled D19564: Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation from to Update Debug Intrinsics in RewriteUsesOfClonedInstructions in LoopRotation.
Apr 26 2016, 3:34 PM

Apr 22 2016

tjablin added a comment to D18448: Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker.

This is also a fix for 25503 which is important for mesa/llvmpipe on
ppc64le. I know some IBMers would like this to be fixed for 3.8.1.

Apr 22 2016, 7:16 AM

Apr 21 2016

tjablin added a comment to D18448: Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker.

Can this patch please be considered for inclusion in LLVM 3.8.1?

Apr 21 2016, 9:19 AM

Apr 7 2016

tjablin updated D17533: CXX_FAST_TLS calling convention: performance improvement for PPC64.
Apr 7 2016, 6:19 PM
tjablin retitled D18884: Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0 from to Mark CR0 Live if PPCInstrInfo::optimizeCompareInstr Creates a Use of CR0.
Apr 7 2016, 5:54 PM

Apr 4 2016

tjablin updated the diff for D17533: CXX_FAST_TLS calling convention: performance improvement for PPC64.

Simplify logic in getCalleeSavedRegsViaCopy. Add comments.

Apr 4 2016, 5:37 PM

Mar 31 2016

tjablin updated the diff for D18448: Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker.

Add comments per Hal's suggestion.

Mar 31 2016, 5:13 PM

Mar 30 2016

tjablin abandoned D15271: Split functions to create shrink wrapping opportunities .

Replaced with D17948, D17533, and D16984.

Mar 30 2016, 7:49 AM

Mar 24 2016

tjablin retitled D18448: Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker from to Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker.
Mar 24 2016, 8:43 AM

Mar 8 2016

tjablin updated the diff for D17948: Use CXX_FAST_TLS to enable shrink wrapping on PPC.

Switch from a ModulePass to a FunctionPass since outlining is no longer performed.

Mar 8 2016, 5:20 PM

Mar 7 2016

tjablin retitled D17948: Use CXX_FAST_TLS to enable shrink wrapping on PPC from to Use CXX_FAST_TLS to enable shrink wrapping on PPC.
Mar 7 2016, 4:46 PM

Feb 23 2016

tjablin added a reviewer for D17533: CXX_FAST_TLS calling convention: performance improvement for PPC64: hfinkel.
Feb 23 2016, 7:21 AM
tjablin updated the diff for D17533: CXX_FAST_TLS calling convention: performance improvement for PPC64.

Improve test case. Fix formatting errors. Include changes to PPCCallingConv.td. Omit changes to PPCTargetMachine.cpp that were part of another patch.

Feb 23 2016, 7:19 AM

Feb 22 2016

tjablin retitled D17533: CXX_FAST_TLS calling convention: performance improvement for PPC64 from to CXX_FAST_TLS calling convention: performance improvement for PPC64.
Feb 22 2016, 8:06 PM

Feb 8 2016

tjablin updated the diff for D16984: Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge.

Accidentally omitted code for iterating over inner loops. Fix spelling error.

Feb 8 2016, 1:10 PM
tjablin retitled D16984: Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge from to Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge.
Feb 8 2016, 6:49 AM

Feb 6 2016

tjablin updated subscribers of D16893: Keep CodeGenPrepare from preserving the dominator tree.
Feb 6 2016, 8:40 AM

Jan 19 2016

tjablin updated the diff for D15271: Split functions to create shrink wrapping opportunities .

Fix style errors pointed out by David.

Jan 19 2016, 7:33 AM

Jan 18 2016

tjablin updated the diff for D15271: Split functions to create shrink wrapping opportunities .

Per Hal's comments, use a more conservative behavior in the case of Extract and Cast Instructions.

Jan 18 2016, 3:49 PM

Dec 11 2015

tjablin updated the diff for D14851: Add Branch Hints for Highly Biased Branches on PPC.

Enable by default. Remove accidental whitespace additions.

Dec 11 2015, 2:41 PM

Dec 6 2015

tjablin updated the diff for D14851: Add Branch Hints for Highly Biased Branches on PPC.
Dec 6 2015, 4:53 PM
tjablin retitled D15271: Split functions to create shrink wrapping opportunities from to Split functions to create shrink wrapping opportunities .
Dec 6 2015, 4:47 PM

Dec 2 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.

Fix the bug caught by Kit. Sorry.

Dec 2 2015, 8:20 AM

Nov 29 2015

tjablin added a comment to D14064: Convert Returned Constant i1 Values to i32 on PPC64.

Hi Hal,
If you are satisfied, would you mind committing the patch? I don't have
commit access. Thanks!
Tom

Nov 29 2015, 8:00 AM

Nov 25 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.

Move pass to PPC backend, address formatting issues, add early exit when searching for disqualifying PHINodes.

Nov 25 2015, 11:37 AM

Nov 19 2015

tjablin retitled D14851: Add Branch Hints for Highly Biased Branches on PPC from to Add Branch Hints for Highly Biased Branches on PPC.
Nov 19 2015, 5:07 PM

Nov 12 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.

Expand comments to discuss the performance benefits, the limitations of the current implementation, and how these limitations could be addressed in the future. Also, add support for promoting i1 Arguments to i32 when it makes sense.

Nov 12 2015, 5:53 PM

Nov 10 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.

I have added some llvm::Statistics to collect data on how frequently booleans are promoted to integers, and fixed the spacing irregularity.

Nov 10 2015, 7:58 AM

Nov 5 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.
Nov 5 2015, 5:09 PM

Oct 26 2015

tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.
Oct 26 2015, 6:53 AM
tjablin updated D14064: Convert Returned Constant i1 Values to i32 on PPC64.
Oct 26 2015, 6:52 AM
tjablin updated the diff for D14064: Convert Returned Constant i1 Values to i32 on PPC64.
Oct 26 2015, 6:46 AM
tjablin retitled D14064: Convert Returned Constant i1 Values to i32 on PPC64 from to Convert Returned Constant i1 Values to i32 on PPC64.
Oct 26 2015, 6:45 AM

Sep 17 2014

tjablin updated the diff for D5032: Don't constant fold through zero-length fields.

I've updated the patch by replacing the CHECK-NOT test cases with CHECK test cases and included a new test demonstrating that indexing past the end of an array that is not the last field in a structure is already handled correctly.

Sep 17 2014, 4:40 PM
tjablin updated the diff for D5032: Don't constant fold through zero-length fields.
Sep 17 2014, 4:19 PM

Sep 2 2014

tjablin added a comment to D5032: Don't constant fold through zero-length fields.

Ping

Sep 2 2014, 9:52 AM
tjablin added a reviewer for D5032: Don't constant fold through zero-length fields: chandlerc.
Sep 2 2014, 9:52 AM

Aug 29 2014

tjablin updated the diff for D5113: Exit ScalarEvolution::getMulExpr Early when Choose Overflows.

I have made the changes you suggested. The code is functionally equivalent to the original version, but the design of the original version is a bit unclear to me. Basically, the code is trying to combine many AddRecs into a single expression. The run-time is "only" N^2 in the size of Ops, because the second and third loop levels share indexes. I think the underlying issue is the expensive recursive calls to getMulExpr in the inner-most loop.

Aug 29 2014, 5:15 PM

Aug 28 2014

tjablin retitled D5113: Exit ScalarEvolution::getMulExpr Early when Choose Overflows from to Exit ScalarEvolution::getMulExpr Early when Choose Overflows.
Aug 28 2014, 5:17 PM
tjablin updated the diff for D5092: Use Store Size not Alloc Size when Coercing.

Add REQUIRES: aarch64-registered-target since the new test looks at the ARM assembly.

Aug 28 2014, 4:42 PM
tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Add more full-stops.

Aug 28 2014, 11:54 AM
tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

As per request, I have modified the comments to use \brief, added full-stops at the ends of sentences, and modified the unit tests to search for whole prototypes. Please let me know if any additional modifications are necessary. Otherwise, could you please push this upstream?

Aug 28 2014, 11:45 AM
tjablin updated the diff for D5092: Use Store Size not Alloc Size when Coercing.

I have adjusted the comment as per your request. If you are satisfied, could you please push it upstream for me. Thanks.

Aug 28 2014, 7:59 AM

Aug 27 2014

tjablin retitled D5092: Use Store Size not Alloc Size when Coercing from to Use Store Size not Alloc Size when Coercing.
Aug 27 2014, 6:38 PM
tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Hi Reid,
I have updated the patch to remove the dependence on PointerMayBeCaptured, and also to add support for searching through PHINodes. Per your suggestion, I have added a new test case to verify that captured values will not be promoted.
Tom

Aug 27 2014, 4:04 PM
tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Hi Reid,
I have prepared a new version of the patch to address your feedback. The patch includes new tests, so arguments are only promoted if they have no padding or if we can prove the padding bytes are not accessed. In either case, it is safe to pass the elements by value without worrying about complications due to padding bytes. I have added a data layout field to the tail.ll test case. On a platform where i32 is 64-bit aligned, the promotion tested by tail.ll is unsound.
Tom

Aug 27 2014, 1:46 PM

Aug 26 2014

tjablin added a comment to D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Unless you think that the packing bits between the first and second members of the {i32, i64} structure should be preserved, in which case this is not correct.

Aug 26 2014, 10:29 AM
tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Hi Reid,

Aug 26 2014, 10:21 AM

Aug 25 2014

tjablin updated the diff for D5057: Don't Promote x86_fp80 byval Pointer Arguments.

If the members of a structure passed by value are used only through geps and loads, the isSafeToPromoteArgument function will handle them correctly, so contrary to its name, the patch will allow byval x86_fp80*s to be promoted, provided they are used in a sane way. I have added a new testcase where the argument is used sanely, and consequently its argument is promoted.

Aug 25 2014, 6:52 PM
tjablin added a comment to D5057: Don't Promote x86_fp80 byval Pointer Arguments.

Okay. I'll rethink it tomorrow.

Aug 25 2014, 6:19 PM
tjablin updated D5057: Don't Promote x86_fp80 byval Pointer Arguments.
Aug 25 2014, 6:00 PM
tjablin retitled D5057: Don't Promote x86_fp80 byval Pointer Arguments from to Don't Promote x86_fp80 byval Pointer Arguments.
Aug 25 2014, 5:58 PM
tjablin updated the diff for D5055: Don't Promote Args for Variadic Functions.
Aug 25 2014, 4:24 PM
tjablin added a comment to D5055: Don't Promote Args for Variadic Functions.

I think the answer is 'no' as long as LLVM handles variadic args the way it
currently does. Currently, the fits_in_gp predicate is computed based on
assumptions about the size of arguments on the stack. Anything that could
change that will break variadic functions.

Aug 25 2014, 3:05 PM
tjablin retitled D5055: Don't Promote Args for Variadic Functions from to Don't Promote Args for Variadic Functions.
Aug 25 2014, 1:28 PM

Aug 22 2014

tjablin updated the diff for D5032: Don't constant fold through zero-length fields.

Fix a related bug in ConstantFoldLoadThroughBitcast and add an additional test case.

Aug 22 2014, 4:00 PM
tjablin updated subscribers of D5032: Don't constant fold through zero-length fields.
Aug 22 2014, 1:37 PM
tjablin retitled D5032: Don't constant fold through zero-length fields from to Don't constant fold through zero-length fields.
Aug 22 2014, 1:35 PM

Aug 21 2014

tjablin added a comment to D5012: SROA: Handle a case of store size being smaller than allocation size.

I think this patch might not be the right approach. The underlying issue is
that clang translates:

Aug 21 2014, 4:03 PM