Page MenuHomePhabricator

kmclaughlin (Kerry McLaughlin)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 10 2019, 8:51 AM (92 w, 6 d)

Recent Activity

Yesterday

kmclaughlin committed rG62ee638a8700: [NFC] Add tests for scalable vectorization of loops with in-order reductions (authored by kmclaughlin).
[NFC] Add tests for scalable vectorization of loops with in-order reductions
Mon, Apr 19, 3:17 AM
kmclaughlin closed D100385: [NFC] Add tests for scalable vectorization of loops with in-order reductions.
Mon, Apr 19, 3:17 AM · Restricted Project

Fri, Apr 16

kmclaughlin added inline comments to D100385: [NFC] Add tests for scalable vectorization of loops with in-order reductions.
Fri, Apr 16, 6:36 AM · Restricted Project
kmclaughlin updated the diff for D100385: [NFC] Add tests for scalable vectorization of loops with in-order reductions.
  • Addressing review comments
  • Removed -instcombine from the RUN line of scalable-strict-fadd.ll (this was also removed from the fixed-width tests in rG93f54fae9dda)
Fri, Apr 16, 6:31 AM · Restricted Project

Thu, Apr 15

kmclaughlin added inline comments to D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
Thu, Apr 15, 8:50 AM · Restricted Project
kmclaughlin requested review of D100570: [LoopVectorize] Prevent multiple Phis being generated with in-order reductions.
Thu, Apr 15, 8:37 AM · Restricted Project
kmclaughlin committed rG93f54fae9dda: [NFC] Remove the -instcombine flag from strict-fadd.ll (authored by kmclaughlin).
[NFC] Remove the -instcombine flag from strict-fadd.ll
Thu, Apr 15, 7:13 AM

Wed, Apr 14

kmclaughlin accepted D99254: [SVE][LoopVectorize] Fix crash in InnerLoopVectorizer::widenPHIInstruction.
Wed, Apr 14, 11:19 AM · Restricted Project

Tue, Apr 13

kmclaughlin requested review of D100385: [NFC] Add tests for scalable vectorization of loops with in-order reductions.
Tue, Apr 13, 7:06 AM · Restricted Project

Mon, Apr 12

kmclaughlin accepted D100294: [AArch64][SVE] Fix dup/dupq intrinsics for C++..

LGTM!

Mon, Apr 12, 8:43 AM · Restricted Project

Tue, Apr 6

kmclaughlin committed rG7344f3d39a0d: [LoopVectorize] Add strict in-order reduction support for fixed-width… (authored by kmclaughlin).
[LoopVectorize] Add strict in-order reduction support for fixed-width…
Tue, Apr 6, 6:47 AM
kmclaughlin closed D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
Tue, Apr 6, 6:46 AM · Restricted Project
kmclaughlin committed rG857b8a73da91: [LoopVectorize] Change the identity element for FAdd (authored by kmclaughlin).
[LoopVectorize] Change the identity element for FAdd
Tue, Apr 6, 4:14 AM
kmclaughlin closed D98963: [LoopVectorize] Change the identity element for FAdd.
Tue, Apr 6, 4:14 AM · Restricted Project

Fri, Mar 26

kmclaughlin accepted D98715: [LoopVectorize] Add support for scalable vectorization of induction variables.

I think this looks good, I just have a couple of minor comments

Fri, Mar 26, 11:41 AM · Restricted Project
kmclaughlin added a comment to D98963: [LoopVectorize] Change the identity element for FAdd.

@dmgreen, @spatel, @david-arm Thank you for the all the comments and discussion on this patch; I've tried to summarise the changes required before we can always return -0.0 in the FIXME added to getRecurrenceIdentity() and reverted back to the original version.

Fri, Mar 26, 7:16 AM · Restricted Project
kmclaughlin updated the diff for D98963: [LoopVectorize] Change the identity element for FAdd.
  • Reverted back to the original patch
Fri, Mar 26, 7:14 AM · Restricted Project

Thu, Mar 25

kmclaughlin committed rG1f4649969062: [SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads (authored by kmclaughlin).
[SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads
Thu, Mar 25, 7:28 AM
kmclaughlin closed D98506: [SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads.
Thu, Mar 25, 7:28 AM · Restricted Project

Wed, Mar 24

kmclaughlin updated the diff for D98963: [LoopVectorize] Change the identity element for FAdd.

Changed getRecurrenceIdentity to always generate -0.0 for FAdd.

Wed, Mar 24, 11:45 AM · Restricted Project
kmclaughlin added inline comments to D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
Wed, Mar 24, 8:25 AM · Restricted Project
kmclaughlin updated the diff for D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
  • Addressing review comments & fixing clang-format warnings
Wed, Mar 24, 8:24 AM · Restricted Project

Mon, Mar 22

kmclaughlin updated the diff for D98506: [SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads.
  • Added a CHECK line for warnings and another test for the following:
if (cond[i])
 a[i] = b[i] + b[42];
Mon, Mar 22, 9:15 AM · Restricted Project

Mar 19 2021

kmclaughlin accepted D98968: [AArch64] Fix LowerMGATHER to return the chain result for floating point gathers..

Thanks for this fix, LGTM!

Mar 19 2021, 11:02 AM · Restricted Project
kmclaughlin added a comment to D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.

Thanks all for reviewing this!

Mar 19 2021, 10:30 AM · Restricted Project
kmclaughlin updated the diff for D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
  • Changed the name of the flag to -enable-strict-reductions
  • Removed unnecessary changes to create a mask as this was already handled by VPReductionRecipe::execute
  • Simplified checkOrderedReductions for this patch. For now, if Exit is a Phi node we will not set IsOrdered to true.
Mar 19 2021, 10:27 AM · Restricted Project
kmclaughlin requested review of D98963: [LoopVectorize] Change the identity element for FAdd.
Mar 19 2021, 10:09 AM · Restricted Project

Mar 16 2021

kmclaughlin accepted D97858: [AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible.

LGTM! I think it's probably worth checking if we can do this for scatter stores at some point as well.

Mar 16 2021, 5:11 AM · Restricted Project

Mar 12 2021

kmclaughlin requested review of D98506: [SVE][LoopVectorize] Verify support for vectorizing loops with invariant loads.
Mar 12 2021, 7:12 AM · Restricted Project

Mar 11 2021

kmclaughlin requested review of D98435: [LoopVectorize] Add strict in-order reduction support for fixed-width vectorization.
Mar 11 2021, 9:47 AM · Restricted Project

Mar 4 2021

kmclaughlin added inline comments to D97858: [AArch64][SVE] Fold vector ZExt/SExt into gather loads where possible.
Mar 4 2021, 11:16 AM · Restricted Project

Feb 24 2021

kmclaughlin added inline comments to D97299: [IR][SVE] Add new llvm.experimental.stepvector intrinsic.
Feb 24 2021, 5:27 AM · Restricted Project

Feb 16 2021

kmclaughlin added a comment to D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.

Thanks all for reviewing these changes!

Feb 16 2021, 5:51 AM · Restricted Project
kmclaughlin committed rGba1e150d03ca: [SVE] Add support for scalable vectorization of loops with int/fast FP… (authored by kmclaughlin).
[SVE] Add support for scalable vectorization of loops with int/fast FP…
Feb 16 2021, 5:50 AM
kmclaughlin closed D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
Feb 16 2021, 5:50 AM · Restricted Project

Feb 15 2021

kmclaughlin committed rG5fe15934388f: [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax (authored by kmclaughlin).
[LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax
Feb 15 2021, 5:48 AM
kmclaughlin closed D96604: [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax.
Feb 15 2021, 5:48 AM · Restricted Project

Feb 12 2021

kmclaughlin added inline comments to D96604: [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax.
Feb 12 2021, 9:27 AM · Restricted Project
kmclaughlin updated the diff for D96604: [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax.
  • Reordered the conditions in isRecurrenceInstr
  • Added a comment to explain the @minloopmissingnsz test
Feb 12 2021, 9:27 AM · Restricted Project
kmclaughlin added a comment to D96350: [SVE][LoopVectorize] Enable vectorization of fmin/fmax with nnan.

Hi @spatel, thanks for the explanation. I've created D96604 to try and address the missing check for no-signed-zeros at the function-level.

Feb 12 2021, 6:53 AM · Restricted Project
kmclaughlin requested review of D96604: [LoopVectorizer] Require no-signed-zeros-fp-math=true for fmin/fmax.
Feb 12 2021, 6:52 AM · Restricted Project
kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.

Rebased changes

Feb 12 2021, 6:02 AM · Restricted Project
kmclaughlin committed rGfea06efe7c92: [SVE][LoopVectorize] Support for vectorization of loops with function calls (authored by kmclaughlin).
[SVE][LoopVectorize] Support for vectorization of loops with function calls
Feb 12 2021, 5:48 AM
kmclaughlin closed D96356: [SVE][LoopVectorize] Support for vectorization of loops with function calls.
Feb 12 2021, 5:48 AM · Restricted Project

Feb 11 2021

kmclaughlin updated the diff for D96356: [SVE][LoopVectorize] Support for vectorization of loops with function calls.
  • Added the -force-vector-interleave=1 flag to the scalable-call.ll
Feb 11 2021, 5:47 AM · Restricted Project
kmclaughlin updated the diff for D96356: [SVE][LoopVectorize] Support for vectorization of loops with function calls.
  • Removed unnecessary braces from widenCallInstruction
  • Added a test for a loop containing an LLVM intrinsic (@llvm.sin.f64)
Feb 11 2021, 5:28 AM · Restricted Project

Feb 9 2021

kmclaughlin requested review of D96356: [SVE][LoopVectorize] Support for vectorization of loops with function calls.
Feb 9 2021, 10:31 AM · Restricted Project
kmclaughlin added inline comments to D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
Feb 9 2021, 9:13 AM · Restricted Project
kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.

Changes to the tests in scalable-reductions.ll:

  • Removed dso_local from definitions
  • Added a comment on the purpose of the memory_dependence test
  • Added CHECK-REMARK lines for each test in the file
  • Removed the unnecessary fmin/fmax tests where we can't vectorize
Feb 9 2021, 9:12 AM · Restricted Project
kmclaughlin requested review of D96350: [SVE][LoopVectorize] Enable vectorization of fmin/fmax with nnan.
Feb 9 2021, 9:09 AM · Restricted Project
kmclaughlin accepted D96018: [LoopVectorize] NFC: Change computeFeasibleMaxVF to operate on ElementCount..

LGTM!

Feb 9 2021, 6:02 AM · Restricted Project

Feb 5 2021

kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
  • Merged isLegalScalarTypeForSVEMaskedMemOp & isLegalScalarTypeForSVE
  • Return false from isLegalToVectorizeReduction for bfloat types
  • Included isa<ScalableVectorType>(Ty) in the switch statement conditions of useReductionIntrinsic
Feb 5 2021, 6:57 AM · Restricted Project

Feb 4 2021

kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
  • Added a function called isLegalScalarTypeForSVE which checks that the reduction type is supported & added a new test which uses bfloat to scalable-reductions.ll
Feb 4 2021, 7:31 AM · Restricted Project

Feb 3 2021

kmclaughlin added inline comments to D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
Feb 3 2021, 8:23 AM · Restricted Project
kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
  • Moved the Legal->isSafeForAnyVectorWidth() check in computeFeasibleMaxVF further down so that we always check the reductions even if the loop contains memory dependencies. Added a test for this scenario to scalable_reductions.ll.
Feb 3 2021, 8:23 AM · Restricted Project

Feb 1 2021

kmclaughlin accepted D95350: [SVE][LoopVectorize] Add masked load/store and gather/scatter support for SVE.

Thanks for making these changes @david-arm, LGTM

Feb 1 2021, 10:23 AM · Restricted Project
kmclaughlin added inline comments to D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
Feb 1 2021, 10:11 AM · Restricted Project
kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
  • Moved the canVectorizeReductions check to earlier in computeFeasibleMaxVF
  • Updated the RUN lines in scalable_reductions.ll
  • Removed duplicate test for FAdd
Feb 1 2021, 10:10 AM · Restricted Project
kmclaughlin committed rG9b4fcfaa9e8f: [SVE][CodeGen] Remove performMaskedGatherScatterCombine (authored by kmclaughlin).
[SVE][CodeGen] Remove performMaskedGatherScatterCombine
Feb 1 2021, 6:11 AM
kmclaughlin closed D94525: [SVE][CodeGen] Remove performMaskedGatherScatterCombine.
Feb 1 2021, 6:11 AM · Restricted Project

Jan 29 2021

kmclaughlin accepted D95350: [SVE][LoopVectorize] Add masked load/store and gather/scatter support for SVE.

LGTM!

Jan 29 2021, 5:52 AM · Restricted Project

Jan 28 2021

kmclaughlin added inline comments to D94525: [SVE][CodeGen] Remove performMaskedGatherScatterCombine.
Jan 28 2021, 5:38 AM · Restricted Project

Jan 26 2021

kmclaughlin added a comment to D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.

Thanks for reviewing this patch, all!

Jan 26 2021, 9:04 AM · Restricted Project
kmclaughlin updated the diff for D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
  • Removed changes to LoopVectorizationPlanner::plan and instead check whether reductions can be vectorized in computeFeasibleMaxVF. If any reduction in the loop cannot be vectorized with a scalable VF, we fall back on fixed-width vectorization.
Jan 26 2021, 8:44 AM · Restricted Project

Jan 22 2021

kmclaughlin requested review of D95245: [SVE] Add support for scalable vectorization of loops with int/fast FP reductions.
Jan 22 2021, 9:28 AM · Restricted Project

Jan 13 2021

kmclaughlin committed rG2170e0ee60db: [SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates) (authored by kmclaughlin).
[SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates)
Jan 13 2021, 4:26 AM
kmclaughlin closed D94428: [SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates).
Jan 13 2021, 4:25 AM · Restricted Project

Jan 12 2021

kmclaughlin requested review of D94525: [SVE][CodeGen] Remove performMaskedGatherScatterCombine.
Jan 12 2021, 9:52 AM · Restricted Project

Jan 11 2021

kmclaughlin requested review of D94428: [SVE][CodeGen] CTLZ, CTTZ & CTPOP operations (predicates).
Jan 11 2021, 9:25 AM · Restricted Project
kmclaughlin committed rGc37f68a8885c: [SVE][CodeGen] Fix legalisation of floating-point masked gathers (authored by kmclaughlin).
[SVE][CodeGen] Fix legalisation of floating-point masked gathers
Jan 11 2021, 3:28 AM
kmclaughlin closed D94171: [SVE][CodeGen] Fix legalisation of floating-point masked gathers.
Jan 11 2021, 3:28 AM · Restricted Project

Jan 8 2021

kmclaughlin updated the diff for D94171: [SVE][CodeGen] Fix legalisation of floating-point masked gathers.

Removed the isVectorUnpack helper added in the previous revision. If the index values are already extended to i64 by an unpkhi/lo, then the gather does not also need to extend the index.
This affects the masked_gather_nxv4f64 test, which has been updated as follows:

Jan 8 2021, 6:42 AM · Restricted Project

Jan 7 2021

kmclaughlin updated the diff for D94171: [SVE][CodeGen] Fix legalisation of floating-point masked gathers.
  • Added a new helper function, isVectorUnpack
  • Added tests which load <vscale x 4 x half> & <vscale x 2 x float>
Jan 7 2021, 3:43 AM · Restricted Project

Jan 6 2021

kmclaughlin requested review of D94171: [SVE][CodeGen] Fix legalisation of floating-point masked gathers.
Jan 6 2021, 5:32 AM · Restricted Project

Dec 18 2020

kmclaughlin committed rG52e4084d9c3b: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter (authored by kmclaughlin).
[SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter
Dec 18 2020, 3:57 AM
kmclaughlin closed D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter.
Dec 18 2020, 3:57 AM · Restricted Project

Dec 17 2020

kmclaughlin committed rG7c504b6dd063: [AArch64] Renamed sve-masked-scatter-legalise.ll. NFC. (authored by kmclaughlin).
[AArch64] Renamed sve-masked-scatter-legalise.ll. NFC.
Dec 17 2020, 3:41 AM
kmclaughlin committed rG6d2a78996bee: [SVE][CodeGen] Add bfloat16 support to scalable masked gather (authored by kmclaughlin).
[SVE][CodeGen] Add bfloat16 support to scalable masked gather
Dec 17 2020, 3:10 AM
kmclaughlin closed D93307: [SVE][CodeGen] Add bfloat16 support to scalable masked gather.
Dec 17 2020, 3:09 AM · Restricted Project

Dec 16 2020

kmclaughlin updated the diff for D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter.
  • Improve codegen where the splat value is a constant, but out of range for the immediate addressing mode, e.g.
mov x8, xzr
add z1.d, z1.d, #32 // =0x20
st1b { z0.d }, p0, [x8, z1.d]
ret

->

mov w8, #32
st1b { z0.d }, p0, [x8, z1.d]
ret
Dec 16 2020, 8:49 AM · Restricted Project
kmclaughlin added inline comments to D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter.
Dec 16 2020, 7:04 AM · Restricted Project
kmclaughlin updated the diff for D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter.
  • Refactored selectGatherScatterAddrMode based on the suggestions from @sdesmalen
  • Added bfloat tests to the new test files added by this patch
  • Removed unused %offset from sve-masked-gather.ll and removed duplicate tests from sve-masked-scatter-legalise.ll
Dec 16 2020, 7:03 AM · Restricted Project
kmclaughlin updated the diff for D93307: [SVE][CodeGen] Add bfloat16 support to scalable masked gather.
  • Added patterns for bfloat16 extract_subvector to AArch64SVEInstrInfo.td
Dec 16 2020, 3:27 AM · Restricted Project
kmclaughlin added inline comments to D93307: [SVE][CodeGen] Add bfloat16 support to scalable masked gather.
Dec 16 2020, 2:52 AM · Restricted Project

Dec 15 2020

kmclaughlin requested review of D93307: [SVE][CodeGen] Add bfloat16 support to scalable masked gather.
Dec 15 2020, 8:35 AM · Restricted Project

Dec 14 2020

kmclaughlin added a comment to D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.

Thanks @cameron.mcinally & @paulwalker-arm for reviewing this patch!

Dec 14 2020, 3:49 AM · Restricted Project
kmclaughlin committed rGc5ced82c8e49: [SVE][CodeGen] Lower scalable floating-point vector reductions (authored by kmclaughlin).
[SVE][CodeGen] Lower scalable floating-point vector reductions
Dec 14 2020, 3:48 AM
kmclaughlin closed D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.
Dec 14 2020, 3:48 AM · Restricted Project

Dec 11 2020

kmclaughlin requested review of D93132: [SVE][CodeGen] Vector + immediate addressing mode for masked gather/scatter.
Dec 11 2020, 10:42 AM · Restricted Project
kmclaughlin added inline comments to D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.
Dec 11 2020, 4:26 AM · Restricted Project
kmclaughlin updated the diff for D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.
  • Reordered the condition added to LowerReductionToSVE which sets RdxVT
Dec 11 2020, 4:23 AM · Restricted Project

Dec 10 2020

kmclaughlin requested review of D93050: [SVE][CodeGen] Lower scalable floating-point vector reductions.
Dec 10 2020, 9:32 AM · Restricted Project
kmclaughlin committed rGabe7775f5a43: [SVE][CodeGen] Extend index of masked gathers (authored by kmclaughlin).
[SVE][CodeGen] Extend index of masked gathers
Dec 10 2020, 5:55 AM
kmclaughlin closed D91433: [SVE][CodeGen] Extend index of masked gathers.
Dec 10 2020, 5:55 AM · Restricted Project

Dec 9 2020

kmclaughlin committed rG05edfc54750b: [SVE][CodeGen] Add DAG combines for s/zext_masked_gather (authored by kmclaughlin).
[SVE][CodeGen] Add DAG combines for s/zext_masked_gather
Dec 9 2020, 3:55 AM
kmclaughlin closed D92230: [SVE][CodeGen] Add DAG combines for s/zext_masked_gather.
Dec 9 2020, 3:54 AM · Restricted Project
kmclaughlin committed rG4519ff4b6f02: [SVE][CodeGen] Add the ExtensionType flag to MGATHER (authored by kmclaughlin).
[SVE][CodeGen] Add the ExtensionType flag to MGATHER
Dec 9 2020, 3:20 AM
kmclaughlin closed D91084: [SVE][CodeGen] Add the ExtensionType flag to MGATHER.
Dec 9 2020, 3:20 AM · Restricted Project

Dec 7 2020

kmclaughlin updated the diff for D92230: [SVE][CodeGen] Add DAG combines for s/zext_masked_gather.
  • Renamed ResNeedsExtend -> ResNeedsSignExtend
  • Added a test to sve-masked-gather-legalize.ll for a zero-extended gather load with multiple uses
Dec 7 2020, 8:03 AM · Restricted Project
kmclaughlin committed rG111f559bbd12: [SVE][CodeGen] Call refineIndexType & refineUniformBase from visitMGATHER (authored by kmclaughlin).
[SVE][CodeGen] Call refineIndexType & refineUniformBase from visitMGATHER
Dec 7 2020, 5:31 AM