samparker (Sam Parker)
User

Projects

User does not belong to any projects.

User Details

User Since
May 11 2015, 7:59 AM (179 w, 4 d)

Recent Activity

Wed, Oct 17

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Cheers Sjoerd, I've added two helper functions to try to clean this up a bit.

Wed, Oct 17, 4:54 AM

Tue, Oct 16

samparker accepted D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).

Thanks, LGTM. With one bonus question, are the fused operations fast on the M7..?

Tue, Oct 16, 8:19 AM
samparker added a comment to D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).

Good point. Would it be worth adding a test for the M7 though? We seem to be a little lacking in our m-class FP tests.

Tue, Oct 16, 7:09 AM
samparker added inline comments to D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).
Tue, Oct 16, 6:22 AM
samparker accepted D53314: [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2).

Great, LGTM

Tue, Oct 16, 5:55 AM
samparker added inline comments to D53314: [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2).
Tue, Oct 16, 2:14 AM

Mon, Oct 15

samparker accepted D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).

LGTM.

Mon, Oct 15, 5:30 AM
samparker added inline comments to D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).
Mon, Oct 15, 5:25 AM
samparker added inline comments to D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).
Mon, Oct 15, 3:23 AM

Mon, Oct 1

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Two breaking assumptions were that:

  • the base load would be before the offset load.
  • each load would only have one user - this is true but I also really meant and assumed that the sext of the load has one user.
Mon, Oct 1, 8:56 AM
samparker reopened D51983: [ARM] bottom-top mul support in ARMParallelDSP.

This patch was reverted again in rL343082.

Mon, Oct 1, 8:50 AM

Fri, Sep 28

samparker created D52644: [ARM] Prevent DSP and SIM32 being set for v6m.
Fri, Sep 28, 2:10 AM

Thu, Sep 27

samparker added a comment to D52550: [ARM] Check for sel intrinsic use in ARM CGP.

Hi Eli,

Thu, Sep 27, 1:51 AM

Wed, Sep 26

samparker added a reviewer for D52550: [ARM] Check for sel intrinsic use in ARM CGP: dmgreen.
Wed, Sep 26, 7:35 AM
samparker created D52550: [ARM] Check for sel intrinsic use in ARM CGP.
Wed, Sep 26, 7:11 AM

Tue, Sep 25

samparker updated the diff for D52463: [ARM] Fix for PR39060.

Thanks for suggestions Sjoerd. I've evidently had a difficult time of wrapping (pun intended) my head around this and really should have put some comments up before. Hopefully I've now also illustrated when and how we can use this.

Tue, Sep 25, 10:53 AM
samparker created D52463: [ARM] Fix for PR39060.
Tue, Sep 25, 6:29 AM

Mon, Sep 24

samparker closed D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Committed in rL342870.

Mon, Sep 24, 7:19 AM

Thu, Sep 20

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

The patch caused an assert on some vectorized code because I missed a check that the muls are plain integer types.

Thu, Sep 20, 3:53 AM
samparker reopened D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Commit was reverted in rL342260.

Thu, Sep 20, 3:47 AM
samparker added a comment to D52289: [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33.

Shouldn't we also consider code size here?

Thu, Sep 20, 3:02 AM

Sep 17 2018

samparker added a comment to D52080: [ARM] Cleanup ARM CGP isSupportedValue.

Thanks for the reproducer and for the revert.

Sep 17 2018, 9:31 PM

Sep 14 2018

samparker created D52102: [ARM] Disallow icmp with negative imm and overflow.
Sep 14 2018, 9:22 AM
samparker created D52080: [ARM] Cleanup ARM CGP isSupportedValue.
Sep 14 2018, 2:26 AM
samparker added inline comments to D51983: [ARM] bottom-top mul support in ARMParallelDSP.
Sep 14 2018, 1:07 AM

Sep 13 2018

samparker created D52032: [ARM] Fix FixConsts for ARMCodeGenPrepare.
Sep 13 2018, 6:52 AM

Sep 12 2018

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Added a negative test for loading i8s as well as a positive test for 64-bit macs.

Sep 12 2018, 7:32 AM
samparker created D51983: [ARM] bottom-top mul support in ARMParallelDSP.
Sep 12 2018, 7:01 AM
samparker created D51978: [ARM] Allow truncs as sources in ARM CGP.
Sep 12 2018, 5:02 AM

Sep 11 2018

samparker added inline comments to D50758: [ARM] Allow bitcasts in ARMCodeGenPrepare.
Sep 11 2018, 7:15 AM
samparker created D51920: [ARM] Enable ARMCodeGenPrepare by default.
Sep 11 2018, 3:13 AM

Aug 29 2018

samparker created D51424: [ARM] Exchange MAC operands in ARMParallelDSP.
Aug 29 2018, 6:52 AM

Aug 28 2018

samparker updated the diff for D51101: [ARM] Add smlald support in ARMParallelDSP.

Fixed a couple of typos and added assert to AddMACCandidate

Aug 28 2018, 3:12 AM

Aug 22 2018

samparker updated the diff for D51101: [ARM] Add smlald support in ARMParallelDSP.

Hi Sjoerd,

Aug 22 2018, 7:52 AM
samparker created D51101: [ARM] Add smlald support in ARMParallelDSP.
Aug 22 2018, 6:44 AM
samparker updated the diff for D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.

Added test for armv8m.main+dsp.

Aug 22 2018, 5:10 AM
samparker updated the summary of D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:48 AM
samparker updated the summary of D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:47 AM
samparker created D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:33 AM

Aug 21 2018

samparker accepted D51066: [ARM] Handle all-ones mask explicitly in targetShrinkDemandedConstant..

LGTM

Aug 21 2018, 11:40 PM
samparker created D51034: [ARM] Rotated operand patterns for *xtb16.
Aug 21 2018, 7:09 AM

Aug 16 2018

samparker accepted D50846: [ARM][NFC] ARMCodeGenPrepare: some refactoring and algorithm description..

Thanks for putting the time into this, just one nit before its committed please.

Aug 16 2018, 8:51 AM
samparker closed D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Fixed and recommitted in r339858.

Aug 16 2018, 5:06 AM
samparker updated the diff for D50432: [DAGCombiner] Reduce load widths of shifted masks.

Re-adjusted ShAmt for big endian targets.

Aug 16 2018, 3:44 AM
samparker added inline comments to D50432: [DAGCombiner] Reduce load widths of shifted masks.
Aug 16 2018, 3:09 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Thanks for reverting. The issue was that I was assuming that the instruction operands mapped to arguments for CallInsts. Will be recommitting.

Aug 16 2018, 2:09 AM

Aug 15 2018

samparker created D50769: [ARM] Typesize lower bound for ARMCodeGenPrepare.
Aug 15 2018, 5:15 AM
samparker created D50762: [ARM] Ignore GEPs in ARMCodeGenPrepare.
Aug 15 2018, 3:56 AM
samparker created D50759: [ARM] Allow zext in ARMCodeGenPrepare.
Aug 15 2018, 2:59 AM
samparker created D50758: [ARM] Allow bitcasts in ARMCodeGenPrepare.
Aug 15 2018, 2:05 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Rebased.

Aug 15 2018, 1:09 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Changed test regexesess

Aug 15 2018, 12:46 AM

Aug 14 2018

samparker updated the diff for D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.

Removed the unnecessary isa<Instruction> checks and updated the test to actually test.

Aug 14 2018, 8:01 AM
samparker added inline comments to D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.
Aug 14 2018, 7:49 AM
samparker retitled D50054: [ARM] Allow pointer values in ARMCodeGenPrepare from [ARM] Ignore pointer values in ARMCodeGenPrepare to [ARM] Allow pointer values in ARMCodeGenPrepare.
Aug 14 2018, 7:47 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Rebased.

Aug 14 2018, 7:21 AM
samparker updated the diff for D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.

Rebased and fixed the handling of undef values. I've also moved the tests around so we have a standalone file for the call tests. Also added a small piece of control to decide whether we bother to promote or not: if we find nothing but sources, sinks and the icmp, then we don't bother doing anything.

Aug 14 2018, 7:02 AM
samparker added inline comments to D50432: [DAGCombiner] Reduce load widths of shifted masks.
Aug 14 2018, 5:13 AM
samparker updated the diff for D50432: [DAGCombiner] Reduce load widths of shifted masks.

Rebased and updated changes to the x86 codegen tests.

Aug 14 2018, 5:12 AM
samparker added a reviewer for D50432: [DAGCombiner] Reduce load widths of shifted masks: john.brawn.
Aug 14 2018, 1:20 AM
samparker accepted D50667: [ARM] Make PerformSHLSimplify add nodes to the DAG worklist correctly..

Thanks for fixing this, LGTM.

Aug 14 2018, 12:16 AM

Aug 10 2018

samparker updated the diff for D50518: [ARM] Disallow zexts in ARMCodeGenPrepare.
  • Removed commented out code
  • Added some TODOs
  • Expanded the description of what a source and sink are
Aug 10 2018, 5:50 AM
samparker added inline comments to D50518: [ARM] Disallow zexts in ARMCodeGenPrepare.
Aug 10 2018, 3:12 AM

Aug 9 2018

samparker created D50518: [ARM] Disallow zexts in ARMCodeGenPrepare.
Aug 9 2018, 9:17 AM
samparker accepted D50454: [ARM] FP16: codegen support for VTRN.

LGTM

Aug 9 2018, 3:56 AM
samparker updated the diff for D50432: [DAGCombiner] Reduce load widths of shifted masks.

Moved arguments to occupy a single line.

Aug 9 2018, 12:47 AM

Aug 8 2018

samparker accepted D50427: [ARM] FP16: codegen support for VEXT.

LGTM

Aug 8 2018, 5:01 AM
samparker accepted D50030: [ARM] Adjust AND immediates to make them cheaper to select..

Thanks for the extra tests, LGTM.

Aug 8 2018, 4:51 AM
samparker updated the diff for D50079: [ARM] arm.codegen.zeroext intrinsics.

To try to make it clear that these are not user facing intrinsics, I've renamed them to arm.codegen.zeroext

Aug 8 2018, 3:58 AM
samparker accepted D50329: [ARM] FP16: vector vmov and vdup support.

Ok, from your reply on the other ticket - LGTM.

Aug 8 2018, 3:51 AM
samparker created D50432: [DAGCombiner] Reduce load widths of shifted masks.
Aug 8 2018, 3:43 AM
samparker accepted D50326: [ARM] FP16: vector VMUL variants.

Cheers, shufflevector always confuses me. LGTM.

Aug 8 2018, 3:03 AM
samparker accepted D50393: [ARM] FP16: support vector INT_TO_FP and FP_TO_INT.

LGTM

Aug 8 2018, 1:37 AM
samparker added inline comments to D50329: [ARM] FP16: vector vmov and vdup support.
Aug 8 2018, 1:30 AM
samparker added inline comments to D50326: [ARM] FP16: vector VMUL variants.
Aug 8 2018, 1:17 AM

Aug 2 2018

samparker added a comment to D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value.

So this started life in the DAGCombiner and issues around the implementation were raised and that it would be useful to have earlier in the pipeline. But it seems that it hasn't really be thought, or discussion, about how this would fit well in the existing passes... I think DAG combine has always been the right place for this because we're trying to reuse values - something that DAGs are good for. In DAGCombiner::visitANDLike, we already handle ANDs with SRL operands and the motivating example can be addressed with very little effort:

Aug 2 2018, 8:00 AM

Aug 1 2018

samparker added a comment to D50079: [ARM] arm.codegen.zeroext intrinsics.

And what would I need to do about testing for a generic intrinsic? Add to BitCode/compatibility-6.0.ll test and just keep this codegen one too?

Aug 1 2018, 3:46 AM
samparker added inline comments to D50030: [ARM] Adjust AND immediates to make them cheaper to select..
Aug 1 2018, 3:33 AM
samparker added a comment to D50079: [ARM] arm.codegen.zeroext intrinsics.

Yes exactly, I would like to use these for loops. In ARMCodeGenPrepare I need to insert truncs to keep the IR legal, although I've already proved that the value is already zero extended. In those cases I want to use these intrinsics to carry that knowledge through. So it's not trying to work around the lack of info of target specific nodes. Another idea I wondered about was adding flags to instructions, but that seems far more intrusive.

Aug 1 2018, 2:43 AM

Jul 31 2018

samparker created D50079: [ARM] arm.codegen.zeroext intrinsics.
Jul 31 2018, 9:54 AM
samparker created D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.
Jul 31 2018, 7:36 AM
samparker created D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.
Jul 31 2018, 5:35 AM
samparker added a comment to D50030: [ARM] Adjust AND immediates to make them cheaper to select..

Hi Eli,

Jul 31 2018, 5:13 AM

Jul 25 2018

samparker accepted D49585: [ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1..

This LGTM. I don't think there's a problem with solving this niche in the backend.

Jul 25 2018, 2:27 AM
samparker retitled D49239: [ARM]{WIP] SADD16 support in ParallelDSP from [ARM] SADD16 support in ParallelDSP to [ARM]{WIP] SADD16 support in ParallelDSP.
Jul 25 2018, 1:10 AM

Jul 24 2018

samparker added a comment to D49239: [ARM]{WIP] SADD16 support in ParallelDSP.

Is it such a bad idea? Sure, I would like to check whether the sel intrinsic has been used or not, but what happens in the case of inline assembly? The AAPCS is also vague, I'm not sure what a 'public interface' is in terms of an LLVM module. I'd like to have an option which is the user can be explicit in saying its fine to use these instructions.

Jul 24 2018, 5:32 AM

Jul 23 2018

samparker updated the diff for D49239: [ARM]{WIP] SADD16 support in ParallelDSP.

Added tests for:

  • non load operand to the add,
  • immediate operand to the add,
  • volatile store
  • non-consecutive loads
Jul 23 2018, 9:10 AM
samparker updated the diff for D49239: [ARM]{WIP] SADD16 support in ParallelDSP.

Performed a rebase and added a test from a manually unrolled example. I've also added an option to control the use of the GE writing flags - really I think this should go as a subtarget feature so this can be used across this pass and ARMCodeGenPrepare.

Jul 23 2018, 8:35 AM
samparker added inline comments to D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value.
Jul 23 2018, 7:13 AM
samparker updated the diff for D48832: [ARM] ARMCodeGenPrepare backend pass.

Added another test

Jul 23 2018, 5:07 AM
samparker updated the diff for D48832: [ARM] ARMCodeGenPrepare backend pass.

Now disabled at -O0

Jul 23 2018, 3:40 AM
samparker added inline comments to D49585: [ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1..
Jul 23 2018, 2:47 AM

Jul 20 2018

samparker added a comment to D49585: [ARM] Prefer lsls+lsrs over lsls+ands or lsrs+ands in Thumb1..

I did some work for Thumb-2 last year in a similar vain during the combine phase, in PerformSHLSimplify, but this (unsurprisingly) doesn't handle lshr. I didn't find any headaches from changing the canonical form in those cases, so probably would be worth having it there.

Jul 20 2018, 3:31 AM
samparker updated subscribers of D32530: [SVE][IR] Scalable Vector IR Type.
Jul 20 2018, 2:41 AM
samparker added a comment to D49229: [AggressiveInstCombine] Fold redundant masking operations of shifted value.

All of your test cases are rooted at an or, so it makes sense to search up from there. Why not start with searching just from or (and xors?) and then add the search from more operators in later patches?

Jul 20 2018, 2:33 AM

Jul 19 2018

samparker accepted D49444: [DAG] Avoid Node Update assertion due to AND simplification.

LGTM, thanks!

Jul 19 2018, 12:49 AM

Jul 18 2018

samparker abandoned D49380: [ARM] Remove some code from PerformCMOVCombine.

Ok, thanks for the clarification. I'll have a look in instcombine.

Jul 18 2018, 8:39 AM
samparker updated the diff for D48832: [ARM] ARMCodeGenPrepare backend pass.

Fixed support for handling switch instructions and added another test.

Jul 18 2018, 8:26 AM
samparker added a comment to D49444: [DAG] Avoid Node Update assertion due to AND simplification.

This looks like an odd solution to me, I haven't seen TokenFactors used like that before. Isn't it okay for the AND to be folded? Why not just check that the AND hasn't be folded into a constant before trying to update its, now non-existent, operands?

Jul 18 2018, 4:55 AM
samparker updated the diff for D48832: [ARM] ARMCodeGenPrepare backend pass.

Fixed a couple of bugs around the truncating of values into the root users. Also added explicit checks to reject signed compares.

Jul 18 2018, 4:04 AM