Page MenuHomePhabricator

samparker (Sam Parker)
User

Projects

User does not belong to any projects.

User Details

User Since
May 11 2015, 7:59 AM (187 w, 4 d)

Recent Activity

Wed, Dec 12

samparker updated the diff for D55373: [LSR] Generate formulae to enable more post-incs.

I've moved the logic under the control of a new TTI flag as it seems that the current shouldFavourPostInc is trying to achieve different things. Hopefully I've also addressed Gil's comments.

Wed, Dec 12, 9:55 AM
samparker added inline comments to D55373: [LSR] Generate formulae to enable more post-incs.
Wed, Dec 12, 1:09 AM

Tue, Dec 11

samparker added a comment to D55373: [LSR] Generate formulae to enable more post-incs.

Okay, thanks. We're also seeing some regressions, so I know I've got some tuning to do. Do you have any idea of the characteristics of your regressions? At the moment I'm thinking:

  • That the costs that I've added here are overly simplistic, for one I think I need to add a setup cost.
  • It's also probably not worth doing when we know that the loop iteration count is low.
  • In the current state, we also see code size regressions whereas your previous work helps us reduce code size. It may mean that I'll need a different flag to enable this change, but it also maybe a symptom of the performance regressions.
Tue, Dec 11, 9:36 AM
samparker added inline comments to D55373: [LSR] Generate formulae to enable more post-incs.
Tue, Dec 11, 12:51 AM

Thu, Dec 6

samparker created D55373: [LSR] Generate formulae to enable more post-incs.
Thu, Dec 6, 8:08 AM

Fri, Nov 30

samparker added a comment to D54899: [LoopStrengthReduce] ComplexityLimit as an option.

Ah, ok, I don't think they're being generated. I will have a look at GenerateConstantOffsetsImpl. Thanks!

Fri, Nov 30, 8:15 AM

Thu, Nov 29

samparker added a comment to D54899: [LoopStrengthReduce] ComplexityLimit as an option.

@qcolombet maybe you could suggest the areas in LSR which will enable to help produce these post inc loads? What we have with the default complexity is:

Thu, Nov 29, 3:52 AM

Wed, Nov 28

samparker added a comment to D54899: [LoopStrengthReduce] ComplexityLimit as an option.

Hi Quentin

Wed, Nov 28, 11:52 PM
samparker accepted D53190: ARM: avoid infinite combining loop.

Great to see those other test changes! LGTM with the few minor comments, no need to re-review. cheers!

Wed, Nov 28, 7:24 AM
samparker accepted D54961: [AArch64] Add command-line option for SSBS.

LGTM

Wed, Nov 28, 3:43 AM
samparker accepted D54629: [AArch64] Add command-line option for SSBS.

Thanks for the explanation, LGTM!

Wed, Nov 28, 3:43 AM
samparker updated the diff for D54899: [LoopStrengthReduce] ComplexityLimit as an option.

Thanks for taking a look and for the value renamer tip! The test case has now been renamed and reduced.

Wed, Nov 28, 3:37 AM

Tue, Nov 27

samparker added a comment to D53190: ARM: avoid infinite combining loop.

My point was that, even though this extra work and as you mentioned in the comments, the test case isn't generating optimum code. We could introduce a subs node to remove the unnecessary cmp, making the sub opaque and improving codegen. Unless there's a reason why we couldn't do this?

Tue, Nov 27, 2:59 AM

Mon, Nov 26

samparker created D54899: [LoopStrengthReduce] ComplexityLimit as an option.
Mon, Nov 26, 7:39 AM

Fri, Nov 23

samparker added a comment to D54629: [AArch64] Add command-line option for SSBS.

Hi Pablo,

Fri, Nov 23, 2:23 AM
samparker accepted D54842: [ARM][NFC] codegen tests cleanup: remove dangling check prefixes.

LGTM.

Fri, Nov 23, 1:43 AM

Wed, Nov 21

samparker created D54790: [ARM] Prevent parallel macs for unsigned values.
Wed, Nov 21, 5:21 AM

Mon, Nov 19

samparker updated the diff for D54515: [ARM] Replace trunc for switch as a sink.

Added some comments and created a test file for switch statements, which includes existing tests plus a couple of new ones.

Mon, Nov 19, 2:39 AM

Thu, Nov 15

samparker added inline comments to D54546: [ARM] Don't expand sdiv when optimising for minsize.
Thu, Nov 15, 6:16 AM
samparker created D54570: [DAGCombine] Fix non-deterministic debug output .
Thu, Nov 15, 1:00 AM

Nov 14 2018

samparker created D54515: [ARM] Replace trunc for switch as a sink.
Nov 14 2018, 12:43 AM

Nov 9 2018

samparker created D54308: [ARM] Don't promote i1 types in ARM CGP.
Nov 9 2018, 6:22 AM

Nov 8 2018

samparker created D54254: [ARM] Reorganise some functionality in ARMParallelDSP.
Nov 8 2018, 3:14 AM

Nov 6 2018

samparker updated the diff for D54108: [ARM] Enable mixed types in ARM CGP.

Now disallowing icmps that operate on types that are smaller than TypeSize. The handling of truncs being sources and sinks has also been reverted.

  • we allow casts if their result or operand is <= TypeSize.
  • zexts are sinks if their result > TypeSize.
  • truncs are still sinks if their operand == TypeSize.
  • truncs are still sources if their result == TypeSize.
Nov 6 2018, 6:58 AM
samparker added inline comments to D54108: [ARM] Enable mixed types in ARM CGP.
Nov 6 2018, 6:29 AM
samparker updated the diff for D52550: [ARM] Check for sel intrinsic use in ARM CGP.

On initialisation, record the functions that contain uses of the sel intrinsic so that we now check on a function basis and not module.

Nov 6 2018, 2:14 AM
samparker added a comment to D52550: [ARM] Check for sel intrinsic use in ARM CGP.

Yes, it's certainly complicating that the ABI and ACLE don't talk about inline assembly or other intrinsics and this is why I still want to have the option guarded by a command line option. I will update the patch to look at the function only.

Nov 6 2018, 1:34 AM

Nov 5 2018

samparker created D54108: [ARM] Enable mixed types in ARM CGP.
Nov 5 2018, 8:39 AM
samparker created D54094: [ARM] Turn assert into condition in ARMCGP.
Nov 5 2018, 2:26 AM

Nov 2 2018

samparker updated the diff for D54032: [ARM][ARMCGP] Remove unecessary zexts and truncs.

Fixed a bug that allowed zext to be generated with the same source and destination types.

Nov 2 2018, 8:23 AM
samparker created D54032: [ARM][ARMCGP] Remove unecessary zexts and truncs.
Nov 2 2018, 3:39 AM
samparker added a comment to D53485: [ScheduleDAGRRList] Do not preschedule the node has ADJCALLSTACKDOWN parent.

The one patch I made in this area was corrected by Eli, so he's far more informed than I! The problem that time was also due to interleaving of down/up, and, as Matthias said in his email, I'm not sure why we'd need this ability. I'll take sometime today to look into the DAG builder to see if serialising these nodes isn't too much of a pain, because I'd hope that expressing this via dependencies would be better long-term.

Nov 2 2018, 3:24 AM

Nov 1 2018

samparker updated the diff for D53972: [ARM][CGP] Negative constant operand handling.

Thanks Sjoerd.

Nov 1 2018, 4:27 AM
samparker created D53972: [ARM][CGP] Negative constant operand handling.
Nov 1 2018, 2:01 AM

Oct 31 2018

samparker added a comment to D53190: ARM: avoid infinite combining loop.

Ok, fair point. If we are going to introduce a new node to fix this issue, could we have a SUBS node that can be glued to the CMOV?

Oct 31 2018, 12:47 AM

Oct 30 2018

samparker added a comment to D53190: ARM: avoid infinite combining loop.

I thought the normal way to stop combining was to return the original node. Could you not manually replace N with Res and then return N?

Oct 30 2018, 1:32 AM

Oct 26 2018

samparker accepted D53746: [ARM] Fix ARMCodeGenPrepare test cases.

Helpful FileCheck change! Thanks!

Oct 26 2018, 6:53 AM

Oct 24 2018

samparker added a comment to D53562: [ARM] Use the Cortex-A57 sched model for Cortex-A72.

Ok, fair enough, my LNT numbers show that the MISched results are more variable:

Oct 24 2018, 5:48 AM
samparker updated the diff for D53562: [ARM] Use the Cortex-A57 sched model for Cortex-A72.

Added the a72 to a couple of scheduling tests, as well as the basic unroll one.

Oct 24 2018, 12:43 AM
samparker updated the diff for D53562: [ARM] Use the Cortex-A57 sched model for Cortex-A72.

Hey Florian,

Oct 24 2018, 12:25 AM

Oct 23 2018

samparker created D53562: [ARM] Use the Cortex-A57 sched model for Cortex-A72.
Oct 23 2018, 3:43 AM

Oct 17 2018

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Cheers Sjoerd, I've added two helper functions to try to clean this up a bit.

Oct 17 2018, 4:54 AM

Oct 16 2018

samparker accepted D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).

Thanks, LGTM. With one bonus question, are the fused operations fast on the M7..?

Oct 16 2018, 8:19 AM
samparker added a comment to D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).

Good point. Would it be worth adding a test for the M7 though? We seem to be a little lacking in our m-class FP tests.

Oct 16 2018, 7:09 AM
samparker added inline comments to D53315: [ARM] Do not fuse VADD and VMUL, continued (2/2).
Oct 16 2018, 6:22 AM
samparker accepted D53314: [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2).

Great, LGTM

Oct 16 2018, 5:55 AM
samparker added inline comments to D53314: [ARM][NFCI] Do not fuse VADD and VMUL, continued (1/2).
Oct 16 2018, 2:14 AM

Oct 15 2018

samparker accepted D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).

LGTM.

Oct 15 2018, 5:30 AM
samparker added inline comments to D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).
Oct 15 2018, 5:25 AM
samparker added inline comments to D53257: [ARM][NEON] Improve vector popcnt lowering with PADDL (PR39281).
Oct 15 2018, 3:23 AM

Oct 1 2018

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Two breaking assumptions were that:

  • the base load would be before the offset load.
  • each load would only have one user - this is true but I also really meant and assumed that the sext of the load has one user.
Oct 1 2018, 8:56 AM
samparker reopened D51983: [ARM] bottom-top mul support in ARMParallelDSP.

This patch was reverted again in rL343082.

Oct 1 2018, 8:50 AM

Sep 28 2018

samparker created D52644: [ARM] Prevent DSP and SIM32 being set for v6m.
Sep 28 2018, 2:10 AM

Sep 27 2018

samparker added a comment to D52550: [ARM] Check for sel intrinsic use in ARM CGP.

Hi Eli,

Sep 27 2018, 1:51 AM

Sep 26 2018

samparker added a reviewer for D52550: [ARM] Check for sel intrinsic use in ARM CGP: dmgreen.
Sep 26 2018, 7:35 AM
samparker created D52550: [ARM] Check for sel intrinsic use in ARM CGP.
Sep 26 2018, 7:11 AM

Sep 25 2018

samparker updated the diff for D52463: [ARM] Fix for PR39060.

Thanks for suggestions Sjoerd. I've evidently had a difficult time of wrapping (pun intended) my head around this and really should have put some comments up before. Hopefully I've now also illustrated when and how we can use this.

Sep 25 2018, 10:53 AM
samparker created D52463: [ARM] Fix for PR39060.
Sep 25 2018, 6:29 AM

Sep 24 2018

samparker closed D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Committed in rL342870.

Sep 24 2018, 7:19 AM

Sep 20 2018

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

The patch caused an assert on some vectorized code because I missed a check that the muls are plain integer types. parallel-dsp-top-bottom-neg-vec.ll has been added which was the reproducer provided.

Sep 20 2018, 3:53 AM
samparker reopened D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Commit was reverted in rL342260.

Sep 20 2018, 3:47 AM
samparker added a comment to D52289: [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33.

Shouldn't we also consider code size here?

Sep 20 2018, 3:02 AM

Sep 17 2018

samparker added a comment to D52080: [ARM] Cleanup ARM CGP isSupportedValue.

Thanks for the reproducer and for the revert.

Sep 17 2018, 9:31 PM

Sep 14 2018

samparker created D52102: [ARM] Disallow icmp with negative imm and overflow.
Sep 14 2018, 9:22 AM
samparker created D52080: [ARM] Cleanup ARM CGP isSupportedValue.
Sep 14 2018, 2:26 AM
samparker added inline comments to D51983: [ARM] bottom-top mul support in ARMParallelDSP.
Sep 14 2018, 1:07 AM

Sep 13 2018

samparker created D52032: [ARM] Fix FixConsts for ARMCodeGenPrepare.
Sep 13 2018, 6:52 AM

Sep 12 2018

samparker updated the diff for D51983: [ARM] bottom-top mul support in ARMParallelDSP.

Added a negative test for loading i8s as well as a positive test for 64-bit macs.

Sep 12 2018, 7:32 AM
samparker created D51983: [ARM] bottom-top mul support in ARMParallelDSP.
Sep 12 2018, 7:01 AM
samparker created D51978: [ARM] Allow truncs as sources in ARM CGP.
Sep 12 2018, 5:02 AM

Sep 11 2018

samparker added inline comments to D50758: [ARM] Allow bitcasts in ARMCodeGenPrepare.
Sep 11 2018, 7:15 AM
samparker created D51920: [ARM] Enable ARMCodeGenPrepare by default.
Sep 11 2018, 3:13 AM

Aug 29 2018

samparker created D51424: [ARM] Exchange MAC operands in ARMParallelDSP.
Aug 29 2018, 6:52 AM

Aug 28 2018

samparker updated the diff for D51101: [ARM] Add smlald support in ARMParallelDSP.

Fixed a couple of typos and added assert to AddMACCandidate

Aug 28 2018, 3:12 AM

Aug 22 2018

samparker updated the diff for D51101: [ARM] Add smlald support in ARMParallelDSP.

Hi Sjoerd,

Aug 22 2018, 7:52 AM
samparker created D51101: [ARM] Add smlald support in ARMParallelDSP.
Aug 22 2018, 6:44 AM
samparker updated the diff for D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.

Added test for armv8m.main+dsp.

Aug 22 2018, 5:10 AM
samparker updated the summary of D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:48 AM
samparker updated the summary of D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:47 AM
samparker created D51093: [ARM] Set __ARM_FEATURE_SIMD32 for +dsp cores.
Aug 22 2018, 3:33 AM

Aug 21 2018

samparker accepted D51066: [ARM] Handle all-ones mask explicitly in targetShrinkDemandedConstant..

LGTM

Aug 21 2018, 11:40 PM
samparker created D51034: [ARM] Rotated operand patterns for *xtb16.
Aug 21 2018, 7:09 AM

Aug 16 2018

samparker accepted D50846: [ARM][NFC] ARMCodeGenPrepare: some refactoring and algorithm description..

Thanks for putting the time into this, just one nit before its committed please.

Aug 16 2018, 8:51 AM
samparker closed D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Fixed and recommitted in r339858.

Aug 16 2018, 5:06 AM
samparker updated the diff for D50432: [DAGCombiner] Reduce load widths of shifted masks.

Re-adjusted ShAmt for big endian targets.

Aug 16 2018, 3:44 AM
samparker added inline comments to D50432: [DAGCombiner] Reduce load widths of shifted masks.
Aug 16 2018, 3:09 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Thanks for reverting. The issue was that I was assuming that the instruction operands mapped to arguments for CallInsts. Will be recommitting.

Aug 16 2018, 2:09 AM

Aug 15 2018

samparker created D50769: [ARM] Typesize lower bound for ARMCodeGenPrepare.
Aug 15 2018, 5:15 AM
samparker created D50762: [ARM] Ignore GEPs in ARMCodeGenPrepare.
Aug 15 2018, 3:56 AM
samparker created D50759: [ARM] Allow zext in ARMCodeGenPrepare.
Aug 15 2018, 2:59 AM
samparker created D50758: [ARM] Allow bitcasts in ARMCodeGenPrepare.
Aug 15 2018, 2:05 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Rebased.

Aug 15 2018, 1:09 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Changed test regexesess

Aug 15 2018, 12:46 AM

Aug 14 2018

samparker updated the diff for D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.

Removed the unnecessary isa<Instruction> checks and updated the test to actually test.

Aug 14 2018, 8:01 AM
samparker added inline comments to D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.
Aug 14 2018, 7:49 AM
samparker retitled D50054: [ARM] Allow pointer values in ARMCodeGenPrepare from [ARM] Ignore pointer values in ARMCodeGenPrepare to [ARM] Allow pointer values in ARMCodeGenPrepare.
Aug 14 2018, 7:47 AM
samparker updated the diff for D50067: [ARM] Handle signed icmps in ARMCodeGenPrepare.

Rebased.

Aug 14 2018, 7:21 AM
samparker updated the diff for D50054: [ARM] Allow pointer values in ARMCodeGenPrepare.

Rebased and fixed the handling of undef values. I've also moved the tests around so we have a standalone file for the call tests. Also added a small piece of control to decide whether we bother to promote or not: if we find nothing but sources, sinks and the icmp, then we don't bother doing anything.

Aug 14 2018, 7:02 AM
samparker added inline comments to D50432: [DAGCombiner] Reduce load widths of shifted masks.
Aug 14 2018, 5:13 AM
samparker updated the diff for D50432: [DAGCombiner] Reduce load widths of shifted masks.

Rebased and updated changes to the x86 codegen tests.

Aug 14 2018, 5:12 AM