Page MenuHomePhabricator

SjoerdMeijer (Sjoerd Meijer)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 26 2016, 2:17 AM (260 w, 3 d)

Recent Activity

Today

SjoerdMeijer added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Fri, Jan 22, 7:21 AM · Restricted Project
SjoerdMeijer added a comment to D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1..

Thanks. @fhahn @SjoerdMeijer what do we think about the edge case where the width==1? As far as I understand (with this patch):

#pragma clang loop vectorize_predicate(disable) vectorize_width(4)
Gives llvm.loop.vectorize.predicate.enable=false, llvm.loop.vectorize.width=4, llvm.loop.vectorize.scalable.enable=false, llvm.loop.vectorize.enable=true

#pragma clang loop vectorize_predicate(disable) vectorize_width(4)

I guess this is a typo, and should be vectorize_predicate(enable)

Gives llvm.loop.vectorize.predicate.enable=true, llvm.loop.vectorize.width=4, llvm.loop.vectorize.scalable.enable=false, llvm.loop.vectorize.enable=true

#pragma clang loop vectorize_predicate(enable) vectorize_width(1)
Gives llvm.loop.vectorize.width=1, llvm.loop.vectorize.scalable.enable=false

#pragma clang loop vectorize_predicate(disable) vectorize_width(1)
Gives llvm.loop.vectorize.width=1, llvm.loop.vectorize.scalable.enable=false

Should we be adding llvm.loop.vectorize.predicate.enable metadata even when width=1? I would be inclined to emit the predication pragma anyway.
Fri, Jan 22, 6:44 AM · Restricted Project
SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Thanks @shchenz, comments addressed.

Fri, Jan 22, 5:47 AM · Restricted Project
SjoerdMeijer added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Fri, Jan 22, 5:44 AM · Restricted Project

Yesterday

SjoerdMeijer accepted D93476: [LV][ARM] Inloop reduction cost modelling.

Thanks, LGTM too.

Thu, Jan 21, 7:00 AM · Restricted Project
SjoerdMeijer accepted D94457: [AArch64] Add some missing fusion subtarget features.

The existing fusion tests seem a little bit weak—the machine scheduler with the base Cortex-A57 machine models that most of those use seemed to pass the address and literal fusion on its own (by coincidence?) without the explicit features requested. I was unable to make that test case stronger to see the A57 machine model not exploiting those fusion pairs when the feature was disabled. Is that worth worrying about?

Thu, Jan 21, 6:10 AM · Restricted Project
SjoerdMeijer added inline comments to D93476: [LV][ARM] Inloop reduction cost modelling.
Thu, Jan 21, 5:24 AM · Restricted Project
SjoerdMeijer added inline comments to D93476: [LV][ARM] Inloop reduction cost modelling.
Thu, Jan 21, 4:11 AM · Restricted Project
SjoerdMeijer added inline comments to D93476: [LV][ARM] Inloop reduction cost modelling.
Thu, Jan 21, 3:47 AM · Restricted Project
SjoerdMeijer added inline comments to D93476: [LV][ARM] Inloop reduction cost modelling.
Thu, Jan 21, 3:39 AM · Restricted Project

Wed, Jan 20

SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
  • Ported over the only positive test test/CodeGen/X86/sink-cheap-instructions.ll to a MIR test: this is sink_add in the new MIR test file.
  • Walk candidates in reverse order, to start at the bottom of def-use chains, and try using the uses first before attempting to sink a def.
Wed, Jan 20, 6:38 AM · Restricted Project

Tue, Jan 19

SjoerdMeijer accepted D94958: [ARM] Add MVE add.sat costs.

Looks reasonable.

Tue, Jan 19, 5:48 AM · Restricted Project
SjoerdMeijer accepted D94946: [ARM] Expand vXi1 VSELECT's.
Tue, Jan 19, 5:37 AM · Restricted Project
SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
  • Added statistics NumLoopSunk, the number of instructions sunk into a loop
  • Check that a use of candidate is in a loop,
  • Added a bunch of negative tests.
Tue, Jan 19, 5:30 AM · Restricted Project

Mon, Jan 18

SjoerdMeijer added inline comments to D94946: [ARM] Expand vXi1 VSELECT's.
Mon, Jan 18, 11:37 PM · Restricted Project
SjoerdMeijer added a comment to D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1..

I had a look at the Clang Language Extension ... and I saw this:

Specifying a width/count of 1 disables the optimization, and is equivalent to vectorize(disable) or interleave(disable).

Ah yes, thanks. I think we should improve the langref spec also, this should be described in the same paragraph, not floating below some example.

Mon, Jan 18, 5:06 AM · Restricted Project
SjoerdMeijer added reviewers for D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1.: fhahn, Meinersbur.
Mon, Jan 18, 3:39 AM · Restricted Project
SjoerdMeijer added a comment to D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1..

I am trying to remember details here, but first about this:

Mon, Jan 18, 3:38 AM · Restricted Project
SjoerdMeijer added a comment to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Thank you very much for looking at this @shchenz !

Mon, Jan 18, 2:38 AM · Restricted Project

Fri, Jan 15

SjoerdMeijer accepted D94687: [AArch64] Make target intrinsics DefaultAttrIntrinsics..

Looks like we get a lot of happiness from using these attributes. ;-)
I had a look at it before and the general direction also looked okay to me too, which I was happy to see confirmed. Therefore I am happy to accept this, even though I haven't checked all intrinsics because this is a lot of changes. So perhaps wait a day with committing in case anyone wants to have one more look.

Fri, Jan 15, 7:35 AM · Restricted Project
SjoerdMeijer added inline comments to D94779: [Clang] Ensure vector predication pragma is ignored only when vectorization width is 1..
Fri, Jan 15, 7:26 AM · Restricted Project
SjoerdMeijer added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Fri, Jan 15, 7:20 AM · Restricted Project

Thu, Jan 14

SjoerdMeijer accepted D94691: [ARM] Don't run the block placement pass at O0.

Yep, don't think it is useful or necessary to run this at O0.
LGTM.

Thu, Jan 14, 7:58 AM · Restricted Project
SjoerdMeijer accepted D94650: [NewPM] Fix placement of LoopFlatten.

I was going to look at this today, but thanks for fixing! :-)
LGTM

Thu, Jan 14, 1:24 AM · Restricted Project

Wed, Jan 13

SjoerdMeijer accepted D94559: [NewPM] Only non-trivially loop unswitch at -O3 and for non-optsize functions.

To avoid code-size bloat, this is indeed the behaviour that we want.
Thanks, LGTM.

Wed, Jan 13, 12:46 PM · Restricted Project
SjoerdMeijer added inline comments to rG60fda8ebb6dc: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Wed, Jan 13, 9:42 AM
SjoerdMeijer accepted D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Nice one, thanks, LGTM.

Wed, Jan 13, 8:00 AM · Restricted Project

Tue, Jan 12

SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Tue, Jan 12, 9:02 AM · Restricted Project
SjoerdMeijer added a comment to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Little friendly ping: how do we like this little reshuffle?

Tue, Jan 12, 1:20 AM · Restricted Project

Mon, Jan 11

SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Mon, Jan 11, 12:01 PM · Restricted Project
SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Mon, Jan 11, 6:23 AM · Restricted Project
SjoerdMeijer added a comment to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Adding to my previous comment:

Mon, Jan 11, 2:53 AM · Restricted Project
SjoerdMeijer added a comment to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Here's my "pen-and-paper exercise" of the problem that we are solving here, and lets start with some examples.
In these examples, empty blocks means there are no instructions in it that are relevant to this problem.

Mon, Jan 11, 2:35 AM · Restricted Project

Fri, Jan 8

SjoerdMeijer added a comment to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

I have upload WIP patch D93694 to show the direction and motivation for this change here.

Fri, Jan 8, 7:34 AM · Restricted Project
SjoerdMeijer requested review of D94308: [MachineSink] SinkIntoLoop: analyse stores in between. WIP.
Fri, Jan 8, 7:31 AM · Restricted Project
SjoerdMeijer committed rG8af859d514fa: [MachineLoop] New helper isLoopInvariant() (authored by SjoerdMeijer).
[MachineLoop] New helper isLoopInvariant()
Fri, Jan 8, 1:27 AM
SjoerdMeijer closed D94082: [MachineLoop] Add new helper isLoopInvariant().
Fri, Jan 8, 1:27 AM · Restricted Project

Thu, Jan 7

SjoerdMeijer accepted D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line.

Thanks for getting to the bottom of this. Agreed, and also LGTM.

Thu, Jan 7, 7:35 AM · Restricted Project
SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Fixed formatting and description of the command line option. Related to this, and in addition to my previous message, this is off-by-default and is currently covered by exactly 1 test:
test/CodeGen/X86/sink-cheap-instructions.ll. After this NFC change, I will look into extending this.

Thu, Jan 7, 6:55 AM · Restricted Project
SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

See updated Title and Summary.

Thu, Jan 7, 6:46 AM · Restricted Project

Wed, Jan 6

SjoerdMeijer accepted D94189: [ARM] Update isVMOVNOriginalMask to handle single input shuffle vectors.

Looks like a good fix to me.

Wed, Jan 6, 12:58 PM · Restricted Project
SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Wed, Jan 6, 12:36 PM · Restricted Project
SjoerdMeijer accepted D89031: [SVE] Add support to vectorize_width loop pragma for scalable vectors.

LGTM, perhaps wait a day with committing in case there are more comments.

Wed, Jan 6, 9:39 AM · Restricted Project

Tue, Jan 5

SjoerdMeijer accepted D94103: [ConstProp] Constant propagation for get.active.lane.mask instrinsics.

Nice one, LGTM.

Tue, Jan 5, 11:16 AM · Restricted Project
SjoerdMeijer updated the diff for D94082: [MachineLoop] Add new helper isLoopInvariant().

Yep, thanks, that should have been const.

Tue, Jan 5, 7:41 AM · Restricted Project
SjoerdMeijer added a comment to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Yeah, so thanks again for looking at this. After looking more into this, I don't think this change at the moment makes much sense.

Tue, Jan 5, 5:54 AM · Restricted Project
SjoerdMeijer requested review of D94082: [MachineLoop] Add new helper isLoopInvariant().
Tue, Jan 5, 5:38 AM · Restricted Project
SjoerdMeijer accepted D93014: [Clang] Add AArch64 VCMLA LANE variants..
Tue, Jan 5, 1:06 AM · Restricted Project

Mon, Jan 4

SjoerdMeijer added inline comments to D93014: [Clang] Add AArch64 VCMLA LANE variants..
Mon, Jan 4, 9:04 AM · Restricted Project
SjoerdMeijer accepted D93997: [InterleavedAccess] Return correct 'modified' status..

Cheers, LGTM

Mon, Jan 4, 5:30 AM · Restricted Project
SjoerdMeijer accepted D92947: [AArch64] Add patterns for FMCLA*_indexed..

LGTM

Mon, Jan 4, 4:06 AM · Restricted Project
SjoerdMeijer accepted D93622: [ARM] Extend lowering for i64 reductions.
Mon, Jan 4, 3:57 AM · Restricted Project
SjoerdMeijer added inline comments to D93997: [InterleavedAccess] Return correct 'modified' status..
Mon, Jan 4, 1:08 AM · Restricted Project
SjoerdMeijer added inline comments to D93622: [ARM] Extend lowering for i64 reductions.
Mon, Jan 4, 12:58 AM · Restricted Project

Dec 22 2020

SjoerdMeijer added a comment to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Hmmmm, good points, will check. I was assuming that the legality checks were performed, but yeah, this is impressively broken if not!

Dec 22 2020, 8:33 AM · Restricted Project
SjoerdMeijer updated the diff for D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.

Now with a MIR test.

Dec 22 2020, 7:22 AM · Restricted Project
SjoerdMeijer added inline comments to D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Dec 22 2020, 6:04 AM · Restricted Project
SjoerdMeijer requested review of D93694: [MachineLICM][MachineSink] Move SinkIntoLoop from MachineLICM to MachineSink.
Dec 22 2020, 5:49 AM · Restricted Project
SjoerdMeijer committed rGb9b62c28677d: [AArch64] Add a test for MachineLICM SinkIntoLoop. NFC. (authored by SjoerdMeijer).
[AArch64] Add a test for MachineLICM SinkIntoLoop. NFC.
Dec 22 2020, 4:24 AM
SjoerdMeijer committed rG9a6de74d5a9e: [MachineLICM] Add llvm debug messages to SinkIntoLoop. NFC. (authored by SjoerdMeijer).
[MachineLICM] Add llvm debug messages to SinkIntoLoop. NFC.
Dec 22 2020, 1:20 AM
SjoerdMeijer added a comment to D93054: [ARM] Add bank conflict hazarding.

LGTM too.

Dec 22 2020, 12:29 AM · Restricted Project

Dec 21 2020

SjoerdMeijer accepted D93615: [LV] Avoid needless fold tail.

LGTM

Dec 21 2020, 6:46 AM · Restricted Project
SjoerdMeijer added inline comments to D93615: [LV] Avoid needless fold tail.
Dec 21 2020, 5:42 AM · Restricted Project
SjoerdMeijer added inline comments to D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line.
Dec 21 2020, 3:02 AM · Restricted Project
SjoerdMeijer added inline comments to D93395: [clang][cli] Remove -f[no-]trapping-math from -cc1 command line.
Dec 21 2020, 2:29 AM · Restricted Project
SjoerdMeijer added inline comments to D93615: [LV] Avoid needless fold tail.
Dec 21 2020, 12:43 AM · Restricted Project

Dec 15 2020

SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 15 2020, 1:23 AM · Restricted Project

Dec 14 2020

SjoerdMeijer added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Dec 14 2020, 8:36 AM · Restricted Project
SjoerdMeijer added a comment to D93054: [ARM] Add bank conflict hazarding.

Perhaps some tests are missing: one inlined suggestion is related to AssumeITCMBankConflict, and I was wondering about other cases like Size > 4 and SP relative access?

Dec 14 2020, 3:39 AM · Restricted Project

Dec 10 2020

SjoerdMeijer committed rG99ad078b91ed: [AArch64] Cortex-R82: remove crypto (authored by SjoerdMeijer).
[AArch64] Cortex-R82: remove crypto
Dec 10 2020, 4:57 AM
SjoerdMeijer closed D91994: [AArch64] Cortex-R82: remove crypto.
Dec 10 2020, 4:57 AM · Restricted Project, Restricted Project

Dec 9 2020

SjoerdMeijer accepted D91358: [ARM][RegAlloc] Add t2LoopEndDec.

@SjoerdMeijer : I've reviewed the non-target-specific changes now. And that part looks good to me now.

I don't know if the target-specific parts already has been reviewed (or if anything has changed lately). Can someone with ARM/Thumb2 knowledge can acknowledge that part and set "accept revision".

Dec 9 2020, 7:34 AM · Restricted Project
SjoerdMeijer accepted D92553: [ARM] Match dual lane vmovs from insert_vector_elt.

LGTM (with one minor) but an ARM guru should really approve it.

Dec 9 2020, 6:10 AM · Restricted Project

Dec 8 2020

SjoerdMeijer added a reviewer for D91994: [AArch64] Cortex-R82: remove crypto: pbarrio.
Dec 8 2020, 11:20 AM · Restricted Project, Restricted Project
SjoerdMeijer added a comment to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Looking good in general. I think we would need some perf numbers to see if we haven't missed something.
Some more remarks inlined.

Dec 8 2020, 7:05 AM · Restricted Project
SjoerdMeijer accepted D91267: [ARM] Remove copies from low overhead phi inductions..

Looks like a good change to me, and this looked reasonable:

Dec 8 2020, 6:06 AM · Restricted Project
SjoerdMeijer accepted D92227: [ARM] Remove dead instructions before creating VPT block bundles.

Cheers, looks like a good fix to me.

Dec 8 2020, 5:32 AM · Restricted Project
SjoerdMeijer abandoned D92488: [LICM] Add a maximum for the number of instructions to be hoisted.
Dec 8 2020, 4:03 AM · Restricted Project
SjoerdMeijer added a comment to D92488: [LICM] Add a maximum for the number of instructions to be hoisted.

Thanks for the feedback.

Dec 8 2020, 4:02 AM · Restricted Project
SjoerdMeijer committed rG1e260f955d31: [LICM][docs] Document that LICM is also a canonicalization transform. NFC. (authored by SjoerdMeijer).
[LICM][docs] Document that LICM is also a canonicalization transform. NFC.
Dec 8 2020, 3:57 AM

Dec 7 2020

SjoerdMeijer updated the diff for D92488: [LICM] Add a maximum for the number of instructions to be hoisted.
Dec 7 2020, 5:08 AM · Restricted Project
SjoerdMeijer added inline comments to D91267: [ARM] Remove copies from low overhead phi inductions..
Dec 7 2020, 4:04 AM · Restricted Project
SjoerdMeijer accepted D91273: [ARM] Revert low overhead loops with calls before registry allocation..

LGTM

Dec 7 2020, 3:36 AM · Restricted Project

Dec 3 2020

SjoerdMeijer updated the diff for D91994: [AArch64] Cortex-R82: remove crypto.

The short story is that this was "accidentally" getting things right.

Dec 3 2020, 3:51 AM · Restricted Project, Restricted Project

Dec 2 2020

SjoerdMeijer updated the summary of D92488: [LICM] Add a maximum for the number of instructions to be hoisted.
Dec 2 2020, 8:53 AM · Restricted Project
SjoerdMeijer requested review of D92488: [LICM] Add a maximum for the number of instructions to be hoisted.
Dec 2 2020, 8:51 AM · Restricted Project
SjoerdMeijer added a comment to D88962: [SVE] Add support for scalable vectors with vectorize.scalable.enable loop attribute.

Yep, LGTM too

Dec 2 2020, 1:56 AM · Restricted Project
SjoerdMeijer added a comment to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

I have some high-level questions:

  • Are we fighting another optimisations here, some sort of loop-rotate or is this just MBP reshulffling blocks in a way that is not good for us?
  • I know we don't nest WLSTPs for profitability reasons, but in theory we could. Not sure we need to check this though. But in general my impression is that some more test can be added, but perhaps you were still working on that.
  • I am wondering if some cost-modeling is required. For example, if the iteration count is very low, would that change things?
Dec 2 2020, 1:39 AM · Restricted Project

Dec 1 2020

SjoerdMeijer accepted D92373: [ARM] Mark select and selectcc of MVE vector operations as expand..

LGTM

Dec 1 2020, 2:58 AM · Restricted Project
SjoerdMeijer added inline comments to D92369: [ARM] Improve handling of empty VPT blocks in tail predicated loops.
Dec 1 2020, 2:53 AM · Restricted Project
SjoerdMeijer committed rGf44ba251354f: ExtractValue instruction costs (authored by SjoerdMeijer).
ExtractValue instruction costs
Dec 1 2020, 2:43 AM
SjoerdMeijer closed D92317: [LV] ExtractValue instruction costs.
Dec 1 2020, 2:43 AM · Restricted Project
SjoerdMeijer added a comment to D92317: [LV] ExtractValue instruction costs.

Thanks, and I will look into InsertValue too. That may take a few days, but will certainly do that.

Dec 1 2020, 1:05 AM · Restricted Project

Nov 30 2020

SjoerdMeijer added inline comments to D92317: [LV] ExtractValue instruction costs.
Nov 30 2020, 11:43 AM · Restricted Project
SjoerdMeijer committed rG630d37dc1be1: [AArch64] Enable Cortex-A55 schedmodel (authored by SjoerdMeijer).
[AArch64] Enable Cortex-A55 schedmodel
Nov 30 2020, 11:29 AM
SjoerdMeijer closed D88017: [AArch64] Enable Cortex-A55 schedmodel.
Nov 30 2020, 11:28 AM · Restricted Project
SjoerdMeijer updated the diff for D92317: [LV] ExtractValue instruction costs.

Decided to use TTI.getInstructionCost because getVectorInstrCost doesn't work here; it's an aggregate type, not really a vector.
That means the cost for a ExtractValue will be Free now, which I think is actually more accurate (than 1, and also the original 4).
I haven't added InsertValue yet, because that is e.g. not supported by getInstructionCost as it seems a little bit of a different beast.

Nov 30 2020, 9:31 AM · Restricted Project
SjoerdMeijer added a comment to D88017: [AArch64] Enable Cortex-A55 schedmodel.

Thanks. While I do have plans to look into this more soon, I think it's time to commit this thing as it has been proven to do some good, so will do that a bit later today.

Nov 30 2020, 6:50 AM · Restricted Project
SjoerdMeijer added inline comments to D92317: [LV] ExtractValue instruction costs.
Nov 30 2020, 6:30 AM · Restricted Project
SjoerdMeijer requested review of D92317: [LV] ExtractValue instruction costs.
Nov 30 2020, 6:12 AM · Restricted Project