Page MenuHomePhabricator
Feed Advanced Search

Today

SjoerdMeijer accepted D88549: [ARM][LowOverheadLoops] Iteration count liveness.

LGTM

Thu, Oct 1, 1:27 AM · Restricted Project
SjoerdMeijer accepted D88542: [ARM][LowOverheadLoops] Start insertion point.
Thu, Oct 1, 1:24 AM · Restricted Project
SjoerdMeijer added inline comments to D88542: [ARM][LowOverheadLoops] Start insertion point.
Thu, Oct 1, 1:07 AM · Restricted Project

Yesterday

SjoerdMeijer updated the diff for D42365: [LoopFlatten] Add a loop-flattening pass.

I've added the test cases from PR40581. Test v0 does not trigger yet, test v1 triggers. I propose adding support for v0 once we've got something in-tree.

Wed, Sep 30, 7:56 AM · Restricted Project
SjoerdMeijer accepted D88554: [RDA] isSafeToDefRegAt: Look at global uses.

Looks like a good fix to me.

Wed, Sep 30, 6:03 AM · Restricted Project
SjoerdMeijer updated the diff for D42365: [LoopFlatten] Add a loop-flattening pass.

This addresses minor issues from @samparker.
@dmgreen had already answered other question.

Wed, Sep 30, 3:38 AM · Restricted Project
SjoerdMeijer updated the diff for D42365: [LoopFlatten] Add a loop-flattening pass.

Arg, silly! Thanks for letting me know.

Wed, Sep 30, 3:19 AM · Restricted Project
SjoerdMeijer updated the diff for D42365: [LoopFlatten] Add a loop-flattening pass.

Rebased

Wed, Sep 30, 2:13 AM · Restricted Project
SjoerdMeijer added reviewers for D42365: [LoopFlatten] Add a loop-flattening pass: ostannard, samparker, alanphipps.
Wed, Sep 30, 1:46 AM · Restricted Project
SjoerdMeijer commandeered D42365: [LoopFlatten] Add a loop-flattening pass.

With Dave's and Oliver's permission I am commandeering this because I really would like to see this getting committed soonish and I have some bandwidth to progress this.

Wed, Sep 30, 1:45 AM · Restricted Project

Tue, Sep 29

SjoerdMeijer added inline comments to D88419: [RDA] Switch isSafeToMove iterators.
Tue, Sep 29, 1:19 AM · Restricted Project
SjoerdMeijer accepted D88419: [RDA] Switch isSafeToMove iterators.

If Sam has no further questions, this looks good to me.

Tue, Sep 29, 1:07 AM · Restricted Project

Mon, Sep 28

SjoerdMeijer committed rG1696dd27fb61: [ARM][MVE] Enable tail-predication by default (authored by SjoerdMeijer).
[ARM][MVE] Enable tail-predication by default
Mon, Sep 28, 6:08 AM
SjoerdMeijer closed D88093: [ARM][MVE] Enable tail-predication by default.
Mon, Sep 28, 6:08 AM · Restricted Project
SjoerdMeijer added a comment to D88093: [ARM][MVE] Enable tail-predication by default.

Thanks Dave. With D88086 committed now, I don't think there's anything in our way anymore.

Mon, Sep 28, 5:57 AM · Restricted Project
SjoerdMeijer committed rGf39f92c1f610: [ARM][MVE] tail-predication: overflow checks for elementcount, cont'd (authored by SjoerdMeijer).
[ARM][MVE] tail-predication: overflow checks for elementcount, cont'd
Mon, Sep 28, 1:22 AM
SjoerdMeijer closed D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.
Mon, Sep 28, 1:22 AM · Restricted Project
SjoerdMeijer added a comment to D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.

Many thanks @efriedma and @samparker for your help with this work.

Mon, Sep 28, 1:10 AM · Restricted Project

Fri, Sep 25

SjoerdMeijer added a reviewer for D88307: [DON'T MERGE] Jump-threading for finite state automata: alanphipps.
Fri, Sep 25, 8:01 AM · Restricted Project
SjoerdMeijer updated the diff for D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.

I am so happy that this approach works! I.e., this determines equality of TC and ElemenCount by calculating 2 scev expressions and subtracting them and testing the result for 0. Also a check for the base of the AddRec has been added now, so I think this addresses all comments.

Fri, Sep 25, 3:39 AM · Restricted Project

Thu, Sep 24

SjoerdMeijer added inline comments to D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.
Thu, Sep 24, 12:12 PM · Restricted Project
SjoerdMeijer updated the diff for D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.

I wanted to write the new checks in a separate patch as I thought it would be a new lump of code, wanted to get this clean up first out of the way, but since our last idea it is probably best to continue here. I.e., the TC == (ElemCount+VW-1) / VW is hopefully just a minor addition.

Thu, Sep 24, 8:23 AM · Restricted Project
SjoerdMeijer committed rG2fc690ac904c: [ARM] LowoverheadLoops: add an option to disable tail-predication (authored by SjoerdMeijer).
[ARM] LowoverheadLoops: add an option to disable tail-predication
Thu, Sep 24, 5:36 AM
SjoerdMeijer closed D88212: [ARM] LowoverheadLoops: add an option to disable tail-predication.
Thu, Sep 24, 5:36 AM · Restricted Project
SjoerdMeijer added a comment to D88209: [ARM] Check for LSTP side-effects..

Okidoki, nice one

Thu, Sep 24, 5:22 AM · Restricted Project
SjoerdMeijer accepted D88209: [ARM] Check for LSTP side-effects..

Thanks, perfectly clear, LGTM.

Thu, Sep 24, 5:13 AM · Restricted Project
SjoerdMeijer added a comment to D88209: [ARM] Check for LSTP side-effects..

Looks good, but ignoring the nits I have one question inlined that asks about explaining why we are doing this, and am interested to have a read first.

Thu, Sep 24, 4:17 AM · Restricted Project
SjoerdMeijer updated the diff for D88212: [ARM] LowoverheadLoops: add an option to disable tail-predication.
Thu, Sep 24, 4:02 AM · Restricted Project
SjoerdMeijer requested review of D88212: [ARM] LowoverheadLoops: add an option to disable tail-predication.
Thu, Sep 24, 3:59 AM · Restricted Project
SjoerdMeijer added a comment to D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.

Actually, I guess if you could prove that the tripcount is precisely equal to (ElementCount + VectorWidth - 1)/VectorWidth, you could also use that to prove the subtraction doesn't overflow.

This sounds like the same suggestion that I made many moons ago... I suggested taking these values and substituting them into the expected SCEV expression, and then perform some SCEV algebra on it and the vector TC expression, until hopefully they both just equal ElementCount == ElementCount. My quick prototype 'worked', but I don't know if that says much.

Thu, Sep 24, 12:44 AM · Restricted Project

Wed, Sep 23

SjoerdMeijer updated the diff for D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.

Thanks for looking Eli.

Wed, Sep 23, 8:51 AM · Restricted Project

Tue, Sep 22

SjoerdMeijer requested review of D88093: [ARM][MVE] Enable tail-predication by default.
Tue, Sep 22, 6:39 AM · Restricted Project
SjoerdMeijer requested review of D88086: [ARM][MVE] tail-predication: checks for the elementcount, cont'd.
Tue, Sep 22, 3:56 AM · Restricted Project
SjoerdMeijer added a comment to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.

Sorry, I wrote a reply end of last week, but apparently forgot to push submit. So please see my reply inline, but I will open a new review soon, where it's probably best to continue this discussion and my reply.

Tue, Sep 22, 3:35 AM · Restricted Project
SjoerdMeijer accepted D87681: [ARM] Improve VPT predicate tracking.

Thanks, nice one.

Tue, Sep 22, 2:34 AM · Restricted Project

Mon, Sep 21

SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

@SjoerdMeijer

sounds like you've got your environment all setup. Would it be easy for you to quickly test the changes that you suggested earlier?

Sorry, I'm involved in slightly different activity atm. Will do whenever I can.

Mon, Sep 21, 12:31 PM · Restricted Project
SjoerdMeijer updated the diff for D88017: [AArch64] Enable Cortex-A55 schedmodel.
  • pruned the test case, added comments for the different instruction categories to make the test more readable,
  • fixed latencies for the FDIV and FSQRT instructions,
  • regarding @evgeny777 's comment "WriteLD should be 3 cycles, not 4". I have kept it as it was, because in a first benchmark run I tried this regressed things a bit.
Mon, Sep 21, 12:29 PM · Restricted Project
SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

FYI: I have created https://reviews.llvm.org/D88017 to enable the model.
I will now look at the optimisation guide, correct the obvious mistakes, and create a diff for that, unless @evgeny777 you get there first, let me know.

Mon, Sep 21, 4:15 AM · Restricted Project
SjoerdMeijer requested review of D88017: [AArch64] Enable Cortex-A55 schedmodel.
Mon, Sep 21, 4:12 AM · Restricted Project
SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

I have committed a first version that we can now iterate on; it is enabled/used yet.

Mon, Sep 21, 2:59 AM · Restricted Project
SjoerdMeijer committed rG4b8ade837e36: [AArch64] Cortex-A55 scheduler model (authored by SjoerdMeijer).
[AArch64] Cortex-A55 scheduler model
Mon, Sep 21, 2:55 AM
SjoerdMeijer closed D46884: [AArch64] Cortex-A55 scheduler model.
Mon, Sep 21, 2:55 AM · Restricted Project
SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

We could perhaps commit it without enabling it to begin with.

Mon, Sep 21, 1:11 AM · Restricted Project
SjoerdMeijer added a comment to D46884: [AArch64] Cortex-A55 scheduler model.

@evgeny777 : how about we commit this to have a baseline that we iterate on?

Mon, Sep 21, 12:43 AM · Restricted Project

Thu, Sep 17

SjoerdMeijer added inline comments to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.
Thu, Sep 17, 7:15 AM · Restricted Project
SjoerdMeijer added inline comments to D87681: [ARM] Improve VPT predicate tracking.
Thu, Sep 17, 5:36 AM · Restricted Project
SjoerdMeijer committed rG6637d72ddd3c: [Lint] Add check for intrinsic get.active.lane.mask (authored by SjoerdMeijer).
[Lint] Add check for intrinsic get.active.lane.mask
Thu, Sep 17, 1:22 AM
SjoerdMeijer closed D87228: [Lint] Add check for intrinsic get.active.lane.mask.
Thu, Sep 17, 1:22 AM · Restricted Project

Wed, Sep 16

SjoerdMeijer updated the summary of D87769: [ARM][MVE] tail-predication: predicate new checks on force-enabled option.
Wed, Sep 16, 9:12 AM · Restricted Project
SjoerdMeijer committed rGb5c3efeb7bc9: [ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled (authored by SjoerdMeijer).
[ARM][MVE] Tail-predication: predicate new elementcount checks on force-enabled
Wed, Sep 16, 9:06 AM
SjoerdMeijer closed D87769: [ARM][MVE] tail-predication: predicate new checks on force-enabled option.
Wed, Sep 16, 9:05 AM · Restricted Project
SjoerdMeijer updated the diff for D87769: [ARM][MVE] tail-predication: predicate new checks on force-enabled option.

Thanks Dave, just for completeness, uploading a new diff with the codegen changes gone, which shouldn't have been there.

Wed, Sep 16, 8:59 AM · Restricted Project
SjoerdMeijer requested review of D87769: [ARM][MVE] tail-predication: predicate new checks on force-enabled option.
Wed, Sep 16, 8:44 AM · Restricted Project
SjoerdMeijer accepted D87751: [RDA] Fix getUniqueReachingDef for self loops.

Cheers, LGTM

Wed, Sep 16, 4:39 AM · Restricted Project
SjoerdMeijer added inline comments to D87751: [RDA] Fix getUniqueReachingDef for self loops.
Wed, Sep 16, 4:05 AM · Restricted Project
SjoerdMeijer accepted D87610: [ARM] Fix tail predication predicate tracking.

Thanks for that. There was a lot going on before, but this now looks like a small, nice change.

Wed, Sep 16, 3:55 AM · Restricted Project
SjoerdMeijer accepted D87753: [ARM] Add more validForTailPredication.

Looks reasonable

Wed, Sep 16, 3:48 AM · Restricted Project
SjoerdMeijer committed rGcb1ef0eaff87: Follow up rG635b87511ec3: forgot to add/commit the new test file. NFC. (authored by SjoerdMeijer).
Follow up rG635b87511ec3: forgot to add/commit the new test file. NFC.
Wed, Sep 16, 1:39 AM
SjoerdMeijer added a comment to D82678: [CGP] Set debug locations when optimizing phi types.

I am also not a debug expert, but this looks like an "innocent" patch to me that makes things a bit better, so that's good. What I am wondering about though why setDebugLoc isn't done in the constructor, which makes the code cleaner here and also it won't be forgotten. But I don't want to make this bigger than it is, and since I have never really looked into debug info, I also don't know if there would be any disadvantages doing that. Perhaps others can comment on that.

Wed, Sep 16, 1:13 AM · debug-info, Restricted Project
SjoerdMeijer added a comment to D87610: [ARM] Fix tail predication predicate tracking.

I understood there is a NFC and non-NFC part of this patch. Is worth separating this out?

Wed, Sep 16, 12:52 AM · Restricted Project

Tue, Sep 15

SjoerdMeijer committed rG635b87511ec3: [ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount (authored by SjoerdMeijer).
[ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount
Tue, Sep 15, 5:23 AM
SjoerdMeijer closed D87608: [ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount.
Tue, Sep 15, 5:23 AM · Restricted Project
SjoerdMeijer committed rGb4b1b84106a0: [MVE] fix typo in llvm debug message. NFC. (authored by SjoerdMeijer).
[MVE] fix typo in llvm debug message. NFC.
Tue, Sep 15, 2:14 AM
SjoerdMeijer committed rG487412988cea: [MVE] Rename of tests making them consistent with tail-predication tests. NFC. (authored by SjoerdMeijer).
[MVE] Rename of tests making them consistent with tail-predication tests. NFC.
Tue, Sep 15, 1:25 AM
SjoerdMeijer added a comment to D87228: [Lint] Add check for intrinsic get.active.lane.mask.

Little non-urgent ping, but would be nice to get this little guy out of the way.

Tue, Sep 15, 12:37 AM · Restricted Project

Mon, Sep 14

SjoerdMeijer added inline comments to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.
Mon, Sep 14, 3:55 PM · Restricted Project
SjoerdMeijer added inline comments to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.
Mon, Sep 14, 2:15 PM · Restricted Project
SjoerdMeijer updated the diff for D87608: [ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount.

test case clean up.

Mon, Sep 14, 6:11 AM · Restricted Project
SjoerdMeijer requested review of D87608: [ARM][MVE] Tail-predication: use unsigned SCEV ranges for tripcount.
Mon, Sep 14, 6:07 AM · Restricted Project
SjoerdMeijer committed rG676febc044ec: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value (authored by SjoerdMeijer).
[ARM][MVE] Tail-predication: check get.active.lane.mask's TC value
Mon, Sep 14, 3:32 AM
SjoerdMeijer closed D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.
Mon, Sep 14, 3:32 AM · Restricted Project
SjoerdMeijer added a comment to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.

Thanks for that, and agreed with your remarks. I think this is already a bit more generic/flexible and thus better than what we had, but certainly isn't fully generic. I am willing to review this once that becomes important. Then, this logic has to be moved to Scalarevolution and be made generic.

Mon, Sep 14, 2:54 AM · Restricted Project

Thu, Sep 10

SjoerdMeijer added inline comments to D46884: [AArch64] Cortex-A55 scheduler model.
Thu, Sep 10, 10:48 AM · Restricted Project
SjoerdMeijer updated the diff for D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.

Cheers, comments addressed.

Thu, Sep 10, 5:40 AM · Restricted Project

Wed, Sep 9

SjoerdMeijer added inline comments to D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.
Wed, Sep 9, 7:47 AM · Restricted Project
SjoerdMeijer updated the diff for D86074: [ARM][MVE] Tail-predication: check get.active.lane.mask's TC value.

This is a (partial) rewrite of the patch after we changed the semantics of get.active.lane.mask to accept the loop tripcount as its second argument, and not the backedge-taken count. This now implements several checks to see if the tripcount belongs to this loop.

Wed, Sep 9, 7:38 AM · Restricted Project
SjoerdMeijer committed rG8cb8cea1bd7f: [ARM] Fixup of a few test cases. NFC. (authored by SjoerdMeijer).
[ARM] Fixup of a few test cases. NFC.
Wed, Sep 9, 3:16 AM

Tue, Sep 8

SjoerdMeijer accepted D87280: [ARM] Try to rematerialize VCTP instructions.

Rebased after pre-committing the test, of which I've changed the function name too.

Tue, Sep 8, 6:58 AM · Restricted Project
SjoerdMeijer added inline comments to D87280: [ARM] Try to rematerialize VCTP instructions.
Tue, Sep 8, 5:43 AM · Restricted Project

Mon, Sep 7

SjoerdMeijer added inline comments to D86784: [ARM] Skip combining base updates for vld1x NEON intrinsics.
Mon, Sep 7, 7:21 AM · Restricted Project
SjoerdMeijer committed rG288c582fc939: Follow up of rG5f1cad4d296a, slightly reduced test case. NFC. (authored by SjoerdMeijer).
Follow up of rG5f1cad4d296a, slightly reduced test case. NFC.
Mon, Sep 7, 7:12 AM
SjoerdMeijer added a comment to D75512: [LoopVectorizer][ARM] Add preferInloopReduction target hook..

There are some tests for 64bit reductions. We will probably want to enable inloop reductions for them in the future too, as we have the instructions. That will require a lot of costmodel improvements though.

Mon, Sep 7, 3:21 AM · Restricted Project
SjoerdMeijer updated the diff for D87228: [Lint] Add check for intrinsic get.active.lane.mask.

Updated test

Mon, Sep 7, 2:39 AM · Restricted Project
SjoerdMeijer requested review of D87228: [Lint] Add check for intrinsic get.active.lane.mask.
Mon, Sep 7, 2:34 AM · Restricted Project
SjoerdMeijer accepted D86525: [ARM][CostModel] CodeSize costs for i1 arith ops.
Mon, Sep 7, 1:24 AM · Restricted Project
SjoerdMeijer added inline comments to D86525: [ARM][CostModel] CodeSize costs for i1 arith ops.
Mon, Sep 7, 1:16 AM · Restricted Project
SjoerdMeijer added a comment to D86147: [LangRef] Revise semantics of get.active.lane.mask.

Hi Luke, thanks for sharing your thoughts. I agree with your analysis. The in-tree vector extension that I am aware of that supports first faulting loads is Arm's SVE. While I work on Arm's MVE, I hope and think this is useful for SVE (and other targets) too, i.e. I think ffirst mask capability can be used. But since the devil is in the details here, an implementation would need to prove this. Hopefully that happens soon.

Mon, Sep 7, 12:42 AM · Restricted Project
SjoerdMeijer accepted D75512: [LoopVectorizer][ARM] Add preferInloopReduction target hook..

Looks good to me.

Mon, Sep 7, 12:27 AM · Restricted Project

Aug 28 2020

SjoerdMeijer committed rG5f1cad4d296a: [ARM] Skip combining base updates for vld1x NEON intrinsics (authored by SjoerdMeijer).
[ARM] Skip combining base updates for vld1x NEON intrinsics
Aug 28 2020, 12:32 PM
SjoerdMeijer closed D86784: [ARM] Skip combining base updates for vld1x NEON intrinsics.
Aug 28 2020, 12:31 PM · Restricted Project
SjoerdMeijer added inline comments to D86784: [ARM] Skip combining base updates for vld1x NEON intrinsics.
Aug 28 2020, 12:26 PM · Restricted Project
SjoerdMeijer requested review of D86784: [ARM] Skip combining base updates for vld1x NEON intrinsics.
Aug 28 2020, 7:01 AM · Restricted Project
SjoerdMeijer added a comment to D86776: [ARM]{MVE] Enable MVE gathers and scatters by default.

Before we flip the switch, can you give an impression of the performance impact of this? Does this not regress cases, is it overall a win, etc.?

Aug 28 2020, 5:33 AM · Restricted Project
SjoerdMeijer accepted D86613: [ARM][LowOverheadLoops] Liveouts and reductions.

LGTM

Aug 28 2020, 12:25 AM · Restricted Project

Aug 27 2020

SjoerdMeijer added inline comments to D86613: [ARM][LowOverheadLoops] Liveouts and reductions.
Aug 27 2020, 6:50 AM · Restricted Project
SjoerdMeijer added a comment to D86702: [ARM] Fold predicate_cast(load) into vldr p0.

Sorry for kind of asking the usual testing question....but was curious if there's a negative test with a pattern where its condition isn't met, so alignment < 4, if that makes sense.

Aug 27 2020, 6:38 AM · Restricted Project
SjoerdMeijer added a comment to D86301: [Verifier] Additional check for get.active.lane.mask.

Ah yes, thanks Eli!
This is reverted in rG1d8af682ef1d, and I will move this to Lint.

Aug 27 2020, 3:06 AM · Restricted Project
SjoerdMeijer committed rGff6dbb231923: Follow up of rGca243b07276a: fixed a typo. NFC. (authored by SjoerdMeijer).
Follow up of rGca243b07276a: fixed a typo. NFC.
Aug 27 2020, 2:54 AM
SjoerdMeijer added a reverting change for rG8d5f64c4edbc: [Verifier] Additional check for intrinsic get.active.lane.mask: rG1d8af682ef1d: Revert "[Verifier] Additional check for intrinsic get.active.lane.mask".
Aug 27 2020, 1:28 AM
SjoerdMeijer committed rG1d8af682ef1d: Revert "[Verifier] Additional check for intrinsic get.active.lane.mask" (authored by SjoerdMeijer).
Revert "[Verifier] Additional check for intrinsic get.active.lane.mask"
Aug 27 2020, 1:28 AM