Page MenuHomePhabricator

samtebbs (Sam Tebbs)
User

Projects

User does not belong to any projects.

User Details

User Since
May 31 2019, 2:34 AM (79 w, 23 h)

Recent Activity

Thu, Dec 3

samtebbs updated the diff for D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Fix tests

Thu, Dec 3, 9:46 AM · Restricted Project

Wed, Dec 2

samtebbs updated the diff for D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Inline canMoveBasicBlock, update BB offsets and make fixFallthrough a lambda.

Wed, Dec 2, 8:40 AM · Restricted Project
samtebbs added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Wed, Dec 2, 7:00 AM · Restricted Project
samtebbs updated the diff for D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

Use BBUtils to get BB offsets, loop over terminators instead of instructions and other clean-up.

Wed, Dec 2, 4:29 AM · Restricted Project
samtebbs added a comment to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.

I have some high-level questions:

  • Are we fighting another optimisations here, some sort of loop-rotate or is this just MBP reshulffling blocks in a way that is not good for us?

Yeah MBP moves the loop blocks (preheader, body etc.) closer together but unfortunately moves the WLSs branch target above the WLS. So it does some good things but also some bad things. After looking at MBP I thought it would be simpler to make a target pass rather than juggle things around in MBP that could end up affecting other targets.

  • I know we don't nest WLSTPs for profitability reasons, but in theory we could. Not sure we need to check this though. But in general my impression is that some more test can be added, but perhaps you were still working on that.

I wasn't sure if it was possible or not, but thought I'd add the checks in there just in case. I'm OK with removing them if they definitely are unnecessary. I'm also working on testing a nested while loop, can you think of any other tests I should add?

  • I am wondering if some cost-modeling is required. For example, if the iteration count is very low, would that change things?

That is a good idea. It would be good to get an idea of the cycles saved by converting the while loop into a LOL compared to those needed by the branches that replace fallthrough.

Wed, Dec 2, 3:31 AM · Restricted Project

Tue, Dec 1

samtebbs added inline comments to D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Tue, Dec 1, 9:07 AM · Restricted Project
samtebbs added inline comments to D92369: [ARM] Improve handling of empty VPT blocks in tail predicated loops.
Tue, Dec 1, 5:36 AM · Restricted Project
samtebbs updated the summary of D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Tue, Dec 1, 5:23 AM · Restricted Project
samtebbs requested review of D92385: [ARM] Add a pass that re-arranges blocks when there is a backwards WLS branch.
Tue, Dec 1, 5:21 AM · Restricted Project

Wed, Nov 25

samtebbs accepted D91938: [ARM] MVE vabd .

Very nice change, looks good to me.

Wed, Nov 25, 3:42 AM · Restricted Project

Fri, Nov 20

samtebbs accepted D91866: [ARM] Cleanup for the MVETailPrediction pass.

Nice, LGTM

Fri, Nov 20, 6:06 AM · Restricted Project
samtebbs added inline comments to D91857: [ARM] Remove dead mov's in preheader of tail predicated loops.
Fri, Nov 20, 6:04 AM · Restricted Project
samtebbs updated the diff for D89800: [ARM][LowOverheadLoops] Don't generate a LOL if lr is redefined after the start.

Rebase

Fri, Nov 20, 5:54 AM · Restricted Project

Thu, Nov 19

samtebbs committed rG8ecb015ed5ad: [ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition (authored by samtebbs).
[ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition
Thu, Nov 19, 9:16 AM
samtebbs closed D91790: [ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition.
Thu, Nov 19, 9:15 AM · Restricted Project
samtebbs updated the diff for D91790: [ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition.

Fix test ordering. This is an NFC so I will commit with the previous approval.

Thu, Nov 19, 8:44 AM · Restricted Project
samtebbs requested review of D91790: [ARM][LowOverheadLoops] Convert intermediate vpr use assertion to condition.
Thu, Nov 19, 5:40 AM · Restricted Project

Wed, Nov 18

samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Wed, Nov 18, 8:01 AM · Restricted Project
samtebbs added a comment to D91705: Fix unused variables in release build.

Thank you @goncharov and @kadircet

Wed, Nov 18, 7:04 AM · Restricted Project
samtebbs committed rGda2e4728c71f: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks (authored by samtebbs).
[ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks
Wed, Nov 18, 4:55 AM
samtebbs closed D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Wed, Nov 18, 4:54 AM · Restricted Project

Tue, Nov 17

samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Use std::any_of and hasVPRUse.

Tue, Nov 17, 5:42 AM · Restricted Project
samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Tue, Nov 17, 3:56 AM · Restricted Project
samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Add another test case and format

Tue, Nov 17, 3:21 AM · Restricted Project
samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Tue, Nov 17, 3:20 AM · Restricted Project
samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Tue, Nov 17, 2:25 AM · Restricted Project

Mon, Nov 16

samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Get VPR def when merging across blocks rather than keeping a pointer to the VCMP.

Mon, Nov 16, 3:17 AM · Restricted Project

Fri, Nov 13

samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Fri, Nov 13, 2:54 AM · Restricted Project
samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Fri, Nov 13, 2:52 AM · Restricted Project

Thu, Nov 12

samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Clean up tests

Thu, Nov 12, 7:59 AM · Restricted Project
samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Thu, Nov 12, 6:00 AM · Restricted Project
samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Check for VPR use rather than predicate, pass VCMP as argument to ReplaceVCMPWithVPT, make nullptr the default VCMP value and remove impossible VPST vase.

Thu, Nov 12, 6:00 AM · Restricted Project

Wed, Nov 11

samtebbs updated the diff for D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Add more tests and check for instructions between the VCMP and VPST that use vpr.

Wed, Nov 11, 8:33 AM · Restricted Project

Tue, Nov 10

samtebbs added inline comments to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Tue, Nov 10, 6:17 AM · Restricted Project

Mon, Nov 9

samtebbs added a comment to D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.

Is it possible to write some tests that are hopefully simple but still tail predicated, and contain different kinds of vpt blocks with various instructions in various orders? It probably doesn't matter if the instructions are super sensible so long as they show the different sets of blocks with vcmps followed by predicated/unpredicated instructions.

Mon, Nov 9, 8:05 AM · Restricted Project
samtebbs committed rG40a3f7e48d6b: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT (authored by samtebbs).
[ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT
Mon, Nov 9, 7:05 AM
samtebbs closed D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.
Mon, Nov 9, 7:04 AM · Restricted Project
samtebbs accepted D90964: [ARM] Remove kill flags between VCMP and insertion point.

LGTM! Will we need to make similar change for the combination that happens in the LowOverheadLoops pass as well?

Mon, Nov 9, 2:01 AM · Restricted Project

Fri, Nov 6

samtebbs added inline comments to D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.
Fri, Nov 6, 8:51 AM · Restricted Project
samtebbs updated the diff for D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.

Don't merge a VCMP and a preceeding VPST.

Fri, Nov 6, 8:51 AM · Restricted Project
samtebbs requested review of D90935: [ARM][LowOverheadLoops] Merge VCMP and VPST across VPT blocks.
Fri, Nov 6, 6:12 AM · Restricted Project
samtebbs updated the diff for D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.

Add a check for the VPR def at the VPST.

Fri, Nov 6, 5:41 AM · Restricted Project

Nov 2 2020

samtebbs updated the diff for D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.

Clean up the test.

Nov 2 2020, 5:26 AM · Restricted Project
samtebbs added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

Yeah I was wondering which way to go with that. The t2DoLoopStartTP is meant to mean "a t2DoLoopStart that is almost certainly going to become a DLSTP". A t2DoLoopStart are for all the low overhead loops that are not expected to change to tail predicated loop. That way we could treat them differently elsewhere in the pipeline if we need to.

Nov 2 2020, 5:04 AM · Restricted Project
samtebbs accepted D90439: [ARM] Treat memcpy/memset/memmove as call instructions for low overhead loops.

LGTM

Nov 2 2020, 5:00 AM · Restricted Project
samtebbs added a comment to D90591: [ARM] Introduce t2DoLoopStartTP.

The TP at the end of the name somewhat implies that this is only for tail predication, would it make sense to change t2DoLoopStart to take the extra register?

Nov 2 2020, 3:21 AM · Restricted Project
samtebbs added inline comments to D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.
Nov 2 2020, 2:59 AM · Restricted Project

Oct 30 2020

samtebbs abandoned D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.

The code that this change depends on has been reverted so I will close this and re-visit it once those changes have been re-worked.

Oct 30 2020, 8:25 AM · Restricted Project
samtebbs abandoned D89549: [ARM][LowOverheadLoops] Check live-out for InsertPt instead of Start.

The code that this change depends on has been reverted so I will delay close this and re-visit it once those changes have been re-worked.

Oct 30 2020, 8:25 AM · Restricted Project
samtebbs added a comment to D89800: [ARM][LowOverheadLoops] Don't generate a LOL if lr is redefined after the start.

Thanks for the review. I will delay working on this until @dmgreen 's lr patch is in as it affect fix this.

Oct 30 2020, 8:23 AM · Restricted Project
samtebbs requested review of D90461: [ARM][LowOverheadLoops] Merge a VCMP and the new VPST into a VPT.
Oct 30 2020, 7:51 AM · Restricted Project

Oct 20 2020

samtebbs requested review of D89800: [ARM][LowOverheadLoops] Don't generate a LOL if lr is redefined after the start.
Oct 20 2020, 8:48 AM · Restricted Project

Oct 19 2020

samtebbs added inline comments to D89549: [ARM][LowOverheadLoops] Check live-out for InsertPt instead of Start.
Oct 19 2020, 3:27 AM · Restricted Project
samtebbs added a comment to D89549: [ARM][LowOverheadLoops] Check live-out for InsertPt instead of Start.

Can we add tests for this?

Oct 19 2020, 3:27 AM · Restricted Project

Oct 16 2020

samtebbs added inline comments to D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.
Oct 16 2020, 6:31 AM · Restricted Project
samtebbs updated the diff for D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.

Rebase on top of https://reviews.llvm.org/D89549 and format code.

Oct 16 2020, 6:31 AM · Restricted Project
samtebbs requested review of D89549: [ARM][LowOverheadLoops] Check live-out for InsertPt instead of Start.
Oct 16 2020, 6:18 AM · Restricted Project

Oct 9 2020

samtebbs added inline comments to D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.
Oct 9 2020, 8:02 AM · Restricted Project
samtebbs updated the diff for D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.

Simplify logic and add check for LR being live-out.

Oct 9 2020, 7:58 AM · Restricted Project

Oct 8 2020

samtebbs closed D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Oct 8 2020, 9:24 AM · Restricted Project
samtebbs updated the diff for D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.

Fix test

Oct 8 2020, 8:52 AM · Restricted Project
samtebbs requested review of D89048: [ARM][LowOverheadLoops] Insert loop start at end of block in more cases.
Oct 8 2020, 7:53 AM · Restricted Project

Oct 7 2020

samtebbs accepted D88926: [ARM] Attempt to make Tail predication / RDA more resilient to empty blocks.

Looks good to me, thanks Dave.

Oct 7 2020, 4:05 AM · Restricted Project

Oct 6 2020

samtebbs added a comment to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Hi Amara.

Apologies for this. The fix is just a matter of updating the two failing
tests. What is the process for getting the change back in?

If it's just a trivial update of a test, you can just go ahead and recommit it. If you're in doubt, you can always put it for review again.

Oct 6 2020, 6:45 AM · Restricted Project
samtebbs committed rG68e002e1819f: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV (authored by samtebbs).
[ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV
Oct 6 2020, 6:45 AM
samtebbs added a comment to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Hi Amara.

Oct 6 2020, 4:46 AM · Restricted Project

Oct 5 2020

samtebbs committed rG2573cf3c3d42: [ARM]Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV (authored by samtebbs).
[ARM]Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV
Oct 5 2020, 7:53 AM
samtebbs added a comment to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Thanks. LGTM

Oct 5 2020, 7:53 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Add check for the valid vector types.

Oct 5 2020, 3:00 AM · Restricted Project
samtebbs added a comment to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

One last thing I thought of. Because this is pre-lowering we have to be careful with illegal types.

Can you check the types are the ones we expect them to be (v16i8/v8i16/v4i32). And add some tests for v2i64 (which should be scalarized).

Oct 5 2020, 2:53 AM · Restricted Project

Oct 2 2020

samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Oct 2 2020, 6:59 AM · Restricted Project
samtebbs added inline comments to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Oct 2 2020, 6:52 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Perform the folding in a DAG combination rather than lowering.

Oct 2 2020, 4:49 AM · Restricted Project

Sep 29 2020

samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Add type promotion checks and two new tests

Sep 29 2020, 8:16 AM · Restricted Project

Sep 28 2020

samtebbs added inline comments to D88419: [RDA] Switch isSafeToMove iterators.
Sep 28 2020, 8:54 AM · Restricted Project

Sep 24 2020

samtebbs accepted D87819: [ARM] Find VPT implicitly predicated by VCTP.

Nice change! LGTM

Sep 24 2020, 7:17 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Add truncate_sext test

Sep 24 2020, 6:45 AM · Restricted Project
samtebbs abandoned D88022: [ARM][LowOverheadLoops] Check VCMP operands have same def as the VPT before combining.

Closed since it's no longer needed.

Sep 24 2020, 6:34 AM · Restricted Project

Sep 23 2020

samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Allow inverted condition codes

Sep 23 2020, 3:48 AM · Restricted Project

Sep 21 2020

samtebbs added inline comments to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Sep 21 2020, 9:35 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Be a bit more strict.

Sep 21 2020, 9:35 AM · Restricted Project
samtebbs added inline comments to D87819: [ARM] Find VPT implicitly predicated by VCTP.
Sep 21 2020, 7:51 AM · Restricted Project
samtebbs requested review of D88022: [ARM][LowOverheadLoops] Check VCMP operands have same def as the VPT before combining.
Sep 21 2020, 7:28 AM · Restricted Project
samtebbs added inline comments to D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.
Sep 21 2020, 7:27 AM · Restricted Project
samtebbs added inline comments to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Sep 21 2020, 6:35 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Fold using the LHS and check for LHS and RHS opcodes explicitly.

Sep 21 2020, 6:35 AM · Restricted Project

Sep 18 2020

samtebbs added inline comments to D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Sep 18 2020, 5:25 AM · Restricted Project
samtebbs updated the diff for D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.

Support commuted operands and add tests to cover them.

Sep 18 2020, 5:25 AM · Restricted Project

Sep 17 2020

samtebbs requested review of D87836: [ARM] Fold select_cc(vecreduce_[u|s][min|max], x) into VMINV or VMAXV.
Sep 17 2020, 8:19 AM · Restricted Project

Sep 16 2020

samtebbs added inline comments to D87751: [RDA] Fix getUniqueReachingDef for self loops.
Sep 16 2020, 3:06 AM · Restricted Project
samtebbs added a comment to D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.

Tests updated in rGac2717bfdd0d

Sep 16 2020, 3:02 AM · Restricted Project
samtebbs committed rGac2717bfdd0d: [ARM][LowOverheadLoops] Fix tests after ef0b9f3 (authored by samtebbs).
[ARM][LowOverheadLoops] Fix tests after ef0b9f3
Sep 16 2020, 3:01 AM
samtebbs closed D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.

Closed by rGef0b9f3307a1

Sep 16 2020, 1:36 AM · Restricted Project
samtebbs committed rGef0b9f3307a1: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT (authored by samtebbs).
[ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT
Sep 16 2020, 1:34 AM

Sep 15 2020

samtebbs added inline comments to D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.
Sep 15 2020, 6:36 AM · Restricted Project
samtebbs updated the diff for D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.

Clean up the formatting a little.

Sep 15 2020, 6:33 AM · Restricted Project
samtebbs updated the diff for D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.

Remove -O3 from test and improve VCMP detection.

Sep 15 2020, 6:28 AM · Restricted Project

Sep 14 2020

samtebbs requested review of D87616: [ARM][LowOverheadLoops] Combine a VCMP and VPST into a VPT.
Sep 14 2020, 8:02 AM · Restricted Project

Sep 10 2020

samtebbs committed rGb81c57d646e4: [ARM][LowOverheadLoops] Allow tail predication on predicated instructions with… (authored by samtebbs).
[ARM][LowOverheadLoops] Allow tail predication on predicated instructions with…
Sep 10 2020, 2:35 AM
samtebbs closed D87376: [ARM][LowOverheadLoops] Allow tail predication on predicated instructions with unknown lane values.
Sep 10 2020, 2:35 AM · Restricted Project