Page MenuHomePhabricator
Feed Advanced Search

Yesterday

dmgreen edited reviewers for D100585: [ARM][disassembler] Fix incorrect number of operands MCInst generated by the disassembler, added: dmgreen, DavidSpickett, ostannard; removed: greened.

Can you add a test using llvm-mc -show-inst ?

Tue, Apr 20, 11:23 PM · Restricted Project
dmgreen added a comment to D100464: [DSE] Remove stores in the same loop iteration.

Sigh. I should probably have found that problem. I hadn't considered multiple stores overriding happening like that.

Tue, Apr 20, 2:27 PM · Restricted Project
dmgreen accepted D100871: [COST][AARCH64] Improve cost of reverse shuffles for AArch64..

Thanks. LGTM

Tue, Apr 20, 12:59 PM · Restricted Project
dmgreen requested review of D100882: [AArch64] Improve vector reverse lowering.
Tue, Apr 20, 12:57 PM · Restricted Project
dmgreen added inline comments to D100871: [COST][AARCH64] Improve cost of reverse shuffles for AArch64..
Tue, Apr 20, 12:46 PM · Restricted Project
dmgreen added inline comments to D100245: [ARM] Expand VMOVRRD simplification pattern.
Tue, Apr 20, 10:20 AM · Restricted Project
dmgreen committed rG21a8b9d9e9e1: [ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal. (authored by dmgreen).
[ARM] Limit PerformExtractEltToVMOVRRD to when f64 is legal.
Tue, Apr 20, 8:25 AM
dmgreen added inline comments to D100244: [ARM] Create VMOVRRD from adjacent vector extracts.
Tue, Apr 20, 7:51 AM · Restricted Project
dmgreen committed rG48cef1fa8ee6: [ARM] Create VMOVRRD from adjacent vector extracts (authored by dmgreen).
[ARM] Create VMOVRRD from adjacent vector extracts
Tue, Apr 20, 7:16 AM
dmgreen closed D100244: [ARM] Create VMOVRRD from adjacent vector extracts.
Tue, Apr 20, 7:16 AM · Restricted Project
dmgreen committed rG806b47ade3f6: [ARM] Regenerate a couple of tests. NFC (authored by dmgreen).
[ARM] Regenerate a couple of tests. NFC
Tue, Apr 20, 2:56 AM
dmgreen added a comment to D97947: [AArch64] Force runtime unrolling for in-order scheduling models.

I was under the impression that without a -mcpu it defaulted to cortex-a53 schedule. It looks like it's no-schedule though, which still counts as an in-order core as it has no MicroOpBufferSize. Can we check if ST->getSchedModel().ProcID != 0, which will be the "NoSchedModel".

Tue, Apr 20, 2:48 AM · Restricted Project
dmgreen added a comment to D100245: [ARM] Expand VMOVRRD simplification pattern.

ping

Tue, Apr 20, 12:33 AM · Restricted Project

Mon, Apr 19

dmgreen committed rGca8eef7e3da8: [CodeGen] Use ProcResGroup information in SchedBoundary (authored by dpenry).
[CodeGen] Use ProcResGroup information in SchedBoundary
Mon, Apr 19, 1:28 PM
dmgreen closed D98976: [CodeGen] Use ProcResGroup information in SchedBoundary.
Mon, Apr 19, 1:28 PM · Restricted Project
dmgreen committed rG78a871abf701: [ARM] Use ProcResGroup in Cortex-M7 scheduling model (authored by dpenry).
[ARM] Use ProcResGroup in Cortex-M7 scheduling model
Mon, Apr 19, 1:23 PM
dmgreen closed D98977: [ARM] Use ProcResGroup in Cortex-M7 scheduling model.
Mon, Apr 19, 1:23 PM · Restricted Project
dmgreen updated the diff for D100244: [ARM] Create VMOVRRD from adjacent vector extracts.

Fix typo.

Mon, Apr 19, 4:53 AM · Restricted Project
dmgreen added inline comments to D100244: [ARM] Create VMOVRRD from adjacent vector extracts.
Mon, Apr 19, 4:47 AM · Restricted Project
dmgreen added inline comments to D100435: [ARM] Transforming memset to Tail predicated Loop.
Mon, Apr 19, 2:06 AM · Restricted Project
dmgreen added a comment to D100244: [ARM] Create VMOVRRD from adjacent vector extracts.

ping

Mon, Apr 19, 1:26 AM · Restricted Project
dmgreen accepted D99723: [ARM] Transforming memcpy to Tail predicated Loop.
Mon, Apr 19, 1:23 AM · Restricted Project
dmgreen added a comment to D100476: [AArch64][SVEIntrinsicOpts] Replace last{a,b} intrinsic calls with extracts....

Thanks for the change. This looks very sensible to me, so long as the other SVE folks agree.

Mon, Apr 19, 1:11 AM · Restricted Project

Fri, Apr 16

dmgreen updated the diff for D100464: [DSE] Remove stores in the same loop iteration.

Rebase and move the condition logic into the start of IsGuaranteedLoopInvariant.

Fri, Apr 16, 11:02 AM · Restricted Project
dmgreen committed rG093f1828e58c: [ARM] Prevent phi-node-elimination from generating copy above t2WhileLoopStartLR (authored by malharJ).
[ARM] Prevent phi-node-elimination from generating copy above t2WhileLoopStartLR
Fri, Apr 16, 8:45 AM
dmgreen closed D100376: [ARM] Prevent phi-node-elimination from generating copy above t2WhileLoopStartLR.
Fri, Apr 16, 8:45 AM · Restricted Project
dmgreen accepted D100463: [AArch64][SVEIntrinsicOpts] Fold sve_convert_from_svbool(zero) to zero.

Nice one, thanks. This Looks good to me.

Fri, Apr 16, 5:57 AM · Restricted Project
dmgreen committed rG00a60454734c: [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC (authored by dmgreen).
[ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC
Fri, Apr 16, 3:53 AM
dmgreen closed D99940: [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC.
Fri, Apr 16, 3:52 AM · Restricted Project
dmgreen accepted D100121: [LV] Let selectVectorizationFactor reason directly on VectorizationFactor..

Thanks. Seems like a useful step forward. LGTM.

Fri, Apr 16, 1:07 AM · Restricted Project
dmgreen added a comment to D100464: [DSE] Remove stores in the same loop iteration.

Thanks for the patch! Given that this only shifts the invocation of LI, there shouldn't be any problems in terms of compile time.

Fri, Apr 16, 12:54 AM · Restricted Project
dmgreen updated the diff for D100464: [DSE] Remove stores in the same loop iteration.

Add a ContainsIrreducibleLoops flag, Move LoopInfo and add some irreducible tests.

Fri, Apr 16, 12:53 AM · Restricted Project

Thu, Apr 15

dmgreen added a comment to D99272: [AArch64] Adds a pre-indexed paired Load/Store optimization for LDR-STR..

It would be good to see some extra tests for various edge cases, like offsets near to the boundaries and different pairs of instructions being combined/not.

Thu, Apr 15, 9:17 AM · Restricted Project
dmgreen added inline comments to D100121: [LV] Let selectVectorizationFactor reason directly on VectorizationFactor..
Thu, Apr 15, 7:53 AM · Restricted Project
dmgreen requested review of D100550: [ARM] Ensure loop invariant active.lane.mask operands.
Thu, Apr 15, 4:35 AM · Restricted Project
dmgreen added inline comments to D99723: [ARM] Transforming memcpy to Tail predicated Loop.
Thu, Apr 15, 2:44 AM · Restricted Project
dmgreen added a comment to D100476: [AArch64][SVEIntrinsicOpts] Replace last{a,b} intrinsic calls with extracts....

Like D100463, could this be done in instcombine/TTI::instCombineIntrinsic?

Thu, Apr 15, 2:07 AM · Restricted Project
dmgreen added a comment to D100463: [AArch64][SVEIntrinsicOpts] Fold sve_convert_from_svbool(zero) to zero.

Hi @dmgreen! This is SVE-specific, and SVEIntrinsicOpts.cpp is where such transformations are typically placed (at least for now). I did a quick grep and it seems SVE intrinsics don't currently have much of a presence in generic passes like instcombine/constant folding, perhaps because some SVE optimisations are more complex than others and I guess it makes sense to keep them all in the same place.

X86 also has llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp which houses instcombine-like optimisations for X86. 🙂

Thu, Apr 15, 2:06 AM · Restricted Project
dmgreen accepted D98564: [AArch64] Peephole rule to remove redundant cmp after cset..

I think I managed to convince myself that this is correct. But there is a lot that could go wrong and subtly forgotten, this code has been a bit error prone in the past. Hopefully now that it's gained some infrastructure that's less likely.

Thu, Apr 15, 1:34 AM · Restricted Project

Wed, Apr 14

dmgreen accepted D100376: [ARM] Prevent phi-node-elimination from generating copy above t2WhileLoopStartLR.

Thanks. LGTM

Wed, Apr 14, 11:56 PM · Restricted Project
dmgreen added inline comments to D99723: [ARM] Transforming memcpy to Tail predicated Loop.
Wed, Apr 14, 1:22 PM · Restricted Project
dmgreen accepted D100317: [TTI] NFC: Change getArithmeticInstrCost to return InstructionCost.

Looks good

Wed, Apr 14, 5:32 AM · Restricted Project
dmgreen accepted D100315: [TTI] NFC: Change getVectorInstrCost to return InstructionCost.

LGTM

Wed, Apr 14, 5:27 AM · Restricted Project
dmgreen accepted D100314: [TTI] NFC: Change getShuffleCost to return InstructionCost.

Looks like a mechanical extension to the other changes.

Wed, Apr 14, 5:24 AM · Restricted Project
dmgreen accepted D100304: [AArch64][NEON] Match (or (and -a b) (and (a+1) b)) => bit select.

I would have gone the other way, making the mtriple and mattr both command line args. LGTM either way though.

Wed, Apr 14, 5:20 AM · Restricted Project
dmgreen added a comment to D100463: [AArch64][SVEIntrinsicOpts] Fold sve_convert_from_svbool(zero) to zero.

Why is this kind of thing not done in instcombine? Or even constant folding if that is what this is really doing.

Wed, Apr 14, 5:14 AM · Restricted Project
dmgreen requested review of D100464: [DSE] Remove stores in the same loop iteration.
Wed, Apr 14, 4:15 AM · Restricted Project
dmgreen added a comment to D100435: [ARM] Transforming memset to Tail predicated Loop.

Can you base this on top of D99723? Some of the code may be able to be shared, even if there will be differences.

Wed, Apr 14, 1:30 AM · Restricted Project
dmgreen added inline comments to D100376: [ARM] Prevent phi-node-elimination from generating copy above t2WhileLoopStartLR.
Wed, Apr 14, 1:02 AM · Restricted Project

Tue, Apr 13

dmgreen added a comment to D98976: [CodeGen] Use ProcResGroup information in SchedBoundary.

You folks all know more about scheduling than I do, but if you are in accord that is great. I can certainly verify that this improves the accuracy of the M7 schedule for the samples I've seen.

Tue, Apr 13, 11:25 AM · Restricted Project
dmgreen added inline comments to D100383: [LSR] Fix for pre-indexed generated constant offset.
Tue, Apr 13, 8:42 AM · Restricted Project

Mon, Apr 12

dmgreen accepted D98977: [ARM] Use ProcResGroup in Cortex-M7 scheduling model.

This looks sensible to me, if we can get the scheduler to agree.

Mon, Apr 12, 10:32 AM · Restricted Project
dmgreen committed rGdd31b2c6e546: [ARM] Add a number of intrinsics for MVE lane interleaving (authored by dmgreen).
[ARM] Add a number of intrinsics for MVE lane interleaving
Mon, Apr 12, 9:23 AM
dmgreen closed D97293: [ARM] Add a number of intrinsics for MVE lane interleaving.
Mon, Apr 12, 9:23 AM · Restricted Project
dmgreen accepted D100205: [TTI] NFC: Change get[Interleaved]MemoryOpCost to return InstructionCost.

This LGTM too.

Mon, Apr 12, 7:45 AM · Restricted Project
dmgreen committed rG6c0a1ed3a94f: [ARM] Add FP handling for MVE lane interleaving (authored by dmgreen).
[ARM] Add FP handling for MVE lane interleaving
Mon, Apr 12, 7:28 AM
dmgreen closed D97292: [ARM] Add FP handling for MVE lane interleaving.
Mon, Apr 12, 7:28 AM · Restricted Project
dmgreen added inline comments to D100304: [AArch64][NEON] Match (or (and -a b) (and (a+1) b)) => bit select.
Mon, Apr 12, 7:26 AM · Restricted Project
dmgreen committed rG58f3201a20f7: [ARM] Updates to arm-block-placement pass (authored by malharJ).
[ARM] Updates to arm-block-placement pass
Mon, Apr 12, 6:46 AM
dmgreen closed D99649: [ARM] Updates to arm-block-placement pass.
Mon, Apr 12, 6:46 AM · Restricted Project
dmgreen added inline comments to D100121: [LV] Let selectVectorizationFactor reason directly on VectorizationFactor..
Mon, Apr 12, 6:29 AM · Restricted Project
dmgreen added inline comments to D99723: [ARM] Transforming memcpy to Tail predicated Loop.
Mon, Apr 12, 6:28 AM · Restricted Project
dmgreen accepted D100203: [TTI] NFC: Change getCmpSelInstrCost to return InstructionCost.

Thanks. LGTM

Mon, Apr 12, 3:45 AM · Restricted Project
dmgreen updated the diff for D97292: [ARM] Add FP handling for MVE lane interleaving.

Yep. "cheap" -> "beneficial to convert"

Mon, Apr 12, 3:35 AM · Restricted Project
dmgreen added reviewers for D100225: [Clang][AArch64] Coerce integer return values through an undef vector: ostannard, sdesmalen, momchil.velikov, SjoerdMeijer.
Mon, Apr 12, 1:21 AM · Restricted Project
dmgreen updated the diff for D99940: [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC.

Add some brackets to a comment, to help readability.

Mon, Apr 12, 12:53 AM · Restricted Project
dmgreen added a comment to D99940: [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC.

I don't have much context, but I'm just wondering if a similar optimization for csneg might be useful ?
sub(0, csneg( X, Y, <cc>) ) = csinv -X, -Y-1, <cc>

Mon, Apr 12, 12:49 AM · Restricted Project

Sat, Apr 10

dmgreen accepted D100204: [TTI] NFC: Change getMaskedMemoryOpCost to return InstructionCost.

Looks good, expect for the one getValue.

Sat, Apr 10, 6:57 AM · Restricted Project
dmgreen added inline comments to D100203: [TTI] NFC: Change getCmpSelInstrCost to return InstructionCost.
Sat, Apr 10, 6:56 AM · Restricted Project
dmgreen accepted D100202: [TTI] NFC: Change getMinMaxReductionCost to return InstructionCost.

Looks simple. LGTM

Sat, Apr 10, 6:55 AM · Restricted Project
dmgreen accepted D100201: [TTI] NFC: Change getArithmeticReductionCost to return InstructionCost.

LGTM

Sat, Apr 10, 6:54 AM · Restricted Project
dmgreen accepted D100200: [TTI] NFC: Change getGatherScatterOpCost to return InstructionCost.

Seems straight forward.

Sat, Apr 10, 6:53 AM · Restricted Project
dmgreen accepted D100199: [TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost.

Thanks. LGTM

Sat, Apr 10, 6:49 AM · Restricted Project
dmgreen added inline comments to D98781: [AArch64] Enable UseAA globally in the AArch64 backend.
Sat, Apr 10, 6:16 AM · Restricted Project
dmgreen updated the diff for D97293: [ARM] Add a number of intrinsics for MVE lane interleaving.

Rebase

Sat, Apr 10, 5:58 AM · Restricted Project
dmgreen updated the diff for D97292: [ARM] Add FP handling for MVE lane interleaving.

Rebase

Sat, Apr 10, 5:56 AM · Restricted Project
dmgreen added inline comments to D100245: [ARM] Expand VMOVRRD simplification pattern.
Sat, Apr 10, 4:38 AM · Restricted Project
dmgreen requested review of D100245: [ARM] Expand VMOVRRD simplification pattern.
Sat, Apr 10, 4:36 AM · Restricted Project
dmgreen requested review of D100244: [ARM] Create VMOVRRD from adjacent vector extracts.
Sat, Apr 10, 4:24 AM · Restricted Project

Fri, Apr 9

dmgreen added inline comments to D100199: [TTI] NFC: Change getCastInstrCost and getExtractWithExtendCost to return InstructionCost.
Fri, Apr 9, 9:01 AM · Restricted Project

Thu, Apr 8

dmgreen added a comment to D99723: [ARM] Transforming memcpy to Tail predicated Loop.

I'm a little worried that WLSTP is going to cause problems, with it not used anywhere else. Lets at least add an option for disabling it needed.

Thu, Apr 8, 3:07 AM · Restricted Project
dmgreen committed rG8675ef100f8c: [LV] Logical and/or select costs (authored by dmgreen).
[LV] Logical and/or select costs
Thu, Apr 8, 2:40 AM
dmgreen closed D99884: [LV] Logical and/or select costs.
Thu, Apr 8, 2:40 AM · Restricted Project
dmgreen added a comment to D99884: [LV] Logical and/or select costs.

Thanks

Thu, Apr 8, 2:28 AM · Restricted Project
dmgreen committed rG1a4d3d0bca2b: [LV] Add a logical and/or select cost test. NFC (authored by dmgreen).
[LV] Add a logical and/or select cost test. NFC
Thu, Apr 8, 2:27 AM

Wed, Apr 7

dmgreen added a comment to D99662: [AArch64] Add Machine InstCombiner patterns for FMUL indexed variant.

But the other thing I was just wondering, not that I mind these patterns here, but are we not expecting that the VDUP is sunk to its user? I think that's probably what I would expect, but don't know if that is a fair expectation.

Wed, Apr 7, 11:57 AM · Restricted Project
dmgreen accepted D99649: [ARM] Updates to arm-block-placement pass.

Thanks. LGTM

Wed, Apr 7, 3:22 AM · Restricted Project
dmgreen accepted D97407: [LoopUnrollAndJam] Avoid repeated instructions for UAJ analysis.

LGTM, so long as the test is cleaned up a little

Wed, Apr 7, 2:40 AM · Restricted Project
dmgreen added a comment to D79155: [CodeGen] Increase applicability of ffine-grained-bitfield-accesses for targets with limited native integer widths.

Hello. We've received reports that this is bloating codesize of some code, quite a lot in places. There is an example in https://godbolt.org/z/66TEKa1xK. Essentially the glomming together of reads/writes into i32's (in our case) helps to reduce the total number of loads/stores needed. Splitting that up into individual i8/i16's creates a lot more load/mask/load/mask/or/store sequences.

Wed, Apr 7, 1:45 AM · Restricted Project

Tue, Apr 6

dmgreen requested review of D99940: [ARM] Combine sub 0, csinc X, Y, CC -> csinv -X, Y, CC.
Tue, Apr 6, 3:42 AM · Restricted Project
dmgreen added inline comments to D99884: [LV] Logical and/or select costs.
Tue, Apr 6, 2:43 AM · Restricted Project
dmgreen accepted D99602: [test, AArch64] Fix use of var defined in CHECK-NOT.

LGTM

Tue, Apr 6, 2:20 AM · Restricted Project

Mon, Apr 5

dmgreen added inline comments to D99509: [RISCV] Add legality check for vectoring reduction.
Mon, Apr 5, 10:57 AM · Restricted Project
dmgreen requested review of D99884: [LV] Logical and/or select costs.
Mon, Apr 5, 7:22 AM · Restricted Project

Fri, Apr 2

dmgreen accepted D99699: [AArch64][SVE] Lowering sve.dot to DOT node.

Thanks. LGTM

Fri, Apr 2, 4:40 AM · Restricted Project

Thu, Apr 1

dmgreen committed rGda98177cda16: [ARM] Allow v6m runtime loop unrolling (authored by dmgreen).
[ARM] Allow v6m runtime loop unrolling
Thu, Apr 1, 1:22 PM
dmgreen closed D99588: [ARM] Allow v6m runtime loop unrolling.
Thu, Apr 1, 1:22 PM · Restricted Project
dmgreen accepted D99586: [AArch64] Default to zero-cycle-zeroing FP registers..

Thanks. My tests agreed, LGTM

Thu, Apr 1, 8:54 AM · Restricted Project
dmgreen added inline comments to D99723: [ARM] Transforming memcpy to Tail predicated Loop.
Thu, Apr 1, 8:18 AM · Restricted Project
dmgreen added a comment to D99649: [ARM] Updates to arm-block-placement pass.

addressed comments,
renamed some functions,
added comments to test (and updated)
updated some incorrect code:

  • adjustBBOffsetsAfter() is called with BBPrevious as input since BB is moved, which would cause change in offsets after it.
  • code checking for LE to loopExit now starts search from the MBB after loopExit. Updated the test accordingly.
Thu, Apr 1, 7:46 AM · Restricted Project