Page MenuHomePhabricator
Feed Advanced Search

Today

dmgreen committed rG93aa445e16f7: Revert "[ARM] Extend narrow values to allow using truncating scatters" (authored by dmgreen).
Revert "[ARM] Extend narrow values to allow using truncating scatters"
Tue, Jun 15, 10:19 AM
dmgreen committed rGb9bd2936f9cf: [ARM] Extend narrow values to allow using truncating scatters (authored by dmgreen).
[ARM] Extend narrow values to allow using truncating scatters
Tue, Jun 15, 9:45 AM
dmgreen closed D103704: [ARM] Extend narrow values to allow using truncating scatters.
Tue, Jun 15, 9:45 AM · Restricted Project
dmgreen committed rG680d3f8f1785: [ARM] Use rq gather/scatters for smaller v4 vectors (authored by dmgreen).
[ARM] Use rq gather/scatters for smaller v4 vectors
Tue, Jun 15, 9:06 AM
dmgreen closed D103674: [ARM] Use rq gather/scatters for smaller v4 vectors.
Tue, Jun 15, 9:06 AM · Restricted Project
dmgreen committed rG09924cbab780: [ARM] Rejig some of the MVE gather/scatter lowering pass. NFC (authored by dmgreen).
[ARM] Rejig some of the MVE gather/scatter lowering pass. NFC
Tue, Jun 15, 7:39 AM
dmgreen added inline comments to D103903: [ARM] Transform a fixed-point to floating-point conversion into a VCVT_fix.
Tue, Jun 15, 7:32 AM · Restricted Project
dmgreen added inline comments to D104255: [InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass.
Tue, Jun 15, 1:08 AM · Restricted Project
dmgreen updated the diff for D104255: [InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass.
Tue, Jun 15, 1:08 AM · Restricted Project

Yesterday

dmgreen added a comment to D103799: [CostModel] Express cost(urem) as cost(div+mul+sub) when set to Expand..

This sounds OK to me, so long as the X86 numbers are not bonkers.

Mon, Jun 14, 2:38 PM · Restricted Project
dmgreen added inline comments to D103903: [ARM] Transform a fixed-point to floating-point conversion into a VCVT_fix.
Mon, Jun 14, 2:18 PM · Restricted Project
dmgreen added a comment to D104236: [AArch64] Add a TableGen pattern to generate uaddlv from uaddlp and addv.

Can we add the other types too? It's good to add all the varieties if we can.

Mon, Jun 14, 2:15 PM · Restricted Project
dmgreen requested review of D104255: [InterleaveAccess] Copy fast math flags when adjusting binary operators in interleave access pass.
Mon, Jun 14, 12:04 PM · Restricted Project
dmgreen added inline comments to D104247: [DAGCombine] reassoc flag shouldn't enable contract.
Mon, Jun 14, 11:43 AM · Restricted Project
dmgreen accepted D104042: [AArch64] Improve SAD pattern.

Thanks. LGTM

Mon, Jun 14, 6:33 AM · Restricted Project

Sun, Jun 13

dmgreen committed rG562593ff82f8: [DSE] Extra multiblock loop tests, NFC. (authored by dmgreen).
[DSE] Extra multiblock loop tests, NFC.
Sun, Jun 13, 2:33 PM
dmgreen committed rGbee2f618d599: [ARM] Introduce t2WhileLoopStartTP (authored by dmgreen).
[ARM] Introduce t2WhileLoopStartTP
Sun, Jun 13, 5:56 AM
dmgreen closed D103236: [ARM] Introduce t2WhileLoopStartTP.
Sun, Jun 13, 5:56 AM · Restricted Project

Fri, Jun 11

dmgreen added a comment to D104042: [AArch64] Improve SAD pattern.

It is good point! I have tried below pattern following your suggestion. It seems to work. If you are ok, let me add below pattern in this patch.

let AddedComplexity = 10 in { 
def : Pat<(i32 (extractelt
                 (v4i32 (AArch64uaddv (v4i32 (AArch64uaddlp (v8i16 V128:$op))))),
                 (i64 0))),
          (UADDLVv8i16v V128:$op)>;
}

Do you mind doing this as a new patch? As it does feel logically separable. If we can test them, it would be good to add the various other sizes too. And, I'm not sure about this, but maybe it doesn't need to start from the extract, and can produce a INSERT_SUBREG like some of the other patterns do (like the ones from SIMDAcrossLanesIntrinsic). That might remove the need for the added complexity, and the INSERT_SUBREG / EXRACT_SUBREG should all get cleared up later in the pipeline.

Fri, Jun 11, 9:38 AM · Restricted Project
dmgreen accepted D103952: [CostModel][AArch64] Improve the cost estimate of CTPOP intrinsic.

Looks sensible to me, thanks

Fri, Jun 11, 1:00 AM · Restricted Project

Thu, Jun 10

dmgreen committed rG5d5b686f6bf6: [ARM] Fix Changed status in MVEGatherScatterLoweringPass. (authored by dmgreen).
[ARM] Fix Changed status in MVEGatherScatterLoweringPass.
Thu, Jun 10, 1:53 PM
dmgreen committed rGe0c605f6383c: [ARM] Ensure instructions are simplified prior to GatherScatter lowering. (authored by dmgreen).
[ARM] Ensure instructions are simplified prior to GatherScatter lowering.
Thu, Jun 10, 12:19 PM
dmgreen closed D103150: [ARM] Ensure instructions are simplified prior to GatherScatter lowering..
Thu, Jun 10, 12:18 PM · Restricted Project
dmgreen added a comment to D104042: [AArch64] Improve SAD pattern.

Nice patch. It's a big combine, but we certainly have bigger elsewhere. Sometimes it's possible to do things like this as smaller pieces, that add up to the same thing. Speaking of which, would it be possible to combine (in a separate patch, maybe as a tablegen pattern) addv (uaddlp x) -> uaddlv x ?

Thu, Jun 10, 12:16 PM · Restricted Project
dmgreen committed rG9872551ca09b: [ARM] Skip debug during vpt block creation (authored by dmgreen).
[ARM] Skip debug during vpt block creation
Thu, Jun 10, 6:49 AM
dmgreen committed rGdb9ba830d4b3: [ARM] MVE VPT block tests with debug info. NFC (authored by dmgreen).
[ARM] MVE VPT block tests with debug info. NFC
Thu, Jun 10, 6:49 AM
dmgreen closed D103610: [ARM] Skip debug during vpt block creation.
Thu, Jun 10, 6:49 AM · Restricted Project
dmgreen accepted D102755: [AArch64] Add cost tests for bitreverse.

Thanks. LGTM

Thu, Jun 10, 2:58 AM · Restricted Project

Wed, Jun 9

dmgreen updated subscribers of D103999: [AArch64][GlobalISel] Mark some G_BITREVERSE types as legal + select them.
Wed, Jun 9, 11:32 PM · Restricted Project
dmgreen added a comment to D103882: [CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> expensive..

It seems like a simpler overall route to remove the assert from the vectorizer, and allow it to deal with invalid costs. But I am not blocking this patch.

Wed, Jun 9, 11:01 AM · Restricted Project
dmgreen updated subscribers of D102755: [AArch64] Add cost tests for bitreverse.
Wed, Jun 9, 10:27 AM · Restricted Project
dmgreen updated subscribers of D103952: [CostModel][AArch64] Improve the cost estimate of CTPOP intrinsic.
Wed, Jun 9, 10:27 AM · Restricted Project
dmgreen added a comment to D102755: [AArch64] Add cost tests for bitreverse.

Costs look good, as far as I can tell.

Wed, Jun 9, 10:04 AM · Restricted Project
dmgreen accepted D103836: [ARM][NEON] Combine base address updates for vld1Ndup intrinsics.

I got confused by the _fixed vs _register vs _UPD for a while, but I see that's pre-existing :)

Wed, Jun 9, 2:08 AM · Restricted Project
dmgreen added a comment to D103882: [CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> expensive..

Why 9999? Why not an invalid cost? I thought that was what invalid was for, so we didn't need hacky random numbers like this.

We've chosen not to conflate legalization and cost-modelling, so that the cost-model must return a valid cost when the legalization phase says that a given VF is legal to use. The benefit of that is that it guards the requirement to have a complete (albeit not necessarily accurate) cost-model. For SVE a VF of vscale x 1 is legal in principle, meaning that we should be able to code-generate it. But because our code-generator is incomplete, we want to avoid using these VFs. I haven't added an interface to query the minimum legal VF, since I know we'll remove that interface at some point. The LV will expect all costs calculated to be Valid and will fail this assertion otherwise. So basically return <magic number> is just a temporary stop-gap to avoid failing that (and other) assertion failures.

Does that make more sense?

Wed, Jun 9, 12:59 AM · Restricted Project
dmgreen added inline comments to D103939: [SVE][LSR] Teach LSR to enable simple scaled-index addressing mode generation for SVE..
Wed, Jun 9, 12:13 AM · Restricted Project

Tue, Jun 8

dmgreen committed rG0178ae734ca3: [DSE] Add another multiblock loop DSE test. NFC (authored by dmgreen).
[DSE] Add another multiblock loop DSE test. NFC
Tue, Jun 8, 1:55 PM
dmgreen added a reverting change for rG222aeb4d51a4: [DSE] Remove stores in the same loop iteration: rG297088d1add7: Revert "[DSE] Remove stores in the same loop iteration".
Tue, Jun 8, 1:23 PM
dmgreen committed rG297088d1add7: Revert "[DSE] Remove stores in the same loop iteration" (authored by dmgreen).
Revert "[DSE] Remove stores in the same loop iteration"
Tue, Jun 8, 1:23 PM
dmgreen added a comment to D100464: [DSE] Remove stores in the same loop iteration.

Oh so it does. Thanks for the report. I thought we had a test case for that...

Tue, Jun 8, 1:07 PM · Restricted Project
dmgreen committed rGd7853bae9410: [ARM] Generate VDUP(Const) from constant buildvectors (authored by dmgreen).
[ARM] Generate VDUP(Const) from constant buildvectors
Tue, Jun 8, 12:52 PM
dmgreen closed D103808: [ARM] Generate VDUP(Const) from constant buildvectors.
Tue, Jun 8, 12:52 PM · Restricted Project
dmgreen committed rGf44770c32992: [ARM] A couple of extra VMOVimm tests, useful for showing BE codegen. NFC (authored by dmgreen).
[ARM] A couple of extra VMOVimm tests, useful for showing BE codegen. NFC
Tue, Jun 8, 11:40 AM
dmgreen added inline comments to D103704: [ARM] Extend narrow values to allow using truncating scatters.
Tue, Jun 8, 10:50 AM · Restricted Project
dmgreen added a comment to D103903: [ARM] Transform a fixed-point to floating-point conversion into a VCVT_fix.

Nice one. Sounds funky.

Tue, Jun 8, 10:37 AM · Restricted Project
dmgreen added a comment to D103882: [CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> expensive..

Why 9999? Why not an invalid cost? I thought that was what invalid was for, so we didn't need hacky random numbers like this.

Tue, Jun 8, 9:31 AM · Restricted Project
dmgreen added a comment to D103755: [DAG] Fold neg(bvsplat(neg(x)) -> bvsplat(x).

Funnily enough I was wondering about this pattern the other day as a followup to D98778...

Should we always be folding unaryop(splat(x)) -> splat(unaryop(x)) if the unaryop is legal/custom on the scalar type? And then maybe extend that to binop(splat(x),splat(y)) -> splat(binop(x,y)) as well?

Tue, Jun 8, 2:22 AM · Restricted Project
dmgreen updated the diff for D103755: [DAG] Fold neg(bvsplat(neg(x)) -> bvsplat(x).

Rebase over D103756

Tue, Jun 8, 2:21 AM · Restricted Project
dmgreen committed rGb889c6ee9911: [DAG] Allow isNullOrNullSplat to see truncated zeroes (authored by dmgreen).
[DAG] Allow isNullOrNullSplat to see truncated zeroes
Tue, Jun 8, 2:19 AM
dmgreen closed D103756: [DAG] Allow isNullOrNullSplat to see truncated zeroes.
Tue, Jun 8, 2:19 AM · Restricted Project
dmgreen added a comment to D103756: [DAG] Allow isNullOrNullSplat to see truncated zeroes.

Thanks Folks.

Tue, Jun 8, 2:13 AM · Restricted Project
dmgreen updated the diff for D103808: [ARM] Generate VDUP(Const) from constant buildvectors.

Added two new test cases, mov_int8_1234 that does like you said i8 <1,2,3,4,1,2,3,4,..> and mov_int32_16908546 which is 0x1020102 VDUP'd as a i16.

Tue, Jun 8, 12:42 AM · Restricted Project
dmgreen added inline comments to D102755: [AArch64] Add cost tests for bitreverse.
Tue, Jun 8, 12:34 AM · Restricted Project

Mon, Jun 7

dmgreen requested review of D103808: [ARM] Generate VDUP(Const) from constant buildvectors.
Mon, Jun 7, 6:20 AM · Restricted Project
dmgreen added inline comments to D103674: [ARM] Use rq gather/scatters for smaller v4 vectors.
Mon, Jun 7, 12:12 AM · Restricted Project
dmgreen updated the diff for D103674: [ARM] Use rq gather/scatters for smaller v4 vectors.

Update comment

Mon, Jun 7, 12:12 AM · Restricted Project

Sun, Jun 6

dmgreen updated the diff for D103756: [DAG] Allow isNullOrNullSplat to see truncated zeroes.
Sun, Jun 6, 11:56 PM · Restricted Project
dmgreen added a comment to D103756: [DAG] Allow isNullOrNullSplat to see truncated zeroes.

Whilst I'm here also add a call to peekThroughBitcasts as bitcast Zero is still Zero, and removed a related TODO comment from isOneOrOneSplat as the bitcast of 1 isn't always still 1.

Is this necessary for any of the test changes? I just wonder whether we should add this separately.

Sun, Jun 6, 11:55 PM · Restricted Project
dmgreen committed rGc85766f79b2e: [ARM] MVE tests for vmull from a splat. NFC (authored by dmgreen).
[ARM] MVE tests for vmull from a splat. NFC
Sun, Jun 6, 2:30 PM
dmgreen committed rG8f8273c54db9: [AArch64] Extra tests for vector shift. NFC (authored by dmgreen).
[AArch64] Extra tests for vector shift. NFC
Sun, Jun 6, 2:30 PM

Sat, Jun 5

dmgreen committed rG12f53e5392d6: [AArch64] Remove AArch64ISD::NEG (authored by dmgreen).
[AArch64] Remove AArch64ISD::NEG
Sat, Jun 5, 11:55 AM
dmgreen closed D103703: [AArch64] Remove AArch64ISD::NEG.
Sat, Jun 5, 11:55 AM · Restricted Project
dmgreen requested review of D103756: [DAG] Allow isNullOrNullSplat to see truncated zeroes.
Sat, Jun 5, 11:51 AM · Restricted Project
dmgreen requested review of D103755: [DAG] Fold neg(bvsplat(neg(x)) -> bvsplat(x).
Sat, Jun 5, 11:49 AM · Restricted Project

Fri, Jun 4

dmgreen requested review of D103704: [ARM] Extend narrow values to allow using truncating scatters.
Fri, Jun 4, 9:53 AM · Restricted Project
dmgreen requested review of D103703: [AArch64] Remove AArch64ISD::NEG.
Fri, Jun 4, 9:19 AM · Restricted Project
dmgreen accepted D103604: [AArch64] Further enable UnrollAndJam.

Thanks. LGTM

Fri, Jun 4, 1:21 AM · Restricted Project
dmgreen requested review of D103674: [ARM] Use rq gather/scatters for smaller v4 vectors.
Fri, Jun 4, 1:04 AM · Restricted Project

Thu, Jun 3

dmgreen added a reviewer for D103629: [AArch64] Cost-model i8 vector loads/stores: asavonic.
Thu, Jun 3, 10:18 PM · Restricted Project
dmgreen added a comment to D103604: [AArch64] Further enable UnrollAndJam.

UnJ isn't enabled by default upstream yet in the pass manager, but it should be OK to set the option in the target.

Thu, Jun 3, 8:53 AM · Restricted Project
dmgreen requested review of D103610: [ARM] Skip debug during vpt block creation.
Thu, Jun 3, 5:18 AM · Restricted Project
dmgreen committed rG929c54379a48: [ARM] Prettify gather/scatter debug comments. NFC (authored by dmgreen).
[ARM] Prettify gather/scatter debug comments. NFC
Thu, Jun 3, 4:36 AM
dmgreen updated the diff for D103236: [ARM] Introduce t2WhileLoopStartTP.

Sounds good. Now with a getWhileLoopStartTargetBB used in more places.

Thu, Jun 3, 4:14 AM · Restricted Project
dmgreen added inline comments to D103180: [InstSimplify] Add constant fold for extractelement + splat for scalable vectors.
Thu, Jun 3, 1:58 AM · Restricted Project

Wed, Jun 2

dmgreen accepted D103105: [AArch64] Optimise bitreverse lowering in ISel.

Thanks. LGTM.

Wed, Jun 2, 12:22 AM · Restricted Project

Tue, Jun 1

dmgreen added inline comments to D103105: [AArch64] Optimise bitreverse lowering in ISel.
Tue, Jun 1, 12:42 AM · Restricted Project

Mon, May 31

dmgreen closed D100464: [DSE] Remove stores in the same loop iteration.

I did some rewording, but I'm not sure it's really better than it was before. Feel free to update further if you see fit.

Mon, May 31, 2:26 AM · Restricted Project
dmgreen committed rG222aeb4d51a4: [DSE] Remove stores in the same loop iteration (authored by dmgreen).
[DSE] Remove stores in the same loop iteration
Mon, May 31, 2:23 AM

Sun, May 30

dmgreen committed rG2176be556b44: [ARM] Guard against loop variant gather ptr operands (authored by dmgreen).
[ARM] Guard against loop variant gather ptr operands
Sun, May 30, 10:02 AM

Sat, May 29

dmgreen committed rG65831422a98f: [ARM] Guard against WhileLoopStart kill flags (authored by dmgreen).
[ARM] Guard against WhileLoopStart kill flags
Sat, May 29, 1:04 PM

Thu, May 27

dmgreen requested review of D103263: [AArch64] Add S/UQXTRN tablegen patterns..
Thu, May 27, 9:12 AM · Restricted Project
dmgreen committed rG1d5b976b7783: [ARM] Extra test for reverted WLS memset. NFC (authored by dmgreen).
[ARM] Extra test for reverted WLS memset. NFC
Thu, May 27, 4:20 AM
dmgreen requested review of D103236: [ARM] Introduce t2WhileLoopStartTP.
Thu, May 27, 3:43 AM · Restricted Project

Wed, May 26

dmgreen committed rGa409fcddaed9: [ARM] Extra test for reverted WLS memset. NFC (authored by dmgreen).
[ARM] Extra test for reverted WLS memset. NFC
Wed, May 26, 6:55 AM
dmgreen requested review of D103150: [ARM] Ensure instructions are simplified prior to GatherScatter lowering..
Wed, May 26, 3:32 AM · Restricted Project
dmgreen committed rG2cf0e52b8548: [ARM] Add patterns for vmulh (authored by dmgreen).
[ARM] Add patterns for vmulh
Wed, May 26, 1:22 AM
dmgreen closed D88011: [ARM] Add patterns for vmulh.
Wed, May 26, 1:22 AM · Restricted Project

Tue, May 25

dmgreen updated the diff for D100464: [DSE] Remove stores in the same loop iteration.

Changed IsGuaranteedLoopInvariant to isGuaranteedLoopInvariant (if this wasn't what you meant, let me know, and do you have suggestions for a better name?)
Reworded comments.
Fixed Later->Earlier memref, which then needed a rejig of one of the test to move the initial memloc out of the entry block.

Tue, May 25, 2:46 PM · Restricted Project
dmgreen committed rG8cc437a8a16e: [ARM] Extra predicated tests for VMULH. NFC (authored by dmgreen).
[ARM] Extra predicated tests for VMULH. NFC
Tue, May 25, 2:24 PM
dmgreen accepted D102938: [AArch64] Generate LD1 for anyext i8 or i16 vector load.

Thanks. LGTM.

Tue, May 25, 2:20 PM · Restricted Project
dmgreen added a comment to D103105: [AArch64] Optimise bitreverse lowering in ISel.

Can we add v2i32 and maybe v1i64 handling too?

Tue, May 25, 2:19 PM · Restricted Project

Mon, May 24

dmgreen added a comment to D102904: [LoopNest][LoopFlatten] Change LoopFlattenPass to LoopNest pass.

OK. So because we only delete the inner loop, not the current one, the loop nest pass remains valid? That sounds Ok then.

Mon, May 24, 8:39 AM · Restricted Project
dmgreen committed rG543406a69b33: [ARM] Allow findLoopPreheader to return headers with multiple loop successors (authored by dmgreen).
[ARM] Allow findLoopPreheader to return headers with multiple loop successors
Mon, May 24, 4:22 AM
dmgreen closed D102747: [ARM] Allow findLoopPreheader to return headers with multiple loop successors.
Mon, May 24, 4:22 AM · Restricted Project
dmgreen accepted D102855: [ARM][NEON] Combine base address updates for vld1x intrinsics.

Looks great. I was wondering on D102256 if you were going to do loads too, but didn't want to presume.

Mon, May 24, 3:54 AM · Restricted Project
dmgreen added a comment to D102938: [AArch64] Generate LD1 for anyext i8 or i16 vector load.

Sounds good. Can we make sure there is test coverage for big endian too.

Mon, May 24, 3:31 AM · Restricted Project
dmgreen committed rG53c42f7700e8: [ARM] Ensure WLS preheader blocks have branches during memcpy lowering (authored by dmgreen).
[ARM] Ensure WLS preheader blocks have branches during memcpy lowering
Mon, May 24, 3:27 AM
dmgreen committed rG6cc78b9245bc: [ARM] Fix inline memcpy trip count sequence (authored by dmgreen).
[ARM] Fix inline memcpy trip count sequence
Mon, May 24, 3:02 AM
dmgreen closed D102629: [ARM] Fix inline memcpy trip count sequence.
Mon, May 24, 3:02 AM · Restricted Project
dmgreen added a comment to D102747: [ARM] Allow findLoopPreheader to return headers with multiple loop successors.

Thanks

Mon, May 24, 2:39 AM · Restricted Project