Page MenuHomePhabricator

dmgreen (Dave Green)
User

Projects

User does not belong to any projects.

User Details

User Since
May 24 2016, 8:35 AM (355 w, 6 d)

Recent Activity

Today

dmgreen committed rGcd22e7c3ad98: [AArch64] Regenerate neon-vcmla.ll tests and add tests for combining fadd with… (authored by dmgreen).
[AArch64] Regenerate neon-vcmla.ll tests and add tests for combining fadd with…
Mon, Mar 20, 9:30 AM · Restricted Project, Restricted Project
dmgreen updated the diff for D146407: [AArch64] Combine fadd into fcmla.

We possibly do have a test like that in another file, but I've made sure we have a specific one in reassoc_nonfast_f32x4 at the end of the newly added tests.

Mon, Mar 20, 3:44 AM · Restricted Project, Restricted Project
dmgreen requested review of D146409: [ComplexDeinterleaving] Propagate fast math flags to symmetric operations..
Mon, Mar 20, 1:38 AM · Restricted Project, Restricted Project
dmgreen added a comment to D146404: Improve min/max vector reductions on arm.

Sounds like a nice idea. This may want to custom lower 128bit vectors too, to turn them into vpmin.u8 lowhalf, highhalf as a first step.

Mon, Mar 20, 1:32 AM · Restricted Project, Restricted Project
dmgreen requested review of D146407: [AArch64] Combine fadd into fcmla.
Mon, Mar 20, 1:23 AM · Restricted Project, Restricted Project

Yesterday

dmgreen added a reviewer for D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask.: fhahn.
Sun, Mar 19, 11:47 AM · Restricted Project, Restricted Project
dmgreen added a comment to D146212: [AArch64] Use NEON's tbl1 for 16xi8 and 8xi8 build vector with mask..

Hi - I like the idea of generating tbl instructions from non-constant masks. Does the patch need an update to be based against main?

Sun, Mar 19, 11:46 AM · Restricted Project, Restricted Project

Fri, Mar 17

dmgreen added a comment to D146033: [AArch64][TTI] Cost model FADD/FSUB/FNEG.

I think this makes a lot of sense. Especially if we are treating many shuffles with a cost of 1, floating point operations should not be twice the cost. We could consider doing the same for fmul from looking at software optimization guides, but the changes for fadd/fsub already have a high likelihood of causing some large changes. Adding fneg is worth it though as that should be a simple operation.

Fri, Mar 17, 6:18 AM · Restricted Project, Restricted Project
dmgreen added a comment to D146128: [SVE][LoopVectorize] Add option to disable tail-folding for reverse loops.

Hello - yeah I was thinking of 2, as isConsecutivePtr is fairly simple. The Arm backend uses getPtrStride and is more careful about the strides it allows. LD2/LD4 are ruled out, but other strides are allowed to use the gather/scatters for example. It will miss the SymbolicStrides and "Assume" checks though.

Fri, Mar 17, 2:32 AM · Restricted Project, Restricted Project

Thu, Mar 16

dmgreen added a comment to D146128: [SVE][LoopVectorize] Add option to disable tail-folding for reverse loops.

LoopVectorizationLegality::containsDecreasingPointers seems to loop over all the instructions and call isConsecutivePtr, which just calls getPtrStride. Could that logic just be placed in AArch64TTIImpl::preferPredicateOverEpilogue? That is how it has worked in ARMTTIImpl::preferPredicateOverEpilogue via canTailPredicateLoop. Otherwise the code in containsDecreasingPointers is ran for any architecture, but only used by AArch64.

Thu, Mar 16, 3:01 AM · Restricted Project, Restricted Project
dmgreen added a comment to D146199: [LoopVectorize] Don't tail-fold for scalable VFs when there is no scalar tail.

I had written a very similar patch recently, but it would only use the fixed length if the scalable was unknown. The performance of it was pretty bad though, so I ended up dropping it. I had noticed that there is an xfail in llvm/test/Transforms/LoopVectorize/AArch64/eliminate-tail-predication.ll at the moment. Can it now be replaced with a check for store <vscale x 4 x i32>?

Thu, Mar 16, 2:55 AM · Restricted Project, Restricted Project

Wed, Mar 15

dmgreen accepted D146055: [AArch64] Change GeneratePerfectShuffle to return one destination operand for zip and transpose operations..

Thanks. Sometimes you wonder how things ever worked. LGTM.

Wed, Mar 15, 10:18 AM · Restricted Project, Restricted Project

Tue, Mar 14

dmgreen committed rG180865a50085: [AArch64] Add FP16 broadcast and transpose costs (authored by dmgreen).
[AArch64] Add FP16 broadcast and transpose costs
Tue, Mar 14, 2:25 PM · Restricted Project, Restricted Project
dmgreen closed D146035: [AArch64] Add FP16 broadcast and transpose costs.
Tue, Mar 14, 2:25 PM · Restricted Project, Restricted Project
dmgreen added a comment to D146055: [AArch64] Change GeneratePerfectShuffle to return one destination operand for zip and transpose operations..

Does this fix https://github.com/llvm/llvm-project/issues/61203?

Tue, Mar 14, 2:21 PM · Restricted Project, Restricted Project
dmgreen requested review of D146035: [AArch64] Add FP16 broadcast and transpose costs.
Tue, Mar 14, 5:05 AM · Restricted Project, Restricted Project
dmgreen added a comment to D145939: [DAG] Fold multiple insert_vector_elt of zero values into an AND mask.

It looks like this comes up from odd numbered vector elements quite a bit. The AArch64 changes all look OK.

Tue, Mar 14, 1:26 AM · Restricted Project, Restricted Project
dmgreen accepted D144789: [DAG] Match select(icmp(x,y),sub(x,y),sub(y,x)) -> abd(x,y) patterns.

I checked the AArch64 test and they all seem to be valid. LGTM, thanks.

Tue, Mar 14, 1:23 AM · Restricted Project, Restricted Project

Mon, Mar 13

dmgreen committed rG98481bc723c8: [LV][VPlan] Fix printing TripCount liveins. NFC (authored by dmgreen).
[LV][VPlan] Fix printing TripCount liveins. NFC
Mon, Mar 13, 12:44 PM · Restricted Project, Restricted Project
dmgreen closed D145507: [LV][VPlan] Fix printing TripCount liveins. NFC.
Mon, Mar 13, 12:44 PM · Restricted Project, Restricted Project
dmgreen accepted D145578: [AArch64] Cost-model vector splat LD1Rs to avoid unprofitable SLP vectorisation.

Brilliant, thanks. LGTM.

Mon, Mar 13, 7:02 AM · Restricted Project, Restricted Project
dmgreen abandoned D89323: [LV] Costing for VPInstructions.
Mon, Mar 13, 5:36 AM · Restricted Project, Restricted Project
dmgreen abandoned D89322: [LV] Initial VPlan cost modelling.

This too old to be useful now and I don't have any plans to work on it in the near term. (It would be good to see improvements though, where the vplan is costed more directly as opposed continuing to go through the IR instructions).

Mon, Mar 13, 5:35 AM · Restricted Project, Restricted Project
dmgreen added a comment to D145578: [AArch64] Cost-model vector splat LD1Rs to avoid unprofitable SLP vectorisation.

The CostKind can be TCK_RecipThroughput (the default and the one we usually care most about), TCK_Latency, TCK_CodeSize or TCK_SizeAndLatency. I think if we have the code we might as well get TCK_CodeSize correct and return 0 in that case, so the load+dup have a combined cost of 1. TCK_Latency and TCK_SizeAndLatency I'm less sure about, perhaps leave them with the same costs as TCK_RecipThroughput?

Mon, Mar 13, 5:32 AM · Restricted Project, Restricted Project
dmgreen requested review of D145927: [LV] Fix the combination of predicated epilogs and DataAndControlFlow.
Mon, Mar 13, 5:24 AM · Restricted Project, Restricted Project
dmgreen requested review of D145925: [LV] Add a UsePredicatedEpilogue epilog vectorization scheme option.
Mon, Mar 13, 4:58 AM · Restricted Project, Restricted Project
dmgreen added a comment to D142875: [LV] Predicated epilog vectorization.

Sorry for the delay - I've had less time than I would like to get back to this. I have updated and rebased the patch. There is still one large MVE issue I need to work through, and the combo of epilog vectorization + DataAndControlFlow is currently not working correctly. I will split that off into another patch though as it is a bit of a more involved change. Plus there is another patch for letting this be controlled by the target or an option.

Mon, Mar 13, 3:25 AM · Restricted Project, Restricted Project
dmgreen updated the diff for D142875: [LV] Predicated epilog vectorization.
Mon, Mar 13, 3:24 AM · Restricted Project, Restricted Project

Fri, Mar 10

dmgreen committed rGfb8839110c04: [AArch64] Remove fixed FIXMEs from D120706. NFC (authored by dmgreen).
[AArch64] Remove fixed FIXMEs from D120706. NFC
Fri, Mar 10, 6:28 AM · Restricted Project, Restricted Project
dmgreen added a comment to D145614: [AARCH64] Enable STORE of v4i8 to help more vectorization opportunities.

Hello. This looks like a nice idea. We have done some work to make v4i8 better in the recent past. I hadn't realized that the slp vectorizer wasn't making use of that for stores.

Fri, Mar 10, 4:47 AM · Restricted Project, Restricted Project
dmgreen accepted D142482: [Codegen] Support symmetric operations on complex numbers.

If you can move the call to identifySymmetricOperation into this patch, then it LGTM. Cheers

Fri, Mar 10, 4:35 AM · Restricted Project, Restricted Project
dmgreen accepted D143177: Cleanup of Complex Deinterleaving pass (NFCI).

If you can move the identifySymmetricOperation call to the other patch then what remains LGTM. Thanks.

Fri, Mar 10, 4:35 AM · Restricted Project, Restricted Project
dmgreen added a comment to D143177: Cleanup of Complex Deinterleaving pass (NFCI).

This LGTM but please make sure the correct parts are in the correct patches. I don't think this will actually build on its own.

Fri, Mar 10, 4:32 AM · Restricted Project, Restricted Project
dmgreen accepted D145370: [AArch64] Fix N2 SchedModel for arithmetic and logic ops with cheap LSL.

LGTM. Thanks

Fri, Mar 10, 4:07 AM · Restricted Project, Restricted Project
dmgreen accepted D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..

Yeah nothing else from me. LGTM, thanks for the changes.

Fri, Mar 10, 3:14 AM · Restricted Project, Restricted Project
dmgreen added a reviewer for D145578: [AArch64] Cost-model vector splat LD1Rs to avoid unprofitable SLP vectorisation: vporpo.

Hello. I had to remind myself where this came from. It looks like it was introduced in D123638, and there were some comments already about the performance not always being ideal. It apparently helped for some <2 x double> vectorization. I'm not sure if there it a perfect answer, but an effective cost of 2 for the throughput of the ld1r would seem to match the hardware better. This doesn't alter isLegalBroadcastLoad and the tests added in D123638 don't seem to change.

Fri, Mar 10, 2:41 AM · Restricted Project, Restricted Project

Wed, Mar 8

dmgreen added a comment to D144086: [AArch64] Load into zero vector patterns.

There is hopefully a fix in 1c6ea961938488997712763762079e535b8b704. Please let me know if that does or doesn't fix your issue, and if you have details on getting assembly from mlir. Thanks

Wed, Mar 8, 5:09 AM · Restricted Project, Restricted Project
dmgreen committed rG1c6ea9619384: [AArch64] Fix load-insert-zero patterns with i8 and negative offsets. (authored by dmgreen).
[AArch64] Fix load-insert-zero patterns with i8 and negative offsets.
Wed, Mar 8, 4:48 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144086: [AArch64] Load into zero vector patterns.

Hi - thanks for the report. It sounds like the offset might be wrong from the look at the assembly. This instructions specifically:

10534: 40 f4 7f 3d  	ldr	b0, [x2, #4093]
vs
10534: 46 0c 00 d1  	sub	x6, x2, #3
1053c: c0 00 40 0d  	ld1	{ v0.b }[0], [x6]
Wed, Mar 8, 4:35 AM · Restricted Project, Restricted Project

Tue, Mar 7

dmgreen committed rG9aa39481d9eb: [AArch64] Prefer to fold dup into fmul/fma as opposed to ld1r (authored by dmgreen).
[AArch64] Prefer to fold dup into fmul/fma as opposed to ld1r
Tue, Mar 7, 1:24 PM · Restricted Project, Restricted Project
dmgreen closed D145184: [AArch64] Prefer to fold dup into fmul/fma as opposed to ld1r.
Tue, Mar 7, 1:24 PM · Restricted Project, Restricted Project
dmgreen requested review of D145507: [LV][VPlan] Fix printing TripCount liveins. NFC.
Tue, Mar 7, 7:32 AM · Restricted Project, Restricted Project
dmgreen committed rG5a45d21a0866: [AArch64] Tests for dup in load vs mul. NFC (authored by dmgreen).
[AArch64] Tests for dup in load vs mul. NFC
Tue, Mar 7, 5:21 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..
Tue, Mar 7, 5:10 AM · Restricted Project, Restricted Project
dmgreen accepted D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..

I think it's worth adding test for both the ashr and lshr versions, but otherwise I think this LGTM. Thanks

Tue, Mar 7, 2:59 AM · Restricted Project, Restricted Project

Mon, Mar 6

dmgreen committed rGa10ac6554db4: [AArch64] Extend load insert into zero patterns to SVE. (authored by dmgreen).
[AArch64] Extend load insert into zero patterns to SVE.
Mon, Mar 6, 3:26 PM · Restricted Project, Restricted Project
dmgreen added a comment to D145370: [AArch64] Fix N2 SchedModel for arithmetic and logic ops with cheap LSL.

The add/sub sounds like a nice change.

Mon, Mar 6, 6:18 AM · Restricted Project, Restricted Project
dmgreen added a comment to D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..

Thanks. here are some alive proofs for the transform in https://alive2.llvm.org/ce/z/N6hwQY and https://alive2.llvm.org/ce/z/u_GjYJ.

Mon, Mar 6, 12:10 AM · Restricted Project, Restricted Project

Sun, Mar 5

dmgreen committed rGad002398c9ca: [AArch64] Add missing bf16 SVE extract vector patterns (authored by dmgreen).
[AArch64] Add missing bf16 SVE extract vector patterns
Sun, Mar 5, 8:52 AM · Restricted Project, Restricted Project
dmgreen committed rG2c958a5aa9ff: [AArch64] Add missing bf16 SVE insert vector patterns. (authored by dmgreen).
[AArch64] Add missing bf16 SVE insert vector patterns.
Sun, Mar 5, 8:25 AM · Restricted Project, Restricted Project

Fri, Mar 3

dmgreen committed rGae12e57a777f: [AArch64] Add missing bf16 load insert pattern (authored by dmgreen).
[AArch64] Add missing bf16 load insert pattern
Fri, Mar 3, 2:01 PM · Restricted Project, Restricted Project

Thu, Mar 2

dmgreen accepted D143713: [ARM] Fix Chain/Glue Bug in PerformVMOVhrCombine.

Thanks. LGTM

Thu, Mar 2, 2:05 PM · Restricted Project, Restricted Project
dmgreen accepted D143712: [ARM] Pre-Commit Tests for PR60510.

It's a bit odd to read code with the filter-out push/pop, but LGTM.

Thu, Mar 2, 2:04 PM · Restricted Project, Restricted Project
dmgreen accepted D145185: [AArch64] Fix crash in LowerBUILD_VECTOR trying to create invalid EXTRACT_SUBVECTOR..

Sounds good to me.

Thu, Mar 2, 2:02 PM · Restricted Project, Restricted Project
dmgreen requested review of D145184: [AArch64] Prefer to fold dup into fmul/fma as opposed to ld1r.
Thu, Mar 2, 1:46 PM · Restricted Project, Restricted Project
dmgreen added a comment to D145131: [Arm][AArch64] Setting IsX18ReservedByDefault() to true for Unknown OSes.

I don't think this is the right approach to take. As far as I understand this will effect any aarch64-none-eabi targets that are compiling for bare metal (as the tests show), causing a performance hit to any bare metal application. They shouldn't have to pay the price if they don't actually use x18 for anything, which many will not.

Thu, Mar 2, 3:09 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..
Thu, Mar 2, 12:21 AM · Restricted Project, Restricted Project

Wed, Mar 1

dmgreen added reviewers for D145113: [SelectionDAG][AArch64] Constant fold in SelectionDAG::getVScale if VScaleMin==VScaleMax.: sdesmalen, david-arm.
Wed, Mar 1, 11:50 PM · Restricted Project, Restricted Project
dmgreen added a comment to rG201b7858f695: [AArch64] Disable aarch64-enable-gep-opt.

Hey @dmgreen, this is Hanhan working on IREE team (which is a MLIR based compiler project). We've noticed that this commit causes regressions in our tracking models. If there was a way to do control this flag through the C++ API, that would help us configure the options during setup of target machine easier. Could you help expose the option or give us some advices on how we can control the flag through some C++ API? Thank you!

Wed, Mar 1, 10:22 AM · Restricted Project, Restricted Project
dmgreen accepted D145067: [AArch64] Precommit some more LD1R splat tests for scalar int/fp loads.

Additional tests sounds OK. We don't always match splats to a single cost like we should.

Wed, Mar 1, 9:44 AM · Restricted Project, Restricted Project
dmgreen committed rG337215ddf93f: [DAG] ABD is not reassociative (authored by dmgreen).
[DAG] ABD is not reassociative
Wed, Mar 1, 8:22 AM · Restricted Project, Restricted Project
dmgreen closed D145064: [DAG] ABD is not reassociative.
Wed, Mar 1, 8:22 AM · Restricted Project, Restricted Project
dmgreen added a comment to D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..

Hi - I was just looking at the patch whilst you updated it! Please ignore any comments that don't apply any more.

Wed, Mar 1, 7:53 AM · Restricted Project, Restricted Project
dmgreen committed rG83bbd3fdbd75: [AArch64] Load into zero vector patterns (authored by dmgreen).
[AArch64] Load into zero vector patterns
Wed, Mar 1, 5:54 AM · Restricted Project, Restricted Project
dmgreen closed D144086: [AArch64] Load into zero vector patterns.
Wed, Mar 1, 5:54 AM · Restricted Project, Restricted Project
dmgreen requested review of D145064: [DAG] ABD is not reassociative.
Wed, Mar 1, 5:50 AM · Restricted Project, Restricted Project
dmgreen accepted D145004: [AArch64] More patterns to generate LD1R vector splats.

Sounds good. I'm a little surprised that if we have a vector load where only one lane it demanded that we don't change it into a scalar load.

Wed, Mar 1, 1:53 AM · Restricted Project, Restricted Project
dmgreen committed rG18af85302200: [AArch64] Remove 64bit->128bit vector insert lowering (authored by dmgreen).
[AArch64] Remove 64bit->128bit vector insert lowering
Wed, Mar 1, 1:40 AM · Restricted Project, Restricted Project
dmgreen closed D144550: [AArch64] Remove 64bit->128bit vector insert lowering.
Wed, Mar 1, 1:40 AM · Restricted Project, Restricted Project

Mon, Feb 27

dmgreen committed rG06daa515b270: [AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to… (authored by dmgreen).
[AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to…
Mon, Feb 27, 11:20 AM · Restricted Project, Restricted Project
dmgreen closed D144850: [AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts.
Mon, Feb 27, 11:20 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144850: [AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts.

Cheers

Mon, Feb 27, 11:20 AM · Restricted Project, Restricted Project
dmgreen committed rG9e5bfa1ae30b: [AArch64] Add some tests for multiple uses of extended vector extracts. NFC (authored by dmgreen).
[AArch64] Add some tests for multiple uses of extended vector extracts. NFC
Mon, Feb 27, 6:35 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144550: [AArch64] Remove 64bit->128bit vector insert lowering.

The idea makes sense I think, but just to put things into context, do you already have a case or patch where we can see the benefit of this?

Mon, Feb 27, 4:42 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144733: [ARM] Add Thumb Attributes for thumb thunks created in SLSHarding.

Looks good but where are the tests?

Mon, Feb 27, 1:29 AM · Restricted Project, Restricted Project
dmgreen requested review of D144850: [AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts.
Mon, Feb 27, 12:52 AM · Restricted Project, Restricted Project

Sun, Feb 26

dmgreen added reviewers for D144791: Reorder stack up-adjustment and register copies: aemerson, paquette, efriedma.
Sun, Feb 26, 10:52 AM · Restricted Project, Restricted Project

Fri, Feb 24

dmgreen requested review of D144733: [ARM] Add Thumb Attributes for thumb thunks created in SLSHarding.
Fri, Feb 24, 7:06 AM · Restricted Project, Restricted Project
dmgreen added a comment to D143283: [AArch64][SVE]: custom lower AVGFloor/AVGCeil..

Hello. Sorry for the delay in looking at this but I wasn't sure exactly what you were trying to do, and I've never been a huge fan of DAG combines that create the wrong node just to expand it later. It looks like for legal types this can lead to a nice decrease in instruction count though.

Fri, Feb 24, 2:39 AM · Restricted Project, Restricted Project

Wed, Feb 22

dmgreen committed rG74b67e53c638: [LSR] Fix incorrect check in 73cd3d4391ad47ae7 (authored by dmgreen).
[LSR] Fix incorrect check in 73cd3d4391ad47ae7
Wed, Feb 22, 3:42 PM · Restricted Project, Restricted Project
dmgreen committed rG73cd3d4391ad: [LSR] Prevent creating SCEVs of addrecs from mismatching loops (authored by dmgreen).
[LSR] Prevent creating SCEVs of addrecs from mismatching loops
Wed, Feb 22, 2:51 PM · Restricted Project, Restricted Project
dmgreen added a comment to D141940: [SLP]Add shuffling of extractelements to avoid extra costs/data movement..

Fixed in cbcdd747e85b8d33b821d94d8114b971f31fd0d2

Wed, Feb 22, 5:31 AM · Restricted Project, Restricted Project
dmgreen committed rGc33fd3b47faa: [AArch64] Lower all fp zero buildvectors through BUILD_VECTOR. (authored by dmgreen).
[AArch64] Lower all fp zero buildvectors through BUILD_VECTOR.
Wed, Feb 22, 3:27 AM · Restricted Project, Restricted Project
dmgreen requested review of D144550: [AArch64] Remove 64bit->128bit vector insert lowering.
Wed, Feb 22, 3:25 AM · Restricted Project, Restricted Project

Tue, Feb 21

dmgreen accepted D144508: [AArch64] Fix N2 SchedModel INS instruction latencies.

Sounds good to me. This LGTM but you may be able to remove the extra pattern, and it may be worth quickly adding tests for each of the 4 type sizes.

Tue, Feb 21, 11:47 PM · Restricted Project, Restricted Project
dmgreen committed rGafa557fad61a: [AArch64] Add a test for loading into a zerovector. NFC (authored by dmgreen).
[AArch64] Add a test for loading into a zerovector. NFC
Tue, Feb 21, 6:43 AM · Restricted Project, Restricted Project
dmgreen updated subscribers of D141940: [SLP]Add shuffling of extractelements to avoid extra costs/data movement..

Hi - We noticed some regressions from this. The largest ones appear to be similar to the potential regressions that were reported in https://reviews.llvm.org/D142359#4131731, but with code that includes intrinsics and didn't change the vector cost: https://godbolt.org/z/P1zoxz85f.

Tue, Feb 21, 3:07 AM · Restricted Project, Restricted Project

Mon, Feb 20

dmgreen accepted D144399: [AArch64] Add tests for saba (NFC).

Yeah they sounds OK to add to me. The tests LGTM, whether you want to update the triple or not.

Mon, Feb 20, 9:05 AM · Restricted Project, Restricted Project
dmgreen added a comment to D143713: [ARM] Fix Chain/Glue Bug in PerformVMOVhrCombine.

Do you mean the AES pass tests? Do you think that should happen before we land this, they've been an issue for a little while.

Mon, Feb 20, 9:01 AM · Restricted Project, Restricted Project
dmgreen accepted D142594: [AArch64] Eliminating the use of integer unit in moving from a Neon scalar result of a uaddlv to a Neon vector.

Thanks. From what I can see this LGTM.

Mon, Feb 20, 6:29 AM · Restricted Project, Restricted Project
dmgreen committed rGc6c6723189f4: [AArch64] More consistently use buildvector for zero and all-ones constants (authored by dmgreen).
[AArch64] More consistently use buildvector for zero and all-ones constants
Mon, Feb 20, 6:14 AM · Restricted Project, Restricted Project
dmgreen closed D144018: [AArch64] More consistently use buildvector for zero and all-ones constants.
Mon, Feb 20, 6:13 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144379: [AArch64] Fix abs(sub nsw) -> absd.

Can you make sure you upload with full context (-U999999), as per https://llvm.org/docs/DeveloperPolicy.html#making-and-submitting-a-patch. It can help to make the patch easier to review.

Mon, Feb 20, 5:22 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D142359: [TTI][AArch64] Cost model vector INS instructions.
Mon, Feb 20, 4:44 AM · Restricted Project, Restricted Project
dmgreen added a comment to D144116: [DAGCombiner] Avoid converting (x or/xor const) + y to (x + y) + const if benefit is unclear.

Would it be possible to optimize the ADDCARRY to the same result as without this fold? Similar to combineADDCARRYDiamond. I looked at the DAG that was being produced, but it's not obvious to me how it would be sensible combined to the same result as before.

Mon, Feb 20, 12:56 AM · Restricted Project, Restricted Project
dmgreen added reviewers for D144128: [SLP] Check with target before vectorizing GEP Indices: fhahn, SjoerdMeijer.

I think this change is OK for AArch64 too, I don't think it will change much in practice. Some of the tests may not be testing what they did in the past though.

Mon, Feb 20, 12:45 AM · Restricted Project, Restricted Project

Sun, Feb 19

dmgreen committed rGfd4d29808efa: [ARM] Add targets for Arm DebugInfo tests. NFC (authored by dmgreen).
[ARM] Add targets for Arm DebugInfo tests. NFC
Sun, Feb 19, 11:14 AM · Restricted Project, Restricted Project

Sat, Feb 18

dmgreen committed rG8e3dc1366fb8: [AArch64] Concat zip1 and zip2 is a wider zip1 (authored by dmgreen).
[AArch64] Concat zip1 and zip2 is a wider zip1
Sat, Feb 18, 11:54 AM · Restricted Project, Restricted Project
dmgreen closed D121088: [AArch64] Concat zip1 and zip2 is a wider zip1.
Sat, Feb 18, 11:54 AM · Restricted Project, Restricted Project

Feb 16 2023

dmgreen added inline comments to D144018: [AArch64] More consistently use buildvector for zero and all-ones constants.
Feb 16 2023, 10:04 AM · Restricted Project, Restricted Project