Page MenuHomePhabricator

dmgreen (Dave Green)
User

Projects

User does not belong to any projects.

User Details

User Since
May 24 2016, 8:35 AM (201 w, 3 d)

Recent Activity

Today

dmgreen added a comment to D76909: [MachineScheduler] Update available queue on the first mop of a new cycle.

LGTM on the power side changes, as no perf regression found with this change. I will leave others to accept this patch as I am not qualified to do.

Fri, Apr 3, 9:40 AM
dmgreen accepted D76514: [ARM] Avoid pointless vrev of element-wise vmov.

LGTM, Thanks.

Fri, Apr 3, 9:07 AM · Restricted Project
dmgreen added a reviewer for D77387: [ARM] Fix conditions for lowering to S[LR]I: t.p.northover.

Looks good.

Fri, Apr 3, 6:57 AM · Restricted Project
dmgreen accepted D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.

Nice one. LGTM

Fri, Apr 3, 2:39 AM · Restricted Project
dmgreen added a comment to rGf2fbdf76d8d0: [InstCombine] do not exclude min/max from icmp with casted operand fold.

Thanks, much appreciated. And thanks for adding the extra test.

Fri, Apr 3, 2:07 AM

Yesterday

dmgreen added inline comments to D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.
Thu, Apr 2, 11:57 PM · Restricted Project
dmgreen committed rGfbd53ffc3ad9: [ARM] MVE VMULL patterns (authored by dmgreen).
[ARM] MVE VMULL patterns
Thu, Apr 2, 3:15 AM
dmgreen committed rGc697dd9ffdb1: [ARM] Make remaining MVE instruction predictable (authored by dmgreen).
[ARM] Make remaining MVE instruction predictable
Thu, Apr 2, 3:14 AM
dmgreen closed D76740: [ARM] MVE VMULL patterns.
Thu, Apr 2, 3:14 AM · Restricted Project
dmgreen closed D76910: [ARM] Make remaining MVE instruction predictable.
Thu, Apr 2, 3:14 AM · Restricted Project
dmgreen added a comment to D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.

Looking good.

Thu, Apr 2, 1:35 AM · Restricted Project
dmgreen added a comment to rGf2fbdf76d8d0: [InstCombine] do not exclude min/max from icmp with casted operand fold.

Yeah, the multiple uses are a pain.

Thu, Apr 2, 12:30 AM

Wed, Apr 1

dmgreen added inline comments to D76518: [ARM] CMSE code generation.
Wed, Apr 1, 11:31 AM
dmgreen committed rGa0c537834ae8: [ARM] Extra vmull loop tests. NFC (authored by dmgreen).
[ARM] Extra vmull loop tests. NFC
Wed, Apr 1, 6:35 AM
dmgreen added a comment to D76740: [ARM] MVE VMULL patterns.

Thanks.

Wed, Apr 1, 5:30 AM · Restricted Project
dmgreen added a comment to D76514: [ARM] Avoid pointless vrev of element-wise vmov.

The code looks OK. I think update_llc_test_checks should work, I've used it elsewhere in the past.

Wed, Apr 1, 5:30 AM · Restricted Project
dmgreen added inline comments to D76786: [ARM][MVE] Add support for incrementing gathers.
Wed, Apr 1, 4:58 AM · Restricted Project
dmgreen added a comment to D77202: [Target][ARM] Fold or(A, B) more aggressively for I1 Vectors.

The code that already exists in the function you are changing is essentially already doing the same thing as you add here, just in a more constrained set of circumstances. It is saying that if the operands are obviously invertable, then the VPNOT will be really free and we can go ahead and invert the and to an or. The question is if that is true for all cases or not. For the test cases you have here it is true that the operands are easily invertable, but that won't be true for everything.

Wed, Apr 1, 3:54 AM · Restricted Project
dmgreen accepted D75993: [Target][ARM] Improvements to the VPT Block Insertion Pass.

Oh yeah. Because we are now sinking float splats. Still LGTM.

Wed, Apr 1, 3:22 AM · Restricted Project
dmgreen added inline comments to D76681: [ARM][MVE] Optimise offset addresses of gathers/scatters.
Wed, Apr 1, 3:18 AM · Restricted Project
dmgreen added a comment to rGf2fbdf76d8d0: [InstCombine] do not exclude min/max from icmp with casted operand fold.

Hello. Unfortunately we have some tests where it looks like this is needed. This code: https://godbolt.org/z/DpJuFm is no longer reaching the same minimum after this patch. Any ideas?

Wed, Apr 1, 1:03 AM

Tue, Mar 31

dmgreen committed rG2c5f43f9ddbb: [ARM] Fix qdadd operand order (authored by dmgreen).
[ARM] Fix qdadd operand order
Tue, Mar 31, 2:47 AM
dmgreen closed D77049: [ARM] Fix qdadd operand order.
Tue, Mar 31, 2:45 AM · Restricted Project

Mon, Mar 30

dmgreen added a comment to D76570: [AArch64] Homogeneous Prolog and Epilog for Size Optimization.

Hello. I like the idea. It's something we thought about internally but no-one has ever worked on enough to see how much of an improvement it gives in general.

Mon, Mar 30, 9:43 AM · Restricted Project
dmgreen added a comment to D77065: [ARM][MVE] Add VHADD and VHSUB patterns.

We had these patterns before and took them out because they were not correct. My understanding is that these instructions do trunc(shift(add(sext(a), sext(b)), 1)). They internally operate in a higher bitwidth than we natively have.

Mon, Mar 30, 9:43 AM · Restricted Project
dmgreen created D77049: [ARM] Fix qdadd operand order.
Mon, Mar 30, 4:17 AM · Restricted Project
dmgreen added a comment to D76909: [MachineScheduler] Update available queue on the first mop of a new cycle.

Looks like this change has more impact to pre-P9 models, especially ppc32/ppc64, which we were using itineraries . (And I know there might be bugs in those itineraries)
Although some changes doesn't look good at first glance, we should dig into itineraries of each model to see whether this is problem in itineraries or due to this change.

@steven.zhang Can you help to look into some of them , especially for P6-P8 tests. Thanks.
@jhibbits FYI, you may want to check whether those changes in G3/SPE are good.

Mon, Mar 30, 3:44 AM
dmgreen closed D76629: [ARM] MVE VMOV.i64.

rGc9eaed514929

Mon, Mar 30, 3:12 AM
dmgreen added inline comments to D76681: [ARM][MVE] Optimise offset addresses of gathers/scatters.
Mon, Mar 30, 1:35 AM · Restricted Project
dmgreen committed rGc9eaed514929: [ARM] MVE VMOV.i64 (authored by dmgreen).
[ARM] MVE VMOV.i64
Mon, Mar 30, 12:30 AM

Sun, Mar 29

dmgreen added a comment to D76786: [ARM][MVE] Add support for incrementing gathers.

Tests look nice on this one.

Sun, Mar 29, 3:32 PM · Restricted Project
dmgreen added a comment to D76847: [Target][ARM] Replace re-uses of old VPR values with VPNOTs.

I'm surprised not to see more test changes, does this really have no effect on any existing tests?

Sun, Mar 29, 3:32 PM · Restricted Project
dmgreen added inline comments to D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.
Sun, Mar 29, 3:32 PM · Restricted Project
dmgreen committed rG7c1a6873aa53: [ARM] VMOV.64 immediate tests. NFC (authored by dmgreen).
[ARM] VMOV.64 immediate tests. NFC
Sun, Mar 29, 1:24 PM

Fri, Mar 27

dmgreen updated the diff for D76910: [ARM] Make remaining MVE instruction predictable.

Unpredicatable -> Unpredictable

Fri, Mar 27, 4:50 AM · Restricted Project
dmgreen added a comment to D76910: [ARM] Make remaining MVE instruction predictable.

Are we talking about CONSTRAINED_UNPREDICTABLE here? If so, why is this modeled with hasSideEffects?

Fri, Mar 27, 4:50 AM · Restricted Project
dmgreen committed rG8689f98e9ba5: [ARM] Fix MVE VCMPr f16 pattern (authored by dmgreen).
[ARM] Fix MVE VCMPr f16 pattern
Fri, Mar 27, 4:19 AM
dmgreen closed D76841: [ARM] Fix MVE VCMPr f16 pattern.
Fri, Mar 27, 4:19 AM · Restricted Project
dmgreen created D76910: [ARM] Make remaining MVE instruction predictable.
Fri, Mar 27, 2:38 AM · Restricted Project
dmgreen created D76909: [MachineScheduler] Update available queue on the first mop of a new cycle.
Fri, Mar 27, 2:06 AM
dmgreen added inline comments to D76909: [MachineScheduler] Update available queue on the first mop of a new cycle.
Fri, Mar 27, 2:06 AM

Thu, Mar 26

dmgreen added inline comments to D76740: [ARM] MVE VMULL patterns.
Thu, Mar 26, 9:11 AM · Restricted Project
dmgreen updated the diff for D76740: [ARM] MVE VMULL patterns.

Added some tests for both top and bottom vmull's

Thu, Mar 26, 9:11 AM · Restricted Project
dmgreen accepted D75993: [Target][ARM] Improvements to the VPT Block Insertion Pass.

LGTM

Thu, Mar 26, 9:11 AM · Restricted Project
dmgreen added inline comments to D76681: [ARM][MVE] Optimise offset addresses of gathers/scatters.
Thu, Mar 26, 7:00 AM · Restricted Project
dmgreen created D76841: [ARM] Fix MVE VCMPr f16 pattern.
Thu, Mar 26, 6:28 AM · Restricted Project
dmgreen committed rG37b9cc8f29e9: [ARM] Sink splats to vector float instructions (authored by dmgreen).
[ARM] Sink splats to vector float instructions
Thu, Mar 26, 2:08 AM
dmgreen closed D76023: [ARM] Sink splats to vector float instructions.
Thu, Mar 26, 2:08 AM · Restricted Project

Wed, Mar 25

dmgreen added a comment to D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.

Unfortunately, I've written this pass in a single commit, so there is no easy way for me to split this patch in 2.
I can do it if you want, it's not impossible, but it's going to take me a while to get right. Also, the second optimisation is a relatively small part of this patch, so patch #1 would still be a large patch.

Wed, Mar 25, 4:17 AM · Restricted Project

Tue, Mar 24

dmgreen created D76740: [ARM] MVE VMULL patterns.
Tue, Mar 24, 4:32 PM · Restricted Project
dmgreen added a comment to D76709: [Target][ARM] Adding MVE VPT Optimisation Pass.

Is it possible to split this into two patches? The pass and "replaces VCMPs with VPNOTs when possible" part, then the second part to replace the re-use with the not. I think that would make each part easier to review, more manageable.

Tue, Mar 24, 4:17 PM · Restricted Project
dmgreen added a comment to D76514: [ARM] Avoid pointless vrev of element-wise vmov.

We added a VECTOR_REG_CAST, which it like a bitcast but doesn't change the bits. Similar to the AArch64 NVCAST.

Tue, Mar 24, 10:11 AM · Restricted Project
dmgreen committed rGf8c79b94af71: [ARM] Fold VMOVrh VLDR to LDRH (authored by dmgreen).
[ARM] Fold VMOVrh VLDR to LDRH
Tue, Mar 24, 9:07 AM
dmgreen closed D76485: [ARM] Fold VMOVrh VLDR to LDRH.
Tue, Mar 24, 9:07 AM · Restricted Project
dmgreen added inline comments to D76681: [ARM][MVE] Optimise offset addresses of gathers/scatters.
Tue, Mar 24, 8:34 AM · Restricted Project
dmgreen accepted D76695: [CMSE] Fix bogus save-temps.c tests.

SGTM

Tue, Mar 24, 5:20 AM
dmgreen updated the diff for D76135: [MachineLICM] Don't treat cross copies as cheap.

No longer a virtual method.

Tue, Mar 24, 4:48 AM
dmgreen added inline comments to D75993: [Target][ARM] Improvements to the VPT Block Insertion Pass.
Tue, Mar 24, 4:48 AM · Restricted Project
dmgreen committed rG1232cfa385c1: [ARM] Don't split trunc stores that can be better handled as VMOVN (authored by dmgreen).
[ARM] Don't split trunc stores that can be better handled as VMOVN
Tue, Mar 24, 3:12 AM
dmgreen closed D76511: [ARM] Don't split trunc stores that can be better handled as VMOVN.
Tue, Mar 24, 3:12 AM · Restricted Project

Mon, Mar 23

dmgreen created D76629: [ARM] MVE VMOV.i64.
Mon, Mar 23, 10:21 AM
dmgreen accepted D76515: [ARM] Fix incorrect handling of big-endian vmov.i64.

Amazing. I was going to say I'm not surprised we get big endian wrong. But it's a movimm. I am a little surprised.

Mon, Mar 23, 10:21 AM · Restricted Project
dmgreen committed rGe10af89d9917: [ARM] Extra VMOVN and VMULL tests. NFC (authored by dmgreen).
[ARM] Extra VMOVN and VMULL tests. NFC
Mon, Mar 23, 9:49 AM
dmgreen added a comment to D76514: [ARM] Avoid pointless vrev of element-wise vmov.

I see. Because we are just swapping around the same values anyway. Makes sense.

Mon, Mar 23, 8:41 AM · Restricted Project
dmgreen added a comment to D76135: [MachineLICM] Don't treat cross copies as cheap.

I have altered the way that MVE VDUP are lowered, and we end up with a VMOV now instead of using a COPY. That means I don't need this any more for MVE at least.

Mon, Mar 23, 7:03 AM
dmgreen updated the diff for D76023: [ARM] Sink splats to vector float instructions.

Rebased onto the VDUP type changes.

Mon, Mar 23, 5:58 AM · Restricted Project
dmgreen added inline comments to D76485: [ARM] Fold VMOVrh VLDR to LDRH.
Mon, Mar 23, 5:26 AM · Restricted Project
dmgreen updated the diff for D76485: [ARM] Fold VMOVrh VLDR to LDRH.
Mon, Mar 23, 5:26 AM · Restricted Project
dmgreen added inline comments to D75993: [Target][ARM] Improvements to the VPT Block Insertion Pass.
Mon, Mar 23, 4:20 AM · Restricted Project
dmgreen added inline comments to D76132: [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest..
Mon, Mar 23, 3:48 AM · Restricted Project

Sun, Mar 22

dmgreen added a reviewer for D76570: [AArch64] Homogeneous Prolog and Epilog for Size Optimization: paquette.
Sun, Mar 22, 3:34 PM · Restricted Project
dmgreen accepted D75847: [DAGCombine] Skip PostInc combine with later users.

Looks nice.

Sun, Mar 22, 3:01 PM · Restricted Project
dmgreen accepted D76060: [NFC][DAGCombine] Extract post-inc logic.

LGTM

Sun, Mar 22, 3:01 PM · Restricted Project
dmgreen added inline comments to D76132: [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest..
Sun, Mar 22, 3:01 PM · Restricted Project

Fri, Mar 20

dmgreen created D76511: [ARM] Don't split trunc stores that can be better handled as VMOVN.
Fri, Mar 20, 9:44 AM · Restricted Project
dmgreen accepted D76491: [ARM,MVE] Add ACLE intrinsics for the vaddv/vaddlv family..

Sounds great, from what I can see. The predicated lowering looks useful when/if we try and get predicated vecreduce's working.

Fri, Mar 20, 8:38 AM · Restricted Project
dmgreen committed rGb3499f572d37: [ARM] Change VDUP type to i32 for MVE (authored by dmgreen).
[ARM] Change VDUP type to i32 for MVE
Fri, Mar 20, 3:13 AM
dmgreen closed D76292: [ARM] Change VDUP type to i32 for MVE.
Fri, Mar 20, 3:13 AM · Restricted Project
dmgreen created D76485: [ARM] Fold VMOVrh VLDR to LDRH.
Fri, Mar 20, 3:13 AM · Restricted Project
dmgreen committed rG9cf920e64d18: [ARM] Extra MVE float loop tests. NFC (authored by dmgreen).
[ARM] Extra MVE float loop tests. NFC
Fri, Mar 20, 2:41 AM

Thu, Mar 19

dmgreen added a comment to D75388: Expand interleaved memory access pass to identify certain shuffle_vector and transform it into target specific intrinsics..

I'm still not convinced that this shouldn't be done in ISel. There's nothing cross-block going on, so this is what ISel is designed for. It might make sense not to think of this as trying to convert vector_shuffle to something else, but instead trying to convert what vector_shuffle has turned into into something more optimal. In that one case I looked at, there was something like a (v4i32 (ext (v2i32 (buildvector (v4i32,..))). We get this way because a v2i16 was legalised to a v2i32, but everything around it was a v4i32. Can we "flatten" the ext into a single BUILDVECTOR? I have not had time to see if that is or isn't possible, but it sounds more sensible than very special case pre-isel legalisation for certain shuffle_vector's.

Thu, Mar 19, 4:49 AM · Restricted Project
dmgreen added inline comments to D76132: [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest..
Thu, Mar 19, 4:49 AM · Restricted Project
dmgreen added a comment to D76292: [ARM] Change VDUP type to i32 for MVE.

I haven't been following the MVE work that closely, but changing the operand type of MVE vdup makes sense. My one concern here is the potential for confusion due to the opcode; VDUP for NEON and MVE have the same opcode and result type, but the operand types are different. Doesn't really matter much for isel patterns, but could be confusing for writing target-specific combines.

Thu, Mar 19, 3:44 AM · Restricted Project

Tue, Mar 17

dmgreen created D76292: [ARM] Change VDUP type to i32 for MVE.
Tue, Mar 17, 9:05 AM · Restricted Project

Mon, Mar 16

dmgreen added inline comments to D76132: [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest..
Mon, Mar 16, 4:57 PM · Restricted Project
dmgreen accepted D76122: [ARM,MVE] Add intrinsics and isel for MVE integer VMLA..

LGTM

Mon, Mar 16, 8:08 AM · Restricted Project
dmgreen added inline comments to D76132: [LoopUnrollAndJam] Changed safety checks to consider more than 2-levels loop nest..
Mon, Mar 16, 5:52 AM · Restricted Project
dmgreen updated the diff for D76135: [MachineLICM] Don't treat cross copies as cheap.

Now using TargetRegisterInfo::shareSameRegisterFile.

Mon, Mar 16, 3:53 AM

Fri, Mar 13

dmgreen updated the diff for D76135: [MachineLICM] Don't treat cross copies as cheap.

Thanks for the suggestion.

Fri, Mar 13, 10:44 AM
dmgreen added a comment to D76024: [MachineLICM] Let targets decide to hoist cheap instructions.

I'm not sure what, exactly, is specific to Cortex-M here. Copies between int and SIMD registers aren't really cheaper on other CPUs, relatively speaking. I don't really like adding a new target hook without a better justification for why Cortex-M is different from other targets; having target-specific codepaths is going to make it harder for anyone to make improvements here in the future.

Fri, Mar 13, 8:33 AM · Restricted Project
dmgreen created D76135: [MachineLICM] Don't treat cross copies as cheap.
Fri, Mar 13, 8:33 AM
dmgreen added inline comments to D76124: [TTI] Remove getOperationCost.
Fri, Mar 13, 5:06 AM · Restricted Project
dmgreen committed rG2c6c169dbd60: [ARM] Optimise ASRL/LSRL to smaller shifts using demand bits. (authored by dmgreen).
[ARM] Optimise ASRL/LSRL to smaller shifts using demand bits.
Fri, Mar 13, 3:42 AM
dmgreen closed D75371: [ARM] Optimise ASRL/LSRL to smaller shifts using demand bits..
Fri, Mar 13, 3:41 AM · Restricted Project
dmgreen added a comment to D75371: [ARM] Optimise ASRL/LSRL to smaller shifts using demand bits..

Thanks

Fri, Mar 13, 2:58 AM · Restricted Project
dmgreen committed rGf67d93dc23f9: [ARM] Constant long shift combines (authored by dmgreen).
[ARM] Constant long shift combines
Fri, Mar 13, 2:16 AM
dmgreen closed D75553: [ARM] Constant long shift combines.
Fri, Mar 13, 2:15 AM · Restricted Project

Thu, Mar 12

dmgreen committed rG05334de67976: [ARM] Long shift tests. NFC (authored by dmgreen).
[ARM] Long shift tests. NFC
Thu, Mar 12, 12:29 PM
dmgreen added a comment to D76023: [ARM] Sink splats to vector float instructions.

Annoying about the vmovs.... I can't see, with register aliasing, how this codegen wouldn't be a regression.

Thu, Mar 12, 7:35 AM · Restricted Project

Wed, Mar 11

dmgreen added a comment to D75993: [Target][ARM] Improvements to the VPT Block Insertion Pass.

Good to see this getting updated.

Wed, Mar 11, 2:07 PM · Restricted Project