Page MenuHomePhabricator

dmgreen (Dave Green)
User

Projects

User does not belong to any projects.

User Details

User Since
May 24 2016, 8:35 AM (190 w, 3 d)

Recent Activity

Today

dmgreen added inline comments to D72919: [AArch64] Add custom store lowering for 256 bit non-temporal stores..
Sat, Jan 18, 2:31 AM · Restricted Project

Yesterday

dmgreen created D72939: [Schedule] Add a MultiHazardRecognizer.
Fri, Jan 17, 11:11 AM · Restricted Project
dmgreen added inline comments to D72856: [ARM][MVE] Enable masked scatter.
Fri, Jan 17, 8:05 AM · Restricted Project
dmgreen added a comment to D72916: [IPO] Don't run jump threading at Oz.

I feel that this should be an decision made inside the pass, not necessarily in the pass manager. We may have a single module with both functions that are and are not optsize. The inliner/unroller would change thresholds based on this, for example.

Fri, Jan 17, 6:49 AM · Restricted Project
dmgreen accepted D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..

This looks useful as a separate function, the interface looks pretty clean and I think it lives nicely in LoopUtils.

Fri, Jan 17, 6:11 AM · Restricted Project
dmgreen added inline comments to D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..
Fri, Jan 17, 1:28 AM · Restricted Project
dmgreen added a comment to D72602: [IndVarSimplify][LoopUtils] rewriteLoopExitValues. NFC..

These seems like a sensible thing to go along with D72714.

Fri, Jan 17, 1:24 AM · Restricted Project
dmgreen added a comment to D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

Seems good to me, if Sam is happy with the tail predication parts.

Fri, Jan 17, 1:24 AM · Restricted Project

Thu, Jan 16

dmgreen accepted D72451: [ARM][MVE] Enable extending gathers.

LGTM. Thanks

Thu, Jan 16, 2:08 AM · Restricted Project

Wed, Jan 15

dmgreen added inline comments to D72753: [ARM] Fixup FP16 bitcasts.
Wed, Jan 15, 3:24 PM · Restricted Project
dmgreen added a comment to D72761: [ARM][MVE][Intrinsics] Add VMINAQ, VMINNMAQ, VMAXAQ, VMAXNMAQ intrinsics..

Nice one. Good to see codegen changes coming out of these intrinsics.

Wed, Jan 15, 10:22 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.
Wed, Jan 15, 8:49 AM · Restricted Project
dmgreen added a comment to D71194: [ARM] MVE VLDn addressing modes.

Thanks! This is mostly a copy of the existing Neon code, adjusted for MVE. Hopefully fairly straight forward.

Wed, Jan 15, 6:48 AM · Restricted Project
dmgreen added inline comments to D72451: [ARM][MVE] Enable extending gathers.
Wed, Jan 15, 4:19 AM · Restricted Project
dmgreen updated the diff for D71194: [ARM] MVE VLDn addressing modes.

I just copied the existing tests with post-inc versions.

Wed, Jan 15, 2:46 AM · Restricted Project
dmgreen added inline comments to D71194: [ARM] MVE VLDn addressing modes.
Wed, Jan 15, 2:37 AM · Restricted Project
dmgreen updated the diff for D72753: [ARM] Fixup FP16 bitcasts.

Update another test.

Wed, Jan 15, 12:55 AM · Restricted Project
dmgreen created D72753: [ARM] Fixup FP16 bitcasts.
Wed, Jan 15, 12:27 AM · Restricted Project
dmgreen committed rG1b264a8263f8: [ARM] Reegenerate MVE tests. NFC (authored by dmgreen).
[ARM] Reegenerate MVE tests. NFC
Wed, Jan 15, 12:18 AM

Tue, Jan 14

dmgreen added a comment to D71194: [ARM] MVE VLDn addressing modes.

Ping

Tue, Jan 14, 11:49 PM · Restricted Project
dmgreen committed rGb891490ceb39: [Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use… (authored by dmgreen).
[Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use…
Tue, Jan 14, 11:40 PM
dmgreen added a comment to D72714: [ARM][MVE] Tail-Predication: rematerialise iteration count in exit blocks.

I had not realised that this required LCSSA form. That will likely cause many more changed than you really want. It's probably best not to create LCSSA form here without destroying it afterwards. Do you know if LoopSimpleForm is needed too? It looks from the code that it shouldn't (it can handle multiple exits).

Tue, Jan 14, 2:17 PM · Restricted Project
dmgreen accepted D72629: [ARM][MVE] Disallow VPSEL for tail predication.

Thanks. LGTM

Tue, Jan 14, 3:33 AM · Restricted Project
dmgreen accepted D72509: [ARM][LowOverheadLoops] Allow all MVE instrs..

LGTM by the way. I hadn't meant to tread on Sjoerd toes, I didn't see he was taking a look already. The issues were mostly unrelated, this looks like a nice improvement on it's own.

Tue, Jan 14, 12:17 AM · Restricted Project

Mon, Jan 13

dmgreen added a comment to D72629: [ARM][MVE] Disallow VPSEL for tail predication.

Sounds good. I think that if we wanted to make VPSEL work, it would be better to try to convert them to VMOVt in a VPT block. (Although if there is only one of them, creating the block maybe more overhead. Hopefully the VCMP can be folded in).

Mon, Jan 13, 10:25 AM · Restricted Project
dmgreen committed rG90555d925343: [Scheduler] Remove superfluous casts. NFC (authored by dmgreen).
[Scheduler] Remove superfluous casts. NFC
Mon, Jan 13, 8:41 AM
dmgreen added inline comments to D72509: [ARM][LowOverheadLoops] Allow all MVE instrs..
Mon, Jan 13, 7:25 AM · Restricted Project
dmgreen added inline comments to D72451: [ARM][MVE] Enable extending gathers.
Mon, Jan 13, 6:26 AM · Restricted Project
dmgreen added a comment to D72509: [ARM][LowOverheadLoops] Allow all MVE instrs..

This looks sensible, using the VPT blocks to know that the instructions are predicated on something that makes tail predication valid.

Mon, Jan 13, 5:39 AM · Restricted Project
dmgreen updated the diff for D70790: [ARM] Favour post inc for MVE loops.

Sorry for the long delay here. This wasn't making things much better in the tests I was trying (it was a bit up-and-down). I've adjusted it now to include shouldFavorPostInc for MVE subtargets, not just disable shouldFavorBackedgeIndex. That should get the costmodel more correct in LSR, where the AddRec is now free because it can just be the postinc. I also removed the old "containsVectors(L)" check as adding something that is O(n) to the inner parts of LSR, something that is already O(something large), was probably a bad idea.

Mon, Jan 13, 5:39 AM · Restricted Project
dmgreen accepted D72330: [ARM][MVE] Enable masked gathers from base + vector of offsets.

LGTM. Nice one.

Mon, Jan 13, 2:41 AM · Restricted Project

Sun, Jan 12

dmgreen added a comment to D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..

Yeah, thanks. The square case comes up quite a lot.

Sun, Jan 12, 10:48 AM · Restricted Project

Fri, Jan 10

dmgreen added a comment to D72330: [ARM][MVE] Enable masked gathers from base + vector of offsets.

This I like. I like the structure. I think it's nice to split it out like this.

Fri, Jan 10, 6:27 AM · Restricted Project
dmgreen accepted D72502: [ARM][MVE] Tail predicate VMAX,VMAXA,VMIN,VMINA.

Sounds good to me.

Fri, Jan 10, 5:47 AM · Restricted Project
dmgreen accepted D72496: [ARM,MVE] Make `vqrshrun` generate the right instruction..

Yeah, I said these would be tough to read! LGTM

Fri, Jan 10, 3:22 AM · Restricted Project

Thu, Jan 9

dmgreen added inline comments to D72330: [ARM][MVE] Enable masked gathers from base + vector of offsets.
Thu, Jan 9, 2:00 PM · Restricted Project
dmgreen added inline comments to D72451: [ARM][MVE] Enable extending gathers.
Thu, Jan 9, 12:35 PM · Restricted Project
dmgreen accepted D72445: [ARM,MVE] Add missing IntrNoMem flag on IR intrinsics..

Thanks. LGTM.

Thu, Jan 9, 6:18 AM · Restricted Project
dmgreen accepted D72444: [ARM,MVE] Fix valid immediate range for vsliq_n..

Nice test. LGTM

Thu, Jan 9, 5:50 AM · Restricted Project
dmgreen added a comment to D72445: [ARM,MVE] Add missing IntrNoMem flag on IR intrinsics..

I agree that IntrNoMem makes a very sensible default.

Thu, Jan 9, 5:50 AM · Restricted Project
dmgreen accepted D72440: [ARM][MVE] Don't unroll intrinsic loops..

Ah. LGTM. Thanks

Thu, Jan 9, 3:13 AM · Restricted Project
dmgreen accepted D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..

LGTM. Thanks.

Thu, Jan 9, 1:31 AM · Restricted Project

Wed, Jan 8

dmgreen added a comment to D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..

The discussion in the link above sounds like using addRequiredID on a transformation pass is not a good idea. Beside that, I don't know a way to require transformations to run before a pass in the new pass manager.
I suggested to give up on loops that are not simplified or loop nests that are not in LCSSA. And change the LITs to be all in simplified and LCSSA form. In additional, add the two transformation passes before LoopUnrollAndJamPass in the pipeline. What do you think?

Wed, Jan 8, 6:53 AM · Restricted Project
dmgreen accepted D72328: [ARM,MVE] Intrinsics for partial-overwrite imm shifts..

LGTM I think

Wed, Jan 8, 5:23 AM · Restricted Project, Restricted Project
dmgreen accepted D72329: [ARM,MVE] Intrinsics for variable shift instructions..

I imagine trying to read what a llvm.arm.mve.vshl means in IR would be quite difficult, but it does make lowering them simpler. LGTM

Wed, Jan 8, 5:21 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..
Wed, Jan 8, 1:08 AM · Restricted Project
dmgreen created D72387: [LoopVectorize][TTI] Add an isLegalMaskedLoadStore method. NFC.
Wed, Jan 8, 1:08 AM · Restricted Project
dmgreen accepted D72131: [ARM][LowOverheadLoops] Update liveness info.

LGTM then. Thanks.

Wed, Jan 8, 12:50 AM · Restricted Project

Tue, Jan 7

dmgreen added a comment to D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..

If UnJ needs loops in LoopSimplify form then it should request (or enforce) that loops are in that form. I think it can be done with AU.addRequiredID(LoopSimplifyID); (and maybe a DEPENDENCY on LoopSimplify?) That should hopefully mean that the tests don't need these adjustments.

I am not familiar with addRequiredID, do you know if there is an equivalent in the new pass manager?

Do we have any tests for the new pass manager? I feel we would have added some, but I don't see any here.

I don't see any LIT tests using the new pass manager in llvm/test/Transforms/LoopUnrollAndJam, is it in some other folder I missed?

Tue, Jan 7, 2:35 PM · Restricted Project
dmgreen added a comment to D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..

If UnJ needs loops in LoopSimplify form then it should request (or enforce) that loops are in that form. I think it can be done with AU.addRequiredID(LoopSimplifyID); (and maybe a DEPENDENCY on LoopSimplify?) That should hopefully mean that the tests don't need these adjustments.

Tue, Jan 7, 12:49 PM · Restricted Project
dmgreen added inline comments to D72330: [ARM][MVE] Enable masked gathers from base + vector of offsets.
Tue, Jan 7, 9:55 AM · Restricted Project
dmgreen added inline comments to D72131: [ARM][LowOverheadLoops] Update liveness info.
Tue, Jan 7, 9:37 AM · Restricted Project
dmgreen accepted D71743: [ARM][MVE] Enable masked gathers from vector of pointers.

Thanks. LGTM, if no-one else has any other comments.

Tue, Jan 7, 3:23 AM · Restricted Project

Mon, Jan 6

dmgreen added inline comments to D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..
Mon, Jan 6, 2:54 PM · Restricted Project
dmgreen committed rGf88d52728b9c: [ARM] Use the correct opcodes for Thumb2 segmented stack frame lowering (authored by dmgreen).
[ARM] Use the correct opcodes for Thumb2 segmented stack frame lowering
Mon, Jan 6, 8:41 AM
dmgreen closed D72074: [ARM] Use the correct opcodes for Thumb2 segmented stack frame lowering.
Mon, Jan 6, 8:41 AM · Restricted Project
dmgreen committed rG0eb981b8ce70: [ARM] Use correct TRAP opcode for thumb in FastISel (authored by dmgreen).
[ARM] Use correct TRAP opcode for thumb in FastISel
Mon, Jan 6, 8:41 AM
dmgreen closed D72075: [ARM] Use correct TRAP opcode for thumb in FastISel.
Mon, Jan 6, 8:41 AM · Restricted Project
dmgreen accepted D72268: [ARM,MVE] Support -ve offsets in gather-load intrinsics..

LGTM.

Mon, Jan 6, 8:32 AM · Restricted Project, Restricted Project
dmgreen accepted D72270: [ARM,MVE] Fix many signedness errors in MVE intrinsics..

OK. LGTM then.

Mon, Jan 6, 8:31 AM · Restricted Project, Restricted Project
dmgreen added inline comments to D72270: [ARM,MVE] Fix many signedness errors in MVE intrinsics..
Mon, Jan 6, 8:15 AM · Restricted Project, Restricted Project
dmgreen accepted D72269: [ARM,MVE] Generate the right instruction for vmaxnmq_m_f16..

LGTM

Mon, Jan 6, 8:02 AM · Restricted Project
dmgreen added inline comments to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.
Mon, Jan 6, 6:51 AM · Restricted Project
dmgreen added a reviewer for D71743: [ARM][MVE] Enable masked gathers from vector of pointers: samparker.

Nice, thanks!

Mon, Jan 6, 6:24 AM · Restricted Project
dmgreen created D72257: [ARM] Use reduction intrinsics for larger than legal reductions.
Mon, Jan 6, 5:40 AM · Restricted Project
dmgreen updated the diff for D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..

Update as per the comments. I may leave this to be committed until the MVE codegen is further along (it depends on the first patch at least).

Mon, Jan 6, 4:52 AM · Restricted Project
dmgreen added inline comments to D72131: [ARM][LowOverheadLoops] Update liveness info.
Mon, Jan 6, 2:26 AM · Restricted Project
dmgreen added a comment to D72230: [NFCI][LoopUnrollAndJam] Changing LoopUnrollAndJamPass to a function pass..

This is only the new pass manager, right? It's probably best to keep the old pass manager working in the same way too, to keep them in sync. That will be the one that we use downstream at the mo.

Mon, Jan 6, 1:30 AM · Restricted Project

Sun, Jan 5

dmgreen committed rGfb8c9a339a9d: [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors (authored by dmgreen).
[ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors
Sun, Jan 5, 4:01 AM
dmgreen closed D72139: [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors.
Sun, Jan 5, 4:01 AM · Restricted Project
dmgreen committed rGc15a56f61a56: [ARM] Fill in FP16 FMA patterns (authored by dmgreen).
[ARM] Fill in FP16 FMA patterns
Sun, Jan 5, 4:01 AM
dmgreen closed D72138: [ARM] Fill in FP16 FMA patterns.
Sun, Jan 5, 4:01 AM · Restricted Project
dmgreen committed rG5a2539922124: [ARM] Add and update FMA tests. NFC (authored by dmgreen).
[ARM] Add and update FMA tests. NFC
Sun, Jan 5, 4:01 AM
dmgreen committed rG170de3de2eea: [ParserTest] Move raw string literal out of macro (authored by dmgreen).
[ParserTest] Move raw string literal out of macro
Sun, Jan 5, 3:33 AM

Fri, Jan 3

dmgreen updated the diff for D72139: [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors.

Move more of the logic into Subtarget.

Fri, Jan 3, 8:40 AM · Restricted Project
dmgreen added inline comments to D72139: [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors.
Fri, Jan 3, 8:32 AM · Restricted Project
dmgreen added inline comments to D72138: [ARM] Fill in FP16 FMA patterns.
Fri, Jan 3, 7:54 AM · Restricted Project
dmgreen added a comment to D72131: [ARM][LowOverheadLoops] Update liveness info.

Looks interesting. Thanks for splitting it out, it makes it a lot clearer.

Fri, Jan 3, 4:51 AM · Restricted Project
dmgreen created D72139: [ARM] Use isFMAFasterThanFMulAndFAdd for scalars as well as MVE vectors.
Fri, Jan 3, 4:10 AM · Restricted Project
dmgreen created D72138: [ARM] Fill in FP16 FMA patterns.
Fri, Jan 3, 3:42 AM · Restricted Project
dmgreen accepted D72061: [AArch64][test] Merge arm64-$i.ll Linux tests into $i.ll.

Sounds good. We could add an arm64 triple run line to the existing tests, but I'm not sure it adds much.

Fri, Jan 3, 1:03 AM · Restricted Project

Thu, Jan 2

dmgreen committed rG6b067c6a91e5: [ARM] Update ifcvt test target triples and opcodes. NFC (authored by dmgreen).
[ARM] Update ifcvt test target triples and opcodes. NFC
Thu, Jan 2, 6:19 AM
dmgreen added a reviewer for D72075: [ARM] Use correct TRAP opcode for thumb in FastISel: t.p.northover.
Thu, Jan 2, 3:41 AM · Restricted Project
dmgreen created D72075: [ARM] Use correct TRAP opcode for thumb in FastISel.
Thu, Jan 2, 3:41 AM · Restricted Project
dmgreen created D72074: [ARM] Use the correct opcodes for Thumb2 segmented stack frame lowering.
Thu, Jan 2, 3:29 AM · Restricted Project
dmgreen updated the diff for D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..
Thu, Jan 2, 3:20 AM · Restricted Project
dmgreen added inline comments to D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..
Thu, Jan 2, 3:20 AM · Restricted Project
dmgreen added a comment to D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..

OK.

Longer version:

Agreed, this is the place that gathers all symbolic strides for which runtime checks are later added by replaceSymbolicStrideSCEV().

Agreed, a gather or scatter would probably be a better option than a runtime check, certainly if the stride turns out to be other than 1, in which case the runtime check option will execute the original scalar loop.
Note that if the stride does turn out to be 1, a runtime check may be faster: the cost of a vector load/store is typically less than that of a gather/scatter, disregarding the overhead of the runtime check itself. So having a way to "manually" restore original performance for such cases may be useful (in addition to EnableMemAccessVersioning). Always preferring a gather or scatter as suggested should be good step forward, given the expected complexity of devising a cost-based preference.

Instead of teaching LAI to make such target/cost-based decisions, it would have been better to let this analysis continue to collect *all* symbolic strides potentially subject to runtime checks, and teach the planning/transform to prune/decide which strides to actually specialize; e.g., have LVP::plan() start by calling "CM.setVersioningForStrideDecisions()", analogous to InterleavedAccessInfo::analyzeInterleaving() which collects all potential interleave groups, and CM::setCostBasedWideningDecision() which decides which of the groups to materialize (per VF). However, this requires a fair amount of refactoring; worth a TODO?

Thu, Jan 2, 3:11 AM · Restricted Project
dmgreen accepted D70000: [DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal'.

The AArch64 changes look good to me. Thanks

Thu, Jan 2, 1:39 AM · Restricted Project

Wed, Jan 1

dmgreen committed rGf323ab919a70: [ARM] Add +mve feature to mve tests. NFC (authored by dmgreen).
[ARM] Add +mve feature to mve tests. NFC
Wed, Jan 1, 9:38 AM

Mon, Dec 30

dmgreen accepted D72000: [ARM][Thumb][FIX] Add unwinding information to t4.

Thanks. LGTM

Mon, Dec 30, 8:03 AM · Restricted Project
dmgreen added a comment to D72000: [ARM][Thumb][FIX] Add unwinding information to t4.

Sounds good. Can you add a test?

Mon, Dec 30, 6:30 AM · Restricted Project
dmgreen committed rGb4abe7afbf52: [ARM] Sink splat to ICmp (authored by dmgreen).
[ARM] Sink splat to ICmp
Mon, Dec 30, 5:04 AM
dmgreen closed D70997: [ARM] Sink splat to ICmp.
Mon, Dec 30, 5:04 AM · Restricted Project
dmgreen committed rGa5a141544d0b: [ARM] MVE sink ICmp test. NFC (authored by dmgreen).
[ARM] MVE sink ICmp test. NFC
Mon, Dec 30, 5:03 AM
dmgreen added a comment to D71992: [ARM] Unrestrict Armv8 IT blocks.

Sounds interesting. My understanding is that there were several features of IT instructions that were performance deprecated. Some of the instructions inside the blocks and accesses to PC were ruled out too. Take a look at isV8EligibleForIT for how they are disabled (I happened to be looking at it the other day for different reasons. Someone wanted to try restrict-it on other cpus).

Mon, Dec 30, 3:02 AM · Restricted Project

Fri, Dec 27

dmgreen created D71919: [LoopVectorize] Disable single stride access predicates when gather loads are available..
Fri, Dec 27, 1:01 AM · Restricted Project

Mon, Dec 23

dmgreen added inline comments to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.
Mon, Dec 23, 8:30 AM · Restricted Project

Fri, Dec 20

dmgreen added inline comments to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.
Fri, Dec 20, 2:49 AM · Restricted Project
dmgreen added inline comments to D71743: [ARM][MVE] Enable masked gathers from vector of pointers.
Fri, Dec 20, 2:11 AM · Restricted Project