dmgreen (Dave Green)
User

Projects

User does not belong to any projects.

User Details

User Since
May 24 2016, 8:35 AM (111 w, 6 d)

Recent Activity

Yesterday

dmgreen added a comment to D49281: [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes..

I like the idea of this, giving control to the user to figure out the best way to manipulate loops (even if they end up shooting themselves in the foot with it :) )

Sun, Jul 15, 1:15 PM
dmgreen created D49349: [UnJ] Document loop metadata.
Sun, Jul 15, 1:05 PM

Sat, Jul 14

dmgreen added a comment to D40369: Support sext instruction in SCEV delinearization algorithm (new revision).

Hello. I just have some test Nits.

Sat, Jul 14, 2:00 AM

Thu, Jul 12

dmgreen committed rL336897: [UnJ] Use SmallPtrSets for block collections. NFC.
[UnJ] Use SmallPtrSets for block collections. NFC
Thu, Jul 12, 3:50 AM
dmgreen closed D49060: [UnJ] Use SmallPtrSets for block collections NFC.
Thu, Jul 12, 3:49 AM
dmgreen added a comment to D49061: [UnJ] Common some code. NFC.

Thanks

Thu, Jul 12, 1:55 AM
dmgreen added a comment to D49060: [UnJ] Use SmallPtrSets for block collections NFC.

Thanks

Thu, Jul 12, 1:54 AM

Sun, Jul 8

dmgreen created D49061: [UnJ] Common some code. NFC.
Sun, Jul 8, 2:16 PM
dmgreen created D49060: [UnJ] Use SmallPtrSets for block collections NFC.
Sun, Jul 8, 1:40 PM
dmgreen accepted D48914: [PGOMemOPSize] Preserve the DominatorTree.

LGTM

Sun, Jul 8, 1:42 AM

Fri, Jul 6

dmgreen added a comment to D48914: [PGOMemOPSize] Preserve the DominatorTree.

The updates looks correct to me. Can you add a test for pgo-memop-opt that preserves the DT and verifies it's correct.

Fri, Jul 6, 1:04 AM

Thu, Jul 5

dmgreen added inline comments to D48971: [ARM] ParallelDSP: added statistics, NFC..
Thu, Jul 5, 11:57 PM

Wed, Jul 4

dmgreen accepted D48937: [TableGen] Increase the number of supported decoder fix-ups..

Thanks Sander!

Wed, Jul 4, 7:10 AM
dmgreen added inline comments to D48914: [PGOMemOPSize] Preserve the DominatorTree.
Wed, Jul 4, 12:18 AM

Mon, Jul 2

dmgreen added a comment to D48383: [Dominators] Add the DomTreeUpdater class.

+1. Nice work guys.

Mon, Jul 2, 1:14 AM

Sun, Jul 1

dmgreen committed rL336062: [UnrollAndJam] New Unroll and Jam pass.
[UnrollAndJam] New Unroll and Jam pass
Sun, Jul 1, 5:52 AM
dmgreen closed D41953: [LoopUnroll] Unroll and Jam.
Sun, Jul 1, 5:52 AM

Thu, Jun 28

dmgreen added a comment to D41953: [LoopUnroll] Unroll and Jam.

OK. The important DA patches are in. Unless anyone objects, I will commit this soon (hopefully over the weekend).

Thu, Jun 28, 12:17 PM

Mon, Jun 25

dmgreen committed rL335481: [DA] Delinearise AddRecs if we can prove they don't wrap.
[DA] Delinearise AddRecs if we can prove they don't wrap
Mon, Jun 25, 8:18 AM
dmgreen closed D48481: [DA] Delinearise AddRecs if we can prove they don't wrap.
Mon, Jun 25, 8:18 AM
dmgreen added a comment to D48481: [DA] Delinearise AddRecs if we can prove they don't wrap.

Cheers!

Mon, Jun 25, 8:17 AM

Fri, Jun 22

dmgreen added a comment to D48383: [Dominators] Add the DomTreeUpdater class.

Hello. Glad to see this is moving forward! This looks like it's going to be very useful.

Fri, Jun 22, 7:54 AM
dmgreen created D48481: [DA] Delinearise AddRecs if we can prove they don't wrap.
Fri, Jun 22, 6:39 AM

Thu, Jun 21

dmgreen committed rL335249: [ARM] Enable useAA() for the in-order Cortex-R52.
[ARM] Enable useAA() for the in-order Cortex-R52
Thu, Jun 21, 8:53 AM
dmgreen closed D48074: [ARM] Enable useAA() for the in-order Cortex-R52.
Thu, Jun 21, 8:53 AM
dmgreen committed rL335217: [DA] Enable -da-delinearize by default.
[DA] Enable -da-delinearize by default
Thu, Jun 21, 4:59 AM
dmgreen closed D45872: [DA] Enable -da-delinearize by default.
Thu, Jun 21, 4:59 AM
dmgreen added a comment to D45872: [DA] Enable -da-delinearize by default.

Thanks!

Thu, Jun 21, 4:32 AM
dmgreen committed rL335210: [DAGCombine] Fix alignment for offset loads/stores.
[DAGCombine] Fix alignment for offset loads/stores
Thu, Jun 21, 1:34 AM
dmgreen closed D48029: [DAGCombine] Fix alignment for offset loads/stores.
Thu, Jun 21, 1:34 AM
dmgreen added a comment to D48029: [DAGCombine] Fix alignment for offset loads/stores.

Thanks Eli.

Thu, Jun 21, 1:34 AM

Wed, Jun 20

dmgreen updated the diff for D48029: [DAGCombine] Fix alignment for offset loads/stores.

Very minor update as NewLoad is unused in release builds.

Wed, Jun 20, 2:41 AM
dmgreen accepted D48202: Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred..

LGTM, thanks.

Wed, Jun 20, 1:20 AM

Tue, Jun 19

dmgreen accepted D48279: [PatternMatch] Add m_Store pattern match helper.

LGTM. This looks simple enough. Perhaps wait enough for anyone on a different time zone to object.

Tue, Jun 19, 7:44 AM
dmgreen added a comment to D48202: Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred..

You may need to update Transforms/LoopSimplifyCFG/scev.ll now too (which is hopefully simple). Otherwise this looks sensible to me.

Tue, Jun 19, 7:20 AM
dmgreen added inline comments to D48029: [DAGCombine] Fix alignment for offset loads/stores.
Tue, Jun 19, 6:26 AM
dmgreen added inline comments to D48279: [PatternMatch] Add m_Store pattern match helper.
Tue, Jun 19, 4:55 AM
dmgreen updated the diff for D48029: [DAGCombine] Fix alignment for offset loads/stores.

I have changed these to asserts, which I believe will never fire. I'm not sure if it breaks the design on DAGCombiner to work like this, just refining the alignment on the MMO.

Tue, Jun 19, 4:44 AM
dmgreen committed rL335036: [LoopSimplifyCFG] Invalidate SCEV in LoopSimplifyCFG.
[LoopSimplifyCFG] Invalidate SCEV in LoopSimplifyCFG
Tue, Jun 19, 2:48 AM
dmgreen closed D48258: [LoopSimplifyCFG] Preserve Scalar Evolution in LoopSimplifyCFG.
Tue, Jun 19, 2:48 AM
dmgreen added a comment to D48258: [LoopSimplifyCFG] Preserve Scalar Evolution in LoopSimplifyCFG.

Thanks

Tue, Jun 19, 2:47 AM

Mon, Jun 18

dmgreen added inline comments to D48279: [PatternMatch] Add m_Store pattern match helper.
Mon, Jun 18, 6:56 AM
dmgreen added a comment to D48029: [DAGCombine] Fix alignment for offset loads/stores.

Ping. This look Ok now?

Mon, Jun 18, 3:12 AM
dmgreen added a comment to D48074: [ARM] Enable useAA() for the in-order Cortex-R52.

Yes I can see that. I would have liked to turn this on for more in-order cores, but without scheduling enough to at least say that a load takes multiple cycles, I didn't feel I had a great justification. For the record, these were the changes I saw on a A53 with useAA returning true (units are time, so lower is better. these are more than 2%):

Mon, Jun 18, 3:11 AM
dmgreen added a comment to D45872: [DA] Enable -da-delinearize by default.

Yes, I agree. There are hopefully multiple improvements we can make in the future. I remember seeing one patch recently (40369) that should help delinearise when sext's are involved (which I think happens a lot on 64bit machines). One thing that would probably be good to add somehow would be versioning based on needs of the loop. For example in your case it could check that ss is more than w, or in a more general case check that arrays do not alias. I know polly is very good at that, but it's not something we have for DA yet.

Mon, Jun 18, 3:08 AM

Sun, Jun 17

dmgreen added a comment to D48256: Fix bug to merge away entry block and update DT correctly..

Worth seeing D48202, which plans to remove this function in preference of MergeBlockIntoPredecessor (which does the merge the other way, so doesn't run into the same problem removing the entry block).

Sun, Jun 17, 12:40 PM

Sat, Jun 16

dmgreen added inline comments to D48202: Generalize MergeBlockIntoPredecessor. Replace uses of MergeBasicBlockIntoOnlyPred..
Sat, Jun 16, 3:08 PM
dmgreen created D48258: [LoopSimplifyCFG] Preserve Scalar Evolution in LoopSimplifyCFG.
Sat, Jun 16, 3:06 PM

Jun 15 2018

dmgreen added a comment to D45872: [DA] Enable -da-delinearize by default.

Hello. Thanks for running the tests. I did not know that there were uses outside of loop interchange and now unroll and jam. That's good to know it will get some more use.

Jun 15 2018, 2:30 AM

Jun 14 2018

dmgreen updated the diff for D41953: [LoopUnroll] Unroll and Jam.

OK. One DA patch is in, the other is still waiting for review.

Jun 14 2018, 5:57 AM

Jun 13 2018

dmgreen accepted D47917: [ARM] Lower llvm.ctlz.i32 to a libcall when clz is not available..

So we (Arm Compiler 6) don't ship/compile against compiler-rt. At least not at the moment. I'm not sure why, it's been like that from before my time, we have just survived like that for a long while. I'm guessing our c library has always just filled in the gaps (at least the parts that we need).

Jun 13 2018, 8:21 AM
dmgreen added a comment to D48074: [ARM] Enable useAA() for the in-order Cortex-R52.

Thanks

Jun 13 2018, 2:25 AM

Jun 12 2018

dmgreen added a reviewer for D48074: [ARM] Enable useAA() for the in-order Cortex-R52: hfinkel.
Jun 12 2018, 5:09 AM
dmgreen edited reviewers for D48074: [ARM] Enable useAA() for the in-order Cortex-R52, added: t.p.northover, rengolin; removed: hfinkel.
Jun 12 2018, 5:09 AM
dmgreen created D48074: [ARM] Enable useAA() for the in-order Cortex-R52.
Jun 12 2018, 5:00 AM
dmgreen added a comment to D48074: [ARM] Enable useAA() for the in-order Cortex-R52.

Requires D48029 to survives a bootstrap, but that looks like a more generic error than having to use this option. Otherwise I believe this is safe.

Jun 12 2018, 5:00 AM
dmgreen added a comment to D47917: [ARM] Lower llvm.ctlz.i32 to a libcall when clz is not available..

I like the idea I think. Should this be guarded by some sort of gnueabi though?

Jun 12 2018, 3:00 AM

Jun 11 2018

dmgreen updated the diff for D48029: [DAGCombine] Fix alignment for offset loads/stores.

getOriginalAlignment() -> getAlignment()

Jun 11 2018, 9:30 AM
dmgreen added inline comments to D48029: [DAGCombine] Fix alignment for offset loads/stores.
Jun 11 2018, 9:26 AM
dmgreen added a comment to D48029: [DAGCombine] Fix alignment for offset loads/stores.

Hello.

Jun 11 2018, 9:06 AM
dmgreen created D48029: [DAGCombine] Fix alignment for offset loads/stores.
Jun 11 2018, 8:31 AM
dmgreen added a comment to D45872: [DA] Enable -da-delinearize by default.

https://rise4fun.com/Z3/hP6p is a slightly cleaner version of the proof for the change in GCD.ll.

Jun 11 2018, 5:11 AM

Jun 6 2018

dmgreen added a comment to D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

I have a preference for using the underscores as our primary spelling. I think that it's easier to read.

Jun 6 2018, 10:04 AM
dmgreen committed rL334099: [GlobalMerge] Set the alignment on merged global structs.
[GlobalMerge] Set the alignment on merged global structs
Jun 6 2018, 7:53 AM
dmgreen closed D47633: [GlobalMerge] Set the alignment on merged global structs.
Jun 6 2018, 7:52 AM

Jun 5 2018

dmgreen added a comment to D47633: [GlobalMerge] Set the alignment on merged global structs.

Thanks.

Jun 5 2018, 3:05 PM
dmgreen added a comment to D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

I quite like the UnrollAndFuse naming. I'd not heard that the xlc compiler called it that. The UnrollAndJam pass was origin named that before I renamed for similar reasons (UnrollAndJam being more well known).

Jun 5 2018, 3:04 PM
dmgreen added a comment to D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

I noticed in the paper that you used the name "unrollandjam", minus underscores. Should I change this use that spelling here? I have no strong opinion of one over the other (was just using what I had found from the Intel docs).

Jun 5 2018, 7:16 AM
dmgreen updated the diff for D47633: [GlobalMerge] Set the alignment on merged global structs.

Added a comment. I could change it to calculate the MaxAlignment from the alignment on the globals, if you think that's better in the long run.

Jun 5 2018, 7:15 AM

Jun 4 2018

dmgreen updated the diff for D45872: [DA] Enable -da-delinearize by default.

Thanks for the comments. I've moved things around as a result of your suggestions.

Jun 4 2018, 10:06 AM

Jun 1 2018

dmgreen updated the diff for D45872: [DA] Enable -da-delinearize by default.

Rebase

Jun 1 2018, 9:09 AM
dmgreen created D47633: [GlobalMerge] Set the alignment on merged global structs.
Jun 1 2018, 7:00 AM

May 31 2018

dmgreen committed rL333658: [DA] Fix direction vectors for weakZeroSrcSIV.
[DA] Fix direction vectors for weakZeroSrcSIV
May 31 2018, 8:01 AM
dmgreen closed D46678: [DA] Fix direction vectors for weakZeroSrcSIV.
May 31 2018, 8:01 AM
dmgreen added a comment to D46678: [DA] Fix direction vectors for weakZeroSrcSIV.

Thanks! I have a couple more DA patches around, but they may not be as straight forward as this one.

May 31 2018, 8:01 AM

May 30 2018

dmgreen updated the diff for D41953: [LoopUnroll] Unroll and Jam.

Attempted to clean up tests and make them not rely on arm. Removed the outer loop IV check, some other small code edits and some comment improvements/formatting.

May 30 2018, 9:57 AM

May 29 2018

dmgreen planned changes to D41953: [LoopUnroll] Unroll and Jam.
May 29 2018, 11:47 AM
dmgreen reopened D41953: [LoopUnroll] Unroll and Jam.

There are a number of obvious enhancements that might be worthwhile, but I think it's best to work on those after this lands.

May 29 2018, 11:47 AM
dmgreen added reviewers for D46678: [DA] Fix direction vectors for weakZeroSrcSIV: hiraditya, philip.pfaffe, bcahoon.

friendly ping

May 29 2018, 11:07 AM
dmgreen added inline comments to D47408: [PM/LoopUnswitch] When using the new SimpleLoopUnswitch pass, schedule loop-cleanup passes at the beginning of the loop pass pipeline, and re-enqueue loops after even trivial unswitching..
May 29 2018, 4:27 AM

May 28 2018

dmgreen added a comment to D47113: [CVP] Teach CorrelatedValuePropagation to reduce the width of lshr instruction..

Sorry, I forgot to mention the important fact that those results were Arm, specifically thumbv8m.baseline on a cortex-m23 (where the pointers are 32bit, and only i32 are legal types). Try this data layout for arm code:
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"

May 28 2018, 3:46 AM

May 27 2018

dmgreen added a comment to D41953: [LoopUnroll] Unroll and Jam.

Reverted in 333359 as it's failing on some of the builders. I believe the tests are dependant on the arm backend being built.

May 27 2018, 6:30 AM
dmgreen committed rL333359: Revert 333358 as it's failing on some builders..
Revert 333358 as it's failing on some builders.
May 27 2018, 5:58 AM
dmgreen added a reverting commit for rL333358: [UnrollAndJam] Add a new Unroll and Jam pass: rL333359: Revert 333358 as it's failing on some builders..
May 27 2018, 5:58 AM
dmgreen committed rL333358: [UnrollAndJam] Add a new Unroll and Jam pass.
[UnrollAndJam] Add a new Unroll and Jam pass
May 27 2018, 5:15 AM
dmgreen closed D41953: [LoopUnroll] Unroll and Jam.
May 27 2018, 5:15 AM

May 24 2018

dmgreen added a comment to D47113: [CVP] Teach CorrelatedValuePropagation to reduce the width of lshr instruction..

So yes, I ran some quick benchmarks and I believe this will cause regressions in some circumstances. In one case I looked at (which is running under our special LTO pipeline and may be a little difficult to replicate), we start off with this:

%shr = lshr i32 %sub, 6
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr

This is turned into:

%shr.lhs.trunc = trunc i32 %sub to i16
%shr.rhs.trunc = trunc i32 6 to i16
%shr = lshr i16 %shr.lhs.trunc, %shr.rhs.trunc
%shr.zext = zext i16 %shr to i32
%arrayidx = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext

Which gets turned right back into:

%shr = lshr i32 %sub, 6
%shr.zext = and i32 %shr, 1023
%arrayidx11 = getelementptr inbounds i16, i16* %AllocationMap, i32 %shr.zext
May 24 2018, 7:27 AM
dmgreen updated the diff for D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

This splits out the pragma clang loop unroll_and_jam handling into D47320, for if/when we need it. Which I believe is what you wanted, correct me if I'm wrong.

May 24 2018, 5:53 AM
dmgreen created D47320: [UnrollAndJam] Add pragma clang loop unroll_and_jam handling..
May 24 2018, 5:51 AM

May 23 2018

dmgreen added a comment to D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.

In my experience, they are used.

May 23 2018, 11:33 AM
dmgreen created D47267: [UnrollAndJam] Add unroll_and_jam pragma handling.
May 23 2018, 9:00 AM

May 22 2018

dmgreen updated the diff for D41953: [LoopUnroll] Unroll and Jam.

Review Updates.

May 22 2018, 4:23 AM

May 21 2018

dmgreen updated the diff for D41953: [LoopUnroll] Unroll and Jam.

OK thanks. I will leave this for at least a couple more days whilst I run extra tests. Any other comments from anyone are welcome/appreciated.

May 21 2018, 10:57 AM
dmgreen added inline comments to D41953: [LoopUnroll] Unroll and Jam.
May 21 2018, 10:57 AM
dmgreen closed D46893: [CVP] Require DomTree for new Pass Manager.

I also forgot to mention I took a quick look and it appears that getBestSimplifyQuery is only used in two places, here in CVP and LoopRotate.

May 21 2018, 4:17 AM
dmgreen committed rL332836: [CVP] Require DomTree for new Pass Manager.
[CVP] Require DomTree for new Pass Manager
May 21 2018, 4:13 AM
dmgreen updated the diff for D46893: [CVP] Require DomTree for new Pass Manager.

Thanks for the info. I see the problem. I will commit this, to make it so this isn't crashing, and we can go from there.

May 21 2018, 4:07 AM

May 18 2018

dmgreen added inline comments to D47041: [ValueTracking] Teach computeKnownBits that the result of an absolute value pattern that uses nsw flag is always positive..
May 18 2018, 5:11 AM
dmgreen added a comment to D46678: [DA] Fix direction vectors for weakZeroSrcSIV.

Ping :)

May 18 2018, 2:56 AM
dmgreen added a comment to D46893: [CVP] Require DomTree for new Pass Manager.

I may have sowed some confusion here by not updating commit messages as the code changed.

May 18 2018, 2:55 AM