dmgreen (Dave Green)
User

Projects

User does not belong to any projects.

User Details

User Since
May 24 2016, 8:35 AM (129 w, 5 d)

Recent Activity

Tue, Nov 13

dmgreen updated the diff for D50141: Add errors for tiny codemodel on targets other than AArch64.

I'm happy to change stuff around. I believe it's the way it is at the moment because of the place this is called, half way into a constructor (making virtuals difficult) and some archs requiring different pieces of info (like aarch64 being different for JIT's). I figured if in doubt, don't change too much at once.

Tue, Nov 13, 10:04 AM
dmgreen added a comment to D54411: [Codegen] Merge tail blocks with no successors after block placement.

Hello. I agree that this looks like an improvement. Can you add a testcase?

Tue, Nov 13, 8:25 AM

Wed, Nov 7

dmgreen updated the diff for D54142: [ARM] Cortex-M4 schedule.

Cleanup using tablegen classes.

Wed, Nov 7, 3:09 AM
dmgreen added inline comments to D54142: [ARM] Cortex-M4 schedule.
Wed, Nov 7, 3:09 AM

Tue, Nov 6

dmgreen added a comment to D54142: [ARM] Cortex-M4 schedule.

Unfortunately, this also increased codesize a little at -Oz, which I will have to look into.

Tue, Nov 6, 2:49 AM
dmgreen created D54142: [ARM] Cortex-M4 schedule.
Tue, Nov 6, 2:46 AM

Mon, Nov 5

dmgreen planned changes to D53405: [Inliner] Attempt to more accurately model the cost of loops at minsize.

(but can cause problems for cases where the blocks are not in the form they will appear in assembly).

I'm not sure what sort of issue you're running into here?

Mon, Nov 5, 8:17 AM
dmgreen committed rL346134: [Inliner] Penalise inlining of calls with loops at Oz.
[Inliner] Penalise inlining of calls with loops at Oz
Mon, Nov 5, 6:56 AM
dmgreen closed D52716: [Inliner] Penalise inlining of calls with loops at Oz.
Mon, Nov 5, 6:56 AM

Sun, Nov 4

dmgreen added inline comments to D54021: [LoopSimplifyCFG] Teach LoopSimplifyCFG to constant-fold branches and switches.
Sun, Nov 4, 1:02 PM

Sat, Nov 3

dmgreen added reviewers for D53980: [ARM, AArch64] Move ARM/AArch64 target parsers into separate files to enable future changes.: SjoerdMeijer, olista01, efriedma, peter.smith, t.p.northover.

Hello! Drive by comments. I've not looked in any depth.

Sat, Nov 3, 1:31 PM

Thu, Nov 1

dmgreen updated subscribers of D53876: Preserve loop metadata when splitting exit blocks.
Thu, Nov 1, 2:31 AM

Wed, Oct 31

dmgreen added a comment to D52716: [Inliner] Penalise inlining of calls with loops at Oz.

Friendly Ping

Wed, Oct 31, 3:21 PM

Thu, Oct 25

dmgreen accepted D53582: [AArch64] Add EXT patterns for 64-bit EXT of a subvector of a 128-bit vector.

I've managed to convince myself that this looks OK. A couple of nits depending on what you think of them.

Thu, Oct 25, 4:22 AM
dmgreen accepted D53580: [AArch64] Refactor definition of EXT patterns to use a multiclass.

Nice cleanup. LGTM

Thu, Oct 25, 4:20 AM
dmgreen accepted D53579: [AArch64] Do 64-bit vector move of 0 and -1 by extracting from the 128-bit move.

I was thinking about how this might affect other little cores like the A53/A55, especially around the dual issue on q registers. I don't think it will make much difference though, and the CSE benefits look like a bigger win.

Thu, Oct 25, 4:20 AM

Wed, Oct 24

dmgreen updated the diff for D52716: [Inliner] Penalise inlining of calls with loops at Oz.

Now ignores loops that will never be executed. I also have some code that uses SCEV to calculate if the backedge count is <= 1 and allow inlining there. It doesn't seem to come up very often though and needed some plumbing to get SE's/TLI's in the right places.

Wed, Oct 24, 9:12 AM

Mon, Oct 22

dmgreen accepted D53453: [ARM] Make InstrEmitter mark CPSR defs dead for Thumb1..

Anyway, this looks good to me.

Mon, Oct 22, 3:18 AM
dmgreen updated subscribers of D53453: [ARM] Make InstrEmitter mark CPSR defs dead for Thumb1..

Interesting. I didn't realise it worked like that. I had presumed that lot of passes would have to be taught about the optional defs, as opposed to them not being marked as dead correctly.

Mon, Oct 22, 3:10 AM
dmgreen accepted D53452: [ARM] Allow TBB formation for Thumb1 in more cases..

Yep, this came up a few times in the tests I ran. Looks like a nice improvement, and I don't think it complicate things too much.

Mon, Oct 22, 2:49 AM

Oct 18 2018

dmgreen created D53405: [Inliner] Attempt to more accurately model the cost of loops at minsize.
Oct 18 2018, 10:35 AM
dmgreen closed D51780: ARM: align loops to 4 bytes on Cortex-M3 and Cortex-M4..
Oct 18 2018, 1:41 AM

Oct 16 2018

dmgreen added a comment to D53136: [LNT] Come up with MIN_PERCENTAGE_CHANGE value.

This looks like something that would be useful for us, where some of our benchmarks are very low noise. I currently have a couple of test-ish LNT instances with changes similar to this, either changing this value or setting ignore_small=False from the daily report page.

Oct 16 2018, 2:11 PM
Herald updated subscribers of D53190: ARM: avoid infinite combining loop.
Oct 16 2018, 9:13 AM
Herald updated subscribers of D32564: AArch64: compress jump tables to minimum size needed to reach destinations.
Oct 16 2018, 9:12 AM

Oct 15 2018

dmgreen added inline comments to D52508: [InstCombine] Clean up after IndVarSimplify.
Oct 15 2018, 12:41 PM
dmgreen added a comment to D52508: [InstCombine] Clean up after IndVarSimplify.

I think we should deal with do while in another patch.

Yeah, defo. I just need to come up with a sensible way to fix it. I feel some of this is pushing against the edges of what instcombine should be doing, but I'll keep pushing until someone tells me to stop.

Oct 15 2018, 2:52 AM
dmgreen updated the diff for D52508: [InstCombine] Clean up after IndVarSimplify.
Oct 15 2018, 2:51 AM

Oct 14 2018

dmgreen added a comment to D53245: Teach the DominatorTree fallback to recalculation when applying updates to speedup JT (PR37929).

Interesting. Nice improvements. What about small trees? It would seem that any tree less that 75 nodes would always be recalculated. Do the timings you ran include things to show that this is better? Or was that just looking at larger trees at the time?

Oct 14 2018, 1:22 PM

Oct 12 2018

dmgreen accepted D53177: [builtins] Implement __aeabi_uread4/8 and __aeabi_uwrite4/8..

Yeah, I agree. LGTM.

Oct 12 2018, 1:29 PM
dmgreen added a comment to D53177: [builtins] Implement __aeabi_uread4/8 and __aeabi_uwrite4/8..

Nice idea. The rtabi seems to says:

Write functions return the value written, read functions the value read.

Oct 12 2018, 5:49 AM
dmgreen added a comment to D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Yeah, that looks like similar IR to what I was looking at. The vectorised version on Skylake (https://godbolt.org/z/RBS2Os) has a lot of shuffling, perhaps that's deemed unprofitable on Goldmont?

Oct 12 2018, 5:42 AM

Oct 11 2018

dmgreen updated subscribers of D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Oh, no. That's not what I wanted to hear. I presume we are looking at the same bit of code!

Oct 11 2018, 3:21 PM
dmgreen added inline comments to D52508: [InstCombine] Clean up after IndVarSimplify.
Oct 11 2018, 11:32 AM
dmgreen updated the diff for D52508: [InstCombine] Clean up after IndVarSimplify.

Whilst we're here, can anyone think of a good way to simplify:
(S + -32) - (-32 & (S + umax(31 - S, -32)))
That's the "do" case. I think if we distribute the -32& through the add, that with the rest of instcombine + cse + instcombine again does get us down to:
S & 31

Oct 11 2018, 11:32 AM
dmgreen updated the diff for D52508: [InstCombine] Clean up after IndVarSimplify.

OK, now one bit ol' matcher. Thanks for the suggestions.

Oct 11 2018, 8:02 AM
dmgreen committed rL344239: [InstCombine] Demand bits of UMin.
[InstCombine] Demand bits of UMin
Oct 11 2018, 4:31 AM
dmgreen closed D53036: [InstCombine] Demand bits of UMin.
Oct 11 2018, 4:31 AM
dmgreen added a comment to D53036: [InstCombine] Demand bits of UMin.

This is the page I look at as reference for alive coding:
https://github.com/nunoplopes/alive/blob/newsema/rise4fun/language

Oct 11 2018, 4:09 AM
dmgreen committed rL344237: [InstCombine] Demand bits of UMax.
[InstCombine] Demand bits of UMax
Oct 11 2018, 4:06 AM
dmgreen closed D53033: [InstCombine] Demand bits of UMAX.
Oct 11 2018, 4:06 AM
dmgreen committed rL344236: [InstCombine] Add tests for demand bits of min/max. NFC..
[InstCombine] Add tests for demand bits of min/max. NFC.
Oct 11 2018, 3:48 AM

Oct 10 2018

dmgreen updated subscribers of D33935: Allow rematerialization of ARM Thumb MOVi8 instruction in some contexts.

I ran some benchmarks for thumb1, they look great.

Oct 10 2018, 10:31 AM
dmgreen added inline comments to D52508: [InstCombine] Clean up after IndVarSimplify.
Oct 10 2018, 8:54 AM
dmgreen added inline comments to D53036: [InstCombine] Demand bits of UMin.
Oct 10 2018, 7:01 AM
dmgreen updated the diff for D53036: [InstCombine] Demand bits of UMin.

Just added a new test, and changed some of the others around a little.

Oct 10 2018, 7:01 AM
dmgreen updated the diff for D52508: [InstCombine] Clean up after IndVarSimplify.

I've taken the demand parts parts out of this, and added some extra tests for the do / while and signed /unsigned cases.

Oct 10 2018, 4:48 AM
dmgreen updated the diff for D53036: [InstCombine] Demand bits of UMin.

Luckily, it appears that Alive can do countLeadingOnes too (although I don't see it anywhere in the sources I have).
https://rise4fun.com/Alive/O9i

Oct 10 2018, 3:54 AM
dmgreen updated the diff for D53033: [InstCombine] Demand bits of UMAX.

Updated to use activeBits. Thanks for the suggestions.

Oct 10 2018, 3:46 AM

Oct 9 2018

dmgreen added a comment to D53033: [InstCombine] Demand bits of UMAX.

Ah, I must have been looking at the wrong bit of code. Thanks!

Oct 9 2018, 12:25 PM
dmgreen created D53036: [InstCombine] Demand bits of UMin.
Oct 9 2018, 12:18 PM
dmgreen created D53033: [InstCombine] Demand bits of UMAX.
Oct 9 2018, 12:00 PM
dmgreen added a comment to D52508: [InstCombine] Clean up after IndVarSimplify.

It turns out the other case I ran into above ((S + -32) - (32 & (S + umax(31 - S, -32)))) was from do loops, not while loops. Signed will also be different to unsigned, with signed cases not having quite as small simplified forms.
These are the cases:
https://godbolt.org/z/SE-xhD
With some possible simplifications:
https://rise4fun.com/Alive/slxj

Oct 9 2018, 11:57 AM
dmgreen added a comment to D53005: Implement machine unroller utility class.

Hello, see the part about context in https://llvm.org/docs/Phabricator.html#phabricator-request-review-web. It's easier to review things if we can see the code around the patch as well as the code in the patch.

Oct 9 2018, 11:34 AM
dmgreen added a comment to D53005: Implement machine unroller utility class.

Hello. Very nice. I don't think I can speak to much of the detail here, especially the Hexagon parts, but can you:

  • Add full context to the patch (-U99999)
  • Replace the copyright headers to be more "llvmy"
Oct 9 2018, 2:48 AM

Oct 5 2018

dmgreen committed rC343843: [AArch64] Use filecheck captures for metadata node numbers in test. NFC.
[AArch64] Use filecheck captures for metadata node numbers in test. NFC
Oct 5 2018, 3:23 AM
dmgreen committed rL343843: [AArch64] Use filecheck captures for metadata node numbers in test. NFC.
[AArch64] Use filecheck captures for metadata node numbers in test. NFC
Oct 5 2018, 3:23 AM

Oct 2 2018

dmgreen updated the diff for D52716: [Inliner] Penalise inlining of calls with loops at Oz.

I've added a new memcpy test from the original reproducer. It's a byte memcpy (people seem to love writing those), which I think is worth focusing on because its small, but still increases codesize. It expands to:

Oct 2 2018, 11:38 AM
dmgreen planned changes to D52508: [InstCombine] Clean up after IndVarSimplify.

Unfortunately, since I did this everything seems to have changed and we now end up with something like:
(S + -32) - (32 & (S + umax(31 - S, -32)))

Oct 2 2018, 6:04 AM
dmgreen committed rL343569: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.
[InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A
Oct 2 2018, 2:50 AM
dmgreen closed D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.
Oct 2 2018, 2:50 AM
dmgreen added a comment to D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Let me know if you see anything funny from this patch.

Oct 2 2018, 2:49 AM
dmgreen committed rL343561: [InstCombine] Tests for ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A. NFC.
[InstCombine] Tests for ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A. NFC
Oct 2 2018, 2:09 AM

Oct 1 2018

dmgreen added a comment to D52716: [Inliner] Penalise inlining of calls with loops at Oz.

For loop-noinline.ll the "Simplified" instructions (the ones that cost nothing) appear to be:
br %body
3 x phis
one of the geps in the loop
the gep outside the loop
the ret

Oct 1 2018, 1:20 PM
dmgreen added inline comments to D35035: [InstCombine] Prevent memcpy generation for small data size.
Oct 1 2018, 6:26 AM
dmgreen created D52716: [Inliner] Penalise inlining of calls with loops at Oz.
Oct 1 2018, 4:01 AM

Sep 30 2018

dmgreen abandoned D44043: [DAGCombine] Remove AND in SETCC if we can prove they are unneeded.

Yep sure, with 52177, I no longer have a motivating case for this.

Sep 30 2018, 2:18 AM

Sep 28 2018

dmgreen accepted D52644: [ARM] Prevent DSP and SIM32 being set for v6m.

LGTM, Thanks

Sep 28 2018, 2:36 AM

Sep 26 2018

dmgreen accepted D52470: [ARM/AArch64][v8.5A] Add Armv8.5-A target.

LGTM

Sep 26 2018, 4:31 AM
dmgreen committed rL343091: [CodeGen] Enable tail calls for functions with NonNull attributes..
[CodeGen] Enable tail calls for functions with NonNull attributes.
Sep 26 2018, 3:48 AM
dmgreen closed D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..
Sep 26 2018, 3:47 AM
dmgreen added a comment to D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..

Thanks!

Sep 26 2018, 3:47 AM

Sep 25 2018

dmgreen accepted D52471: [ARM/AArch64] Add target parser unit tests for Armv8.4-A.

LGTM.

Sep 25 2018, 3:19 PM
dmgreen added inline comments to D52470: [ARM/AArch64][v8.5A] Add Armv8.5-A target.
Sep 25 2018, 3:19 PM
dmgreen added inline comments to D52470: [ARM/AArch64][v8.5A] Add Armv8.5-A target.
Sep 25 2018, 3:18 PM
dmgreen updated the diff for D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Added a comment and correct spelling

Sep 25 2018, 10:43 AM
dmgreen added a comment to D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

I was hoping to find a more general solution for min/max with nots, but I'm not seeing it, so just a few nits in the inline comments.

Sep 25 2018, 10:32 AM
dmgreen created D52508: [InstCombine] Clean up after IndVarSimplify.
Sep 25 2018, 10:32 AM
dmgreen accepted D52243: [ConstHoist] Do not rebase single (or few) dependent constant.

LGTM, with one minor Nit.

Sep 25 2018, 5:52 AM
dmgreen committed rL342958: [LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder.
[LoopUnroll] Add check to Latch's terminator in UnrollRuntimeLoopRemainder
Sep 25 2018, 3:12 AM
dmgreen closed D51486: Add check to Latch's terminator in UnrollRuntimeLoopRemainder.
Sep 25 2018, 3:12 AM
dmgreen added a comment to D51486: Add check to Latch's terminator in UnrollRuntimeLoopRemainder.

Thanks!

Sep 25 2018, 3:09 AM

Sep 24 2018

dmgreen updated the diff for D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Turns out there wasn't any conflict, but I've tried to clean this up a little and add a few more tests.

Sep 24 2018, 12:01 PM
dmgreen added inline comments to D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..
Sep 24 2018, 7:20 AM
dmgreen accepted D52257: [Thumb1] Any imm of i8 type on Thumb1 should have cost of 1.

Sounds good. I'm happy with this, if no one else has any issues.

Sep 24 2018, 7:19 AM

Sep 21 2018

dmgreen added inline comments to D52243: [ConstHoist] Do not rebase single (or few) dependent constant.
Sep 21 2018, 2:06 AM
dmgreen added a comment to D52243: [ConstHoist] Do not rebase single (or few) dependent constant.

This does seem, from the tests I ran, to reduce codesize on average. Especially on Thumb1.

Sep 21 2018, 1:56 AM
dmgreen added a comment to D52257: [Thumb1] Any imm of i8 type on Thumb1 should have cost of 1.

Is this because we can just use a MOVS and wont have to fill in any higher bits? And MOVS's aren't trivially rematerialisable? And Thumb2/Arm are handled by getT2SOImmVal?

Sep 21 2018, 1:20 AM

Sep 20 2018

dmgreen added inline comments to D52289: [ARM] Do not fuse VADD and VMUL on the Cortex-M4 and Cortex-M33.
Sep 20 2018, 3:41 AM

Sep 19 2018

dmgreen updated the diff for D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..

Minor edit to test (and renamed it to tailcall-dup.ll).

Sep 19 2018, 4:43 AM
dmgreen updated the diff for D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..

Added a tailcall duplication opt test to Transforms/CodeGenPrepare.

Sep 19 2018, 4:42 AM
dmgreen added inline comments to D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..
Sep 19 2018, 4:39 AM

Sep 18 2018

dmgreen added inline comments to D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..
Sep 18 2018, 2:50 PM
dmgreen added a comment to D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.

Thanks for pointing to D52070. I think I saw that (it gave me the idea for this), but hadn't realised it had come back out.

Sep 18 2018, 10:34 AM
dmgreen created D52238: [CodeGen] Enable tail calls for functions with NonNull attributes..
Sep 18 2018, 8:51 AM
dmgreen added a comment to D51486: Add check to Latch's terminator in UnrollRuntimeLoopRemainder.

Sorry for the delay. Sure, I can do that.

Sep 18 2018, 3:38 AM
dmgreen committed rL342455: [AArch64] Attempt to parse more operands as expressions.
[AArch64] Attempt to parse more operands as expressions
Sep 18 2018, 2:48 AM
dmgreen closed D51792: [AArch64] Attempt to parse expressions as adr/adrp operands.
Sep 18 2018, 2:48 AM
dmgreen added a comment to D51792: [AArch64] Attempt to parse expressions as adr/adrp operands.

OK. Thanks guys.

Sep 18 2018, 2:47 AM

Sep 17 2018

dmgreen created D52177: [InstCombine] Fold ~A - Min/Max(~A, O) -> Max/Min(A, ~O) - A.
Sep 17 2018, 10:02 AM