Page MenuHomePhabricator

dstuttard (David Stuttard)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 25 2017, 7:29 AM (133 w, 4 d)

Recent Activity

Mon, Jul 29

dstuttard committed rG20235ef3e751: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions (authored by dstuttard).
[AMDGPU] Enable v4f16 and above for v_pk_fma instructions
Mon, Jul 29, 8:16 AM
dstuttard committed rL367206: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions.
[AMDGPU] Enable v4f16 and above for v_pk_fma instructions
Mon, Jul 29, 8:16 AM
dstuttard closed D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions.
Mon, Jul 29, 8:16 AM · Restricted Project

Fri, Jul 26

dstuttard updated the diff for D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions.

Managed to get the fmac test to keep using fmac
Also updated the test to use non-anonymous values

Fri, Jul 26, 9:41 AM · Restricted Project
dstuttard updated the diff for D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions.

Changed test to use fma intrinsic

Fri, Jul 26, 7:21 AM · Restricted Project
dstuttard added a reviewer for D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions: rampitec.

+Stas to comment on the v_fmac_f16 test change.
Is it acceptable to change the result to look for v_pk_fma_f16 rather than 2 v_fmac_f16 instructions? If not, any suggestions on how to get the compiler to generate 2 x fmac instead?

Fri, Jul 26, 5:13 AM · Restricted Project
dstuttard added reviewers for D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions: arsenm, piotr.
Fri, Jul 26, 3:39 AM · Restricted Project
dstuttard created D65325: [AMDGPU] Enable v4f16 and above for v_pk_fma instructions.
Fri, Jul 26, 3:38 AM · Restricted Project

Jul 17 2019

dstuttard abandoned D63639: [AMDGPU] Prevent backend override of WGP when using PAL.

I might revisit this one - setting cumode seems messy to enable driver control of the WGP setting, but seems the most pragmatic at the moment.

Jul 17 2019, 1:31 AM · Restricted Project
dstuttard added inline comments to D63639: [AMDGPU] Prevent backend override of WGP when using PAL.
Jul 17 2019, 1:31 AM · Restricted Project

Jul 15 2019

dstuttard added a comment to D63639: [AMDGPU] Prevent backend override of WGP when using PAL.

ping

Jul 15 2019, 1:21 AM · Restricted Project

Jun 21 2019

dstuttard added inline comments to D63639: [AMDGPU] Prevent backend override of WGP when using PAL.
Jun 21 2019, 7:10 AM · Restricted Project
dstuttard added reviewers for D63639: [AMDGPU] Prevent backend override of WGP when using PAL: tpr, rampitec.
Jun 21 2019, 2:38 AM · Restricted Project
dstuttard created D63639: [AMDGPU] Prevent backend override of WGP when using PAL.
Jun 21 2019, 2:36 AM · Restricted Project

May 9 2019

dstuttard committed rG411488b11edf: [CodeGenPrepare] Limit recursion depth for collectBitParts (authored by dstuttard).
[CodeGenPrepare] Limit recursion depth for collectBitParts
May 9 2019, 8:00 AM
dstuttard committed rL360347: [CodeGenPrepare] Limit recursion depth for collectBitParts.
[CodeGenPrepare] Limit recursion depth for collectBitParts
May 9 2019, 8:00 AM
dstuttard closed D61728: [CodeGenPrepare] Limit recursion depth for collectBitParts.
May 9 2019, 7:59 AM · Restricted Project
dstuttard created D61728: [CodeGenPrepare] Limit recursion depth for collectBitParts.
May 9 2019, 5:53 AM · Restricted Project
dstuttard added a reviewer for D61728: [CodeGenPrepare] Limit recursion depth for collectBitParts: jmolloy.
May 9 2019, 5:53 AM · Restricted Project

Apr 23 2019

dstuttard accepted D60999: AMDGPU: Fix LCSSA phi lowering in SILowerI1Copies.

LGTM

Apr 23 2019, 4:20 AM · Restricted Project

Mar 20 2019

dstuttard committed rGfc2a74734574: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel (authored by dstuttard).
[AMDGPU] Allow MIMG with no uses in adjustWritemask in isel
Mar 20 2019, 2:29 AM
dstuttard committed rL356540: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.
[AMDGPU] Allow MIMG with no uses in adjustWritemask in isel
Mar 20 2019, 2:29 AM
dstuttard closed D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.
Mar 20 2019, 2:29 AM · Restricted Project

Mar 18 2019

dstuttard added a comment to D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.

ping

Mar 18 2019, 3:24 AM · Restricted Project

Mar 12 2019

dstuttard committed rG20ea21c6ede8: [AMDGPU] Add support for immediate operand for S_ENDPGM (authored by dstuttard).
[AMDGPU] Add support for immediate operand for S_ENDPGM
Mar 12 2019, 2:52 AM
dstuttard committed rL355902: [AMDGPU] Add support for immediate operand for S_ENDPGM.
[AMDGPU] Add support for immediate operand for S_ENDPGM
Mar 12 2019, 2:52 AM
dstuttard closed D59213: [AMDGPU] Add support for immediate operand for S_ENDPGM.
Mar 12 2019, 2:52 AM · Restricted Project

Mar 11 2019

dstuttard added reviewers for D59213: [AMDGPU] Add support for immediate operand for S_ENDPGM: rampitec, arsenm.
Mar 11 2019, 8:48 AM · Restricted Project
dstuttard created D59213: [AMDGPU] Add support for immediate operand for S_ENDPGM.
Mar 11 2019, 8:45 AM · Restricted Project

Mar 7 2019

dstuttard added inline comments to D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.
Mar 7 2019, 7:31 AM · Restricted Project
dstuttard updated the diff for D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.

Modified test in line with review comments

Mar 7 2019, 7:31 AM · Restricted Project

Mar 6 2019

dstuttard added inline comments to D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.
Mar 6 2019, 4:53 AM · Restricted Project

Mar 5 2019

dstuttard added reviewers for D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel: nhaehnle, tpr.
Mar 5 2019, 6:42 AM · Restricted Project
dstuttard created D58964: [AMDGPU] Allow MIMG with no uses in adjustWritemask in isel.
Mar 5 2019, 6:42 AM · Restricted Project
dstuttard added inline comments to D57737: [AMDGPU] Fix DPP sequence in atomic optimizer..
Mar 5 2019, 5:39 AM · Restricted Project, Restricted Project
dstuttard committed rG81eec58a0d55: [AMDGPU] Omit KILL instructions from hazard recognizer (authored by dstuttard).
[AMDGPU] Omit KILL instructions from hazard recognizer
Mar 5 2019, 2:25 AM
dstuttard committed rL355384: [AMDGPU] Omit KILL instructions from hazard recognizer.
[AMDGPU] Omit KILL instructions from hazard recognizer
Mar 5 2019, 2:24 AM
dstuttard closed D58898: [AMDGPU] Omit KILL instructions from hazard recognizer.
Mar 5 2019, 2:24 AM · Restricted Project

Mar 4 2019

dstuttard added a reviewer for D58898: [AMDGPU] Omit KILL instructions from hazard recognizer: arsenm.
Mar 4 2019, 6:32 AM · Restricted Project
dstuttard added reviewers for D58898: [AMDGPU] Omit KILL instructions from hazard recognizer: nhaehnle, sheredom.
Mar 4 2019, 6:30 AM · Restricted Project
dstuttard created D58898: [AMDGPU] Omit KILL instructions from hazard recognizer.
Mar 4 2019, 6:29 AM · Restricted Project

Feb 11 2019

dstuttard accepted D57737: [AMDGPU] Fix DPP sequence in atomic optimizer..

Not really an area I'm 100% sure about - but looks ok to me. One of the other reviewers will have to sign off too.
Minor niggle on the comment (if my understanding is correct).

Feb 11 2019, 2:37 AM · Restricted Project, Restricted Project

Feb 4 2019

dstuttard accepted D57681: [InstCombine] Cleanup the TFE/LWE check in AMDGPU SimplifyDemanded.

LGTM

Feb 4 2019, 5:14 AM · Restricted Project

Jan 14 2019

dstuttard committed rL351054: [AMDGPU] Add support for TFE/LWE in image intrinsics. 2nd try.
[AMDGPU] Add support for TFE/LWE in image intrinsics. 2nd try
Jan 14 2019, 3:59 AM

Dec 11 2018

dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

ping

Dec 11 2018, 8:35 AM
dstuttard added a comment to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

ping

Dec 11 2018, 8:35 AM · Restricted Project
dstuttard accepted D55267: [AMDGPU] Set metadata access for explicit section.

LGTM - but probably need approval from one of the other reviewers as well

Dec 11 2018, 1:14 AM

Nov 29 2018

dstuttard committed rL347911: Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic".
Revert r347871 "Fix: Add support for TFE/LWE in image intrinsic"
Nov 29 2018, 12:17 PM
dstuttard committed rL347876: Fix: Add support for TFE/LWE in image intrinsic.
Fix: Add support for TFE/LWE in image intrinsic
Nov 29 2018, 7:59 AM
dstuttard committed rL347871: Add support for TFE/LWE in image intrinsics.
Add support for TFE/LWE in image intrinsics
Nov 29 2018, 7:24 AM
dstuttard closed D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.
Nov 29 2018, 7:24 AM

Nov 28 2018

dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

ping

What happens if you just drop the optimization entirely?

Nov 28 2018, 3:28 AM

Nov 27 2018

dstuttard added a comment to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

ping

Nov 27 2018, 7:20 AM · Restricted Project
dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

ping

Nov 27 2018, 7:20 AM
dstuttard added a comment to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

ping

Nov 27 2018, 7:19 AM

Nov 19 2018

dstuttard committed rL347221: [AMDGPU] Derive GCNSubtarget from MF to get overridden target features.
[AMDGPU] Derive GCNSubtarget from MF to get overridden target features
Nov 19 2018, 7:47 AM
dstuttard closed D54301: [AMDGPU] Derive GCNSubtarget from MF to get overridden target features.
Nov 19 2018, 7:47 AM
dstuttard updated the diff for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Thanks for the review - made all the suggested changes

Nov 19 2018, 7:35 AM
dstuttard added inline comments to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.
Nov 19 2018, 7:35 AM

Nov 13 2018

dstuttard added a comment to D54301: [AMDGPU] Derive GCNSubtarget from MF to get overridden target features.

I don't quite understand what distinguishes the two versions of MCSubtargetInfo (getSTI() vs STM), but it appears a patch from Konstantin earlier this year changed this specific instance from STM.getFeatureBits() to getSTI(). Changing this back LGTM but I don't know what the rationale was in the first place, if it wasn't a typo, so I don't want to sign off without asking.

Nov 13 2018, 8:53 AM

Nov 9 2018

dstuttard added a reviewer for D54301: [AMDGPU] Derive GCNSubtarget from MF to get overridden target features: scott.linder.

Scott - you made some changes here most recently so adding you as the reviewer.

Nov 9 2018, 3:28 AM
dstuttard created D54301: [AMDGPU] Derive GCNSubtarget from MF to get overridden target features.
Nov 9 2018, 3:27 AM

Nov 7 2018

dstuttard added a reviewer for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics: sheredom.

Added Neil as a reviewer as I've made some changes to some of his a16 tests. I'm pretty certain that the modifications are correct, but wanted to get feedback on that as well.

Nov 7 2018, 6:24 AM
dstuttard updated the diff for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Modified based on review feedback from Nicolai

Nov 7 2018, 6:21 AM

Nov 6 2018

dstuttard abandoned D35073: [RegisterCoalescer] Fix for subrange join unreachable.

This patch has been superseded by D49097 - so I'm abandoning this one.

Nov 6 2018, 4:15 AM

Oct 30 2018

dstuttard added a comment to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Minor changes made.

Oct 30 2018, 7:57 AM
dstuttard updated the diff for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Made minor code changes suggested in review

Oct 30 2018, 7:51 AM

Oct 24 2018

dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

@arsenm Matt, any more comments? Would you be happy with a clarification comment as per the last suggestion from me?

Oct 24 2018, 1:08 AM
dstuttard added a comment to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

@arsenm Matt - good to go?

Oct 24 2018, 1:06 AM · Restricted Project
dstuttard added a comment to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Covered all the requested changes (I think). Also implemented a test to make sure that simplifyDemanded doesn't run when TFE/LWE is enabled.

Oct 24 2018, 1:03 AM
dstuttard updated the diff for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Changed the implementation of the intrinsic return type to be an aggregate type

Oct 24 2018, 1:00 AM

Sep 24 2018

dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

Looking again at the code - you're correct that it attempts to only do this transformation if the high bits are zero.
However, the code that checks this has the following telling comment:

// (i32 zext (i16 (bitcast f16:$src))) -> fp16_zext $src
// FIXME: It is not universally true that the high bits are zeroed on gfx9.
if (Src.getOpcode() == ISD::BITCAST) {
  SDValue BCSrc = Src.getOperand(0);
  if (BCSrc.getValueType() == MVT::f16 &&
      fp16SrcZerosHighBits(BCSrc.getOpcode()))
    return DCI.DAG.getNode(AMDGPUISD::FP16_ZEXT, SDLoc(N), VT, BCSrc);
}

In this particular case the BCSrc operation was an fptrunc which passes the fp16SrcZerosHighBits test - but that eventually ends up as v_mad_mixlo_f16 which doesn't ensure that the high bits are zero.

Any suggestions on how to proceed? I agree that it seems a shame to have to insert the extra AND operation blindly.

I guess you could check the subtarget in fp16SrcZerosHighBits. However that's pretty risky since it's depending on things we can't guarantee. Something could transform any other instruction into something else that won't preserve this. Overall I'm very unhappy this hardware change happened and it's a lot of work to handle all of this properly. I think what we really need is to drop this combine/node, and a separate machine instruction for every operation that preserves the high bits (with a tied source operand) vs. zeros them, and then have a machine pass that tries to clean up the extra ands while dropping this combine. We'll have to do extra work because we will have missed out on combines that this was enabling.

Sep 24 2018, 7:38 AM
dstuttard added a comment to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

I don't actually understand why this code is where it is? Why is SIFixSGPRCopies doing this? To clarify is this just an optimization? My initial reaction was that it was a fix, but looking at it again it seems like an optimization to me

Sep 24 2018, 7:30 AM · Restricted Project
dstuttard updated the diff for D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

Updated a mov to a copy as per review comment

Sep 24 2018, 7:26 AM · Restricted Project

Sep 14 2018

dstuttard committed rL342222: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.
[AMDGPU] Ensure trig range reduction only used for subtargets that require it
Sep 14 2018, 3:29 AM
dstuttard closed D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.
Sep 14 2018, 3:29 AM

Sep 13 2018

dstuttard updated the diff for D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

Moved foldToImm into SIInstrInfo as suggested
Implemented check in verifyInstruction and checked that it worked when the fix was removed

Sep 13 2018, 6:51 AM · Restricted Project
dstuttard added inline comments to D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.
Sep 13 2018, 4:10 AM
dstuttard added inline comments to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.
Sep 13 2018, 1:35 AM · Restricted Project
dstuttard updated the diff for D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

Added missing break

Sep 13 2018, 1:33 AM · Restricted Project
dstuttard added inline comments to D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.
Sep 13 2018, 1:19 AM
dstuttard updated the diff for D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.

De-duplicated 1/2PI constant

Sep 13 2018, 1:18 AM

Sep 12 2018

dstuttard updated the diff for D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.

Made suggested change

Sep 12 2018, 9:51 AM
dstuttard added inline comments to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.
Sep 12 2018, 9:38 AM · Restricted Project
dstuttard updated the diff for D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.

Made suggested changes

Sep 12 2018, 9:38 AM · Restricted Project

Sep 11 2018

dstuttard added inline comments to D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.
Sep 11 2018, 2:01 PM · Restricted Project
dstuttard added inline comments to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.
Sep 11 2018, 10:38 AM
dstuttard updated the diff for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

Folded in most of the changes highlighted in the review

Sep 11 2018, 10:31 AM
dstuttard added reviewers for D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it: arsenm, tpr.
Sep 11 2018, 7:53 AM
dstuttard created D51933: [AMDGPU] Ensure trig range reduction only used for subtargets that require it.
Sep 11 2018, 7:52 AM
dstuttard added reviewers for D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands: rampitec, tpr.
Sep 11 2018, 7:41 AM · Restricted Project
dstuttard created D51932: [AMDGPU] Fix-up cases where writelane has 2 SGPR operands.
Sep 11 2018, 7:40 AM · Restricted Project
dstuttard added a comment to D51925: [AMDGPU] Fix issue for zext of f16 to i32.

Looking again at the code - you're correct that it attempts to only do this transformation if the high bits are zero.
However, the code that checks this has the following telling comment:

Sep 11 2018, 6:43 AM
dstuttard added inline comments to D51925: [AMDGPU] Fix issue for zext of f16 to i32.
Sep 11 2018, 6:13 AM
dstuttard added reviewers for D51925: [AMDGPU] Fix issue for zext of f16 to i32: arsenm, tpr.
Sep 11 2018, 4:52 AM
dstuttard created D51925: [AMDGPU] Fix issue for zext of f16 to i32.
Sep 11 2018, 4:47 AM

Jul 23 2018

dstuttard accepted D49026: [AMDGPU] New tbuffer intrinsics.

LGTM - but you might want further reviews from others not so involved in implementation.

Jul 23 2018, 7:39 AM

Jul 2 2018

dstuttard added a comment to D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.

@nhaehnle - just added you as reviewer at the moment.

Jul 2 2018, 4:35 AM
dstuttard added a reviewer for D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics: nhaehnle.
Jul 2 2018, 4:33 AM
dstuttard created D48826: [AMDGPU] Add support for TFE/LWE in image intrinsics.
Jul 2 2018, 4:32 AM