dstuttard (David Stuttard)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 25 2017, 7:29 AM (65 w, 1 h)

Recent Activity

Wed, Apr 18

dstuttard committed rL330257: [AMDGPU] Fix issues for backend divergence tracking.
[AMDGPU] Fix issues for backend divergence tracking
Wed, Apr 18, 6:56 AM
dstuttard closed D45372: [AMDGPU] Fix issues for backend divergence tracking.
Wed, Apr 18, 6:56 AM

Mon, Apr 16

dstuttard updated the diff for D45372: [AMDGPU] Fix issues for backend divergence tracking.

Removed test as being overkill for the missed state clear being added (2 other
tests still remain)

Mon, Apr 16, 8:46 AM
dstuttard added a comment to D45372: [AMDGPU] Fix issues for backend divergence tracking.
  1. Remove the test altogether since the VirtReg2Value.clear() is definitely required and was originally missed as an oversight - and the test might be overkill.

Sounds reasonable. It really was an oversight :) It's funny that the change was reviewed by lots of people for 2 months but nobody noticed not clearing state.

Mon, Apr 16, 8:23 AM
dstuttard added a comment to D45372: [AMDGPU] Fix issues for backend divergence tracking.

In general LGTM.
I also concerned about the magic test that has no checks and has no visible side effects on shared data but is necessary to reproduce buggy behavior.
Could you please clarify what do you need it for?

Mon, Apr 16, 7:12 AM

Sat, Apr 7

dstuttard added inline comments to D45372: [AMDGPU] Fix issues for backend divergence tracking.
Sat, Apr 7, 1:40 AM

Fri, Apr 6

dstuttard updated the diff for D45372: [AMDGPU] Fix issues for backend divergence tracking.

Folded in triple to run line

Fri, Apr 6, 9:44 AM
dstuttard added a reviewer for D45372: [AMDGPU] Fix issues for backend divergence tracking: alex-t.
Fri, Apr 6, 8:34 AM
dstuttard added a reviewer for D45372: [AMDGPU] Fix issues for backend divergence tracking: nhaehnle.
Fri, Apr 6, 8:32 AM
dstuttard created D45372: [AMDGPU] Fix issues for backend divergence tracking.
Fri, Apr 6, 8:27 AM

Mar 14 2018

dstuttard accepted D44468: [AMDGPU] For OS type AMDPAL, fixed scratch on compute shader.

LGTM

Mar 14 2018, 7:54 AM

Feb 2 2018

dstuttard added inline comments to D42838: [AMDGPU] added writelane intrinsic.
Feb 2 2018, 4:04 AM

Jan 17 2018

dstuttard added inline comments to D42079: AMDGPU: Add a function attribute that shrinks buggy s_buffer opcodes on GFX9.
Jan 17 2018, 1:05 AM

Jan 16 2018

dstuttard added a comment to D40308: [RegisterCoalescer] More fixes for subreg join failure in RegCoalescer.

Any comment on this change? If not I'll land in the next few days.

Jan 16 2018, 2:18 AM
dstuttard added a comment to D40300: [RegisterCoalescer] Fix for SubRegJoin failures.

Any comment on this change? If not I'll land in the next few days.

Jan 16 2018, 2:16 AM
dstuttard added a comment to D35073: [RegisterCoalescer] Fix for subrange join unreachable.

Any more comments on this change? If not I'll land this in the next few days.

Jan 16 2018, 2:15 AM

Jan 9 2018

dstuttard updated the diff for D35073: [RegisterCoalescer] Fix for subrange join unreachable.

Removing a debug llvm_unreachable

Jan 9 2018, 6:02 AM
dstuttard updated the diff for D35073: [RegisterCoalescer] Fix for subrange join unreachable.

A better fix for this problem.

Jan 9 2018, 3:58 AM

Nov 28 2017

dstuttard added a comment to D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.

The intent of this code is really to allow anyone doing debug to insert a check anywhere in the code. In particular, for the problems I was looking at I needed to check the live intervals after each change made in order to pinpoint where an error was being introduced. As implemented here (with the check after coalescing) it looks redundant, so having that extra check is probably useless.
Running a verification phase after coalescing is sometimes too late, and indeed may well have asserted before this point.

Nov 28 2017, 2:18 AM

Nov 21 2017

dstuttard updated the diff for D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.

Removing debug code

Nov 21 2017, 11:03 AM
dstuttard added a comment to D35073: [RegisterCoalescer] Fix for subrange join unreachable.

I'm going to take another look at this bug.
I've also uploaded a couple of other fixes for SubRange join failures. See D40300 and D40308. I've also uploaded a verifier method that is helpful in tracking these problems down. See D40297.

Nov 21 2017, 7:58 AM
dstuttard added reviewers for D40308: [RegisterCoalescer] More fixes for subreg join failure in RegCoalescer: qcolombet, MatzeB.
Nov 21 2017, 7:54 AM
dstuttard updated subscribers of D40308: [RegisterCoalescer] More fixes for subreg join failure in RegCoalescer.
Nov 21 2017, 7:54 AM
dstuttard created D40308: [RegisterCoalescer] More fixes for subreg join failure in RegCoalescer.
Nov 21 2017, 7:54 AM
dstuttard added a reviewer for D40300: [RegisterCoalescer] Fix for SubRegJoin failures: MatzeB.
Nov 21 2017, 5:53 AM
dstuttard updated subscribers of D40300: [RegisterCoalescer] Fix for SubRegJoin failures.
Nov 21 2017, 5:53 AM
dstuttard created D40300: [RegisterCoalescer] Fix for SubRegJoin failures.
Nov 21 2017, 5:53 AM
dstuttard updated the diff for D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.

Removed a blank line causing unnecessary differences

Nov 21 2017, 5:50 AM
dstuttard added a comment to D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.

I've got a couple of new SubRange join fixes that I'm about to upload for review. This routine was useful in tracking those down so I thought it would be useful to add as a generic aid for future problems.

Nov 21 2017, 5:08 AM
dstuttard added reviewers for D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments: qcolombet, MatzeB.
Nov 21 2017, 5:08 AM
dstuttard updated subscribers of D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.
Nov 21 2017, 5:04 AM
dstuttard created D40297: [RegisterCoalescer] Add verification method to check LiveInterval Segments.
Nov 21 2017, 5:04 AM

Oct 10 2017

dstuttard committed rL315307: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.
[DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
Oct 10 2017, 5:46 AM
dstuttard closed D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors by committing rL315307: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.
Oct 10 2017, 5:46 AM

Oct 9 2017

dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

@dstuttard Are you ready to commit this?

Oct 9 2017, 1:43 AM

Sep 28 2017

dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

Test now reduced further - thanks @RKSimon - confirmed that it still exhibits the problem and that the fix still fixes it.
I managed to remove a couple more lines from the cut-down version you provided.

Sep 28 2017, 3:04 AM
dstuttard updated the diff for D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

Cutting down the test size and rebasing

Sep 28 2017, 3:01 AM

Sep 25 2017

dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

@dstuttard Are you still looking at this?

Sep 25 2017, 5:03 AM

Aug 4 2017

dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

Can you reduce the test any further?

Aug 4 2017, 6:44 AM
dstuttard abandoned D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic.

Agreed - @cwabbott your changes provide something much more complete than this change, plus remove the need for it. Thanks.

Aug 4 2017, 6:41 AM
dstuttard added a comment to D34889: [ScheduleDAG] Fix bug in check for use of dead defs.

Testcase? It seems that your change wants to allow the following:

vreg4:sub0<def, dead> = ...
...
  = use vreg4:sub1

This is not legal! The dead modifier is about the full vreg not just about the sub0 part defined. (-verify-machineinstrs should also lead to such inputs getting rejected).

Aug 4 2017, 6:39 AM

Jul 26 2017

dstuttard updated the diff for D35073: [RegisterCoalescer] Fix for subrange join unreachable.

Implemented a slightly different approach to resolving the issue.

Jul 26 2017, 7:24 AM

Jul 25 2017

dstuttard added a comment to D35073: [RegisterCoalescer] Fix for subrange join unreachable.

I'm confused by this change (and the testcase doesn't really help understanding what is going on). You are probably onto a real bug here where we cannot assume the same lanemasks work for the input/output register class of a copy. However that should be independent of the fact that a full or partial copy is present.

Jul 25 2017, 7:29 AM

Jul 17 2017

dstuttard added a comment to D34889: [ScheduleDAG] Fix bug in check for use of dead defs.

ping

Jul 17 2017, 9:04 AM
dstuttard added a comment to D35073: [RegisterCoalescer] Fix for subrange join unreachable.

ping

Jul 17 2017, 9:03 AM
dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

@arsenm - any comments?

Jul 17 2017, 9:03 AM

Jul 11 2017

dstuttard added a comment to D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.

It's already bugpoint reduced - I tried trimming it further by hand but it hid the issue (unsurprisingly).

Jul 11 2017, 5:03 AM
dstuttard added a reviewer for D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors: RKSimon.

Not sure who is an appropriate reviewer for this, adding @RKSimon as he seems to have made several changes in the same area.

Jul 11 2017, 3:28 AM
dstuttard updated subscribers of D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.
Jul 11 2017, 3:27 AM
dstuttard created D35241: [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors.
Jul 11 2017, 3:26 AM

Jul 7 2017

dstuttard updated the diff for D35073: [RegisterCoalescer] Fix for subrange join unreachable.

Adding verify-machineinstr flag to llc invocation in the test

Jul 7 2017, 5:35 AM

Jul 6 2017

dstuttard updated subscribers of D35073: [RegisterCoalescer] Fix for subrange join unreachable.
Jul 6 2017, 10:52 AM
dstuttard added reviewers for D35073: [RegisterCoalescer] Fix for subrange join unreachable: MatzeB, arsenm.
Jul 6 2017, 10:51 AM
dstuttard created D35073: [RegisterCoalescer] Fix for subrange join unreachable.
Jul 6 2017, 10:50 AM
dstuttard committed rL307247: [RegisterCoalescer] Fix for SubRange join unreachable.
[RegisterCoalescer] Fix for SubRange join unreachable
Jul 6 2017, 3:08 AM
dstuttard closed D34391: [RegisterCoalescer] Fix for SubRange join unreachable by committing rL307247: [RegisterCoalescer] Fix for SubRange join unreachable.
Jul 6 2017, 3:08 AM

Jul 3 2017

dstuttard added a comment to D34889: [ScheduleDAG] Fix bug in check for use of dead defs.

I do have a reproducer for this (a .ll test). I've attempted to turn this into a .mir test, but it is too fragile to allow me to change the target specific intrinsics to something generic (required in order to print out the .mir).
Hopefully, the check is an obvious omission and an easy one to approve?
If not I'll see if I can create a smaller regression for it - but I've already spent quite a lot of time attempting to do this.

Jul 3 2017, 7:34 AM
dstuttard updated the diff for D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Updating the test to a .mir test
Replacing the buffer.load intrinsics with load volatile worked

Jul 3 2017, 7:34 AM
dstuttard added a comment to D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

@qcolombet - any further comments or are you happy for this to go in?

Jul 3 2017, 7:34 AM

Jun 30 2017

dstuttard added inline comments to D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic.
Jun 30 2017, 8:39 AM
dstuttard added a reviewer for D34391: [RegisterCoalescer] Fix for SubRange join unreachable: arsenm.

The code fix LGTM, but please wait for @qcolombet/@arsenm before committing.

Making a .mir test seems indeed hard at the moment, as the printer already fails; manually stripping memory operands only works after printing I presume?

Jun 30 2017, 8:35 AM
dstuttard added a reviewer for D34889: [ScheduleDAG] Fix bug in check for use of dead defs: MatzeB.
Jun 30 2017, 8:30 AM
dstuttard updated subscribers of D34889: [ScheduleDAG] Fix bug in check for use of dead defs.
Jun 30 2017, 8:27 AM
dstuttard created D34889: [ScheduleDAG] Fix bug in check for use of dead defs.
Jun 30 2017, 8:26 AM

Jun 27 2017

dstuttard added a comment to D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic.

I think we will require a more extensive change to properly support WQM and all it can do. However, this is a minor extension to an existing intrinsic that allows implementation of functions required in some shader implementations.
This can be removed if necessary if more fully featured support is implemented.

Jun 27 2017, 3:45 AM
dstuttard added reviewers for D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic: arsenm, tpr.
Jun 27 2017, 3:42 AM
dstuttard updated subscribers of D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic.
Jun 27 2017, 3:39 AM
dstuttard created D34677: [AMDGPU] Whole Quad Mode variant of mov.dpp intrinsic.
Jun 27 2017, 3:38 AM

Jun 22 2017

dstuttard committed rL306034: [AMDGPU] Add intrinsics for tbuffer load and store - build error fix.
[AMDGPU] Add intrinsics for tbuffer load and store - build error fix
Jun 22 2017, 10:16 AM
dstuttard committed rL306031: [AMDGPU] Add intrinsics for tbuffer load and store.
[AMDGPU] Add intrinsics for tbuffer load and store
Jun 22 2017, 9:43 AM
dstuttard closed D30687: [AMDGPU] Add intrinsics for tbuffer load and store by committing rL306031: [AMDGPU] Add intrinsics for tbuffer load and store.
Jun 22 2017, 9:43 AM
dstuttard updated the diff for D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Updating in line with comments from reviewers

Jun 22 2017, 7:40 AM

Jun 21 2017

dstuttard updated the diff for D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Added in mising check on RmValNo for null

Jun 21 2017, 12:11 PM
dstuttard added a comment to D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

FYI I tried this patch in my out-of-tree backend (hoping to resolve http://llvm.org/PR32773). I observed a segfault SR.removeValNo(RmValNo) because RmValNo may be null.

I don't know yet whether this is specific to my backend, but I thought I'd mention it in case it indicates a more general problem.

Jun 21 2017, 12:09 PM
dstuttard updated the diff for D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Updating the test as per review comments

Jun 21 2017, 4:46 AM
dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Inadvertently included extra changes in last diff

Jun 21 2017, 1:52 AM
dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Corrected 80 column formatting and variable names

Jun 21 2017, 1:50 AM

Jun 20 2017

dstuttard added a comment to D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Removed the commented out debug statement

Jun 20 2017, 6:36 AM
dstuttard updated the diff for D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Removing debug comment
Also adding the test case from the bugzilla as a new test case (I realised it is
a bit annoying to have to go elsewhere to get hold of the reproducer)
I'll update it as a test if this change looks promising

Jun 20 2017, 6:35 AM
dstuttard added a comment to D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Test case?

Jun 20 2017, 6:15 AM
dstuttard added a comment to D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

I've added this change specifically to address a problem seen in the associated PR (http://llvm.org/PR33524), however I'm not sure that this is necessarily the right way to go about fixing this issue.

Jun 20 2017, 2:44 AM
dstuttard added a reviewer for D34391: [RegisterCoalescer] Fix for SubRange join unreachable: MatzeB.

Adding MatzeB as reviewer - you've made some recent changes in the same area.

Jun 20 2017, 2:37 AM
dstuttard updated subscribers of D34391: [RegisterCoalescer] Fix for SubRange join unreachable.

Added llvm-commits

Jun 20 2017, 2:28 AM
dstuttard created D34391: [RegisterCoalescer] Fix for SubRange join unreachable.
Jun 20 2017, 2:25 AM

Jun 13 2017

dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Removed the bulk of the legacy implementation and now lower to the new form earlier

Jun 13 2017, 5:59 AM

Jun 9 2017

dstuttard committed rL305079: [AMDGPU] Fix for issue in alloca to vector promotion pass.
[AMDGPU] Fix for issue in alloca to vector promotion pass
Jun 9 2017, 7:16 AM
dstuttard closed D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass by committing rL305079: [AMDGPU] Fix for issue in alloca to vector promotion pass.
Jun 9 2017, 7:16 AM

Jun 8 2017

dstuttard added a comment to D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass.

Made suggested changes prior to submission

Jun 8 2017, 5:44 AM
dstuttard updated the diff for D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass.

Made changes to comments
Changed address spaces in tests and added some checks

Jun 8 2017, 5:42 AM

Jun 6 2017

dstuttard added inline comments to D30687: [AMDGPU] Add intrinsics for tbuffer load and store.
Jun 6 2017, 8:58 AM
dstuttard abandoned D33929: [IR] Remove line causing build issues for VS2015.
Jun 6 2017, 8:56 AM
dstuttard created D33929: [IR] Remove line causing build issues for VS2015.
Jun 6 2017, 2:46 AM

May 25 2017

dstuttard added a comment to D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

See responses to your individual comments - it could be that I've missed something in the way this could be done.

May 25 2017, 7:25 AM

May 24 2017

dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Forgot to update comment about float as an option for overloading

May 24 2017, 10:58 AM
dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Data can now be float or int
I decided that since the load variant would support this, so should the store. This meant re-jigging
the implementation slightly to use an approach more similar to load to enable the use of any for the
store as well (unless you can see a more efficient way to do it).

May 24 2017, 10:56 AM

May 22 2017

dstuttard added a comment to D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

Any chance of another review of this change?
Does it look good to go now?
Thanks

May 22 2017, 9:20 AM
dstuttard added a comment to D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass.

Any chance of a final review on this change?

May 22 2017, 9:20 AM

May 16 2017

dstuttard added a comment to D31710: [AMDGPU] Fix for issue in alloca to vector promotion pass.

ping

May 16 2017, 8:51 AM
dstuttard added a comment to D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

ping

May 16 2017, 8:50 AM

May 11 2017

dstuttard updated the diff for D30687: [AMDGPU] Add intrinsics for tbuffer load and store.

As suggested by Tom Stellard (due to potential issues with range checking) , I've changed the
intrinsics to have explicit operands for vindex, voffset, soffset and offset. The backend no longer
attempts to optimise by folding in any offsets it can spot in preceding instructions.

May 11 2017, 8:02 AM