Page MenuHomePhabricator

cwabbott (Connor Abbott)
User

Projects

User does not belong to any projects.

User Details

User Since
Jun 13 2017, 6:06 PM (159 w, 5 d)

Recent Activity

Mar 18 2020

cwabbott added inline comments to D76364: [AMDGPU] Fix AMDGPUUnifyDivergentExitNodes.
Mar 18 2020, 9:14 AM · Restricted Project

Feb 27 2020

cwabbott added a comment to D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.

Hi @cwabbott,
This commit causes a GPU hang on amdvlk in one test, due to the missing "done" bit on the normal export. I think in the attached case the patch incorrectly classifies the export as being in an infinite loop.

Feb 27 2020, 4:26 AM · Restricted Project

Jan 30 2020

cwabbott committed rGce06d50756e9: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns (authored by cwabbott).
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Jan 30 2020, 2:08 AM
cwabbott closed D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.
Jan 30 2020, 2:08 AM · Restricted Project
cwabbott added a comment to D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.

LGTM. Did you manage to sort out your commit access issues?

Jan 30 2020, 1:52 AM · Restricted Project

Jan 29 2020

cwabbott updated the diff for D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.

It seems I forgot to test this against the entire testsuite, which made a bunch
of buildbots unhappy. This version fixes the issues:

Jan 29 2020, 9:12 AM · Restricted Project
cwabbott closed D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
Jan 29 2020, 9:12 AM · Restricted Project
cwabbott reopened D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.
Jan 29 2020, 9:12 AM · Restricted Project
cwabbott reopened D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
Jan 29 2020, 9:03 AM · Restricted Project
cwabbott committed rG87d98c149504: AMDGPU: Fix handling of infinite loops in fragment shaders (authored by cwabbott).
AMDGPU: Fix handling of infinite loops in fragment shaders
Jan 29 2020, 8:16 AM
cwabbott committed rG13ab22ab22de: Revert "AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns" (authored by cwabbott).
Revert "AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns"
Jan 29 2020, 7:22 AM
cwabbott committed rG08b205bb4808: Revert "AMDGPU: Fix handling of infinite loops in fragment shaders" (authored by cwabbott).
Revert "AMDGPU: Fix handling of infinite loops in fragment shaders"
Jan 29 2020, 7:22 AM
cwabbott added a reverting change for rG0994c485e613: AMDGPU: Fix handling of infinite loops in fragment shaders: rG08b205bb4808: Revert "AMDGPU: Fix handling of infinite loops in fragment shaders".
Jan 29 2020, 7:22 AM
cwabbott added a reverting change for rG323bfde20c5f: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns: rG13ab22ab22de: Revert "AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns".
Jan 29 2020, 7:22 AM
cwabbott committed rG0994c485e613: AMDGPU: Fix handling of infinite loops in fragment shaders (authored by cwabbott).
AMDGPU: Fix handling of infinite loops in fragment shaders
Jan 29 2020, 6:36 AM
cwabbott committed rG323bfde20c5f: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns (authored by cwabbott).
AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns
Jan 29 2020, 6:35 AM
cwabbott closed D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.
Jan 29 2020, 6:35 AM · Restricted Project
cwabbott closed D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
Jan 29 2020, 6:35 AM · Restricted Project
cwabbott added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

Can someone else please commit this? @nhaehnle? It's been almost a month, excluding Christmas, and my commit access situation still hasn't been resolved. This fix is necessary for radv to pass VK 1.2 conformance, and we'd like it to cherry-pick it to LLVM 10 before it's released.

Jan 29 2020, 5:50 AM · Restricted Project

Dec 12 2019

cwabbott added a comment to D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.

I just asked earlier this week to get my commit rights back, but I haven't heard back right away, so you can commit it if I don't get it first.

Dec 12 2019, 1:39 AM · Restricted Project

Dec 10 2019

cwabbott added inline comments to D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
Dec 10 2019, 5:43 AM · Restricted Project
cwabbott updated the diff for D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.

Suppress unused variable warnings in release builds.

Dec 10 2019, 5:43 AM · Restricted Project
cwabbott updated the diff for D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.

Add diff for new test.

Dec 10 2019, 5:25 AM · Restricted Project
cwabbott updated the diff for D71260: InstCombine: Add test for bugzilla 44242.

Add a test for multiple phis as suggested by @nikic.

Dec 10 2019, 5:25 AM · Restricted Project
cwabbott updated the diff for D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
  • Update precommitted test in this commit.
  • Make the test match the original bug better. It turns out that in my attempt to make it harder for other transforms to happen beforehand and break the test, I accidentally made another transform kick in which broke it anyways. Use this form with a loop which should hopefully defeat other optimizations better.
Dec 10 2019, 4:57 AM · Restricted Project
cwabbott added a child revision for D71260: InstCombine: Add test for bugzilla 44242: D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
Dec 10 2019, 4:57 AM · Restricted Project
cwabbott added a parent revision for D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users: D71260: InstCombine: Add test for bugzilla 44242.
Dec 10 2019, 4:57 AM · Restricted Project
cwabbott created D71260: InstCombine: Add test for bugzilla 44242.
Dec 10 2019, 4:57 AM · Restricted Project
cwabbott added inline comments to D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
Dec 10 2019, 3:17 AM · Restricted Project

Dec 9 2019

cwabbott added inline comments to D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
Dec 9 2019, 7:47 AM · Restricted Project
cwabbott added inline comments to D71164: [InstCombine] Fix infinite loop due to bitcast <-> phi transforms.
Dec 9 2019, 7:38 AM · Restricted Project
cwabbott created D71209: InstCombine: Don't rewrite phi-of-bitcast when the phi has other users.
Dec 9 2019, 7:19 AM · Restricted Project
cwabbott added a parent revision for D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns: D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
Dec 9 2019, 3:15 AM · Restricted Project
cwabbott created D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.
Dec 9 2019, 3:15 AM · Restricted Project
cwabbott added a child revision for D70781: AMDGPU: Fix handling of infinite loops in fragment shaders: D71192: AMDGPU: Fix AMDGPUUnifyDivergentExitNodes with no normal returns.
Dec 9 2019, 3:15 AM · Restricted Project
cwabbott added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

If two exports with done bit are executed, I suspect we could enter race conditions where the export allocation is freed after the first export with done, and then another wave gets the same export memory spot and its data could be overwritten by the second, newly introduced, export with done bit. Or enter some hang condition in the hardware. Who knows.

Dec 9 2019, 2:41 AM · Restricted Project
cwabbott updated the diff for D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
  • Fix wrong indentation and missing "immarg" on intrinsic declaration
  • Make sure that we remove the done bit from existing exports, and test it
Dec 9 2019, 2:39 AM · Restricted Project

Dec 6 2019

cwabbott updated the diff for D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

Update comment to explain why this works even when only some threads are killed.

Dec 6 2019, 9:44 AM · Restricted Project

Dec 3 2019

cwabbott added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

I think this isn't an issue because the exec mask at the end of the program is going to be the same as the mask when the last "real" export happens. The way kill is implemented means that control flow never reconverges for a killed thread, so it stays dead until the very end. I've done some manual tests with the aforementioned CTS test, and it does seem to be properly discarding the right pixels too. But that's a good question :)

I agree in the general case that should hold; however, if there was a discard after the true export done then it wouldn't. I don't have a good use case for why such a thing would happen -- exports should in the main be the last thing a (PS) shader does. So if we consider kill after export done semantically invalid then this is fine.

Dec 3 2019, 2:33 AM · Restricted Project

Dec 2 2019

cwabbott added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

This seems fine to me, although I know nothing about exports, and I would prefer if we could eventually fix the kill intrinsic design

Dec 2 2019, 8:52 AM · Restricted Project
cwabbott added a comment to D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
The extra export is for not great for performance as it introduces an unnecessary stall at the end of the shader.
Dec 2 2019, 8:05 AM · Restricted Project

Nov 27 2019

cwabbott updated the diff for D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.

Fix leftover extraneous change from an earlier version.

Nov 27 2019, 6:45 AM · Restricted Project
cwabbott created D70781: AMDGPU: Fix handling of infinite loops in fragment shaders.
Nov 27 2019, 6:27 AM · Restricted Project

Sep 2 2019

cwabbott accepted D65813: Partially revert D61491 "AMDGPU: Be explicit about whether the high-word in SI_PC_ADD_REL_OFFSET is 0".

FWIW, this LGTM. Maybe it would be a good idea to add more lines to global-constant.ll to test radeonsi (amdgcn-- at the moment) and radv (amdgcn-mesa-mesa3d) under the NOPAL label.

Sep 2 2019, 4:28 AM · Restricted Project
cwabbott abandoned D67003: AMDGPU: Don't put constants in .text for Mesa.

I just noticed that this already came up in D65813 and it does the right thing, it's just waiting review.

Sep 2 2019, 1:57 AM · Restricted Project

Aug 30 2019

cwabbott added a comment to D67003: AMDGPU: Don't put constants in .text for Mesa.

Is mesa actually using mesa3d now? I thought we were still in the unpleasant situation where "unknown" OS was treated like mesa3d, but not quite. I thought clover and radv were using mesa3d, but OpenGL was not. If it has, can we finally drop the second scratch implementation?

Ugh, you're right, radeonsi still has target triple = "amdgcn--". Why exactly is that? Would anything break if we just changed it to amdgcn-mesa-mesa3d like what radv does? I don't want that to block this, though, so is there any other user that needs to return true here? AMDVLK maybe?

Mesa is the only oddball relying on the unknown OS behavior. IIRC this required work in mesa to switch the scratch ABI from using relocations to what radv does, but besides that I think everything would work

Aug 30 2019, 8:27 AM · Restricted Project
cwabbott added a comment to D67003: AMDGPU: Don't put constants in .text for Mesa.

Is mesa actually using mesa3d now? I thought we were still in the unpleasant situation where "unknown" OS was treated like mesa3d, but not quite. I thought clover and radv were using mesa3d, but OpenGL was not. If it has, can we finally drop the second scratch implementation?

Aug 30 2019, 8:08 AM · Restricted Project
cwabbott added a comment to D67003: AMDGPU: Don't put constants in .text for Mesa.

FYI, I think I don't have commit access anymore because of the whole licensing thing, so I'll need someone else to commit for me.

Aug 30 2019, 6:32 AM · Restricted Project
cwabbott created D67003: AMDGPU: Don't put constants in .text for Mesa.
Aug 30 2019, 6:26 AM · Restricted Project

Mar 13 2019

cwabbott added inline comments to D59295: [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure..
Mar 13 2019, 12:19 PM · Restricted Project, Restricted Project
cwabbott added a comment to D59295: [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure..

Actually, now that I think about it, I believe we realized that SIFixWWMLiveness has a giant hole in that if any of the extra live ranges we insert are split, it'll fall over. I don't think anyone has come up with a way to express the constraints only with extra defs and uses in a way that always works, and I'm not sure it's possible. The issue is that we're lying to LLVM RA by pretending that vector instructions always fully clobber their destinations, and while before we were careful to never write to any inactive channels in order to keep up the charade, but WWM instructions force us to deal with it somehow. Fully informing LLVM of what's going on would involve marking every vector instruction as partially clobbering its destination, even the move instructions and load/store instructions LLVM emits during RA, which of course would tank performance unless LLVM is taught about predicated liveness -- but of course that's a whole lot of work that opens another can of worms (register pressure is suddenly not that meaningful anymore...).

Mar 13 2019, 12:02 PM · Restricted Project, Restricted Project
cwabbott added a comment to D59295: [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure..

If you can express the constraints with some series of uses and defs, that would be preferable.

Mar 13 2019, 11:46 AM · Restricted Project, Restricted Project
cwabbott added a comment to D59295: [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure..

I really don't like introducing new, dynamically reserved registers for this. It's going to introduce hell for dealing with any kind of ABI, and reserved registers are generally a bad idea. There's also nothing guaranteeing there are any free registers available to reserve, since you are just grabbing totally unused ones. This is going to just hit some variant of the problem I've been working on solving for handling SGPR->VGPR spills. Can WWM code be moved into a bundle or something?

Mar 13 2019, 11:16 AM · Restricted Project, Restricted Project

Feb 21 2019

cwabbott updated the diff for D57894: AMDGPU: Fix @llvm.amdgcn.wqm.vote implementation.

Fixed a regression in the llvm.amdgcn.kill tests.

Feb 21 2019, 5:33 AM · Restricted Project

Feb 8 2019

cwabbott accepted D55444: AMDGPU: Fix DPP combiner.

I'm not going to read everything in detail, but the combining rules look correct to me and everything passes with this pass enabled. Feel free to re-enable it.

Feb 8 2019, 2:54 AM · Restricted Project, Restricted Project
cwabbott added a comment to D57748: AMDGPU: Add inverse ballot intrinsic.

Adding a pattern for this wouldn't work for what I wanted to do, which was a ballot/inverseballot pair to operate directly on the bitmask representation of a boolean, since there's a bug where SelectionDAG forgets that ballot removes divergence, and it needs to be non-divergent for the pattern to fire. That being said, inserting two readlanes isn't that much better, so maybe I should just fix that instead...

Feb 8 2019, 2:22 AM · Restricted Project

Feb 7 2019

cwabbott created D57894: AMDGPU: Fix @llvm.amdgcn.wqm.vote implementation.
Feb 7 2019, 7:31 AM · Restricted Project
cwabbott updated the diff for D57748: AMDGPU: Add inverse ballot intrinsic.
  • Remove spurious whitespace change.
  • Lower S_INV_BALLOT with EmitInstrWithCustomInserter.
Feb 7 2019, 6:48 AM · Restricted Project

Feb 5 2019

cwabbott updated the diff for D57748: AMDGPU: Add inverse ballot intrinsic.
  • Added a pseudoinstruction lowered in SILowerI1Copies, and legalized it more

similarly to how other instructions are legalized.

  • Added tests where the source is a uniform and non-uniform phi node. I had to

to use the amdgpu_ps calling convention here to get the arguments into SGPR's.

Feb 5 2019, 8:30 AM · Restricted Project
cwabbott added inline comments to D57748: AMDGPU: Add inverse ballot intrinsic.
Feb 5 2019, 6:25 AM · Restricted Project
cwabbott added inline comments to D57748: AMDGPU: Add inverse ballot intrinsic.
Feb 5 2019, 6:19 AM · Restricted Project
cwabbott added a comment to D57748: AMDGPU: Add inverse ballot intrinsic.

The user can't actually guarantee the argument is uniform, since transforms are allowed to touch the argument. We're accumulating quite a few places that assume this though

Feb 5 2019, 6:16 AM · Restricted Project
cwabbott created D57748: AMDGPU: Add inverse ballot intrinsic.
Feb 5 2019, 5:57 AM · Restricted Project

Jan 28 2019

cwabbott added inline comments to D55897: Add constrained fptrunc and fpext intrinsics.
Jan 28 2019, 9:16 AM · Restricted Project
cwabbott added inline comments to D55897: Add constrained fptrunc and fpext intrinsics.
Jan 28 2019, 7:31 AM · Restricted Project

Jan 24 2019

cwabbott added inline comments to D55897: Add constrained fptrunc and fpext intrinsics.
Jan 24 2019, 3:57 AM · Restricted Project

Jan 16 2019

cwabbott added inline comments to D55444: AMDGPU: Fix DPP combiner.
Jan 16 2019, 1:14 AM · Restricted Project, Restricted Project

Jan 15 2019

cwabbott added inline comments to D55444: AMDGPU: Fix DPP combiner.
Jan 15 2019, 7:44 AM · Restricted Project, Restricted Project

Jan 14 2019

cwabbott added inline comments to D55444: AMDGPU: Fix DPP combiner.
Jan 14 2019, 10:27 AM · Restricted Project, Restricted Project

Jan 11 2019

cwabbott added a comment to D55444: AMDGPU: Fix DPP combiner.

I figured it would be a little easier if I looked at these cases by myself. It turns out there are more problems with isIdentityValue, including some correctness issues. After fixing these, everything works correctly now.

Jan 11 2019, 2:22 AM · Restricted Project, Restricted Project

Jan 10 2019

cwabbott added a comment to D55444: AMDGPU: Fix DPP combiner.

Sorry, I just got back from break this week. I've run CTS with the pass enabled, and it now passes, although it seems most of the patterns we use don't get folded. Firstly AND, XOR, unsigned max, and unsigned min are most troubling, since the code that gets generated looks like it should be optimized:

Jan 10 2019, 5:34 AM · Restricted Project, Restricted Project

Dec 12 2018

cwabbott added inline comments to D55444: AMDGPU: Fix DPP combiner.
Dec 12 2018, 10:24 AM · Restricted Project, Restricted Project

Dec 6 2018

cwabbott added a comment to D55314: AMDGPU: Turn on the DPP combiner by default.

Ok, I'll disable it. I'm not sure about 3rd point: are you sayng the pass doesn't actually perform the optimization or it's fundamentally wrong? Because it implemented to handle "identity" cases for add, mul and min/max.

Dec 6 2018, 6:29 AM
cwabbott added a comment to D55314: AMDGPU: Turn on the DPP combiner by default.

Hi,

This change breaks most of the subgroups tests with RADV (ie. dEQP-VK.subgroups.arithmetic.*).

Any reasons why you enabled it by default? Looks like it now triggers a new bug in the AMDGPU backend.

Thanks!

Hi,

this is awaited feature and other reason is to detect a situation such yours. I think it's better to turn it off and reproduce the failures: how can I do this?

Dec 6 2018, 6:08 AM

Nov 16 2018

cwabbott added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.

I believe the combination of Convergent + not Speculatable should mean that the compiler shouldn't hoist it to a non-control-equivalent block and shouldn't CSE it. In particular, IIRC it's not guaranteed that that a readnone function always return the same value when it's called with the same arguments, so it's not safe to CSE -- it just means that LLVM can move other things across it, since it doesn't modify *caller-visible* state. What pass is causing a problem? Maybe it's a bug in the pass?

readnone is literally readnone. Maybe you're mixing this up with inaccessiblememonly?

Nov 16 2018, 4:50 AM

Nov 14 2018

cwabbott added a comment to D54516: [AMDGPU] Do not mark llvm.amdgcn.set.inactive as IntrNoMem.

I believe the combination of Convergent + not Speculatable should mean that the compiler shouldn't hoist it to a non-control-equivalent block and shouldn't CSE it. In particular, IIRC it's not guaranteed that that a readnone function always return the same value when it's called with the same arguments, so it's not safe to CSE -- it just means that LLVM can move other things across it, since it doesn't modify *caller-visible* state. What pass is causing a problem? Maybe it's a bug in the pass?

Nov 14 2018, 2:04 AM

May 6 2018

cwabbott added a comment to D46470: [AMDGPU] Fixed a couple of SIFixWWMLiveness problems.
In D46470#1088984, @tpr wrote:

You made a comment in this code that "this is a workaround anyways until LLVM gains the notion of predicated uses and definitions of variables". How do you see that working? Are you aware of anyone else thinking along those lines in the LLVM community? Something to think about for the future.

May 6 2018, 10:37 AM

May 4 2018

cwabbott added a comment to D46470: [AMDGPU] Fixed a couple of SIFixWWMLiveness problems.

For the liveness issue, maybe a better way to solve it would be to add a new ENTER_WWM pseudoinstruction similar to EXIT_WWM, and add a matching implicit def to the matching ENTER_WWM whenever we insert an implicit use on EXIT_WWM, and mark both of them as kills. After all, any affected registers only need to interfere with the instructions run in WWM, so that should help with code quality too. I'm not sure why I didn't do that in the first place.

May 4 2018, 2:56 PM
cwabbott added inline comments to D46470: [AMDGPU] Fixed a couple of SIFixWWMLiveness problems.
May 4 2018, 2:37 PM

Oct 16 2017

cwabbott added inline comments to D38543: AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic.
Oct 16 2017, 12:39 PM

Aug 8 2017

cwabbott committed rL310399: [AMDGPU] Add llvm.amdgpu.update.dpp intrinsic.
[AMDGPU] Add llvm.amdgpu.update.dpp intrinsic
Aug 8 2017, 11:54 AM
cwabbott closed D34718: [AMDGPU] Add llvm.amdgpu.update.dpp intrinsic by committing rL310399: [AMDGPU] Add llvm.amdgpu.update.dpp intrinsic.
Aug 8 2017, 11:54 AM

Aug 7 2017

cwabbott added a comment to D34718: [AMDGPU] Add llvm.amdgpu.update.dpp intrinsic.

Ping on this one. This is the last outstanding patch for implementing AMD_shader_ballot in Mesa.

Aug 7 2017, 4:25 PM
cwabbott committed rL310283: [AMDGPU] Add pseudo "old" source to all DPP instructions.
[AMDGPU] Add pseudo "old" source to all DPP instructions
Aug 7 2017, 12:12 PM
cwabbott closed D34716: [AMDGPU] Add pseudo "old" and "wqm_mode" source to all DPP instructions by committing rL310283: [AMDGPU] Add pseudo "old" source to all DPP instructions.
Aug 7 2017, 12:12 PM

Aug 4 2017

cwabbott committed rL310088: [AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic.
[AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic
Aug 4 2017, 11:38 AM
cwabbott closed D34719: [AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic by committing rL310088: [AMDGPU] Implement llvm.amdgcn.set.inactive intrinsic.
Aug 4 2017, 11:37 AM
cwabbott committed rL310087: [AMDGPU] Add support for Whole Wavefront Mode.
[AMDGPU] Add support for Whole Wavefront Mode
Aug 4 2017, 11:37 AM
cwabbott committed rL310085: [AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM.
[AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM
Aug 4 2017, 11:37 AM
cwabbott committed rL310086: [AMDGPU] refactor WQM pass in preparation for WWM (NFCI).
[AMDGPU] refactor WQM pass in preparation for WWM (NFCI)
Aug 4 2017, 11:37 AM
cwabbott closed D35524: [AMDGPU] Add support for Whole Wavefront Mode by committing rL310087: [AMDGPU] Add support for Whole Wavefront Mode.
Aug 4 2017, 11:37 AM
cwabbott closed D35167: [AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM by committing rL310085: [AMDGPU] Add an llvm.amdgcn.wqm intrinsic for WQM.
Aug 4 2017, 11:37 AM
cwabbott closed D35523: [AMDGPU] refactor WQM pass in preparation for WWM (NFCI) by committing rL310086: [AMDGPU] refactor WQM pass in preparation for WWM (NFCI).
Aug 4 2017, 11:37 AM
cwabbott added a reviewer for D34716: [AMDGPU] Add pseudo "old" and "wqm_mode" source to all DPP instructions: SamWot.

It looks like Sam has worked a lot on the assembler, including adding support for DPP instructions, so I'm adding him for the assembler bits. I'd like to get this in before I leave next week, though.

Aug 4 2017, 10:58 AM
cwabbott updated the diff for D34716: [AMDGPU] Add pseudo "old" and "wqm_mode" source to all DPP instructions.

Remove spurious change to AMDGPUAsmParser.cpp

Aug 4 2017, 10:54 AM
cwabbott added inline comments to D34716: [AMDGPU] Add pseudo "old" and "wqm_mode" source to all DPP instructions.
Aug 4 2017, 8:48 AM

Aug 3 2017

cwabbott committed rL310013: [AMDGPU] Add missing hazard for DPP-after-EXEC-write.
[AMDGPU] Add missing hazard for DPP-after-EXEC-write
Aug 3 2017, 6:10 PM
cwabbott closed D34849: [AMDGPU] Add missing hazard for DPP-after-EXEC-write by committing rL310013: [AMDGPU] Add missing hazard for DPP-after-EXEC-write.
Aug 3 2017, 6:10 PM
cwabbott updated the diff for D34716: [AMDGPU] Add pseudo "old" and "wqm_mode" source to all DPP instructions.

Fix assembling DPP instructions. Also, adopt a more conservative version of
D34715. In particular, we ignore Constraints/DisableEncoding from the original
instruction for the DPP version. The only instruction with any special
constraints is MAC, because of its fake third source, and there it doesn't make
sense to keep the fake third source since it has to be the same as the normal
"old" source anyways. We can revisit this if something else comes up, but I
think this is a good plan for now.

Aug 3 2017, 5:48 PM
cwabbott abandoned D34715: [AMDGPU] don't set Constraints/DisableEncoding from the Profile.
Aug 3 2017, 5:39 PM