Page MenuHomePhabricator

critson (Carl Ritson)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 4 2018, 4:49 AM (200 w, 1 h)

Recent Activity

Wed, Jun 29

critson added a comment to D128800: [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra.

@arsenm I hope you don't mind, I merged your tests (and some fixes) from D128110 and D128315 into a new review D128882. We can proceed with the merged review, or I am happy to contribute to reviewing on your existing patch sets.

Wed, Jun 29, 11:35 PM · Restricted Project, Restricted Project
critson requested review of D128882: [AMDGPU] Additional liveness tests for si-optimize-exec-masking-pre-ra.
Wed, Jun 29, 11:32 PM · Restricted Project, Restricted Project
critson committed rGd0f664161575: [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra (authored by critson).
[AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra
Wed, Jun 29, 11:30 PM · Restricted Project, Restricted Project
critson closed D128800: [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra.
Wed, Jun 29, 11:29 PM · Restricted Project, Restricted Project
critson added a comment to D128800: [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra.

I think D128315 D128110 would also avoid this. I assumed cndmask in a different block was not useful, but presumably you found this in the wild?

Wed, Jun 29, 8:26 PM · Restricted Project, Restricted Project
critson requested review of D128800: [AMDGPU] Fix liveness for loops in si-optimize-exec-masking-pre-ra.
Wed, Jun 29, 3:39 AM · Restricted Project, Restricted Project

Fri, Jun 24

critson added a comment to D123230: [UnifyLoopExits] Reduce number of guard blocks.

@bcahoon Thanks! I have attached what I think is a semi-representative test for the issue.

Fri, Jun 24, 8:28 PM · Restricted Project, Restricted Project
critson added a comment to D123230: [UnifyLoopExits] Reduce number of guard blocks.

Unfortunately this can breaks reconvergence in some edge cases.
This occurs because exits in an inner loop may end up no longer reconvergencing with exits in an outer loop.
I am trying to clean up on a test case based on a Vulkan CTS test failure this causes.

Fri, Jun 24, 12:54 AM · Restricted Project, Restricted Project

Thu, Jun 23

critson committed rG874fbe2cbbe6: [MachineSink] Clear kill flags on operands outside loop (authored by critson).
[MachineSink] Clear kill flags on operands outside loop
Thu, Jun 23, 10:03 PM · Restricted Project, Restricted Project
critson closed D126754: [MachineSink] Clear kill flags on operands outside loop.
Thu, Jun 23, 10:03 PM · Restricted Project, Restricted Project
critson updated the diff for D126754: [MachineSink] Clear kill flags on operands outside loop.
  • Rebase before commit
Thu, Jun 23, 10:02 PM · Restricted Project, Restricted Project

Tue, Jun 21

critson accepted D128193: AMDGPU: Skip unexpected CFG in SIOptimizeVGPRLiveRange.

LGTM

Tue, Jun 21, 1:20 AM · Restricted Project, Restricted Project

Sun, Jun 19

critson added inline comments to D127963: [AMDGPU] Add support for GFX11 LDSDIR hazards.
Sun, Jun 19, 11:41 PM · Restricted Project, Restricted Project
critson updated the diff for D126754: [MachineSink] Clear kill flags on operands outside loop.
  • readsReg() instead of isUse()
Sun, Jun 19, 11:13 PM · Restricted Project, Restricted Project

Thu, Jun 16

critson added a comment to D126754: [MachineSink] Clear kill flags on operands outside loop.

Could the test case by just a MIR test case that only runs MachineSink?

Thu, Jun 16, 2:14 AM · Restricted Project, Restricted Project
critson updated the diff for D126754: [MachineSink] Clear kill flags on operands outside loop.
  • Address reviewer comments
  • Change to MIR test
Thu, Jun 16, 1:39 AM · Restricted Project, Restricted Project

Tue, Jun 14

critson added inline comments to D127664: [AMDGPU] gfx11 ldsdir intrinsics and ISel.
Tue, Jun 14, 8:11 PM · Restricted Project, Restricted Project

May 31 2022

critson requested review of D126754: [MachineSink] Clear kill flags on operands outside loop.
May 31 2022, 11:10 PM · Restricted Project, Restricted Project

May 24 2022

critson accepted D125989: [AMDGPU] gfx11 vinterp instructions MC support.

LGTM

May 24 2022, 6:33 PM · Restricted Project, Restricted Project

May 19 2022

critson added a comment to D125989: [AMDGPU] gfx11 vinterp instructions MC support.

As with LDSDIR, probably needs a couple of tests for invalid values of wait_exp.

May 19 2022, 9:26 PM · Restricted Project, Restricted Project

May 18 2022

critson accepted D125820: [AMDGPU] gfx11 LDSDIR instructions MC support.

LGTM, thanks!

May 18 2022, 5:26 PM · Restricted Project, Restricted Project

May 12 2022

critson committed rG698fda0e3ecc: [AMDGPU] Remove pre-committed test for D124981. NFC. (authored by critson).
[AMDGPU] Remove pre-committed test for D124981. NFC.
May 12 2022, 12:05 AM · Restricted Project, Restricted Project

May 10 2022

critson abandoned D124981: [AMDGPU] Enable WQM if demotes and softwqm are combined.

This change is not required if front-end uses explicit WQM.

May 10 2022, 4:35 PM · Restricted Project, Restricted Project
critson added a comment to D124981: [AMDGPU] Enable WQM if demotes and softwqm are combined.

I think there is perhaps another way to see this.

May 10 2022, 4:52 AM · Restricted Project, Restricted Project
critson updated the diff for D124981: [AMDGPU] Enable WQM if demotes and softwqm are combined.
  • Rebase on to pre-committed test
May 10 2022, 4:36 AM · Restricted Project, Restricted Project

May 9 2022

critson committed rGdcd69d82baad: [AMDGPU] Generate checks in llvm.amdgcn.softwqm.ll (authored by critson).
[AMDGPU] Generate checks in llvm.amdgcn.softwqm.ll
May 9 2022, 5:49 PM · Restricted Project, Restricted Project
critson committed rG78ab7adbd39e: [AMDGPU] Pre-commit test for D124981. NFC. (authored by critson).
[AMDGPU] Pre-commit test for D124981. NFC.
May 9 2022, 5:49 PM · Restricted Project, Restricted Project

May 4 2022

critson requested review of D124981: [AMDGPU] Enable WQM if demotes and softwqm are combined.
May 4 2022, 10:56 PM · Restricted Project, Restricted Project

Apr 14 2022

critson committed rG35ea326047ef: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch (authored by critson).
[AMDGPU] Try to avoid inserting duplicate s_inst_prefetch
Apr 14 2022, 12:21 AM · Restricted Project, Restricted Project
critson closed D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.
Apr 14 2022, 12:21 AM · Restricted Project, Restricted Project

Apr 13 2022

critson added a comment to D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.

Thanks! Did we ever run any benchmarking on this? I have written this before actual HW was available.

Apr 13 2022, 9:16 PM · Restricted Project, Restricted Project

Apr 12 2022

critson updated the diff for D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.
  • Incorporate reviewer comments
Apr 12 2022, 3:35 AM · Restricted Project, Restricted Project
critson added inline comments to D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.
Apr 12 2022, 3:35 AM · Restricted Project, Restricted Project

Apr 11 2022

critson committed rG2bca7d859ac2: [AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange (authored by critson).
[AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange
Apr 11 2022, 10:16 PM · Restricted Project, Restricted Project
critson closed D123480: [AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange.
Apr 11 2022, 10:16 PM · Restricted Project, Restricted Project
critson updated the diff for D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.

Rebase on to pre-committed test.

Apr 11 2022, 9:49 PM · Restricted Project, Restricted Project
critson committed rG4c59fc53299d: [AMDGPU] Pre-commit test for D123569. NFC. (authored by critson).
[AMDGPU] Pre-commit test for D123569. NFC.
Apr 11 2022, 9:48 PM · Restricted Project, Restricted Project
critson requested review of D123569: [AMDGPU] Try to avoid inserting duplicate s_inst_prefetch.
Apr 11 2022, 9:27 PM · Restricted Project, Restricted Project

Apr 10 2022

critson added a comment to D123480: [AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange.

Ideally we would support VGPR live range optimisation for nested waterfalls, but this is none trivial.

Apr 10 2022, 11:03 PM · Restricted Project, Restricted Project
critson requested review of D123480: [AMDGPU] Graceful abort for waterfalls in SIOptimizeVGPRLiveRange.
Apr 10 2022, 9:39 PM · Restricted Project, Restricted Project

Mar 28 2022

critson committed rG1f52d02cebf1: [AMDGPU] Split waterfall loop exec manipulation (authored by critson).
[AMDGPU] Split waterfall loop exec manipulation
Mar 28 2022, 1:45 AM · Restricted Project, Restricted Project
critson closed D122200: [AMDGPU] Split waterfall loop exec manipulation.
Mar 28 2022, 1:45 AM · Restricted Project, Restricted Project

Mar 24 2022

critson added inline comments to D122200: [AMDGPU] Split waterfall loop exec manipulation.
Mar 24 2022, 5:56 PM · Restricted Project, Restricted Project
critson updated the diff for D122200: [AMDGPU] Split waterfall loop exec manipulation.
  • Update liveness for new registers
Mar 24 2022, 5:56 PM · Restricted Project, Restricted Project

Mar 23 2022

critson accepted D122332: [AMDGPU] Improve v_cmpx usage on GFX10.3..

Please run "arc lint" or otherwise apply clang-format rules to fix the formatting errors highlighted by Lint pre-merge checks.

Mar 23 2022, 7:40 PM · Restricted Project, Restricted Project
critson added a comment to D122200: [AMDGPU] Split waterfall loop exec manipulation.

I agree this is a good idea and the change looks good, just only a few minor comments. btw, Is this the only case that we modify EXEC in middle of a block?

I don't think so. WWM/WQM still ends up in the middle of a block I believe. Kills also start in the middle of blocks, but are moved to terminators

Mar 23 2022, 6:14 PM · Restricted Project, Restricted Project
critson updated the diff for D122200: [AMDGPU] Split waterfall loop exec manipulation.
  • Address reviewer comments
Mar 23 2022, 6:14 PM · Restricted Project, Restricted Project
critson abandoned D122364: [AMDGPU] Split waterfall loop exec manipulation.

arc diff failure...

Mar 23 2022, 6:10 PM · Restricted Project, Restricted Project
critson requested review of D122364: [AMDGPU] Split waterfall loop exec manipulation.
Mar 23 2022, 6:09 PM · Restricted Project, Restricted Project

Mar 21 2022

critson committed rG8e64d84995dd: [MachineSink] Check block prologue interference (authored by critson).
[MachineSink] Check block prologue interference
Mar 21 2022, 7:17 PM · Restricted Project
critson closed D121277: [MachineSink] Check block prologue interference.
Mar 21 2022, 7:16 PM · Restricted Project, Restricted Project
critson retitled D121277: [MachineSink] Check block prologue interference from [MachineSink] Check block prologue does not clobber uses to [MachineSink] Check block prologue interference.
Mar 21 2022, 7:08 PM · Restricted Project, Restricted Project
critson added inline comments to D122200: [AMDGPU] Split waterfall loop exec manipulation.
Mar 21 2022, 7:00 PM · Restricted Project, Restricted Project
critson requested review of D122200: [AMDGPU] Split waterfall loop exec manipulation.
Mar 21 2022, 6:50 PM · Restricted Project, Restricted Project

Mar 20 2022

critson accepted D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3..

LGTM, with a few minor nits.

Mar 20 2022, 7:54 PM · Restricted Project, Restricted Project

Mar 17 2022

critson added a comment to D121277: [MachineSink] Check block prologue interference.

Thanks!
I won't submit this for at least 72 hours.

Mar 17 2022, 5:58 PM · Restricted Project, Restricted Project
critson updated the diff for D121277: [MachineSink] Check block prologue interference.
  • Incorporate reviewer feedback.
Mar 17 2022, 5:57 PM · Restricted Project, Restricted Project
critson updated the diff for D121277: [MachineSink] Check block prologue interference.
  • Address case of sunk instruction clobbering register partially defined in block prologue
Mar 17 2022, 2:55 AM · Restricted Project, Restricted Project

Mar 15 2022

critson added a comment to D121277: [MachineSink] Check block prologue interference.

Try this. I don't know what would be the most likely pattern in real cases, this is just to show that sinking $sgpr0_sgpr1 definition instruction into successor block would break the register dependency of type (a).

snipped
Mar 15 2022, 11:22 PM · Restricted Project, Restricted Project
critson updated the diff for D121277: [MachineSink] Check block prologue interference.
  • Generalize to any target defined block prologue instruction interfering with a sink candidate
  • Extend to pre-RA machine sink
Mar 15 2022, 11:18 PM · Restricted Project, Restricted Project
critson added a comment to D121277: [MachineSink] Check block prologue interference.

I agree this issue should be fixed here. Machine sinking should check for register dependency between the sunk instruction and the prologue instruction in the successor block.
But I think there are two kinds of register dependency need to be checked:

a.) the definition of a register which would be used in successor prologue instruction.
b.) the instruction which has a source physical register being overwritten by successor prologue instruction.

I think you are fixing the second kind of dependency currently. Maybe check the first kind of dependency in the same change?
The first kind of def-use dependency should also apply to the pre-RA sinking.

Mar 15 2022, 4:33 PM · Restricted Project, Restricted Project
critson requested changes to D119696: [AMDGPU] Improve v_cmpx usage on GFX10.3..

Need to fix missing TRI for modifiesRegister calls.
Also please add a test of the form:

$sgpr0 = S_BFE_U32 killed renamable $sgpr3, 524296, implicit-def dead $scc
$vcc = V_CMP_GT_U32_e64 killed $sgpr0, killed $vgpr0, implicit $exec
$sgpr0_sgpr1 = COPY $exec, implicit-def $exec
$sgpr0_sgpr1 = S_AND_B64 killed renamable $sgpr0_sgpr1, killed renamable $vcc, implicit-def dead $scc
$exec = S_MOV_B64_term killed renamable $sgpr0_sgpr
Mar 15 2022, 12:11 AM · Restricted Project, Restricted Project

Mar 14 2022

critson added a comment to D121277: [MachineSink] Check block prologue interference.

LGTM.

Mar 14 2022, 5:11 PM · Restricted Project, Restricted Project
critson added a comment to D117014: AMDGPU: Use removeAllRegUnitsForPhysReg().

No objections from me to you submitting this.
I assume the posted bug report would be sufficient as a starting point for someone (perhaps myself) to try and root cause the why liveness is being introduced for phys subregs?

Mar 14 2022, 1:38 AM · Restricted Project, Restricted Project

Mar 9 2022

critson added a comment to D121277: [MachineSink] Check block prologue interference.

What do you mean by "block prologue"? Can you tell me which of the 2 COPYs in your test is falsely moved with your test and which instruction clobbers the register?

(Intuitively I would expect a live-in value to be available after all PHI instructions so I don't see why there can be a clobber when something is inserted at the beginning of the basic block...)

Mar 9 2022, 6:44 PM · Restricted Project, Restricted Project
critson updated the diff for D121277: [MachineSink] Check block prologue interference.
  • Rebase on top of pre-committed test
Mar 9 2022, 6:39 PM · Restricted Project, Restricted Project
critson committed rG3cb9af1be2b4: [MachineSink] Pre-commit test for D121277. NFC. (authored by critson).
[MachineSink] Pre-commit test for D121277. NFC.
Mar 9 2022, 6:33 PM · Restricted Project
critson abandoned D121268: [AMDGPU] Control flow pseudos are not part of block prologue.

This is unnecessary, the remaining pseudo instructions are terminators so already skipped.

Mar 9 2022, 4:00 AM · Restricted Project, Restricted Project
critson added a comment to D121268: [AMDGPU] Control flow pseudos are not part of block prologue.

I need to rewrite the test.

Mar 9 2022, 1:34 AM · Restricted Project, Restricted Project
critson requested review of D121277: [MachineSink] Check block prologue interference.
Mar 9 2022, 1:33 AM · Restricted Project, Restricted Project

Mar 8 2022

critson requested review of D121268: [AMDGPU] Control flow pseudos are not part of block prologue.
Mar 8 2022, 6:55 PM · Restricted Project, Restricted Project

Mar 2 2022

critson added inline comments to D120855: [AMDGPU] gfx940 uses new names for coherency bits.
Mar 2 2022, 8:04 PM · Restricted Project, Restricted Project

Feb 26 2022

critson committed rG2bbe6506d4a9: [AMDGPU] Remove redundant isVALU in SIPreEmitPeephole. NFC (authored by critson).
[AMDGPU] Remove redundant isVALU in SIPreEmitPeephole. NFC
Feb 26 2022, 11:10 PM

Feb 24 2022

critson added inline comments to D120202: [AMDGPU] Extend pre-emit peephole to redundantly masked VCC.
Feb 24 2022, 5:25 PM · Restricted Project
critson committed rG565af157efac: [AMDGPU] Extend pre-emit peephole to redundantly masked VCC (authored by critson).
[AMDGPU] Extend pre-emit peephole to redundantly masked VCC
Feb 24 2022, 5:19 PM
critson closed D120202: [AMDGPU] Extend pre-emit peephole to redundantly masked VCC.
Feb 24 2022, 5:19 PM · Restricted Project

Feb 21 2022

critson updated the diff for D120202: [AMDGPU] Extend pre-emit peephole to redundantly masked VCC.
  • Add V_CMP_CLASS support
  • Add MIR tests
Feb 21 2022, 9:40 PM · Restricted Project

Feb 19 2022

critson requested review of D120202: [AMDGPU] Extend pre-emit peephole to redundantly masked VCC.
Feb 19 2022, 8:09 PM · Restricted Project

Feb 15 2022

critson committed rGef949ecba574: [MachineSink] Use SkipPHIsAndLabels for sink insertion points (authored by critson).
[MachineSink] Use SkipPHIsAndLabels for sink insertion points
Feb 15 2022, 7:45 PM
critson closed D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
Feb 15 2022, 7:45 PM · Restricted Project

Feb 14 2022

critson updated the diff for D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
  • Address reviewer comments
  • Further prune test
Feb 14 2022, 4:34 PM · Restricted Project

Feb 13 2022

critson retitled D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points from [MachineSink] Allow target to adjust sink insertion point to [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
Feb 13 2022, 6:49 PM · Restricted Project
critson added inline comments to D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
Feb 13 2022, 6:46 PM · Restricted Project
critson updated the diff for D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
  • Rework to fixing MachineSink with SkipPHIsAndLabels
Feb 13 2022, 6:45 PM · Restricted Project

Feb 10 2022

critson added inline comments to D119235: [AMDGPU][NFC] Fix typos.
Feb 10 2022, 12:56 AM · Restricted Project

Feb 9 2022

critson added a comment to D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.

The test is based on new wave transform control flow, and does not occur with current control flow which uses pseudo instructions which are typically lowered after machine sinking.

Feb 9 2022, 7:05 PM · Restricted Project
critson requested review of D119399: [MachineSink] Use SkipPHIsAndLabels for sink insertion points.
Feb 9 2022, 7:03 PM · Restricted Project

Feb 7 2022

critson committed rG42ac4e1a120c: [MachineLICM] Add shouldHoist method to TargetInstrInfo (authored by critson).
[MachineLICM] Add shouldHoist method to TargetInstrInfo
Feb 7 2022, 10:53 PM
critson closed D118773: [MachineLICM] Add shouldHoist method to TargetInstrInfo.
Feb 7 2022, 10:53 PM · Restricted Project
critson committed rGa1fb307b4b8d: [AMDGPU] Allow hoisting of some VALU compare instructions (authored by critson).
[AMDGPU] Allow hoisting of some VALU compare instructions
Feb 7 2022, 6:28 PM
critson closed D118975: [AMDGPU] Allow hoisting of some VALU compare instructions.
Feb 7 2022, 6:28 PM · Restricted Project
critson updated the diff for D118975: [AMDGPU] Allow hoisting of some VALU compare instructions.
  • Address reviewer comments.
Feb 7 2022, 6:27 PM · Restricted Project
critson added a comment to D118975: [AMDGPU] Allow hoisting of some VALU compare instructions.

I suspect that V_CNDMASK uses are fairly common, and those implicitly mask with EXEC as well.

This also doesn't address more complex cases like

%0 = V_CMP ...
%1 = V_CMP ...
%2 = S_OR %0, %1
V_CNDMASK lhs, rhs, %2

though perhaps the change is good as-is for now.

Feb 7 2022, 6:25 PM · Restricted Project
critson updated the diff for D118975: [AMDGPU] Allow hoisting of some VALU compare instructions.
  • Address reviewer comments.
Feb 7 2022, 12:11 AM · Restricted Project

Feb 4 2022

critson requested review of D118975: [AMDGPU] Allow hoisting of some VALU compare instructions.
Feb 4 2022, 1:18 AM · Restricted Project

Feb 2 2022

critson updated the diff for D118773: [MachineLICM] Add shouldHoist method to TargetInstrInfo.
  • Address reviewer comments.
Feb 2 2022, 11:57 PM · Restricted Project
critson requested review of D118773: [MachineLICM] Add shouldHoist method to TargetInstrInfo.
Feb 2 2022, 3:51 AM · Restricted Project

Feb 1 2022

critson added a comment to D118185: [AMDGPU] Mark v_cmp* convergent.

As an aside, I am working on a patch to add a "shouldHoist" to control the behaviour of MachineLICM.
This mostly mirrors the shouldSink that already exists.

Feb 1 2022, 1:27 AM · Restricted Project

Jan 25 2022

critson added a comment to D117482: AMDGPU: Don't clobber source register for V_SET_INACTIVE_*.

If I understand correctly, the root issue is the Register Coalescer not being aware of the wave-level CFG?
It is unfortunate that we have to add these copies, but I agree this seems the simplest fix right now.

Jan 25 2022, 11:48 PM · Restricted Project
critson accepted D118096: [AMDGPU][SIWholeQuadMode] Use the right VCC register to activate the correct lanes..

LGTM

Jan 25 2022, 12:17 AM · Restricted Project