Page MenuHomePhabricator

kerbowa (Austin Kerbow)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 31 2018, 12:07 PM (183 w, 1 d)

Recent Activity

Yesterday

kerbowa abandoned D127665: [AMDGPU] Remove FillMFMAShadowMutation after switch to postmisched.
Tue, Jul 5, 10:16 PM · Restricted Project, Restricted Project
kerbowa requested review of D129172: [AMDGPU] Disable FillMFMAShadowMutation by default.
Tue, Jul 5, 10:15 PM · Restricted Project, Restricted Project

Thu, Jun 30

kerbowa added a comment to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

JFYI, I'm planning to reintroduce this change in the next few weeks. Before this change, the compiler would ALWAYS wait for outstanding VMEM/LGKM at barriers. There is no HW requirement for this. As an optimization, we will be omitting these waitcnt on Navi/MI200. In order to retain the same behavior as before the change, fences must be added before barriers.

Thu, Jun 30, 9:56 PM · Restricted Project, Restricted Project

Wed, Jun 22

kerbowa accepted D128313: AMDGPU: Use isMeta flags on pseudoinstructions.

LGTM

Wed, Jun 22, 6:57 PM · Restricted Project, Restricted Project

Tue, Jun 21

kerbowa added a comment to D128313: AMDGPU: Use isMeta flags on pseudoinstructions.

Any idea why there is this slight change to the scheduling in the test? This patch will definitely be helpful especially for sched_group_barrier if it doesn't change the way any edges are added. Thanks!

Tue, Jun 21, 10:59 PM · Restricted Project, Restricted Project

Mon, Jun 20

kerbowa added a comment to D128158: [AMDGPU] Add amdgcn_sched_group_barrier builtin.

Somewhat WIP needs more tests and cleanup. Posted for dependent work.

Mon, Jun 20, 12:15 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa requested review of D128158: [AMDGPU] Add amdgcn_sched_group_barrier builtin.
Mon, Jun 20, 12:13 AM · Restricted Project, Restricted Project, Restricted Project

Thu, Jun 16

kerbowa added inline comments to D127994: [AMDGPU] Expose CLI controls for IGroup ordering.
Thu, Jun 16, 4:24 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D127994: [AMDGPU] Expose CLI controls for IGroup ordering.
Thu, Jun 16, 12:18 PM · Restricted Project, Restricted Project

Wed, Jun 15

kerbowa committed rG4bba82116a1c: [AMDGPU] Fix buildbot failures after 48ebc1af29 (authored by kerbowa).
[AMDGPU] Fix buildbot failures after 48ebc1af29
Wed, Jun 15, 12:30 AM · Restricted Project, Restricted Project

Tue, Jun 14

kerbowa committed rG48ebc1af2948: [AMDGPU] Add more expressive sched_barrier controls (authored by kerbowa).
[AMDGPU] Add more expressive sched_barrier controls
Tue, Jun 14, 10:24 PM · Restricted Project, Restricted Project
kerbowa closed D127123: [AMDGPU] Add more expressive sched_barrier controls.
Tue, Jun 14, 10:24 PM · Restricted Project, Restricted Project
kerbowa committed rGbd9eed3aecc6: [AMDGPU] Add isMFMA helper function. NFC (authored by kerbowa).
[AMDGPU] Add isMFMA helper function. NFC
Tue, Jun 14, 10:02 PM · Restricted Project, Restricted Project
kerbowa closed D127124: [AMDGPU] Add isMFMA helper function. NFC.
Tue, Jun 14, 10:02 PM · Restricted Project, Restricted Project

Mon, Jun 13

kerbowa abandoned D127666: [AMDGPU] Remove ShouldPreferAnother after switch to postmisched.
Mon, Jun 13, 3:34 PM · Restricted Project, Restricted Project
kerbowa requested review of D127666: [AMDGPU] Remove ShouldPreferAnother after switch to postmisched.
Mon, Jun 13, 8:56 AM · Restricted Project, Restricted Project
kerbowa requested review of D127665: [AMDGPU] Remove FillMFMAShadowMutation after switch to postmisched.
Mon, Jun 13, 8:55 AM · Restricted Project, Restricted Project

Sun, Jun 12

kerbowa updated the diff for D127123: [AMDGPU] Add more expressive sched_barrier controls.

Fix test after name change.

Sun, Jun 12, 5:37 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D127123: [AMDGPU] Add more expressive sched_barrier controls.

Fix cl options.

Sun, Jun 12, 5:28 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D127123: [AMDGPU] Add more expressive sched_barrier controls.
Sun, Jun 12, 5:25 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D127123: [AMDGPU] Add more expressive sched_barrier controls.

Address comments.

Sun, Jun 12, 5:23 PM · Restricted Project, Restricted Project

Sat, Jun 11

kerbowa updated the diff for D127124: [AMDGPU] Add isMFMA helper function. NFC.

Use function directly instead of wrapping in lambda.

Sat, Jun 11, 6:16 PM · Restricted Project, Restricted Project

Mon, Jun 6

kerbowa requested review of D127124: [AMDGPU] Add isMFMA helper function. NFC.
Mon, Jun 6, 9:01 AM · Restricted Project, Restricted Project
kerbowa requested review of D127123: [AMDGPU] Add more expressive sched_barrier controls.
Mon, Jun 6, 8:57 AM · Restricted Project, Restricted Project

May 31 2022

kerbowa accepted D125997: [AMDGPU] Instruction Type Pipeline.
May 31 2022, 10:43 AM · Restricted Project, Restricted Project

May 26 2022

kerbowa added a comment to D125997: [AMDGPU] Instruction Type Pipeline.

Thanks! Good catch on a region that ends with a bundle. I think your last diff overwrote the changes before last. I.e. the added comments and renamed cl opts are gone.

May 26 2022, 5:09 PM · Restricted Project, Restricted Project

May 24 2022

kerbowa accepted D125997: [AMDGPU] Instruction Type Pipeline.

As a basis for our planned extensions, this looks OK.

May 24 2022, 10:53 PM · Restricted Project, Restricted Project

May 19 2022

kerbowa added a comment to D125997: [AMDGPU] Instruction Type Pipeline.

Can you name these files something so that it is clear that it is a DAG mutation involving MFMA? I'm also not sure pipeline is the correct terminology here? I know that library folks have been calling it that.

May 19 2022, 1:09 PM · Restricted Project, Restricted Project

May 11 2022

kerbowa committed rG2db700215a2e: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic (authored by kerbowa).
[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic
May 11 2022, 1:42 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa closed D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.
May 11 2022, 1:41 PM · Restricted Project, Restricted Project, Restricted Project

May 9 2022

kerbowa accepted D124678: [AMDGPU] Allow for MFMA Inst Clustering.
May 9 2022, 8:46 AM · Restricted Project, Restricted Project

May 6 2022

kerbowa accepted D124678: [AMDGPU] Allow for MFMA Inst Clustering.

LGTM with nits. Thanks!

May 6 2022, 3:03 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.

Use i32.
Output hex.
Fix hazard rec tests for pseudo instructions.

May 6 2022, 2:22 PM · Restricted Project, Restricted Project, Restricted Project

May 4 2022

kerbowa added inline comments to D124678: [AMDGPU] Allow for MFMA Inst Clustering.
May 4 2022, 9:41 AM · Restricted Project, Restricted Project

May 2 2022

kerbowa added inline comments to D124678: [AMDGPU] Allow for MFMA Inst Clustering.
May 2 2022, 6:44 PM · Restricted Project, Restricted Project
kerbowa added a comment to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

We are going to insert amdgcn.s.waitcnt instead of a fence because fences wait for memory stores, which we don't want.

May 2 2022, 5:54 PM · Restricted Project, Restricted Project

Apr 29 2022

kerbowa added a comment to D124678: [AMDGPU] Allow for MFMA Inst Clustering.

Regarding your note -- yes this is something I spent some time thinking about – Sdep::Cluster doesn’t gaurantee a single cluster. In fact, I believe there is a hardware dependency between MFMA’s, so the scheduler will try to fill this gap with an independent instruction.

This is conflicting thing, we need to make sure it does not succeed to fill the gap. Probably it needs some tweaking in FillMFMAShadowMutation and GCNHazardRecognizer::ShouldPreferAnother if this option is set. In any way you need some more tests with different clustering/non-clustering scenarios and check the final code, do we get resulting clusters? Especially given that post-RA scheduler will try to torn them.

It may be that we want the cluster edges to be a suggestion rather than a hard limit, the cluster edges already work this way but the priority for them is low so it usually doesn't matter.

Apr 29 2022, 5:25 PM · Restricted Project, Restricted Project
kerbowa added a comment to D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.

You do not handle masks other than 0 yet?

We handle 0 and 1 only.

Do you mean 1 is supported simply because it has side effects? If I understand it right you will need to remove this to support more flexible masks, right?

Apr 29 2022, 5:23 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa added a comment to D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.

You do not handle masks other than 0 yet?

Apr 29 2022, 3:37 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa added reviewers for D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic: rampitec, vangthao95, jrbyrnes, foad, arsenm.
Apr 29 2022, 2:50 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.

Add mir tests.

Apr 29 2022, 2:48 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa requested review of D124700: [AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic.
Apr 29 2022, 2:28 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa added inline comments to D124678: [AMDGPU] Allow for MFMA Inst Clustering.
Apr 29 2022, 10:33 AM · Restricted Project, Restricted Project
kerbowa added a comment to D124678: [AMDGPU] Allow for MFMA Inst Clustering.

Would be nice to have some tests that show the results of the clustering as well.

Apr 29 2022, 10:15 AM · Restricted Project, Restricted Project

Apr 28 2022

kerbowa accepted D124647: [NFC] Fix typo.

LGTM

Apr 28 2022, 5:17 PM · Restricted Project, Restricted Project

Apr 21 2022

kerbowa added a comment to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

I believe HIP adds a release fence before the barrier and an acquire directly after. Would something like that make sense?

Apr 21 2022, 11:00 PM · Restricted Project, Restricted Project

Apr 18 2022

kerbowa added a comment to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

Reverted until I can investigate the failures.

Apr 18 2022, 9:37 PM · Restricted Project, Restricted Project
kerbowa added a reverting change for rG8d0c34fd4fb6: [AMDGPU] Omit unnecessary waitcnt before barriers: rG7f97ac94f713: Revert "[AMDGPU] Omit unnecessary waitcnt before barriers".
Apr 18 2022, 9:36 PM · Restricted Project
kerbowa committed rG7f97ac94f713: Revert "[AMDGPU] Omit unnecessary waitcnt before barriers" (authored by kerbowa).
Revert "[AMDGPU] Omit unnecessary waitcnt before barriers"
Apr 18 2022, 9:35 PM · Restricted Project, Restricted Project
kerbowa added a reverting change for D120544: [AMDGPU] Omit unnecessary waitcnt before barriers: rG7f97ac94f713: Revert "[AMDGPU] Omit unnecessary waitcnt before barriers".
Apr 18 2022, 9:35 PM · Restricted Project, Restricted Project

Apr 7 2022

kerbowa committed rG26b14c3ea77f: [InferAddressSpaces] Fix assert on invalid bitcast placement (authored by kerbowa).
[InferAddressSpaces] Fix assert on invalid bitcast placement
Apr 7 2022, 8:13 PM · Restricted Project, Restricted Project
kerbowa closed D122964: [InferAddressSpaces] Fix assert on invalid bitcast placement.
Apr 7 2022, 8:12 PM · Restricted Project, Restricted Project

Apr 1 2022

kerbowa requested review of D122964: [InferAddressSpaces] Fix assert on invalid bitcast placement.
Apr 1 2022, 11:13 PM · Restricted Project, Restricted Project

Mar 23 2022

kerbowa committed rG1e15adba62a9: [AMDGPU] Add s_nop WaitStates between neighboring mfma (authored by kerbowa).
[AMDGPU] Add s_nop WaitStates between neighboring mfma
Mar 23 2022, 1:57 PM · Restricted Project
kerbowa closed D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma.
Mar 23 2022, 1:57 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma.
Mar 23 2022, 11:41 AM · Restricted Project, Restricted Project

Mar 21 2022

kerbowa updated the diff for D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma.

Remove gfx90a for now.
Don't parse HWXDL proc resource since all gfx908 MFMA use HWXDL.
Add more detailed comments.

Mar 21 2022, 10:38 AM · Restricted Project, Restricted Project

Mar 12 2022

kerbowa committed rG62bcfcb5a588: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic (authored by kerbowa).
[AMDGPU] Add llvm.amdgcn.s.setprio intrinsic
Mar 12 2022, 10:16 PM · Restricted Project
kerbowa closed D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.
Mar 12 2022, 10:16 PM · Restricted Project, Restricted Project, Restricted Project

Mar 10 2022

kerbowa updated the diff for D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma.

Add early exit.

Mar 10 2022, 10:35 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.

Add clang builtin and tests.

Mar 10 2022, 7:00 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa requested review of D121437: [AMDGPU] Add s_nop WaitStates between neighboring mfma.
Mar 10 2022, 6:48 PM · Restricted Project, Restricted Project

Mar 7 2022

kerbowa committed rG0c0636f7822d: [AMDGPU] Fix uninitialized value after 8d0c34fd4f (authored by kerbowa).
[AMDGPU] Fix uninitialized value after 8d0c34fd4f
Mar 7 2022, 11:34 AM · Restricted Project
kerbowa added inline comments to D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.
Mar 7 2022, 10:10 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.

Update type to match encoding.

Mar 7 2022, 10:08 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa added inline comments to D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.
Mar 7 2022, 9:19 AM · Restricted Project, Restricted Project, Restricted Project
Herald added a project to D102536: [AMDGPU] Rename MUBUF_Invalidate to MUBUF_CacheControl: Restricted Project.

ping

Mar 7 2022, 9:07 AM · Restricted Project, Restricted Project
kerbowa updated the diff for D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.

Remove gisel test and add gisel runline instead.
Test out-of-bounds values.

Mar 7 2022, 8:53 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa committed rG8d0c34fd4fb6: [AMDGPU] Omit unnecessary waitcnt before barriers (authored by kerbowa).
[AMDGPU] Omit unnecessary waitcnt before barriers
Mar 7 2022, 8:26 AM · Restricted Project
kerbowa closed D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.
Mar 7 2022, 8:25 AM · Restricted Project, Restricted Project

Mar 4 2022

kerbowa requested review of D120976: [AMDGPU] Add llvm.amdgcn.s.setprio intrinsic.
Mar 4 2022, 12:27 AM · Restricted Project, Restricted Project, Restricted Project

Mar 3 2022

kerbowa abandoned D120587: [AMDGPU] Use workgroup fences in test waitcnt-vscnt.ll.
Mar 3 2022, 10:43 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

Update test to use -mattr.

Mar 3 2022, 10:42 PM · Restricted Project, Restricted Project

Feb 25 2022

kerbowa added a comment to D120587: [AMDGPU] Use workgroup fences in test waitcnt-vscnt.ll.

It does not seem to do what it did before. For example it used to test that only a vmcnt is produced or only a vscnt. Maybe duplicate these tests instead to create a _workgroup versions (like existing barrier_vmcnt_vscnt_flat_workgroup)?

Feb 25 2022, 12:57 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.
Feb 25 2022, 12:48 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

Fix typo and rebase.

Feb 25 2022, 12:44 PM · Restricted Project, Restricted Project
kerbowa requested review of D120587: [AMDGPU] Use workgroup fences in test waitcnt-vscnt.ll.
Feb 25 2022, 12:36 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.
Feb 25 2022, 12:13 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.
Feb 25 2022, 11:51 AM · Restricted Project, Restricted Project
kerbowa added a comment to D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.

The test waitcnt-vscnt.ll is kind of muddied by this patch. It was relying on the behavior of s_barrier always adding s_waitcnt_vscnt at barriers if there were any outstanding events. The singlethread fences were not actually doing anything. I've added workgroup fences instead. With workgroup fences we currently always add the waitcnt. The test will look as it did before after my memory legalizer/waitcnt overhaul so that these are optimized. Consider this a precommit of this test for now.

Feb 25 2022, 12:58 AM · Restricted Project, Restricted Project
kerbowa requested review of D120544: [AMDGPU] Omit unnecessary waitcnt before barriers.
Feb 25 2022, 12:55 AM · Restricted Project, Restricted Project

Feb 21 2022

kerbowa added inline comments to D119475: [AMDGPU] Add scheduler pass to rematerialize trivial defs.
Feb 21 2022, 5:51 PM · Restricted Project, Restricted Project

Feb 11 2022

kerbowa committed rG0bb25b46034a: [InferAddressSpaces] Fix assert on invalid cast ordering (authored by kerbowa).
[InferAddressSpaces] Fix assert on invalid cast ordering
Feb 11 2022, 10:03 AM
kerbowa closed D119524: [InferAddressSpaces] Fix assert on invalid cast ordering.
Feb 11 2022, 10:03 AM · Restricted Project

Feb 10 2022

kerbowa requested review of D119524: [InferAddressSpaces] Fix assert on invalid cast ordering.
Feb 10 2022, 11:28 PM · Restricted Project
kerbowa added inline comments to D119022: [AMDGPU] Fix debug values in scheduler not placed correctly when reverting.
Feb 10 2022, 4:38 PM · Restricted Project

Jan 11 2022

kerbowa committed rG8470bf2b0884: [AMDGPU] Do not reserve any VGPR for SGPR spills (authored by kerbowa).
[AMDGPU] Do not reserve any VGPR for SGPR spills
Jan 11 2022, 10:16 PM
kerbowa closed D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.
Jan 11 2022, 10:15 PM · Restricted Project

Jan 9 2022

kerbowa updated the diff for D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.

Remove dead code.

Jan 9 2022, 10:20 PM · Restricted Project

Dec 30 2021

kerbowa added a reviewer for D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills: ruiling.
Dec 30 2021, 12:21 AM · Restricted Project
kerbowa updated the diff for D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.

Update WillHaveFP prediction.

Dec 30 2021, 12:12 AM · Restricted Project

Dec 10 2021

kerbowa updated the diff for D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.

Add tail call test.

Dec 10 2021, 2:33 PM · Restricted Project
kerbowa added inline comments to D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.
Dec 10 2021, 2:04 PM · Restricted Project
kerbowa requested review of D115551: [AMDGPU] Do not reserve any VGPR for SGPR spills.
Dec 10 2021, 1:45 PM · Restricted Project

Dec 1 2021

kerbowa added a comment to D114777: [AMDGPU] Set most sched model resource's BufferSize to one.

Would it be possible to write some llvm-mca tests that show exactly how this change affects the scheduling model? schedule-ilp.mir is better than nothing, but it is still a very indirect way of observing what's going on under the covers.

Dec 1 2021, 10:39 PM · Restricted Project
kerbowa committed rGda067ed569e0: [AMDGPU] Set most sched model resource's BufferSize to one (authored by kerbowa).
[AMDGPU] Set most sched model resource's BufferSize to one
Dec 1 2021, 10:35 PM
kerbowa closed D114777: [AMDGPU] Set most sched model resource's BufferSize to one.
Dec 1 2021, 10:35 PM · Restricted Project

Nov 30 2021

kerbowa requested review of D114777: [AMDGPU] Set most sched model resource's BufferSize to one.
Nov 30 2021, 12:26 AM · Restricted Project

Nov 18 2021

kerbowa accepted D114202: [AMDGPU] Fix SIPostRABundler crash on null register used by dbg value.

LGTM

Nov 18 2021, 4:52 PM · Restricted Project

Oct 26 2021

kerbowa committed rG02e60f2e7725: [AMDGPU] Use max waves for scheduler's initial occupancy target (authored by kerbowa).
[AMDGPU] Use max waves for scheduler's initial occupancy target
Oct 26 2021, 3:31 PM