Page MenuHomePhabricator

kerbowa (Austin Kerbow)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 31 2018, 12:07 PM (230 w, 2 d)

Recent Activity

Tue, May 30

kerbowa accepted D149393: [AMDGPU][IGLP] Parameterize the SchedGroup processing / linking in Solver.

LGTM, thanks!

Tue, May 30, 8:53 AM · Restricted Project, Restricted Project

Thu, May 25

kerbowa updated the diff for D151126: [AMDGPU] Don't flush vmcnt for loops with use/def pairs.

Use member function.

Thu, May 25, 9:45 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D149393: [AMDGPU][IGLP] Parameterize the SchedGroup processing / linking in Solver.
Thu, May 25, 2:04 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D149393: [AMDGPU][IGLP] Parameterize the SchedGroup processing / linking in Solver.
Thu, May 25, 10:36 AM · Restricted Project, Restricted Project

Wed, May 24

kerbowa accepted D151311: [ScheduleDAG] Fix error assert target.
Wed, May 24, 8:56 PM · Restricted Project, Restricted Project
kerbowa added a comment to D151311: [ScheduleDAG] Fix error assert target.

LGTM, thanks!

Wed, May 24, 8:56 PM · Restricted Project, Restricted Project

Mon, May 22

kerbowa requested review of D151126: [AMDGPU] Don't flush vmcnt for loops with use/def pairs.
Mon, May 22, 10:50 AM · Restricted Project, Restricted Project

Mon, May 8

kerbowa added inline comments to D149393: [AMDGPU][IGLP] Parameterize the SchedGroup processing / linking in Solver.
Mon, May 8, 9:42 AM · Restricted Project, Restricted Project

Apr 27 2023

kerbowa added inline comments to D149332: [AMDGPU] Also consider global and scratch instructions when flushing vmcnt counter in loop preheader.
Apr 27 2023, 8:14 AM · Restricted Project, Restricted Project

Apr 3 2023

kerbowa added inline comments to D147363: [AMDGPU] Add target hook to isGlobalMemoryObject.
Apr 3 2023, 8:18 AM · Restricted Project, Restricted Project

Apr 2 2023

kerbowa added a comment to D146774: [AMDGPU][IGLP]: Add rules to SchedGroups.

I like the general approach. It seems like things could get unwieldy with larger SchedGroups. You would need to have lots of checks vs Collection.size() which could be somewhat hard to work with.

Apr 2 2023, 7:01 PM · Restricted Project, Restricted Project

Mar 31 2023

kerbowa requested review of D147363: [AMDGPU] Add target hook to isGlobalMemoryObject.
Mar 31 2023, 4:39 PM · Restricted Project, Restricted Project

Mar 19 2023

kerbowa committed rG09f756c8800a: [AMDGPU] Add release note for ommited barrier waitcnt (authored by kerbowa).
[AMDGPU] Add release note for ommited barrier waitcnt
Mar 19 2023, 9:18 PM · Restricted Project, Restricted Project
kerbowa closed D146353: [AMDGPU] Add release note for ommited barrier waitcnt.
Mar 19 2023, 9:18 PM · Restricted Project, Restricted Project

Mar 17 2023

kerbowa requested review of D146353: [AMDGPU] Add release note for ommited barrier waitcnt.
Mar 17 2023, 11:46 PM · Restricted Project, Restricted Project
kerbowa committed rG864a2b25beac: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting (authored by kerbowa).
[AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting
Mar 17 2023, 9:12 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa closed D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.
Mar 17 2023, 9:12 PM · Restricted Project, Restricted Project, Restricted Project

Mar 13 2023

kerbowa added a comment to D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.

Added AMDGPU group to reviewers.

Mar 13 2023, 8:32 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa added a reviewer for D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting: Restricted Project.
Mar 13 2023, 8:24 AM · Restricted Project, Restricted Project, Restricted Project

Mar 8 2023

kerbowa updated the diff for D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.

Update tests.

Mar 8 2023, 11:38 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa added a comment to D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.

Actually, this is breaking tests with non-HSA. Is TargetID relevant for pal/graphics/ect @foad, or should the default there be XNACK- in the absence of any explicit subtarget features being added?

Mar 8 2023, 11:20 AM · Restricted Project, Restricted Project, Restricted Project

Mar 7 2023

kerbowa added inline comments to D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.
Mar 7 2023, 5:11 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.

Add readelf run-line to test.

Mar 7 2023, 9:48 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa added inline comments to D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.
Mar 7 2023, 9:40 AM · Restricted Project, Restricted Project, Restricted Project

Mar 6 2023

kerbowa requested review of D145401: [AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting.
Mar 6 2023, 10:28 AM · Restricted Project, Restricted Project, Restricted Project

Feb 14 2023

kerbowa accepted D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.

LGTM, can you renumber regs in the test please.

Feb 14 2023, 12:00 PM · Restricted Project, Restricted Project

Feb 4 2023

kerbowa added a comment to D129667: [AMDGPU] Update the mechanism used to check for cycles and add eges in power-sched mutation.

Didn't this land?

Feb 4 2023, 10:58 AM · Restricted Project, Restricted Project

Jan 25 2023

kerbowa committed rG913837eaa3d1: [ScheduleDAG] Fix removing edges with weak deps (authored by kerbowa).
[ScheduleDAG] Fix removing edges with weak deps
Jan 25 2023, 10:06 AM · Restricted Project, Restricted Project
kerbowa closed D142325: [ScheduleDAG] Fix removing edges with weak deps.
Jan 25 2023, 10:06 AM · Restricted Project, Restricted Project

Jan 22 2023

kerbowa accepted D142262: [AMDGPU] Use more consistemt way to avoid overflow in the scheduler.
Jan 22 2023, 10:27 PM · Restricted Project, Restricted Project
kerbowa accepted D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.

LGTM.

Jan 22 2023, 10:26 PM · Restricted Project, Restricted Project
kerbowa requested review of D142325: [ScheduleDAG] Fix removing edges with weak deps.
Jan 22 2023, 9:58 PM · Restricted Project, Restricted Project

Jan 20 2023

kerbowa added a comment to D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.

Can you add some debug printouts so we know when this is being triggered in logs?

Jan 20 2023, 12:28 PM · Restricted Project, Restricted Project

Jan 18 2023

kerbowa accepted D142062: [AMDGPU] Treat WMMA the same as MFMA for sched_barrier.

LGTM

Jan 18 2023, 10:30 PM · Restricted Project, Restricted Project
kerbowa accepted D142051: [AMDGPU] Introduce separate register limit bias in scheduler.

LGTM, thanks!

Jan 18 2023, 10:25 PM · Restricted Project, Restricted Project

Jan 15 2023

kerbowa added a comment to D141728: [AMDGPU] Tune scheduler on GFX10 and GFX11.

I think this makes sense, but it also makes the concept of an occupancy target a misnomer.

Jan 15 2023, 11:53 PM · Restricted Project, Restricted Project

Jan 11 2023

kerbowa added a comment to D141379: [AMDGPU] Temporarily disable FeatureBackOffBarrier for GFX11.

Is it possible to just check for cumode? I don't really have a problem with this since on Navi we can't really take advantage of barriers being too conservative without addrspace info on fences, but it really only has an impact when compiling for cumode. Also, I thought the workaround was pretty simple.

Jan 11 2023, 9:51 AM · Restricted Project, Restricted Project

Dec 12 2022

kerbowa added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 12 2022, 10:37 PM · Restricted Project, Restricted Project

Dec 9 2022

kerbowa committed rGf9c76a119834: [AMDGPU] Update MFMASmallGemmOpt with better performing stategy (authored by kerbowa).
[AMDGPU] Update MFMASmallGemmOpt with better performing stategy
Dec 9 2022, 7:10 PM · Restricted Project, Restricted Project
kerbowa closed D139227: [AMDGPU] Update MFMASmallGemmOpt with better performing stategy.
Dec 9 2022, 7:10 PM · Restricted Project, Restricted Project
kerbowa added a comment to D139227: [AMDGPU] Update MFMASmallGemmOpt with better performing stategy.

Hi Austin,

Changes look fine -- and if experiments show it has better performance then I suppose it is better. But the pipeline seems rather arbitrary -- in fact, in the test the previous pipeline fits the requirements of the new one. Maybe since the DAG is less constrained the scheduler has a better ability to produce improved schedule?

Also, having pipeline with 3x as many MFMA SchedGroups as there are MFMAs is an impossible pipeline. I assume you also tried I < MFMACount ?

Dec 9 2022, 2:02 PM · Restricted Project, Restricted Project

Dec 2 2022

kerbowa requested review of D139227: [AMDGPU] Update MFMASmallGemmOpt with better performing stategy.
Dec 2 2022, 1:35 PM · Restricted Project, Restricted Project

Oct 17 2022

kerbowa added a comment to D136069: [AMDGPU] Scheduler: Don't revert the schedule if the register pressure isn't changed for a region.

Interesting. Why exactly does this improve compile time so much? I thought reverting scheduling wasn't exactly expensive and the RP tracking was the problem.

Oct 17 2022, 8:23 AM · Restricted Project, Restricted Project
kerbowa added a comment to D135733: AMDGPU: Treat asm as a hazard for all register read-after-write hazards.

Could you detect empty inline asms and do this only for the non-empty ones? In graphics we use empty inline asms as some kind of scheduling barrier. (Granted this is just a workaround for problems elsewhere, but I don't really want it broken or penalised just now.)

Oct 17 2022, 8:21 AM · Restricted Project, Restricted Project

Sep 7 2022

kerbowa accepted D133459: [AMDGPU] Fix liveness verifier error in hazard recognizer.

LGTM

Sep 7 2022, 4:14 PM · Restricted Project, Restricted Project
kerbowa accepted D133067: [AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR.

LGTM

Sep 7 2022, 2:03 PM · Restricted Project, Restricted Project

Sep 6 2022

kerbowa added inline comments to D133067: [AMDGPU] W/a hazard if 64 bit shift amount is a highest allocated VGPR.
Sep 6 2022, 9:52 PM · Restricted Project, Restricted Project

Aug 19 2022

kerbowa committed rGb0f4678b9058: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy (authored by kerbowa).
[AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy
Aug 19 2022, 3:50 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa closed D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.
Aug 19 2022, 3:50 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa added inline comments to D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.
Aug 19 2022, 12:51 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.

Address comments.

Aug 19 2022, 12:51 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.

Don't loop over all instructions again in pre-RA scheduler.

Aug 19 2022, 11:55 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.

Address comment.

Aug 19 2022, 8:53 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa updated the diff for D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.

Update pipeline, remove edges from iglp_opt.

Aug 19 2022, 12:46 AM · Restricted Project, Restricted Project, Restricted Project

Aug 17 2022

kerbowa requested review of D132079: [AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy.
Aug 17 2022, 3:55 PM · Restricted Project, Restricted Project, Restricted Project
kerbowa accepted D130797: [AMDGPU] Implement pipeline solver for non-trivial pipelines.

LGTM

Aug 17 2022, 1:40 PM · Restricted Project, Restricted Project

Aug 9 2022

kerbowa added inline comments to D130797: [AMDGPU] Implement pipeline solver for non-trivial pipelines.
Aug 9 2022, 6:27 PM · Restricted Project, Restricted Project

Aug 4 2022

kerbowa added inline comments to D130797: [AMDGPU] Implement pipeline solver for non-trivial pipelines.
Aug 4 2022, 11:47 PM · Restricted Project, Restricted Project
kerbowa committed rGb568cb10648f: [AMDGPU] Pre-commit tests for D130797 (authored by kerbowa).
[AMDGPU] Pre-commit tests for D130797
Aug 4 2022, 10:59 PM · Restricted Project, Restricted Project

Aug 2 2022

kerbowa committed rG3dfa5626434b: [AMDGPU] Add CL option for max-ilp scheduler. (authored by kerbowa).
[AMDGPU] Add CL option for max-ilp scheduler.
Aug 2 2022, 4:53 PM · Restricted Project, Restricted Project
kerbowa closed D131022: [AMDGPU] Add CL option for max-ilp scheduler..
Aug 2 2022, 4:53 PM · Restricted Project, Restricted Project
kerbowa added a comment to D131022: [AMDGPU] Add CL option for max-ilp scheduler..

I do not follow, what's the difference in passing -mllvm -misched vs -mllvm -another-option?

Aug 2 2022, 4:39 PM · Restricted Project, Restricted Project
kerbowa requested review of D131022: [AMDGPU] Add CL option for max-ilp scheduler..
Aug 2 2022, 2:21 PM · Restricted Project, Restricted Project
kerbowa committed rG40eec27618d0: [AMDGPU] Add llvm_unreachable to switch statement added in d7100b398. (authored by kerbowa).
[AMDGPU] Add llvm_unreachable to switch statement added in d7100b398.
Aug 2 2022, 1:48 PM · Restricted Project, Restricted Project
kerbowa committed rGd7100b398b76: [AMDGPU] Add GCNMaxILPSchedStrategy (authored by kerbowa).
[AMDGPU] Add GCNMaxILPSchedStrategy
Aug 2 2022, 1:21 PM · Restricted Project, Restricted Project
kerbowa closed D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.
Aug 2 2022, 1:21 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.

Address comments.

Aug 2 2022, 12:42 PM · Restricted Project, Restricted Project

Aug 1 2022

kerbowa added inline comments to D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.
Aug 1 2022, 11:55 AM · Restricted Project, Restricted Project
kerbowa added a comment to D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.

Is scheduling for maximum ILP the same thing as scheduling for minimum latency?

Aug 1 2022, 9:49 AM · Restricted Project, Restricted Project
kerbowa added inline comments to D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.
Aug 1 2022, 8:18 AM · Restricted Project, Restricted Project

Jul 31 2022

kerbowa requested review of D130869: [AMDGPU] Add GCNMaxILPSchedStrategy.
Jul 31 2022, 11:26 PM · Restricted Project, Restricted Project

Jul 30 2022

kerbowa added inline comments to D128158: [AMDGPU] Add amdgcn_sched_group_barrier builtin.
Jul 30 2022, 7:48 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa committed rG7898426a7244: [AMDGPU] Remove unused function (authored by kerbowa).
[AMDGPU] Remove unused function
Jul 30 2022, 7:48 AM · Restricted Project, Restricted Project

Jul 29 2022

kerbowa added a comment to D130797: [AMDGPU] Implement pipeline solver for non-trivial pipelines.

Thanks! I like the idea behind the greedy solver. Not sure about SchedGroupSU. Maybe just a map between SUs and lists of schedgroups? I think trying to track sched_group_barriers by their order and assigning that an index is a bit confusing.

Jul 29 2022, 2:35 PM · Restricted Project, Restricted Project
kerbowa committed rG2c82a126d762: [AMDGPU] Omit unnecessary waitcnt before barriers (authored by kerbowa).
[AMDGPU] Omit unnecessary waitcnt before barriers
Jul 29 2022, 11:22 AM · Restricted Project, Restricted Project
kerbowa closed D130722: [AMDGPU] Omit unnecessary waitcnt before barriers.
Jul 29 2022, 11:21 AM · Restricted Project, Restricted Project

Jul 28 2022

kerbowa added a comment to D130722: [AMDGPU] Omit unnecessary waitcnt before barriers.

What's the resolution of mesa problem?

Jul 28 2022, 12:33 PM · Restricted Project, Restricted Project
kerbowa added inline comments to D130677: [AMDGPU] Fix DGEMM hazard for GFX90a.
Jul 28 2022, 12:22 PM · Restricted Project, Restricted Project
kerbowa added a comment to D130722: [AMDGPU] Omit unnecessary waitcnt before barriers.

Resubmit https://reviews.llvm.org/D120544.

Jul 28 2022, 11:46 AM · Restricted Project, Restricted Project
kerbowa requested review of D130722: [AMDGPU] Omit unnecessary waitcnt before barriers.
Jul 28 2022, 11:45 AM · Restricted Project, Restricted Project
kerbowa committed rG0f93a45b118e: [AMDGPU] Add isMeta flag to SCHED_GROUP_BARRIER (authored by kerbowa).
[AMDGPU] Add isMeta flag to SCHED_GROUP_BARRIER
Jul 28 2022, 11:24 AM · Restricted Project, Restricted Project
kerbowa committed rGf5b21680d122: [AMDGPU] Add amdgcn_sched_group_barrier builtin (authored by kerbowa).
[AMDGPU] Add amdgcn_sched_group_barrier builtin
Jul 28 2022, 10:43 AM · Restricted Project, Restricted Project, Restricted Project
kerbowa closed D128158: [AMDGPU] Add amdgcn_sched_group_barrier builtin.
Jul 28 2022, 10:43 AM · Restricted Project, Restricted Project, Restricted Project

Jul 27 2022

kerbowa committed rGba0d079c7aa5: [AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions (authored by kerbowa).
[AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions
Jul 27 2022, 10:43 PM · Restricted Project, Restricted Project
kerbowa closed D130329: [AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions.
Jul 27 2022, 10:42 PM · Restricted Project, Restricted Project
kerbowa added a comment to D130654: [AMDGPU] Consider S_SETPRIO a scheduling boundary.

Why not make it a scheduling boundary?

Jul 27 2022, 1:24 PM · Restricted Project, Restricted Project
kerbowa added a comment to D130654: [AMDGPU] Consider S_SETPRIO a scheduling boundary.

Why not make it a scheduling boundary?

Jul 27 2022, 1:23 PM · Restricted Project, Restricted Project

Jul 26 2022

kerbowa committed rG7ca9e471fe5b: [AMDGPU] Start refactoring GCNSchedStrategy (authored by kerbowa).
[AMDGPU] Start refactoring GCNSchedStrategy
Jul 26 2022, 8:55 AM · Restricted Project, Restricted Project
kerbowa closed D130147: [AMDGPU] Start refactoring GCNSchedStrategy.
Jul 26 2022, 8:55 AM · Restricted Project, Restricted Project

Jul 22 2022

kerbowa updated the diff for D130329: [AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions.

Fix not running unclustred pass on non-excess RP regions.

Jul 22 2022, 9:41 AM · Restricted Project, Restricted Project

Jul 21 2022

kerbowa requested review of D130329: [AMDGPU] Aggressively schedule to reduce RP in occupancy limited regions.
Jul 21 2022, 8:38 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D130147: [AMDGPU] Start refactoring GCNSchedStrategy.

Minor fixes.

Jul 21 2022, 7:46 PM · Restricted Project, Restricted Project

Jul 19 2022

kerbowa requested review of D130147: [AMDGPU] Start refactoring GCNSchedStrategy.
Jul 19 2022, 11:05 PM · Restricted Project, Restricted Project
kerbowa updated the diff for D128158: [AMDGPU] Add amdgcn_sched_group_barrier builtin.

Fix some bugs. Add better pipeline fitting. Address comments.

Jul 19 2022, 3:44 PM · Restricted Project, Restricted Project, Restricted Project

Jul 14 2022

kerbowa added a comment to D129522: [mlir][AMDGPU] Add lds_barrier op.

I'm not objecting to the change, just pointing out that you may miss out on some optimization since this is lowered to inline ASM, and that you may want to lower it to intrinsics in the future.

Jul 14 2022, 8:44 AM · Restricted Project, Restricted Project

Jul 13 2022

kerbowa added inline comments to D129667: [AMDGPU] Update the mechanism used to check for cycles and add eges in power-sched mutation.
Jul 13 2022, 4:57 PM · Restricted Project, Restricted Project
kerbowa added a comment to D129667: [AMDGPU] Update the mechanism used to check for cycles and add eges in power-sched mutation.

Can you add a test with the bug?

Jul 13 2022, 10:50 AM · Restricted Project, Restricted Project

Jul 11 2022

kerbowa added a comment to D129522: [mlir][AMDGPU] Add lds_barrier op.

This is a single op that expands to both a waitcnt on LDS and a barrier. I
can go digging tomorrow for what we used to lower this to some sort of
fence and a barrier, but I recall ( @whchung who may have more detail) that
using this bit of inline assembly to work around what I think was the lack
of an LDS-only fence gave a noticable performance increase

Jul 11 2022, 5:25 PM · Restricted Project, Restricted Project
kerbowa added a comment to D129522: [mlir][AMDGPU] Add lds_barrier op.

Is this really necessary if we have https://reviews.llvm.org/D120544. A barrier has no requirement to wait for LDS and VMEM it only does so currently because of a bug. Using inline asm like this seems like it will eventually cause problems although I'm not too familiar with MLIR.

Jul 11 2022, 4:18 PM · Restricted Project, Restricted Project

Jul 7 2022

kerbowa added a comment to D129322: [AMDGPU][Scheduler] Avoid initializing Register pressure tracker when tracking is disabled.

Can we not rename initCandidate since it is mimicking GenericScheduler::initCandidate?

Jul 7 2022, 1:17 PM · Restricted Project, Restricted Project