Page MenuHomePhabricator

rampitec (Stanislav Mekhanoshin)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 4 2014, 4:14 AM (467 w, 3 d)

Recent Activity

Thu, Mar 16

rampitec committed rG3e12cc9463e6: [AMDGPU] Simplify AGPR reservation. NFC. (authored by rampitec).
[AMDGPU] Simplify AGPR reservation. NFC.
Thu, Mar 16, 4:16 PM · Restricted Project, Restricted Project
rampitec accepted D146137: [AMDGPU] Select flat atomic fmin/fmax.

LGTM

Thu, Mar 16, 9:59 AM · Restricted Project, Restricted Project

Wed, Mar 15

rampitec added inline comments to D146137: [AMDGPU] Select flat atomic fmin/fmax.
Wed, Mar 15, 3:46 PM · Restricted Project, Restricted Project

Mon, Mar 13

rampitec added inline comments to D145586: [AMDGPU] Tweak PromoteAlloca limits.
Mon, Mar 13, 3:06 PM · Restricted Project, Restricted Project
rampitec accepted D145936: [AMDGPU] Fix .amdhsa_shared_vgpr_count error checking for GFX11.
Mon, Mar 13, 2:51 PM · Restricted Project, Restricted Project

Thu, Mar 9

rampitec added inline comments to D145586: [AMDGPU] Tweak PromoteAlloca limits.
Thu, Mar 9, 11:39 AM · Restricted Project, Restricted Project

Wed, Mar 8

rampitec committed rGe7ec123c6af9: [AMDGPU] Implement idempotent atomic lowering (authored by rampitec).
[AMDGPU] Implement idempotent atomic lowering
Wed, Mar 8, 2:10 PM · Restricted Project, Restricted Project
rampitec closed D144759: [AMDGPU] Implement idempotent atomic lowering.
Wed, Mar 8, 2:10 PM · Restricted Project, Restricted Project
rampitec added a comment to D144759: [AMDGPU] Implement idempotent atomic lowering.

Still don't understand why this isn't just a generic / default implementation

Wed, Mar 8, 2:08 PM · Restricted Project, Restricted Project
rampitec updated the diff for D144759: [AMDGPU] Implement idempotent atomic lowering.
Wed, Mar 8, 1:42 PM · Restricted Project, Restricted Project
rampitec updated the diff for D144759: [AMDGPU] Implement idempotent atomic lowering.

Simplified patch to avoid the optimization on any atomicrmw with a release semantics. A monotonic or acquire does not require a fence or cache flush.

Wed, Mar 8, 12:33 PM · Restricted Project, Restricted Project
rampitec added inline comments to D145586: [AMDGPU] Tweak PromoteAlloca limits.
Wed, Mar 8, 12:02 PM · Restricted Project, Restricted Project
rampitec added a comment to D145586: [AMDGPU] Tweak PromoteAlloca limits.

Increasing the limit may result in more spilling in other cases. In general a good performance testing is needed to reason if this is beneficial.

Wed, Mar 8, 11:34 AM · Restricted Project, Restricted Project
rampitec committed rG59162e38590f: [AMDGPU] Skip buffer_wbl2 before atomic fence acquire (authored by rampitec).
[AMDGPU] Skip buffer_wbl2 before atomic fence acquire
Wed, Mar 8, 1:25 AM · Restricted Project, Restricted Project
rampitec closed D145524: [AMDGPU] Skip buffer_wbl2 before atomic fence acquire.
Wed, Mar 8, 1:24 AM · Restricted Project, Restricted Project

Tue, Mar 7

rampitec planned changes to D144759: [AMDGPU] Implement idempotent atomic lowering.

Just discussed it with Tony. This seems somewhat problematic as exploiting a general lack of other atomic optimizations and that we cannot really reorder a fence. But then we only really need it for relaxed atomic and can safely do without a fence for a relaxed or acquire atomic. So let's keep it simple and only do the optimization if there is no release semantics on the atomicrmw. I will update the patch.

Tue, Mar 7, 4:23 PM · Restricted Project, Restricted Project
rampitec added inline comments to D145524: [AMDGPU] Skip buffer_wbl2 before atomic fence acquire.
Tue, Mar 7, 3:33 PM · Restricted Project, Restricted Project
rampitec updated the diff for D144759: [AMDGPU] Implement idempotent atomic lowering.

OK, let's be on a safe side. https://www.hpl.hp.com/techreports/2012/HPL-2012-68.pdf tells than a release fence is needed for load ordering if rmw is release or stronger. Legalizer does not do it just by itself, although the only noticeable difference in codegen is with seq_cst, which looks reasonable.

Tue, Mar 7, 3:26 PM · Restricted Project, Restricted Project
rampitec requested review of D145524: [AMDGPU] Skip buffer_wbl2 before atomic fence acquire.
Tue, Mar 7, 12:43 PM · Restricted Project, Restricted Project
rampitec accepted D136918: [AMDGPU] Scheduler: fix RP calculation for a MBB with one successor.
Tue, Mar 7, 12:14 PM · Restricted Project, Restricted Project
rampitec accepted D136918: [AMDGPU] Scheduler: fix RP calculation for a MBB with one successor.

LGTM, but please add a comment why are you checking predecessors here. Something like: "Live-ins of a successor are the same as live-outs of a predecessor, but subreg mask may be different for different predecessors."

Tue, Mar 7, 9:59 AM · Restricted Project, Restricted Project

Fri, Mar 3

rampitec added a comment to D144759: [AMDGPU] Implement idempotent atomic lowering.

I feel like it's time to ask Tony.

Fri, Mar 3, 5:02 PM · Restricted Project, Restricted Project
rampitec added a reviewer for D144759: [AMDGPU] Implement idempotent atomic lowering: t-tye.
Fri, Mar 3, 5:02 PM · Restricted Project, Restricted Project
rampitec added inline comments to D144759: [AMDGPU] Implement idempotent atomic lowering.
Fri, Mar 3, 4:58 PM · Restricted Project, Restricted Project
rampitec added inline comments to D144759: [AMDGPU] Implement idempotent atomic lowering.
Fri, Mar 3, 1:51 PM · Restricted Project, Restricted Project
rampitec updated the diff for D144759: [AMDGPU] Implement idempotent atomic lowering.
Fri, Mar 3, 1:51 PM · Restricted Project, Restricted Project

Thu, Mar 2

rampitec accepted D145170: [AMDGPU] Vectorize misaligned global loads & stores.

LGTM, please give Matt a chance to review.

Thu, Mar 2, 5:31 PM · Restricted Project, Restricted Project
rampitec added inline comments to D145170: [AMDGPU] Vectorize misaligned global loads & stores.
Thu, Mar 2, 3:42 PM · Restricted Project, Restricted Project
rampitec added inline comments to D144759: [AMDGPU] Implement idempotent atomic lowering.
Thu, Mar 2, 12:12 PM · Restricted Project, Restricted Project
rampitec added a comment to D145170: [AMDGPU] Vectorize misaligned global loads & stores.

LGTM, modulo Matt's comment.

Thu, Mar 2, 11:51 AM · Restricted Project, Restricted Project
rampitec added inline comments to D144759: [AMDGPU] Implement idempotent atomic lowering.
Thu, Mar 2, 11:43 AM · Restricted Project, Restricted Project
rampitec added a comment to D144759: [AMDGPU] Implement idempotent atomic lowering.

PSDB passed.

Thu, Mar 2, 11:26 AM · Restricted Project, Restricted Project

Fri, Feb 24

rampitec requested review of D144759: [AMDGPU] Implement idempotent atomic lowering.
Fri, Feb 24, 1:34 PM · Restricted Project, Restricted Project

Feb 15 2023

rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

My current understanding is the c-p will go into already forked clang-16, but not to rocm 5.4. So rocm device-libs will be accompanied by the older clang-16 w/o this and stay compatible. Someone building from scratch will use latest clang-16 and staging device-libs with this change. Do you think this will work?

I wouldn't recommend it. I would patch whatever device libs are being built in association with clang-16, not staging. Staging device libs is only appropriate for the staging compiler. A hash of device libs from around the time that clang-16 stable released would probably be safe.

Feb 15 2023, 11:37 AM · Restricted Project, Restricted Project, Restricted Project

Feb 14 2023

rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

I have no objection to backporting this, but it may need to be accompanied with a device-libs patch, and I don't know where that patch would be checked in. The ROCm-Device-Libs in github certainly doesn't have a "clang-16" branch.

Feb 14 2023, 3:20 PM · Restricted Project, Restricted Project, Restricted Project
rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

I think unless conflicts arise creating an issue similar to this https://github.com/llvm/llvm-project/issues/60600 with the cherry-pick line set to this commit should be enough. (See also https://llvm.org/docs/GitHub.html).

Feb 14 2023, 2:58 PM · Restricted Project, Restricted Project, Restricted Project
rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

I cannot say there was much choice. The only real choice was to postpone the split and magnify the problem in the future. As for the ifdefs, this might be possible in the device-libs but I do not see how to do it the Builtins.def.

Hmm maybe ifdefs in the device libs would also just delay the issue. Maybe it really is best to pull this change into Clang 16 and accept the fact that it's an unfortunate situation, but at least give users with very recent hardware the option to use a regular Clang to build ROCm. Realistically, those actually upgrading to Clang 16 early will also be those upgrading to ROCm5.5 early and likely also be those most likely to have 7900 GPUs.

Somehow, telling users "if you have a new GPU you need new Clang + ROCm" and "if you want new ROCm for your old GPU you need to also upgrade Clang" sounds better to me than telling them "if you have a new GPU you are SOL unless you use binary releases or build the amd-llvm-fork" 😅

Feb 14 2023, 2:24 PM · Restricted Project, Restricted Project, Restricted Project
rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

Well, I can already feel the pain that distro maintainers having to build the next ROCm releases 😅

I wonder what the better course of action is here:

  1. Port this patch to Clang 16 so that users with new hardware will be able to build ROCm 5.5, but make it impossible to build ROCm 5.4 and older with clang 16.
  2. Don't port this patch and have a ~6 months gap during which users with the 7900 GPUs won't be able to build ROCm with a stable Clang version, requiring distro maintainers to use several toolchains and source-based distro users to use differentl compatibility patches for different ROCm releases. So basically when 8900 GPUs are announced, clang would support ROCm for 7900 GPUs 😅

Would there be a way to retain at least *some* backwards compatibility or version interoperability? For instance, via an #ifdef CLANG_VERSION_MAJOR in the device libs and an #ifdef INCOMPATIBLE_AMDGPU_INSTS in Clang?

This would obviously very ugly, but it still seems better to me than locking out users (and more likely, ROCm contributors) from using 7900 GPUs if they are unable to build Clang themselves. Users already complain about how hard it is to build ROCm, and they also complain about the frequent breaking changes Clang. I'm very much in favor of moving fast, but I'm worried that complete disregard for backwards compatibility like this with no clear upgrade path or fallback mechanism could cause a lot of frustration for users and distro maintainers.

Maybe there is some other, prettier way to solve this? 🥹

Feb 14 2023, 1:51 PM · Restricted Project, Restricted Project, Restricted Project
rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

It shall be complimented by the device-lib change in the corresponding release, so it is not that simple.

@rampitec I'm not sure I understand. Does this mean that this is breaking in a way that Clang 17 won't be able to build ROCm 5.4?

I thought it was like "we need D142507 to build device-libs after 8dc779e" and for older device libs we just fall back to some older behavior.

Feb 14 2023, 12:26 PM · Restricted Project, Restricted Project, Restricted Project
rampitec committed rG12b4f9e2af95: [AMDGPU] Do not apply schedule metric for regions with spilling (authored by rampitec).
[AMDGPU] Do not apply schedule metric for regions with spilling
Feb 14 2023, 12:17 PM · Restricted Project, Restricted Project
rampitec closed D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.
Feb 14 2023, 12:17 PM · Restricted Project, Restricted Project
rampitec updated the diff for D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.

Renumbered registers.

Feb 14 2023, 12:08 PM · Restricted Project, Restricted Project
rampitec accepted D143963: AMDGPU: Override getNegatedExpression constant handling.
Feb 14 2023, 10:10 AM · Restricted Project, Restricted Project
rampitec added a comment to D142507: [AMDGPU] Split dot7 feature.

Would it be possible to backport this to Clang 16?

If https://github.com/RadeonOpenCompute/ROCm-Device-Libs/commit/8dc779e19cbf2ccfd3307b60f7db57cf4203a5be makes it into ROCm 5.5 no distro would be able to build it with "vanilla" Clang 16, potentially causing pain for users that try to build ROCm 5.5 with a Clang from a package manager (a realistic scenario, considering that one may want to invest 5 min to build ROCm but not 40 min to build Clang). ROCm 5.5 will be the first release to officially support the 7900XT and 7900XTX, so not having this potentially causes issues for users with recent AMD hardware. (See https://github.com/RadeonOpenCompute/ROCm/issues/1880 for extensive, related discussion).

@jhuber6 This wouldn't exactly "solve" https://github.com/llvm/llvm-project/issues/60660, but I think this could also be a workaround (with potentially better user experience), as allowing users build ROCm with regular Clang 16 prevents that deadlock where we can't build ROCm anymore. This is entirely based on speculation that ROCm 5.5 won't introduce other breakages before its release though, so I'd totally understand if this is not a satisfactory solution.

Feb 14 2023, 9:58 AM · Restricted Project, Restricted Project, Restricted Project

Feb 13 2023

rampitec updated the diff for D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.

Added testcase.

Feb 13 2023, 1:55 PM · Restricted Project, Restricted Project
rampitec accepted D143941: AMDGPU: Teach getNegatedExpression about rcp.
Feb 13 2023, 12:57 PM · Restricted Project, Restricted Project
rampitec added a comment to D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.

testcase?

Feb 13 2023, 12:12 PM · Restricted Project, Restricted Project
rampitec requested review of D143934: [AMDGPU] Do not apply schedule metric for regions with spilling.
Feb 13 2023, 11:45 AM · Restricted Project, Restricted Project

Feb 10 2023

rampitec accepted D143740: [AMDGPU] Add GFX11 HW_REG_PERF_SNAPSHOT_*.
Feb 10 2023, 11:03 AM · Restricted Project, Restricted Project
rampitec accepted D143706: [AMDGPU] Add switch to enable architected SGPRs..
Feb 10 2023, 10:56 AM · Restricted Project, Restricted Project

Feb 9 2023

rampitec accepted D143662: [AMDGPU] Refactor multiclass FLAT_Atomic_Pseudo. NFC..

LGTM

Feb 9 2023, 11:40 AM · Restricted Project, Restricted Project
rampitec added inline comments to D143662: [AMDGPU] Refactor multiclass FLAT_Atomic_Pseudo. NFC..
Feb 9 2023, 11:21 AM · Restricted Project, Restricted Project

Feb 8 2023

rampitec committed rG94def1b44eef: [AMDGPU] Do not exapnd fp atomics on gfx940 (authored by rampitec).
[AMDGPU] Do not exapnd fp atomics on gfx940
Feb 8 2023, 1:22 PM · Restricted Project, Restricted Project
rampitec closed D143603: [AMDGPU] Do not exapnd fp atomics on gfx940.
Feb 8 2023, 1:22 PM · Restricted Project, Restricted Project
rampitec added a comment to D143603: [AMDGPU] Do not exapnd fp atomics on gfx940.

It's even safe for system scope?

Feb 8 2023, 1:21 PM · Restricted Project, Restricted Project
rampitec requested review of D143603: [AMDGPU] Do not exapnd fp atomics on gfx940.
Feb 8 2023, 1:10 PM · Restricted Project, Restricted Project
rampitec committed rG3e9f2af27ae7: [AMDGPU] Update atomic tests. NFC. (authored by rampitec).
[AMDGPU] Update atomic tests. NFC.
Feb 8 2023, 12:56 PM · Restricted Project, Restricted Project

Feb 7 2023

rampitec added inline comments to D143420: AMDGPU: Replace certain llvm.amdgcn.class uses with llvm.is.fpclass.
Feb 7 2023, 2:42 AM · Restricted Project, Restricted Project

Feb 6 2023

rampitec accepted D143420: AMDGPU: Replace certain llvm.amdgcn.class uses with llvm.is.fpclass.
Feb 6 2023, 10:58 AM · Restricted Project, Restricted Project

Feb 5 2023

rampitec committed rGdd0caa82de59: [AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp (authored by rampitec).
[AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp
Feb 5 2023, 12:23 PM · Restricted Project, Restricted Project
rampitec closed D143302: [AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp.
Feb 5 2023, 12:22 PM · Restricted Project, Restricted Project

Feb 4 2023

rampitec updated the summary of D143302: [AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp.
Feb 4 2023, 12:05 AM · Restricted Project, Restricted Project
rampitec updated the summary of D143302: [AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp.
Feb 4 2023, 12:04 AM · Restricted Project, Restricted Project

Feb 3 2023

rampitec requested review of D143302: [AMDGPU] Fix liveness in the SIOptimizeExecMaskingPreRA.cpp.
Feb 3 2023, 3:29 PM · Restricted Project, Restricted Project
rampitec accepted D143263: AMDGPU: Ensure flat loads are broken into dword in functions.

LGTM, thanks!

Feb 3 2023, 11:23 AM · Restricted Project, Restricted Project

Jan 26 2023

rampitec committed rGdf0488369d32: [AMDGPU] Split dot7 feature (authored by rampitec).
[AMDGPU] Split dot7 feature
Jan 26 2023, 10:35 AM · Restricted Project, Restricted Project, Restricted Project
rampitec closed D142507: [AMDGPU] Split dot7 feature.
Jan 26 2023, 10:34 AM · Restricted Project, Restricted Project, Restricted Project

Jan 25 2023

rampitec committed rGccdd4ae1db23: [AMDGPU] Remove predicates from real dot instructions. NFCI. (authored by rampitec).
[AMDGPU] Remove predicates from real dot instructions. NFCI.
Jan 25 2023, 12:51 PM · Restricted Project, Restricted Project
rampitec closed D142575: [AMDGPU] Remove predicates from real dot instructions. NFCI..
Jan 25 2023, 12:50 PM · Restricted Project, Restricted Project
rampitec added inline comments to D142507: [AMDGPU] Split dot7 feature.
Jan 25 2023, 12:09 PM · Restricted Project, Restricted Project, Restricted Project
rampitec updated the diff for D142507: [AMDGPU] Split dot7 feature.

Split the cleanup NFCI.

Jan 25 2023, 12:08 PM · Restricted Project, Restricted Project, Restricted Project
rampitec requested review of D142575: [AMDGPU] Remove predicates from real dot instructions. NFCI..
Jan 25 2023, 12:05 PM · Restricted Project, Restricted Project
rampitec added inline comments to D142507: [AMDGPU] Split dot7 feature.
Jan 25 2023, 11:33 AM · Restricted Project, Restricted Project, Restricted Project

Jan 24 2023

rampitec added inline comments to D142507: [AMDGPU] Split dot7 feature.
Jan 24 2023, 2:55 PM · Restricted Project, Restricted Project, Restricted Project
rampitec requested review of D142507: [AMDGPU] Split dot7 feature.
Jan 24 2023, 2:25 PM · Restricted Project, Restricted Project, Restricted Project
rampitec committed rG870b92977e89: [AMDGPU] Split dot8 feature (authored by rampitec).
[AMDGPU] Split dot8 feature
Jan 24 2023, 11:16 AM · Restricted Project, Restricted Project, Restricted Project
rampitec closed D142407: [AMDGPU] Split dot8 feature.
Jan 24 2023, 11:16 AM · Restricted Project, Restricted Project, Restricted Project
rampitec updated the diff for D142407: [AMDGPU] Split dot8 feature.

Rebased.

Jan 24 2023, 11:13 AM · Restricted Project, Restricted Project, Restricted Project
rampitec committed rG4ab2246d486b: [AMDGPU] Remove dot1 and dot6 features from clang for gfx11 (authored by rampitec).
[AMDGPU] Remove dot1 and dot6 features from clang for gfx11
Jan 24 2023, 10:53 AM · Restricted Project, Restricted Project
rampitec closed D142493: [AMDGPU] Remove dot1 and dot6 features from clang for gfx11.
Jan 24 2023, 10:53 AM · Restricted Project, Restricted Project
rampitec added a comment to D142493: [AMDGPU] Remove dot1 and dot6 features from clang for gfx11.

[AMDGPU] Remove dot1 and dot5 features from clang for gfx11

"dot1 and dot6"?

Jan 24 2023, 10:45 AM · Restricted Project, Restricted Project
rampitec retitled D142493: [AMDGPU] Remove dot1 and dot6 features from clang for gfx11 from [AMDGPU] Remove dot1 and dot5 features from clang for gfx11 to [AMDGPU] Remove dot1 and dot6 features from clang for gfx11.
Jan 24 2023, 10:45 AM · Restricted Project, Restricted Project
rampitec requested review of D142493: [AMDGPU] Remove dot1 and dot6 features from clang for gfx11.
Jan 24 2023, 10:36 AM · Restricted Project, Restricted Project
rampitec committed rG296838071751: [AMDGPU] Add missing gfx11 tests in the directive-amdgcn-target.ll. NFC. (authored by rampitec).
[AMDGPU] Add missing gfx11 tests in the directive-amdgcn-target.ll. NFC.
Jan 24 2023, 9:58 AM · Restricted Project, Restricted Project

Jan 23 2023

rampitec added a reviewer for D142407: [AMDGPU] Split dot8 feature: b-sumner.
Jan 23 2023, 3:01 PM · Restricted Project, Restricted Project, Restricted Project
rampitec requested review of D142407: [AMDGPU] Split dot8 feature.
Jan 23 2023, 2:43 PM · Restricted Project, Restricted Project, Restricted Project
rampitec committed rG7d0145cc4748: [AMDGPU] Use more consistemt way to avoid overflow in the scheduler (authored by rampitec).
[AMDGPU] Use more consistemt way to avoid overflow in the scheduler
Jan 23 2023, 11:01 AM · Restricted Project, Restricted Project
rampitec closed D142262: [AMDGPU] Use more consistemt way to avoid overflow in the scheduler.
Jan 23 2023, 11:01 AM · Restricted Project, Restricted Project
rampitec committed rGd1c0febeab41: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling (authored by rampitec).
[AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling
Jan 23 2023, 10:43 AM · Restricted Project, Restricted Project
rampitec closed D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.
Jan 23 2023, 10:42 AM · Restricted Project, Restricted Project
rampitec accepted D142325: [ScheduleDAG] Fix removing edges with weak deps.

LGTM

Jan 23 2023, 10:28 AM · Restricted Project, Restricted Project
rampitec updated the diff for D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.

Moved initialization.

Jan 23 2023, 10:24 AM · Restricted Project, Restricted Project
rampitec accepted D142329: [AMDGPU][NFC] Apply new naming convention for feature fmacf64.

LGTM

Jan 23 2023, 10:22 AM · Restricted Project, Restricted Project

Jan 20 2023

rampitec updated the diff for D142262: [AMDGPU] Use more consistemt way to avoid overflow in the scheduler.
Jan 20 2023, 9:28 PM · Restricted Project, Restricted Project
rampitec requested review of D142262: [AMDGPU] Use more consistemt way to avoid overflow in the scheduler.
Jan 20 2023, 3:22 PM · Restricted Project, Restricted Project
rampitec updated the diff for D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.

Added debug output.

Jan 20 2023, 12:41 PM · Restricted Project, Restricted Project
rampitec updated the summary of D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.
Jan 20 2023, 11:14 AM · Restricted Project, Restricted Project
rampitec added a comment to D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling.

ping

Jan 20 2023, 11:12 AM · Restricted Project, Restricted Project
rampitec abandoned D141728: [AMDGPU] Tune scheduler on GFX10 and GFX11.

Testing showed negative performance impact. D141876 is the way to go.

Jan 20 2023, 11:11 AM · Restricted Project, Restricted Project

Jan 19 2023

rampitec committed rG63e7e9c8756a: [AMDGPU] Treat WMMA the same as MFMA for sched_barrier (authored by rampitec).
[AMDGPU] Treat WMMA the same as MFMA for sched_barrier
Jan 19 2023, 11:06 AM · Restricted Project, Restricted Project