Page MenuHomePhabricator

alex-t (Alexander)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 26 2016, 7:17 AM (357 w, 4 d)

Recent Activity

Wed, May 24

alex-t updated the diff for D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

Test simplified

Wed, May 24, 7:12 AM · Restricted Project, Restricted Project
alex-t added inline comments to D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.
Wed, May 24, 6:58 AM · Restricted Project, Restricted Project

Tue, May 23

alex-t updated the diff for D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

clang-format error corrected

Tue, May 23, 5:54 AM · Restricted Project, Restricted Project
alex-t retitled D149281: Don't disable loop unroll for vectorized loops on AMDGPU target from Not disable loop unroll for vectorized loops on AMDGPU target to Don't disable loop unroll for vectorized loops on AMDGPU target.
Tue, May 23, 5:01 AM · Restricted Project, Restricted Project
alex-t added inline comments to D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.
Tue, May 23, 5:00 AM · Restricted Project, Restricted Project
alex-t updated the diff for D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

Minor changes: comment corrected, class member name capitalized

Tue, May 23, 4:49 AM · Restricted Project, Restricted Project

Wed, May 17

alex-t added a comment to D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.
Wed, May 17, 9:30 AM · Restricted Project, Restricted Project

Mon, May 8

alex-t added a comment to D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

The question is why the vectorizer failed to unroll the loop in your workload.

Mon, May 8, 6:14 PM · Restricted Project, Restricted Project

Fri, May 5

alex-t updated the diff for D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

The review title was changed to better reflect the changes being submitted.
The variable name was changed accordingly.
The test was added to check that

Fri, May 5, 7:11 AM · Restricted Project, Restricted Project

May 3 2023

alex-t added a comment to D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.

Add a test?

That would be helpful. It would be good to understand why runtime unrolling is needed here. Does the interleave heuristic not kick in?

May 3 2023, 11:35 AM · Restricted Project, Restricted Project
alex-t retitled D149281: Don't disable loop unroll for vectorized loops on AMDGPU target from Must unroll epilogue loops after vectorization on AMDGPU target to Not disable loop unroll for vectorized loops on AMDGPU target.
May 3 2023, 11:27 AM · Restricted Project, Restricted Project

Apr 26 2023

alex-t requested review of D149281: Don't disable loop unroll for vectorized loops on AMDGPU target.
Apr 26 2023, 12:10 PM · Restricted Project, Restricted Project

Apr 3 2023

alex-t abandoned D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.
Apr 3 2023, 9:56 AM · Restricted Project, Restricted Project
alex-t added a comment to D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.

@nikic, thanks a lot, I have cherry-picked that commits, which helps.
I suspected we were "lucky" enough to hit the middle of an actively developed feature.

Apr 3 2023, 9:56 AM · Restricted Project, Restricted Project

Mar 29 2023

alex-t added a comment to D140798: [InstCombine] Fold zero check followed by decrement to usub.sat.

This change caused a regression in AMDGPU backend.
In case optimization is done before inline, it covers opportunities for other inst-combines that may have higher precedence.
As a result, we get suboptimal code.

Mar 29 2023, 6:29 AM · Restricted Project, Restricted Project
alex-t added a comment to D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.

My idea is just to postpone the zero check folding before the compare argument, which is a call, is inlined.
If it is, the next inst-combine invocation will have a chance to do a better job, if not - it will do the same as before.

Mar 29 2023, 6:27 AM · Restricted Project, Restricted Project
alex-t added a comment to D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.
define i32 @zot(i32 noundef %arg) {
bb:
  %inst = icmp eq i32 %arg, 0
  %inst1 = call i32 @llvm.cttz.i32(i32 %arg, i1 true)
  %inst2 = select i1 %inst, i32 -1, i32 %inst1
  %inst3 = add nsw i32 %inst2, 1
  ret i32 %inst3
}
Mar 29 2023, 6:24 AM · Restricted Project, Restricted Project
alex-t requested review of D147146: [InstCombine] Should postpone zero check folding if the compare argument is a call.
Mar 29 2023, 6:23 AM · Restricted Project, Restricted Project

Feb 20 2023

alex-t added inline comments to D143134: [AMDGPU][SDAG] attempt to custom legalize uaddo/usubo for long operands.
Feb 20 2023, 9:36 AM · Restricted Project, Restricted Project
alex-t added inline comments to D143134: [AMDGPU][SDAG] attempt to custom legalize uaddo/usubo for long operands.
Feb 20 2023, 9:16 AM · Restricted Project, Restricted Project

Jan 5 2023

alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Jan 5 2023, 12:04 PM · Restricted Project, Restricted Project
alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Jan 5 2023, 11:32 AM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

minor changes according to the reviewer request

Jan 5 2023, 11:31 AM · Restricted Project, Restricted Project

Dec 20 2022

alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

schedule printing function moved out from the GCNSchedStage class.
objects passed by reference are marked as const

Dec 20 2022, 3:14 AM · Restricted Project, Restricted Project

Dec 19 2022

alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

refactorerd getScheduleMetrics to avoid copying

Dec 19 2022, 9:50 AM · Restricted Project, Restricted Project

Dec 15 2022

alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

amdgpu-schedule-metric-bias description updated

Dec 15 2022, 1:27 PM · Restricted Project, Restricted Project
alex-t added a comment to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
amdgpu-schedule-metric-bias=100

makes the scheduler always prefer the occupancy over the latency.
but the schedule metrics are still computed.
Did you mean the option to completely switch the getScheduleMetrics OFF?

Dec 15 2022, 12:54 PM · Restricted Project, Restricted Project
alex-t added a comment to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

I still want to have an option to disable this heuristic.
Also are there any performance measurements done?

Dec 15 2022, 12:45 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

register class and move size from the virtual source register
operand type check is loosen to !isReg()

Dec 15 2022, 5:47 AM · Restricted Project, Restricted Project
alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 15 2022, 5:15 AM · Restricted Project, Restricted Project
alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 15 2022, 5:06 AM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
  1. metric calculation calls localized to the UnclusteredHighRPStage::shouldRevertSchedule. Now the getScheduleMetric is only called if it is really necessary.
  2. -amdgpu-schedule-metric-bias=<unsigned value> compiler option was added to ease the further testing and tuning. It defaults to 10 which means the schedule w/o bubbles gets 10 points reward.
  3. Several other changes according to the reviewer request.
Dec 15 2022, 4:57 AM · Restricted Project, Restricted Project

Dec 13 2022

alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 13 2022, 12:14 PM · Restricted Project, Restricted Project
alex-t added inline comments to D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.
Dec 13 2022, 12:13 PM · Restricted Project, Restricted Project
alex-t added inline comments to D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.
Dec 13 2022, 12:08 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

reviewers comments addressed

Dec 13 2022, 12:08 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

add operand instead of adding immediate

Dec 13 2022, 9:20 AM · Restricted Project, Restricted Project

Dec 12 2022

alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 12 2022, 3:04 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

assign getImm() result to unsigned for implicit conversion

Dec 12 2022, 2:58 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

changed as requested

Dec 12 2022, 2:26 PM · Restricted Project, Restricted Project
alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 12 2022, 2:11 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

test renamed

Dec 12 2022, 1:32 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.

test added

Dec 12 2022, 1:29 PM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

floating points changed to scaled integers

Dec 12 2022, 1:17 PM · Restricted Project, Restricted Project
alex-t added a comment to D139852: [amdgpu] Lower CopyToReg into SGPR explicitly to avoid illegal vgpr to sgpr copy.

Fixed in https://reviews.llvm.org/D139874

Dec 12 2022, 12:23 PM · Restricted Project, Restricted Project
alex-t requested review of D139874: [AMDGPU] Lower VGPR to physical SGPR COPY to S_MOV_B32 if VGPR contains the compile time constant.
Dec 12 2022, 12:21 PM · Restricted Project, Restricted Project
alex-t added a comment to D139852: [amdgpu] Lower CopyToReg into SGPR explicitly to avoid illegal vgpr to sgpr copy.

Since we ensure all the VGPR to SGPR copies are uniform, we just need to V_READFIRSTLANE_B32 here.

What ensures this copy is uniform?

Dec 12 2022, 11:26 AM · Restricted Project, Restricted Project
alex-t added a comment to D139852: [amdgpu] Lower CopyToReg into SGPR explicitly to avoid illegal vgpr to sgpr copy.

Since we ensure all the VGPR to SGPR copies are uniform, we just need to V_READFIRSTLANE_B32 here.

Dec 12 2022, 10:28 AM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

Model map renamed to ReadyCycles

Dec 12 2022, 8:10 AM · Restricted Project, Restricted Project
alex-t added inline comments to D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 12 2022, 7:36 AM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

changes as requested

Dec 12 2022, 7:36 AM · Restricted Project, Restricted Project

Dec 9 2022

alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

yet one more minor code cleanup

Dec 9 2022, 6:13 AM · Restricted Project, Restricted Project
alex-t updated the diff for D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.

odd changes removed

Dec 9 2022, 6:06 AM · Restricted Project, Restricted Project
alex-t requested review of D139710: [AMDGPU] MachineScheduler: schedule execution metric added for the UnclusteredHighRPStage.
Dec 9 2022, 5:58 AM · Restricted Project, Restricted Project

Nov 22 2022

alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Nov 22 2022, 10:47 AM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Nov 22 2022, 10:06 AM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

removed changes not relevant directly to the current revision
added assert to check if the flat scratch offset is aligned

Nov 22 2022, 9:44 AM · Restricted Project, Restricted Project
alex-t added a comment to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

There are still setIsDead changes and whitespace changes. Please try to strip them all out so we get a minimal patch to review.

Nov 22 2022, 9:02 AM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Nov 22 2022, 4:42 AM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

removed unrelated changes

Nov 22 2022, 4:41 AM · Restricted Project, Restricted Project

Nov 21 2022

alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Nov 21 2022, 2:36 PM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

added test for using and restoring frame register for offset calculation.
S_SUBB_U32 changed to S_ADDC_U32 with "-Offset"

Nov 21 2022, 2:36 PM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

Since the https://reviews.llvm.org/D137574 has been landed, this review is updated to use backward PEI.

Nov 21 2022, 11:06 AM · Restricted Project, Restricted Project

Nov 18 2022

alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

rebased

Nov 18 2022, 6:56 AM · Restricted Project, Restricted Project

Nov 17 2022

alex-t added inline comments to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 17 2022, 9:09 AM · Restricted Project, Restricted Project

Nov 16 2022

alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

TargetRegisterInfo::eliminateFrameIndex signature changed to return "true" if the MachineInstr passed in by iterator was removed

Nov 16 2022, 1:50 PM · Restricted Project, Restricted Project
alex-t reopened D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 16 2022, 1:49 PM · Restricted Project, Restricted Project

Nov 14 2022

alex-t added a comment to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

LGTM. Can you create follow up patches to start migrating all the other targets?

Nov 14 2022, 12:34 PM · Restricted Project, Restricted Project
alex-t added inline comments to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 14 2022, 10:31 AM · Restricted Project, Restricted Project
alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

TargetRegisterInfo::supportsBackwardScavenger() description changed

Nov 14 2022, 10:31 AM · Restricted Project, Restricted Project
alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

Added test case for the frame index in the last instruction in the BB. "::llvm" removed.

Nov 14 2022, 10:19 AM · Restricted Project, Restricted Project

Nov 10 2022

alex-t added a comment to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..

BTW, if %5 is divergent we have a bug in ISel. We now should not have any V2S copy with the divergent source.

Look at the MIR that @skc7 quoted. %5 is divergent - it's copied from a vgpr function argument.

Nov 10 2022, 10:15 AM · Restricted Project, Restricted Project
alex-t added a comment to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..
define <4 x i32> @extract0_bitcast_raw_buffer_load_v4i32(<4 x i32> inreg %rsrc, i32 %ofs, i32 %sofs) local_unnamed_addr #0 {
%var = tail call <4 x i32> @llvm.amdgcn.raw.buffer.load.v4i32(<4 x i32> %rsrc, i32 %ofs, i32 %sofs, i32 0)
ret <4 x i32> %var
}

IR dump after amdgpu-isel:

bb.0 (%ir-block.0):
liveins: $vgpr0, $vgpr1, $vgpr2, $vgpr3, $vgpr4, $vgpr5
Nov 10 2022, 10:12 AM · Restricted Project, Restricted Project
alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

Common code for debug instructions frame index operand replacement is factored
out to separate function.

Nov 10 2022, 7:54 AM · Restricted Project, Restricted Project
alex-t updated the diff for D137741: [PEI][NFC] Refactoring of the debug instructions frame index replacement.

no else after return

Nov 10 2022, 5:00 AM · Restricted Project, Restricted Project

Nov 9 2022

alex-t added inline comments to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 9 2022, 2:11 PM · Restricted Project, Restricted Project
alex-t requested review of D137741: [PEI][NFC] Refactoring of the debug instructions frame index replacement.
Nov 9 2022, 2:09 PM · Restricted Project, Restricted Project
alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

Formatting

Nov 9 2022, 12:46 PM · Restricted Project, Restricted Project
alex-t added reviewers for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward.: foad, rampitec, arsenm.
Nov 9 2022, 6:15 AM · Restricted Project, Restricted Project
alex-t updated the diff for D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

The first really working version of the frame index elimination with backward walk.

Nov 9 2022, 6:14 AM · Restricted Project, Restricted Project

Nov 7 2022

alex-t added a comment to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..

TargetRegisterInfo::eliminateFrameIndex invalidate iterator passed to it. It can remove the instruction or change the number of its operands.
I am thinking of changing its interface to make it return an iterator that points to the new/changed instruction to let caller handle this.

Nov 7 2022, 10:34 AM · Restricted Project, Restricted Project
alex-t added inline comments to D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 7 2022, 10:31 AM · Restricted Project, Restricted Project
alex-t requested review of D137574: PEI should be able to use backward walk in replaceFrameIndicesBackward..
Nov 7 2022, 10:06 AM · Restricted Project, Restricted Project
alex-t added a comment to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..

%8:sreg_32 = COPY %5:vgpr_32
%7:vgpr_32 = BUFFER_LOAD_DWORD_OFFEN %4:vgpr_32, killed %6:sgpr_128, %8:sreg_32, 0, 0, 0, 0, implicit $exec ::

I need more context. Is %5 uniform?

Nov 7 2022, 9:08 AM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

Temporary workaround to avoid SCC liveness scanning loop and using manually set
"dead" flags to decrease the amount of unnecessary SCC preserving code.

Nov 7 2022, 8:53 AM · Restricted Project, Restricted Project

Nov 2 2022

alex-t added inline comments to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..
Nov 2 2022, 3:10 PM · Restricted Project, Restricted Project

Oct 31 2022

alex-t added inline comments to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..
Oct 31 2022, 5:17 PM · Restricted Project, Restricted Project
alex-t added inline comments to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..
Oct 31 2022, 4:32 PM · Restricted Project, Restricted Project
alex-t added a comment to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..
Oct 31 2022, 11:26 AM · Restricted Project, Restricted Project
alex-t added a comment to D134423: [AMDGPU] Fix vgpr2sgpr copy analysis to check scalar operands of buffer instructions use scalar registers..

Sorry for my tediousness but
I would like to see any inspirational reason for this change.

Oct 31 2022, 7:06 AM · Restricted Project, Restricted Project

Oct 24 2022

alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Oct 24 2022, 9:20 AM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

Corrected wrong Frame Register restore code.
Several minor changes as requested.

Oct 24 2022, 9:16 AM · Restricted Project, Restricted Project

Oct 21 2022

alex-t added a comment to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

Am I correct to assume that frame offset is always DWORD aligned? If so we have couple spare bits:

S_ADDC_U32 $reg, offset
S_BITCMP1_B32 $reg, 0
S_BITSET0_B32 $reg, 0
Oct 21 2022, 1:05 PM · Restricted Project, Restricted Project
alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

Applied approach suggested by @rampitec

Oct 21 2022, 12:59 PM · Restricted Project, Restricted Project

Oct 20 2022

alex-t added a comment to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

In fact, to avoid this unwanted loop I can use another hack - just forward scavenger one position to look up the SCC liveness at the insertion slot and then immediately backward scavenger to restore its position and avoid breaking the PEI logic.
This looks weird but avoids the loop.

Oct 20 2022, 11:41 AM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Oct 20 2022, 10:17 AM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Oct 20 2022, 9:49 AM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Oct 20 2022, 3:16 AM · Restricted Project, Restricted Project

Oct 19 2022

alex-t updated the diff for D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.

SCC copy restore has been made wave size independent.

Oct 19 2022, 2:24 PM · Restricted Project, Restricted Project
alex-t added inline comments to D136169: [AMDGPU] Avoid SCC clobbering before S_CSELECT_B32.
Oct 19 2022, 2:21 PM · Restricted Project, Restricted Project