- User Since
- Apr 29 2019, 11:13 PM (202 w, 6 d)
Tue, Mar 7
Split the assertion into smaller chunks.
Sun, Mar 5
Wed, Mar 1
Extended only for amdgpu_cs and when the subtarget has architected SGPRs enabled.
Renamed the testcase.
Feb 16 2023
The IsPredicable field will always be turned to 1 for Pred_Copy to indicate that it is the Predicated version of the copy.
The real motivation for these buildCopy helper functions is to minimize the additional code changes required while introducing getCopyOpcode() as part of D143754.
One could probably say that "still you could have avoided the buildCopy functions, and instead replace TargetOpcode::COPY with getCopyOpcode()".
I think it would be ok to refactor the code if found an opportunity while working around that code.
For instance, should we do the same thing for PHI, INSERT_SUBREG, etc. and why or why not.
Also, should these helper be virtual or not?
As I said earlier, COPY wasn't a random pick, but rather a code refactor for another patch.
I guess where I am going is what do you intend to simplify/achieve with them?
I felt it was better with buildCopy rather than the direct replacement of TargetOpcode::COPY with getCopyOpcode() in their original BuildMI instances. You tend to agree with that towards the end - "(I actually like the patch)".
The bottom line is without a proper rationale on why we need this helper, I feel that it gets difficult to know which method should be used to construct a copy.
Right now we have BuildMI (and the underlying MachineInstrBuilder object that can be called directly), MachineFunction::CreateMachineInstr, MachineIRBuilder::buildInstr (BTW MachineIRBuilder has its own MachineIRBuilder::buildCopy) and we add a new method here.
I agree. All my buildCopy instances currently return MachineInstr*. I wasn't sure they should return MachineInstrBuilder or any other form as you mentioned. It can be worked out if anyone has a better suggestion.
And about MachineIRBuilder::buildCopy, I feel it is better to retain this very specific instance considering the actual function being called ( the generic MachineIRBuilder::buildInstr) and the operands used which are different from the instances I added.
I believe we need to disambiguate when this one should be used and why.
Admittedly I'm playing the devil's advocate here (I actually like the patch) but I think it is important to flush out the goals and intended usages to be able to better guide future users of this (and the other) API(s).
Feb 15 2023
Feb 13 2023
Added an assertion as per review comment.
Adjusted NumRequiredSystemSGPRs before force initializing the 16 input SGPRs.
Feb 10 2023
D143757 in the review chain enables predicated copy for AMDGPU target.
The application of pred-copy is here: D143762.
Testcase llvm/test/CodeGen/AMDGPU/whole-wave-register-copy.ll added along with D124196 demonstrates the need for pred-copy.
Rebased after whole-wave copy implementation.
Dec 23 2022
D138656 handled it better and hence abanding it.
Dec 21 2022
Dec 20 2022
Dec 16 2022
Dec 15 2022
Dec 14 2022
I will wait for a couple of days before I upstream it to see if there is any comment from others.
Fixed the comment appropriately.
Dec 12 2022
Addressed the review comment.
Dec 8 2022
Dec 5 2022
Ping. Much appreciate if this can be reviewed.
Nov 29 2022
Nov 24 2022
Posted D138656 to pass the virtual register as an additional argument to the spiller interface. This review can be abandoned once D138656 gets approved.
This is a better and a stable approach than D138515.
The use case of the extra argument can be found in D124196. See the implementations SIInstrInfo::storeRegToStackSlot and SIInstrInfo::loadRegFromStackSlot.
Implemented the WWM spill during RegAllocFast using the additional argument to the spiller interface introduced with patch D138656.
Nov 23 2022
The idea of mapping a physical register into the current virtual register in RegAllocFast would be likely a risky choice with a global variable. The same was true for the initial patch I posted D134951 which does the same thing specifically around the spill and reload functions via. the delegate.
The effort was to identify the physical register allocated for some special machine operands (whole-wave vector registers for AMDGPU), especially inside storeRegToStackSlot & loadRegFromStackSlot spill interfaces called from fast regalloc that directly spills the physRegs.
The better approach would be by passing an additional parameter (the virtual register) to the spiller interface. The interface is common to Greedy, PEI, and to some other places that have no use for this additional parameter. Still, I think it is the right way to handle the problem.
See the Usecase of PhysToCurrentVirtReg in D124196 (look for getPhysToCurrentVirtReg)
PhysToCurrentVirtReg should also be updated while reloading a physical register from the spill location. It happens when a RegMask (for instance, a function call) clobbers a register that is used in an instruction later in the BB. Also, at a BB entry, the LiveIn registers are reloaded. The reload point begins a new live range and it should be properly reflected in the mapping.
Rebase + Incorporated changes after D138515 to move the handling of physReg to current VirtReg mapping entirely into the generic design.
Nov 22 2022
This should replace the patch D134951 which used the delegate method for a specific purpose.
This is a more generic approach for targets to help identify the exact live range (virtual register) a physical register was currently assigned for.
While the allocation is in progress, this would serve the purpose of mapping a physical register to its virtual register.
Nov 21 2022
Nov 15 2022
Rebase + Suggestions incorporated.
Removed the IsIgnored flag and special-handled FP spill & restore cases.
For the downstream case, can we just erase the entry in the map after emitting the spill?
Can do that.
Nov 14 2022
Renamed IsPEI to IsPrologEpilog & removed the unwanted comment.
Fixed a typo in the IR test file name.