If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
test/CodeGen/AMDGPU/schedule-regpressure.mir | ||
---|---|---|
62–68 ↗ | (On Diff #88015) | Can you add this as a second test rather than modifying the existing test for m0 |
I did not implement it this way yet because in general a bit in the lane mask doesn't necessarily represent 1 register pressure unit. Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well. I'd recomment running with -verify-misched when touching the regpressure code.
You are right, yet this estimation seems more correct to me. I think if/when we have targets for which this is not true we could extend TRI interface to return the estimation.
Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.
You are probably right, I was going to look into these places...
I'd recomment running with -verify-misched when touching the regpressure code.
Thanks, added.
The current code is correct, just not particularily exact.
We cannot apply this without a plan to deal with those cases, even if it is just aborting/falling back to a pessimistic behavior.
Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.
You are probably right, I was going to look into these places...
We cannot apply a patch that just changes one part of the system.
I'd recomment running with -verify-misched when touching the regpressure code.
Thanks, added.
That was meant as a recommendation for running experiments (not necessarily for the test here) like llvm-lit -Dllc='llc -verify-misched'. I would expect the scheduler to fail in a number of cases when the pressure diffs and the decrease/increaseRegisterPressure() functions are out of sync.
I guess if we really see a problem like that then TRI interface can be extended to return a weight given RegUnit and Lanemask. In fact I'm trying to patch other places like increase/decreaseRegPressure and I had to separate this logic into a function. The very same function can be exposed by targets. Do you want to create it right now?
Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.
You are probably right, I was going to look into these places...
We cannot apply a patch that just changes one part of the system.
I'm adding this changes at the moment.
In fact PressureDiff calculation and general RP tracking are independent. Now as I'm trying to extend the change for general RP tracking as well I really see many failures in lit tests...
Added callback for a target to estimate register unit weight given a lane mask, since that is unclear if proportional weight is correct for all targets.
lib/CodeGen/RegisterPressure.cpp | ||
---|---|---|
43 | Given that getMaxLaneMaskForVReg returns all bit s that possibly can be set in the n=mask for given register class, what is the purpose of the AND? Unless you expect the bits not present in the register class to be set in LaneMask for the reg unit of this class. |
lib/CodeGen/RegisterPressure.cpp | ||
---|---|---|
43 | In certain situations I just see -1 passed as a lane mask, mostly with Hexagon. In fact I can also see quite a few places where lane mask is initialized with default argument = LaneBitmask::getAll(), so until it is properly tracked at all places we need to take extra care. Without this I see a lot of assertions about pressure underflow from many lit tests. |
The current implementation will miscalculate the PressureDiff objects. Could you revert this until we can further investigate and fix this.
llvm/trunk/include/llvm/Target/TargetRegisterInfo.h | ||
---|---|---|
723–724 ↗ | (On Diff #89545) | This description makes no sense like that: Register units have no subregisters. |
llvm/trunk/lib/CodeGen/MachineScheduler.cpp | ||
1088 ↗ | (On Diff #89545) | The LiveUses array here only contains register that have become live or completely dead, you do not get updates of some lanes getting added/remove at the moment. So the calculations for the pressure diffs will be out of sync when you do them at lane granularity like this. |
I will revert it for now. There is yet another problem, for a target which does not override getRegUnitWeight that is possible to get pressure underflow because a full register weight is decreased with each subreg def.
Given that getMaxLaneMaskForVReg returns all bit s that possibly can be set in the n=mask for given register class, what is the purpose of the AND? Unless you expect the bits not present in the register class to be set in LaneMask for the reg unit of this class.