This is an archive of the discontinued LLVM Phabricator instance.

Correct register pressure calculation in presence of subregs
ClosedPublic

Authored by rampitec on Feb 10 2017, 10:37 AM.

Details

Summary

If a subreg is used in an instruction it counts as a whole superreg
for the purpose of register pressure calculation. This patch corrects
improper register pressure calculation by examining operand's lane mask.

Diff Detail

Repository
rL LLVM

Event Timeline

rampitec created this revision.Feb 10 2017, 10:37 AM
arsenm added inline comments.Feb 10 2017, 10:42 AM
test/CodeGen/AMDGPU/schedule-regpressure.mir
62–68 ↗(On Diff #88015)

Can you add this as a second test rather than modifying the existing test for m0

rampitec updated this revision to Diff 88018.Feb 10 2017, 10:53 AM
rampitec marked an inline comment as done.

Separated test per Matt's request.

I did not implement it this way yet because in general a bit in the lane mask doesn't necessarily represent 1 register pressure unit. Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well. I'd recomment running with -verify-misched when touching the regpressure code.

rampitec updated this revision to Diff 88026.Feb 10 2017, 11:29 AM

Added -verify-misched to the test.

I did not implement it this way yet because in general a bit in the lane mask doesn't necessarily represent 1 register pressure unit.

You are right, yet this estimation seems more correct to me. I think if/when we have targets for which this is not true we could extend TRI interface to return the estimation.

Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.

You are probably right, I was going to look into these places...

I'd recomment running with -verify-misched when touching the regpressure code.

Thanks, added.

I did not implement it this way yet because in general a bit in the lane mask doesn't necessarily represent 1 register pressure unit.

You are right, yet this estimation seems more correct to me. I think if/when we have targets for which this is not true we could extend TRI interface to return the estimation.

The current code is correct, just not particularily exact.

We cannot apply this without a plan to deal with those cases, even if it is just aborting/falling back to a pessimistic behavior.

Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.

You are probably right, I was going to look into these places...

We cannot apply a patch that just changes one part of the system.

I'd recomment running with -verify-misched when touching the regpressure code.

Thanks, added.

That was meant as a recommendation for running experiments (not necessarily for the test here) like llvm-lit -Dllc='llc -verify-misched'. I would expect the scheduler to fail in a number of cases when the pressure diffs and the decrease/increaseRegisterPressure() functions are out of sync.

I did not implement it this way yet because in general a bit in the lane mask doesn't necessarily represent 1 register pressure unit.

You are right, yet this estimation seems more correct to me. I think if/when we have targets for which this is not true we could extend TRI interface to return the estimation.

The current code is correct, just not particularily exact.
We cannot apply this without a plan to deal with those cases, even if it is just aborting/falling back to a pessimistic behavior.

I guess if we really see a problem like that then TRI interface can be extended to return a weight given RegUnit and Lanemask. In fact I'm trying to patch other places like increase/decreaseRegPressure and I had to separate this logic into a function. The very same function can be exposed by targets. Do you want to create it right now?

Without a similar patch to {increase|decrease}RegPressure() pressure book keeping will go out of sync as well.

You are probably right, I was going to look into these places...

We cannot apply a patch that just changes one part of the system.

I'm adding this changes at the moment.

I'm adding this changes at the moment.

In fact PressureDiff calculation and general RP tracking are independent. Now as I'm trying to extend the change for general RP tracking as well I really see many failures in lit tests...

rampitec updated this revision to Diff 88085.Feb 10 2017, 11:16 PM
rampitec retitled this revision from Correct PressureDiff calculation in presence of subregs to Correct register pressure calculation in presence of subregs.
rampitec edited the summary of this revision. (Show Details)

Added the same logic for all register pressure, not just PressureDiff.

rampitec updated this revision to Diff 89245.Feb 21 2017, 10:55 AM

Added callback for a target to estimate register unit weight given a lane mask, since that is unclear if proportional weight is correct for all targets.

alex-t accepted this revision.Feb 22 2017, 5:57 AM
alex-t added inline comments.
lib/CodeGen/RegisterPressure.cpp
43 ↗(On Diff #88085)

Given that getMaxLaneMaskForVReg returns all bit s that possibly can be set in the n=mask for given register class, what is the purpose of the AND? Unless you expect the bits not present in the register class to be set in LaneMask for the reg unit of this class.

This revision is now accepted and ready to land.Feb 22 2017, 5:57 AM
rampitec added inline comments.Feb 22 2017, 8:57 AM
lib/CodeGen/RegisterPressure.cpp
43 ↗(On Diff #88085)

In certain situations I just see -1 passed as a lane mask, mostly with Hexagon. In fact I can also see quite a few places where lane mask is initialized with default argument = LaneBitmask::getAll(), so until it is properly tracked at all places we need to take extra care. Without this I see a lot of assertions about pressure underflow from many lit tests.

This revision was automatically updated to reflect the committed changes.
MatzeB edited edge metadata.Feb 24 2017, 11:35 AM

The current implementation will miscalculate the PressureDiff objects. Could you revert this until we can further investigate and fix this.

llvm/trunk/include/llvm/Target/TargetRegisterInfo.h
723–724

This description makes no sense like that: Register units have no subregisters.

llvm/trunk/lib/CodeGen/MachineScheduler.cpp
1088

The LiveUses array here only contains register that have become live or completely dead, you do not get updates of some lanes getting added/remove at the moment. So the calculations for the pressure diffs will be out of sync when you do them at lane granularity like this.

The current implementation will miscalculate the PressureDiff objects. Could you revert this until we can further investigate and fix this.

I will revert it for now. There is yet another problem, for a target which does not override getRegUnitWeight that is possible to get pressure underflow because a full register weight is decreased with each subreg def.

Reverted in r296182.

Reverted in r296182.

Thanks!