This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/lib/Target/AMDGPU/
-
trunk/
-
lib/
-
Target/
-
AMDGPU/
-
GCNRegPressure.h
-
GCNRegPressure.cpp

Differential D33289

[AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker
ClosedPublic

Authored by vpykhtin on May 17 2017, 10:25 AM.

Download Raw Diff

Details

Reviewers

rampitec
arsenm

Commits

rG74cb9c88314a: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker
rL303548: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker

Summary

This change fixes incorrect maximum register pressure calculation in GCNUpwardRPTracker: it reduced pressure of defs before incrementing pressure on uses losing the possible maximum pressure of defs + uses at the machine instruction.

After several attempts to fix it with fewer lines of code I decided it would be easier to introduce MachineInstrRegs class which collects register def/uses for a machine instruction with theirs linemasks. This class does job similar to standard llvm's RegisterOperands class but much smaller.

Trying to figure out a test for this.

Diff Detail

Repository: rL LLVM

Event Timeline

vpykhtin created this revision.May 17 2017, 10:25 AM

Herald added subscribers: t-tye, tpr, dstuttard and 4 others. · View Herald TranscriptMay 17 2017, 10:25 AM

Where is GCNUpwardRPTracker::reset() now? What is the base revision this diff is taken against? The GCNUpwardRPTracker::reset() is right before recede() in the current revision, but I do not see it in the left pane as well. Same with some other functions like getDefRegMask.

lib/Target/AMDGPU/GCNRegPressure.cpp
295 ↗	(On Diff #99324)	You have GCNRPTracker::getDefRegMask() for this.
300 ↗	(On Diff #99324)	And there is getUsedRegMask() for this too.
311 ↗	(On Diff #99324)	Why not to do it right above, where you assign getAll()? It seems to be less work.
346 ↗	(On Diff #99324)	Do not you want to also erase it from LiveRegs if LaneMask.none()?

rampitec added inline comments.May 17 2017, 2:51 PM

lib/Target/AMDGPU/GCNRegPressure.cpp
346 ↗	(On Diff #99324)	You are now updating MaxPressure correctly, but your CurPressure is incorrect. At this point defs still contribute to the current pressure, they will be out only with the next recede. That may lead scheduler to wrong decisions about a current instruction. You can also run into a paradoxical situation that no single instruction has a pressure equal to max. I still believe that in the situation where we want to account for both defs and uses contributing to the RP of an instruction we naturally have a two step advance/recede process - step before/past the instruction and actually step to it.

vpykhtin added inline comments.May 18 2017, 3:15 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
295 ↗	(On Diff #99324)	This is another class, though getDefRegMask can be made nonmember
300 ↗	(On Diff #99324)	I'm avoiding getLiveLaneMask call here before all registers collected
311 ↗	(On Diff #99324)	This avoids getLiveLaneMask call more than once for a register
346 ↗	(On Diff #99324)	Why to erase? Actually recede moves from the point after the instruction to the point before the instruction in top-down order accounting max pressure interim. recede never stops at the instruction.

I inserted MachineInstrRegs between reset and recede, all functions are in old places. I should move MachineInstrRegs higher

In D33289#758351, @vpykhtin wrote:

I inserted MachineInstrRegs between reset and recede, all functions are in old places. I should move MachineInstrRegs higher

I see it now, thanks. Can you move it higher please?

lib/Target/AMDGPU/GCNRegPressure.cpp
295 ↗	(On Diff #99324)	Probably static inline is just right for it. Even for getUsedRegMask if you pass LIS to it.
311 ↗	(On Diff #99324)	Do you see a common situation where the same register is used in the same instruction more than once? This sounds quite exotic to me, provided we are speaking about uses only, not defs.
346 ↗	(On Diff #99324)	If you erase you will not iterate over it getRegPressure(). I understand your point that recede does not stop in steps, but I'm still concerned that you will not get a correct CurPressure, or even will not get CurPressure equal to max pressure anywhere in the region. How about that?

In D33289#758647, @rampitec wrote:

In D33289#758351, @vpykhtin wrote:

I inserted MachineInstrRegs between reset and recede, all functions are in old places. I should move MachineInstrRegs higher

I see it now, thanks. Can you move it higher please?

Yes I'll fix that.

lib/Target/AMDGPU/GCNRegPressure.cpp
311 ↗	(On Diff #99324)	I agree its going to be very rare ocasion, but the main purpose of this class is to deduplicate register def/uses so it wouldn't be counted twice when calculating pressure. So if I deduplicated registers already then I can save some time on calculating mask.
346 ↗	(On Diff #99324)	Ok, now that live regs can be reused it may have the point to clear a register immediately. Previously I used stripEmpty on the set but only for debug printing purposes. CurPressure isn't calculated for the at-the-instruction level, its calculated for the after recede point. I put an assert that CurPressure calculated correctly in the end of recede. CurPressure can never become MaxPressure, but I don't see a problem here. There is no at-the-instruction position in the tracker - it is always in between.

rampitec added inline comments.May 18 2017, 10:30 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
295 ↗	(On Diff #99324)	Actually getUsedRegMask is unused now, so you can just delete it.
311 ↗	(On Diff #99324)	I mean, if that is extremely rare you may lose more in the inexpensive but way more often called second loop.

vpykhtin added inline comments.May 18 2017, 10:37 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
311 ↗	(On Diff #99324)	Ok, I'll make getDefRegMask and getUsedRegMask static and reuse it here removing bottom loop, thanks.

rampitec added inline comments.May 19 2017, 2:21 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
269 ↗	(On Diff #99324)	It looks like defs are not really needed in this class. Uses needed because you walk them twice, but defs can be just directly processed. I.e. the code can be simplified and overhead somewhat reduced.
346 ↗	(On Diff #99324)	I see it as a sort of quantum tracker. It hides the intermediate step where pressure actually peaks from an observer. As long as we agree on that understanding I have no objection on submitting such a tracker where actual pressure "tunnels" through recede method as we have no actual interested observers currently. We might want to split the method in the future if we have them.

rampitec mentioned this in D33087: [AMDGCN] Fix overly optimistic GCNUpwardRPTracker.May 19 2017, 2:25 AM

fixed as per comments

LGTM. Thanks!

This revision is now accepted and ready to land.May 19 2017, 9:21 AM

rampitec added inline comments.May 19 2017, 9:43 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
313 ↗	(On Diff #99544)	Are you sure it should always present? What if we have a dead def? I.e. an instruction defines a register which is never used. I guess it will not be reported by LIS. If so this should be if (I != LiveRegs.end()) continue;

rampitec added inline comments.May 19 2017, 9:44 AM

lib/Target/AMDGPU/GCNRegPressure.cpp
313 ↗	(On Diff #99544)	Sorry, if (I == LiveRegs.end()) continue;

Closed by commit rL303548: [AMDGPU] Fix incorrect register usage tracking in GCNUpwardTracker (authored by vpykhtin). · Explain WhyMay 22 2017, 6:09 AM

This revision was automatically updated to reflect the committed changes.

vpykhtin marked an inline comment as done.May 22 2017, 6:11 AM

vpykhtin added inline comments.

lib/Target/AMDGPU/GCNRegPressure.cpp
313 ↗	(On Diff #99544)	Done, thanks!

This change fixes incorrect maximum register pressure calculation in GCNUpwardRPTracker: it reduced pressure of defs before incrementing pressure on uses losing the possible maximum pressure of defs + uses at the machine instruction.

Hi @vpykhtin! I don't understand the reason for this change. Why should max pressure include both the uses and the defs of one instruction? The uses and defs are not live at the same time and can be allocated to overlapping physical registers (assuming the uses are killed by the instruction). There should be an exception for early-clobber def operands but they are not very common.

+ @piotr @critson

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptOct 16 2023, 6:46 AM

Herald added subscribers: llvm-commits, kerbowa, jvesely. · View Herald Transcript

In D33289#4654033, @foad wrote:

This change fixes incorrect maximum register pressure calculation in GCNUpwardRPTracker: it reduced pressure of defs before incrementing pressure on uses losing the possible maximum pressure of defs + uses at the machine instruction.

Hi @vpykhtin! I don't understand the reason for this change. Why should max pressure include both the uses and the defs of one instruction? The uses and defs are not live at the same time and can be allocated to overlapping physical registers (assuming the uses are killed by the instruction). There should be an exception for early-clobber def operands but they are not very common.

+ @piotr @critson

Would it make sense to only do the AtMIPressure part for instructions with early clobber?

In D33289#4654038, @piotr wrote:

In D33289#4654033, @foad wrote:

This change fixes incorrect maximum register pressure calculation in GCNUpwardRPTracker: it reduced pressure of defs before incrementing pressure on uses losing the possible maximum pressure of defs + uses at the machine instruction.

Hi @vpykhtin! I don't understand the reason for this change. Why should max pressure include both the uses and the defs of one instruction? The uses and defs are not live at the same time and can be allocated to overlapping physical registers (assuming the uses are killed by the instruction). There should be an exception for early-clobber def operands but they are not very common.

+ @piotr @critson

Would it make sense to only do the AtMIPressure part for instructions with early clobber?

You're right, we should not increment pressure for early-clobbers twice in AtMIPressure

You're right, we should not increment pressure for early-clobbers twice in AtMIPressure

Sorry, disregard this comment. We should only account for early-clobbers this way.

Is anyone working on this at the moment?

In D33289#4654317, @vpykhtin wrote:

Is anyone working on this at the moment?

I'm not working on it but I have been thinking about it. The first problem is how to write regression tests for GCNRegPressure.

In D33289#4654324, @foad wrote:

In D33289#4654317, @vpykhtin wrote:

Is anyone working on this at the moment?

I'm not working on it but I have been thinking about it. The first problem is how to write regression tests for GCNRegPressure.

I can probably come up with a unit test in the way similar to how we test LiveIntervals and LiveVariables.

In D33289#4654325, @vpykhtin wrote:

In D33289#4654324, @foad wrote:

In D33289#4654317, @vpykhtin wrote:

Is anyone working on this at the moment?

I'm not working on it but I have been thinking about it. The first problem is how to write regression tests for GCNRegPressure.

I can probably come up with a unit test in the way similar to how we test LiveIntervals and LiveVariables.

I wonder if we can test it more directly by adding an analysis pass like this: https://github.com/GPUOpen-Drivers/llvm-project/commit/042be23e3d98963fb02833511a86f4e26378a04d
and then using something like opt -passes='print<amdgpu-reg-press>'.

I wonder if we can test it more directly by adding an analysis pass like this: https://github.com/GPUOpen-Drivers/llvm-project/commit/042be23e3d98963fb02833511a86f4e26378a04d
and then using something like opt -passes='print<amdgpu-reg-press>'.

We can try to print reg pressure at every instruction

Something like this? (don't forget to expand *.mir file diff, it's not shown by default)

https://github.com/llvm/llvm-project/compare/main...vpykhtin:llvm-project:rp_printer

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

AMDGPU/

GCNRegPressure.h

2 lines

GCNRegPressure.cpp

148 lines

Diff 99751

llvm/trunk/lib/Target/AMDGPU/GCNRegPressure.h

	Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines

	protected:			protected:
	const LiveIntervals &LIS;			const LiveIntervals &LIS;
	LiveRegSet LiveRegs;			LiveRegSet LiveRegs;
	GCNRegPressure CurPressure, MaxPressure;			GCNRegPressure CurPressure, MaxPressure;
	const MachineInstr *LastTrackedMI = nullptr;			const MachineInstr *LastTrackedMI = nullptr;
	mutable const MachineRegisterInfo *MRI = nullptr;			mutable const MachineRegisterInfo *MRI = nullptr;
	GCNRPTracker(const LiveIntervals &LIS_) : LIS(LIS_) {}			GCNRPTracker(const LiveIntervals &LIS_) : LIS(LIS_) {}
	LaneBitmask getDefRegMask(const MachineOperand &MO) const;
	LaneBitmask getUsedRegMask(const MachineOperand &MO) const;
	public:			public:
	// live regs for the current state			// live regs for the current state
	const decltype(LiveRegs) &getLiveRegs() const { return LiveRegs; }			const decltype(LiveRegs) &getLiveRegs() const { return LiveRegs; }
	const MachineInstr *getLastTrackedMI() const { return LastTrackedMI; }			const MachineInstr *getLastTrackedMI() const { return LastTrackedMI; }

	void clearMaxPressure() { MaxPressure.clear(); }			void clearMaxPressure() { MaxPressure.clear(); }

	// returns MaxPressure, resetting it			// returns MaxPressure, resetting it
	▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AMDGPU/GCNRegPressure.cpp

//===------------------------- GCNRegPressure.cpp - -----------------------===//		//===------------------------- GCNRegPressure.cpp - -----------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
/// \file		/// \file
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "GCNRegPressure.h"		#include "GCNRegPressure.h"
		#include "llvm/CodeGen/RegisterPressure.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "misched"		#define DEBUG_TYPE "misched"

#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)		#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD		LLVM_DUMP_METHOD
void llvm::printLivesAt(SlotIndex SI,		void llvm::printLivesAt(SlotIndex SI,
Show All 35 Lines	static bool isEqual(const GCNRPTracker::LiveRegSet &S1,
for (const auto &P : S1) {		for (const auto &P : S1) {
auto I = S2.find(P.first);		auto I = S2.find(P.first);
if (I == S2.end() \|\| I->second != P.second)		if (I == S2.end() \|\| I->second != P.second)
return false;		return false;
}		}
return true;		return true;
}		}

static GCNRPTracker::LiveRegSet
stripEmpty(const GCNRPTracker::LiveRegSet &LR) {
GCNRPTracker::LiveRegSet Res;
for (const auto &P : LR) {
if (P.second.any())
Res.insert(P);
}
return Res;
}
#endif		#endif

///////////////////////////////////////////////////////////////////////////////		///////////////////////////////////////////////////////////////////////////////
// GCNRegPressure		// GCNRegPressure

unsigned GCNRegPressure::getRegKind(unsigned Reg,		unsigned GCNRegPressure::getRegKind(unsigned Reg,
const MachineRegisterInfo &MRI) {		const MachineRegisterInfo &MRI) {
assert(TargetRegisterInfo::isVirtualRegister(Reg));		assert(TargetRegisterInfo::isVirtualRegister(Reg));
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	void GCNRegPressure::print(raw_ostream &OS, const SISubtarget *ST) const {
if (ST) OS << "(O" << ST->getOccupancyWithNumSGPRs(getSGPRNum()) << ')';		if (ST) OS << "(O" << ST->getOccupancyWithNumSGPRs(getSGPRNum()) << ')';
OS << ", LVGPR WT: " << getVGPRTuplesWeight()		OS << ", LVGPR WT: " << getVGPRTuplesWeight()
<< ", LSGPR WT: " << getSGPRTuplesWeight();		<< ", LSGPR WT: " << getSGPRTuplesWeight();
if (ST) OS << " -> Occ: " << getOccupancy(*ST);		if (ST) OS << " -> Occ: " << getOccupancy(*ST);
OS << '\n';		OS << '\n';
}		}
#endif		#endif


		static LaneBitmask getDefRegMask(const MachineOperand &MO,
		const MachineRegisterInfo &MRI) {
		assert(MO.isDef() && MO.isReg() &&
		TargetRegisterInfo::isVirtualRegister(MO.getReg()));

		// We don't rely on read-undef flag because in case of tentative schedule
		// tracking it isn't set correctly yet. This works correctly however since
		// use mask has been tracked before using LIS.
		return MO.getSubReg() == 0 ?
		MRI.getMaxLaneMaskForVReg(MO.getReg()) :
		MRI.getTargetRegisterInfo()->getSubRegIndexLaneMask(MO.getSubReg());
		}

		static LaneBitmask getUsedRegMask(const MachineOperand &MO,
		const MachineRegisterInfo &MRI,
		const LiveIntervals &LIS) {
		assert(MO.isUse() && MO.isReg() &&
		TargetRegisterInfo::isVirtualRegister(MO.getReg()));

		if (auto SubReg = MO.getSubReg())
		return MRI.getTargetRegisterInfo()->getSubRegIndexLaneMask(SubReg);

		auto MaxMask = MRI.getMaxLaneMaskForVReg(MO.getReg());
		if (MaxMask.getAsInteger() == 1) // cannot have subregs
		return MaxMask;

		// For a tentative schedule LIS isn't updated yet but livemask should remain
		// the same on any schedule. Subreg defs can be reordered but they all must
		// dominate uses anyway.
		auto SI = LIS.getInstructionIndex(*MO.getParent()).getBaseIndex();
		return getLiveLaneMask(MO.getReg(), SI, LIS, MRI);
		}

		SmallVector<RegisterMaskPair, 8> collectVirtualRegUses(const MachineInstr &MI,
		const LiveIntervals &LIS,
		const MachineRegisterInfo &MRI) {
		SmallVector<RegisterMaskPair, 8> Res;
		for (const auto &MO : MI.operands()) {
		if (!MO.isReg() \|\| !TargetRegisterInfo::isVirtualRegister(MO.getReg()))
		continue;
		if (!MO.isUse() \|\| !MO.readsReg())
		continue;

		auto const UsedMask = getUsedRegMask(MO, MRI, LIS);

		auto Reg = MO.getReg();
		auto I = std::find_if(Res.begin(), Res.end(), [Reg](const RegisterMaskPair &RM) {
		return RM.RegUnit == Reg;
		});
		if (I != Res.end())
		I->LaneMask \|= UsedMask;
		else
		Res.push_back(RegisterMaskPair(Reg, UsedMask));
		}
		return Res;
		}

///////////////////////////////////////////////////////////////////////////////		///////////////////////////////////////////////////////////////////////////////
// GCNRPTracker		// GCNRPTracker

LaneBitmask llvm::getLiveLaneMask(unsigned Reg,		LaneBitmask llvm::getLiveLaneMask(unsigned Reg,
SlotIndex SI,		SlotIndex SI,
const LiveIntervals &LIS,		const LiveIntervals &LIS,
const MachineRegisterInfo &MRI) {		const MachineRegisterInfo &MRI) {
LaneBitmask LiveMask;		LaneBitmask LiveMask;
Show All 21 Lines	if (!LIS.hasInterval(Reg))
continue;		continue;
auto LiveMask = getLiveLaneMask(Reg, SI, LIS, MRI);		auto LiveMask = getLiveLaneMask(Reg, SI, LIS, MRI);
if (LiveMask.any())		if (LiveMask.any())
LiveRegs[Reg] = LiveMask;		LiveRegs[Reg] = LiveMask;
}		}
return LiveRegs;		return LiveRegs;
}		}

LaneBitmask GCNRPTracker::getDefRegMask(const MachineOperand &MO) const {
assert(MO.isDef() && MO.isReg() &&
TargetRegisterInfo::isVirtualRegister(MO.getReg()));

// We don't rely on read-undef flag because in case of tentative schedule
// tracking it isn't set correctly yet. This works correctly however since
// use mask has been tracked before using LIS.
return MO.getSubReg() == 0 ?
MRI->getMaxLaneMaskForVReg(MO.getReg()) :
MRI->getTargetRegisterInfo()->getSubRegIndexLaneMask(MO.getSubReg());
}

LaneBitmask GCNRPTracker::getUsedRegMask(const MachineOperand &MO) const {
assert(MO.isUse() && MO.isReg() &&
TargetRegisterInfo::isVirtualRegister(MO.getReg()));

if (auto SubReg = MO.getSubReg())
return MRI->getTargetRegisterInfo()->getSubRegIndexLaneMask(SubReg);

auto MaxMask = MRI->getMaxLaneMaskForVReg(MO.getReg());
if (MaxMask.getAsInteger() == 1) // cannot have subregs
return MaxMask;

// For a tentative schedule LIS isn't updated yet but livemask should remain
// the same on any schedule. Subreg defs can be reordered but they all must
// dominate uses anyway.
auto SI = LIS.getInstructionIndex(*MO.getParent()).getBaseIndex();
return getLiveLaneMask(MO.getReg(), SI, LIS, *MRI);
}

void GCNUpwardRPTracker::reset(const MachineInstr &MI,		void GCNUpwardRPTracker::reset(const MachineInstr &MI,
const LiveRegSet *LiveRegsCopy) {		const LiveRegSet *LiveRegsCopy) {
MRI = &MI.getParent()->getParent()->getRegInfo();		MRI = &MI.getParent()->getParent()->getRegInfo();
if (LiveRegsCopy) {		if (LiveRegsCopy) {
if (&LiveRegs != LiveRegsCopy)		if (&LiveRegs != LiveRegsCopy)
LiveRegs = *LiveRegsCopy;		LiveRegs = *LiveRegsCopy;
} else {		} else {
LiveRegs = getLiveRegsAfter(MI, LIS);		LiveRegs = getLiveRegsAfter(MI, LIS);
}		}
MaxPressure = CurPressure = getRegPressure(*MRI, LiveRegs);		MaxPressure = CurPressure = getRegPressure(*MRI, LiveRegs);
}		}

void GCNUpwardRPTracker::recede(const MachineInstr &MI) {		void GCNUpwardRPTracker::recede(const MachineInstr &MI) {
assert(MRI && "call reset first");		assert(MRI && "call reset first");

LastTrackedMI = &MI;		LastTrackedMI = &MI;

if (MI.isDebugValue())		if (MI.isDebugValue())
return;		return;

// process all defs first to ensure early clobbers are handled correctly		auto const RegUses = collectVirtualRegUses(MI, LIS, *MRI);
// iterating over operands() to catch implicit defs
for (const auto &MO : MI.operands()) {		// calc pressure at the MI (defs + uses)
if (!MO.isReg() \|\| !MO.isDef() \|\|		auto AtMIPressure = CurPressure;
!TargetRegisterInfo::isVirtualRegister(MO.getReg()))		for (const auto &U : RegUses) {
		auto LiveMask = LiveRegs[U.RegUnit];
		AtMIPressure.inc(U.RegUnit, LiveMask, LiveMask \| U.LaneMask, *MRI);
		}
		// update max pressure
		MaxPressure = max(AtMIPressure, MaxPressure);

		for (const auto &MO : MI.defs()) {
		if (!MO.isReg() \|\| !TargetRegisterInfo::isVirtualRegister(MO.getReg()) \|\|
		MO.isDead())
continue;		continue;

auto Reg = MO.getReg();		auto Reg = MO.getReg();
auto &LiveMask = LiveRegs[Reg];		auto I = LiveRegs.find(Reg);
		if (I == LiveRegs.end())
		continue;
		auto &LiveMask = I->second;
auto PrevMask = LiveMask;		auto PrevMask = LiveMask;
LiveMask &= ~getDefRegMask(MO);		LiveMask &= ~getDefRegMask(MO, *MRI);
CurPressure.inc(Reg, PrevMask, LiveMask, *MRI);		CurPressure.inc(Reg, PrevMask, LiveMask, *MRI);
		if (LiveMask.none())
		LiveRegs.erase(I);
}		}
		for (const auto &U : RegUses) {
// then all uses		auto &LiveMask = LiveRegs[U.RegUnit];
for (const auto &MO : MI.uses()) {
if (!MO.isReg() \|\| !MO.readsReg() \|\|
!TargetRegisterInfo::isVirtualRegister(MO.getReg()))
continue;

auto Reg = MO.getReg();
auto &LiveMask = LiveRegs[Reg];
auto PrevMask = LiveMask;		auto PrevMask = LiveMask;
LiveMask \|= getUsedRegMask(MO);		LiveMask \|= U.LaneMask;
CurPressure.inc(Reg, PrevMask, LiveMask, *MRI);		CurPressure.inc(U.RegUnit, PrevMask, LiveMask, *MRI);
}		}
		assert(CurPressure == getRegPressure(*MRI, LiveRegs));
MaxPressure = max(MaxPressure, CurPressure);
}		}

bool GCNDownwardRPTracker::reset(const MachineInstr &MI,		bool GCNDownwardRPTracker::reset(const MachineInstr &MI,
const LiveRegSet *LiveRegsCopy) {		const LiveRegSet *LiveRegsCopy) {
MRI = &MI.getParent()->getParent()->getRegInfo();		MRI = &MI.getParent()->getParent()->getRegInfo();
LastTrackedMI = nullptr;		LastTrackedMI = nullptr;
MBBEnd = MI.getParent()->end();		MBBEnd = MI.getParent()->end();
NextMI = &MI;		NextMI = &MI;
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	void GCNDownwardRPTracker::advanceToNext() {
for (const auto &MO : LastTrackedMI->defs()) {		for (const auto &MO : LastTrackedMI->defs()) {
if (!MO.isReg())		if (!MO.isReg())
continue;		continue;
unsigned Reg = MO.getReg();		unsigned Reg = MO.getReg();
if (!TargetRegisterInfo::isVirtualRegister(Reg))		if (!TargetRegisterInfo::isVirtualRegister(Reg))
continue;		continue;
auto &LiveMask = LiveRegs[Reg];		auto &LiveMask = LiveRegs[Reg];
auto PrevMask = LiveMask;		auto PrevMask = LiveMask;
LiveMask \|= getDefRegMask(MO);		LiveMask \|= getDefRegMask(MO, *MRI);
CurPressure.inc(Reg, PrevMask, LiveMask, *MRI);		CurPressure.inc(Reg, PrevMask, LiveMask, *MRI);
}		}

MaxPressure = max(MaxPressure, CurPressure);		MaxPressure = max(MaxPressure, CurPressure);
}		}

bool GCNDownwardRPTracker::advance() {		bool GCNDownwardRPTracker::advance() {
// If we have just called reset live set is actual.		// If we have just called reset live set is actual.
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (I == TrackedLR.end()) {
<< " isn't found in tracked set\n";		<< " isn't found in tracked set\n";
}		}
}		}
}		}

bool GCNUpwardRPTracker::isValid() const {		bool GCNUpwardRPTracker::isValid() const {
const auto &SI = LIS.getInstructionIndex(*LastTrackedMI).getBaseIndex();		const auto &SI = LIS.getInstructionIndex(*LastTrackedMI).getBaseIndex();
const auto LISLR = llvm::getLiveRegs(SI, LIS, *MRI);		const auto LISLR = llvm::getLiveRegs(SI, LIS, *MRI);
const auto TrackedLR = stripEmpty(LiveRegs);		const auto &TrackedLR = LiveRegs;

if (!isEqual(LISLR, TrackedLR)) {		if (!isEqual(LISLR, TrackedLR)) {
dbgs() << "\nGCNUpwardRPTracker error: Tracked and"		dbgs() << "\nGCNUpwardRPTracker error: Tracked and"
" LIS reported livesets mismatch:\n";		" LIS reported livesets mismatch:\n";
printLivesAt(SI, LIS, *MRI);		printLivesAt(SI, LIS, *MRI);
reportMismatch(LISLR, TrackedLR, MRI->getTargetRegisterInfo());		reportMismatch(LISLR, TrackedLR, MRI->getTargetRegisterInfo());
return false;		return false;
}		}
Show All 25 Lines