This is an archive of the discontinued LLVM Phabricator instance.

Thanks for updating the patch. I think it would be slightly easier to review this patch if you could provide a brief high-level description of the modelling and some test cases.

lib/CodeGen/MachinePipeliner.cpp
1394	could use range based for loop here?
1397	We need to ignore debug info here I think.
lib/Target/AArch64/AArch64InstrInfo.cpp
5106	Does this implementation satisfy the interface? According to TargetInstrInfo::reduceLoopCount, it should generate code to reduce the loop iteration by one.
5114	Couldn't we stop after we found the SUBSXrr closest to the terminator?

Thank you for your detailed comment.
Your points make sense to me.

I should have clarified the intent of this patch.
I do not think that this patch will be accepted.
This is aimed to show that we can generate code without DFAPacketizer with as few changes as possible.
For that reason, the method interface and code keeps the original code as much as possible.

lib/Target/AArch64/AArch64InstrInfo.cpp
5106	Hexagon uses special loop instructions to count down the loop counter. On the other hand, in most cases on AArch64 or x86_64, we will target loops that count up the loop counter. Therefore, I think that it is appropriate to make the function name `TargetInstrInfo::fixLoopCount' rather than` TargetInstrInfo::reduceLoopCount'． I thought it was inappropriate to rewrite code for Hexagon, so I did not change it.
5114	Yes, you are right. I should break the loop when I found the first CompMI. There is the same bug for X86InstrInfo::reduceLoopCount.

kparzysz added a subscriber: kparzysz.Jun 12 2018, 3:12 PM

kparzysz added inline comments.

lib/Target/AArch64/AArch64InstrInfo.cpp
5106	You can change the Hexagon code if it makes it easier to adopt the pipeliner for other architectures.

lsaba added a subscriber: lsaba.Jun 13 2018, 1:43 AM

lsaba removed a subscriber: lsaba.

lsaba added a subscriber: lsaba.

In D47943#1129236, @masakiarai wrote:

Thank you for your detailed comment.
Your points make sense to me.

I should have clarified the intent of this patch.
I do not think that this patch will be accepted.
This is aimed to show that we can generate code without DFAPacketizer with as few changes as possible.
For that reason, the method interface and code keeps the original code as much as possible.

Ok thanks. Is there anything else you want feedback on at the moment?

No, at the moment there is nothing.
Since I think there was no objection to the extension of MachinePipeliner, I am currently creating a patch aimed for upstreaming.
Please review it.
Thank you very much.

Marking this as requiring changes, as the MachinePipeliner code is getting refactored a bit in D56084. Please let us know when you want us to review the patch again.

This revision now requires changes to proceed.Jan 11 2019, 8:47 AM

Revision Contents

Path

Size

lib/

CodeGen/

MachinePipeliner.cpp

146 lines

Target/

AArch64/

AArch64InstrInfo.h

10 lines

AArch64InstrInfo.cpp

55 lines

AArch64TargetMachine.cpp

2 lines

X86/

X86InstrInfo.h

10 lines

X86InstrInfo.cpp

46 lines

X86TargetMachine.cpp

3 lines

Diff 150668

lib/CodeGen/MachinePipeliner.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines
#include <functional>		#include <functional>
#include <iterator>		#include <iterator>
#include <map>		#include <map>
#include <memory>		#include <memory>
#include <tuple>		#include <tuple>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

		/* #define USE_DFAPacketizer_P 1 */

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "pipeliner"		#define DEBUG_TYPE "pipeliner"

STATISTIC(NumTrytoPipeline, "Number of loops that we attempt to pipeline");		STATISTIC(NumTrytoPipeline, "Number of loops that we attempt to pipeline");
STATISTIC(NumPipelined, "Number of loops software pipelined");		STATISTIC(NumPipelined, "Number of loops software pipelined");

/// A command line option to turn software pipelining on or off.		/// A command line option to turn software pipelining on or off.
▲ Show 20 Lines • Show All 466 Lines • ▼ Show 20 Lines	private:
int InitiationInterval = 0;		int InitiationInterval = 0;

/// Target machine information.		/// Target machine information.
const TargetSubtargetInfo &ST;		const TargetSubtargetInfo &ST;

/// Virtual register information.		/// Virtual register information.
MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;

		#ifdef USE_DFAPacketizer_P
std::unique_ptr<DFAPacketizer> Resources;		std::unique_ptr<DFAPacketizer> Resources;
		#endif /* USE_DFAPacketizer_P */
public:		public:
		#ifdef USE_DFAPacketizer_P
SMSchedule(MachineFunction *mf)		SMSchedule(MachineFunction *mf)
: ST(mf->getSubtarget()), MRI(mf->getRegInfo()),		: ST(mf->getSubtarget()), MRI(mf->getRegInfo()),
Resources(ST.getInstrInfo()->CreateTargetScheduleState(ST)) {}		Resources(ST.getInstrInfo()->CreateTargetScheduleState(ST)) {}
		#else
		SMSchedule(MachineFunction *mf)
		: ST(mf->getSubtarget()), MRI(mf->getRegInfo()) {}
		#endif /* USE_DFAPacketizer_P */

void reset() {		void reset() {
ScheduledInstrs.clear();		ScheduledInstrs.clear();
InstrToCycle.clear();		InstrToCycle.clear();
RegToStageDiff.clear();		RegToStageDiff.clear();
FirstCycle = 0;		FirstCycle = 0;
LastCycle = 0;		LastCycle = 0;
InitiationInterval = 0;		InitiationInterval = 0;
▲ Show 20 Lines • Show All 686 Lines • ▼ Show 20 Lines	bool operator()(const MachineInstr IS1, const MachineInstr IS2) const {
if (MFUs1 == 1 && MFUs2 == 1)		if (MFUs1 == 1 && MFUs2 == 1)
return Resources.lookup(F1) < Resources.lookup(F2);		return Resources.lookup(F1) < Resources.lookup(F2);
return MFUs1 > MFUs2;		return MFUs1 > MFUs2;
}		}
};		};

} // end anonymous namespace		} // end anonymous namespace

		#ifdef USE_DFAPacketizer_P
/// Calculate the resource constrained minimum initiation interval for the		/// Calculate the resource constrained minimum initiation interval for the
/// specified loop. We use the DFA to model the resources needed for		/// specified loop. We use the DFA to model the resources needed for
/// each instruction, and we ignore dependences. A different DFA is created		/// each instruction, and we ignore dependences. A different DFA is created
/// for each cycle that is required. When adding a new instruction, we attempt		/// for each cycle that is required. When adding a new instruction, we attempt
/// to add it to each existing DFA, until a legal space is found. If the		/// to add it to each existing DFA, until a legal space is found. If the
/// instruction cannot be reserved in an existing DFA, we create a new one.		/// instruction cannot be reserved in an existing DFA, we create a new one.
unsigned SwingSchedulerDAG::calculateResMII() {		unsigned SwingSchedulerDAG::calculateResMII() {
SmallVector<DFAPacketizer *, 8> Resources;		SmallVector<DFAPacketizer *, 8> Resources;
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	unsigned SwingSchedulerDAG::calculateResMII() {
// Delete the memory for each of the DFAs that were created earlier.		// Delete the memory for each of the DFAs that were created earlier.
for (DFAPacketizer *RI : Resources) {		for (DFAPacketizer *RI : Resources) {
DFAPacketizer *D = RI;		DFAPacketizer *D = RI;
delete D;		delete D;
}		}
Resources.clear();		Resources.clear();
return Resmii;		return Resmii;
}		}
		#else
		unsigned SwingSchedulerDAG::calculateResMII() {
		// Consider only issue width
		MachineBasicBlock *MBB = Loop.getHeader();
		unsigned size = 0;
		for (MachineBasicBlock::iterator I = MBB->getFirstNonPHI(),
		fhahnUnsubmitted Not Done Reply Inline Actions could use range based for loop here? fhahn: could use range based for loop here?
		E = MBB->getFirstTerminator();
		I != E; ++I) {
		size++;
		fhahnUnsubmitted Not Done Reply Inline Actions We need to ignore debug info here I think. fhahn: We need to ignore debug info here I think.
		}
		return (size + SchedModel.getIssueWidth() - 1) / SchedModel.getIssueWidth();
		}
		#endif /* USE_DFAPacketizer_P */

/// Calculate the recurrence-constrainted minimum initiation interval.		/// Calculate the recurrence-constrainted minimum initiation interval.
/// Iterate over each circuit. Compute the delay(c) and distance(c)		/// Iterate over each circuit. Compute the delay(c) and distance(c)
/// for each circuit. The II needs to satisfy the inequality		/// for each circuit. The II needs to satisfy the inequality
/// delay(c) - II*distance(c) <= 0. For each circuit, choose the smallest		/// delay(c) - II*distance(c) <= 0. For each circuit, choose the smallest
/// II that satistifies the inequality, and the RecMII is the maximum		/// II that satistifies the inequality, and the RecMII is the maximum
/// of those values.		/// of those values.
unsigned SwingSchedulerDAG::calculateRecMII(NodeSetType &NodeSets) {		unsigned SwingSchedulerDAG::calculateRecMII(NodeSetType &NodeSets) {
▲ Show 20 Lines • Show All 2,090 Lines • ▼ Show 20 Lines	bool SwingSchedulerDAG::isLoopCarriedOrder(SUnit *Source, const SDep &Dep,
return true;		return true;
}		}

void SwingSchedulerDAG::postprocessDAG() {		void SwingSchedulerDAG::postprocessDAG() {
for (auto &M : Mutations)		for (auto &M : Mutations)
M->apply(this);		M->apply(this);
}		}

		#ifdef USE_DFAPacketizer_P
/// Try to schedule the node at the specified StartCycle and continue		/// Try to schedule the node at the specified StartCycle and continue
/// until the node is schedule or the EndCycle is reached. This function		/// until the node is schedule or the EndCycle is reached. This function
/// returns true if the node is scheduled. This routine may search either		/// returns true if the node is scheduled. This routine may search either
/// forward or backward for a place to insert the instruction based upon		/// forward or backward for a place to insert the instruction based upon
/// the relative values of StartCycle and EndCycle.		/// the relative values of StartCycle and EndCycle.
bool SMSchedule::insert(SUnit *SU, int StartCycle, int EndCycle, int II) {		bool SMSchedule::insert(SUnit *SU, int StartCycle, int EndCycle, int II) {
bool forward = true;		bool forward = true;
if (StartCycle > EndCycle)		if (StartCycle > EndCycle)
Show All 37 Lines	for (int curCycle = StartCycle; curCycle != termCycle;
}		}
DEBUG({		DEBUG({
dbgs() << "\tfailed to insert at cycle " << curCycle << " ";		dbgs() << "\tfailed to insert at cycle " << curCycle << " ";
SU->getInstr()->dump();		SU->getInstr()->dump();
});		});
}		}
return false;		return false;
}		}
		#else
		static void clearResources(std::vector<unsigned> &ResourceTable) {
		for (unsigned Idx = 0; Idx < ResourceTable.size(); ++Idx) {
		ResourceTable[Idx] = 0;
		}
		}

		static bool canReserveResources(const TargetSubtargetInfo *STI,
		const MCSchedModel &SchedModel,
		std::vector<unsigned> &ResourceTable,
		SUnit *SU) {
		unsigned SchedClass = SU->getInstr()->getDesc().getSchedClass();
		const MCSchedClassDesc *SC = SchedModel.getSchedClassDesc(SchedClass);
		unsigned Issues = 0;
		for (unsigned Idx = 0; Idx < ResourceTable.size(); ++Idx) {
		Issues += ResourceTable[Idx];
		}
		if (Issues >= SchedModel.IssueWidth)
		return false;

		// TODO: This process is not accurate.
		// Information on elements of the set of ProcResGroup as follows is necessary.
		// [TargetSchedule] Expose sub-units of a ProcResGroup in MCProcResourceDesc.
		// https://reviews.llvm.org/D43023
		// In addition, resource management using ResourceCycles is necessary.
		bool Check = false;
		for (const MCWriteProcResEntry &PRE :
		make_range(STI->getWriteProcResBegin(SC), STI->getWriteProcResEnd(SC))) {
		unsigned Idx = PRE.ProcResourceIdx;
		if (SchedModel.getProcResource(Idx)->NumUnits > ResourceTable[Idx]) {
		return true;
		}
		Check = true;
		}
		if (!Check) {
		ResourceTable[0]++;
		return true;
		}
		return false;
		}

		static void reserveResources(const TargetSubtargetInfo *STI,
		const MCSchedModel &SchedModel,
		std::vector<unsigned> &ResourceTable, SUnit *SU) {
		unsigned SchedClass = SU->getInstr()->getDesc().getSchedClass();
		const MCSchedClassDesc *SC = SchedModel.getSchedClassDesc(SchedClass);
		bool Check = false;
		for (const MCWriteProcResEntry &PRE :
		make_range(STI->getWriteProcResBegin(SC), STI->getWriteProcResEnd(SC))) {
		unsigned Idx = PRE.ProcResourceIdx;
		if (SchedModel.getProcResource(Idx)->NumUnits > ResourceTable[Idx]) {
		ResourceTable[Idx]++;
		return;
		}
		Check = true;
		}
		if (!Check) {
		ResourceTable[0]++;
		return;
		}
		llvm_unreachable("reserveResources");
		}

		bool SMSchedule::insert(SUnit *SU, int StartCycle, int EndCycle, int II) {
		const TargetSubtargetInfo *STI = &ST;
		std::vector<unsigned> ResourceTable;
		const MCSchedModel &SchedModel = ST.getSchedModel();
		{
		unsigned NumRes = SchedModel.getNumProcResourceKinds();
		for (unsigned Idx = 0; Idx < NumRes; ++Idx) {
		ResourceTable.push_back(0);
		}
		}
		bool forward = true;
		if (StartCycle > EndCycle)
		forward = false;

		// The terminating condition depends on the direction.
		int termCycle = forward ? EndCycle + 1 : EndCycle - 1;
		for (int curCycle = StartCycle; curCycle != termCycle;
		forward ? ++curCycle : --curCycle) {

		// Add the already scheduled instructions at the specified cycle.
		clearResources(ResourceTable);
		for (int checkCycle = FirstCycle + ((curCycle - FirstCycle) % II);
		checkCycle <= LastCycle; checkCycle += II) {
		std::deque<SUnit *> &cycleInstrs = ScheduledInstrs[checkCycle];

		for (std::deque<SUnit *>::iterator I = cycleInstrs.begin(),
		E = cycleInstrs.end();
		I != E; ++I) {
		if (ST.getInstrInfo()->isZeroCost((*I)->getInstr()->getOpcode()))
		continue;
		assert(canReserveResources(STI, SchedModel, ResourceTable, (*I)) &&
		"These instructions have already been scheduled.");
		reserveResources(STI, SchedModel, ResourceTable, (*I));
		}
		}
		if (ST.getInstrInfo()->isZeroCost(SU->getInstr()->getOpcode()) \|\|
		canReserveResources(STI, SchedModel, ResourceTable, SU)) {
		DEBUG({
		dbgs() << "\tinsert at cycle " << curCycle << " ";
		SU->getInstr()->dump();
		});

		ScheduledInstrs[curCycle].push_back(SU);
		InstrToCycle.insert(std::make_pair(SU, curCycle));
		if (curCycle > LastCycle)
		LastCycle = curCycle;
		if (curCycle < FirstCycle)
		FirstCycle = curCycle;
		return true;
		}
		DEBUG({
		dbgs() << "\tfailed to insert at cycle " << curCycle << " ";
		SU->getInstr()->dump();
		});
		}
		return false;
		}
		#endif /* USE_DFAPacketizer_P */

// Return the cycle of the earliest scheduled instruction in the chain.		// Return the cycle of the earliest scheduled instruction in the chain.
int SMSchedule::earliestCycleInChain(const SDep &Dep) {		int SMSchedule::earliestCycleInChain(const SDep &Dep) {
SmallPtrSet<SUnit *, 8> Visited;		SmallPtrSet<SUnit *, 8> Visited;
SmallVector<SDep, 8> Worklist;		SmallVector<SDep, 8> Worklist;
Worklist.push_back(Dep);		Worklist.push_back(Dep);
int EarlyCycle = INT_MAX;		int EarlyCycle = INT_MAX;
while (!Worklist.empty()) {		while (!Worklist.empty()) {
▲ Show 20 Lines • Show All 485 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64InstrInfo.h

Show First 20 Lines • Show All 287 Lines • ▼ Show 20 Lines	bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify = false) const override;		bool AllowModify = false) const override;
unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
int *BytesRemoved = nullptr) const override;		int *BytesRemoved = nullptr) const override;
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const override;		int *BytesAdded = nullptr) const override;

		bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
		MachineInstr *&CmpInst) const override;

		unsigned reduceLoopCount(MachineBasicBlock &MBB, MachineInstr *IndVar,
		MachineInstr &Cmp,
		SmallVectorImpl<MachineOperand> &Cond,
		SmallVectorImpl<MachineInstr *> &PrevInsts,
		unsigned Iter, unsigned MaxIter) const override;

bool		bool
reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const override;		reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const override;
bool canInsertSelect(const MachineBasicBlock &, ArrayRef<MachineOperand> Cond,		bool canInsertSelect(const MachineBasicBlock &, ArrayRef<MachineOperand> Cond,
unsigned, unsigned, int &, int &, int &) const override;		unsigned, unsigned, int &, int &, int &) const override;
void insertSelect(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void insertSelect(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
const DebugLoc &DL, unsigned DstReg,		const DebugLoc &DL, unsigned DstReg,
ArrayRef<MachineOperand> Cond, unsigned TrueReg,		ArrayRef<MachineOperand> Cond, unsigned TrueReg,
unsigned FalseReg) const override;		unsigned FalseReg) const override;
▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

lib/Target/AArch64/AArch64InstrInfo.cpp

Show First 20 Lines • Show All 5,057 Lines • ▼ Show 20 Lines	MachineInstr *LDRXpost = BuildMI(MF, DebugLoc(), get(AArch64::LDRXpost))
.addReg(AArch64::SP, RegState::Define)		.addReg(AArch64::SP, RegState::Define)
.addReg(AArch64::LR, RegState::Define)		.addReg(AArch64::LR, RegState::Define)
.addReg(AArch64::SP)		.addReg(AArch64::SP)
.addImm(16);		.addImm(16);
It = MBB.insert(It, LDRXpost);		It = MBB.insert(It, LDRXpost);

return It;		return It;
}		}

		bool AArch64InstrInfo::analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
		MachineInstr *&CmpInst) const {
		MachineBasicBlock *LoopEnd = L.getBottomBlock();
		MachineBasicBlock::iterator I = LoopEnd->getFirstTerminator();
		MachineBasicBlock::iterator E = LoopEnd->getFirstNonDebugInstr();
		MachineInstr *BccMI = nullptr;
		MachineInstr *CompMI = nullptr;
		MachineInstr *CopyMI = nullptr;
		MachineInstr *AddMI = nullptr;
		for (; I != E; --I) {
		if (!BccMI && I->getOpcode() == AArch64::Bcc) {
		BccMI = &*I;
		AArch64CC::CondCode CC =
		(AArch64CC::CondCode)BccMI->getOperand(0).getImm();
		if (CC != AArch64CC::LT)
		return true;
		} else if (BccMI && !CompMI && I->getOpcode() == AArch64::SUBSXrr) {
		CompMI = &*I;
		} else if (CompMI && !CopyMI && I->getOpcode() == AArch64::COPY) {
		if (CompMI->getOperand(1).getReg() == I->getOperand(1).getReg()) {
		CopyMI = &*I;
		}
		} else if (CopyMI && !AddMI && I->getOpcode() == AArch64::ADDXri) {
		if (CompMI->getOperand(1).getReg() == I->getOperand(0).getReg()) {
		AddMI = &*I;
		}
		} else if (AddMI && I->isPHI()) {
		if (I->getOperand(0).getReg() == AddMI->getOperand(1).getReg() &&
		I->getOperand(3).getReg() == CopyMI->getOperand(0).getReg()) {
		IndVarInst = AddMI;
		CmpInst = CompMI;
		return false;
		}
		}
		}
		return true;
		}

		unsigned
		AArch64InstrInfo::reduceLoopCount(MachineBasicBlock &MBB, MachineInstr *IndVar,
		fhahnUnsubmitted Not Done Reply Inline Actions Does this implementation satisfy the interface? According to TargetInstrInfo::reduceLoopCount, it should generate code to reduce the loop iteration by one. fhahn: Does this implementation satisfy the interface? According to TargetInstrInfo::reduceLoopCount…
		masakiaraiAuthorUnsubmitted Not Done Reply Inline Actions Hexagon uses special loop instructions to count down the loop counter. On the other hand, in most cases on AArch64 or x86_64, we will target loops that count up the loop counter. Therefore, I think that it is appropriate to make the function name `TargetInstrInfo::fixLoopCount' rather than` TargetInstrInfo::reduceLoopCount'． I thought it was inappropriate to rewrite code for Hexagon, so I did not change it. masakiarai: Hexagon uses special loop instructions to count down the loop counter. On the other hand, in…
		kparzyszUnsubmitted Not Done Reply Inline Actions You can change the Hexagon code if it makes it easier to adopt the pipeliner for other architectures. kparzysz: You can change the Hexagon code if it makes it easier to adopt the pipeliner for other…
		MachineInstr &Cmp,
		SmallVectorImpl<MachineOperand> &Cond,
		SmallVectorImpl<MachineInstr *> &PrevInsts,
		unsigned Iter, unsigned MaxIter) const {
		MachineInstr *CompMI = nullptr;
		for (auto I = MBB.instr_rbegin(), E = MBB.instr_rend(); I != E; ++I) {
		if (I->getOpcode() == AArch64::SUBSXrr) {
		CompMI = &*I;
		fhahnUnsubmitted Not Done Reply Inline Actions Couldn't we stop after we found the SUBSXrr closest to the terminator? fhahn: Couldn't we stop after we found the SUBSXrr closest to the terminator?
		masakiaraiAuthorUnsubmitted Not Done Reply Inline Actions Yes, you are right. I should break the loop when I found the first CompMI. There is the same bug for X86InstrInfo::reduceLoopCount. masakiarai: Yes, you are right. I should break the loop when I found the first CompMI. There is the same…
		}
		}
		unsigned LoopCount = CompMI->getOperand(1).getReg();
		Cond.push_back(MachineOperand::CreateImm(AArch64CC::LT));
		return LoopCount;
		}

lib/Target/AArch64/AArch64TargetMachine.cpp

Show First 20 Lines • Show All 482 Lines • ▼ Show 20 Lines	void AArch64PassConfig::addPreRegAlloc() {

// Use AdvSIMD scalar instructions whenever profitable.		// Use AdvSIMD scalar instructions whenever profitable.
if (TM->getOptLevel() != CodeGenOpt::None && EnableAdvSIMDScalar) {		if (TM->getOptLevel() != CodeGenOpt::None && EnableAdvSIMDScalar) {
addPass(createAArch64AdvSIMDScalar());		addPass(createAArch64AdvSIMDScalar());
// The AdvSIMD pass may produce copies that can be rewritten to		// The AdvSIMD pass may produce copies that can be rewritten to
// be register coaleascer friendly.		// be register coaleascer friendly.
addPass(&PeepholeOptimizerID);		addPass(&PeepholeOptimizerID);
}		}
		if (TM->getOptLevel() >= CodeGenOpt::Default)
		addPass(&MachinePipelinerID);
}		}

void AArch64PassConfig::addPostRegAlloc() {		void AArch64PassConfig::addPostRegAlloc() {
// Remove redundant copy instructions.		// Remove redundant copy instructions.
if (TM->getOptLevel() != CodeGenOpt::None && EnableRedundantCopyElimination)		if (TM->getOptLevel() != CodeGenOpt::None && EnableRedundantCopyElimination)
addPass(createAArch64RedundantCopyEliminationPass());		addPass(createAArch64RedundantCopyEliminationPass());

if (TM->getOptLevel() != CodeGenOpt::None && usingDefaultRegAlloc())		if (TM->getOptLevel() != CodeGenOpt::None && usingDefaultRegAlloc())
Show All 28 Lines

lib/Target/X86/X86InstrInfo.h

Show First 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	bool analyzeBranchPredicate(MachineBasicBlock &MBB,
bool AllowModify = false) const override;		bool AllowModify = false) const override;

unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
int *BytesRemoved = nullptr) const override;		int *BytesRemoved = nullptr) const override;
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const override;		int *BytesAdded = nullptr) const override;

		bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
		MachineInstr *&CmpInst) const override;

		unsigned reduceLoopCount(MachineBasicBlock &MBB, MachineInstr *IndVar,
		MachineInstr &Cmp,
		SmallVectorImpl<MachineOperand> &Cond,
		SmallVectorImpl<MachineInstr *> &PrevInsts,
		unsigned Iter, unsigned MaxIter) const override;

bool canInsertSelect(const MachineBasicBlock &, ArrayRef<MachineOperand> Cond,		bool canInsertSelect(const MachineBasicBlock &, ArrayRef<MachineOperand> Cond,
unsigned, unsigned, int &, int &, int &) const override;		unsigned, unsigned, int &, int &, int &) const override;
void insertSelect(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void insertSelect(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
const DebugLoc &DL, unsigned DstReg,		const DebugLoc &DL, unsigned DstReg,
ArrayRef<MachineOperand> Cond, unsigned TrueReg,		ArrayRef<MachineOperand> Cond, unsigned TrueReg,
unsigned FalseReg) const override;		unsigned FalseReg) const override;
void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,		const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,
▲ Show 20 Lines • Show All 271 Lines • Show Last 20 Lines

lib/Target/X86/X86InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,866 Lines • ▼ Show 20 Lines	if (MInfo.CallConstructionID == MachineOutlinerTailCall) {
// No, insert a call.		// No, insert a call.
It = MBB.insert(It,		It = MBB.insert(It,
BuildMI(MF, DebugLoc(), get(X86::CALL64pcrel32))		BuildMI(MF, DebugLoc(), get(X86::CALL64pcrel32))
.addGlobalAddress(M.getNamedValue(MF.getName())));		.addGlobalAddress(M.getNamedValue(MF.getName())));
}		}

return It;		return It;
}		}

		bool X86InstrInfo::analyzeLoop(MachineLoop &L,
		MachineInstr *&IndVarInst,
		MachineInstr *&CmpInst) const {
		MachineBasicBlock *LoopEnd = L.getBottomBlock();
		MachineBasicBlock::iterator I = LoopEnd->getFirstTerminator();
		MachineBasicBlock::iterator E = LoopEnd->getFirstNonDebugInstr();
		MachineInstr *JumpMI = nullptr;
		MachineInstr *CompMI = nullptr;
		MachineInstr *AddMI = nullptr;
		for (; I != E; --I) {
		if (!JumpMI && I->getOpcode() == X86::JL_1) {
		JumpMI = &*I;
		} else if (JumpMI && !CompMI && I->getOpcode() == X86::CMP64rr) {
		CompMI = &*I;
		} else if (CompMI && !AddMI &&
		(I->getOpcode() == X86::INC64r \|\| I->getOpcode() == X86::ADD64ri8)
		&& CompMI->getOperand(0).getReg() == I->getOperand(0).getReg()) {
		AddMI = &*I;
		} else if (AddMI && I->isPHI()) {
		if (I->getOperand(0).getReg() == AddMI->getOperand(1).getReg()
		&& I->getOperand(3).getReg() == AddMI->getOperand(0).getReg()) {
		IndVarInst = AddMI;
		CmpInst = CompMI;
		return false;
		}
		}
		}
		return true;
		}

		unsigned X86InstrInfo::reduceLoopCount(MachineBasicBlock &MBB,
		MachineInstr *IndVar, MachineInstr &Cmp,
		SmallVectorImpl<MachineOperand> &Cond,
		SmallVectorImpl<MachineInstr *> &PrevInsts,
		unsigned Iter, unsigned MaxIter) const {
		MachineInstr *CompMI = nullptr;
		for (auto I = MBB.instr_rbegin(), E = MBB.instr_rend(); I != E; ++I) {
		if (I->getOpcode() == X86::CMP64rr) {
		CompMI = &*I;
		}
		}
		unsigned LoopCount = CompMI->getOperand(0).getReg();
		Cond.push_back(MachineOperand::CreateImm(X86::COND_L));
		return LoopCount;
		}

lib/Target/X86/X86TargetMachine.cpp

	Show First 20 Lines • Show All 411 Lines • ▼ Show 20 Lines
	void X86PassConfig::addPreRegAlloc() {			void X86PassConfig::addPreRegAlloc() {
	if (getOptLevel() != CodeGenOpt::None) {			if (getOptLevel() != CodeGenOpt::None) {
	addPass(&LiveRangeShrinkID);			addPass(&LiveRangeShrinkID);
	addPass(createX86FixupSetCC());			addPass(createX86FixupSetCC());
	addPass(createX86OptimizeLEAs());			addPass(createX86OptimizeLEAs());
	addPass(createX86CallFrameOptimization());			addPass(createX86CallFrameOptimization());
	}			}

				if (TM->getOptLevel() >= CodeGenOpt::Default)
				addPass(&MachinePipelinerID);

	addPass(createX86FlagsCopyLoweringPass());			addPass(createX86FlagsCopyLoweringPass());
	addPass(createX86WinAllocaExpander());			addPass(createX86WinAllocaExpander());
	}			}
	void X86PassConfig::addMachineSSAOptimization() {			void X86PassConfig::addMachineSSAOptimization() {
	addPass(createX86DomainReassignmentPass());			addPass(createX86DomainReassignmentPass());
	TargetPassConfig::addMachineSSAOptimization();			TargetPassConfig::addMachineSSAOptimization();
	}			}

	Show All 24 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Sample code for porting MachinePipeliner to AArch64+SVENeeds RevisionPublic

Details

Diff Detail

Event Timeline