This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
ModuloSchedule.h
5/9
TargetInstrInfo.h
-
lib/
-
CodeGen/
-
MachinePipeliner.cpp
-
ModuloSchedule.cpp
-
TargetInstrInfo.cpp
-
Target/
-
Hexagon/
-
HexagonInstrInfo.h
-
HexagonInstrInfo.cpp
-
PowerPC/
-
PPCInstrInfo.h
2/4
PPCInstrInfo.cpp
-
test/CodeGen/Hexagon/
-
CodeGen/
-
Hexagon/
-
swp-epilog-phi7.ll

Differential D67167

[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
ClosedPublic

Authored by jmolloy on Sep 4 2019, 6:27 AM.

Download Raw Diff

Details

Reviewers

bcahoon
jsji
ThomasRaoux
jmolloy

Summary

The way MachinePipeliner uses these target hooks is stateful - we reduce trip
count by one per call to reduceLoopCount. It's a little overfit for hardware
loops, where we don't have to worry about stitching a loop induction variable
across prologs and epilogs (the induction variable is implicit).

This patch introduces a new API:

/// Analyze loop L, which must be a single-basic-block loop, and if the
/// conditions can be understood enough produce a PipelinerLoopInfo object.
virtual std::unique_ptr<PipelinerLoopInfo>
analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;

The return value is expected to be an implementation of the abstract class:

/// Object returned by analyzeLoopForPipelining. Allows software pipelining
/// implementations to query attributes of the loop being pipelined.
class PipelinerLoopInfo {
public:
  virtual ~PipelinerLoopInfo();
  /// Return true if the given instruction should not be pipelined and should
  /// be ignored. An example could be a loop comparison, or induction variable
  /// update with no users being pipelined.
  virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

  /// Create a condition to determine if the trip count of the loop is greater
  /// than TC.
  ///
  /// If the trip count is statically known to be greater than TC, return
  /// true. If the trip count is statically known to be not greater than TC,
  /// return false. Otherwise return nullopt and fill out Cond with the test
  /// condition.
  virtual Optional<bool>
  getTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
                               SmallVectorImpl<MachineOperand> &Cond) = 0;

  /// Modify the loop such that the trip count is
  /// OriginalTC + TripCountAdjust.
  /// Additionally the loop's preheader is now NewPreheader.
  virtual void adjust(int TripCountAdjust,
                      MachineBasicBlock *NewPreheader) = 0;

  /// Called when the loop is being removed. Any instructions in the preheader
  /// should be removed.
  virtual void disposed() = 0;
};

The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while
allowing the target to hold its own state across all calls. This API, in
particular the disjunction of creating a trip count check condition and
adjusting the loop, improves the code quality in ModuloSchedule.cpp.

Diff Detail

Repository: rL LLVM

Event Timeline

jmolloy created this revision.Sep 4 2019, 6:27 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2019, 6:27 AM

Herald added subscribers: llvm-commits, MaskRay, kbarton and 2 others. · View Herald Transcript

General idea is great, some comments when I looked at PowerPC part. Thanks.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
666	No just query, but also modify the loop?
673	What is the use of this API? Did not see it in-tree.
683	I might be wrong, but it reads like a `get accessor` , so looks like READ only, but we actually `create` things with API.
689	I don't know why we would like to do two things in one API? Can we split?
694	It might be a little confusing to call `LoopInfo->adjust` after calling `LoopInfo->disposed`. So maybe we should be more specific about what we dispose here in API name?
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3999	I don't quite follow when we should call splice to `NewPreheader` for a target, and why Hexagon needs it, but PowerPC don't?
4033	I think we skip this loop intentionally when enabling PowerPC target , why we need it back?

Herald added a subscriber: • wuzish. · View Herald TranscriptSep 9 2019, 2:51 PM

Thanks for the great feedback Jinsong! I've addressed your comments.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
673	There are only two targets in-tree that use the pipeliner (Hexagon and PPC9). Both of these use dedicated hardware loop instructions so there is no induction variable update. However, the more traditional loop code sequence: %newindvar = ADDri %indvar, 1 %done = CMPri %newindvar, TripCount BR %done we have two non-terminator instructions that must not be pipelined, because there is no guarantee that they will end up in stage 0 (which is the only valid stage for them). Tanya Lattner's thesis (and the original SMS paper) states that they strip canonical indvar updates while pipelining and add them in later when constructing the pipelined loop. This function allows the target to specify these instructions to strip.
683	Thanks! I agree.
689	Good idea!
694	I agree. This is a hook point to allow PPC and Hexagon to remove the LOOP setup instruction in the preheader. I don't particularly want to bake that implementation detail into this target-independent header though. I've reworded this to remove references to preheaders and just mention that the loop will be removed and once this function is called no other functions can be called. Does this look alright?
llvm/lib/Target/PowerPC/PPCInstrInfo.cpp
3999	Hexagon wants the LOOP instruction in the immediate predecessor of the pipelined loop kernel: prolog1: prolog2: LOOP kernel: ENDLOOP Whereas PPC wants it to stay in the preheader to the prolog, so it can use BDZ within the prolog to adapt the loop count: LOOP prolog1: BDZ prolog2: BDZ kernel: BDNZ
4033	We needed it because I was calling analyzeLoop after doing some modifications to the loop. I've changed so that ModuloScheduleExpander calls analyzeLoop early, so the preheader is in the right place and we don't need this change any more. Thanks for noticing this!

In D67167#1667618, @jmolloy wrote:

Thanks for the great feedback Jinsong! I've addressed your comments.

Thanks @jmolloy . The API and PPC part LGTM.

Thanks Jinsong! I've committed this as rL372376.

This revision is now accepted and ready to land.Sep 20 2019, 1:56 AM

jmolloy closed this revision.Sep 20 2019, 1:56 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

ModuloSchedule.h

2 lines

TargetInstrInfo.h

44 lines

lib/

CodeGen/

MachinePipeliner.cpp

2 lines

ModuloSchedule.cpp

40 lines

TargetInstrInfo.cpp

2 lines

Target/

Hexagon/

HexagonInstrInfo.h

19 lines

HexagonInstrInfo.cpp

152 lines

PowerPC/

PPCInstrInfo.h

28 lines

PPCInstrInfo.cpp

139 lines

test/

CodeGen/

Hexagon/

swp-epilog-phi7.ll

4 lines

Diff 219889

llvm/include/llvm/CodeGen/ModuloSchedule.h

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_LIB_CODEGEN_MODULOSCHEDULE_H		#ifndef LLVM_LIB_CODEGEN_MODULOSCHEDULE_H
#define LLVM_LIB_CODEGEN_MODULOSCHEDULE_H		#define LLVM_LIB_CODEGEN_MODULOSCHEDULE_H

#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineLoopInfo.h"		#include "llvm/CodeGen/MachineLoopInfo.h"
		#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include <vector>		#include <vector>

namespace llvm {		namespace llvm {
class MachineBasicBlock;		class MachineBasicBlock;
class MachineInstr;		class MachineInstr;
class LiveIntervals;		class LiveIntervals;

▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	private:
const TargetSubtargetInfo &ST;		const TargetSubtargetInfo &ST;
MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;
const TargetInstrInfo *TII;		const TargetInstrInfo *TII;
LiveIntervals &LIS;		LiveIntervals &LIS;

MachineBasicBlock *BB;		MachineBasicBlock *BB;
MachineBasicBlock *Preheader;		MachineBasicBlock *Preheader;
MachineBasicBlock *NewKernel = nullptr;		MachineBasicBlock *NewKernel = nullptr;
		std::unique_ptr<TargetInstrInfo::PipelinerLoopInfo> LoopInfo;

/// Map for each register and the max difference between its uses and def.		/// Map for each register and the max difference between its uses and def.
/// The first element in the pair is the max difference in stages. The		/// The first element in the pair is the max difference in stages. The
/// second is true if the register defines a Phi value and loop value is		/// second is true if the register defines a Phi value and loop value is
/// scheduled before the Phi.		/// scheduled before the Phi.
std::map<unsigned, std::pair<unsigned, bool>> RegToStageDiff;		std::map<unsigned, std::pair<unsigned, bool>> RegToStageDiff;

/// Instructions to change when emitting the final schedule.		/// Instructions to change when emitting the final schedule.
▲ Show 20 Lines • Show All 137 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetInstrInfo.h

Show First 20 Lines • Show All 656 Lines • ▼ Show 20 Lines	public:
unsigned insertUnconditionalBranch(MachineBasicBlock &MBB,		unsigned insertUnconditionalBranch(MachineBasicBlock &MBB,
MachineBasicBlock *DestBB,		MachineBasicBlock *DestBB,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const {		int *BytesAdded = nullptr) const {
return insertBranch(MBB, DestBB, nullptr, ArrayRef<MachineOperand>(), DL,		return insertBranch(MBB, DestBB, nullptr, ArrayRef<MachineOperand>(), DL,
BytesAdded);		BytesAdded);
}		}

		/// Object returned by analyzeLoopForPipelining. Allows software pipelining
		/// implementations to query attributes of the loop being pipelined and to
		jsjiUnsubmitted Done Reply Inline Actions No just query, but also modify the loop? jsji: No just query, but also modify the loop?
		/// apply target-specific updates to the loop once pipelining is complete.
		class PipelinerLoopInfo {
		public:
		virtual ~PipelinerLoopInfo();
		/// Return true if the given instruction should not be pipelined and should
		/// be ignored. An example could be a loop comparison, or induction variable
		/// update with no users being pipelined.
		jsjiUnsubmitted Not Done Reply Inline Actions What is the use of this API? Did not see it in-tree. jsji: What is the use of this API? Did not see it in-tree.
		jmolloyAuthorUnsubmitted Done Reply Inline Actions There are only two targets in-tree that use the pipeliner (Hexagon and PPC9). Both of these use dedicated hardware loop instructions so there is no induction variable update. However, the more traditional loop code sequence: %newindvar = ADDri %indvar, 1 %done = CMPri %newindvar, TripCount BR %done we have two non-terminator instructions that must not be pipelined, because there is no guarantee that they will end up in stage 0 (which is the only valid stage for them). Tanya Lattner's thesis (and the original SMS paper) states that they strip canonical indvar updates while pipelining and add them in later when constructing the pipelined loop. This function allows the target to specify these instructions to strip. jmolloy: There are only two targets in-tree that use the pipeliner (Hexagon and PPC9). Both of these use…
		virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

		/// Create a condition to determine if the trip count of the loop is greater
		/// than TC.
		///
		/// If the trip count is statically known to be greater than TC, return
		/// true. If the trip count is statically known to be not greater than TC,
		/// return false. Otherwise return nullopt and fill out Cond with the test
		/// condition.
		virtual Optional<bool>
		jsjiUnsubmitted Not Done Reply Inline Actions I might be wrong, but it reads like a `get accessor` , so looks like READ only, but we actually `create` things with API. jsji: I might be wrong, but it reads like a `get accessor `, so looks like READ only, but we…
		jmolloyAuthorUnsubmitted Done Reply Inline Actions Thanks! I agree. jmolloy: Thanks! I agree.
		createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
		SmallVectorImpl<MachineOperand> &Cond) = 0;

		/// Modify the loop such that the trip count is
		/// OriginalTC + TripCountAdjust.
		virtual void adjustTripCount(int TripCountAdjust) = 0;
		jsjiUnsubmitted Not Done Reply Inline Actions I don't know why we would like to do two things in one API? Can we split? jsji: I don't know why we would like to do two things in one API? Can we split?
		jmolloyAuthorUnsubmitted Done Reply Inline Actions Good idea! jmolloy: Good idea!

		/// Called when the loop's preheader has been modified to NewPreheader.
		virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;

		/// Called when the loop is being removed. Any instructions in the preheader
		jsjiUnsubmitted Not Done Reply Inline Actions It might be a little confusing to call `LoopInfo->adjust` after calling `LoopInfo->disposed`. So maybe we should be more specific about what we dispose here in API name? jsji: It might be a little confusing to call `LoopInfo->adjust` after calling `LoopInfo->disposed`.
		jmolloyAuthorUnsubmitted Done Reply Inline Actions I agree. This is a hook point to allow PPC and Hexagon to remove the LOOP setup instruction in the preheader. I don't particularly want to bake that implementation detail into this target-independent header though. I've reworded this to remove references to preheaders and just mention that the loop will be removed and once this function is called no other functions can be called. Does this look alright? jmolloy: I agree. This is a hook point to allow PPC and Hexagon to remove the LOOP setup instruction in…
		/// should be removed.
		///
		/// Once this function is called, no other functions on this object are
		/// valid; the loop has been removed.
		virtual void disposed() = 0;
		};

		/// Analyze loop L, which must be a single-basic-block loop, and if the
		/// conditions can be understood enough produce a PipelinerLoopInfo object.
		virtual std::unique_ptr<PipelinerLoopInfo>
		analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const {
		return nullptr;
		}

/// Analyze the loop code, return true if it cannot be understoo. Upon		/// Analyze the loop code, return true if it cannot be understoo. Upon
/// success, this function returns false and returns information about the		/// success, this function returns false and returns information about the
/// induction variable and compare instruction used at the end.		/// induction variable and compare instruction used at the end.
virtual bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,		virtual bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
MachineInstr *&CmpInst) const {		MachineInstr *&CmpInst) const {
return true;		return true;
}		}

▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachinePipeliner.cpp

Show First 20 Lines • Show All 320 Lines • ▼ Show 20 Lines	if (TII->analyzeBranch(*L.getHeader(), LI.TBB, LI.FBB, LI.BrCond)) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "Unable to analyzeBranch, can NOT pipeline current Loop\n");		dbgs() << "Unable to analyzeBranch, can NOT pipeline current Loop\n");
NumFailBranch++;		NumFailBranch++;
return false;		return false;
}		}

LI.LoopInductionVar = nullptr;		LI.LoopInductionVar = nullptr;
LI.LoopCompare = nullptr;		LI.LoopCompare = nullptr;
if (TII->analyzeLoop(L, LI.LoopInductionVar, LI.LoopCompare)) {		if (!TII->analyzeLoopForPipelining(L.getTopBlock())) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "Unable to analyzeLoop, can NOT pipeline current Loop\n");		dbgs() << "Unable to analyzeLoop, can NOT pipeline current Loop\n");
NumFailLoop++;		NumFailLoop++;
return false;		return false;
}		}

if (!L.getLoopPreheader()) {		if (!L.getLoopPreheader()) {
LLVM_DEBUG(		LLVM_DEBUG(
▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ModuloSchedule.cpp

Show First 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = MI->getNumOperands(); i < e; ++i) {
RegToStageDiff[Reg] = std::make_pair(MaxDiff, PhiIsSwapped);		RegToStageDiff[Reg] = std::make_pair(MaxDiff, PhiIsSwapped);
}		}
}		}

generatePipelinedLoop();		generatePipelinedLoop();
}		}

void ModuloScheduleExpander::generatePipelinedLoop() {		void ModuloScheduleExpander::generatePipelinedLoop() {
		LoopInfo = TII->analyzeLoopForPipelining(BB);
		assert(LoopInfo && "Must be able to analyze loop!");

// Create a new basic block for the kernel and add it to the CFG.		// Create a new basic block for the kernel and add it to the CFG.
MachineBasicBlock *KernelBB = MF.CreateMachineBasicBlock(BB->getBasicBlock());		MachineBasicBlock *KernelBB = MF.CreateMachineBasicBlock(BB->getBasicBlock());

unsigned MaxStageCount = Schedule.getNumStages() - 1;		unsigned MaxStageCount = Schedule.getNumStages() - 1;

// Remember the registers that are used in different stages. The index is		// Remember the registers that are used in different stages. The index is
// the iteration, or stage, that the instruction is scheduled in. This is		// the iteration, or stage, that the instruction is scheduled in. This is
// a map between register names in the original block and the names created		// a map between register names in the original block and the names created
▲ Show 20 Lines • Show All 726 Lines • ▼ Show 20 Lines
/// block. These edges are needed if the loop ends before reaching the		/// block. These edges are needed if the loop ends before reaching the
/// kernel.		/// kernel.
void ModuloScheduleExpander::addBranches(MachineBasicBlock &PreheaderBB,		void ModuloScheduleExpander::addBranches(MachineBasicBlock &PreheaderBB,
MBBVectorTy &PrologBBs,		MBBVectorTy &PrologBBs,
MachineBasicBlock *KernelBB,		MachineBasicBlock *KernelBB,
MBBVectorTy &EpilogBBs,		MBBVectorTy &EpilogBBs,
ValueMapTy *VRMap) {		ValueMapTy *VRMap) {
assert(PrologBBs.size() == EpilogBBs.size() && "Prolog/Epilog mismatch");		assert(PrologBBs.size() == EpilogBBs.size() && "Prolog/Epilog mismatch");
MachineInstr *IndVar;
MachineInstr *Cmp;
if (TII->analyzeLoop(*Schedule.getLoop(), IndVar, Cmp))
llvm_unreachable("Must be able to analyze loop!");
MachineBasicBlock *LastPro = KernelBB;		MachineBasicBlock *LastPro = KernelBB;
MachineBasicBlock *LastEpi = KernelBB;		MachineBasicBlock *LastEpi = KernelBB;

// Start from the blocks connected to the kernel and work "out"		// Start from the blocks connected to the kernel and work "out"
// to the first prolog and the last epilog blocks.		// to the first prolog and the last epilog blocks.
SmallVector<MachineInstr *, 4> PrevInsts;		SmallVector<MachineInstr *, 4> PrevInsts;
unsigned MaxIter = PrologBBs.size() - 1;		unsigned MaxIter = PrologBBs.size() - 1;
unsigned LC = UINT_MAX;
unsigned LCMin = UINT_MAX;
for (unsigned i = 0, j = MaxIter; i <= MaxIter; ++i, --j) {		for (unsigned i = 0, j = MaxIter; i <= MaxIter; ++i, --j) {
// Add branches to the prolog that go to the corresponding		// Add branches to the prolog that go to the corresponding
// epilog, and the fall-thru prolog/kernel block.		// epilog, and the fall-thru prolog/kernel block.
MachineBasicBlock *Prolog = PrologBBs[j];		MachineBasicBlock *Prolog = PrologBBs[j];
MachineBasicBlock *Epilog = EpilogBBs[i];		MachineBasicBlock *Epilog = EpilogBBs[i];
// We've executed one iteration, so decrement the loop count and check for
// the loop end.
SmallVector<MachineOperand, 4> Cond;
// Check if the LOOP0 has already been removed. If so, then there is no need
// to reduce the trip count.
if (LC != 0)
LC = TII->reduceLoopCount(Prolog, PreheaderBB, IndVar, Cmp, Cond,
PrevInsts, j, MaxIter);

// Record the value of the first trip count, which is used to determine if
// branches and blocks can be removed for constant trip counts.
if (LCMin == UINT_MAX)
LCMin = LC;

		SmallVector<MachineOperand, 4> Cond;
		Optional<bool> StaticallyGreater =
		LoopInfo->createTripCountGreaterCondition(j + 1, *Prolog, Cond);
unsigned numAdded = 0;		unsigned numAdded = 0;
if (Register::isVirtualRegister(LC)) {		if (!StaticallyGreater.hasValue()) {
Prolog->addSuccessor(Epilog);		Prolog->addSuccessor(Epilog);
numAdded = TII->insertBranch(*Prolog, Epilog, LastPro, Cond, DebugLoc());		numAdded = TII->insertBranch(*Prolog, Epilog, LastPro, Cond, DebugLoc());
} else if (j >= LCMin) {		} else if (*StaticallyGreater == false) {
Prolog->addSuccessor(Epilog);		Prolog->addSuccessor(Epilog);
Prolog->removeSuccessor(LastPro);		Prolog->removeSuccessor(LastPro);
LastEpi->removeSuccessor(Epilog);		LastEpi->removeSuccessor(Epilog);
numAdded = TII->insertBranch(*Prolog, Epilog, nullptr, Cond, DebugLoc());		numAdded = TII->insertBranch(*Prolog, Epilog, nullptr, Cond, DebugLoc());
removePhis(Epilog, LastEpi);		removePhis(Epilog, LastEpi);
// Remove the blocks that are no longer referenced.		// Remove the blocks that are no longer referenced.
if (LastPro != LastEpi) {		if (LastPro != LastEpi) {
LastEpi->clear();		LastEpi->clear();
LastEpi->eraseFromParent();		LastEpi->eraseFromParent();
}		}
		if (LastPro == KernelBB) {
		LoopInfo->disposed();
		NewKernel = nullptr;
		}
LastPro->clear();		LastPro->clear();
LastPro->eraseFromParent();		LastPro->eraseFromParent();
if (LastPro == KernelBB)
NewKernel = nullptr;
} else {		} else {
numAdded = TII->insertBranch(*Prolog, LastPro, nullptr, Cond, DebugLoc());		numAdded = TII->insertBranch(*Prolog, LastPro, nullptr, Cond, DebugLoc());
removePhis(Epilog, Prolog);		removePhis(Epilog, Prolog);
}		}
LastPro = Prolog;		LastPro = Prolog;
LastEpi = Epilog;		LastEpi = Epilog;
for (MachineBasicBlock::reverse_instr_iterator I = Prolog->instr_rbegin(),		for (MachineBasicBlock::reverse_instr_iterator I = Prolog->instr_rbegin(),
E = Prolog->instr_rend();		E = Prolog->instr_rend();
I != E && numAdded > 0; ++I, --numAdded)		I != E && numAdded > 0; ++I, --numAdded)
updateInstruction(&*I, false, j, 0, VRMap);		updateInstruction(&*I, false, j, 0, VRMap);
}		}

		if (NewKernel) {
		LoopInfo->setPreheader(PrologBBs[MaxIter]);
		LoopInfo->adjustTripCount(-(MaxIter + 1));
		}
}		}

/// Return true if we can compute the amount the instruction changes		/// Return true if we can compute the amount the instruction changes
/// during each iteration. Set Delta to the amount of the change.		/// during each iteration. Set Delta to the amount of the change.
bool ModuloScheduleExpander::computeDelta(MachineInstr &MI, unsigned &Delta) {		bool ModuloScheduleExpander::computeDelta(MachineInstr &MI, unsigned &Delta) {
const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();		const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
const MachineOperand *BaseOp;		const MachineOperand *BaseOp;
int64_t Offset;		int64_t Offset;
▲ Show 20 Lines • Show All 845 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetInstrInfo.cpp

	Show First 20 Lines • Show All 991 Lines • ▼ Show 20 Lines
	BaseReg.Reg = MOBaseReg.getReg();	BaseReg.Reg = MOBaseReg.getReg();
	BaseReg.SubReg = MOBaseReg.getSubReg();	BaseReg.SubReg = MOBaseReg.getSubReg();

	InsertedReg.Reg = MOInsertedReg.getReg();	InsertedReg.Reg = MOInsertedReg.getReg();
	InsertedReg.SubReg = MOInsertedReg.getSubReg();	InsertedReg.SubReg = MOInsertedReg.getSubReg();
	InsertedReg.SubIdx = (unsigned)MOSubIdx.getImm();	InsertedReg.SubIdx = (unsigned)MOSubIdx.getImm();
	return true;	return true;
	}	}

		TargetInstrInfo::PipelinerLoopInfo::~PipelinerLoopInfo() {}
Context not available.

llvm/lib/Target/Hexagon/HexagonInstrInfo.h

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	public:
/// cases where AnalyzeBranch doesn't apply because there was no original		/// cases where AnalyzeBranch doesn't apply because there was no original
/// branch to analyze. At least this much must be implemented, else tail		/// branch to analyze. At least this much must be implemented, else tail
/// merging needs to be disabled.		/// merging needs to be disabled.
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
int *BytesAdded = nullptr) const override;		int *BytesAdded = nullptr) const override;

/// Analyze the loop code, return true if it cannot be understood. Upon		/// Analyze loop L, which must be a single-basic-block loop, and if the
/// success, this function returns false and returns information about the		/// conditions can be understood enough produce a PipelinerLoopInfo object.
/// induction variable and compare instruction used at the end.		std::unique_ptr<PipelinerLoopInfo>
bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,		analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const override;
MachineInstr *&CmpInst) const override;

/// Generate code to reduce the loop iteration by one and check if the loop
/// is finished. Return the value/register of the new loop count. We need
/// this function when peeling off one or more iterations of a loop. This
/// function assumes the nth iteration is peeled first.
unsigned reduceLoopCount(MachineBasicBlock &MBB, MachineBasicBlock &PreHeader,
MachineInstr *IndVar, MachineInstr &Cmp,
SmallVectorImpl<MachineOperand> &Cond,
SmallVectorImpl<MachineInstr *> &PrevInsts,
unsigned Iter, unsigned MaxIter) const override;

/// Return true if it's profitable to predicate		/// Return true if it's profitable to predicate
/// instructions with accumulated instruction latency of "NumCycles"		/// instructions with accumulated instruction latency of "NumCycles"
/// of the specified basic block, where the probability of the instructions		/// of the specified basic block, where the probability of the instructions
/// being executed is given by Probability, and Confidence is a measure		/// being executed is given by Probability, and Confidence is a measure
/// of our confidence that it will be properly predicted.		/// of our confidence that it will be properly predicted.
bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,		bool isProfitableToIfCvt(MachineBasicBlock &MBB, unsigned NumCycles,
unsigned ExtraPredCycles,		unsigned ExtraPredCycles,
▲ Show 20 Lines • Show All 373 Lines • Show Last 20 Lines

llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp

Show First 20 Lines • Show All 668 Lines • ▼ Show 20 Lines	if (isEndLoopN(Cond[0].getImm())) {
unsigned Flags = getUndefRegState(RO.isUndef());		unsigned Flags = getUndefRegState(RO.isUndef());
BuildMI(&MBB, DL, get(BccOpc)).addReg(RO.getReg(), Flags).addMBB(TBB);		BuildMI(&MBB, DL, get(BccOpc)).addReg(RO.getReg(), Flags).addMBB(TBB);
}		}
BuildMI(&MBB, DL, get(BOpc)).addMBB(FBB);		BuildMI(&MBB, DL, get(BOpc)).addMBB(FBB);

return 2;		return 2;
}		}

/// Analyze the loop code to find the loop induction variable and compare used		class HexagonPipelinerLoopInfo : public TargetInstrInfo::PipelinerLoopInfo {
/// to compute the number of iterations. Currently, we analyze loop that are		MachineInstr Loop, EndLoop;
/// controlled using hardware loops. In this case, the induction variable		MachineFunction *MF;
/// instruction is null. For all other cases, this function returns true, which		const HexagonInstrInfo *TII;
/// means we're unable to analyze it.
bool HexagonInstrInfo::analyzeLoop(MachineLoop &L,
MachineInstr *&IndVarInst,
MachineInstr *&CmpInst) const {

MachineBasicBlock *LoopEnd = L.getBottomBlock();		public:
MachineBasicBlock::iterator I = LoopEnd->getFirstTerminator();		HexagonPipelinerLoopInfo(MachineInstr Loop, MachineInstr EndLoop)
// We really "analyze" only hardware loops right now.		: Loop(Loop), EndLoop(EndLoop), MF(Loop->getParent()->getParent()),
if (I != LoopEnd->end() && isEndLoopN(I->getOpcode())) {		TII(MF->getSubtarget<HexagonSubtarget>().getInstrInfo()) {}
IndVarInst = nullptr;
CmpInst = &*I;		bool shouldIgnoreForPipelining(const MachineInstr *MI) const override {
return false;		// Only ignore the terminator.
		return MI == EndLoop;
}		}
return true;
		Optional<bool>
		createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
		SmallVectorImpl<MachineOperand> &Cond) override {
		if (Loop->getOpcode() == Hexagon::J2_loop0r) {
		Register LoopCount = Loop->getOperand(1).getReg();
		// Check if we're done with the loop.
		unsigned Done = TII->createVR(MF, MVT::i1);
		MachineInstr *NewCmp = BuildMI(&MBB, Loop->getDebugLoc(),
		TII->get(Hexagon::C2_cmpgtui), Done)
		.addReg(LoopCount)
		.addImm(TC);
		Cond.push_back(MachineOperand::CreateImm(Hexagon::J2_jumpf));
		Cond.push_back(NewCmp->getOperand(0));
		return {};
}		}

/// Generate code to reduce the loop iteration by one and check if the loop is		int64_t TripCount = Loop->getOperand(1).getImm();
/// finished. Return the value/register of the new loop count. this function		return TripCount > TC;
/// assumes the nth iteration is peeled first.		}
unsigned HexagonInstrInfo::reduceLoopCount(
MachineBasicBlock &MBB, MachineBasicBlock &PreHeader, MachineInstr *IndVar,		void setPreheader(MachineBasicBlock *NewPreheader) override {
MachineInstr &Cmp, SmallVectorImpl<MachineOperand> &Cond,		NewPreheader->splice(NewPreheader->getFirstTerminator(), Loop->getParent(),
SmallVectorImpl<MachineInstr *> &PrevInsts, unsigned Iter,		Loop);
unsigned MaxIter) const {		}
// We expect a hardware loop currently. This means that IndVar is set
// to null, and the compare is the ENDLOOP instruction.		void adjustTripCount(int TripCountAdjust) override {
assert((!IndVar) && isEndLoopN(Cmp.getOpcode())
&& "Expecting a hardware loop");
MachineFunction *MF = MBB.getParent();
DebugLoc DL = Cmp.getDebugLoc();
SmallPtrSet<MachineBasicBlock *, 8> VisitedBBs;
MachineInstr *Loop = findLoopInstr(&MBB, Cmp.getOpcode(),
Cmp.getOperand(0).getMBB(), VisitedBBs);
if (!Loop)
return 0;
// If the loop trip count is a compile-time value, then just change the		// If the loop trip count is a compile-time value, then just change the
// value.		// value.
if (Loop->getOpcode() == Hexagon::J2_loop0i \|\|		if (Loop->getOpcode() == Hexagon::J2_loop0i \|\|
Loop->getOpcode() == Hexagon::J2_loop1i) {		Loop->getOpcode() == Hexagon::J2_loop1i) {
int64_t Offset = Loop->getOperand(1).getImm();		int64_t TripCount = Loop->getOperand(1).getImm() + TripCountAdjust;
if (Offset <= 1)		assert(TripCount > 0 && "Can't create an empty or negative loop!");
Loop->eraseFromParent();		Loop->getOperand(1).setImm(TripCount);
else		return;
Loop->getOperand(1).setImm(Offset - 1);
return Offset - 1;
}		}

// The loop trip count is a run-time value. We generate code to subtract		// The loop trip count is a run-time value. We generate code to subtract
// one from the trip count, and update the loop instruction.		// one from the trip count, and update the loop instruction.
assert(Loop->getOpcode() == Hexagon::J2_loop0r && "Unexpected instruction");
Register LoopCount = Loop->getOperand(1).getReg();		Register LoopCount = Loop->getOperand(1).getReg();
// Check if we're done with the loop.		Register NewLoopCount = TII->createVR(MF, MVT::i32);
unsigned LoopEnd = createVR(MF, MVT::i1);		BuildMI(*Loop->getParent(), Loop, Loop->getDebugLoc(),
MachineInstr *NewCmp = BuildMI(&MBB, DL, get(Hexagon::C2_cmpgtui), LoopEnd).		TII->get(Hexagon::A2_addi), NewLoopCount)
addReg(LoopCount).addImm(1);		.addReg(LoopCount)
unsigned NewLoopCount = createVR(MF, MVT::i32);		.addImm(TripCountAdjust);
MachineInstr *NewAdd = BuildMI(&MBB, DL, get(Hexagon::A2_addi), NewLoopCount).		Loop->getOperand(1).setReg(NewLoopCount);
addReg(LoopCount).addImm(-1);		}
const HexagonRegisterInfo &HRI = *Subtarget.getRegisterInfo();
// Update the previously generated instructions with the new loop counter.		void disposed() override { Loop->eraseFromParent(); }
for (SmallVectorImpl<MachineInstr *>::iterator I = PrevInsts.begin(),		};
E = PrevInsts.end(); I != E; ++I)
(*I)->substituteRegister(LoopCount, NewLoopCount, 0, HRI);		std::unique_ptr<TargetInstrInfo::PipelinerLoopInfo>
PrevInsts.clear();		HexagonInstrInfo::analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const {
PrevInsts.push_back(NewCmp);		// We really "analyze" only hardware loops right now.
PrevInsts.push_back(NewAdd);		MachineBasicBlock::iterator I = LoopBB->getFirstTerminator();
// Insert the new loop instruction if this is the last time the loop is
// decremented.		if (I != LoopBB->end() && isEndLoopN(I->getOpcode())) {
if (Iter == MaxIter)		SmallPtrSet<MachineBasicBlock *, 8> VisitedBBs;
BuildMI(&MBB, DL, get(Hexagon::J2_loop0r)).		MachineInstr *LoopInst = findLoopInstr(
addMBB(Loop->getOperand(0).getMBB()).addReg(NewLoopCount);		LoopBB, I->getOpcode(), I->getOperand(0).getMBB(), VisitedBBs);
// Delete the old loop instruction.		if (LoopInst)
if (Iter == 0)		return std::make_unique<HexagonPipelinerLoopInfo>(LoopInst, &*I);
Loop->eraseFromParent();		}
Cond.push_back(MachineOperand::CreateImm(Hexagon::J2_jumpf));		return nullptr;
Cond.push_back(NewCmp->getOperand(0));
return NewLoopCount;
}		}

bool HexagonInstrInfo::isProfitableToIfCvt(MachineBasicBlock &MBB,		bool HexagonInstrInfo::isProfitableToIfCvt(MachineBasicBlock &MBB,
unsigned NumCycles, unsigned ExtraPredCycles,		unsigned NumCycles, unsigned ExtraPredCycles,
BranchProbability Probability) const {		BranchProbability Probability) const {
return nonDbgBBSize(&MBB) <= 3;		return nonDbgBBSize(&MBB) <= 3;
}		}

▲ Show 20 Lines • Show All 991 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.h

Show First 20 Lines • Show All 480 Lines • ▼ Show 20 Lines	public:

/// Check \p Opcode is BDNZ (Decrement CTR and branch if it is still nonzero).		/// Check \p Opcode is BDNZ (Decrement CTR and branch if it is still nonzero).
bool isBDNZ(unsigned Opcode) const;		bool isBDNZ(unsigned Opcode) const;

/// Find the hardware loop instruction used to set-up the specified loop.		/// Find the hardware loop instruction used to set-up the specified loop.
/// On PPC, we have two instructions used to set-up the hardware loop		/// On PPC, we have two instructions used to set-up the hardware loop
/// (MTCTRloop, MTCTR8loop) with corresponding endloop (BDNZ, BDNZ8)		/// (MTCTRloop, MTCTR8loop) with corresponding endloop (BDNZ, BDNZ8)
/// instructions to indicate the end of a loop.		/// instructions to indicate the end of a loop.
MachineInstr *findLoopInstr(MachineBasicBlock &PreHeader) const;		MachineInstr *
		findLoopInstr(MachineBasicBlock &PreHeader,
/// Analyze the loop code to find the loop induction variable and compare used		SmallPtrSet<MachineBasicBlock *, 8> &Visited) const;
/// to compute the number of iterations. Currently, we analyze loop that are
/// controlled using hardware loops. In this case, the induction variable		/// Analyze loop L, which must be a single-basic-block loop, and if the
/// instruction is null. For all other cases, this function returns true,		/// conditions can be understood enough produce a PipelinerLoopInfo object.
/// which means we're unable to analyze it. \p IndVarInst and \p CmpInst will		std::unique_ptr<TargetInstrInfo::PipelinerLoopInfo>
/// return new values when we can analyze the readonly loop \p L, otherwise,		analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const override;
/// nothing got changed
bool analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,
MachineInstr *&CmpInst) const override;
/// Generate code to reduce the loop iteration by one and check if the loop
/// is finished. Return the value/register of the new loop count. We need
/// this function when peeling off one or more iterations of a loop. This
/// function assumes the last iteration is peeled first.
unsigned reduceLoopCount(MachineBasicBlock &MBB, MachineBasicBlock &PreHeader,
MachineInstr *IndVar, MachineInstr &Cmp,
SmallVectorImpl<MachineOperand> &Cond,
SmallVectorImpl<MachineInstr *> &PrevInsts,
unsigned Iter, unsigned MaxIter) const override;
};		};

}		}

#endif		#endif

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

	Show First 20 Lines • Show All 991 Lines • ▼ Show 20 Lines
	}			}
	return false;			return false;
	}			}

	bool PPCInstrInfo::isBDNZ(unsigned Opcode) const {			bool PPCInstrInfo::isBDNZ(unsigned Opcode) const {
	return (Opcode == (Subtarget.isPPC64() ? PPC::BDNZ8 : PPC::BDNZ));			return (Opcode == (Subtarget.isPPC64() ? PPC::BDNZ8 : PPC::BDNZ));
	}			}

	bool PPCInstrInfo::analyzeLoop(MachineLoop &L, MachineInstr *&IndVarInst,			class PPCPipelinerLoopInfo : public TargetInstrInfo::PipelinerLoopInfo {
	MachineInstr *&CmpInst) const {			MachineInstr Loop, EndLoop, *LoopCount;
	MachineBasicBlock *LoopEnd = L.getBottomBlock();			MachineFunction *MF;
	MachineBasicBlock::iterator I = LoopEnd->getFirstTerminator();			const TargetInstrInfo *TII;
	// We really "analyze" only CTR loops right now.
	if (I != LoopEnd->end() && isBDNZ(I->getOpcode())) {			public:
	IndVarInst = nullptr;			PPCPipelinerLoopInfo(MachineInstr Loop, MachineInstr EndLoop,
	CmpInst = &*I;			MachineInstr *LoopCount)
	return false;			: Loop(Loop), EndLoop(EndLoop), LoopCount(LoopCount),
				MF(Loop->getParent()->getParent()),
				TII(MF->getSubtarget().getInstrInfo()) {}

				bool shouldIgnoreForPipelining(const MachineInstr *MI) const override {
				// Only ignore the terminator.
				return MI == EndLoop;
				}

				Optional<bool>
				createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
				SmallVectorImpl<MachineOperand> &Cond) override {
				bool IsConstantTripCount =
				LoopCount->getOpcode() == PPC::LI8 \|\| LoopCount->getOpcode() == PPC::LI;
				if (!IsConstantTripCount) {
				// Since BDZ/BDZ8 that we will insert will also decrease the ctr by 1,
				// so we don't need to generate any thing here.
				Cond.push_back(MachineOperand::CreateImm(0));
				Cond.push_back(MachineOperand::CreateReg(
				MF->getSubtarget<PPCSubtarget>().isPPC64() ? PPC::CTR8 : PPC::CTR,
				true));
				return {};
				}

				int64_t TripCount = LoopCount->getOperand(1).getImm();
				return TripCount > TC;
	}			}
	return true;
				void setPreheader(MachineBasicBlock *NewPreheader) override {
				// Do nothing. We want the LOOP setup instruction to stay in the old
				// preheader, so we can use BDZ in the prologs to adapt the loop trip count.
				}

				void adjustTripCount(int TripCountAdjust) override {
				// If the loop trip count is a compile-time value, then just change the
				// value.
				if (LoopCount->getOpcode() == PPC::LI8 \|\|
				LoopCount->getOpcode() == PPC::LI) {
				int64_t TripCount = LoopCount->getOperand(1).getImm() + TripCountAdjust;
				LoopCount->getOperand(1).setImm(TripCount);
				return;
				}

				// Since BDZ/BDZ8 that we will insert will also decrease the ctr by 1,
				// so we don't need to generate any thing here.
				}

				void disposed() override {
				jsjiUnsubmitted Not Done Reply Inline Actions I don't quite follow when we should call splice to `NewPreheader` for a target, and why Hexagon needs it, but PowerPC don't? jsji: I don't quite follow when we should call splice to `NewPreheader` for a target, and why…
				jmolloyAuthorUnsubmitted Done Reply Inline Actions Hexagon wants the LOOP instruction in the immediate predecessor of the pipelined loop kernel: prolog1: prolog2: LOOP kernel: ENDLOOP Whereas PPC wants it to stay in the preheader to the prolog, so it can use BDZ within the prolog to adapt the loop count: LOOP prolog1: BDZ prolog2: BDZ kernel: BDNZ jmolloy: Hexagon wants the LOOP instruction in the immediate predecessor of the pipelined loop kernel…
				Loop->eraseFromParent();
				// Ensure the loop setup instruction is deleted too.
				LoopCount->eraseFromParent();
				}
				};

				std::unique_ptr<TargetInstrInfo::PipelinerLoopInfo>
				PPCInstrInfo::analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const {
				// We really "analyze" only hardware loops right now.
				MachineBasicBlock::iterator I = LoopBB->getFirstTerminator();
				MachineBasicBlock Preheader = LoopBB->pred_begin();
				if (Preheader == LoopBB)
				Preheader = *std::next(LoopBB->pred_begin());
				MachineFunction *MF = Preheader->getParent();

				if (I != LoopBB->end() && isBDNZ(I->getOpcode())) {
				SmallPtrSet<MachineBasicBlock *, 8> Visited;
				if (MachineInstr LoopInst = findLoopInstr(Preheader, Visited)) {
				Register LoopCountReg = LoopInst->getOperand(0).getReg();
				MachineRegisterInfo &MRI = MF->getRegInfo();
				MachineInstr *LoopCount = MRI.getUniqueVRegDef(LoopCountReg);
				return std::make_unique<PPCPipelinerLoopInfo>(LoopInst, &*I, LoopCount);
				}
				}
				return nullptr;
	}			}

	MachineInstr *			MachineInstr *PPCInstrInfo::findLoopInstr(
	PPCInstrInfo::findLoopInstr(MachineBasicBlock &PreHeader) const {			MachineBasicBlock &PreHeader,
				SmallPtrSet<MachineBasicBlock *, 8> &Visited) const {

	unsigned LOOPi = (Subtarget.isPPC64() ? PPC::MTCTR8loop : PPC::MTCTRloop);			unsigned LOOPi = (Subtarget.isPPC64() ? PPC::MTCTR8loop : PPC::MTCTRloop);

	// The loop set-up instruction should be in preheader			// The loop set-up instruction should be in preheader
				jsjiUnsubmitted Not Done Reply Inline Actions I think we skip this loop intentionally when enabling PowerPC target , why we need it back? jsji: I think we skip this loop intentionally when enabling PowerPC target , why we need it back?
				jmolloyAuthorUnsubmitted Done Reply Inline Actions We needed it because I was calling analyzeLoop after doing some modifications to the loop. I've changed so that ModuloScheduleExpander calls analyzeLoop early, so the preheader is in the right place and we don't need this change any more. Thanks for noticing this! jmolloy: We needed it because I was calling analyzeLoop after doing some modifications to the loop. I've…
	for (auto &I : PreHeader.instrs())			for (auto &I : PreHeader.instrs())
	if (I.getOpcode() == LOOPi)			if (I.getOpcode() == LOOPi)
	return &I;			return &I;
	return nullptr;			return nullptr;
	}			}

	unsigned PPCInstrInfo::reduceLoopCount(
	MachineBasicBlock &MBB, MachineBasicBlock &PreHeader, MachineInstr *IndVar,
	MachineInstr &Cmp, SmallVectorImpl<MachineOperand> &Cond,
	SmallVectorImpl<MachineInstr *> &PrevInsts, unsigned Iter,
	unsigned MaxIter) const {
	// We expect a hardware loop currently. This means that IndVar is set
	// to null, and the compare is the ENDLOOP instruction.
	assert((!IndVar) && isBDNZ(Cmp.getOpcode()) && "Expecting a CTR loop");
	MachineFunction *MF = MBB.getParent();
	DebugLoc DL = Cmp.getDebugLoc();
	MachineInstr *Loop = findLoopInstr(PreHeader);
	if (!Loop)
	return 0;
	Register LoopCountReg = Loop->getOperand(0).getReg();
	MachineRegisterInfo &MRI = MF->getRegInfo();
	MachineInstr *LoopCount = MRI.getUniqueVRegDef(LoopCountReg);

	if (!LoopCount)
	return 0;
	// If the loop trip count is a compile-time value, then just change the
	// value.
	if (LoopCount->getOpcode() == PPC::LI8 \|\| LoopCount->getOpcode() == PPC::LI) {
	int64_t Offset = LoopCount->getOperand(1).getImm();
	if (Offset <= 1) {
	LoopCount->eraseFromParent();
	Loop->eraseFromParent();
	return 0;
	}
	LoopCount->getOperand(1).setImm(Offset - 1);
	return Offset - 1;
	}

	// The loop trip count is a run-time value.
	// We need to subtract one from the trip count,
	// and insert branch later to check if we're done with the loop.

	// Since BDZ/BDZ8 that we will insert will also decrease the ctr by 1,
	// so we don't need to generate any thing here.
	Cond.push_back(MachineOperand::CreateImm(0));
	Cond.push_back(MachineOperand::CreateReg(
	Subtarget.isPPC64() ? PPC::CTR8 : PPC::CTR, true));
	return LoopCountReg;
	}

	// Return true if get the base operand, byte offset of an instruction and the			// Return true if get the base operand, byte offset of an instruction and the
	// memory width. Width is the size of memory that is being loaded/stored.			// memory width. Width is the size of memory that is being loaded/stored.
	bool PPCInstrInfo::getMemOperandWithOffsetWidth(			bool PPCInstrInfo::getMemOperandWithOffsetWidth(
	const MachineInstr &LdSt,			const MachineInstr &LdSt,
	const MachineOperand *&BaseReg,			const MachineOperand *&BaseReg,
	int64_t &Offset,			int64_t &Offset,
	unsigned &Width,			unsigned &Width,
	const TargetRegisterInfo *TRI) const {			const TargetRegisterInfo *TRI) const {
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/CodeGen/Hexagon/swp-epilog-phi7.ll

	; RUN: llc -march=hexagon -O2 -enable-pipeliner -disable-block-placement=0 < %s \| FileCheck %s			; RUN: llc -march=hexagon -O2 -enable-pipeliner -disable-block-placement=0 < %s \| FileCheck %s

	; For the Phis generated in the epilog, test that we generate the correct			; For the Phis generated in the epilog, test that we generate the correct
	; names for the values coming from the prolog stages. The test belows			; names for the values coming from the prolog stages. The test belows
	; checks that the value loaded in the first prolog block gets propagated			; checks that the value loaded in the first prolog block gets propagated
	; through the first epilog to the use after the loop.			; through the first epilog to the use after the loop.

	; CHECK: if ({{.*}}) jump			; CHECK: if ({{.*}}) jump
	; CHECK: [[VREG:v([0-9]+)]]{{.}} = {{.}}vmem(r{{[0-9]+}}++#1)			; CHECK: [[VREG:v([0-9]+)]]{{.}} = {{.}}vmem(r{{[0-9]+}}++#1)
	; CHECK: if ({{.}}) {{jump\|jump:nt}} [[EPLOG1:(.)]]			; CHECK: if ({{.}}) {{jump\|jump:nt\|jump:t}} [[EPLOG1:(.)]]
	; CHECK: if ({{.}}) {{jump\|jump:nt}} [[EPLOG:(.)]]			; CHECK: if ({{.}}) {{jump\|jump:nt\|jump:t}} [[EPLOG:(.)]]
	; CHECK: [[EPLOG]]:			; CHECK: [[EPLOG]]:
	; CHECK: [[VREG1:v([0-9]+)]] = [[VREG]]			; CHECK: [[VREG1:v([0-9]+)]] = [[VREG]]
	; CHECK: [[VREG]] = v{{[0-9]+}}			; CHECK: [[VREG]] = v{{[0-9]+}}
	; CHECK: [[EPLOG1]]:			; CHECK: [[EPLOG1]]:
	; CHECK: = vlalign([[VREG]],[[VREG1]],#1)			; CHECK: = vlalign([[VREG]],[[VREG1]],#1)

	; Function Attrs: nounwind			; Function Attrs: nounwind
	define void @f0(i8* noalias nocapture readonly %a0, i32 %a1, i32 %a2, i8* noalias nocapture readonly %a3, i32 %a4, i8* noalias nocapture %a5, i32 %a6) #0 {			define void @f0(i8* noalias nocapture readonly %a0, i32 %a1, i32 %a2, i8* noalias nocapture readonly %a3, i32 %a4, i8* noalias nocapture %a5, i32 %a6) #0 {
	▲ Show 20 Lines • Show All 256 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCountClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 219889

llvm/include/llvm/CodeGen/ModuloSchedule.h

llvm/include/llvm/CodeGen/TargetInstrInfo.h

llvm/lib/CodeGen/MachinePipeliner.cpp

llvm/lib/CodeGen/ModuloSchedule.cpp

llvm/lib/CodeGen/TargetInstrInfo.cpp

llvm/lib/Target/Hexagon/HexagonInstrInfo.h

llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp

llvm/lib/Target/PowerPC/PPCInstrInfo.h

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

llvm/test/CodeGen/Hexagon/swp-epilog-phi7.ll

[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
ClosedPublic