Diff 308279

llvm/lib/Target/ARM/ARMScheduleA57.td

	Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
	// if so adds 2 cyc to latency, 1 uop, 1 res cycle for A57UnitB			// if so adds 2 cyc to latency, 1 uop, 1 res cycle for A57UnitB
	class A57BranchForm<SchedWriteRes non_br> :			class A57BranchForm<SchedWriteRes non_br> :
	BranchWriteRes<2, 1, [A57UnitB], [1], non_br>;			BranchWriteRes<2, 1, [A57UnitB], [1], non_br>;

	// shift by register, conditional or unconditional			// shift by register, conditional or unconditional
	// TODO: according to the doc, conditional uses I0/I1, unconditional uses M			// TODO: according to the doc, conditional uses I0/I1, unconditional uses M
	// Why more complex instruction uses more simple pipeline?			// Why more complex instruction uses more simple pipeline?
	// May be an error in doc.			// May be an error in doc.
	def A57WriteALUsi : SchedWriteVariant<[
	dmgreenUnsubmitted Not Done Reply Inline Actions This is "Move, shift by immed, no setflags" _and_ "Move, shift by immed, setflags"? I agree that the predicated pred should not matter, but there probably should be some difference between flag setting and not. I think the TODO above is referring to A57WriteALUSsr? I'm not sure why A57WriteALUsr is treated the same way though. From what I can see it should be using A57Write_1cyc_1. Is A57ReadALUsr worth keeping around? dmgreen: This is "Move, shift by immed, no setflags" _and_ "Move, shift by immed, setflags"? I agree…
	evgeny777AuthorUnsubmitted Done Reply Inline Actions Is A57ReadALUsr worth keeping around? I don't think it is, however I decided to keep it for now for testing purposes. I'm not sure why A57WriteALUsr is treated the same way though. From what I can see it should be using A57Write_1cyc_1. Why? From opt guide: ALU, shift by register, unconditional (same as above) 2 1 M ALU, shift by register, conditional (same as above) 2 1 I0/I1 evgeny777: > Is A57ReadALUsr worth keeping around? I don't think it is, however I decided to keep it for…
	dmgreenUnsubmitted Not Done Reply Inline Actions Hmm. Which opt guide is that from? I seem to see: Move, shift by immed, no setflags 1I Move, shift by immed, setflags 2M Move, shift by register, no setflags, unconditional 1I Move, shift by register, no setflags, conditional 2I Move, shift by register, setflags, unconditional 2M Move, shift by register, setflags, conditional 2I So the first 2 are currently in WriteALUsi (which should probably be split up to get it correct, but that is a separate issue. A better default is probably A57Write_1cyc_1I). A57WriteALUSsr is the last 2 and fits the opt guide correctly at least. A57WriteALUsr is the middle two, but should probably be using Pred:A57Write_2cyc_1I and NoPred: A57Write_1cyc_1I. dmgreen: Hmm. Which opt guide is that from? I seem to see: Move, shift by immed, no setflags 1I…
	evgeny777AuthorUnsubmitted Done Reply Inline Actions Hmm. Which opt guide is that from? I think we're both looking at the same one. I've copy-pasted from section 3.3 I seem to see: Right, but `Move, shift by register` (MOVsr) is bound to `WriteALU` (ARM version) and thumb version (t2MOVsr) is unbound. The following commands are currently bound to WriteALUsr: ADCrsr ADDrsr ANDrsr BICrsr EORrsr ORRrsr RSBrsr RSCrsr SBCrsr SUBrsr SXTAB SXTAB16 SXTAH UXTAB UXTAB16 UXTAH It seems ARM/Thumb instruction definition is incomplete and broken in many ways when it comes to scheduling. Patch however is more about simplifying adding new models, not fixing existing ones. evgeny777: > Hmm. Which opt guide is that from? I think we're both looking at the same one. I've copy…
	dmgreenUnsubmitted Not Done Reply Inline Actions Ah. I was thinking more about shifts than arithmetic operations. In that case, A57Write_2cyc_1M is probably a better default than A57Write_1cyc_1I. dmgreen: Ah. I was thinking more about shifts than arithmetic operations. In that case, A57Write_2cyc_1M…
	// lsl #2, lsl #1, or lsr #1.
	SchedVar<IsPredicatedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>,
	SchedVar<NoSchedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>
	]>;
	def A57WriteALUsr : SchedWriteVariant<[			def A57WriteALUsr : SchedWriteVariant<[
	SchedVar<IsPredicatedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1I>>]>,			SchedVar<IsPredicatedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1I>>]>,
	SchedVar<NoSchedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>			SchedVar<NoSchedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>
	]>;			]>;
	def A57WriteALUSsr : SchedWriteVariant<[			def A57WriteALUSsr : SchedWriteVariant<[
	SchedVar<IsPredicatedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1I>>]>,			SchedVar<IsPredicatedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1I>>]>,
	SchedVar<NoSchedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>			SchedVar<NoSchedPred, [CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>]>
	]>;			]>;
	def A57ReadALUsr : SchedReadVariant<[			def A57ReadALUsr : SchedReadVariant<[
	SchedVar<IsPredicatedPred, [ReadDefault]>,			SchedVar<IsPredicatedPred, [ReadDefault]>,
	SchedVar<NoSchedPred, [ReadDefault]>			SchedVar<NoSchedPred, [ReadDefault]>
	]>;			]>;
	def : SchedAlias<WriteALUsi, A57WriteALUsi>;			def : SchedAlias<WriteALUsi, CheckBranchForm<0, A57BranchForm<A57Write_2cyc_1M>>>;
	def : SchedAlias<WriteALUsr, A57WriteALUsr>;			def : SchedAlias<WriteALUsr, A57WriteALUsr>;
	def : SchedAlias<WriteALUSsr, A57WriteALUSsr>;			def : SchedAlias<WriteALUSsr, A57WriteALUSsr>;
	def : SchedAlias<ReadALUsr, A57ReadALUsr>;			def : SchedAlias<ReadALUsr, A57ReadALUsr>;

	def A57WriteCMPsr : SchedWriteVariant<[			def A57WriteCMPsr : SchedWriteVariant<[
	SchedVar<IsPredicatedPred, [A57Write_2cyc_1I]>,			SchedVar<IsPredicatedPred, [A57Write_2cyc_1I]>,
	SchedVar<NoSchedPred, [A57Write_2cyc_1M]>			SchedVar<NoSchedPred, [A57Write_2cyc_1M]>
	]>;			]>;
	▲ Show 20 Lines • Show All 1,292 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-mla.mir

This file was added.

				# RUN: llc -mcpu=cortex-a57 -mtriple=thumb -enable-misched -run-pass=machine-scheduler -debug-only=machine-scheduler %s 2>&1 \| FileCheck %s

				# CHECK-LABEL: ******** MI Scheduling ********
				# CHECK: %[[RES:[0-9]+]]:rgpr = t2MLA
				# CHECK-NEXT: # preds left
				# CHECK-NEXT: # succs left
				# CHECK-NEXT: # rdefs left
				# CHECK-NEXT: Latency : 3
				# CHECK-NEXT: Depth
				# CHECK-NEXT: Height
				# CHECK-NEXT: Predecessors:
				# CHECK-NEXT: SU({{.*}}): Data Latency=1 Reg=
				# CHECK-NEXT: SU({{.*}}): Out Latency=
				# CHECK-NEXT: SU({{.*}}): Data Latency=1 Reg=
				# CHECK-NEXT: Successors:
				# CHECK-NEXT: SU([[SMLA_SU:[0-9]+]]): Data Latency=1 Reg=%[[RES]]
				# CHECK-NEXT: Pressure Diff
				# CHECK-NEXT: Single Issue : false;
				# CHECK-NEXT: SU([[SMLA_SU]]): {{.*}} = t2SMLAL %{{[0-9]+}}:rgpr, %{{[0-9]+}}:rgpr, %{{[0-9]+}}:rgpr(tied-def 0), %[[RES]]:rgpr(tied-def 1), 14, $noreg

				name: test_smlal_forwarding
				tracksRegLiveness: true
				body: \|
				bb.0:
				liveins: $r1, $r3, $r4, $r5, $r6
				%1:rgpr = COPY $r1
				%3:rgpr = COPY $r3
				%4:rgpr = COPY $r4
				%5:rgpr = COPY $r5
				%6:rgpr = COPY $r6
				%3:rgpr = t2MLA %4:rgpr, %1:rgpr, %4:rgpr, 14, $noreg
				%6:rgpr, %5:rgpr = t2SMLAL %5:rgpr, %6:rgpr, %4:rgpr, %3:rgpr, 14, $noreg
				$r0 = COPY %6:rgpr
				BX_RET 14, $noreg, implicit $r0

llvm/utils/TableGen/CodeGenSchedule.h

Show First 20 Lines • Show All 437 Lines • ▼ Show 20 Lines	class CodeGenSchedModels {
RecVec ProcResGroups;		RecVec ProcResGroups;

// Map each instruction to its unique SchedClass index considering the		// Map each instruction to its unique SchedClass index considering the
// combination of it's itinerary class, SchedRW list, and InstRW records.		// combination of it's itinerary class, SchedRW list, and InstRW records.
using InstClassMapTy = DenseMap<Record*, unsigned>;		using InstClassMapTy = DenseMap<Record*, unsigned>;
InstClassMapTy InstrClassMap;		InstClassMapTy InstrClassMap;

std::vector<STIPredicateFunction> STIPredicates;		std::vector<STIPredicateFunction> STIPredicates;
		std::vector<unsigned> getAllProcIndices() const;

public:		public:
CodeGenSchedModels(RecordKeeper& RK, const CodeGenTarget &TGT);		CodeGenSchedModels(RecordKeeper& RK, const CodeGenTarget &TGT);

// iterator access to the scheduling classes.		// iterator access to the scheduling classes.
using class_iterator = std::vector<CodeGenSchedClass>::iterator;		using class_iterator = std::vector<CodeGenSchedClass>::iterator;
using const_class_iterator = std::vector<CodeGenSchedClass>::const_iterator;		using const_class_iterator = std::vector<CodeGenSchedClass>::const_iterator;
class_iterator classes_begin() { return SchedClasses.begin(); }		class_iterator classes_begin() { return SchedClasses.begin(); }
▲ Show 20 Lines • Show All 195 Lines • Show Last 20 Lines

llvm/utils/TableGen/CodeGenSchedule.cpp

Show First 20 Lines • Show All 1,332 Lines • ▼ Show 20 Lines	class PredTransitions {
CodeGenSchedModels &SchedModels;		CodeGenSchedModels &SchedModels;

public:		public:
std::vector<PredTransition> TransVec;		std::vector<PredTransition> TransVec;

PredTransitions(CodeGenSchedModels &sm): SchedModels(sm) {}		PredTransitions(CodeGenSchedModels &sm): SchedModels(sm) {}

bool substituteVariantOperand(const SmallVectorImpl<unsigned> &RWSeq,		bool substituteVariantOperand(const SmallVectorImpl<unsigned> &RWSeq,
bool IsRead, bool IsForAnyCPU,		bool IsRead, unsigned StartIdx);
unsigned StartIdx);

bool substituteVariants(const PredTransition &Trans);		bool substituteVariants(const PredTransition &Trans);

#ifndef NDEBUG		#ifndef NDEBUG
void dump() const;		void dump() const;
#endif		#endif

private:		private:
bool mutuallyExclusive(Record PredDef, ArrayRef<Record > Preds,		bool mutuallyExclusive(Record PredDef, ArrayRef<Record > Preds,
ArrayRef<PredCheck> Term);		ArrayRef<PredCheck> Term);
void getIntersectingVariants(		void getIntersectingVariants(
const CodeGenSchedRW &SchedRW, unsigned TransIdx,		const CodeGenSchedRW &SchedRW, unsigned TransIdx,
std::vector<TransVariant> &IntersectingVariants);		std::vector<TransVariant> &IntersectingVariants);
void pushVariant(const TransVariant &VInfo, bool IsRead);		void pushVariant(const TransVariant &VInfo, bool IsRead);
};		};

} // end anonymous namespace		} // end anonymous namespace

// Return true if this predicate is mutually exclusive with a PredTerm. This		// Return true if this predicate is mutually exclusive with a PredTerm. This
		dmgreenUnsubmitted Not Done Reply Inline Actions Does this need to use a densemap? It seems to be being used to check whether the TransVariant have already been handled. Can it use a set or something simpler for that? dmgreen: Does this need to use a densemap? It seems to be being used to check whether the TransVariant…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions Unfortunately it can't, because we not only need to check for same record definition, but also for same processor index (it's a bug in current implementation to ignore this). This is because same variant record may be shared between different processor models (like ReadAdrBase in ThunderX2T99 and ThunderX3T110) evgeny777: Unfortunately it can't, because we not only need to check for same record definition, but also…
// degenerates into checking if the predicate is mutually exclusive with any		// degenerates into checking if the predicate is mutually exclusive with any
// predicate in the Term's conjunction.		// predicate in the Term's conjunction.
//		//
// All predicates associated with a given SchedRW are considered mutually		// All predicates associated with a given SchedRW are considered mutually
// exclusive. This should work even if the conditions expressed by the		// exclusive. This should work even if the conditions expressed by the
// predicates are not exclusive because the predicates for a given SchedWrite		// predicates are not exclusive because the predicates for a given SchedWrite
// are always checked in the order they are defined in the .td file. Later		// are always checked in the order they are defined in the .td file. Later
// conditions implicitly negate any prior condition.		// conditions implicitly negate any prior condition.
Show All 38 Lines	if (any_of(Variants, [PredDef](const Record *R) {
if (!count(Preds, PC.Predicate))		if (!count(Preds, PC.Predicate))
continue;		continue;
return true;		return true;
}		}
}		}
return false;		return false;
}		}

static bool hasAliasedVariants(const CodeGenSchedRW &RW,
CodeGenSchedModels &SchedModels) {
if (RW.HasVariants)
return true;

for (Record *Alias : RW.Aliases) {
const CodeGenSchedRW &AliasRW =
SchedModels.getSchedRW(Alias->getValueAsDef("AliasRW"));
if (AliasRW.HasVariants)
return true;
if (AliasRW.IsSequence) {
IdxVec ExpandedRWs;
SchedModels.expandRWSequence(AliasRW.Index, ExpandedRWs, AliasRW.IsRead);
for (unsigned SI : ExpandedRWs) {
if (hasAliasedVariants(SchedModels.getSchedRW(SI, AliasRW.IsRead),
SchedModels))
return true;
}
}
}
return false;
}

static std::vector<Record *> getAllPredicates(ArrayRef<TransVariant> Variants,		static std::vector<Record *> getAllPredicates(ArrayRef<TransVariant> Variants,
ArrayRef<unsigned> ProcIndices) {		ArrayRef<unsigned> ProcIndices) {
std::vector<Record *> Preds;		std::vector<Record *> Preds;
for (auto &Variant : Variants) {		for (auto &Variant : Variants) {
if (!Variant.VarOrSeqDef->isSubClassOf("SchedVar"))		if (!Variant.VarOrSeqDef->isSubClassOf("SchedVar"))
continue;		continue;

if (ProcIndices[0] && Variant.ProcIdx)		if (ProcIndices[0] && Variant.ProcIdx)
▲ Show 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	pushVariant(const TransVariant &VInfo, bool IsRead) {
}		}
}		}

// RWSeq is a sequence of all Reads or all Writes for the next read or write		// RWSeq is a sequence of all Reads or all Writes for the next read or write
// operand. StartIdx is an index into TransVec where partial results		// operand. StartIdx is an index into TransVec where partial results
// starts. RWSeq must be applied to all transitions between StartIdx and the end		// starts. RWSeq must be applied to all transitions between StartIdx and the end
// of TransVec.		// of TransVec.
bool PredTransitions::substituteVariantOperand(		bool PredTransitions::substituteVariantOperand(
const SmallVectorImpl<unsigned> &RWSeq, bool IsRead, bool IsForAnyCPU,		const SmallVectorImpl<unsigned> &RWSeq, bool IsRead, unsigned StartIdx) {
unsigned StartIdx) {

auto CollectAndAddVariants = [&](unsigned TransIdx,
const CodeGenSchedRW &SchedRW) {
// Distribute this partial PredTransition across intersecting variants.
// This will push a copies of TransVec[TransIdx] on the back of TransVec.
std::vector<TransVariant> IntersectingVariants;
getIntersectingVariants(SchedRW, TransIdx, IntersectingVariants);
// Now expand each variant on top of its copy of the transition.
for (const TransVariant &IV : IntersectingVariants)
pushVariant(IV, IsRead);
return !IntersectingVariants.empty();
};

bool Subst = false;		bool Subst = false;
// Visit each original RW within the current sequence.		// Visit each original RW within the current sequence.
for (SmallVectorImpl<unsigned>::const_iterator		for (SmallVectorImpl<unsigned>::const_iterator
RWI = RWSeq.begin(), RWE = RWSeq.end(); RWI != RWE; ++RWI) {		RWI = RWSeq.begin(), RWE = RWSeq.end(); RWI != RWE; ++RWI) {
const CodeGenSchedRW &SchedRW = SchedModels.getSchedRW(*RWI, IsRead);		const CodeGenSchedRW &SchedRW = SchedModels.getSchedRW(*RWI, IsRead);
// Push this RW on all partial PredTransitions or distribute variants.		// Push this RW on all partial PredTransitions or distribute variants.
// New PredTransitions may be pushed within this loop which should not be		// New PredTransitions may be pushed within this loop which should not be
// revisited (TransEnd must be loop invariant).		// revisited (TransEnd must be loop invariant).
bool HasAliases = false, WasPushed = false;
for (unsigned TransIdx = StartIdx, TransEnd = TransVec.size();		for (unsigned TransIdx = StartIdx, TransEnd = TransVec.size();
TransIdx != TransEnd; ++TransIdx) {		TransIdx != TransEnd; ++TransIdx) {
// In the common case, push RW onto the current operand's sequence.		// Distribute this partial PredTransition across intersecting variants.
if (!hasAliasedVariants(SchedRW, SchedModels)) {		// This will push a copies of TransVec[TransIdx] on the back of TransVec.
		std::vector<TransVariant> IntersectingVariants;
		getIntersectingVariants(SchedRW, TransIdx, IntersectingVariants);
		// Now expand each variant on top of its copy of the transition.
		for (const TransVariant &IV : IntersectingVariants)
		pushVariant(IV, IsRead);
		if (IntersectingVariants.empty()) {
if (IsRead)		if (IsRead)
TransVec[TransIdx].ReadSequences.back().push_back(*RWI);		TransVec[TransIdx].ReadSequences.back().push_back(*RWI);
else		else
TransVec[TransIdx].WriteSequences.back().push_back(*RWI);		TransVec[TransIdx].WriteSequences.back().push_back(*RWI);
continue;		continue;
		} else {
		Subst = true;
}		}
HasAliases = true;
WasPushed \|= CollectAndAddVariants(TransIdx, SchedRW);
Subst \|= WasPushed;
}
if (IsRead && IsForAnyCPU && HasAliases && !WasPushed) {
// If we're here this means that in some sched class:
// a) We have read variant for CPU A
// b) We have write variant for CPU B
// b) We don't have write variant for CPU A
// d) We must expand all read/write variants (IsForAnyCPU is true)
// e) We couldn't expand SchedRW because TransVec doesn't have
// any transition with compatible CPU ID.
// In such case we create new empty transition with zero (AnyCPU)
// index.
TransVec.reserve(TransVec.size() + 1);
TransVec.emplace_back(TransVec[StartIdx].PredTerm);
TransVec.back().ReadSequences.emplace_back();
Subst \|= CollectAndAddVariants(TransVec.size() - 1, SchedRW);
}		}
}		}
return Subst;		return Subst;
}		}

// For each variant of a Read/Write in Trans, substitute the sequence of		// For each variant of a Read/Write in Trans, substitute the sequence of
// Read/Writes guarded by the variant. This is exponential in the number of		// Read/Writes guarded by the variant. This is exponential in the number of
// variant Read/Writes, but in practice detection of mutually exclusive		// variant Read/Writes, but in practice detection of mutually exclusive
// predicates should result in linear growth in the total number variants.		// predicates should result in linear growth in the total number variants.
//		//
// This is one step in a breadth-first search of nested variants.		// This is one step in a breadth-first search of nested variants.
bool PredTransitions::substituteVariants(const PredTransition &Trans) {		bool PredTransitions::substituteVariants(const PredTransition &Trans) {
// Build up a set of partial results starting at the back of		// Build up a set of partial results starting at the back of
// PredTransitions. Remember the first new transition.		// PredTransitions. Remember the first new transition.
unsigned StartIdx = TransVec.size();		unsigned StartIdx = TransVec.size();
bool Subst = false;		bool Subst = false;
TransVec.emplace_back(Trans.PredTerm, Trans.ProcIndices);		TransVec.emplace_back(Trans.PredTerm, Trans.ProcIndices);

bool IsForAnyCPU = llvm::count(Trans.ProcIndices, 0);		assert(!llvm::count(Trans.ProcIndices, 0));
// Visit each original write sequence.		// Visit each original write sequence.
for (SmallVectorImpl<SmallVector<unsigned,4>>::const_iterator		for (SmallVectorImpl<SmallVector<unsigned,4>>::const_iterator
WSI = Trans.WriteSequences.begin(), WSE = Trans.WriteSequences.end();		WSI = Trans.WriteSequences.begin(), WSE = Trans.WriteSequences.end();
WSI != WSE; ++WSI) {		WSI != WSE; ++WSI) {
// Push a new (empty) write sequence onto all partial Transitions.		// Push a new (empty) write sequence onto all partial Transitions.
for (std::vector<PredTransition>::iterator I =		for (std::vector<PredTransition>::iterator I =
TransVec.begin() + StartIdx, E = TransVec.end(); I != E; ++I) {		TransVec.begin() + StartIdx, E = TransVec.end(); I != E; ++I) {
I->WriteSequences.emplace_back();		I->WriteSequences.emplace_back();
}		}
Subst \|=		Subst \|= substituteVariantOperand(WSI, /IsRead=*/false, StartIdx);
substituteVariantOperand(WSI, /IsRead=*/false, IsForAnyCPU, StartIdx);
}		}
// Visit each original read sequence.		// Visit each original read sequence.
for (SmallVectorImpl<SmallVector<unsigned,4>>::const_iterator		for (SmallVectorImpl<SmallVector<unsigned,4>>::const_iterator
RSI = Trans.ReadSequences.begin(), RSE = Trans.ReadSequences.end();		RSI = Trans.ReadSequences.begin(), RSE = Trans.ReadSequences.end();
RSI != RSE; ++RSI) {		RSI != RSE; ++RSI) {
// Push a new (empty) read sequence onto all partial Transitions.		// Push a new (empty) read sequence onto all partial Transitions.
for (std::vector<PredTransition>::iterator I =		for (std::vector<PredTransition>::iterator I =
TransVec.begin() + StartIdx, E = TransVec.end(); I != E; ++I) {		TransVec.begin() + StartIdx, E = TransVec.end(); I != E; ++I) {
I->ReadSequences.emplace_back();		I->ReadSequences.emplace_back();
}		}
Subst \|=		Subst \|= substituteVariantOperand(RSI, /IsRead=*/true, StartIdx);
substituteVariantOperand(RSI, /IsRead=*/true, IsForAnyCPU, StartIdx);
}		}
return Subst;		return Subst;
}		}

static void addSequences(CodeGenSchedModels &SchedModels,		static void addSequences(CodeGenSchedModels &SchedModels,
const SmallVectorImpl<SmallVector<unsigned, 4>> &Seqs,		const SmallVectorImpl<SmallVector<unsigned, 4>> &Seqs,
IdxVec &Result, bool IsRead) {		IdxVec &Result, bool IsRead) {
for (const auto &S : Seqs)		for (const auto &S : Seqs)
Show All 22 Lines
// Create a new SchedClass for each variant found by inferFromRW. Pass		// Create a new SchedClass for each variant found by inferFromRW. Pass
static void inferFromTransitions(ArrayRef<PredTransition> LastTransitions,		static void inferFromTransitions(ArrayRef<PredTransition> LastTransitions,
unsigned FromClassIdx,		unsigned FromClassIdx,
CodeGenSchedModels &SchedModels) {		CodeGenSchedModels &SchedModels) {
// For each PredTransition, create a new CodeGenSchedTransition, which usually		// For each PredTransition, create a new CodeGenSchedTransition, which usually
// requires creating a new SchedClass.		// requires creating a new SchedClass.
for (ArrayRef<PredTransition>::iterator		for (ArrayRef<PredTransition>::iterator
I = LastTransitions.begin(), E = LastTransitions.end(); I != E; ++I) {		I = LastTransitions.begin(), E = LastTransitions.end(); I != E; ++I) {
		// Variant expansion (substituteVariants) may create unconditional
		// transitions. We don't need to build sched classes for them.
		if (I->PredTerm.empty())
		continue;
IdxVec OperWritesVariant, OperReadsVariant;		IdxVec OperWritesVariant, OperReadsVariant;
addSequences(SchedModels, I->WriteSequences, OperWritesVariant, false);		addSequences(SchedModels, I->WriteSequences, OperWritesVariant, false);
addSequences(SchedModels, I->ReadSequences, OperReadsVariant, true);		addSequences(SchedModels, I->ReadSequences, OperReadsVariant, true);
CodeGenSchedTransition SCTrans;		CodeGenSchedTransition SCTrans;

// Transition should not contain processor indices already assigned to		// Transition should not contain processor indices already assigned to
// InstRWs in this scheduling class.		// InstRWs in this scheduling class.
const CodeGenSchedClass &FromSC = SchedModels.getSchedClass(FromClassIdx);		const CodeGenSchedClass &FromSC = SchedModels.getSchedClass(FromClassIdx);
Show All 16 Lines	for (ArrayRef<PredTransition>::iterator
Preds.erase(std::unique(Preds.begin(), Preds.end()), Preds.end());		Preds.erase(std::unique(Preds.begin(), Preds.end()), Preds.end());
dumpTransition(SchedModels, FromSC, SCTrans, Preds);		dumpTransition(SchedModels, FromSC, SCTrans, Preds);
SCTrans.PredTerm = std::move(Preds);		SCTrans.PredTerm = std::move(Preds);
SchedModels.getSchedClass(FromClassIdx)		SchedModels.getSchedClass(FromClassIdx)
.Transitions.push_back(std::move(SCTrans));		.Transitions.push_back(std::move(SCTrans));
}		}
}		}

		std::vector<unsigned> CodeGenSchedModels::getAllProcIndices() const {
		std::vector<unsigned> ProcIdVec;
		for (const auto &PM : ProcModelMap)
		if (PM.second != 0)
		ProcIdVec.push_back(PM.second);
		return ProcIdVec;
		}

		static std::vector<PredTransition>
		makePerProcessorTransitions(const PredTransition &Trans,
		ArrayRef<unsigned> ProcIndices) {
		std::vector<PredTransition> PerCpuTransVec;
		for (unsigned ProcId : ProcIndices) {
		assert(ProcId != 0);
		PerCpuTransVec.push_back(Trans);
		PerCpuTransVec.back().ProcIndices.assign(1, ProcId);
		}
		return PerCpuTransVec;
		}

// Create new SchedClasses for the given ReadWrite list. If any of the		// Create new SchedClasses for the given ReadWrite list. If any of the
// ReadWrites refers to a SchedVariant, create a new SchedClass for each variant		// ReadWrites refers to a SchedVariant, create a new SchedClass for each variant
// of the ReadWrite list, following Aliases if necessary.		// of the ReadWrite list, following Aliases if necessary.
void CodeGenSchedModels::inferFromRW(ArrayRef<unsigned> OperWrites,		void CodeGenSchedModels::inferFromRW(ArrayRef<unsigned> OperWrites,
ArrayRef<unsigned> OperReads,		ArrayRef<unsigned> OperReads,
unsigned FromClassIdx,		unsigned FromClassIdx,
ArrayRef<unsigned> ProcIndices) {		ArrayRef<unsigned> ProcIndices) {
LLVM_DEBUG(dbgs() << "INFER RW proc("; dumpIdxVec(ProcIndices);		LLVM_DEBUG(dbgs() << "INFER RW proc("; dumpIdxVec(ProcIndices);
Show All 19 Lines	for (unsigned ReadIdx : OperReads) {
expandRWSequence(ReadIdx, ReadSeq, /IsRead=/true);		expandRWSequence(ReadIdx, ReadSeq, /IsRead=/true);
LastTransitions[0].ReadSequences.emplace_back();		LastTransitions[0].ReadSequences.emplace_back();
SmallVectorImpl<unsigned> &Seq = LastTransitions[0].ReadSequences.back();		SmallVectorImpl<unsigned> &Seq = LastTransitions[0].ReadSequences.back();
Seq.append(ReadSeq.begin(), ReadSeq.end());		Seq.append(ReadSeq.begin(), ReadSeq.end());
LLVM_DEBUG(dbgs() << "("; dumpIdxVec(Seq); dbgs() << ") ");		LLVM_DEBUG(dbgs() << "("; dumpIdxVec(Seq); dbgs() << ") ");
}		}
LLVM_DEBUG(dbgs() << '\n');		LLVM_DEBUG(dbgs() << '\n');

		LastTransitions = makePerProcessorTransitions(
		LastTransitions[0], llvm::count(ProcIndices, 0)
		? ArrayRef<unsigned>(getAllProcIndices())
		: ProcIndices);
// Collect all PredTransitions for individual operands.		// Collect all PredTransitions for individual operands.
// Iterate until no variant writes remain.		// Iterate until no variant writes remain.
bool SubstitutedAny;		bool SubstitutedAny;
do {		do {
SubstitutedAny = false;		SubstitutedAny = false;
PredTransitions Transitions(*this);		PredTransitions Transitions(*this);
for (const PredTransition &Trans : LastTransitions)		for (const PredTransition &Trans : LastTransitions)
SubstitutedAny \|= Transitions.substituteVariants(Trans);		SubstitutedAny \|= Transitions.substituteVariants(Trans);
LLVM_DEBUG(Transitions.dump());		LLVM_DEBUG(Transitions.dump());
LastTransitions.swap(Transitions.TransVec);		LastTransitions.swap(Transitions.TransVec);
} while (SubstitutedAny);		} while (SubstitutedAny);
// If the first transition has no variants, nothing to do.
if (LastTransitions[0].PredTerm.empty())
return;

// WARNING: We are about to mutate the SchedClasses vector. Do not refer to		// WARNING: We are about to mutate the SchedClasses vector. Do not refer to
// OperWrites, OperReads, or ProcIndices after calling inferFromTransitions.		// OperWrites, OperReads, or ProcIndices after calling inferFromTransitions.
inferFromTransitions(LastTransitions, FromClassIdx, *this);		inferFromTransitions(LastTransitions, FromClassIdx, *this);
}		}

// Check if any processor resource group contains all resource records in		// Check if any processor resource group contains all resource records in
// SubUnits.		// SubUnits.
▲ Show 20 Lines • Show All 498 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TableGen][SchedModels] Fix read/write variant substitution #2
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 308279

llvm/lib/Target/ARM/ARMScheduleA57.td

llvm/test/CodeGen/ARM/cortex-a57-misched-mla.mir

llvm/utils/TableGen/CodeGenSchedule.h

llvm/utils/TableGen/CodeGenSchedule.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[TableGen][SchedModels] Fix read/write variant substitution #2ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 308279

llvm/lib/Target/ARM/ARMScheduleA57.td

llvm/test/CodeGen/ARM/cortex-a57-misched-mla.mir

llvm/utils/TableGen/CodeGenSchedule.h

llvm/utils/TableGen/CodeGenSchedule.cpp

[TableGen][SchedModels] Fix read/write variant substitution #2
ClosedPublic