This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
MC/
2
MCSchedule.h
-
Target/
-
TargetSchedule.td
-
lib/Target/X86/
-
Target/
-
X86/
-
X86ScheduleBtVer2.td
-
test/tools/llvm-mca/X86/BtVer2/
-
tools/
-
llvm-mca/
-
X86/
-
BtVer2/
-
dot-product.s
-
pipes-fpu.s
-
register-files-5.s
-
tools/llvm-mca/
-
llvm-mca/
-
Backend.h
1/4
Dispatch.h
2/5
Dispatch.cpp
-
utils/TableGen/
-
TableGen/
-
CodeGenSchedule.h
-
CodeGenSchedule.cpp
-
SubtargetEmitter.cpp

Differential D45259

[MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.
ClosedPublic

Authored by andreadb on Apr 4 2018, 7:57 AM.

Download Raw Diff

Details

Reviewers

RKSimon
courbet
spatel

Commits

rGc74ad502cecf: [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.
rL329304: [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca.

Summary

This patch adds the ability to describe properties of the hardware retire control unit.

Tablegen class RetireControlUnit has been added for this purpose (see TargetSchedule.td).

A RetireControlUnit specifies the size of the reorder buffer, as well as the maximum number of opcodes that can be retired every cycle.

A zero (or negative) value for the reorder buffer size means: "the size is unknown". If the size is unknown, then llvm-mca defaults it to the value of field SchedMachineModel::MicroOpBufferSize.
A zero or negative number of opcodes retired per cycle means: "there is no restriction on the number of instructions that can be retired every cycle".

Models can optionally specify an instance of RetireControlUnit. There can only be up-to one RetireControlUnit definition per scheduling model.

Information related to the RCU (RetireControlUnit) is stored in (two new fields of) MCExtraProcessorInfo.
llvm-mca loads that information when it initializes the DispatchUnit / RetireControlUnit (see Dispatch.h/Dispatch.cpp).

This patch fixes PR36661.

Please let me know if okay to commit

-Andrea

Diff Detail

Event Timeline

andreadb created this revision.Apr 4 2018, 7:57 AM

Herald added subscribers: gbedwell, tschuett. · View Herald TranscriptApr 4 2018, 7:57 AM

courbet added inline comments.Apr 5 2018, 2:32 AM

tools/llvm-mca/Dispatch.cpp
262	Can we get rid of the MaxRetirePerCycle ctor parameter ? Is there any reason why the user would override this ?
tools/llvm-mca/Dispatch.h
194	Any reason for this to be in a separate function ? What about moving the ctor definition to the cpp file and inlining the code ? Else when reading the ctor definition here it's not obvious that AvailableSlots might be overridden in initialize().

andreadb added inline comments.Apr 5 2018, 2:46 AM

tools/llvm-mca/Dispatch.cpp
262	The original idea was to let users override the retire-per-cycle. However, with this patch, the max-retire-per-cycle can now be set via tablegen. That being said, most processors don't provide a max-retire-per-cycle. This patch would only fix the BtVer2 model. So, I am tempted to keep the flag for now (although, I don't have a strong opinion about it). Alternatively, I can commit a separate patch that removes the flag, and the rebase this patch. What do you think?
tools/llvm-mca/Dispatch.h
194	No reason. I will move the constructor to the cpp file.

Why did the tests change ? I thought that this change would not impact tests.

tools/llvm-mca/Dispatch.cpp
262	I'm pretty sure that I would stick to tablegen-specified models as a user. And if you're developing you own processor, I hope that you have your own tools :) So the flag does not sound that useful to me given that it makes a bit harder to understand this code. I don't have a strong opinion though, so feel free to keep the flag if i'ts useful to you.

In D45259#1058226, @courbet wrote:

Why did the tests change ? I thought that this change would not impact tests.

It does have an impact on the retire stage.

On BtVer2, the retire throughput is 2 instructions per cycle. It does affect the timeline at least.

tools/llvm-mca/Dispatch.cpp
262	That's a good point (especially if you have your own tools). It could have been used to experiment/play with the RCU (with SMT processors we might emulate a different retire throughput). That being said, the argument is a bit weak. SMT introduces other issues related to the partitioning of resources which cannot be fully tweaked via flags. If okay for you, I am going to commit a change that removes the flag and updates the docs in preparation for this patch. Then I rebase this patch.

javed.absar added a subscriber: javed.absar.Apr 5 2018, 3:24 AM

Diffusion mentioned this in rL329274: [llvm-mca] Remove flag -max-retire-per-cycle, and update the docs..Apr 5 2018, 4:39 AM

Patch updated.

Addressed review comments.

andreadb marked 3 inline comments as done.Apr 5 2018, 6:30 AM

andreadb added inline comments.

tools/llvm-mca/Dispatch.cpp
262	I went ahead and removed that flag at r329274.

LGTM with a couple of minors

include/llvm/MC/MCSchedule.h
176	These comments repeat a lot of what is said in TargetSchedule.td - make the comments briefer here?
tools/llvm-mca/Dispatch.h
269–273	Are you confident that we won't need llvm::MCSubtargetInfo again anytime soon?

This revision is now accepted and ready to land.Apr 5 2018, 7:57 AM

Thanks Simon.

include/llvm/MC/MCSchedule.h
176	I will simplify the comment.
tools/llvm-mca/Dispatch.h
269–273	Ideally, all the information needed by the dispatch logic should be accessible through the scheduling model. I may be wrong, but I don't think that we will need it again.

Closed by commit rL329304: [MC][Tablegen] Allow models to describe the retire control unit for llvm-mca. (authored by adibiagio). · Explain WhyApr 5 2018, 8:44 AM

This revision was automatically updated to reflect the committed changes.

benhamilton added a subscriber: benhamilton.Apr 5 2018, 8:51 AM

benhamilton added inline comments.

llvm/trunk/utils/TableGen/SubtargetEmitter.cpp
616 ↗	(On Diff #141168)	Looks like this broke the build: http://green.lab.llvm.org/green//job/clang-stage1-cmake-RA-incremental/47722/consoleFull#-172429723349ba4694-19c4-4d7e-bec5-911270d8a58c /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-incremental/llvm/utils/TableGen/SubtargetEmitter.cpp:616:9: error: no matching function for call to 'max' std::max(ReorderBufferSize, RCU->getValueAsInt("ReorderBufferSize")); ^~~~~~~~ P8077

I am going to fix it now.
I guess I should have used int64_t ...

Revision Contents

Path

Size

include/

llvm/

MC/

MCSchedule.h

13 lines

Target/

TargetSchedule.td

19 lines

lib/

Target/

X86/

X86ScheduleBtVer2.td

5 lines

test/

tools/

llvm-mca/

X86/

BtVer2/

dot-product.s

11 lines

pipes-fpu.s

43 lines

42 lines

tools/

llvm-mca/

Backend.h

6 lines

Dispatch.h

25 lines

Dispatch.cpp

16 lines

utils/

TableGen/

CodeGenSchedule.h

12 lines

CodeGenSchedule.cpp

26 lines

SubtargetEmitter.cpp

17 lines

Diff 140958

include/llvm/MC/MCSchedule.h

	Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines
	};			};

	/// Provide extra details about the machine processor.			/// Provide extra details about the machine processor.
	///			///
	/// This is a collection of "optional" processor information that is not			/// This is a collection of "optional" processor information that is not
	/// normally used by the LLVM machine schedulers, but that can be consumed by			/// normally used by the LLVM machine schedulers, but that can be consumed by
	/// external tools like llvm-mca to improve the quality of the peformance			/// external tools like llvm-mca to improve the quality of the peformance
	/// analysis.			/// analysis.
	/// In future, the plan is to extend this struct with extra information (for
	/// example: maximum number of instructions retired per cycle; actual size of
	/// the reorder buffer; etc.).
	struct MCExtraProcessorInfo {			struct MCExtraProcessorInfo {
				// ReorderBufferSize is the actual size of the reorder buffer in hardware.
				// This value may differ from the value of field
				// `MCSchedModel::MicroOpBufferSize`. External tools like llvm-mca should use
				// this value to model the size of the reorder buffer in the retire control
				// unit. A value of zero for this field is used for when the size is unknown.
				unsigned ReorderBufferSize;
				// Number of instructions retired per cycle.
				// A value of zero means that there are no restrictions in the number of
				// micro-ops that can be retired every cycle.
				unsigned MaxRetirePerCycle;
				RKSimonUnsubmitted Not Done Reply Inline Actions These comments repeat a lot of what is said in TargetSchedule.td - make the comments briefer here? RKSimon: These comments repeat a lot of what is said in TargetSchedule.td - make the comments briefer…
				andreadbAuthorUnsubmitted Not Done Reply Inline Actions I will simplify the comment. andreadb: I will simplify the comment.
	const MCRegisterFileDesc *RegisterFiles;			const MCRegisterFileDesc *RegisterFiles;
	unsigned NumRegisterFiles;			unsigned NumRegisterFiles;
	const MCRegisterCostEntry *RegisterCostTable;			const MCRegisterCostEntry *RegisterCostTable;
	unsigned NumRegisterCostEntries;			unsigned NumRegisterCostEntries;
	};			};

	/// Machine model for scheduling, bundling, and heuristics.			/// Machine model for scheduling, bundling, and heuristics.
	///			///
	▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

include/llvm/Target/TargetSchedule.td

	Show First 20 Lines • Show All 437 Lines • ▼ Show 20 Lines
	// SchedModel will usually be provided by surrounding let statement			// SchedModel will usually be provided by surrounding let statement
	// and ties this SchedAlias mapping to a processor.			// and ties this SchedAlias mapping to a processor.
	class SchedAlias<SchedReadWrite match, SchedReadWrite alias> {			class SchedAlias<SchedReadWrite match, SchedReadWrite alias> {
	SchedReadWrite MatchRW = match;			SchedReadWrite MatchRW = match;
	SchedReadWrite AliasRW = alias;			SchedReadWrite AliasRW = alias;
	SchedMachineModel SchedModel = ?;			SchedMachineModel SchedModel = ?;
	}			}

	// Alow the definition of processor register files.			// Allow the definition of processor register files.
	// Each processor register file declares the number of physical registers, as			// Each processor register file declares the number of physical registers, as
	// well as a optional register cost information. The cost of a register R is the			// well as a optional register cost information. The cost of a register R is the
	// number of physical registers used to rename R (at register renaming stage).			// number of physical registers used to rename R (at register renaming stage).
	// That value defaults to 1, to all the registers contained in the register			// That value defaults to 1, to all the registers contained in the register
	// file. The set of target register files is inferred from the list of register			// file. The set of target register files is inferred from the list of register
	// classes. Register costs are defined at register class granularity. An empty			// classes. Register costs are defined at register class granularity. An empty
	// list of register classes means that this register file contains all the			// list of register classes means that this register file contains all the
	// registers defined by the target.			// registers defined by the target.
	class RegisterFile<int numPhysRegs, list<RegisterClass> Classes = [],			class RegisterFile<int numPhysRegs, list<RegisterClass> Classes = [],
	list<int> Costs = []> {			list<int> Costs = []> {
	list<RegisterClass> RegClasses = Classes;			list<RegisterClass> RegClasses = Classes;
	list<int> RegCosts = Costs;			list<int> RegCosts = Costs;
	int NumPhysRegs = numPhysRegs;			int NumPhysRegs = numPhysRegs;
	SchedMachineModel SchedModel = ?;			SchedMachineModel SchedModel = ?;
	}			}

				// Describe the retire control unit.
				// A retire control unit specifies the size of the reorder buffer, as well as
				// the maximum number of opcodes that can be retired every cycle.
				// A value less-than-or-equal-to zero for field 'ReorderBufferSize' means: "the
				// size is unknown". The idea is that external tools can fall-back to using
				// field MicroOpBufferSize in SchedModel if the reorder buffer size is unknown.
				// A zero or negative value for field 'MaxRetirePerCycle' means "no
				// restrictions on the number of instructions retired per cycle".
				// Models can optionally specify up to one instance of RetireControlUnit per
				// scheduling model.
				class RetireControlUnit<int bufferSize, int retirePerCycle> {
				int ReorderBufferSize = bufferSize;
				int MaxRetirePerCycle = retirePerCycle;
				SchedMachineModel SchedModel = ?;
				}

lib/Target/X86/X86ScheduleBtVer2.td

	Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
	// Reference: www.realworldtech.com/jaguar/4/			// Reference: www.realworldtech.com/jaguar/4/
	def IntegerPRF : RegisterFile<64, [GR8, GR16, GR32, GR64, CCR]>;			def IntegerPRF : RegisterFile<64, [GR8, GR16, GR32, GR64, CCR]>;

	// The Jaguar FP Retire Queue renames SIMD and FP uOps onto a pool of 72 SSE			// The Jaguar FP Retire Queue renames SIMD and FP uOps onto a pool of 72 SSE
	// registers. Operations on 256-bit data types are cracked into two COPs.			// registers. Operations on 256-bit data types are cracked into two COPs.
	// Reference: www.realworldtech.com/jaguar/4/			// Reference: www.realworldtech.com/jaguar/4/
	def FpuPRF: RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>;			def FpuPRF: RegisterFile<72, [VR64, VR128, VR256], [1, 1, 2]>;

				// The out-of-order window for Jaguar is 64 entries. Up to 2 COPs per cycle are
				// retired in-order by the RCU.
				// Reference: www.realworldtech.com/jaguar/4/
				def RCU : RetireControlUnit<64, 2>;

	// Integer Pipe Scheduler			// Integer Pipe Scheduler
	def JALU01 : ProcResGroup<[JALU0, JALU1]> {			def JALU01 : ProcResGroup<[JALU0, JALU1]> {
	let BufferSize=20;			let BufferSize=20;
	}			}

	// AGU Pipe Scheduler			// AGU Pipe Scheduler
	def JLSAGU : ProcResGroup<[JLAGU, JSAGU]> {			def JLSAGU : ProcResGroup<[JLAGU, JSAGU]> {
	let BufferSize=12;			let BufferSize=12;
	▲ Show 20 Lines • Show All 846 Lines • Show Last 20 Lines

test/tools/llvm-mca/X86/BtVer2/dot-product.s

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: 012345			# CHECK-NEXT: 012345
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 0123456789

	# CHECK: [0,0] DeeER. . . vmulps %xmm0, %xmm1, %xmm2			# CHECK: [0,0] DeeER. . . vmulps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [0,1] D==eeeER . . vhaddps %xmm2, %xmm2, %xmm3			# CHECK-NEXT: [0,1] D==eeeER . . vhaddps %xmm2, %xmm2, %xmm3
	# CHECK-NEXT: [0,2] .D====eeeER . vhaddps %xmm3, %xmm3, %xmm4			# CHECK-NEXT: [0,2] .D====eeeER . vhaddps %xmm3, %xmm3, %xmm4

	# CHECK: [1,0] .DeeE-----R . vmulps %xmm0, %xmm1, %xmm2			# CHECK: [1,0] .DeeE-----R . vmulps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [1,1] . D=eeeE--R . vhaddps %xmm2, %xmm2, %xmm3			# CHECK-NEXT: [1,1] . D=eeeE---R . vhaddps %xmm2, %xmm2, %xmm3
	# CHECK-NEXT: [1,2] . D====eeeER . vhaddps %xmm3, %xmm3, %xmm4			# CHECK-NEXT: [1,2] . D====eeeER . vhaddps %xmm3, %xmm3, %xmm4
				# CHECK: [2,0] . DeeE-----R . vmulps %xmm0, %xmm1, %xmm2
	# CHECK: [2,0] . DeeE----R . vmulps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [2,1] . D====eeeER . vhaddps %xmm2, %xmm2, %xmm3			# CHECK-NEXT: [2,1] . D====eeeER . vhaddps %xmm2, %xmm2, %xmm3
	# CHECK-NEXT: [2,2] . D======eeeER vhaddps %xmm3, %xmm3, %xmm4			# CHECK-NEXT: [2,2] . D======eeeER vhaddps %xmm3, %xmm3, %xmm4


	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 3 1.0 1.0 3.0 vmulps %xmm0, %xmm1, %xmm2			# CHECK-NEXT: 0. 3 1.0 1.0 3.3 vmulps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: 1. 3 3.3 0.7 0.7 vhaddps %xmm2, %xmm2, %xmm3			# CHECK-NEXT: 1. 3 3.3 0.7 1.0 vhaddps %xmm2, %xmm2, %xmm3
	# CHECK-NEXT: 2. 3 5.7 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4			# CHECK-NEXT: 2. 3 5.7 0.0 0.0 vhaddps %xmm3, %xmm3, %xmm4

test/tools/llvm-mca/X86/BtVer2/pipes-fpu.s

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	# CHECK-NEXT: - - - - - - 1.00 - - - 1.00 - - - vcvttps2dq %xmm0, %xmm2			# CHECK-NEXT: - - - - - - 1.00 - - - 1.00 - - - vcvttps2dq %xmm0, %xmm2
	# CHECK-NEXT: - - - - - 1.00 - - - - - - - 1.00 vpclmulqdq $0, %xmm0, %xmm1, %xmm2			# CHECK-NEXT: - - - - - 1.00 - - - - - - - 1.00 vpclmulqdq $0, %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - - vaddps %xmm0, %xmm1, %xmm2			# CHECK-NEXT: - - - 1.00 - 1.00 - - - - - - - - vaddps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: - - - - 21.00 - 1.00 - - - - - - - vsqrtps %xmm0, %xmm2			# CHECK-NEXT: - - - - 21.00 - 1.00 - - - - - - - vsqrtps %xmm0, %xmm2
	# CHECK-NEXT: - - - 2.00 - 2.00 - - - - - - - - vaddps %ymm0, %ymm1, %ymm2			# CHECK-NEXT: - - - 2.00 - 2.00 - - - - - - - - vaddps %ymm0, %ymm1, %ymm2
	# CHECK-NEXT: - - - - 42.00 - 2.00 - - - - - - - vsqrtps %ymm0, %ymm2			# CHECK-NEXT: - - - - 42.00 - 2.00 - - - - - - - vsqrtps %ymm0, %ymm2


	# CHECK: Timeline view:
	# CHECK-NEXT: 0123456789 0123456789 0123456789
	# CHECK-NEXT: Index 0123456789 0123456789 0123456789 01234567

				# CHECK: Timeline view:
				# CHECK-NEXT: 0123456789 0123456789 0123456789 0
				# CHECK-NEXT: Index 0123456789 0123456789 0123456789 0123456789
	# CHECK: [0,0] DeeeeER . . . . . . . . . . . . . vpmulld %xmm0, %xmm1, %xmm2			# CHECK: [0,0] DeeeeER . . . . . . . . . . . . . vpmulld %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [0,1] .DeE--R . . . . . . . . . . . . . vpand %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [0,1] .DeE--R . . . . . . . . . . . . . vpand %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [0,2] . DeeeER . . . . . . . . . . . . . vcvttps2dq %xmm0, %xmm2			# CHECK-NEXT: [0,2] . DeeeER . . . . . . . . . . . . . vcvttps2dq %xmm0, %xmm2
	# CHECK-NEXT: [0,3] . DeeE-R . . . . . . . . . . . . . vpclmulqdq $0, %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [0,3] . DeeE-R . . . . . . . . . . . . . vpclmulqdq $0, %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [0,4] . DeeeER . . . . . . . . . . . . . vaddps %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [0,4] . DeeeER . . . . . . . . . . . . . vaddps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [0,5] . DeeeeeeeeeeeeeeeeeeeeeER . . . . . . . . . vsqrtps %xmm0, %xmm2			# CHECK-NEXT: [0,5] . DeeeeeeeeeeeeeeeeeeeeeER . . . . . . . . . vsqrtps %xmm0, %xmm2
	# CHECK-NEXT: [0,6] . DeeeE-----------------R . . . . . . . . . vaddps %ymm0, %ymm1, %ymm2			# CHECK-NEXT: [0,6] . DeeeE-----------------R . . . . . . . . . vaddps %ymm0, %ymm1, %ymm2
	# CHECK-NEXT: [0,7] . D===================eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeER vsqrtps %ymm0, %ymm2			# CHECK-NEXT: [0,7] . D===================eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeER . vsqrtps %ymm0, %ymm2
				# CHECK: [1,0] . .DeeeeE--------------------------------------------------------R . vpmulld %xmm0, %xmm1, %xmm2
	# CHECK: [1,0] . .DeeeeE--------------------------------------------------------R vpmulld %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [1,1] . . DeE-----------------------------------------------------------R. vpand %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [1,1] . . DeE----------------------------------------------------------R vpand %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [1,2] . . DeeeE--------------------------------------------------------R. vcvttps2dq %xmm0, %xmm2
	# CHECK-NEXT: [1,2] . . DeeeE-------------------------------------------------------R vcvttps2dq %xmm0, %xmm2			# CHECK-NEXT: [1,3] . . DeeE----------------------------------------------------------R vpclmulqdq $0, %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [1,3] . . DeeE--------------------------------------------------------R vpclmulqdq $0, %xmm0, %xmm1, %xmm2			# CHECK-NEXT: [1,4] . . DeeeE--------------------------------------------------------R vaddps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: [1,4] . . DeeeE------------------------------------------------------R vaddps %xmm0, %xmm1, %xmm2


	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 2 1.0 1.0 28.0 vpmulld %xmm0, %xmm1, %xmm2			# CHECK-NEXT: 0. 2 1.0 1.0 28.0 vpmulld %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: 1. 2 1.0 1.0 30.0 vpand %xmm0, %xmm1, %xmm2			# CHECK-NEXT: 1. 2 1.0 1.0 30.5 vpand %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: 2. 2 1.0 1.0 27.5 vcvttps2dq %xmm0, %xmm2			# CHECK-NEXT: 2. 2 1.0 1.0 28.0 vcvttps2dq %xmm0, %xmm2
	# CHECK-NEXT: 3. 2 1.0 1.0 28.5 vpclmulqdq $0, %xmm0, %xmm1, %xmm2			# CHECK-NEXT: 3. 2 1.0 1.0 29.5 vpclmulqdq $0, %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: 4. 2 1.0 1.0 27.0 vaddps %xmm0, %xmm1, %xmm2			# CHECK-NEXT: 4. 2 1.0 1.0 28.0 vaddps %xmm0, %xmm1, %xmm2
	# CHECK-NEXT: 5. 1 1.0 1.0 0.0 vsqrtps %xmm0, %xmm2			# CHECK-NEXT: 5. 1 1.0 1.0 0.0 vsqrtps %xmm0, %xmm2
	# CHECK-NEXT: 6. 1 1.0 1.0 17.0 vaddps %ymm0, %ymm1, %ymm2			# CHECK-NEXT: 6. 1 1.0 1.0 17.0 vaddps %ymm0, %ymm1, %ymm2
	# CHECK-NEXT: 7. 1 20.0 20.0 0.0 vsqrtps %ymm0, %ymm2			# CHECK-NEXT: 7. 1 20.0 20.0 0.0 vsqrtps %ymm0, %ymm2

test/tools/llvm-mca/X86/BtVer2/register-files-5.s

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines


	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: 0123456789 0123456789 0123456789			# CHECK-NEXT: 0123456789 0123456789 0123456789
	# CHECK-NEXT: Index 0123456789 0123456789 0123456789 0123456789			# CHECK-NEXT: Index 0123456789 0123456789 0123456789 0123456789

	# CHECK: [0,0] DeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeER . . . . . . vdivps %ymm0, %ymm0, %ymm1			# CHECK: [0,0] DeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeER . . . . . . vdivps %ymm0, %ymm0, %ymm1
	# CHECK-NEXT: [0,1] .DeeeE----------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm2			# CHECK-NEXT: [0,1] .DeeeE----------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm2
	# CHECK-NEXT: [0,2] . D=eeeE--------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm3			# CHECK-NEXT: [0,2] . D=eeeE---------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm3
	# CHECK-NEXT: [0,3] . D==eeeE------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm4			# CHECK-NEXT: [0,3] . D==eeeE-------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm4
	# CHECK-NEXT: [0,4] . D===eeeE----------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm5			# CHECK-NEXT: [0,4] . D===eeeE------------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm5
	# CHECK-NEXT: [0,5] . D====eeeE--------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm6			# CHECK-NEXT: [0,5] . D====eeeE----------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm6
	# CHECK-NEXT: [0,6] . .D=====eeeE------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm7			# CHECK-NEXT: [0,6] . .D=====eeeE---------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm7
	# CHECK-NEXT: [0,7] . . D======eeeE----------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm8			# CHECK-NEXT: [0,7] . . D======eeeE-------------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm8
	# CHECK-NEXT: [0,8] . . D=======eeeE--------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm9			# CHECK-NEXT: [0,8] . . D=======eeeE------------------------R. . . . . . vaddps %ymm0, %ymm0, %ymm9
	# CHECK-NEXT: [0,9] . . D========eeeE------------------R . . . . . . vaddps %ymm0, %ymm0, %ymm10			# CHECK-NEXT: [0,9] . . D========eeeE----------------------R. . . . . . vaddps %ymm0, %ymm0, %ymm10
	# CHECK-NEXT: [0,10] . . D=========eeeE----------------R . . . . . . vaddps %ymm0, %ymm0, %ymm11			# CHECK-NEXT: [0,10] . . D=========eeeE---------------------R . . . . . vaddps %ymm0, %ymm0, %ymm11
	# CHECK-NEXT: [0,11] . . .D==========eeeE--------------R . . . . . . vaddps %ymm0, %ymm0, %ymm12			# CHECK-NEXT: [0,11] . . .D==========eeeE-------------------R . . . . . vaddps %ymm0, %ymm0, %ymm12
	# CHECK-NEXT: [0,12] . . . D===========eeeE------------R . . . . . . vaddps %ymm0, %ymm0, %ymm13			# CHECK-NEXT: [0,12] . . . D===========eeeE------------------R . . . . . vaddps %ymm0, %ymm0, %ymm13
	# CHECK-NEXT: [0,13] . . . D============eeeE----------R . . . . . . vaddps %ymm0, %ymm0, %ymm14			# CHECK-NEXT: [0,13] . . . D============eeeE----------------R . . . . . vaddps %ymm0, %ymm0, %ymm14
	# CHECK-NEXT: [0,14] . . . D=============eeeE--------R . . . . . . vaddps %ymm0, %ymm0, %ymm15			# CHECK-NEXT: [0,14] . . . D=============eeeE---------------R . . . . . vaddps %ymm0, %ymm0, %ymm15
	# CHECK-NEXT: [0,15] . . . D==============eeeE------R . . . . . . vaddps %ymm2, %ymm0, %ymm0			# CHECK-NEXT: [0,15] . . . D==============eeeE-------------R . . . . . vaddps %ymm2, %ymm0, %ymm0
	# CHECK-NEXT: [0,16] . . . .D================eeeE---R . . . . . . vaddps %ymm2, %ymm0, %ymm3			# CHECK-NEXT: [0,16] . . . .D================eeeE-----------R . . . . . vaddps %ymm2, %ymm0, %ymm3
	# CHECK-NEXT: [0,17] . . . . D=================eeeE-R . . . . . . vaddps %ymm2, %ymm0, %ymm4			# CHECK-NEXT: [0,17] . . . . D=================eeeE---------R . . . . . vaddps %ymm2, %ymm0, %ymm4
	# CHECK-NEXT: [0,18] . . . . D==================eeeER . . . . . . vaddps %ymm2, %ymm0, %ymm5			# CHECK-NEXT: [0,18] . . . . D==================eeeE--------R. . . . . vaddps %ymm2, %ymm0, %ymm5
	# CHECK-NEXT: [0,19] . . . . D===================eeeER . . . . . . vaddps %ymm2, %ymm0, %ymm6			# CHECK-NEXT: [0,19] . . . . D===================eeeE------R. . . . . vaddps %ymm2, %ymm0, %ymm6
	# CHECK-NEXT: [0,20] . . . . D====================eeeER . . . . . vaddps %ymm2, %ymm0, %ymm7			# CHECK-NEXT: [0,20] . . . . D====================eeeE-----R . . . . vaddps %ymm2, %ymm0, %ymm7
	# CHECK-NEXT: [0,21] . . . . .D=====================eeeER . . . . . vaddps %ymm2, %ymm0, %ymm8			# CHECK-NEXT: [0,21] . . . . .D=====================eeeE---R . . . . vaddps %ymm2, %ymm0, %ymm8
	# CHECK-NEXT: [0,22] . . . . . D======================eeeER. . . . . vaddps %ymm2, %ymm0, %ymm9			# CHECK-NEXT: [0,22] . . . . . D======================eeeE--R . . . . vaddps %ymm2, %ymm0, %ymm9
	# CHECK-NEXT: [0,23] . . . . . D=======================eeeER . . . . vaddps %ymm2, %ymm0, %ymm10			# CHECK-NEXT: [0,23] . . . . . D=======================eeeER . . . . vaddps %ymm2, %ymm0, %ymm10
	# CHECK-NEXT: [0,24] . . . . . D========================eeeER . . . . vaddps %ymm2, %ymm0, %ymm11			# CHECK-NEXT: [0,24] . . . . . D========================eeeER . . . . vaddps %ymm2, %ymm0, %ymm11
	# CHECK-NEXT: [0,25] . . . . . D=========================eeeER . . . vaddps %ymm2, %ymm0, %ymm12			# CHECK-NEXT: [0,25] . . . . . D=========================eeeER . . . vaddps %ymm2, %ymm0, %ymm12
	# CHECK-NEXT: [0,26] . . . . . .D==========================eeeER . . . vaddps %ymm2, %ymm0, %ymm13			# CHECK-NEXT: [0,26] . . . . . .D==========================eeeER . . . vaddps %ymm2, %ymm0, %ymm13
	# CHECK-NEXT: [0,27] . . . . . . D===========================eeeER. . . vaddps %ymm2, %ymm0, %ymm14			# CHECK-NEXT: [0,27] . . . . . . D===========================eeeER. . . vaddps %ymm2, %ymm0, %ymm14
	# CHECK-NEXT: [0,28] . . . . . . D============================eeeER . . vaddps %ymm2, %ymm0, %ymm15			# CHECK-NEXT: [0,28] . . . . . . D============================eeeER . . vaddps %ymm2, %ymm0, %ymm15
	# CHECK-NEXT: [0,29] . . . . . . D=============================eeeER . . vaddps %ymm3, %ymm0, %ymm2			# CHECK-NEXT: [0,29] . . . . . . D=============================eeeER . . vaddps %ymm3, %ymm0, %ymm2
	# CHECK-NEXT: [0,30] . . . . . . D==============================eeeER . vaddps %ymm3, %ymm0, %ymm4			# CHECK-NEXT: [0,30] . . . . . . D==============================eeeER . vaddps %ymm3, %ymm0, %ymm4
	# CHECK-NEXT: [0,31] . . . . . . .D===============================eeeER . vaddps %ymm3, %ymm0, %ymm5			# CHECK-NEXT: [0,31] . . . . . . .D===============================eeeER . vaddps %ymm3, %ymm0, %ymm5
	# CHECK-NEXT: [0,32] . . . . . . . . D========================eeeER vaddps %ymm3, %ymm0, %ymm6			# CHECK-NEXT: [0,32] . . . . . . . . D========================eeeER vaddps %ymm3, %ymm0, %ymm6

tools/llvm-mca/Backend.h

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	Backend(const llvm::MCSubtargetInfo &Subtarget,
const llvm::MCRegisterInfo &MRI, InstrBuilder &B, SourceMgr &Source,		const llvm::MCRegisterInfo &MRI, InstrBuilder &B, SourceMgr &Source,
unsigned DispatchWidth = 0, unsigned RegisterFileSize = 0,		unsigned DispatchWidth = 0, unsigned RegisterFileSize = 0,
unsigned MaxRetirePerCycle = 0, unsigned LoadQueueSize = 0,		unsigned MaxRetirePerCycle = 0, unsigned LoadQueueSize = 0,
unsigned StoreQueueSize = 0, bool AssumeNoAlias = false)		unsigned StoreQueueSize = 0, bool AssumeNoAlias = false)
: STI(Subtarget), IB(B),		: STI(Subtarget), IB(B),
HWS(llvm::make_unique<Scheduler>(this, Subtarget.getSchedModel(),		HWS(llvm::make_unique<Scheduler>(this, Subtarget.getSchedModel(),
LoadQueueSize, StoreQueueSize,		LoadQueueSize, StoreQueueSize,
AssumeNoAlias)),		AssumeNoAlias)),
DU(llvm::make_unique<DispatchUnit>(		DU(llvm::make_unique<DispatchUnit>(this, Subtarget.getSchedModel(), MRI,
this, STI, MRI, Subtarget.getSchedModel().MicroOpBufferSize,		RegisterFileSize, MaxRetirePerCycle,
RegisterFileSize, MaxRetirePerCycle, DispatchWidth, HWS.get())),		DispatchWidth, HWS.get())),
SM(Source), Cycles(0) {		SM(Source), Cycles(0) {
HWS->setDispatchUnit(DU.get());		HWS->setDispatchUnit(DU.get());
}		}

void run() {		void run() {
while (SM.hasNext() \|\| !DU->isRCUEmpty())		while (SM.hasNext() \|\| !DU->isRCUEmpty())
runCycle(Cycles++);		runCycle(Cycles++);
}		}
Show All 21 Lines

tools/llvm-mca/Dispatch.h

Show First 20 Lines • Show All 185 Lines • ▼ Show 20 Lines
private:		private:
unsigned NextAvailableSlotIdx;		unsigned NextAvailableSlotIdx;
unsigned CurrentInstructionSlotIdx;		unsigned CurrentInstructionSlotIdx;
unsigned AvailableSlots;		unsigned AvailableSlots;
unsigned MaxRetirePerCycle; // 0 means no limit.		unsigned MaxRetirePerCycle; // 0 means no limit.
std::vector<RUToken> Queue;		std::vector<RUToken> Queue;
DispatchUnit *Owner;		DispatchUnit *Owner;

		void initialize(const llvm::MCSchedModel &SM);
		courbetUnsubmitted Done Reply Inline Actions Any reason for this to be in a separate function ? What about moving the ctor definition to the cpp file and inlining the code ? Else when reading the ctor definition here it's not obvious that AvailableSlots might be overridden in initialize(). courbet: Any reason for this to be in a separate function ? What about moving the ctor definition to the…
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions No reason. I will move the constructor to the cpp file. andreadb: No reason. I will move the constructor to the cpp file.

public:		public:
RetireControlUnit(unsigned NumSlots, unsigned RPC, DispatchUnit *DU)		RetireControlUnit(const llvm::MCSchedModel &SM, unsigned RetirePerCycle,
		DispatchUnit *DU)
: NextAvailableSlotIdx(0), CurrentInstructionSlotIdx(0),		: NextAvailableSlotIdx(0), CurrentInstructionSlotIdx(0),
AvailableSlots(NumSlots), MaxRetirePerCycle(RPC), Owner(DU) {		AvailableSlots(SM.MicroOpBufferSize), MaxRetirePerCycle(RetirePerCycle),
assert(NumSlots && "Expected at least one slot!");		Owner(DU) {
Queue.resize(NumSlots);		initialize(SM);
}		}

bool isFull() const { return !AvailableSlots; }		bool isFull() const { return !AvailableSlots; }
bool isEmpty() const { return AvailableSlots == Queue.size(); }		bool isEmpty() const { return AvailableSlots == Queue.size(); }
bool isAvailable(unsigned Quantity = 1) const {		bool isAvailable(unsigned Quantity = 1) const {
// Some instructions may declare a number of uOps which exceedes the size		// Some instructions may declare a number of uOps which exceedes the size
// of the reorder buffer. To avoid problems, cap the amount of slots to		// of the reorder buffer. To avoid problems, cap the amount of slots to
// the size of the reorder buffer.		// the size of the reorder buffer.
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	class DispatchUnit {
bool checkRAT(unsigned Index, const Instruction &Inst);		bool checkRAT(unsigned Index, const Instruction &Inst);
bool checkRCU(unsigned Index, const InstrDesc &Desc);		bool checkRCU(unsigned Index, const InstrDesc &Desc);
bool checkScheduler(unsigned Index, const InstrDesc &Desc);		bool checkScheduler(unsigned Index, const InstrDesc &Desc);

void updateRAWDependencies(ReadState &RS, const llvm::MCSubtargetInfo &STI);		void updateRAWDependencies(ReadState &RS, const llvm::MCSubtargetInfo &STI);
void notifyInstructionDispatched(unsigned IID,		void notifyInstructionDispatched(unsigned IID,
llvm::ArrayRef<unsigned> UsedPhysRegs);		llvm::ArrayRef<unsigned> UsedPhysRegs);

public:		public:
DispatchUnit(Backend *B, const llvm::MCSubtargetInfo &STI,		DispatchUnit(Backend *B, const llvm::MCSchedModel &SM,
const llvm::MCRegisterInfo &MRI, unsigned MicroOpBufferSize,		const llvm::MCRegisterInfo &MRI, unsigned RegisterFileSize,
unsigned RegisterFileSize, unsigned MaxRetirePerCycle,		unsigned MaxRetirePerCycle, unsigned MaxDispatchWidth,
unsigned MaxDispatchWidth, Scheduler *Sched)		Scheduler *Sched)
		RKSimonUnsubmitted Not Done Reply Inline Actions Are you confident that we won't need llvm::MCSubtargetInfo again anytime soon? RKSimon: Are you confident that we won't need llvm::MCSubtargetInfo again anytime soon?
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions Ideally, all the information needed by the dispatch logic should be accessible through the scheduling model. I may be wrong, but I don't think that we will need it again. andreadb: Ideally, all the information needed by the dispatch logic should be accessible through the…
: DispatchWidth(MaxDispatchWidth), AvailableEntries(MaxDispatchWidth),		: DispatchWidth(MaxDispatchWidth), AvailableEntries(MaxDispatchWidth),
CarryOver(0U), SC(Sched),		CarryOver(0U), SC(Sched),
RAT(llvm::make_unique<RegisterFile>(STI.getSchedModel(), MRI,		RAT(llvm::make_unique<RegisterFile>(SM, MRI, RegisterFileSize)),
RegisterFileSize)),		RCU(llvm::make_unique<RetireControlUnit>(SM, MaxRetirePerCycle, this)),
RCU(llvm::make_unique<RetireControlUnit>(MicroOpBufferSize,
MaxRetirePerCycle, this)),
Owner(B) {}		Owner(B) {}

unsigned getDispatchWidth() const { return DispatchWidth; }		unsigned getDispatchWidth() const { return DispatchWidth; }

bool isAvailable(unsigned NumEntries) const {		bool isAvailable(unsigned NumEntries) const {
return NumEntries <= AvailableEntries \|\| AvailableEntries == DispatchWidth;		return NumEntries <= AvailableEntries \|\| AvailableEntries == DispatchWidth;
}		}

Show All 38 Lines

tools/llvm-mca/Dispatch.cpp

Show First 20 Lines • Show All 245 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = getNumRegisterFiles(); I < E; ++I) {
dbgs() << "Register File #" << I;		dbgs() << "Register File #" << I;
const RegisterMappingTracker &RMT = RegisterFiles[I];		const RegisterMappingTracker &RMT = RegisterFiles[I];
dbgs() << "\n TotalMappings: " << RMT.TotalMappings		dbgs() << "\n TotalMappings: " << RMT.TotalMappings
<< "\n NumUsedMappings: " << RMT.NumUsedMappings << '\n';		<< "\n NumUsedMappings: " << RMT.NumUsedMappings << '\n';
}		}
}		}
#endif		#endif

		void RetireControlUnit::initialize(const MCSchedModel &SM) {
		// Check if the scheduling model provides extra information about the machine
		// processor. If so, then use that information to set the reorder buffer size
		// and the maximum number of instructions retired per cycle.
		if (SM.hasExtraProcessorInfo()) {
		const MCExtraProcessorInfo &EPI = SM.getExtraProcessorInfo();
		if (EPI.ReorderBufferSize)
		AvailableSlots = EPI.ReorderBufferSize;
		if (!MaxRetirePerCycle)
		courbetUnsubmitted Done Reply Inline Actions Can we get rid of the MaxRetirePerCycle ctor parameter ? Is there any reason why the user would override this ? courbet: Can we get rid of the MaxRetirePerCycle ctor parameter ? Is there any reason why the user would…
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions The original idea was to let users override the retire-per-cycle. However, with this patch, the max-retire-per-cycle can now be set via tablegen. That being said, most processors don't provide a max-retire-per-cycle. This patch would only fix the BtVer2 model. So, I am tempted to keep the flag for now (although, I don't have a strong opinion about it). Alternatively, I can commit a separate patch that removes the flag, and the rebase this patch. What do you think? andreadb: The original idea was to let users override the retire-per-cycle. However, with this patch, the…
		courbetUnsubmitted Done Reply Inline Actions I'm pretty sure that I would stick to tablegen-specified models as a user. And if you're developing you own processor, I hope that you have your own tools :) So the flag does not sound that useful to me given that it makes a bit harder to understand this code. I don't have a strong opinion though, so feel free to keep the flag if i'ts useful to you. courbet: I'm pretty sure that I would stick to tablegen-specified models as a user. And if you're…
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions That's a good point (especially if you have your own tools). It could have been used to experiment/play with the RCU (with SMT processors we might emulate a different retire throughput). That being said, the argument is a bit weak. SMT introduces other issues related to the partitioning of resources which cannot be fully tweaked via flags. If okay for you, I am going to commit a change that removes the flag and updates the docs in preparation for this patch. Then I rebase this patch. andreadb: That's a good point (especially if you have your own tools). It could have been used to…
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions I went ahead and removed that flag at r329274. andreadb: I went ahead and removed that flag at r329274.
		MaxRetirePerCycle = EPI.MaxRetirePerCycle;
		}

		assert(AvailableSlots && "Invalid reorder buffer size!");
		Queue.resize(AvailableSlots);
		}

// Reserves a number of slots, and returns a new token.		// Reserves a number of slots, and returns a new token.
unsigned RetireControlUnit::reserveSlot(unsigned Index, unsigned NumMicroOps) {		unsigned RetireControlUnit::reserveSlot(unsigned Index, unsigned NumMicroOps) {
assert(isAvailable(NumMicroOps));		assert(isAvailable(NumMicroOps));
unsigned NormalizedQuantity =		unsigned NormalizedQuantity =
std::min(NumMicroOps, static_cast<unsigned>(Queue.size()));		std::min(NumMicroOps, static_cast<unsigned>(Queue.size()));
// Zero latency instructions may have zero mOps. Artificially bump this		// Zero latency instructions may have zero mOps. Artificially bump this
// value to 1. Although zero latency instructions don't consume scheduler		// value to 1. Although zero latency instructions don't consume scheduler
// resources, they still consume one slot in the retire queue.		// resources, they still consume one slot in the retire queue.
▲ Show 20 Lines • Show All 178 Lines • Show Last 20 Lines

utils/TableGen/CodeGenSchedule.h

Show First 20 Lines • Show All 229 Lines • ▼ Show 20 Lines	struct CodeGenProcModel {
RecVec ReadAdvanceDefs;		RecVec ReadAdvanceDefs;

// Per-operand machine model resources associated with this processor.		// Per-operand machine model resources associated with this processor.
RecVec ProcResourceDefs;		RecVec ProcResourceDefs;

// List of Register Files.		// List of Register Files.
std::vector<CodeGenRegisterFile> RegisterFiles;		std::vector<CodeGenRegisterFile> RegisterFiles;

		// Optional Retire Control Unit definition.
		Record *RetireControlUnit;

CodeGenProcModel(unsigned Idx, std::string Name, Record *MDef,		CodeGenProcModel(unsigned Idx, std::string Name, Record *MDef,
Record *IDef) :		Record *IDef) :
Index(Idx), ModelName(std::move(Name)), ModelDef(MDef), ItinsDef(IDef) {}		Index(Idx), ModelName(std::move(Name)), ModelDef(MDef), ItinsDef(IDef),
		RetireControlUnit(nullptr) {}

bool hasItineraries() const {		bool hasItineraries() const {
return !ItinsDef->getValueAsListOfDefs("IID").empty();		return !ItinsDef->getValueAsListOfDefs("IID").empty();
}		}

bool hasInstrSchedModel() const {		bool hasInstrSchedModel() const {
return !WriteResDefs.empty() \|\| !ItinRWDefs.empty();		return !WriteResDefs.empty() \|\| !ItinRWDefs.empty();
}		}

bool hasExtraProcessorInfo() const {		bool hasExtraProcessorInfo() const {
return !RegisterFiles.empty();		return RetireControlUnit \|\| !RegisterFiles.empty();
}		}

unsigned getProcResourceIdx(Record *PRDef) const;		unsigned getProcResourceIdx(Record *PRDef) const;

bool isUnsupported(const CodeGenInstruction &Inst) const;		bool isUnsupported(const CodeGenInstruction &Inst) const;

#ifndef NDEBUG		#ifndef NDEBUG
void dump() const;		void dump() const;
▲ Show 20 Lines • Show All 171 Lines • ▼ Show 20 Lines	private:

void collectSchedRW();		void collectSchedRW();

std::string genRWName(ArrayRef<unsigned> Seq, bool IsRead);		std::string genRWName(ArrayRef<unsigned> Seq, bool IsRead);
unsigned findRWForSequence(ArrayRef<unsigned> Seq, bool IsRead);		unsigned findRWForSequence(ArrayRef<unsigned> Seq, bool IsRead);

void collectSchedClasses();		void collectSchedClasses();

		void collectRetireControlUnits();

void collectRegisterFiles();		void collectRegisterFiles();

		void collectOptionalProcessorInfo();

std::string createSchedClassName(Record *ItinClassDef,		std::string createSchedClassName(Record *ItinClassDef,
ArrayRef<unsigned> OperWrites,		ArrayRef<unsigned> OperWrites,
ArrayRef<unsigned> OperReads);		ArrayRef<unsigned> OperReads);
std::string createSchedClassName(const RecVec &InstDefs);		std::string createSchedClassName(const RecVec &InstDefs);
void createInstRWClass(Record *InstRWDef);		void createInstRWClass(Record *InstRWDef);

void collectProcItins();		void collectProcItins();

Show All 37 Lines

utils/TableGen/CodeGenSchedule.cpp

Show First 20 Lines • Show All 205 Lines • ▼ Show 20 Lines	CodeGenSchedModels::CodeGenSchedModels(RecordKeeper &RK,
// Infer new SchedClasses from SchedVariant.		// Infer new SchedClasses from SchedVariant.
inferSchedClasses();		inferSchedClasses();

// Populate each CodeGenProcModel's WriteResDefs, ReadAdvanceDefs, and		// Populate each CodeGenProcModel's WriteResDefs, ReadAdvanceDefs, and
// ProcResourceDefs.		// ProcResourceDefs.
DEBUG(dbgs() << "\n+++ RESOURCE DEFINITIONS (collectProcResources) +++\n");		DEBUG(dbgs() << "\n+++ RESOURCE DEFINITIONS (collectProcResources) +++\n");
collectProcResources();		collectProcResources();

		// Collect optional processor description.
		collectOptionalProcessorInfo();

		checkCompleteness();
		}

		void CodeGenSchedModels::collectRetireControlUnits() {
		RecVec Units = Records.getAllDerivedDefinitions("RetireControlUnit");

		for (Record *RCU : Units) {
		CodeGenProcModel &PM = getProcModel(RCU->getValueAsDef("SchedModel"));
		if (PM.RetireControlUnit) {
		PrintError(RCU->getLoc(),
		"Expected a single RetireControlUnit definition");
		PrintNote(PM.RetireControlUnit->getLoc(),
		"Previous definition of RetireControlUnit was here");
		}
		PM.RetireControlUnit = RCU;
		}
		}

		/// Collect optional processor information.
		void CodeGenSchedModels::collectOptionalProcessorInfo() {
// Find register file definitions for each processor.		// Find register file definitions for each processor.
collectRegisterFiles();		collectRegisterFiles();

checkCompleteness();		// Collect processor RetireControlUnit descriptors if available.
		collectRetireControlUnits();
}		}

/// Gather all processor models.		/// Gather all processor models.
void CodeGenSchedModels::collectProcModels() {		void CodeGenSchedModels::collectProcModels() {
RecVec ProcRecords = Records.getAllDerivedDefinitions("Processor");		RecVec ProcRecords = Records.getAllDerivedDefinitions("Processor");
std::sort(ProcRecords.begin(), ProcRecords.end(), LessRecordFieldName());		std::sort(ProcRecords.begin(), ProcRecords.end(), LessRecordFieldName());

// Reserve space because we can. Reallocation would be ok.		// Reserve space because we can. Reallocation would be ok.
▲ Show 20 Lines • Show All 1,701 Lines • Show Last 20 Lines

utils/TableGen/SubtargetEmitter.cpp

Show First 20 Lines • Show All 602 Lines • ▼ Show 20 Lines	for (Record *RUDef : ResUnits) {
OS << " " << ProcModel.getProcResourceIdx(RU) << ", ";		OS << " " << ProcModel.getProcResourceIdx(RU) << ", ";
}		}
}		}
OS << " // " << PRDef->getName() << "\n";		OS << " // " << PRDef->getName() << "\n";
}		}
OS << "};\n";		OS << "};\n";
}		}

		static void EmitRetireControlUnitInfo(const CodeGenProcModel &ProcModel,
		raw_ostream &OS) {
		long ReorderBufferSize = 0, MaxRetirePerCycle = 0;
		if (Record *RCU = ProcModel.RetireControlUnit) {
		ReorderBufferSize =
		std::max(ReorderBufferSize, RCU->getValueAsInt("ReorderBufferSize"));
		MaxRetirePerCycle =
		std::max(MaxRetirePerCycle, RCU->getValueAsInt("MaxRetirePerCycle"));
		}

		OS << ReorderBufferSize << ", // ReorderBufferSize\n ";
		OS << MaxRetirePerCycle << ", // MaxRetirePerCycle\n ";
		}

static void EmitRegisterFileInfo(const CodeGenProcModel &ProcModel,		static void EmitRegisterFileInfo(const CodeGenProcModel &ProcModel,
unsigned NumRegisterFiles,		unsigned NumRegisterFiles,
unsigned NumCostEntries, raw_ostream &OS) {		unsigned NumCostEntries, raw_ostream &OS) {
if (NumRegisterFiles)		if (NumRegisterFiles)
OS << ProcModel.ModelName << "RegisterFiles,\n " << (1 + NumRegisterFiles);		OS << ProcModel.ModelName << "RegisterFiles,\n " << (1 + NumRegisterFiles);
else		else
OS << "nullptr,\n 0,\n ";		OS << "nullptr,\n 0,\n ";

▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	void SubtargetEmitter::EmitExtraProcessorInfo(const CodeGenProcModel &ProcModel,
// Generate a table of register file descriptors (one entry per each user		// Generate a table of register file descriptors (one entry per each user
// defined register file), and a table of register costs.		// defined register file), and a table of register costs.
unsigned NumCostEntries = EmitRegisterFileTables(ProcModel, OS);		unsigned NumCostEntries = EmitRegisterFileTables(ProcModel, OS);

// Now generate a table for the extra processor info.		// Now generate a table for the extra processor info.
OS << "\nstatic const llvm::MCExtraProcessorInfo " << ProcModel.ModelName		OS << "\nstatic const llvm::MCExtraProcessorInfo " << ProcModel.ModelName
<< "ExtraInfo = {\n ";		<< "ExtraInfo = {\n ";

		// Add information related to the retire control unit.
		EmitRetireControlUnitInfo(ProcModel, OS);

// Add information related to the register files (i.e. where to find register		// Add information related to the register files (i.e. where to find register
// file descriptors and register costs).		// file descriptors and register costs).
EmitRegisterFileInfo(ProcModel, ProcModel.RegisterFiles.size(),		EmitRegisterFileInfo(ProcModel, ProcModel.RegisterFiles.size(),
NumCostEntries, OS);		NumCostEntries, OS);

OS << "};\n";		OS << "};\n";
}		}

▲ Show 20 Lines • Show All 957 Lines • Show Last 20 Lines