This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
MC/
1/1
MCSchedule.h
-
Target/
-
TargetSchedule.td
-
lib/Target/X86/
-
Target/
-
X86/
-
X86.td
-
X86PfmCounters.td
-
tools/llvm-exegesis/lib/
-
llvm-exegesis/
-
lib/
-
Latency.cpp
-
Uops.cpp
-
utils/TableGen/
-
TableGen/
-
CodeGenSchedule.h
-
CodeGenSchedule.cpp
1/1
SubtargetEmitter.cpp

Differential D45360

[MC][TableGen] Add optional libpfm counter names for ProcResUnits.
ClosedPublic

Authored by courbet on Apr 6 2018, 2:51 AM.

Download Raw Diff

Details

Reviewers

gchatelet
RKSimon
andreadb
craig.topper

Commits

rGb449379eae02: [MC][TableGen] Add optional libpfm counter names for ProcResUnits.
rL329675: [MC][TableGen] Add optional libpfm counter names for ProcResUnits.

Summary

Subtargets can define the libpfm counter names that can be used to
measure cycles and uops issued on ProcResUnits.
This allows making llvm-exegesis available on more targets.
Fixes PR36984.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 16876
Build 16876: arc lint + arc unit

Event Timeline

courbet created this revision.Apr 6 2018, 2:51 AM

Harbormaster completed remote builds in B16816: Diff 141302.Apr 6 2018, 2:54 AM

Hi Clement,

I prefer a more flexible design where perf events can be defined for all resources, and not just for ProcResourceUnits.
In future, we may want to add profiling information to resources that are not just pipeline ports. For example, we could add profiling support for hardware scheduler resources. Essentially groups should be considered too.

I also prefer that we move all the "optional" information into MCExtraProcessorInfo. I don't like the idea of adding an extra pointer to every single resource defined by a processor (especially if the processor doesn't care about describing perf counters).

This is just an idea for an alternative design:
You could define a separate tablegen class to describe the mapping of resources to perf events. Something like this:

class PFMCounter;
class DispatchPort0Counter : PFMCounter;
class DispatchPort1Counter : PFMCounter;
 /*etc.,  ... enumerate all counters here ... */

class PerfCounter<ProcResourceKind Kind, list<PFMCounter> Counters> {
  ProcResourceKind Resource = Kind;
  list<PFMCounter> Counters;  // A list of counter IDs.
}

Here, PFMCounters would be used to construct a (subtarget specific) enum type, where each enum value is a reference to a PMC string name.
The generic "unhalted_core_cycles" event would get an enum ID too.

This is just an idea for an alternative design. Regardless of whether you end up adopting this solution or not, I believe that it is cleaner if we move anything which is optional into MCExtraProcessorInfo.

Just my opinion.
-Andrea

include/llvm/MC/MCSchedule.h
261–263	I think that this should be moved to the MCExtraProcessorInfo data structure.
utils/TableGen/SubtargetEmitter.cpp
775	Can this be moved to `MCExtraProcessorInfo`? I am of the opinion that this information is optional, and it shouldn't be included by default in MCProcResourceDesc. See my other comments.

I'm a bit worried about creating a pfm dependency like this in the core - pfm doesn't come close to covering all targets that we'll want to check in the long term so it's likely that some targets will require a different library. (or raw msrs.....)

Is there anyway that we can keep more of this in llvm-exegesis project - config files or embedding in source?

In D45360#1059635, @RKSimon wrote:

I'm a bit worried about creating a pfm dependency like this in the core - pfm doesn't come close to covering all targets that we'll want to check in the long term so it's likely that some targets will require a different library. (or raw msrs.....)

Is there anyway that we can keep more of this in llvm-exegesis project - config files or embedding in source?

I tend to agree with Simon.
All of this info could live in a config file within exegesis. Something that describes the mapping between abstract counters for perf events and actual pfm counters.

In D45360#1059570, @andreadb wrote:

Hi Clement,

I prefer that we move all the "optional" information into MCExtraProcessorInfo. I don't like the idea of adding an extra pointer to every single resource defined by a processor (especially if the processor doesn't care about describing perf counters).

Agreed. I've moved all counters to MCExtraProcessorInfo.

I prefer a more flexible design where perf events can be defined for all resources, and not just for ProcResourceUnits.
In future, we may want to add profiling information to resources that are not just pipeline ports. For example, we could add profiling support for hardware scheduler resources. Essentially groups should be considered too.

In terms of generated code, the current design has counters index by ProcResIdx, so nothing needs to change if we decide that we want to add counters to groups too.
I agree that the schema you propose would be more generic but I'm reluctant to add more complexity to the TableGen schema until it's clear that we need to support counting other things.

In D45360#1059650, @andreadb wrote:

In D45360#1059635, @RKSimon wrote:

I'm a bit worried about creating a pfm dependency like this in the core - pfm doesn't come close to covering all targets that we'll want to check in the long term so it's likely that some targets will require a different library. (or raw msrs.....)

Is there anyway that we can keep more of this in llvm-exegesis project - config files or embedding in source?

I tend to agree with Simon.
All of this info could live in a config file within exegesis. Something that describes the mapping between abstract counters for perf events and actual pfm counters.

If we were using raw events, I would agree. But the whole point of libpfm is to abstract counters. libpfm pretty much *is* this mapping.

Let's summarize what we need and how we can address those needs.

For each subtarget, we need to know:

How to count cycles.
How to count uops issued on each resource.

There are two orthogonal technical aspects to decide on, which are how to name the counters, and where to store the mapping from subtarget/procres to abstract counter

As for naming counters, the options that have been cited up to now are:
1 - (this design and Andrea's proposal) counters are identified by an abstract string (I happened to pick the libpfm convention, because that's the lib we've used).
2 - (raw counters) counters are identified by the architecture-specific id, which is an int on Intel but could be a something else on other architectures.

I think that (1) is as good as (2), because (2) will have to be something complex to handle all microarchitectures, so I'd rather leave this complexity to be dealt with by libpfm, which was designed for this very purpose.

As for storing the mapping, the options that have been cited up to now are:
A - (this design) counters are a property of the class they semantically relate to.
B - (Andrea's proposal) the mapping is a separate table of pairs of (counter, class it semantically relates to)
C - (Simon's proposal 2) the mapping is a separate config file. I believe this to be a variation of (B) where the table is stored in a different file.
D - (Simon's proposal 1) the mapping is hardcoded in llvm-exegesis.

In D45360#1059678, @courbet wrote:

Let's summarize what we need and how we can address those needs.

For each subtarget, we need to know:

How to count cycles.

How to count uops issued on each resource.

There are two orthogonal technical aspects to decide on, which are how to name the counters, and where to store the mapping from subtarget/procres to abstract counter

As for naming counters, the options that have been cited up to now are:
1 - (this design and Andrea's proposal) counters are identified by an abstract string (I happened to pick the libpfm convention, because that's the lib we've used).
2 - (raw counters) counters are identified by the architecture-specific id, which is an int on Intel but could be a something else on other architectures.

I think that (1) is as good as (2), because (2) will have to be something complex to handle all microarchitectures, so I'd rather leave this complexity to be dealt with by libpfm, which was designed for this very purpose.

As for storing the mapping, the options that have been cited up to now are:
A - (this design) counters are a property of the class they semantically relate to.
B - (Andrea's proposal) the mapping is a separate table of pairs of (counter, class it semantically relates to)
C - (Simon's proposal 2) the mapping is a separate config file. I believe this to be a variation of (B) where the table is stored in a different file.
D - (Simon's proposal 1) the mapping is hardcoded in llvm-exegesis.

Reflecting on this a bit more:

Option (B) has the advantage that the information is not stored alongside either resources or model but in a separate "table" (using relational language). That has the advantage that it imposes no specific place where the information has to be stored (which nicely fits with (C)).
Then (B) is simply "store the mapping for subtarget XYZ into file X86SchedXYZ.td" , which (C) is "store all mappings into a separate X86Counters.td"

As for (D), I'm very much opposed to storing anything outside of the TD files, because that would mean that we loose the advantage of all the object resolution done in SubtargetEmitter, so it would be very hard to maintain.

In D45360#1059678, @courbet wrote:

Let's summarize what we need and how we can address those needs.

For each subtarget, we need to know:

How to count cycles.

How to count uops issued on each resource.

There are two orthogonal technical aspects to decide on, which are how to name the counters, and where to store the mapping from subtarget/procres to abstract counter

As for naming counters, the options that have been cited up to now are:
1 - (this design and Andrea's proposal) counters are identified by an abstract string (I happened to pick the libpfm convention, because that's the lib we've used).
2 - (raw counters) counters are identified by the architecture-specific id, which is an int on Intel but could be a something else on other architectures.

I think that (1) is as good as (2), because (2) will have to be something complex to handle all microarchitectures, so I'd rather leave this complexity to be dealt with by libpfm, which was designed for this very purpose.

I am a bit confused. Does it mean that exegesis will always have the dependency on libpfm (which means, it would only work on systems that provide it, for the cpus known by the installed lib version on the system)?

As for storing the mapping, the options that have been cited up to now are:
A - (this design) counters are a property of the class they semantically relate to.
B - (Andrea's proposal) the mapping is a separate table of pairs of (counter, class it semantically relates to)
C - (Simon's proposal 2) the mapping is a separate config file. I believe this to be a variation of (B) where the table is stored in a different file.
D - (Simon's proposal 1) the mapping is hardcoded in llvm-exegesis.

If D is too complicated, then B or C. I don't want to pollute existing abstractions used by the LLVM schedulers with information which is only used by external tools.
I still want to allow targets to be able to strip off (or disable the emission of ) that information if they don't need it.

In D45360#1059732, @andreadb wrote:

In D45360#1059678, @courbet wrote:

Let's summarize what we need and how we can address those needs.

For each subtarget, we need to know:

How to count cycles.

How to count uops issued on each resource.

There are two orthogonal technical aspects to decide on, which are how to name the counters, and where to store the mapping from subtarget/procres to abstract counter

As for naming counters, the options that have been cited up to now are:
1 - (this design and Andrea's proposal) counters are identified by an abstract string (I happened to pick the libpfm convention, because that's the lib we've used).
2 - (raw counters) counters are identified by the architecture-specific id, which is an int on Intel but could be a something else on other architectures.

I think that (1) is as good as (2), because (2) will have to be something complex to handle all microarchitectures, so I'd rather leave this complexity to be dealt with by libpfm, which was designed for this very purpose.

I am a bit confused. Does it mean that exegesis will always have the dependency on libpfm (which means, it would only work on systems that provide it, for the cpus known by the installed lib version on the system)?

It does not *have to*, but until a better solution comes along, yes :) Just peeking at the tables in libpfm to abstract both OS and hardware really makes me want to not have to handle that !

As for storing the mapping, the options that have been cited up to now are:
A - (this design) counters are a property of the class they semantically relate to.
B - (Andrea's proposal) the mapping is a separate table of pairs of (counter, class it semantically relates to)
C - (Simon's proposal 2) the mapping is a separate config file. I believe this to be a variation of (B) where the table is stored in a different file.
D - (Simon's proposal 1) the mapping is hardcoded in llvm-exegesis.

If D is too complicated, then B or C. I don't want to pollute existing abstractions used by the LLVM schedulers with information which is only used by external tools.
I still want to allow targets to be able to strip off (or disable the emission of ) that information if they don't need it.

I'll go with (C).

I think that (1) is as good as (2), because (2) will have to be something complex to handle all microarchitectures, so I'd rather leave this complexity to be dealt with by libpfm, which was designed for this very purpose.

I am a bit confused. Does it mean that exegesis will always have the dependency on libpfm (which means, it would only work on systems that provide it, for the cpus known by the installed lib version on the system)?

It does not *have to*, but until a better solution comes along, yes :) Just peeking at the tables in libpfm to abstract both OS and hardware really makes me want to not have to handle that !

I agree. You don't want to do that..
I was just curious. I guess, this is not going to hurt me since I mostly develop on linux. Also, based on this: http://perfmon2.sourceforge.net/ I think I should be able to get support for Jaguar (which is the cpu I mostly care about) and other AMD chips.

As for storing the mapping, the options that have been cited up to now are:
A - (this design) counters are a property of the class they semantically relate to.
B - (Andrea's proposal) the mapping is a separate table of pairs of (counter, class it semantically relates to)
C - (Simon's proposal 2) the mapping is a separate config file. I believe this to be a variation of (B) where the table is stored in a different file.
D - (Simon's proposal 1) the mapping is hardcoded in llvm-exegesis.

If D is too complicated, then B or C. I don't want to pollute existing abstractions used by the LLVM schedulers with information which is only used by external tools.
I still want to allow targets to be able to strip off (or disable the emission of ) that information if they don't need it.

I'll go with (C).

Cool. Thanks Clement :-)

Implement option (C).

Implement option (C).

add missing license

Harbormaster completed remote builds in B16885: Diff 141625.Apr 9 2018, 5:24 AM

Nice patch.

Would it be difficult to have a single string table for all the PfmIssueCounters defined by the scheduling models?

At the moment, your patch introduces a PfmIssueCounters table for every model:

static const char* HaswellModelPfmIssueCounters[] = {
  nullptr,
  nullptr,
  nullptr,
  "uops_dispatched_port:port_0,",  //HWPort0
  "uops_dispatched_port:port_1,",  //HWPort1
  "uops_dispatched_port:port_2,",  //HWPort2
  "uops_dispatched_port:port_3,",  //HWPort3
  "uops_dispatched_port:port_4,",  //HWPort4
  "uops_dispatched_port:port_5,",  //HWPort5
  "uops_dispatched_port:port_6,",  //HWPort6
  "uops_dispatched_port:port_7,",  //HWPort7
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
};
<snip>

static const char* SkylakeClientModelPfmIssueCounters[] = {
  nullptr,
  nullptr,
  nullptr,
  "uops_dispatched_port:port_0,",  //SKLPort0
  "uops_dispatched_port:port_1,",  //SKLPort1
  "uops_dispatched_port:port_2,",  //SKLPort2
  "uops_dispatched_port:port_3,",  //SKLPort3
  "uops_dispatched_port:port_4,",  //SKLPort4
  "uops_dispatched_port:port_5,",  //SKLPort5
  "uops_dispatched_port:port_6,",  //SKLPort6
  "uops_dispatched_port:port_7,",  //SKLPort7
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
  nullptr,
};

However, I think this is suboptimal for two reasons:

the per-model table is not well compressed because there is one entry per each processor resource. That means, we consume entries even for resources that don't have a counter associated.
(as you can see from the code snippet above), most strings are actually duplicated in each table.

If possible, ideally we should have a single compressed string table accessible by all models:

nullptr  // invalid entry
"uops_dispatched_port:port_0,"
"uops_dispatched_port:port_1,"
"uops_dispatched_port:port_2,"
"uops_dispatched_port:port_3,"
"uops_dispatched_port:port_4,"
"uops_dispatched_port:port_5,"
"uops_dispatched_port:port_6,"
"uops_dispatched_port:port_7,"

The compressed table comes with the downside that it requires an extra mapping between processor resource IDs and indices to the string table. That mapping could be stored somewhere in the "ExtraInfo" table.

I hope this makes sense.

Overall, the patch looks good. I am also okay if you want to commit this patch first, and then improve the design in a follow-up patch.

-Andrea

Thanks Andrea,

In D45360#1061551, @andreadb wrote:

Nice patch.

Would it be difficult to have a single string table for all the PfmIssueCounters defined by the scheduling models?

The compressed table comes with the downside that it requires an extra mapping between processor resource IDs and indices to the string table. That mapping could be stored somewhere in the "ExtraInfo" table.

I hope this makes sense.

Makes sense, this is similar to how scheduling info is stored.
I guess that what we gain from doing this also depends on whether we really do split ExtraInfo to a separate target that only llvm-mca and llvm-exegesis link. If that's the case, then not compressing is not a big deal.

Overall, the patch looks good. I am also okay if you want to commit this patch first, and then improve the design in a follow-up patch.

I'd rather do it in a separate patch to keep matters separate if you don't mind.

In D45360#1061604, @courbet wrote:

Thanks Andrea,

In D45360#1061551, @andreadb wrote:

Nice patch.

Would it be difficult to have a single string table for all the PfmIssueCounters defined by the scheduling models?

The compressed table comes with the downside that it requires an extra mapping between processor resource IDs and indices to the string table. That mapping could be stored somewhere in the "ExtraInfo" table.

I hope this makes sense.

Makes sense, this is similar to how scheduling info is stored.
I guess that what we gain from doing this also depends on whether we really do split ExtraInfo to a separate target that only llvm-mca and llvm-exegesis link. If that's the case, then not compressing is not a big deal.

Overall, the patch looks good. I am also okay if you want to commit this patch first, and then improve the design in a follow-up patch.

I'd rather do it in a separate patch to keep matters separate if you don't mind.

Sounds good to me.

Thanks

This revision is now accepted and ready to land.Apr 9 2018, 6:58 AM

Created PR37068 to track compression.

Closed by commit rL329675: [MC][TableGen] Add optional libpfm counter names for ProcResUnits. (authored by courbet). · Explain WhyApr 10 2018, 1:19 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

llvm/

MC/

MCSchedule.h

12 lines

Target/

TargetSchedule.td

27 lines

lib/

Target/

X86/

X86.td

6 lines

X86PfmCounters.td

58 lines

tools/

llvm-exegesis/

lib/

Latency.cpp

11 lines

Uops.cpp

51 lines

utils/

TableGen/

CodeGenSchedule.h

10 lines

CodeGenSchedule.cpp

22 lines

SubtargetEmitter.cpp

62 lines

Diff 141600

include/llvm/MC/MCSchedule.h

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	struct MCExtraProcessorInfo {
// Actual size of the reorder buffer in hardware.		// Actual size of the reorder buffer in hardware.
unsigned ReorderBufferSize;		unsigned ReorderBufferSize;
// Number of instructions retired per cycle.		// Number of instructions retired per cycle.
unsigned MaxRetirePerCycle;		unsigned MaxRetirePerCycle;
const MCRegisterFileDesc *RegisterFiles;		const MCRegisterFileDesc *RegisterFiles;
unsigned NumRegisterFiles;		unsigned NumRegisterFiles;
const MCRegisterCostEntry *RegisterCostTable;		const MCRegisterCostEntry *RegisterCostTable;
unsigned NumRegisterCostEntries;		unsigned NumRegisterCostEntries;

		struct PfmCountersInfo {
		// An optional name of a performance counter that can be used to measure
		// cycles.
		const char *CycleCounter;

		// For each MCProcResourceDesc defined by the processor, an optional list of
		// names of performance counters that can be used to measure the resource
		// utilization.
		const char **IssueCounters;
		};
		PfmCountersInfo PfmCounters;
};		};

/// Machine model for scheduling, bundling, and heuristics.		/// Machine model for scheduling, bundling, and heuristics.
///		///
/// The machine model directly provides basic information about the		/// The machine model directly provides basic information about the
/// microarchitecture to the scheduler in the form of properties. It also		/// microarchitecture to the scheduler in the form of properties. It also
/// optionally refers to scheduler resource tables and itinerary		/// optionally refers to scheduler resource tables and itinerary
/// tables. Scheduler resource tables model the latency and cost for each		/// tables. Scheduler resource tables model the latency and cost for each
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	struct MCSchedModel {
unsigned NumProcResourceKinds;		unsigned NumProcResourceKinds;
unsigned NumSchedClasses;		unsigned NumSchedClasses;
// Instruction itinerary tables used by InstrItineraryData.		// Instruction itinerary tables used by InstrItineraryData.
friend class InstrItineraryData;		friend class InstrItineraryData;
const InstrItinerary *InstrItineraries;		const InstrItinerary *InstrItineraries;

const MCExtraProcessorInfo *ExtraProcessorInfo;		const MCExtraProcessorInfo *ExtraProcessorInfo;

bool hasExtraProcessorInfo() const { return ExtraProcessorInfo; }		bool hasExtraProcessorInfo() const { return ExtraProcessorInfo; }

unsigned getProcessorID() const { return ProcID; }		unsigned getProcessorID() const { return ProcID; }
		andreadbUnsubmitted Done Reply Inline Actions I think that this should be moved to the MCExtraProcessorInfo data structure. andreadb: I think that this should be moved to the MCExtraProcessorInfo data structure.

/// Does this machine model include instruction-level scheduling.		/// Does this machine model include instruction-level scheduling.
bool hasInstrSchedModel() const { return SchedClassTable; }		bool hasInstrSchedModel() const { return SchedClassTable; }

const MCExtraProcessorInfo &getExtraProcessorInfo() const {		const MCExtraProcessorInfo &getExtraProcessorInfo() const {
assert(hasExtraProcessorInfo() &&		assert(hasExtraProcessorInfo() &&
"No extra information available for this model");		"No extra information available for this model");
return *ExtraProcessorInfo;		return *ExtraProcessorInfo;
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

include/llvm/Target/TargetSchedule.td

	Show First 20 Lines • Show All 176 Lines • ▼ Show 20 Lines
	// reservation station.			// reservation station.
	//			//
	// To model both dispatch/issue groups and in-order execution units,			// To model both dispatch/issue groups and in-order execution units,
	// create two types of units, one with BufferSize=0 and one with			// create two types of units, one with BufferSize=0 and one with
	// BufferSize=1.			// BufferSize=1.
	//			//
	// SchedModel ties these units to a processor for any stand-alone defs			// SchedModel ties these units to a processor for any stand-alone defs
	// of this class.			// of this class.
	class ProcResourceUnits<ProcResourceKind kind, int num> {			class ProcResourceUnits<ProcResourceKind kind, int num,
				list<string> pfmCounters> {
	ProcResourceKind Kind = kind;			ProcResourceKind Kind = kind;
	int NumUnits = num;			int NumUnits = num;
	ProcResourceKind Super = ?;			ProcResourceKind Super = ?;
	int BufferSize = -1;			int BufferSize = -1;
	SchedMachineModel SchedModel = ?;			SchedMachineModel SchedModel = ?;
	}			}

	// EponymousProcResourceKind helps implement ProcResourceUnits by			// EponymousProcResourceKind helps implement ProcResourceUnits by
	// allowing a ProcResourceUnits definition to reference itself. It			// allowing a ProcResourceUnits definition to reference itself. It
	// should not be referenced anywhere else.			// should not be referenced anywhere else.
	def EponymousProcResourceKind : ProcResourceKind;			def EponymousProcResourceKind : ProcResourceKind;

	// Subtargets typically define processor resource kind and number of			// Subtargets typically define processor resource kind and number of
	// units in one place.			// units in one place.
	class ProcResource<int num> : ProcResourceKind,			class ProcResource<int num, list<string> pfmCounters = []> : ProcResourceKind,
	ProcResourceUnits<EponymousProcResourceKind, num>;			ProcResourceUnits<EponymousProcResourceKind, num, pfmCounters>;

	class ProcResGroup<list<ProcResource> resources> : ProcResourceKind {			class ProcResGroup<list<ProcResource> resources> : ProcResourceKind {
	list<ProcResource> Resources = resources;			list<ProcResource> Resources = resources;
	SchedMachineModel SchedModel = ?;			SchedMachineModel SchedModel = ?;
	int BufferSize = -1;			int BufferSize = -1;
	}			}

	// A target architecture may define SchedReadWrite types and associate			// A target architecture may define SchedReadWrite types and associate
	▲ Show 20 Lines • Show All 261 Lines • ▼ Show 20 Lines
	// Models can optionally specify up to one instance of RetireControlUnit per			// Models can optionally specify up to one instance of RetireControlUnit per
	// scheduling model.			// scheduling model.
	class RetireControlUnit<int bufferSize, int retirePerCycle> {			class RetireControlUnit<int bufferSize, int retirePerCycle> {
	int ReorderBufferSize = bufferSize;			int ReorderBufferSize = bufferSize;
	int MaxRetirePerCycle = retirePerCycle;			int MaxRetirePerCycle = retirePerCycle;
	SchedMachineModel SchedModel = ?;			SchedMachineModel SchedModel = ?;
	}			}

				// Allow the definition of hardware counters.
				class PfmCounter {
				SchedMachineModel SchedModel = ?;
				}

				// Each processor can define how to measure cycles by defining a
				// PfmCycleCounter.
				class PfmCycleCounter<string counter> : PfmCounter {
				string Counter = counter;
				}

				// Each ProcResourceUnits can define how to measure issued uops by defining
				// a PfmIssueCounter.
				class PfmIssueCounter<ProcResourceUnits resource, list<string> counters>
				: PfmCounter{
				// The resource units on which uops are issued.
				ProcResourceUnits Resource = resource;
				// The list of counters that measure issue events.
				list<string> Counters = counters;
				}

lib/Target/X86/X86.td

	Show First 20 Lines • Show All 1,126 Lines • ▼ Show 20 Lines

	def X86 : Target {			def X86 : Target {
	// Information about the instructions...			// Information about the instructions...
	let InstructionSet = X86InstrInfo;			let InstructionSet = X86InstrInfo;
	let AssemblyParserVariants = [ATTAsmParserVariant, IntelAsmParserVariant];			let AssemblyParserVariants = [ATTAsmParserVariant, IntelAsmParserVariant];
	let AssemblyWriters = [ATTAsmWriter, IntelAsmWriter];			let AssemblyWriters = [ATTAsmWriter, IntelAsmWriter];
	let AllowRegisterRenaming = 1;			let AllowRegisterRenaming = 1;
	}			}

				//===----------------------------------------------------------------------===//
				// Pfm Counters
				//===----------------------------------------------------------------------===//

				include "X86PfmCounters.td"

lib/Target/X86/X86PfmCounters.td

This file was added.

				let SchedModel = SandyBridgeModel in {
				def SBCycleCounter : PfmCycleCounter<"unhalted_core_cycles">;
				def SBPort0Counter : PfmIssueCounter<SBPort0, ["uops_dispatched_port:port_0"]>;
				def SBPort1Counter : PfmIssueCounter<SBPort1, ["uops_dispatched_port:port_1"]>;
				def SBPort23Counter : PfmIssueCounter<SBPort23,
				["uops_dispatched_port:port_2",
				"uops_dispatched_port:port_3"]>;
				def SBPort4Counter : PfmIssueCounter<SBPort4, ["uops_dispatched_port:port_4"]>;
				def SBPort5Counter : PfmIssueCounter<SBPort5, ["uops_dispatched_port:port_5"]>;
				}

				let SchedModel = HaswellModel in {
				def HWCycleCounter : PfmCycleCounter<"unhalted_core_cycles">;
				def HWPort0Counter : PfmIssueCounter<HWPort0, ["uops_dispatched_port:port_0"]>;
				def HWPort1Counter : PfmIssueCounter<HWPort1, ["uops_dispatched_port:port_1"]>;
				def HWPort2Counter : PfmIssueCounter<HWPort2, ["uops_dispatched_port:port_2"]>;
				def HWPort3Counter : PfmIssueCounter<HWPort3, ["uops_dispatched_port:port_3"]>;
				def HWPort4Counter : PfmIssueCounter<HWPort4, ["uops_dispatched_port:port_4"]>;
				def HWPort5Counter : PfmIssueCounter<HWPort5, ["uops_dispatched_port:port_5"]>;
				def HWPort6Counter : PfmIssueCounter<HWPort6, ["uops_dispatched_port:port_6"]>;
				def HWPort7Counter : PfmIssueCounter<HWPort7, ["uops_dispatched_port:port_7"]>;
				}

				let SchedModel = BroadwellModel in {
				def BWCycleCounter : PfmCycleCounter<"unhalted_core_cycles">;
				def BWPort0Counter : PfmIssueCounter<BWPort0, ["uops_dispatched_port:port_0"]>;
				def BWPort1Counter : PfmIssueCounter<BWPort1, ["uops_dispatched_port:port_1"]>;
				def BWPort2Counter : PfmIssueCounter<BWPort2, ["uops_dispatched_port:port_2"]>;
				def BWPort3Counter : PfmIssueCounter<BWPort3, ["uops_dispatched_port:port_3"]>;
				def BWPort4Counter : PfmIssueCounter<BWPort4, ["uops_dispatched_port:port_4"]>;
				def BWPort5Counter : PfmIssueCounter<BWPort5, ["uops_dispatched_port:port_5"]>;
				def BWPort6Counter : PfmIssueCounter<BWPort6, ["uops_dispatched_port:port_6"]>;
				def BWPort7Counter : PfmIssueCounter<BWPort7, ["uops_dispatched_port:port_7"]>;
				}

				let SchedModel = SkylakeClientModel in {
				def SKLCycleCounter : PfmCycleCounter<"unhalted_core_cycles">;
				def SKLPort0Counter : PfmIssueCounter<SKLPort0, ["uops_dispatched_port:port_0"]>;
				def SKLPort1Counter : PfmIssueCounter<SKLPort1, ["uops_dispatched_port:port_1"]>;
				def SKLPort2Counter : PfmIssueCounter<SKLPort2, ["uops_dispatched_port:port_2"]>;
				def SKLPort3Counter : PfmIssueCounter<SKLPort3, ["uops_dispatched_port:port_3"]>;
				def SKLPort4Counter : PfmIssueCounter<SKLPort4, ["uops_dispatched_port:port_4"]>;
				def SKLPort5Counter : PfmIssueCounter<SKLPort5, ["uops_dispatched_port:port_5"]>;
				def SKLPort6Counter : PfmIssueCounter<SKLPort6, ["uops_dispatched_port:port_6"]>;
				def SKLPort7Counter : PfmIssueCounter<SKLPort7, ["uops_dispatched_port:port_7"]>;
				}

				let SchedModel = SkylakeServerModel in {
				def SKXCycleCounter : PfmCycleCounter<"unhalted_core_cycles">;
				def SKXPort0Counter : PfmIssueCounter<SKXPort0, ["uops_dispatched_port:port_0"]>;
				def SKXPort1Counter : PfmIssueCounter<SKXPort1, ["uops_dispatched_port:port_1"]>;
				def SKXPort2Counter : PfmIssueCounter<SKXPort2, ["uops_dispatched_port:port_2"]>;
				def SKXPort3Counter : PfmIssueCounter<SKXPort3, ["uops_dispatched_port:port_3"]>;
				def SKXPort4Counter : PfmIssueCounter<SKXPort4, ["uops_dispatched_port:port_4"]>;
				def SKXPort5Counter : PfmIssueCounter<SKXPort5, ["uops_dispatched_port:port_5"]>;
				def SKXPort6Counter : PfmIssueCounter<SKXPort6, ["uops_dispatched_port:port_6"]>;
				def SKXPort7Counter : PfmIssueCounter<SKXPort7, ["uops_dispatched_port:port_7"]>;
				}

tools/llvm-exegesis/lib/Latency.cpp

	Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	std::vector<BenchmarkMeasure>			std::vector<BenchmarkMeasure>
	LatencyBenchmarkRunner::runMeasurements(const LLVMState &State,			LatencyBenchmarkRunner::runMeasurements(const LLVMState &State,
	const JitFunction &Function,			const JitFunction &Function,
	const unsigned NumRepetitions) const {			const unsigned NumRepetitions) const {
	// Cycle measurements include some overhead from the kernel. Repeat the			// Cycle measurements include some overhead from the kernel. Repeat the
	// measure several times and take the minimum value.			// measure several times and take the minimum value.
	constexpr const int NumMeasurements = 30;			constexpr const int NumMeasurements = 30;
	int64_t MinLatency = std::numeric_limits<int64_t>::max();			int64_t MinLatency = std::numeric_limits<int64_t>::max();
	// FIXME: Read the perf event from the MCSchedModel (see PR36984).			const char *CounterName = State.getSubtargetInfo()
	const pfm::PerfEvent CyclesPerfEvent("UNHALTED_CORE_CYCLES");			.getSchedModel()
				.getExtraProcessorInfo()
				.PfmCounters.CycleCounter;
				if (!CounterName)
				llvm::report_fatal_error("sched model does not define a cycle counter");
				const pfm::PerfEvent CyclesPerfEvent(CounterName);
	if (!CyclesPerfEvent.valid())			if (!CyclesPerfEvent.valid())
	llvm::report_fatal_error("invalid perf event 'UNHALTED_CORE_CYCLES'");			llvm::report_fatal_error("invalid perf event");
	for (size_t I = 0; I < NumMeasurements; ++I) {			for (size_t I = 0; I < NumMeasurements; ++I) {
	pfm::Counter Counter(CyclesPerfEvent);			pfm::Counter Counter(CyclesPerfEvent);
	Counter.start();			Counter.start();
	Function();			Function();
	Counter.stop();			Counter.stop();
	const int64_t Value = Counter.read();			const int64_t Value = Counter.read();
	if (Value < MinLatency)			if (Value < MinLatency)
	MinLatency = Value;			MinLatency = Value;
	}			}
	return {{"latency", static_cast<double>(MinLatency) / NumRepetitions, ""}};			return {{"latency", static_cast<double>(MinLatency) / NumRepetitions, ""}};
	}			}

	} // namespace exegesis			} // namespace exegesis

tools/llvm-exegesis/lib/Uops.cpp

Show All 32 Lines	static bool isInvalidOperand(const llvm::MCOperandInfo &OpInfo) {
}		}
}		}

static llvm::Error makeError(llvm::Twine Msg) {		static llvm::Error makeError(llvm::Twine Msg) {
return llvm::make_error<llvm::StringError>(Msg,		return llvm::make_error<llvm::StringError>(Msg,
llvm::inconvertibleErrorCode());		llvm::inconvertibleErrorCode());
}		}

// FIXME: Read the counter names from the ProcResourceUnits when PR36984 is
// fixed.
static const std::string getEventNameFromProcResName(const char ProcResName) {
static const std::unordered_map<std::string, std::string> Entries = {
{"SBPort0", "UOPS_DISPATCHED_PORT:PORT_0"},
{"SBPort1", "UOPS_DISPATCHED_PORT:PORT_1"},
{"SBPort4", "UOPS_DISPATCHED_PORT:PORT_4"},
{"SBPort5", "UOPS_DISPATCHED_PORT:PORT_5"},
{"HWPort0", "UOPS_DISPATCHED_PORT:PORT_0"},
{"HWPort1", "UOPS_DISPATCHED_PORT:PORT_1"},
{"HWPort2", "UOPS_DISPATCHED_PORT:PORT_2"},
{"HWPort3", "UOPS_DISPATCHED_PORT:PORT_3"},
{"HWPort4", "UOPS_DISPATCHED_PORT:PORT_4"},
{"HWPort5", "UOPS_DISPATCHED_PORT:PORT_5"},
{"HWPort6", "UOPS_DISPATCHED_PORT:PORT_6"},
{"HWPort7", "UOPS_DISPATCHED_PORT:PORT_7"},
{"SKLPort0", "UOPS_DISPATCHED_PORT:PORT_0"},
{"SKLPort1", "UOPS_DISPATCHED_PORT:PORT_1"},
{"SKLPort2", "UOPS_DISPATCHED_PORT:PORT_2"},
{"SKLPort3", "UOPS_DISPATCHED_PORT:PORT_3"},
{"SKLPort4", "UOPS_DISPATCHED_PORT:PORT_4"},
{"SKLPort5", "UOPS_DISPATCHED_PORT:PORT_5"},
{"SKLPort6", "UOPS_DISPATCHED_PORT:PORT_6"},
{"SKXPort7", "UOPS_DISPATCHED_PORT:PORT_7"},
{"SKXPort0", "UOPS_DISPATCHED_PORT:PORT_0"},
{"SKXPort1", "UOPS_DISPATCHED_PORT:PORT_1"},
{"SKXPort2", "UOPS_DISPATCHED_PORT:PORT_2"},
{"SKXPort3", "UOPS_DISPATCHED_PORT:PORT_3"},
{"SKXPort4", "UOPS_DISPATCHED_PORT:PORT_4"},
{"SKXPort5", "UOPS_DISPATCHED_PORT:PORT_5"},
{"SKXPort6", "UOPS_DISPATCHED_PORT:PORT_6"},
{"SKXPort7", "UOPS_DISPATCHED_PORT:PORT_7"},
};
const auto It = Entries.find(ProcResName);
return It == Entries.end() ? nullptr : &It->second;
}

static std::vector<llvm::MCInst> generateIndependentAssignments(		static std::vector<llvm::MCInst> generateIndependentAssignments(
const LLVMState &State, const llvm::MCInstrDesc &InstrDesc,		const LLVMState &State, const llvm::MCInstrDesc &InstrDesc,
llvm::SmallVector<Variable, 8> Vars, int MaxAssignments) {		llvm::SmallVector<Variable, 8> Vars, int MaxAssignments) {
std::unordered_set<llvm::MCPhysReg> IsUsedByAnyVar;		std::unordered_set<llvm::MCPhysReg> IsUsedByAnyVar;
for (const Variable &Var : Vars) {		for (const Variable &Var : Vars) {
if (Var.IsUse) {		if (Var.IsUse) {
IsUsedByAnyVar.insert(Var.PossibleRegisters.begin(),		IsUsedByAnyVar.insert(Var.PossibleRegisters.begin(),
Var.PossibleRegisters.end());		Var.PossibleRegisters.end());
▲ Show 20 Lines • Show All 137 Lines • ▼ Show 20 Lines
UopsBenchmarkRunner::runMeasurements(const LLVMState &State,		UopsBenchmarkRunner::runMeasurements(const LLVMState &State,
const JitFunction &Function,		const JitFunction &Function,
const unsigned NumRepetitions) const {		const unsigned NumRepetitions) const {
const auto &SchedModel = State.getSubtargetInfo().getSchedModel();		const auto &SchedModel = State.getSubtargetInfo().getSchedModel();

std::vector<BenchmarkMeasure> Result;		std::vector<BenchmarkMeasure> Result;
for (unsigned ProcResIdx = 1;		for (unsigned ProcResIdx = 1;
ProcResIdx < SchedModel.getNumProcResourceKinds(); ++ProcResIdx) {		ProcResIdx < SchedModel.getNumProcResourceKinds(); ++ProcResIdx) {
const llvm::MCProcResourceDesc &ProcRes =		const char *const PfmCounters = SchedModel.getExtraProcessorInfo()
*SchedModel.getProcResource(ProcResIdx);		.PfmCounters.IssueCounters[ProcResIdx];
const std::string *const EventName =		if (!PfmCounters)
getEventNameFromProcResName(ProcRes.Name);
if (!EventName)
continue;		continue;
pfm::Counter Counter{pfm::PerfEvent(*EventName)};		// FIXME: Sum results when there are several counters for a single ProcRes
		// (e.g. P23 on SandyBridge).
		pfm::Counter Counter{pfm::PerfEvent(PfmCounters)};
Counter.start();		Counter.start();
Function();		Function();
Counter.stop();		Counter.stop();
Result.push_back({llvm::itostr(ProcResIdx),		Result.push_back({llvm::itostr(ProcResIdx),
static_cast<double>(Counter.read()) / NumRepetitions,		static_cast<double>(Counter.read()) / NumRepetitions,
ProcRes.Name});		SchedModel.getProcResource(ProcResIdx)->Name});
}		}
return Result;		return Result;
}		}

} // namespace exegesis		} // namespace exegesis

utils/TableGen/CodeGenSchedule.h

Show First 20 Lines • Show All 232 Lines • ▼ Show 20 Lines	struct CodeGenProcModel {
RecVec ProcResourceDefs;		RecVec ProcResourceDefs;

// List of Register Files.		// List of Register Files.
std::vector<CodeGenRegisterFile> RegisterFiles;		std::vector<CodeGenRegisterFile> RegisterFiles;

// Optional Retire Control Unit definition.		// Optional Retire Control Unit definition.
Record *RetireControlUnit;		Record *RetireControlUnit;

		// List of PfmCounters.
		RecVec PfmIssueCounterDefs;
		Record *PfmCycleCounterDef = nullptr;

CodeGenProcModel(unsigned Idx, std::string Name, Record *MDef,		CodeGenProcModel(unsigned Idx, std::string Name, Record *MDef,
Record *IDef) :		Record *IDef) :
Index(Idx), ModelName(std::move(Name)), ModelDef(MDef), ItinsDef(IDef),		Index(Idx), ModelName(std::move(Name)), ModelDef(MDef), ItinsDef(IDef),
RetireControlUnit(nullptr) {}		RetireControlUnit(nullptr) {}

bool hasItineraries() const {		bool hasItineraries() const {
return !ItinsDef->getValueAsListOfDefs("IID").empty();		return !ItinsDef->getValueAsListOfDefs("IID").empty();
}		}

bool hasInstrSchedModel() const {		bool hasInstrSchedModel() const {
return !WriteResDefs.empty() \|\| !ItinRWDefs.empty();		return !WriteResDefs.empty() \|\| !ItinRWDefs.empty();
}		}

bool hasExtraProcessorInfo() const {		bool hasExtraProcessorInfo() const {
return RetireControlUnit \|\| !RegisterFiles.empty();		return RetireControlUnit \|\| !RegisterFiles.empty() \|\|
		!PfmIssueCounterDefs.empty() \|\|
		PfmCycleCounterDef != nullptr;
}		}

unsigned getProcResourceIdx(Record *PRDef) const;		unsigned getProcResourceIdx(Record *PRDef) const;

bool isUnsupported(const CodeGenInstruction &Inst) const;		bool isUnsupported(const CodeGenInstruction &Inst) const;

#ifndef NDEBUG		#ifndef NDEBUG
void dump() const;		void dump() const;
▲ Show 20 Lines • Show All 175 Lines • ▼ Show 20 Lines	private:
unsigned findRWForSequence(ArrayRef<unsigned> Seq, bool IsRead);		unsigned findRWForSequence(ArrayRef<unsigned> Seq, bool IsRead);

void collectSchedClasses();		void collectSchedClasses();

void collectRetireControlUnits();		void collectRetireControlUnits();

void collectRegisterFiles();		void collectRegisterFiles();

		void collectPfmCounters();

void collectOptionalProcessorInfo();		void collectOptionalProcessorInfo();

std::string createSchedClassName(Record *ItinClassDef,		std::string createSchedClassName(Record *ItinClassDef,
ArrayRef<unsigned> OperWrites,		ArrayRef<unsigned> OperWrites,
ArrayRef<unsigned> OperReads);		ArrayRef<unsigned> OperReads);
std::string createSchedClassName(const RecVec &InstDefs);		std::string createSchedClassName(const RecVec &InstDefs);
void createInstRWClass(Record *InstRWDef);		void createInstRWClass(Record *InstRWDef);

Show All 39 Lines

utils/TableGen/CodeGenSchedule.cpp

Show First 20 Lines • Show All 233 Lines • ▼ Show 20 Lines

/// Collect optional processor information.		/// Collect optional processor information.
void CodeGenSchedModels::collectOptionalProcessorInfo() {		void CodeGenSchedModels::collectOptionalProcessorInfo() {
// Find register file definitions for each processor.		// Find register file definitions for each processor.
collectRegisterFiles();		collectRegisterFiles();

// Collect processor RetireControlUnit descriptors if available.		// Collect processor RetireControlUnit descriptors if available.
collectRetireControlUnits();		collectRetireControlUnits();

		// Find pfm counter definitions for each processor.
		collectPfmCounters();

		checkCompleteness();
}		}

/// Gather all processor models.		/// Gather all processor models.
void CodeGenSchedModels::collectProcModels() {		void CodeGenSchedModels::collectProcModels() {
RecVec ProcRecords = Records.getAllDerivedDefinitions("Processor");		RecVec ProcRecords = Records.getAllDerivedDefinitions("Processor");
llvm::sort(ProcRecords.begin(), ProcRecords.end(), LessRecordFieldName());		llvm::sort(ProcRecords.begin(), ProcRecords.end(), LessRecordFieldName());

// Reserve space because we can. Reallocation would be ok.		// Reserve space because we can. Reallocation would be ok.
▲ Show 20 Lines • Show All 1,282 Lines • ▼ Show 20 Lines	for (Record *RF : RegisterFileDefs) {
std::vector<int64_t> RegisterCosts = RF->getValueAsListOfInts("RegCosts");		std::vector<int64_t> RegisterCosts = RF->getValueAsListOfInts("RegCosts");
for (unsigned I = 0, E = RegisterClasses.size(); I < E; ++I) {		for (unsigned I = 0, E = RegisterClasses.size(); I < E; ++I) {
int Cost = RegisterCosts.size() > I ? RegisterCosts[I] : 1;		int Cost = RegisterCosts.size() > I ? RegisterCosts[I] : 1;
CGRF.Costs.emplace_back(RegisterClasses[I], Cost);		CGRF.Costs.emplace_back(RegisterClasses[I], Cost);
}		}
}		}
}		}

		// Collect all the RegisterFile definitions available in this target.
		void CodeGenSchedModels::collectPfmCounters() {
		for (Record *Def : Records.getAllDerivedDefinitions("PfmIssueCounter")) {
		CodeGenProcModel &PM = getProcModel(Def->getValueAsDef("SchedModel"));
		PM.PfmIssueCounterDefs.emplace_back(Def);
		}
		for (Record *Def : Records.getAllDerivedDefinitions("PfmCycleCounter")) {
		CodeGenProcModel &PM = getProcModel(Def->getValueAsDef("SchedModel"));
		if (PM.PfmCycleCounterDef) {
		PrintFatalError(Def->getLoc(),
		"multiple cycle counters for " +
		Def->getValueAsDef("SchedModel")->getName());
		}
		PM.PfmCycleCounterDef = Def;
		}
		}

// Collect and sort WriteRes, ReadAdvance, and ProcResources.		// Collect and sort WriteRes, ReadAdvance, and ProcResources.
void CodeGenSchedModels::collectProcResources() {		void CodeGenSchedModels::collectProcResources() {
ProcResourceDefs = Records.getAllDerivedDefinitions("ProcResourceUnits");		ProcResourceDefs = Records.getAllDerivedDefinitions("ProcResourceUnits");
ProcResGroups = Records.getAllDerivedDefinitions("ProcResGroup");		ProcResGroups = Records.getAllDerivedDefinitions("ProcResGroup");

// Add any subtarget-specific SchedReadWrites that are directly associated		// Add any subtarget-specific SchedReadWrites that are directly associated
// with processor resources. Refer to the parent SchedClass's ProcIndices to		// with processor resources. Refer to the parent SchedClass's ProcIndices to
// determine which processors they apply to.		// determine which processors they apply to.
▲ Show 20 Lines • Show All 404 Lines • Show Last 20 Lines

utils/TableGen/SubtargetEmitter.cpp

Show First 20 Lines • Show All 629 Lines • ▼ Show 20 Lines	static void EmitRegisterFileInfo(const CodeGenProcModel &ProcModel,
else		else
OS << "nullptr,\n 0";		OS << "nullptr,\n 0";

OS << ", // Number of register files.\n ";		OS << ", // Number of register files.\n ";
if (NumCostEntries)		if (NumCostEntries)
OS << ProcModel.ModelName << "RegisterCosts,\n ";		OS << ProcModel.ModelName << "RegisterCosts,\n ";
else		else
OS << "nullptr,\n ";		OS << "nullptr,\n ";
OS << NumCostEntries << " // Number of register cost entries.\n";		OS << NumCostEntries << ", // Number of register cost entries.\n";
}		}

unsigned		unsigned
SubtargetEmitter::EmitRegisterFileTables(const CodeGenProcModel &ProcModel,		SubtargetEmitter::EmitRegisterFileTables(const CodeGenProcModel &ProcModel,
raw_ostream &OS) {		raw_ostream &OS) {
if (llvm::all_of(ProcModel.RegisterFiles, [](const CodeGenRegisterFile &RF) {		if (llvm::all_of(ProcModel.RegisterFiles, [](const CodeGenRegisterFile &RF) {
return RF.hasDefaultCosts();		return RF.hasDefaultCosts();
}))		}))
Show All 34 Lines	for (const CodeGenRegisterFile &RD : ProcModel.RegisterFiles) {
unsigned NumCostEntries = RD.Costs.size();		unsigned NumCostEntries = RD.Costs.size();
OS << NumCostEntries << ", " << CostTblIndex << "},\n";		OS << NumCostEntries << ", " << CostTblIndex << "},\n";
CostTblIndex += NumCostEntries;		CostTblIndex += NumCostEntries;
}		}
OS << "};\n";		OS << "};\n";

return CostTblIndex;		return CostTblIndex;
}		}
		static bool EmitPfmIssueCountersTable(const CodeGenProcModel &ProcModel,
		raw_ostream &OS) {
		std::vector<const Record *> CounterDefs(ProcModel.ProcResourceDefs.size());
		bool HasCounters = false;
		for (const Record *CounterDef : ProcModel.PfmIssueCounterDefs) {
		const Record *&CD = CounterDefs[ProcModel.getProcResourceIdx(
		CounterDef->getValueAsDef("Resource"))];
		if (CD) {
		PrintFatalError(CounterDef->getLoc(),
		"multiple issue counters for " +
		CounterDef->getValueAsDef("Resource")->getName());
		}
		CD = CounterDef;
		HasCounters = true;
		}
		if (!HasCounters) {
		return false;
		}
		OS << "\nstatic const char* " << ProcModel.ModelName
		<< "PfmIssueCounters[] = {\n";
		for (const Record *CounterDef : CounterDefs) {
		if (CounterDef) {
		const auto PfmCounters = CounterDef->getValueAsListOfStrings("Counters");
		if (PfmCounters.empty())
		PrintFatalError(CounterDef->getLoc(), "empty counter list");
		for (const StringRef CounterName : PfmCounters)
		OS << " \"" << CounterName << ",\"";
		OS << ", //" << CounterDef->getValueAsDef("Resource")->getName() << "\n";
		} else {
		OS << " nullptr,\n";
		}
		}
		OS << "};\n";
		return true;
		}

		static void EmitPfmCounters(const CodeGenProcModel &ProcModel,
		const bool HasPfmIssueCounters, raw_ostream &OS) {
		// Emit the cycle counter.
		if (ProcModel.PfmCycleCounterDef)
		OS << " \"" << ProcModel.PfmCycleCounterDef->getValueAsString("Counter")
		<< "\", // Cycle counter.\n";
		else
		OS << " nullptr, // No cycle counter.\n";

		// Emit a reference to issue counters table.
		if (HasPfmIssueCounters)
		OS << " " << ProcModel.ModelName << "PfmIssueCounters\n";
		else
		OS << " nullptr, // No issue counters.\n";
		}

void SubtargetEmitter::EmitExtraProcessorInfo(const CodeGenProcModel &ProcModel,		void SubtargetEmitter::EmitExtraProcessorInfo(const CodeGenProcModel &ProcModel,
raw_ostream &OS) {		raw_ostream &OS) {
// Generate a table of register file descriptors (one entry per each user		// Generate a table of register file descriptors (one entry per each user
// defined register file), and a table of register costs.		// defined register file), and a table of register costs.
unsigned NumCostEntries = EmitRegisterFileTables(ProcModel, OS);		unsigned NumCostEntries = EmitRegisterFileTables(ProcModel, OS);

		// Generate a table of ProcRes counter names.
		const bool HasPfmIssueCounters = EmitPfmIssueCountersTable(ProcModel, OS);

// Now generate a table for the extra processor info.		// Now generate a table for the extra processor info.
OS << "\nstatic const llvm::MCExtraProcessorInfo " << ProcModel.ModelName		OS << "\nstatic const llvm::MCExtraProcessorInfo " << ProcModel.ModelName
<< "ExtraInfo = {\n ";		<< "ExtraInfo = {\n ";

// Add information related to the retire control unit.		// Add information related to the retire control unit.
EmitRetireControlUnitInfo(ProcModel, OS);		EmitRetireControlUnitInfo(ProcModel, OS);

// Add information related to the register files (i.e. where to find register		// Add information related to the register files (i.e. where to find register
// file descriptors and register costs).		// file descriptors and register costs).
EmitRegisterFileInfo(ProcModel, ProcModel.RegisterFiles.size(),		EmitRegisterFileInfo(ProcModel, ProcModel.RegisterFiles.size(),
NumCostEntries, OS);		NumCostEntries, OS);

		EmitPfmCounters(ProcModel, HasPfmIssueCounters, OS);

OS << "};\n";		OS << "};\n";
}		}

void SubtargetEmitter::EmitProcessorResources(const CodeGenProcModel &ProcModel,		void SubtargetEmitter::EmitProcessorResources(const CodeGenProcModel &ProcModel,
raw_ostream &OS) {		raw_ostream &OS) {
EmitProcessorResourceSubUnits(ProcModel, OS);		EmitProcessorResourceSubUnits(ProcModel, OS);

OS << "\n// {Name, NumUnits, SuperIdx, IsBuffered, SubUnitsIdxBegin}\n";		OS << "\n// {Name, NumUnits, SuperIdx, IsBuffered, SubUnitsIdxBegin}\n";
OS << "static const llvm::MCProcResourceDesc " << ProcModel.ModelName		OS << "static const llvm::MCProcResourceDesc " << ProcModel.ModelName
<< "ProcResources"		<< "ProcResources"
<< "[] = {\n"		<< "[] = {\n"
<< " {\"InvalidUnit\", 0, 0, 0, 0},\n";		<< " {\"InvalidUnit\", 0, 0, 0, 0},\n";
		andreadbUnsubmitted Done Reply Inline Actions Can this be moved to `MCExtraProcessorInfo`? I am of the opinion that this information is optional, and it shouldn't be included by default in MCProcResourceDesc. See my other comments. andreadb: Can this be moved to `MCExtraProcessorInfo`? I am of the opinion that this information is…

unsigned SubUnitsOffset = 1;		unsigned SubUnitsOffset = 1;
for (unsigned i = 0, e = ProcModel.ProcResourceDefs.size(); i < e; ++i) {		for (unsigned i = 0, e = ProcModel.ProcResourceDefs.size(); i < e; ++i) {
Record *PRDef = ProcModel.ProcResourceDefs[i];		Record *PRDef = ProcModel.ProcResourceDefs[i];

Record *SuperDef = nullptr;		Record *SuperDef = nullptr;
unsigned SuperIdx = 0;		unsigned SuperIdx = 0;
unsigned NumUnits = 0;		unsigned NumUnits = 0;
▲ Show 20 Lines • Show All 575 Lines • ▼ Show 20 Lines	for (const CodeGenProcModel &PM : SchedModels.procModels()) {
else		else
OS << " nullptr, nullptr, 0, 0,"		OS << " nullptr, nullptr, 0, 0,"
<< " // No instruction-level machine model.\n";		<< " // No instruction-level machine model.\n";
if (PM.hasItineraries())		if (PM.hasItineraries())
OS << " " << PM.ItinsDef->getName() << ",\n";		OS << " " << PM.ItinsDef->getName() << ",\n";
else		else
OS << " nullptr, // No Itinerary\n";		OS << " nullptr, // No Itinerary\n";
if (PM.hasExtraProcessorInfo())		if (PM.hasExtraProcessorInfo())
OS << " &" << PM.ModelName << "ExtraInfo\n";		OS << " &" << PM.ModelName << "ExtraInfo,\n";
else		else
OS << " nullptr // No extra processor descriptor\n";		OS << " nullptr // No extra processor descriptor\n";
OS << "};\n";		OS << "};\n";
}		}
}		}

//		//
// EmitProcessorLookup - generate cpu name to itinerary lookup table.		// EmitProcessorLookup - generate cpu name to itinerary lookup table.
//		//
void SubtargetEmitter::EmitProcessorLookup(raw_ostream &OS) {		void SubtargetEmitter::EmitProcessorLookup(raw_ostream &OS) {
▲ Show 20 Lines • Show All 346 Lines • Show Last 20 Lines