This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/MCA/HardwareUnits/
-
llvm/
-
MCA/
-
HardwareUnits/
3/3
LSUnit.h
-
lib/MCA/HardwareUnits/
-
MCA/
-
HardwareUnits/
8/8
LSUnit.cpp
-
test/tools/llvm-mca/
-
tools/
-
llvm-mca/
-
AArch64/Exynos/
-
Exynos/
-
asimd-st1.s
-
asimd-st2.s
-
asimd-st3.s
-
asimd-st4.s
-
float-store.s
-
store.s
-
X86/
-
Barcelona/
-
load-store-throughput.s
-
store-throughput.s
-
BdVer2/
-
load-store-throughput.s
-
memcpy-like-test.s
-
store-throughput.s
-
BtVer2/
-
independent-load-stores.s
-
xadd.s
-
Haswell/
-
independent-load-stores.s
-
SkylakeClient/
-
independent-load-stores.s
-
SkylakeServer/
-
independent-load-stores.s

Differential D79351

[MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent (PR45793).
ClosedPublic

Authored by andreadb on May 4 2020, 12:10 PM.

Download Raw Diff

Details

Reviewers

RKSimon
mattd
lebedev.ri

Commits

rG5578ec32f9c4: [MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as…

Summary

This fixes a regression introduced by a very old commit 280ac1fd1dc35 (was llvm-svn 361950).

Commit 280ac1fd1dc35 redesigned the logic in the LSUnit with the goal of speeding up isReady() queries, and stabilising the LSUnit API (while also making the load store unit more customisable).

The concept of MemoryGroup (effectively an alias set) was added by that commit to better describe dependencies between memory operations. However, that concept was not just used to describe simple alias dependencies, but it was also used for describing memory "order" dependencies (enforced by the memory consistency model).
Instructions of a same memory group were considered "equivalent" as in: independent operations that can potentially execute in parallel.
The problem was that the cost of a dependency (in terms of number of cycles) is different if the instruction is in a "order" dependency, and simply has to wait for the predecessor to be "issued" on a pipeline (rather than being fully executed). For simple "order" dependencies, this was effectively introducing an artificial delay on the "issue" of independent loads and stores.

This patch fixes the issue and adds a new test named 'independent-load-stores.s' to a bunch of x86 targets. That test contains the reproducible posted by Fabian Ritter on PR45793.

I had to rerun the update-mca-tests script on several files. To avoid expected regressions on some Exynos tests, I have added a -noalias=false flag (to match the old strict behavior on latencies).

Some tests for processor Barcelona are fixed by this change and they now show better results.
In a few tests we were incorrectly counting the time spent by instructions in a scheduler queue (this was also caused by the issue on the delayed start of execution for loads and stores).
In one case in particular we now correctly see a store executed out of order. That test was affected by the same underlying issue reported as PR45793.

Another test related to store barriers has improved as a result of this change. Instuction int3 is treated by llvm-mca as a full memory barrier (since it mayload/maystore and has unmodelled side effects). So, memory instructions coming after it had to wait until int3 was effectively executed. This was not happening before. This is issue now as a consequence of the rewrite from this patch.

Let me know if OK to commit.
Thanks

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

andreadb created this revision.May 4 2020, 12:10 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 12:10 PM

Herald added subscribers: gbedwell, hiraditya. · View Herald Transcript

Patch updated. This time with full context.

mattd added a comment.May 4 2020, 12:41 PM

This comment was removed by mattd.

llvm/include/llvm/MCA/HardwareUnits/LSUnit.h
61	s/unsigned/size_t/ because SmallVectorBase::size returns a size_t.

I'm totally cool with this change; however, it's been a while since I've taken a look at this part of MCA. I'll let other's chime in as well, but +1 from me.

llvm/lib/MCA/HardwareUnits/LSUnit.cpp
69	This function is getting pretty large. I'm not sure how to break this up, but it could be useful as a NFC follow-up at a later time.
131	Is !ImmediateLoadDominator necessary here? It seems that would be implied by the next condition: `ImmediateLoadDominator <= CurrentStoreGroupID` However, I suppose this does read clearer by leaving the !ImmediateLoadDominator check in place..

In D79351#2018657, @mattd wrote:

I'm totally cool with this change; however, it's been a while since I've taken a look at this part of MCA. I'll let other's chime in as well, but +1 from me.

Thanks Matt :-)

llvm/include/llvm/MCA/HardwareUnits/LSUnit.h
61	Right. Ideally it should return a size_t. I guess it hardly would be a problem in practice. But you are right. I will fix it.
119	To clarify. The reason why there is this early exit here is because we only need to update the information about a critical "memory" dependency if there is memory aliasing.
llvm/lib/MCA/HardwareUnits/LSUnit.cpp
69	Definitely a good suggestion for a follow-up. :-) This logic can probably be split and moved into a few helper methods/functions.
95	Note that, by construction `CurrentStoreBarrierGroupID` is always less than or equal to `CurrentStoreGroupID`. This check was missing in the original code from commit 280ac1fd1dc35. And this is exactly the reason why we have a difference in test pr37790.s
105	This check is to avoid that we add a store dependency twice (since a store can also be a store barrier).
131	The idea is to identify cases where we need a new node in the dependency graph. These are the cases identified by this check: a) This is a memory barrier (by construction we always require that barriers are assigned to different memory group); b) This is the very first load dispatched to the LSUnit (by construction we always keep loads and stores into separate groups. So we need to start a new load group). c) There is an intervening store between the last load dispatched to the LSU and this load; d) There is no intervening store. However the last load group has already started execution (so we need a new node).

mattd added inline comments.May 4 2020, 1:46 PM

llvm/lib/MCA/HardwareUnits/LSUnit.cpp
96	nit: You call this StoreGroup here, but StGroup a few blocks below. I really do not care that there is a difference of a few characters, but I could not silence my inner dialog.
131	Those points would make for a great comment! Thanks for the clarification.

Addressed review comments.

mattd accepted this revision.May 4 2020, 3:42 PM

This revision is now accepted and ready to land.May 4 2020, 3:42 PM

Closed by commit rG5578ec32f9c4: [MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as… (authored by andreadb). · Explain WhyMay 5 2020, 2:39 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

MCA/

HardwareUnits/

LSUnit.h

49 lines

lib/

MCA/

HardwareUnits/

LSUnit.cpp

84 lines

test/

tools/

llvm-mca/

AArch64/

Exynos/

6 lines

6 lines

6 lines

6 lines

6 lines

6 lines

X86/

Barcelona/

load-store-throughput.s

221 lines

store-throughput.s

40 lines

BdVer2/

load-store-throughput.s

215 lines

memcpy-like-test.s

6 lines

store-throughput.s

48 lines

BtVer2/

independent-load-stores.s

146 lines

xadd.s

42 lines

Haswell/

independent-load-stores.s

142 lines

SkylakeClient/

independent-load-stores.s

142 lines

SkylakeServer/

independent-load-stores.s

142 lines

Diff 262042

llvm/include/llvm/MCA/HardwareUnits/LSUnit.h

Show All 34 Lines
class MemoryGroup {		class MemoryGroup {
unsigned NumPredecessors;		unsigned NumPredecessors;
unsigned NumExecutingPredecessors;		unsigned NumExecutingPredecessors;
unsigned NumExecutedPredecessors;		unsigned NumExecutedPredecessors;

unsigned NumInstructions;		unsigned NumInstructions;
unsigned NumExecuting;		unsigned NumExecuting;
unsigned NumExecuted;		unsigned NumExecuted;
SmallVector<MemoryGroup *, 4> Succ;		// Successors that are in a order dependency with this group.
		SmallVector<MemoryGroup *, 4> OrderSucc;
		// Successors that are in a data dependency with this group.
		SmallVector<MemoryGroup *, 4> DataSucc;

CriticalDependency CriticalPredecessor;		CriticalDependency CriticalPredecessor;
InstRef CriticalMemoryInstruction;		InstRef CriticalMemoryInstruction;

MemoryGroup(const MemoryGroup &) = delete;		MemoryGroup(const MemoryGroup &) = delete;
MemoryGroup &operator=(const MemoryGroup &) = delete;		MemoryGroup &operator=(const MemoryGroup &) = delete;

public:		public:
MemoryGroup()		MemoryGroup()
: NumPredecessors(0), NumExecutingPredecessors(0),		: NumPredecessors(0), NumExecutingPredecessors(0),
NumExecutedPredecessors(0), NumInstructions(0), NumExecuting(0),		NumExecutedPredecessors(0), NumInstructions(0), NumExecuting(0),
NumExecuted(0), CriticalPredecessor(), CriticalMemoryInstruction() {}		NumExecuted(0), CriticalPredecessor(), CriticalMemoryInstruction() {}
MemoryGroup(MemoryGroup &&) = default;		MemoryGroup(MemoryGroup &&) = default;

ArrayRef<MemoryGroup *> getSuccessors() const { return Succ; }		size_t getNumSuccessors() const {
		mattdUnsubmitted Done Reply Inline Actions s/unsigned/size_t/ because SmallVectorBase::size returns a size_t. mattd: s/unsigned/size_t/ because SmallVectorBase::size returns a size_t.
		andreadbAuthorUnsubmitted Done Reply Inline Actions Right. Ideally it should return a size_t. I guess it hardly would be a problem in practice. But you are right. I will fix it. andreadb: Right. Ideally it should return a size_t. I guess it hardly would be a problem in practice. But…
unsigned getNumSuccessors() const { return Succ.size(); }		return OrderSucc.size() + DataSucc.size();
		}
unsigned getNumPredecessors() const { return NumPredecessors; }		unsigned getNumPredecessors() const { return NumPredecessors; }
unsigned getNumExecutingPredecessors() const {		unsigned getNumExecutingPredecessors() const {
return NumExecutingPredecessors;		return NumExecutingPredecessors;
}		}
unsigned getNumExecutedPredecessors() const {		unsigned getNumExecutedPredecessors() const {
return NumExecutedPredecessors;		return NumExecutedPredecessors;
}		}
unsigned getNumInstructions() const { return NumInstructions; }		unsigned getNumInstructions() const { return NumInstructions; }
unsigned getNumExecuting() const { return NumExecuting; }		unsigned getNumExecuting() const { return NumExecuting; }
unsigned getNumExecuted() const { return NumExecuted; }		unsigned getNumExecuted() const { return NumExecuted; }

const InstRef &getCriticalMemoryInstruction() const {		const InstRef &getCriticalMemoryInstruction() const {
return CriticalMemoryInstruction;		return CriticalMemoryInstruction;
}		}
const CriticalDependency &getCriticalPredecessor() const {		const CriticalDependency &getCriticalPredecessor() const {
return CriticalPredecessor;		return CriticalPredecessor;
}		}

void addSuccessor(MemoryGroup *Group) {		void addSuccessor(MemoryGroup *Group, bool IsDataDependent) {
		// Do not need to add a dependency if there is no data
		// dependency and all instructions from this group have been
		// issued already.
		if (!IsDataDependent && isExecuting())
		return;

Group->NumPredecessors++;		Group->NumPredecessors++;
assert(!isExecuted() && "Should have been removed!");		assert(!isExecuted() && "Should have been removed!");
if (isExecuting())		if (isExecuting())
Group->onGroupIssued(CriticalMemoryInstruction);		Group->onGroupIssued(CriticalMemoryInstruction, IsDataDependent);
Succ.emplace_back(Group);
		if (IsDataDependent)
		DataSucc.emplace_back(Group);
		else
		OrderSucc.emplace_back(Group);
}		}

bool isWaiting() const {		bool isWaiting() const {
return NumPredecessors >		return NumPredecessors >
(NumExecutingPredecessors + NumExecutedPredecessors);		(NumExecutingPredecessors + NumExecutedPredecessors);
}		}
bool isPending() const {		bool isPending() const {
return NumExecutingPredecessors &&		return NumExecutingPredecessors &&
((NumExecutedPredecessors + NumExecutingPredecessors) ==		((NumExecutedPredecessors + NumExecutingPredecessors) ==
NumPredecessors);		NumPredecessors);
}		}
bool isReady() const { return NumExecutedPredecessors == NumPredecessors; }		bool isReady() const { return NumExecutedPredecessors == NumPredecessors; }
bool isExecuting() const {		bool isExecuting() const {
return NumExecuting && (NumExecuting == (NumInstructions - NumExecuted));		return NumExecuting && (NumExecuting == (NumInstructions - NumExecuted));
}		}
bool isExecuted() const { return NumInstructions == NumExecuted; }		bool isExecuted() const { return NumInstructions == NumExecuted; }

void onGroupIssued(const InstRef &IR) {		void onGroupIssued(const InstRef &IR, bool ShouldUpdateCriticalDep) {
assert(!isReady() && "Unexpected group-start event!");		assert(!isReady() && "Unexpected group-start event!");
NumExecutingPredecessors++;		NumExecutingPredecessors++;

		if (!ShouldUpdateCriticalDep)
		andreadbAuthorUnsubmitted Done Reply Inline Actions To clarify. The reason why there is this early exit here is because we only need to update the information about a critical "memory" dependency if there is memory aliasing. andreadb: To clarify. The reason why there is this early exit here is because we only need to update the…
		return;

unsigned Cycles = IR.getInstruction()->getCyclesLeft();		unsigned Cycles = IR.getInstruction()->getCyclesLeft();
if (CriticalPredecessor.Cycles < Cycles) {		if (CriticalPredecessor.Cycles < Cycles) {
CriticalPredecessor.IID = IR.getSourceIndex();		CriticalPredecessor.IID = IR.getSourceIndex();
CriticalPredecessor.Cycles = Cycles;		CriticalPredecessor.Cycles = Cycles;
}		}
}		}

void onGroupExecuted() {		void onGroupExecuted() {
Show All 15 Lines	void onInstructionIssued(const InstRef &IR) {
} else {		} else {
CriticalMemoryInstruction = IR;		CriticalMemoryInstruction = IR;
}		}

if (!isExecuting())		if (!isExecuting())
return;		return;

// Notify successors that this group started execution.		// Notify successors that this group started execution.
for (MemoryGroup *MG : Succ)		for (MemoryGroup *MG : OrderSucc) {
MG->onGroupIssued(CriticalMemoryInstruction);		MG->onGroupIssued(CriticalMemoryInstruction, false);
		// Release the order dependency with this group.
		MG->onGroupExecuted();
		}

		for (MemoryGroup *MG : DataSucc)
		MG->onGroupIssued(CriticalMemoryInstruction, true);
}		}

void onInstructionExecuted() {		void onInstructionExecuted() {
assert(isReady() && !isExecuted() && "Invalid internal state!");		assert(isReady() && !isExecuted() && "Invalid internal state!");
--NumExecuting;		--NumExecuting;
++NumExecuted;		++NumExecuted;

if (!isExecuted())		if (!isExecuted())
return;		return;

// Notify successors that this group has finished execution.		// Notify data dependent successors that this group has finished execution.
for (MemoryGroup *MG : Succ)		for (MemoryGroup *MG : DataSucc)
MG->onGroupExecuted();		MG->onGroupExecuted();
}		}

void addInstruction() {		void addInstruction() {
assert(!getNumSuccessors() && "Cannot add instructions to this group!");		assert(!getNumSuccessors() && "Cannot add instructions to this group!");
++NumInstructions;		++NumInstructions;
}		}

▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines	class LSUnit : public LSUnitBase {
// executed before newer stores are issued.		// executed before newer stores are issued.
//		//
// An instruction that both 'MayLoad' and 'HasUnmodeledSideEffects' is		// An instruction that both 'MayLoad' and 'HasUnmodeledSideEffects' is
// conservatively treated as a load barrier. It forces older loads to execute		// conservatively treated as a load barrier. It forces older loads to execute
// before newer loads are issued.		// before newer loads are issued.
unsigned CurrentLoadGroupID;		unsigned CurrentLoadGroupID;
unsigned CurrentLoadBarrierGroupID;		unsigned CurrentLoadBarrierGroupID;
unsigned CurrentStoreGroupID;		unsigned CurrentStoreGroupID;
		unsigned CurrentStoreBarrierGroupID;

public:		public:
LSUnit(const MCSchedModel &SM)		LSUnit(const MCSchedModel &SM)
: LSUnit(SM, /* LQSize / 0, / SQSize / 0, / NoAlias */ false) {}		: LSUnit(SM, /* LQSize / 0, / SQSize / 0, / NoAlias */ false) {}
LSUnit(const MCSchedModel &SM, unsigned LQ, unsigned SQ)		LSUnit(const MCSchedModel &SM, unsigned LQ, unsigned SQ)
: LSUnit(SM, LQ, SQ, /* NoAlias */ false) {}		: LSUnit(SM, LQ, SQ, /* NoAlias */ false) {}
LSUnit(const MCSchedModel &SM, unsigned LQ, unsigned SQ, bool AssumeNoAlias)		LSUnit(const MCSchedModel &SM, unsigned LQ, unsigned SQ, bool AssumeNoAlias)
: LSUnitBase(SM, LQ, SQ, AssumeNoAlias), CurrentLoadGroupID(0),		: LSUnitBase(SM, LQ, SQ, AssumeNoAlias), CurrentLoadGroupID(0),
CurrentLoadBarrierGroupID(0), CurrentStoreGroupID(0) {}		CurrentLoadBarrierGroupID(0), CurrentStoreGroupID(0),
		CurrentStoreBarrierGroupID(0) {}

/// Returns LSU_AVAILABLE if there are enough load/store queue entries to		/// Returns LSU_AVAILABLE if there are enough load/store queue entries to
/// accomodate instruction IR.		/// accomodate instruction IR.
Status isAvailable(const InstRef &IR) const override;		Status isAvailable(const InstRef &IR) const override;

/// Allocates LS resources for instruction IR.		/// Allocates LS resources for instruction IR.
///		///
/// This method assumes that a previous call to `isAvailable(IR)` succeeded		/// This method assumes that a previous call to `isAvailable(IR)` succeeded
Show All 19 Lines

llvm/lib/MCA/HardwareUnits/LSUnit.cpp

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	dbgs() << "[LSUnit] Group (" << GroupIt.first << "): "
<< ", #GExecuted = " << Group.getNumExecutedPredecessors()		<< ", #GExecuted = " << Group.getNumExecutedPredecessors()
<< ", #Inst = " << Group.getNumInstructions()		<< ", #Inst = " << Group.getNumInstructions()
<< ", #IIssued = " << Group.getNumExecuting()		<< ", #IIssued = " << Group.getNumExecuting()
<< ", #IExecuted = " << Group.getNumExecuted() << '\n';		<< ", #IExecuted = " << Group.getNumExecuted() << '\n';
}		}
}		}
#endif		#endif

unsigned LSUnit::dispatch(const InstRef &IR) {		unsigned LSUnit::dispatch(const InstRef &IR) {
		mattdUnsubmitted Done Reply Inline Actions This function is getting pretty large. I'm not sure how to break this up, but it could be useful as a NFC follow-up at a later time. mattd: This function is getting pretty large. I'm not sure how to break this up, but it could be…
		andreadbAuthorUnsubmitted Done Reply Inline Actions Definitely a good suggestion for a follow-up. :-) This logic can probably be split and moved into a few helper methods/functions. andreadb: Definitely a good suggestion for a follow-up. :-) This logic can probably be split and moved…
const InstrDesc &Desc = IR.getInstruction()->getDesc();		const InstrDesc &Desc = IR.getInstruction()->getDesc();
unsigned IsMemBarrier = Desc.HasSideEffects;		unsigned IsMemBarrier = Desc.HasSideEffects;
assert((Desc.MayLoad \|\| Desc.MayStore) && "Not a memory operation!");		assert((Desc.MayLoad \|\| Desc.MayStore) && "Not a memory operation!");

if (Desc.MayLoad)		if (Desc.MayLoad)
acquireLQSlot();		acquireLQSlot();
if (Desc.MayStore)		if (Desc.MayStore)
acquireSQSlot();		acquireSQSlot();

if (Desc.MayStore) {		if (Desc.MayStore) {
// Always create a new group for store operations.

// A store may not pass a previous store or store barrier.
unsigned NewGID = createMemoryGroup();		unsigned NewGID = createMemoryGroup();
MemoryGroup &NewGroup = getGroup(NewGID);		MemoryGroup &NewGroup = getGroup(NewGID);
NewGroup.addInstruction();		NewGroup.addInstruction();

// A store may not pass a previous load or load barrier.		// A store may not pass a previous load or load barrier.
unsigned ImmediateLoadDominator =		unsigned ImmediateLoadDominator =
std::max(CurrentLoadGroupID, CurrentLoadBarrierGroupID);		std::max(CurrentLoadGroupID, CurrentLoadBarrierGroupID);
if (ImmediateLoadDominator) {		if (ImmediateLoadDominator) {
MemoryGroup &IDom = getGroup(ImmediateLoadDominator);		MemoryGroup &IDom = getGroup(ImmediateLoadDominator);
LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << ImmediateLoadDominator		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << ImmediateLoadDominator
<< ") --> (" << NewGID << ")\n");		<< ") --> (" << NewGID << ")\n");
IDom.addSuccessor(&NewGroup);		IDom.addSuccessor(&NewGroup, !assumeNoAlias());
		}

		// A store may not pass a previous store barrier.
		if (CurrentStoreBarrierGroupID) {
		andreadbAuthorUnsubmitted Done Reply Inline Actions Note that, by construction `CurrentStoreBarrierGroupID` is always less than or equal to `CurrentStoreGroupID`. This check was missing in the original code from commit 280ac1fd1dc35. And this is exactly the reason why we have a difference in test pr37790.s andreadb: Note that, by construction `CurrentStoreBarrierGroupID` is always less than or equal to…
		MemoryGroup &StoreGroup = getGroup(CurrentStoreBarrierGroupID);
		mattdUnsubmitted Done Reply Inline Actions nit: You call this StoreGroup here, but StGroup a few blocks below. I really do not care that there is a difference of a few characters, but I could not silence my inner dialog. mattd: nit: You call this StoreGroup here, but StGroup a few blocks below. I really do not care that…
		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: ("
		<< CurrentStoreBarrierGroupID
		<< ") --> (" << NewGID << ")\n");
		StoreGroup.addSuccessor(&NewGroup, true);
}		}
if (CurrentStoreGroupID) {
		// A store may not pass a previous store.
		if (CurrentStoreGroupID &&
		(CurrentStoreGroupID != CurrentStoreBarrierGroupID)) {
		andreadbAuthorUnsubmitted Done Reply Inline Actions This check is to avoid that we add a store dependency twice (since a store can also be a store barrier). andreadb: This check is to avoid that we add a store dependency twice (since a store can also be a store…
MemoryGroup &StoreGroup = getGroup(CurrentStoreGroupID);		MemoryGroup &StoreGroup = getGroup(CurrentStoreGroupID);
LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << CurrentStoreGroupID		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << CurrentStoreGroupID
<< ") --> (" << NewGID << ")\n");		<< ") --> (" << NewGID << ")\n");
StoreGroup.addSuccessor(&NewGroup);		StoreGroup.addSuccessor(&NewGroup, !assumeNoAlias());
}		}


CurrentStoreGroupID = NewGID;		CurrentStoreGroupID = NewGID;
		if (IsMemBarrier)
		CurrentStoreBarrierGroupID = NewGID;

if (Desc.MayLoad) {		if (Desc.MayLoad) {
CurrentLoadGroupID = NewGID;		CurrentLoadGroupID = NewGID;
if (IsMemBarrier)		if (IsMemBarrier)
CurrentLoadBarrierGroupID = NewGID;		CurrentLoadBarrierGroupID = NewGID;
}		}

return NewGID;		return NewGID;
}		}

assert(Desc.MayLoad && "Expected a load!");		assert(Desc.MayLoad && "Expected a load!");

// Always create a new memory group if this is the first load of the sequence.		unsigned ImmediateLoadDominator =
		std::max(CurrentLoadGroupID, CurrentLoadBarrierGroupID);

		// A new load group is created if we are in one of the following situations:
		mattdUnsubmitted Done Reply Inline Actions Is !ImmediateLoadDominator necessary here? It seems that would be implied by the next condition: `ImmediateLoadDominator <= CurrentStoreGroupID` However, I suppose this does read clearer by leaving the !ImmediateLoadDominator check in place.. mattd: Is !ImmediateLoadDominator necessary here? It seems that would be implied by the next…
		andreadbAuthorUnsubmitted Done Reply Inline Actions The idea is to identify cases where we need a new node in the dependency graph. These are the cases identified by this check: a) This is a memory barrier (by construction we always require that barriers are assigned to different memory group); b) This is the very first load dispatched to the LSUnit (by construction we always keep loads and stores into separate groups. So we need to start a new load group). c) There is an intervening store between the last load dispatched to the LSU and this load; d) There is no intervening store. However the last load group has already started execution (so we need a new node). andreadb: The idea is to identify cases where we need a new node in the dependency graph. These are the…
		mattdUnsubmitted Done Reply Inline Actions Those points would make for a great comment! Thanks for the clarification. mattd: Those points would make for a great comment! Thanks for the clarification.
		// 1) This is a load barrier (by construction, a load barrier is always
		// assigned to a different memory group).
		// 2) There is no load in flight (by construction we always keep loads and
		// stores into separate memory groups).
		// 3) There is a load barrier in flight. This load depends on it.
		// 4) There is an intervening store between the last load dispatched to the
		// LSU and this load. We always create a new group even if this load
		// does not alias the last dispatched store.
		// 5) There is no intervening store and there is an active load group.
		// However that group has already started execution, so we cannot add
		// this load to it.
		bool ShouldCreateANewGroup =
		IsMemBarrier \|\| !ImmediateLoadDominator \|\|
		CurrentLoadBarrierGroupID == ImmediateLoadDominator \|\|
		ImmediateLoadDominator <= CurrentStoreGroupID \|\|
		getGroup(ImmediateLoadDominator).isExecuting();

// A load may not pass a previous store unless flag 'NoAlias' is set.
// A load may pass a previous load.
// A younger load cannot pass a older load barrier.
// A load barrier cannot pass a older load.
bool ShouldCreateANewGroup = !CurrentLoadGroupID \|\| IsMemBarrier \|\|
CurrentLoadGroupID <= CurrentStoreGroupID \|\|
CurrentLoadGroupID <= CurrentLoadBarrierGroupID;
if (ShouldCreateANewGroup) {		if (ShouldCreateANewGroup) {
unsigned NewGID = createMemoryGroup();		unsigned NewGID = createMemoryGroup();
MemoryGroup &NewGroup = getGroup(NewGID);		MemoryGroup &NewGroup = getGroup(NewGID);
NewGroup.addInstruction();		NewGroup.addInstruction();

		// A load may not pass a previous store or store barrier
		// unless flag 'NoAlias' is set.
if (!assumeNoAlias() && CurrentStoreGroupID) {		if (!assumeNoAlias() && CurrentStoreGroupID) {
MemoryGroup &StGroup = getGroup(CurrentStoreGroupID);		MemoryGroup &StoreGroup = getGroup(CurrentStoreGroupID);
LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << CurrentStoreGroupID		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << CurrentStoreGroupID
<< ") --> (" << NewGID << ")\n");		<< ") --> (" << NewGID << ")\n");
StGroup.addSuccessor(&NewGroup);		StoreGroup.addSuccessor(&NewGroup, true);
		}

		// A load barrier may not pass a previous load or load barrier.
		if (IsMemBarrier) {
		if (ImmediateLoadDominator) {
		MemoryGroup &LoadGroup = getGroup(ImmediateLoadDominator);
		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: ("
		<< ImmediateLoadDominator
		<< ") --> (" << NewGID << ")\n");
		LoadGroup.addSuccessor(&NewGroup, true);
}		}
		} else {
		// A younger load cannot pass a older load barrier.
if (CurrentLoadBarrierGroupID) {		if (CurrentLoadBarrierGroupID) {
MemoryGroup &LdGroup = getGroup(CurrentLoadBarrierGroupID);		MemoryGroup &LoadGroup = getGroup(CurrentLoadBarrierGroupID);
LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: (" << CurrentLoadBarrierGroupID		LLVM_DEBUG(dbgs() << "[LSUnit]: GROUP DEP: ("
		<< CurrentLoadBarrierGroupID
<< ") --> (" << NewGID << ")\n");		<< ") --> (" << NewGID << ")\n");
LdGroup.addSuccessor(&NewGroup);		LoadGroup.addSuccessor(&NewGroup, true);
		}
}		}

CurrentLoadGroupID = NewGID;		CurrentLoadGroupID = NewGID;
if (IsMemBarrier)		if (IsMemBarrier)
CurrentLoadBarrierGroupID = NewGID;		CurrentLoadBarrierGroupID = NewGID;
return NewGID;		return NewGID;
}		}

		// A load may pass a previous load.
MemoryGroup &Group = getGroup(CurrentLoadGroupID);		MemoryGroup &Group = getGroup(CurrentLoadGroupID);
Group.addInstruction();		Group.addInstruction();
return CurrentLoadGroupID;		return CurrentLoadGroupID;
}		}

LSUnit::Status LSUnit::isAvailable(const InstRef &IR) const {		LSUnit::Status LSUnit::isAvailable(const InstRef &IR) const {
const InstrDesc &Desc = IR.getInstruction()->getDesc();		const InstrDesc &Desc = IR.getInstruction()->getDesc();
if (Desc.MayLoad && isLQFull())		if (Desc.MayLoad && isLQFull())
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st1.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	st1 {v0.s}[0], [sp]			st1 {v0.s}[0], [sp]
	st1 {v0.2s}, [sp]			st1 {v0.2s}, [sp]
	st1 {v0.2s, v1.2s}, [sp]			st1 {v0.2s, v1.2s}, [sp]
	st1 {v0.2s, v1.2s, v2.2s}, [sp]			st1 {v0.2s, v1.2s, v2.2s}, [sp]
	st1 {v0.2s, v1.2s, v2.2s, v3.2s}, [sp]			st1 {v0.2s, v1.2s, v2.2s, v3.2s}, [sp]

	st1 {v0.d}[0], [sp]			st1 {v0.d}[0], [sp]
	▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st2.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	st2 {v0.s, v1.s}[0], [sp]			st2 {v0.s, v1.s}[0], [sp]
	st2 {v0.2s, v1.2s}, [sp]			st2 {v0.2s, v1.2s}, [sp]

	st2 {v0.d, v1.d}[0], [sp]			st2 {v0.d, v1.d}[0], [sp]
	st2 {v0.2d, v1.2d}, [sp]			st2 {v0.2d, v1.2d}, [sp]

	st2 {v0.s, v1.s}[0], [sp], #8			st2 {v0.s, v1.s}[0], [sp], #8
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st3.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	st3 {v0.s, v1.s, v2.s}[0], [sp]			st3 {v0.s, v1.s, v2.s}[0], [sp]
	st3 {v0.2s, v1.2s, v2.2s}, [sp]			st3 {v0.2s, v1.2s, v2.2s}, [sp]

	st3 {v0.d, v1.d, v2.d}[0], [sp]			st3 {v0.d, v1.d, v2.d}[0], [sp]
	st3 {v0.2d, v1.2d, v2.2d}, [sp]			st3 {v0.2d, v1.2d, v2.2d}, [sp]

	st3 {v0.s, v1.s, v2.s}[0], [sp], #12			st3 {v0.s, v1.s, v2.s}[0], [sp], #12
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st4.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -mtriple=aarch64-linux-gnu -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	st4 {v0.s, v1.s, v2.s, v3.s}[0], [sp]			st4 {v0.s, v1.s, v2.s, v3.s}[0], [sp]
	st4 {v0.2s, v1.2s, v2.2s, v3.2s}, [sp]			st4 {v0.2s, v1.2s, v2.2s, v3.2s}, [sp]

	st4 {v0.d, v1.d, v2.d, v3.d}[0], [sp]			st4 {v0.d, v1.d, v2.d, v3.d}[0], [sp]
	st4 {v0.2d, v1.2d, v2.2d, v3.2d}, [sp]			st4 {v0.2d, v1.2d, v2.2d, v3.2d}, [sp]

	st4 {v0.s, v1.s, v2.s, v3.s}[0], [sp], #16			st4 {v0.s, v1.s, v2.s, v3.s}[0], [sp], #16
	▲ Show 20 Lines • Show All 85 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/float-store.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	stur d0, [sp, #2]			stur d0, [sp, #2]
	stur q0, [sp, #16]			stur q0, [sp, #16]

	str b0, [sp], #1			str b0, [sp], #1
	str q0, [sp], #16			str q0, [sp], #16

	str h0, [sp, #2]!			str h0, [sp, #2]!
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/AArch64/Exynos/store.s

	# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py			# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m3 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M3			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m3 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M3
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m4 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M4			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m4 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M4
	# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m5 -resource-pressure=false < %s \| FileCheck %s -check-prefixes=ALL,M5			# RUN: llvm-mca -march=aarch64 -mcpu=exynos-m5 -resource-pressure=false -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,M5

	stur x0, [sp, #8]			stur x0, [sp, #8]
	strb w0, [sp], #1			strb w0, [sp], #1
	strh w0, [sp, #2]!			strh w0, [sp, #2]!
	str x0, [sp, #8]			str x0, [sp, #8]
	strb w0, [sp, x31]			strb w0, [sp, x31]
	strh w0, [sp, x31, lsl #1]			strh w0, [sp, x31, lsl #1]
	str w0, [sp, w31, sxtw]			str w0, [sp, w31, sxtw]
	▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s

	Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	movaps (%rdx), %xmm2			movaps (%rdx), %xmm2
	movaps %xmm3, (%rbx)			movaps %xmm3, (%rbx)
	# LLVM-MCA-END			# LLVM-MCA-END

	# CHECK: [0] Code Region			# CHECK: [0] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 208			# CHECK-NEXT: Total Cycles: 207
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.92			# CHECK-NEXT: uOps Per Cycle: 1.93
	# CHECK-NEXT: IPC: 1.92			# CHECK-NEXT: IPC: 1.93
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * movb %spl, (%rax)			# CHECK-NEXT: 1 1 1.00 * movb %spl, (%rax)
	# CHECK-NEXT: 1 5 0.50 * movb (%rcx), %bpl			# CHECK-NEXT: 1 5 0.50 * movb (%rcx), %bpl
	# CHECK-NEXT: 1 5 0.50 * movb (%rdx), %sil			# CHECK-NEXT: 1 5 0.50 * movb (%rdx), %sil
	# CHECK-NEXT: 1 1 1.00 * movb %dil, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movb %dil, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (71.0%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (16.3%)			# CHECK-NEXT: 0, 33 (15.9%)
	# CHECK-NEXT: 2, 148 (71.2%)			# CHECK-NEXT: 2, 148 (71.5%)
	# CHECK-NEXT: 4, 26 (12.5%)			# CHECK-NEXT: 4, 26 (12.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 7 (3.4%)
	# CHECK-NEXT: 1, 10 (4.8%)			# CHECK-NEXT: 2, 200 (96.6%)
	# CHECK-NEXT: 2, 195 (93.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movb %spl, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movb %spl, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movb (%rcx), %bpl			# CHECK-NEXT: - - - - - - 1.00 - movb (%rcx), %bpl
	# CHECK-NEXT: - - - - - - 0.95 0.05 movb (%rdx), %sil			# CHECK-NEXT: - - - - - - - 1.00 movb (%rdx), %sil
	# CHECK-NEXT: - - - - 1.00 - 0.05 0.95 movb %dil, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movb %dil, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movb %spl, (%rax)			# CHECK: [0,0] DeER . . movb %spl, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movb (%rcx), %bpl			# CHECK-NEXT: [0,1] DeeeeeER. movb (%rcx), %bpl
	# CHECK-NEXT: [0,2] D=eeeeeER. movb (%rdx), %sil			# CHECK-NEXT: [0,2] D=eeeeeER movb (%rdx), %sil
	# CHECK-NEXT: [0,3] D======eER movb %dil, (%rbx)			# CHECK-NEXT: [0,3] D=eE----R movb %dil, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movb (%rcx), %bpl			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movb (%rcx), %bpl
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movb (%rdx), %sil			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movb (%rdx), %sil
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movb %dil, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 4.0 movb %dil, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.0 <total>

	# CHECK: [1] Code Region			# CHECK: [1] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 208			# CHECK-NEXT: Total Cycles: 207
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.92			# CHECK-NEXT: uOps Per Cycle: 1.93
	# CHECK-NEXT: IPC: 1.92			# CHECK-NEXT: IPC: 1.93
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * movw %sp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movw %sp, (%rax)
	# CHECK-NEXT: 1 5 0.50 * movw (%rcx), %bp			# CHECK-NEXT: 1 5 0.50 * movw (%rcx), %bp
	# CHECK-NEXT: 1 5 0.50 * movw (%rdx), %si			# CHECK-NEXT: 1 5 0.50 * movw (%rdx), %si
	# CHECK-NEXT: 1 1 1.00 * movw %di, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movw %di, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (71.0%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (16.3%)			# CHECK-NEXT: 0, 33 (15.9%)
	# CHECK-NEXT: 2, 148 (71.2%)			# CHECK-NEXT: 2, 148 (71.5%)
	# CHECK-NEXT: 4, 26 (12.5%)			# CHECK-NEXT: 4, 26 (12.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 7 (3.4%)
	# CHECK-NEXT: 1, 10 (4.8%)			# CHECK-NEXT: 2, 200 (96.6%)
	# CHECK-NEXT: 2, 195 (93.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movw %sp, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movw %sp, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movw (%rcx), %bp			# CHECK-NEXT: - - - - - - 1.00 - movw (%rcx), %bp
	# CHECK-NEXT: - - - - - - 0.95 0.05 movw (%rdx), %si			# CHECK-NEXT: - - - - - - - 1.00 movw (%rdx), %si
	# CHECK-NEXT: - - - - 1.00 - 0.05 0.95 movw %di, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movw %di, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movw %sp, (%rax)			# CHECK: [0,0] DeER . . movw %sp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movw (%rcx), %bp			# CHECK-NEXT: [0,1] DeeeeeER. movw (%rcx), %bp
	# CHECK-NEXT: [0,2] D=eeeeeER. movw (%rdx), %si			# CHECK-NEXT: [0,2] D=eeeeeER movw (%rdx), %si
	# CHECK-NEXT: [0,3] D======eER movw %di, (%rbx)			# CHECK-NEXT: [0,3] D=eE----R movw %di, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movw (%rcx), %bp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movw (%rcx), %bp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movw (%rdx), %si			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movw (%rdx), %si
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movw %di, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 4.0 movw %di, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.0 <total>

	# CHECK: [2] Code Region			# CHECK: [2] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 208			# CHECK-NEXT: Total Cycles: 207
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.92			# CHECK-NEXT: uOps Per Cycle: 1.93
	# CHECK-NEXT: IPC: 1.92			# CHECK-NEXT: IPC: 1.93
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * movl %esp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movl %esp, (%rax)
	# CHECK-NEXT: 1 5 0.50 * movl (%rcx), %ebp			# CHECK-NEXT: 1 5 0.50 * movl (%rcx), %ebp
	# CHECK-NEXT: 1 5 0.50 * movl (%rdx), %esi			# CHECK-NEXT: 1 5 0.50 * movl (%rdx), %esi
	# CHECK-NEXT: 1 1 1.00 * movl %edi, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movl %edi, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (71.0%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (16.3%)			# CHECK-NEXT: 0, 33 (15.9%)
	# CHECK-NEXT: 2, 148 (71.2%)			# CHECK-NEXT: 2, 148 (71.5%)
	# CHECK-NEXT: 4, 26 (12.5%)			# CHECK-NEXT: 4, 26 (12.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 7 (3.4%)
	# CHECK-NEXT: 1, 10 (4.8%)			# CHECK-NEXT: 2, 200 (96.6%)
	# CHECK-NEXT: 2, 195 (93.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movl %esp, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movl %esp, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movl (%rcx), %ebp			# CHECK-NEXT: - - - - - - 1.00 - movl (%rcx), %ebp
	# CHECK-NEXT: - - - - - - 0.95 0.05 movl (%rdx), %esi			# CHECK-NEXT: - - - - - - - 1.00 movl (%rdx), %esi
	# CHECK-NEXT: - - - - 1.00 - 0.05 0.95 movl %edi, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movl %edi, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movl %esp, (%rax)			# CHECK: [0,0] DeER . . movl %esp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movl (%rcx), %ebp			# CHECK-NEXT: [0,1] DeeeeeER. movl (%rcx), %ebp
	# CHECK-NEXT: [0,2] D=eeeeeER. movl (%rdx), %esi			# CHECK-NEXT: [0,2] D=eeeeeER movl (%rdx), %esi
	# CHECK-NEXT: [0,3] D======eER movl %edi, (%rbx)			# CHECK-NEXT: [0,3] D=eE----R movl %edi, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movl (%rcx), %ebp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movl (%rcx), %ebp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movl (%rdx), %esi			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movl (%rdx), %esi
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movl %edi, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 4.0 movl %edi, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.0 <total>

	# CHECK: [3] Code Region			# CHECK: [3] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 208			# CHECK-NEXT: Total Cycles: 207
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.92			# CHECK-NEXT: uOps Per Cycle: 1.93
	# CHECK-NEXT: IPC: 1.92			# CHECK-NEXT: IPC: 1.93
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * movq %rsp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movq %rsp, (%rax)
	# CHECK-NEXT: 1 5 0.50 * movq (%rcx), %rbp			# CHECK-NEXT: 1 5 0.50 * movq (%rcx), %rbp
	# CHECK-NEXT: 1 5 0.50 * movq (%rdx), %rsi			# CHECK-NEXT: 1 5 0.50 * movq (%rdx), %rsi
	# CHECK-NEXT: 1 1 1.00 * movq %rdi, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movq %rdi, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (71.0%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (16.3%)			# CHECK-NEXT: 0, 33 (15.9%)
	# CHECK-NEXT: 2, 148 (71.2%)			# CHECK-NEXT: 2, 148 (71.5%)
	# CHECK-NEXT: 4, 26 (12.5%)			# CHECK-NEXT: 4, 26 (12.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 7 (3.4%)
	# CHECK-NEXT: 1, 10 (4.8%)			# CHECK-NEXT: 2, 200 (96.6%)
	# CHECK-NEXT: 2, 195 (93.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movq %rsp, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movq %rsp, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movq (%rcx), %rbp			# CHECK-NEXT: - - - - - - 1.00 - movq (%rcx), %rbp
	# CHECK-NEXT: - - - - - - 0.95 0.05 movq (%rdx), %rsi			# CHECK-NEXT: - - - - - - - 1.00 movq (%rdx), %rsi
	# CHECK-NEXT: - - - - 1.00 - 0.05 0.95 movq %rdi, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movq %rdi, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movq %rsp, (%rax)			# CHECK: [0,0] DeER . . movq %rsp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movq (%rcx), %rbp			# CHECK-NEXT: [0,1] DeeeeeER. movq (%rcx), %rbp
	# CHECK-NEXT: [0,2] D=eeeeeER. movq (%rdx), %rsi			# CHECK-NEXT: [0,2] D=eeeeeER movq (%rdx), %rsi
	# CHECK-NEXT: [0,3] D======eER movq %rdi, (%rbx)			# CHECK-NEXT: [0,3] D=eE----R movq %rdi, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movq (%rcx), %rbp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movq (%rcx), %rbp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movq (%rdx), %rsi			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movq (%rdx), %rsi
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movq %rdi, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 4.0 movq %rdi, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.0 <total>

	# CHECK: [4] Code Region			# CHECK: [4] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 208			# CHECK-NEXT: Total Cycles: 207
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.92			# CHECK-NEXT: uOps Per Cycle: 1.93
	# CHECK-NEXT: IPC: 1.92			# CHECK-NEXT: IPC: 1.93
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * U movd %mm0, (%rax)			# CHECK-NEXT: 1 1 1.00 * U movd %mm0, (%rax)
	# CHECK-NEXT: 1 5 0.50 * movd (%rcx), %mm1			# CHECK-NEXT: 1 5 0.50 * movd (%rcx), %mm1
	# CHECK-NEXT: 1 5 0.50 * movd (%rdx), %mm2			# CHECK-NEXT: 1 5 0.50 * movd (%rdx), %mm2
	# CHECK-NEXT: 1 1 1.00 * U movd %mm3, (%rbx)			# CHECK-NEXT: 1 1 1.00 * U movd %mm3, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (71.0%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (16.3%)			# CHECK-NEXT: 0, 33 (15.9%)
	# CHECK-NEXT: 2, 148 (71.2%)			# CHECK-NEXT: 2, 148 (71.5%)
	# CHECK-NEXT: 4, 26 (12.5%)			# CHECK-NEXT: 4, 26 (12.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 7 (3.4%)
	# CHECK-NEXT: 1, 10 (4.8%)			# CHECK-NEXT: 2, 200 (96.6%)
	# CHECK-NEXT: 2, 195 (93.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movd %mm0, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movd %mm0, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movd (%rcx), %mm1			# CHECK-NEXT: - - - - - - 1.00 - movd (%rcx), %mm1
	# CHECK-NEXT: - - - - - - 0.95 0.05 movd (%rdx), %mm2			# CHECK-NEXT: - - - - - - - 1.00 movd (%rdx), %mm2
	# CHECK-NEXT: - - - - 1.00 - 0.05 0.95 movd %mm3, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movd %mm3, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movd %mm0, (%rax)			# CHECK: [0,0] DeER . . movd %mm0, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movd (%rcx), %mm1			# CHECK-NEXT: [0,1] DeeeeeER. movd (%rcx), %mm1
	# CHECK-NEXT: [0,2] D=eeeeeER. movd (%rdx), %mm2			# CHECK-NEXT: [0,2] D=eeeeeER movd (%rdx), %mm2
	# CHECK-NEXT: [0,3] D======eER movd %mm3, (%rbx)			# CHECK-NEXT: [0,3] D=eE----R movd %mm3, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movd %mm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movd %mm0, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movd (%rcx), %mm1			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movd (%rcx), %mm1
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movd (%rdx), %mm2			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movd (%rdx), %mm2
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movd %mm3, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 4.0 movd %mm3, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.0 <total>

	# CHECK: [5] Code Region			# CHECK: [5] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 209			# CHECK-NEXT: Total Cycles: 208
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 1.91			# CHECK-NEXT: uOps Per Cycle: 1.92
	# CHECK-NEXT: IPC: 1.91			# CHECK-NEXT: IPC: 1.92
	# CHECK-NEXT: Block RThroughput: 2.0			# CHECK-NEXT: Block RThroughput: 2.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 1 1.00 * movaps %xmm0, (%rax)			# CHECK-NEXT: 1 1 1.00 * movaps %xmm0, (%rax)
	# CHECK-NEXT: 1 6 0.50 * movaps (%rcx), %xmm1			# CHECK-NEXT: 1 6 0.50 * movaps (%rcx), %xmm1
	# CHECK-NEXT: 1 6 0.50 * movaps (%rdx), %xmm2			# CHECK-NEXT: 1 6 0.50 * movaps (%rdx), %xmm2
	# CHECK-NEXT: 1 1 1.00 * movaps %xmm3, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movaps %xmm3, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.3%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 147 (70.7%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 35 (16.7%)			# CHECK-NEXT: 0, 34 (16.3%)
	# CHECK-NEXT: 2, 148 (70.8%)			# CHECK-NEXT: 2, 148 (71.2%)
	# CHECK-NEXT: 4, 26 (12.4%)			# CHECK-NEXT: 4, 26 (12.5%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 3 (1.4%)			# CHECK-NEXT: 0, 8 (3.8%)
	# CHECK-NEXT: 1, 12 (5.7%)			# CHECK-NEXT: 2, 200 (96.2%)
	# CHECK-NEXT: 2, 194 (92.8%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 12 Lines
	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1]
	# CHECK-NEXT: - - - - 2.00 - 2.00 2.00			# CHECK-NEXT: - - - - 2.00 - 2.00 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:			# CHECK-NEXT: [0] [1] [2] [3] [4] [5] [6.0] [6.1] Instructions:
	# CHECK-NEXT: - - - - 1.00 - - 1.00 movaps %xmm0, (%rax)			# CHECK-NEXT: - - - - 1.00 - - 1.00 movaps %xmm0, (%rax)
	# CHECK-NEXT: - - - - - - 1.00 - movaps (%rcx), %xmm1			# CHECK-NEXT: - - - - - - 1.00 - movaps (%rcx), %xmm1
	# CHECK-NEXT: - - - - - - 0.94 0.06 movaps (%rdx), %xmm2			# CHECK-NEXT: - - - - - - - 1.00 movaps (%rdx), %xmm2
	# CHECK-NEXT: - - - - 1.00 - 0.06 0.94 movaps %xmm3, (%rbx)			# CHECK-NEXT: - - - - 1.00 - 1.00 - movaps %xmm3, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: 0
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 0123456789

	# CHECK: [0,0] DeER . . movaps %xmm0, (%rax)			# CHECK: [0,0] DeER . . movaps %xmm0, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeeER . movaps (%rcx), %xmm1			# CHECK-NEXT: [0,1] DeeeeeeER. movaps (%rcx), %xmm1
	# CHECK-NEXT: [0,2] D=eeeeeeER. movaps (%rdx), %xmm2			# CHECK-NEXT: [0,2] D=eeeeeeER movaps (%rdx), %xmm2
	# CHECK-NEXT: [0,3] D=======eER movaps %xmm3, (%rbx)			# CHECK-NEXT: [0,3] D=eE-----R movaps %xmm3, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movaps (%rcx), %xmm1			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movaps (%rcx), %xmm1
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movaps (%rdx), %xmm2			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movaps (%rdx), %xmm2
	# CHECK-NEXT: 3. 1 8.0 0.0 0.0 movaps %xmm3, (%rbx)			# CHECK-NEXT: 3. 1 2.0 0.0 5.0 movaps %xmm3, (%rbx)
	# CHECK-NEXT: 1 3.0 1.0 0.0 <total>			# CHECK-NEXT: 1 1.5 1.0 1.3 <total>

llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s

	Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movb %bpl, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movb %bpl, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movb %sil, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movb %sil, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movb %dil, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movb %dil, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [1] Code Region			# CHECK: [1] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movw %bp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movw %bp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movw %si, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movw %si, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movw %di, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movw %di, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [2] Code Region			# CHECK: [2] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movl %ebp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movl %ebp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movl %esi, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movl %esi, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movl %edi, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movl %edi, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [3] Code Region			# CHECK: [3] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movq %rbp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movq %rbp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movq %rsi, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movq %rsi, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movq %rdi, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movq %rdi, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [4] Code Region			# CHECK: [4] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movaps %xmm1, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movaps %xmm1, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movaps %xmm2, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movaps %xmm2, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movaps %xmm3, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movaps %xmm3, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	# CHECK-NEXT: 1 1 1.00 * movb %spl, (%rax)			# CHECK-NEXT: 1 1 1.00 * movb %spl, (%rax)
	# CHECK-NEXT: 1 5 1.00 * movb (%rcx), %bpl			# CHECK-NEXT: 1 5 1.00 * movb (%rcx), %bpl
	# CHECK-NEXT: 1 5 1.00 * movb (%rdx), %sil			# CHECK-NEXT: 1 5 1.00 * movb (%rdx), %sil
	# CHECK-NEXT: 1 1 1.00 * movb %dil, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movb %dil, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 257 (84.0%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 256 (83.7%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (11.1%)			# CHECK-NEXT: 0, 35 (11.4%)
	# CHECK-NEXT: 1, 172 (56.2%)			# CHECK-NEXT: 1, 171 (55.9%)
	# CHECK-NEXT: 2, 86 (28.1%)			# CHECK-NEXT: 2, 85 (27.8%)
				# CHECK-NEXT: 3, 1 (0.3%)
	# CHECK-NEXT: 4, 14 (4.6%)			# CHECK-NEXT: 4, 14 (4.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 5 (1.6%)			# CHECK-NEXT: 0, 6 (2.0%)
	# CHECK-NEXT: 1, 202 (66.0%)			# CHECK-NEXT: 1, 200 (65.4%)
	# CHECK-NEXT: 2, 99 (32.4%)			# CHECK-NEXT: 2, 100 (32.7%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	# CHECK-NEXT: PdEX 36 40 40			# CHECK-NEXT: PdEX 36 40 40
	# CHECK-NEXT: PdFPU 0 0 64			# CHECK-NEXT: PdFPU 0 0 64
	# CHECK-NEXT: PdLoad 19 22 40			# CHECK-NEXT: PdLoad 21 24 40
	# CHECK-NEXT: PdStore 20 23 24			# CHECK-NEXT: PdStore 18 21 24

	# CHECK: Resources:			# CHECK: Resources:
	# CHECK-NEXT: [0.0] - PdAGLU01			# CHECK-NEXT: [0.0] - PdAGLU01
	# CHECK-NEXT: [0.1] - PdAGLU01			# CHECK-NEXT: [0.1] - PdAGLU01
	# CHECK-NEXT: [1] - PdBranch			# CHECK-NEXT: [1] - PdBranch
	# CHECK-NEXT: [2] - PdCount			# CHECK-NEXT: [2] - PdCount
	# CHECK-NEXT: [3] - PdDiv			# CHECK-NEXT: [3] - PdDiv
	# CHECK-NEXT: [4] - PdEX0			# CHECK-NEXT: [4] - PdEX0
	Show All 16 Lines
	# CHECK-NEXT: [18] - PdStore			# CHECK-NEXT: [18] - PdStore

	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
	# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00			# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: 0.96 0.04 - - - - - - - - - - - - - - - - - - - - 1.00 movb %spl, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - - - - - - - - - - 1.00 movb %spl, (%rax)
	# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movb (%rcx), %bpl			# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movb (%rcx), %bpl
	# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movb (%rdx), %sil			# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movb (%rdx), %sil
	# CHECK-NEXT: 0.04 0.96 - - - - - - - - - - - - - - - - - - - - 1.00 movb %dil, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - - - - - - - - - - 1.00 movb %dil, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movb %spl, (%rax)			# CHECK: [0,0] DeER . . movb %spl, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movb (%rcx), %bpl			# CHECK-NEXT: [0,1] DeeeeeER. movb (%rcx), %bpl
	# CHECK-NEXT: [0,2] D=eeeeeER. movb (%rdx), %sil			# CHECK-NEXT: [0,2] D=eeeeeER movb (%rdx), %sil
	# CHECK-NEXT: [0,3] D======eER movb %dil, (%rbx)			# CHECK-NEXT: [0,3] D==eE---R movb %dil, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movb (%rcx), %bpl			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movb (%rcx), %bpl
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movb (%rdx), %sil			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movb (%rdx), %sil
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movb %dil, (%rbx)			# CHECK-NEXT: 3. 1 3.0 1.0 3.0 movb %dil, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.8 1.3 0.8 <total>

	# CHECK: [1] Code Region			# CHECK: [1] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 306			# CHECK-NEXT: Total Cycles: 306
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	Show All 14 Lines
	# CHECK-NEXT: 1 1 1.00 * movw %sp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movw %sp, (%rax)
	# CHECK-NEXT: 1 5 1.00 * movw (%rcx), %bp			# CHECK-NEXT: 1 5 1.00 * movw (%rcx), %bp
	# CHECK-NEXT: 1 5 1.00 * movw (%rdx), %si			# CHECK-NEXT: 1 5 1.00 * movw (%rdx), %si
	# CHECK-NEXT: 1 1 1.00 * movw %di, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movw %di, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 257 (84.0%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 256 (83.7%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (11.1%)			# CHECK-NEXT: 0, 35 (11.4%)
	# CHECK-NEXT: 1, 172 (56.2%)			# CHECK-NEXT: 1, 171 (55.9%)
	# CHECK-NEXT: 2, 86 (28.1%)			# CHECK-NEXT: 2, 85 (27.8%)
				# CHECK-NEXT: 3, 1 (0.3%)
	# CHECK-NEXT: 4, 14 (4.6%)			# CHECK-NEXT: 4, 14 (4.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 5 (1.6%)			# CHECK-NEXT: 0, 6 (2.0%)
	# CHECK-NEXT: 1, 202 (66.0%)			# CHECK-NEXT: 1, 200 (65.4%)
	# CHECK-NEXT: 2, 99 (32.4%)			# CHECK-NEXT: 2, 100 (32.7%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	# CHECK-NEXT: PdEX 36 40 40			# CHECK-NEXT: PdEX 36 40 40
	# CHECK-NEXT: PdFPU 0 0 64			# CHECK-NEXT: PdFPU 0 0 64
	# CHECK-NEXT: PdLoad 19 22 40			# CHECK-NEXT: PdLoad 21 24 40
	# CHECK-NEXT: PdStore 20 23 24			# CHECK-NEXT: PdStore 18 21 24

	# CHECK: Resources:			# CHECK: Resources:
	# CHECK-NEXT: [0.0] - PdAGLU01			# CHECK-NEXT: [0.0] - PdAGLU01
	# CHECK-NEXT: [0.1] - PdAGLU01			# CHECK-NEXT: [0.1] - PdAGLU01
	# CHECK-NEXT: [1] - PdBranch			# CHECK-NEXT: [1] - PdBranch
	# CHECK-NEXT: [2] - PdCount			# CHECK-NEXT: [2] - PdCount
	# CHECK-NEXT: [3] - PdDiv			# CHECK-NEXT: [3] - PdDiv
	# CHECK-NEXT: [4] - PdEX0			# CHECK-NEXT: [4] - PdEX0
	Show All 16 Lines
	# CHECK-NEXT: [18] - PdStore			# CHECK-NEXT: [18] - PdStore

	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
	# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00			# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: 0.96 0.04 - - - - - - - - - - - - - - - - - - - - 1.00 movw %sp, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - - - - - - - - - - 1.00 movw %sp, (%rax)
	# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movw (%rcx), %bp			# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movw (%rcx), %bp
	# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movw (%rdx), %si			# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movw (%rdx), %si
	# CHECK-NEXT: 0.04 0.96 - - - - - - - - - - - - - - - - - - - - 1.00 movw %di, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - - - - - - - - - - 1.00 movw %di, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movw %sp, (%rax)			# CHECK: [0,0] DeER . . movw %sp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movw (%rcx), %bp			# CHECK-NEXT: [0,1] DeeeeeER. movw (%rcx), %bp
	# CHECK-NEXT: [0,2] D=eeeeeER. movw (%rdx), %si			# CHECK-NEXT: [0,2] D=eeeeeER movw (%rdx), %si
	# CHECK-NEXT: [0,3] D======eER movw %di, (%rbx)			# CHECK-NEXT: [0,3] D==eE---R movw %di, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movw (%rcx), %bp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movw (%rcx), %bp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movw (%rdx), %si			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movw (%rdx), %si
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movw %di, (%rbx)			# CHECK-NEXT: 3. 1 3.0 1.0 3.0 movw %di, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.8 1.3 0.8 <total>

	# CHECK: [2] Code Region			# CHECK: [2] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 306			# CHECK-NEXT: Total Cycles: 306
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	Show All 14 Lines
	# CHECK-NEXT: 1 1 1.00 * movl %esp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movl %esp, (%rax)
	# CHECK-NEXT: 1 5 1.00 * movl (%rcx), %ebp			# CHECK-NEXT: 1 5 1.00 * movl (%rcx), %ebp
	# CHECK-NEXT: 1 5 1.00 * movl (%rdx), %esi			# CHECK-NEXT: 1 5 1.00 * movl (%rdx), %esi
	# CHECK-NEXT: 1 1 1.00 * movl %edi, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movl %edi, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 257 (84.0%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 256 (83.7%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (11.1%)			# CHECK-NEXT: 0, 35 (11.4%)
	# CHECK-NEXT: 1, 172 (56.2%)			# CHECK-NEXT: 1, 171 (55.9%)
	# CHECK-NEXT: 2, 86 (28.1%)			# CHECK-NEXT: 2, 85 (27.8%)
				# CHECK-NEXT: 3, 1 (0.3%)
	# CHECK-NEXT: 4, 14 (4.6%)			# CHECK-NEXT: 4, 14 (4.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 5 (1.6%)			# CHECK-NEXT: 0, 6 (2.0%)
	# CHECK-NEXT: 1, 202 (66.0%)			# CHECK-NEXT: 1, 200 (65.4%)
	# CHECK-NEXT: 2, 99 (32.4%)			# CHECK-NEXT: 2, 100 (32.7%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	# CHECK-NEXT: PdEX 36 40 40			# CHECK-NEXT: PdEX 36 40 40
	# CHECK-NEXT: PdFPU 0 0 64			# CHECK-NEXT: PdFPU 0 0 64
	# CHECK-NEXT: PdLoad 19 22 40			# CHECK-NEXT: PdLoad 21 24 40
	# CHECK-NEXT: PdStore 20 23 24			# CHECK-NEXT: PdStore 18 21 24

	# CHECK: Resources:			# CHECK: Resources:
	# CHECK-NEXT: [0.0] - PdAGLU01			# CHECK-NEXT: [0.0] - PdAGLU01
	# CHECK-NEXT: [0.1] - PdAGLU01			# CHECK-NEXT: [0.1] - PdAGLU01
	# CHECK-NEXT: [1] - PdBranch			# CHECK-NEXT: [1] - PdBranch
	# CHECK-NEXT: [2] - PdCount			# CHECK-NEXT: [2] - PdCount
	# CHECK-NEXT: [3] - PdDiv			# CHECK-NEXT: [3] - PdDiv
	# CHECK-NEXT: [4] - PdEX0			# CHECK-NEXT: [4] - PdEX0
	Show All 16 Lines
	# CHECK-NEXT: [18] - PdStore			# CHECK-NEXT: [18] - PdStore

	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
	# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00			# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: 0.96 0.04 - - - - - - - - - - - - - - - - - - - - 1.00 movl %esp, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - - - - - - - - - - 1.00 movl %esp, (%rax)
	# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movl (%rcx), %ebp			# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movl (%rcx), %ebp
	# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movl (%rdx), %esi			# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movl (%rdx), %esi
	# CHECK-NEXT: 0.04 0.96 - - - - - - - - - - - - - - - - - - - - 1.00 movl %edi, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - - - - - - - - - - 1.00 movl %edi, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movl %esp, (%rax)			# CHECK: [0,0] DeER . . movl %esp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movl (%rcx), %ebp			# CHECK-NEXT: [0,1] DeeeeeER. movl (%rcx), %ebp
	# CHECK-NEXT: [0,2] D=eeeeeER. movl (%rdx), %esi			# CHECK-NEXT: [0,2] D=eeeeeER movl (%rdx), %esi
	# CHECK-NEXT: [0,3] D======eER movl %edi, (%rbx)			# CHECK-NEXT: [0,3] D==eE---R movl %edi, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movl (%rcx), %ebp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movl (%rcx), %ebp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movl (%rdx), %esi			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movl (%rdx), %esi
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movl %edi, (%rbx)			# CHECK-NEXT: 3. 1 3.0 1.0 3.0 movl %edi, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.8 1.3 0.8 <total>

	# CHECK: [3] Code Region			# CHECK: [3] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 306			# CHECK-NEXT: Total Cycles: 306
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	Show All 14 Lines
	# CHECK-NEXT: 1 1 1.00 * movq %rsp, (%rax)			# CHECK-NEXT: 1 1 1.00 * movq %rsp, (%rax)
	# CHECK-NEXT: 1 5 1.00 * movq (%rcx), %rbp			# CHECK-NEXT: 1 5 1.00 * movq (%rcx), %rbp
	# CHECK-NEXT: 1 5 1.00 * movq (%rdx), %rsi			# CHECK-NEXT: 1 5 1.00 * movq (%rdx), %rsi
	# CHECK-NEXT: 1 1 1.00 * movq %rdi, (%rbx)			# CHECK-NEXT: 1 1 1.00 * movq %rdi, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 257 (84.0%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 256 (83.7%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 0			# CHECK-NEXT: SQ - Store queue full: 0
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 34 (11.1%)			# CHECK-NEXT: 0, 35 (11.4%)
	# CHECK-NEXT: 1, 172 (56.2%)			# CHECK-NEXT: 1, 171 (55.9%)
	# CHECK-NEXT: 2, 86 (28.1%)			# CHECK-NEXT: 2, 85 (27.8%)
				# CHECK-NEXT: 3, 1 (0.3%)
	# CHECK-NEXT: 4, 14 (4.6%)			# CHECK-NEXT: 4, 14 (4.6%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 5 (1.6%)			# CHECK-NEXT: 0, 6 (2.0%)
	# CHECK-NEXT: 1, 202 (66.0%)			# CHECK-NEXT: 1, 200 (65.4%)
	# CHECK-NEXT: 2, 99 (32.4%)			# CHECK-NEXT: 2, 100 (32.7%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	# CHECK-NEXT: PdEX 36 40 40			# CHECK-NEXT: PdEX 36 40 40
	# CHECK-NEXT: PdFPU 0 0 64			# CHECK-NEXT: PdFPU 0 0 64
	# CHECK-NEXT: PdLoad 19 22 40			# CHECK-NEXT: PdLoad 21 24 40
	# CHECK-NEXT: PdStore 20 23 24			# CHECK-NEXT: PdStore 18 21 24

	# CHECK: Resources:			# CHECK: Resources:
	# CHECK-NEXT: [0.0] - PdAGLU01			# CHECK-NEXT: [0.0] - PdAGLU01
	# CHECK-NEXT: [0.1] - PdAGLU01			# CHECK-NEXT: [0.1] - PdAGLU01
	# CHECK-NEXT: [1] - PdBranch			# CHECK-NEXT: [1] - PdBranch
	# CHECK-NEXT: [2] - PdCount			# CHECK-NEXT: [2] - PdCount
	# CHECK-NEXT: [3] - PdDiv			# CHECK-NEXT: [3] - PdDiv
	# CHECK-NEXT: [4] - PdEX0			# CHECK-NEXT: [4] - PdEX0
	Show All 16 Lines
	# CHECK-NEXT: [18] - PdStore			# CHECK-NEXT: [18] - PdStore

	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
	# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00			# CHECK-NEXT: 3.00 3.00 - - - - - - - - - - - - - - - - - 2.00 2.00 - 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: 0.96 0.04 - - - - - - - - - - - - - - - - - - - - 1.00 movq %rsp, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - - - - - - - - - - 1.00 movq %rsp, (%rax)
	# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movq (%rcx), %rbp			# CHECK-NEXT: 2.00 - - - - - - - - - - - - - - - - - - - 2.00 - - movq (%rcx), %rbp
	# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movq (%rdx), %rsi			# CHECK-NEXT: - 2.00 - - - - - - - - - - - - - - - - - 2.00 - - - movq (%rdx), %rsi
	# CHECK-NEXT: 0.04 0.96 - - - - - - - - - - - - - - - - - - - - 1.00 movq %rdi, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - - - - - - - - - - 1.00 movq %rdi, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movq %rsp, (%rax)			# CHECK: [0,0] DeER . . movq %rsp, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movq (%rcx), %rbp			# CHECK-NEXT: [0,1] DeeeeeER. movq (%rcx), %rbp
	# CHECK-NEXT: [0,2] D=eeeeeER. movq (%rdx), %rsi			# CHECK-NEXT: [0,2] D=eeeeeER movq (%rdx), %rsi
	# CHECK-NEXT: [0,3] D======eER movq %rdi, (%rbx)			# CHECK-NEXT: [0,3] D==eE---R movq %rdi, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movq (%rcx), %rbp			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movq (%rcx), %rbp
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movq (%rdx), %rsi			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movq (%rdx), %rsi
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movq %rdi, (%rbx)			# CHECK-NEXT: 3. 1 3.0 1.0 3.0 movq %rdi, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 1.8 1.3 0.8 <total>

	# CHECK: [4] Code Region			# CHECK: [4] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 554			# CHECK-NEXT: Total Cycles: 553
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	# CHECK: Dispatch Width: 4			# CHECK: Dispatch Width: 4
	# CHECK-NEXT: uOps Per Cycle: 0.72			# CHECK-NEXT: uOps Per Cycle: 0.72
	# CHECK-NEXT: IPC: 0.72			# CHECK-NEXT: IPC: 0.72
	# CHECK-NEXT: Block RThroughput: 4.0			# CHECK-NEXT: Block RThroughput: 4.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	# CHECK-NEXT: [6]: HasSideEffects (U)			# CHECK-NEXT: [6]: HasSideEffects (U)

	# CHECK: [1] [2] [3] [4] [5] [6] Instructions:			# CHECK: [1] [2] [3] [4] [5] [6] Instructions:
	# CHECK-NEXT: 1 2 1.50 * U movd %mm0, (%rax)			# CHECK-NEXT: 1 2 1.50 * U movd %mm0, (%rax)
	# CHECK-NEXT: 1 5 1.50 * movd (%rcx), %mm1			# CHECK-NEXT: 1 5 1.50 * movd (%rcx), %mm1
	# CHECK-NEXT: 1 5 1.50 * movd (%rdx), %mm2			# CHECK-NEXT: 1 5 1.50 * movd (%rdx), %mm2
	# CHECK-NEXT: 1 2 1.50 * U movd %mm3, (%rbx)			# CHECK-NEXT: 1 2 1.50 * U movd %mm3, (%rbx)

	# CHECK: Dynamic Dispatch Stall Cycles:			# CHECK: Dynamic Dispatch Stall Cycles:
	# CHECK-NEXT: RAT - Register unavailable: 0			# CHECK-NEXT: RAT - Register unavailable: 0
	# CHECK-NEXT: RCU - Retire tokens unavailable: 0			# CHECK-NEXT: RCU - Retire tokens unavailable: 0
	# CHECK-NEXT: SCHEDQ - Scheduler full: 55 (9.9%)			# CHECK-NEXT: SCHEDQ - Scheduler full: 57 (10.3%)
	# CHECK-NEXT: LQ - Load queue full: 0			# CHECK-NEXT: LQ - Load queue full: 0
	# CHECK-NEXT: SQ - Store queue full: 437 (78.9%)			# CHECK-NEXT: SQ - Store queue full: 432 (78.1%)
	# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0			# CHECK-NEXT: GROUP - Static restrictions on the dispatch group: 0

	# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:			# CHECK: Dispatch Logic - number of cycles where we saw N micro opcodes dispatched:
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 365 (65.9%)			# CHECK-NEXT: 0, 364 (65.8%)
	# CHECK-NEXT: 1, 88 (15.9%)			# CHECK-NEXT: 1, 88 (15.9%)
	# CHECK-NEXT: 2, 3 (0.5%)			# CHECK-NEXT: 2, 4 (0.7%)
	# CHECK-NEXT: 3, 86 (15.5%)			# CHECK-NEXT: 3, 84 (15.2%)
	# CHECK-NEXT: 4, 12 (2.2%)			# CHECK-NEXT: 4, 13 (2.4%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 253 (45.7%)			# CHECK-NEXT: 0, 253 (45.8%)
	# CHECK-NEXT: 1, 202 (36.5%)			# CHECK-NEXT: 1, 200 (36.2%)
	# CHECK-NEXT: 2, 99 (17.9%)			# CHECK-NEXT: 2, 100 (18.1%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	Show All 29 Lines

	# CHECK: Resource pressure per iteration:			# CHECK: Resource pressure per iteration:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18]
	# CHECK-NEXT: 4.00 4.00 - - - - - - - - 3.00 3.00 - 2.00 1.00 1.00 3.00 3.00 - 3.00 3.00 - 2.00			# CHECK-NEXT: 4.00 4.00 - - - - - - - - 3.00 3.00 - 2.00 1.00 1.00 3.00 3.00 - 3.00 3.00 - 2.00

	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: - 1.00 - - - - - - - - - - - 1.00 - - - 3.00 - - - - 1.00 movd %mm0, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - 1.00 - - - 3.00 - - - - 1.00 movd %mm0, (%rax)
	# CHECK-NEXT: 1.53 1.47 - - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - movd (%rcx), %mm1			# CHECK-NEXT: 1.50 1.50 - - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - movd (%rcx), %mm1
	# CHECK-NEXT: 1.47 1.53 - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - - movd (%rdx), %mm2			# CHECK-NEXT: 1.50 1.50 - - - - - - - - 3.00 - - - 1.00 - - - - 3.00 - - - movd (%rdx), %mm2
	# CHECK-NEXT: 1.00 - - - - - - - - - - - - 1.00 - - 3.00 - - - - - 1.00 movd %mm3, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - 1.00 - - 3.00 - - - - - 1.00 movd %mm3, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: 0			# CHECK-NEXT: Index 012345678
	# CHECK-NEXT: Index 0123456789

	# CHECK: [0,0] DeeER. . movd %mm0, (%rax)			# CHECK: [0,0] DeeER. . movd %mm0, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movd (%rcx), %mm1			# CHECK-NEXT: [0,1] DeeeeeER. movd (%rcx), %mm1
	# CHECK-NEXT: [0,2] D=eeeeeER . movd (%rdx), %mm2			# CHECK-NEXT: [0,2] D=eeeeeER movd (%rdx), %mm2
	# CHECK-NEXT: [0,3] D======eeER movd %mm3, (%rbx)			# CHECK-NEXT: [0,3] D===eeE-R movd %mm3, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movd %mm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movd %mm0, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movd (%rcx), %mm1			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movd (%rcx), %mm1
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movd (%rdx), %mm2			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movd (%rdx), %mm2
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movd %mm3, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 1.0 movd %mm3, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 2.0 1.3 0.3 <total>

	# CHECK: [5] Code Region			# CHECK: [5] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 405			# CHECK-NEXT: Total Cycles: 405
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	Show All 28 Lines
	# CHECK-NEXT: [# dispatched], [# cycles]			# CHECK-NEXT: [# dispatched], [# cycles]
	# CHECK-NEXT: 0, 131 (32.3%)			# CHECK-NEXT: 0, 131 (32.3%)
	# CHECK-NEXT: 1, 174 (43.0%)			# CHECK-NEXT: 1, 174 (43.0%)
	# CHECK-NEXT: 2, 87 (21.5%)			# CHECK-NEXT: 2, 87 (21.5%)
	# CHECK-NEXT: 4, 13 (3.2%)			# CHECK-NEXT: 4, 13 (3.2%)

	# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:			# CHECK: Schedulers - number of cycles where we saw N micro opcodes issued:
	# CHECK-NEXT: [# issued], [# cycles]			# CHECK-NEXT: [# issued], [# cycles]
	# CHECK-NEXT: 0, 104 (25.7%)			# CHECK-NEXT: 0, 105 (25.9%)
	# CHECK-NEXT: 1, 202 (49.9%)			# CHECK-NEXT: 1, 200 (49.4%)
	# CHECK-NEXT: 2, 99 (24.4%)			# CHECK-NEXT: 2, 100 (24.7%)

	# CHECK: Scheduler's queue usage:			# CHECK: Scheduler's queue usage:
	# CHECK-NEXT: [1] Resource name.			# CHECK-NEXT: [1] Resource name.
	# CHECK-NEXT: [2] Average number of used buffer entries.			# CHECK-NEXT: [2] Average number of used buffer entries.
	# CHECK-NEXT: [3] Maximum number of used buffer entries.			# CHECK-NEXT: [3] Maximum number of used buffer entries.
	# CHECK-NEXT: [4] Total number of buffer entries.			# CHECK-NEXT: [4] Total number of buffer entries.

	# CHECK: [1] [2] [3] [4]			# CHECK: [1] [2] [3] [4]
	# CHECK-NEXT: PdEX 37 40 40			# CHECK-NEXT: PdEX 36 40 40
	# CHECK-NEXT: PdFPU 37 40 64			# CHECK-NEXT: PdFPU 36 40 64
	# CHECK-NEXT: PdLoad 19 22 40			# CHECK-NEXT: PdLoad 20 23 40
	# CHECK-NEXT: PdStore 20 22 24			# CHECK-NEXT: PdStore 19 21 24

	# CHECK: Resources:			# CHECK: Resources:
	# CHECK-NEXT: [0.0] - PdAGLU01			# CHECK-NEXT: [0.0] - PdAGLU01
	# CHECK-NEXT: [0.1] - PdAGLU01			# CHECK-NEXT: [0.1] - PdAGLU01
	# CHECK-NEXT: [1] - PdBranch			# CHECK-NEXT: [1] - PdBranch
	# CHECK-NEXT: [2] - PdCount			# CHECK-NEXT: [2] - PdCount
	# CHECK-NEXT: [3] - PdDiv			# CHECK-NEXT: [3] - PdDiv
	# CHECK-NEXT: [4] - PdEX0			# CHECK-NEXT: [4] - PdEX0
	Show All 22 Lines
	# CHECK: Resource pressure by instruction:			# CHECK: Resource pressure by instruction:
	# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:			# CHECK-NEXT: [0.0] [0.1] [1] [2] [3] [4] [5] [6] [7.0] [7.1] [8.0] [8.1] [9] [10] [11] [12] [13] [14] [15] [16.0] [16.1] [17] [18] Instructions:
	# CHECK-NEXT: - 1.00 - - - - - - - - - - - 1.00 - - - 3.00 - - - - 1.00 movaps %xmm0, (%rax)			# CHECK-NEXT: - 1.00 - - - - - - - - - - - 1.00 - - - 3.00 - - - - 1.00 movaps %xmm0, (%rax)
	# CHECK-NEXT: 3.00 - - - - - - - - 3.00 - - - - - 1.00 - - - - 3.00 - - movaps (%rcx), %xmm1			# CHECK-NEXT: 3.00 - - - - - - - - 3.00 - - - - - 1.00 - - - - 3.00 - - movaps (%rcx), %xmm1
	# CHECK-NEXT: - 3.00 - - - - - - 3.00 - - - - - 1.00 - - - - 3.00 - - - movaps (%rdx), %xmm2			# CHECK-NEXT: - 3.00 - - - - - - 3.00 - - - - - 1.00 - - - - 3.00 - - - movaps (%rdx), %xmm2
	# CHECK-NEXT: 1.00 - - - - - - - - - - - - 1.00 - - 3.00 - - - - - 1.00 movaps %xmm3, (%rbx)			# CHECK-NEXT: 1.00 - - - - - - - - - - - - 1.00 - - 3.00 - - - - - 1.00 movaps %xmm3, (%rbx)

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: Index 0123456789			# CHECK-NEXT: Index 012345678

	# CHECK: [0,0] DeER . . movaps %xmm0, (%rax)			# CHECK: [0,0] DeER . . movaps %xmm0, (%rax)
	# CHECK-NEXT: [0,1] DeeeeeER . movaps (%rcx), %xmm1			# CHECK-NEXT: [0,1] DeeeeeER. movaps (%rcx), %xmm1
	# CHECK-NEXT: [0,2] D=eeeeeER. movaps (%rdx), %xmm2			# CHECK-NEXT: [0,2] D=eeeeeER movaps (%rdx), %xmm2
	# CHECK-NEXT: [0,3] D======eER movaps %xmm3, (%rbx)			# CHECK-NEXT: [0,3] D===eE--R movaps %xmm3, (%rbx)

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)
	# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movaps (%rcx), %xmm1			# CHECK-NEXT: 1. 1 1.0 1.0 0.0 movaps (%rcx), %xmm1
	# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movaps (%rdx), %xmm2			# CHECK-NEXT: 2. 1 2.0 2.0 0.0 movaps (%rdx), %xmm2
	# CHECK-NEXT: 3. 1 7.0 0.0 0.0 movaps %xmm3, (%rbx)			# CHECK-NEXT: 3. 1 4.0 2.0 2.0 movaps %xmm3, (%rbx)
	# CHECK-NEXT: 1 2.8 1.0 0.0 <total>			# CHECK-NEXT: 1 2.0 1.5 0.5 <total>

llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s

	Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 vmovaps (%rsi), %xmm0			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 vmovaps (%rsi), %xmm0
	# CHECK-NEXT: 1. 1 7.0 1.0 0.0 vmovaps %xmm0, (%rdi)			# CHECK-NEXT: 1. 1 7.0 1.0 0.0 vmovaps %xmm0, (%rdi)
	# CHECK-NEXT: 2. 1 1.0 1.0 2.0 vmovaps 16(%rsi), %xmm0			# CHECK-NEXT: 2. 1 1.0 1.0 2.0 vmovaps 16(%rsi), %xmm0
	# CHECK-NEXT: 3. 1 8.0 0.0 0.0 vmovaps %xmm0, 16(%rdi)			# CHECK-NEXT: 3. 1 8.0 1.0 0.0 vmovaps %xmm0, 16(%rdi)
	# CHECK-NEXT: 4. 1 3.0 3.0 0.0 vmovaps 32(%rsi), %xmm0			# CHECK-NEXT: 4. 1 3.0 3.0 0.0 vmovaps 32(%rsi), %xmm0
	# CHECK-NEXT: 5. 1 9.0 1.0 0.0 vmovaps %xmm0, 32(%rdi)			# CHECK-NEXT: 5. 1 9.0 1.0 0.0 vmovaps %xmm0, 32(%rdi)
	# CHECK-NEXT: 6. 1 3.0 3.0 2.0 vmovaps 48(%rsi), %xmm0			# CHECK-NEXT: 6. 1 3.0 3.0 2.0 vmovaps 48(%rsi), %xmm0
	# CHECK-NEXT: 7. 1 10.0 0.0 0.0 vmovaps %xmm0, 48(%rdi)			# CHECK-NEXT: 7. 1 10.0 1.0 0.0 vmovaps %xmm0, 48(%rdi)
	# CHECK-NEXT: 1 5.3 1.3 0.5 <total>			# CHECK-NEXT: 1 5.3 1.5 0.5 <total>

llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s

	Show First 20 Lines • Show All 153 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movb %spl, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movb %bpl, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movb %bpl, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movb %sil, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movb %sil, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movb %dil, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movb %dil, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [1] Code Region			# CHECK: [1] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movw %sp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movw %bp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movw %bp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movw %si, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movw %si, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movw %di, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movw %di, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [2] Code Region			# CHECK: [2] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movl %esp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movl %ebp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movl %ebp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movl %esi, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movl %esi, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movl %edi, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movl %edi, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [3] Code Region			# CHECK: [3] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 403			# CHECK-NEXT: Total Cycles: 403
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movq %rsp, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movq %rbp, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movq %rbp, (%rcx)
	# CHECK-NEXT: 2. 1 3.0 0.0 0.0 movq %rsi, (%rdx)			# CHECK-NEXT: 2. 1 3.0 1.0 0.0 movq %rsi, (%rdx)
	# CHECK-NEXT: 3. 1 4.0 0.0 0.0 movq %rdi, (%rbx)			# CHECK-NEXT: 3. 1 4.0 1.0 0.0 movq %rdi, (%rbx)
	# CHECK-NEXT: 1 2.5 0.3 0.0 <total>			# CHECK-NEXT: 1 2.5 1.0 0.0 <total>

	# CHECK: [4] Code Region			# CHECK: [4] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 803			# CHECK-NEXT: Total Cycles: 803
	# CHECK-NEXT: Total uOps: 400			# CHECK-NEXT: Total uOps: 400

	▲ Show 20 Lines • Show All 211 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 movaps %xmm0, (%rax)
	# CHECK-NEXT: 1. 1 2.0 0.0 0.0 movaps %xmm1, (%rcx)			# CHECK-NEXT: 1. 1 2.0 1.0 0.0 movaps %xmm1, (%rcx)
	# CHECK-NEXT: 2. 1 4.0 1.0 0.0 movaps %xmm2, (%rdx)			# CHECK-NEXT: 2. 1 4.0 2.0 0.0 movaps %xmm2, (%rdx)
	# CHECK-NEXT: 3. 1 5.0 0.0 0.0 movaps %xmm3, (%rbx)			# CHECK-NEXT: 3. 1 5.0 1.0 0.0 movaps %xmm3, (%rbx)
	# CHECK-NEXT: 1 3.0 0.5 0.0 <total>			# CHECK-NEXT: 1 3.0 1.3 0.0 <total>

	# CHECK: [6] Code Region			# CHECK: [6] Code Region

	# CHECK: Iterations: 100			# CHECK: Iterations: 100
	# CHECK-NEXT: Instructions: 400			# CHECK-NEXT: Instructions: 400
	# CHECK-NEXT: Total Cycles: 7170			# CHECK-NEXT: Total Cycles: 7170
	# CHECK-NEXT: Total uOps: 1600			# CHECK-NEXT: Total uOps: 1600

	▲ Show 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 1 1.0 1.0 0.0 vmovaps %ymm0, (%rax)			# CHECK-NEXT: 0. 1 1.0 1.0 0.0 vmovaps %ymm0, (%rax)
	# CHECK-NEXT: 1. 1 2.0 1.0 0.0 vmovaps %ymm1, (%rcx)			# CHECK-NEXT: 1. 1 2.0 2.0 0.0 vmovaps %ymm1, (%rcx)
	# CHECK-NEXT: 2. 1 35.0 33.0 0.0 vmovaps %ymm2, (%rdx)			# CHECK-NEXT: 2. 1 35.0 34.0 0.0 vmovaps %ymm2, (%rdx)
	# CHECK-NEXT: 3. 1 36.0 1.0 0.0 vmovaps %ymm3, (%rbx)			# CHECK-NEXT: 3. 1 36.0 2.0 0.0 vmovaps %ymm3, (%rbx)
	# CHECK-NEXT: 1 18.5 9.0 0.0 <total>			# CHECK-NEXT: 1 18.5 9.8 0.0 <total>

llvm/test/tools/llvm-mca/X86/BtVer2/independent-load-stores.s

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2 -timeline -timeline-max-iterations=1 < %s \| FileCheck %s -check-prefixes=ALL,NOALIAS
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2 -timeline -timeline-max-iterations=1 -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,YESALIAS

				addq $44, 64(%r14)
				addq $44, 128(%r14)
				addq $44, 192(%r14)
				addq $44, 256(%r14)
				addq $44, 320(%r14)
				addq $44, 384(%r14)
				addq $44, 448(%r14)
				addq $44, 512(%r14)
				addq $44, 576(%r14)
				addq $44, 640(%r14)

				# ALL: Iterations: 100
				# ALL-NEXT: Instructions: 1000

				# NOALIAS-NEXT: Total Cycles: 1008
				# YESALIAS-NEXT: Total Cycles: 6003

				# ALL-NEXT: Total uOps: 1000

				# ALL: Dispatch Width: 2

				# NOALIAS-NEXT: uOps Per Cycle: 0.99
				# NOALIAS-NEXT: IPC: 0.99

				# YESALIAS-NEXT: uOps Per Cycle: 0.17
				# YESALIAS-NEXT: IPC: 0.17

				# ALL-NEXT: Block RThroughput: 10.0

				# ALL: Instruction Info:
				# ALL-NEXT: [1]: #uOps
				# ALL-NEXT: [2]: Latency
				# ALL-NEXT: [3]: RThroughput
				# ALL-NEXT: [4]: MayLoad
				# ALL-NEXT: [5]: MayStore
				# ALL-NEXT: [6]: HasSideEffects (U)

				# ALL: [1] [2] [3] [4] [5] [6] Instructions:
				# ALL-NEXT: 1 6 1.00 * * addq $44, 64(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 128(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 192(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 256(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 320(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 384(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 448(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 512(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 576(%r14)
				# ALL-NEXT: 1 6 1.00 * * addq $44, 640(%r14)

				# ALL: Resources:
				# ALL-NEXT: [0] - JALU0
				# ALL-NEXT: [1] - JALU1
				# ALL-NEXT: [2] - JDiv
				# ALL-NEXT: [3] - JFPA
				# ALL-NEXT: [4] - JFPM
				# ALL-NEXT: [5] - JFPU0
				# ALL-NEXT: [6] - JFPU1
				# ALL-NEXT: [7] - JLAGU
				# ALL-NEXT: [8] - JMul
				# ALL-NEXT: [9] - JSAGU
				# ALL-NEXT: [10] - JSTC
				# ALL-NEXT: [11] - JVALU0
				# ALL-NEXT: [12] - JVALU1
				# ALL-NEXT: [13] - JVIMUL

				# ALL: Resource pressure per iteration:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
				# ALL-NEXT: 5.00 5.00 - - - - - 10.00 - 10.00 - - - -

				# ALL: Resource pressure by instruction:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
				# ALL-NEXT: - 1.00 - - - - - 1.00 - 1.00 - - - - addq $44, 64(%r14)
				# ALL-NEXT: 1.00 - - - - - - 1.00 - 1.00 - - - - addq $44, 128(%r14)
				# ALL-NEXT: - 1.00 - - - - - 1.00 - 1.00 - - - - addq $44, 192(%r14)
				# ALL-NEXT: 1.00 - - - - - - 1.00 - 1.00 - - - - addq $44, 256(%r14)
				# ALL-NEXT: - 1.00 - - - - - 1.00 - 1.00 - - - - addq $44, 320(%r14)
				# ALL-NEXT: 1.00 - - - - - - 1.00 - 1.00 - - - - addq $44, 384(%r14)
				# ALL-NEXT: - 1.00 - - - - - 1.00 - 1.00 - - - - addq $44, 448(%r14)
				# ALL-NEXT: 1.00 - - - - - - 1.00 - 1.00 - - - - addq $44, 512(%r14)
				# ALL-NEXT: - 1.00 - - - - - 1.00 - 1.00 - - - - addq $44, 576(%r14)
				# ALL-NEXT: 1.00 - - - - - - 1.00 - 1.00 - - - - addq $44, 640(%r14)

				# ALL: Timeline view:

				# NOALIAS-NEXT: 01234567
				# NOALIAS-NEXT: Index 0123456789

				# YESALIAS-NEXT: 0123456789 0123456789 0123456789
				# YESALIAS-NEXT: Index 0123456789 0123456789 0123456789 012

				# NOALIAS: [0,0] DeeeeeeER . . . addq $44, 64(%r14)
				# NOALIAS-NEXT: [0,1] D=eeeeeeER. . . addq $44, 128(%r14)
				# NOALIAS-NEXT: [0,2] .D=eeeeeeER . . addq $44, 192(%r14)
				# NOALIAS-NEXT: [0,3] .D==eeeeeeER . . addq $44, 256(%r14)
				# NOALIAS-NEXT: [0,4] . D==eeeeeeER . . addq $44, 320(%r14)
				# NOALIAS-NEXT: [0,5] . D===eeeeeeER . . addq $44, 384(%r14)
				# NOALIAS-NEXT: [0,6] . D===eeeeeeER. . addq $44, 448(%r14)
				# NOALIAS-NEXT: [0,7] . D====eeeeeeER . addq $44, 512(%r14)
				# NOALIAS-NEXT: [0,8] . D====eeeeeeER. addq $44, 576(%r14)
				# NOALIAS-NEXT: [0,9] . D=====eeeeeeER addq $44, 640(%r14)

				# YESALIAS: [0,0] DeeeeeeER . . . . . . . . . . . . addq $44, 64(%r14)
				# YESALIAS-NEXT: [0,1] D======eeeeeeER. . . . . . . . . . . addq $44, 128(%r14)
				# YESALIAS-NEXT: [0,2] .D===========eeeeeeER . . . . . . . . . addq $44, 192(%r14)
				# YESALIAS-NEXT: [0,3] .D=================eeeeeeER . . . . . . . . addq $44, 256(%r14)
				# YESALIAS-NEXT: [0,4] . D======================eeeeeeER . . . . . . . addq $44, 320(%r14)
				# YESALIAS-NEXT: [0,5] . D============================eeeeeeER . . . . . . addq $44, 384(%r14)
				# YESALIAS-NEXT: [0,6] . D=================================eeeeeeER. . . . . addq $44, 448(%r14)
				# YESALIAS-NEXT: [0,7] . D=======================================eeeeeeER . . . addq $44, 512(%r14)
				# YESALIAS-NEXT: [0,8] . D============================================eeeeeeER . . addq $44, 576(%r14)
				# YESALIAS-NEXT: [0,9] . D==================================================eeeeeeER addq $44, 640(%r14)

				# ALL: Average Wait times (based on the timeline view):
				# ALL-NEXT: [0]: Executions
				# ALL-NEXT: [1]: Average time spent waiting in a scheduler's queue
				# ALL-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
				# ALL-NEXT: [3]: Average time elapsed from WB until retire stage

				# ALL: [0] [1] [2] [3]
				# ALL-NEXT: 0. 1 1.0 1.0 0.0 addq $44, 64(%r14)

				# NOALIAS-NEXT: 1. 1 2.0 1.0 0.0 addq $44, 128(%r14)
				# NOALIAS-NEXT: 2. 1 2.0 1.0 0.0 addq $44, 192(%r14)
				# NOALIAS-NEXT: 3. 1 3.0 1.0 0.0 addq $44, 256(%r14)
				# NOALIAS-NEXT: 4. 1 3.0 1.0 0.0 addq $44, 320(%r14)
				# NOALIAS-NEXT: 5. 1 4.0 1.0 0.0 addq $44, 384(%r14)
				# NOALIAS-NEXT: 6. 1 4.0 1.0 0.0 addq $44, 448(%r14)
				# NOALIAS-NEXT: 7. 1 5.0 1.0 0.0 addq $44, 512(%r14)
				# NOALIAS-NEXT: 8. 1 5.0 1.0 0.0 addq $44, 576(%r14)
				# NOALIAS-NEXT: 9. 1 6.0 1.0 0.0 addq $44, 640(%r14)
				# NOALIAS-NEXT: 1 3.5 1.0 0.0 <total>

				# YESALIAS-NEXT: 1. 1 7.0 0.0 0.0 addq $44, 128(%r14)
				# YESALIAS-NEXT: 2. 1 12.0 0.0 0.0 addq $44, 192(%r14)
				# YESALIAS-NEXT: 3. 1 18.0 0.0 0.0 addq $44, 256(%r14)
				# YESALIAS-NEXT: 4. 1 23.0 0.0 0.0 addq $44, 320(%r14)
				# YESALIAS-NEXT: 5. 1 29.0 0.0 0.0 addq $44, 384(%r14)
				# YESALIAS-NEXT: 6. 1 34.0 0.0 0.0 addq $44, 448(%r14)
				# YESALIAS-NEXT: 7. 1 40.0 0.0 0.0 addq $44, 512(%r14)
				# YESALIAS-NEXT: 8. 1 45.0 0.0 0.0 addq $44, 576(%r14)
				# YESALIAS-NEXT: 9. 1 51.0 0.0 0.0 addq $44, 640(%r14)
				# YESALIAS-NEXT: 1 26.0 0.1 0.0 <total>

llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s

	Show All 15 Lines
	imul %ecx, %ecx			imul %ecx, %ecx
	imul %ecx, %ecx			imul %ecx, %ecx
	# LLVM-MCA-END			# LLVM-MCA-END

	# CHECK: [0] Code Region			# CHECK: [0] Code Region

	# CHECK: Iterations: 2			# CHECK: Iterations: 2
	# CHECK-NEXT: Instructions: 10			# CHECK-NEXT: Instructions: 10
	# CHECK-NEXT: Total Cycles: 27			# CHECK-NEXT: Total Cycles: 24
	# CHECK-NEXT: Total uOps: 16			# CHECK-NEXT: Total uOps: 16

	# CHECK: Dispatch Width: 2			# CHECK: Dispatch Width: 2
	# CHECK-NEXT: uOps Per Cycle: 0.59			# CHECK-NEXT: uOps Per Cycle: 0.67
	# CHECK-NEXT: IPC: 0.37			# CHECK-NEXT: IPC: 0.42
	# CHECK-NEXT: Block RThroughput: 4.0			# CHECK-NEXT: Block RThroughput: 4.0

	# CHECK: Instruction Info:			# CHECK: Instruction Info:
	# CHECK-NEXT: [1]: #uOps			# CHECK-NEXT: [1]: #uOps
	# CHECK-NEXT: [2]: Latency			# CHECK-NEXT: [2]: Latency
	# CHECK-NEXT: [3]: RThroughput			# CHECK-NEXT: [3]: RThroughput
	# CHECK-NEXT: [4]: MayLoad			# CHECK-NEXT: [4]: MayLoad
	# CHECK-NEXT: [5]: MayStore			# CHECK-NEXT: [5]: MayStore
	Show All 31 Lines
	# CHECK-NEXT: 1.50 1.50 - - - - - 1.00 - 1.00 - - - - xaddl %ecx, (%rsp)			# CHECK-NEXT: 1.50 1.50 - - - - - 1.00 - 1.00 - - - - xaddl %ecx, (%rsp)
	# CHECK-NEXT: 1.00 - - - - - - - - - - - - - addl %ecx, %ecx			# CHECK-NEXT: 1.00 - - - - - - - - - - - - - addl %ecx, %ecx
	# CHECK-NEXT: - 1.00 - - - - - - - - - - - - addl %ecx, %ecx			# CHECK-NEXT: - 1.00 - - - - - - - - - - - - addl %ecx, %ecx
	# CHECK-NEXT: - 1.00 - - - - - - 1.00 - - - - - imull %ecx, %ecx			# CHECK-NEXT: - 1.00 - - - - - - 1.00 - - - - - imull %ecx, %ecx
	# CHECK-NEXT: - 1.00 - - - - - - 1.00 - - - - - imull %ecx, %ecx			# CHECK-NEXT: - 1.00 - - - - - - 1.00 - - - - - imull %ecx, %ecx

	# CHECK: Timeline view:			# CHECK: Timeline view:
	# CHECK-NEXT: 0123456789			# CHECK-NEXT: 0123456789
	# CHECK-NEXT: Index 0123456789 0123456			# CHECK-NEXT: Index 0123456789 0123

	# CHECK: [0,0] DeeeeeeeeeeeER . . .. xaddl %ecx, (%rsp)			# CHECK: [0,0] DeeeeeeeeeeeER . . . xaddl %ecx, (%rsp)
	# CHECK-NEXT: [0,1] . D=eE-------R . . .. addl %ecx, %ecx			# CHECK-NEXT: [0,1] . D=eE-------R . . . addl %ecx, %ecx
	# CHECK-NEXT: [0,2] . D==eE-------R. . .. addl %ecx, %ecx			# CHECK-NEXT: [0,2] . D==eE-------R. . . addl %ecx, %ecx
	# CHECK-NEXT: [0,3] . D==eeeE----R. . .. imull %ecx, %ecx			# CHECK-NEXT: [0,3] . D==eeeE----R. . . imull %ecx, %ecx
	# CHECK-NEXT: [0,4] . D=====eeeE--R . .. imull %ecx, %ecx			# CHECK-NEXT: [0,4] . D=====eeeE--R . . imull %ecx, %ecx
	# CHECK-NEXT: [1,0] . D=======eeeeeeeeeeeER.. xaddl %ecx, (%rsp)			# CHECK-NEXT: [1,0] . D====eeeeeeeeeeeER . xaddl %ecx, (%rsp)
	# CHECK-NEXT: [1,1] . .D========eE-------R.. addl %ecx, %ecx			# CHECK-NEXT: [1,1] . .D=====eE-------R . addl %ecx, %ecx
	# CHECK-NEXT: [1,2] . .D=========eE-------R. addl %ecx, %ecx			# CHECK-NEXT: [1,2] . .D======eE-------R. addl %ecx, %ecx
	# CHECK-NEXT: [1,3] . . D=========eeeE----R. imull %ecx, %ecx			# CHECK-NEXT: [1,3] . . D======eeeE----R. imull %ecx, %ecx
	# CHECK-NEXT: [1,4] . . D============eeeE--R imull %ecx, %ecx			# CHECK-NEXT: [1,4] . . D=========eeeE--R imull %ecx, %ecx

	# CHECK: Average Wait times (based on the timeline view):			# CHECK: Average Wait times (based on the timeline view):
	# CHECK-NEXT: [0]: Executions			# CHECK-NEXT: [0]: Executions
	# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue			# CHECK-NEXT: [1]: Average time spent waiting in a scheduler's queue
	# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready			# CHECK-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
	# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage			# CHECK-NEXT: [3]: Average time elapsed from WB until retire stage

	# CHECK: [0] [1] [2] [3]			# CHECK: [0] [1] [2] [3]
	# CHECK-NEXT: 0. 2 4.5 0.5 0.0 xaddl %ecx, (%rsp)			# CHECK-NEXT: 0. 2 3.0 0.5 0.0 xaddl %ecx, (%rsp)
	# CHECK-NEXT: 1. 2 5.5 0.0 7.0 addl %ecx, %ecx			# CHECK-NEXT: 1. 2 4.0 0.0 7.0 addl %ecx, %ecx
	# CHECK-NEXT: 2. 2 6.5 0.0 7.0 addl %ecx, %ecx			# CHECK-NEXT: 2. 2 5.0 0.0 7.0 addl %ecx, %ecx
	# CHECK-NEXT: 3. 2 6.5 0.0 4.0 imull %ecx, %ecx			# CHECK-NEXT: 3. 2 5.0 0.0 4.0 imull %ecx, %ecx
	# CHECK-NEXT: 4. 2 9.5 0.0 2.0 imull %ecx, %ecx			# CHECK-NEXT: 4. 2 8.0 0.0 2.0 imull %ecx, %ecx
	# CHECK-NEXT: 2 6.5 0.1 4.0 <total>			# CHECK-NEXT: 2 5.0 0.1 4.0 <total>

	# CHECK: [1] Code Region			# CHECK: [1] Code Region

	# CHECK: Iterations: 2			# CHECK: Iterations: 2
	# CHECK-NEXT: Instructions: 10			# CHECK-NEXT: Instructions: 10
	# CHECK-NEXT: Total Cycles: 38			# CHECK-NEXT: Total Cycles: 38
	# CHECK-NEXT: Total uOps: 16			# CHECK-NEXT: Total uOps: 16

	▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

llvm/test/tools/llvm-mca/X86/Haswell/independent-load-stores.s

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
				# RUN: llvm-mca -mcpu=haswell -timeline -timeline-max-iterations=1 < %s \| FileCheck %s -check-prefixes=ALL,NOALIAS
				# RUN: llvm-mca -mcpu=haswell -timeline -timeline-max-iterations=1 -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,YESALIAS

				addq $44, 64(%r14)
				addq $44, 128(%r14)
				addq $44, 192(%r14)
				addq $44, 256(%r14)
				addq $44, 320(%r14)
				addq $44, 384(%r14)
				addq $44, 448(%r14)
				addq $44, 512(%r14)
				addq $44, 576(%r14)
				addq $44, 640(%r14)

				# ALL: Iterations: 100
				# ALL-NEXT: Instructions: 1000

				# NOALIAS-NEXT: Total Cycles: 1009
				# YESALIAS-NEXT: Total Cycles: 7003

				# ALL-NEXT: Total uOps: 3000

				# ALL: Dispatch Width: 4

				# NOALIAS-NEXT: uOps Per Cycle: 2.97
				# NOALIAS-NEXT: IPC: 0.99

				# YESALIAS-NEXT: uOps Per Cycle: 0.43
				# YESALIAS-NEXT: IPC: 0.14

				# ALL-NEXT: Block RThroughput: 10.0

				# ALL: Instruction Info:
				# ALL-NEXT: [1]: #uOps
				# ALL-NEXT: [2]: Latency
				# ALL-NEXT: [3]: RThroughput
				# ALL-NEXT: [4]: MayLoad
				# ALL-NEXT: [5]: MayStore
				# ALL-NEXT: [6]: HasSideEffects (U)

				# ALL: [1] [2] [3] [4] [5] [6] Instructions:
				# ALL-NEXT: 3 7 1.00 * * addq $44, 64(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 128(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 192(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 256(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 320(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 384(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 448(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 512(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 576(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 640(%r14)

				# ALL: Resources:
				# ALL-NEXT: [0] - HWDivider
				# ALL-NEXT: [1] - HWFPDivider
				# ALL-NEXT: [2] - HWPort0
				# ALL-NEXT: [3] - HWPort1
				# ALL-NEXT: [4] - HWPort2
				# ALL-NEXT: [5] - HWPort3
				# ALL-NEXT: [6] - HWPort4
				# ALL-NEXT: [7] - HWPort5
				# ALL-NEXT: [8] - HWPort6
				# ALL-NEXT: [9] - HWPort7

				# ALL: Resource pressure per iteration:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
				# ALL-NEXT: - - 2.50 2.50 6.66 6.67 10.00 2.50 2.50 6.67

				# ALL: Resource pressure by instruction:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions:
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 64(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 128(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 192(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 256(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.66 1.00 - 0.50 0.67 addq $44, 320(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.67 1.00 0.50 - 0.66 addq $44, 384(%r14)
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 448(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 512(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 576(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 640(%r14)

				# ALL: Timeline view:

				# NOALIAS-NEXT: 012345678
				# NOALIAS-NEXT: Index 0123456789

				# YESALIAS-NEXT: 0123456789 0123456789 0123456789 012
				# YESALIAS-NEXT: Index 0123456789 0123456789 0123456789 0123456789

				# NOALIAS: [0,0] DeeeeeeeER. . . addq $44, 64(%r14)
				# NOALIAS-NEXT: [0,1] .DeeeeeeeER . . addq $44, 128(%r14)
				# NOALIAS-NEXT: [0,2] . DeeeeeeeER . . addq $44, 192(%r14)
				# NOALIAS-NEXT: [0,3] . DeeeeeeeER . . addq $44, 256(%r14)
				# NOALIAS-NEXT: [0,4] . DeeeeeeeER . . addq $44, 320(%r14)
				# NOALIAS-NEXT: [0,5] . DeeeeeeeER. . addq $44, 384(%r14)
				# NOALIAS-NEXT: [0,6] . .DeeeeeeeER . addq $44, 448(%r14)
				# NOALIAS-NEXT: [0,7] . . DeeeeeeeER . addq $44, 512(%r14)
				# NOALIAS-NEXT: [0,8] . . DeeeeeeeER. addq $44, 576(%r14)
				# NOALIAS-NEXT: [0,9] . . DeeeeeeeER addq $44, 640(%r14)

				# YESALIAS: [0,0] DeeeeeeeER. . . . . . . . . . . . . . addq $44, 64(%r14)
				# YESALIAS-NEXT: [0,1] .D======eeeeeeeER . . . . . . . . . . . . addq $44, 128(%r14)
				# YESALIAS-NEXT: [0,2] . D============eeeeeeeER . . . . . . . . . . . addq $44, 192(%r14)
				# YESALIAS-NEXT: [0,3] . D==================eeeeeeeER . . . . . . . . . addq $44, 256(%r14)
				# YESALIAS-NEXT: [0,4] . D========================eeeeeeeER . . . . . . . . addq $44, 320(%r14)
				# YESALIAS-NEXT: [0,5] . D==============================eeeeeeeER. . . . . . . addq $44, 384(%r14)
				# YESALIAS-NEXT: [0,6] . .D====================================eeeeeeeER . . . . . addq $44, 448(%r14)
				# YESALIAS-NEXT: [0,7] . . D==========================================eeeeeeeER . . . . addq $44, 512(%r14)
				# YESALIAS-NEXT: [0,8] . . D================================================eeeeeeeER . . addq $44, 576(%r14)
				# YESALIAS-NEXT: [0,9] . . D======================================================eeeeeeeER addq $44, 640(%r14)

				# ALL: Average Wait times (based on the timeline view):
				# ALL-NEXT: [0]: Executions
				# ALL-NEXT: [1]: Average time spent waiting in a scheduler's queue
				# ALL-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
				# ALL-NEXT: [3]: Average time elapsed from WB until retire stage

				# ALL: [0] [1] [2] [3]
				# ALL-NEXT: 0. 1 1.0 1.0 0.0 addq $44, 64(%r14)

				# NOALIAS-NEXT: 1. 1 1.0 1.0 0.0 addq $44, 128(%r14)
				# NOALIAS-NEXT: 2. 1 1.0 1.0 0.0 addq $44, 192(%r14)
				# NOALIAS-NEXT: 3. 1 1.0 1.0 0.0 addq $44, 256(%r14)
				# NOALIAS-NEXT: 4. 1 1.0 1.0 0.0 addq $44, 320(%r14)
				# NOALIAS-NEXT: 5. 1 1.0 1.0 0.0 addq $44, 384(%r14)
				# NOALIAS-NEXT: 6. 1 1.0 1.0 0.0 addq $44, 448(%r14)
				# NOALIAS-NEXT: 7. 1 1.0 1.0 0.0 addq $44, 512(%r14)
				# NOALIAS-NEXT: 8. 1 1.0 1.0 0.0 addq $44, 576(%r14)
				# NOALIAS-NEXT: 9. 1 1.0 1.0 0.0 addq $44, 640(%r14)
				# NOALIAS-NEXT: 1 1.0 1.0 0.0 <total>

				# YESALIAS-NEXT: 1. 1 7.0 0.0 0.0 addq $44, 128(%r14)
				# YESALIAS-NEXT: 2. 1 13.0 0.0 0.0 addq $44, 192(%r14)
				# YESALIAS-NEXT: 3. 1 19.0 0.0 0.0 addq $44, 256(%r14)
				# YESALIAS-NEXT: 4. 1 25.0 0.0 0.0 addq $44, 320(%r14)
				# YESALIAS-NEXT: 5. 1 31.0 0.0 0.0 addq $44, 384(%r14)
				# YESALIAS-NEXT: 6. 1 37.0 0.0 0.0 addq $44, 448(%r14)
				# YESALIAS-NEXT: 7. 1 43.0 0.0 0.0 addq $44, 512(%r14)
				# YESALIAS-NEXT: 8. 1 49.0 0.0 0.0 addq $44, 576(%r14)
				# YESALIAS-NEXT: 9. 1 55.0 0.0 0.0 addq $44, 640(%r14)
				# YESALIAS-NEXT: 1 28.0 0.1 0.0 <total>

llvm/test/tools/llvm-mca/X86/SkylakeClient/independent-load-stores.s

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=skylake -timeline -timeline-max-iterations=1 < %s \| FileCheck %s -check-prefixes=ALL,NOALIAS
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=skylake -timeline -timeline-max-iterations=1 -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,YESALIAS

				addq $44, 64(%r14)
				addq $44, 128(%r14)
				addq $44, 192(%r14)
				addq $44, 256(%r14)
				addq $44, 320(%r14)
				addq $44, 384(%r14)
				addq $44, 448(%r14)
				addq $44, 512(%r14)
				addq $44, 576(%r14)
				addq $44, 640(%r14)

				# ALL: Iterations: 100
				# ALL-NEXT: Instructions: 1000

				# NOALIAS-NEXT: Total Cycles: 1009
				# YESALIAS-NEXT: Total Cycles: 7003

				# ALL-NEXT: Total uOps: 3000

				# ALL: Dispatch Width: 6

				# NOALIAS-NEXT: uOps Per Cycle: 2.97
				# NOALIAS-NEXT: IPC: 0.99

				# YESALIAS-NEXT: uOps Per Cycle: 0.43
				# YESALIAS-NEXT: IPC: 0.14

				# ALL-NEXT: Block RThroughput: 10.0

				# ALL: Instruction Info:
				# ALL-NEXT: [1]: #uOps
				# ALL-NEXT: [2]: Latency
				# ALL-NEXT: [3]: RThroughput
				# ALL-NEXT: [4]: MayLoad
				# ALL-NEXT: [5]: MayStore
				# ALL-NEXT: [6]: HasSideEffects (U)

				# ALL: [1] [2] [3] [4] [5] [6] Instructions:
				# ALL-NEXT: 3 7 1.00 * * addq $44, 64(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 128(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 192(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 256(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 320(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 384(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 448(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 512(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 576(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 640(%r14)

				# ALL: Resources:
				# ALL-NEXT: [0] - SKLDivider
				# ALL-NEXT: [1] - SKLFPDivider
				# ALL-NEXT: [2] - SKLPort0
				# ALL-NEXT: [3] - SKLPort1
				# ALL-NEXT: [4] - SKLPort2
				# ALL-NEXT: [5] - SKLPort3
				# ALL-NEXT: [6] - SKLPort4
				# ALL-NEXT: [7] - SKLPort5
				# ALL-NEXT: [8] - SKLPort6
				# ALL-NEXT: [9] - SKLPort7

				# ALL: Resource pressure per iteration:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
				# ALL-NEXT: - - 2.50 2.50 6.66 6.67 10.00 2.50 2.50 6.67

				# ALL: Resource pressure by instruction:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions:
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 64(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 128(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 192(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 256(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.66 1.00 - 0.50 0.67 addq $44, 320(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.67 1.00 0.50 - 0.66 addq $44, 384(%r14)
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 448(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 512(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 576(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 640(%r14)

				# ALL: Timeline view:

				# NOALIAS-NEXT: 012345678
				# NOALIAS-NEXT: Index 0123456789

				# YESALIAS-NEXT: 0123456789 0123456789 0123456789 012
				# YESALIAS-NEXT: Index 0123456789 0123456789 0123456789 0123456789

				# NOALIAS: [0,0] DeeeeeeeER. . . addq $44, 64(%r14)
				# NOALIAS-NEXT: [0,1] D=eeeeeeeER . . addq $44, 128(%r14)
				# NOALIAS-NEXT: [0,2] .D=eeeeeeeER . . addq $44, 192(%r14)
				# NOALIAS-NEXT: [0,3] .D==eeeeeeeER . . addq $44, 256(%r14)
				# NOALIAS-NEXT: [0,4] . D==eeeeeeeER . . addq $44, 320(%r14)
				# NOALIAS-NEXT: [0,5] . D===eeeeeeeER. . addq $44, 384(%r14)
				# NOALIAS-NEXT: [0,6] . D===eeeeeeeER . addq $44, 448(%r14)
				# NOALIAS-NEXT: [0,7] . D====eeeeeeeER . addq $44, 512(%r14)
				# NOALIAS-NEXT: [0,8] . D====eeeeeeeER. addq $44, 576(%r14)
				# NOALIAS-NEXT: [0,9] . D=====eeeeeeeER addq $44, 640(%r14)

				# YESALIAS: [0,0] DeeeeeeeER. . . . . . . . . . . . . . addq $44, 64(%r14)
				# YESALIAS-NEXT: [0,1] D=======eeeeeeeER . . . . . . . . . . . . addq $44, 128(%r14)
				# YESALIAS-NEXT: [0,2] .D=============eeeeeeeER . . . . . . . . . . . addq $44, 192(%r14)
				# YESALIAS-NEXT: [0,3] .D====================eeeeeeeER . . . . . . . . . addq $44, 256(%r14)
				# YESALIAS-NEXT: [0,4] . D==========================eeeeeeeER . . . . . . . . addq $44, 320(%r14)
				# YESALIAS-NEXT: [0,5] . D=================================eeeeeeeER. . . . . . . addq $44, 384(%r14)
				# YESALIAS-NEXT: [0,6] . D=======================================eeeeeeeER . . . . . addq $44, 448(%r14)
				# YESALIAS-NEXT: [0,7] . D==============================================eeeeeeeER . . . . addq $44, 512(%r14)
				# YESALIAS-NEXT: [0,8] . D====================================================eeeeeeeER . . addq $44, 576(%r14)
				# YESALIAS-NEXT: [0,9] . D===========================================================eeeeeeeER addq $44, 640(%r14)

				# ALL: Average Wait times (based on the timeline view):
				# ALL-NEXT: [0]: Executions
				# ALL-NEXT: [1]: Average time spent waiting in a scheduler's queue
				# ALL-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
				# ALL-NEXT: [3]: Average time elapsed from WB until retire stage

				# ALL: [0] [1] [2] [3]
				# ALL-NEXT: 0. 1 1.0 1.0 0.0 addq $44, 64(%r14)

				# NOALIAS-NEXT: 1. 1 2.0 1.0 0.0 addq $44, 128(%r14)
				# NOALIAS-NEXT: 2. 1 2.0 1.0 0.0 addq $44, 192(%r14)
				# NOALIAS-NEXT: 3. 1 3.0 1.0 0.0 addq $44, 256(%r14)
				# NOALIAS-NEXT: 4. 1 3.0 1.0 0.0 addq $44, 320(%r14)
				# NOALIAS-NEXT: 5. 1 4.0 1.0 0.0 addq $44, 384(%r14)
				# NOALIAS-NEXT: 6. 1 4.0 1.0 0.0 addq $44, 448(%r14)
				# NOALIAS-NEXT: 7. 1 5.0 1.0 0.0 addq $44, 512(%r14)
				# NOALIAS-NEXT: 8. 1 5.0 1.0 0.0 addq $44, 576(%r14)
				# NOALIAS-NEXT: 9. 1 6.0 1.0 0.0 addq $44, 640(%r14)
				# NOALIAS-NEXT: 1 3.5 1.0 0.0 <total>

				# YESALIAS-NEXT: 1. 1 8.0 0.0 0.0 addq $44, 128(%r14)
				# YESALIAS-NEXT: 2. 1 14.0 0.0 0.0 addq $44, 192(%r14)
				# YESALIAS-NEXT: 3. 1 21.0 0.0 0.0 addq $44, 256(%r14)
				# YESALIAS-NEXT: 4. 1 27.0 0.0 0.0 addq $44, 320(%r14)
				# YESALIAS-NEXT: 5. 1 34.0 0.0 0.0 addq $44, 384(%r14)
				# YESALIAS-NEXT: 6. 1 40.0 0.0 0.0 addq $44, 448(%r14)
				# YESALIAS-NEXT: 7. 1 47.0 0.0 0.0 addq $44, 512(%r14)
				# YESALIAS-NEXT: 8. 1 53.0 0.0 0.0 addq $44, 576(%r14)
				# YESALIAS-NEXT: 9. 1 60.0 0.0 0.0 addq $44, 640(%r14)
				# YESALIAS-NEXT: 1 30.5 0.1 0.0 <total>

llvm/test/tools/llvm-mca/X86/SkylakeServer/independent-load-stores.s

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 -timeline -timeline-max-iterations=1 < %s \| FileCheck %s -check-prefixes=ALL,NOALIAS
				# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=skylake-avx512 -timeline -timeline-max-iterations=1 -noalias=false < %s \| FileCheck %s -check-prefixes=ALL,YESALIAS

				addq $44, 64(%r14)
				addq $44, 128(%r14)
				addq $44, 192(%r14)
				addq $44, 256(%r14)
				addq $44, 320(%r14)
				addq $44, 384(%r14)
				addq $44, 448(%r14)
				addq $44, 512(%r14)
				addq $44, 576(%r14)
				addq $44, 640(%r14)

				# ALL: Iterations: 100
				# ALL-NEXT: Instructions: 1000

				# NOALIAS-NEXT: Total Cycles: 1009
				# YESALIAS-NEXT: Total Cycles: 7003

				# ALL-NEXT: Total uOps: 3000

				# ALL: Dispatch Width: 6

				# NOALIAS-NEXT: uOps Per Cycle: 2.97
				# NOALIAS-NEXT: IPC: 0.99

				# YESALIAS-NEXT: uOps Per Cycle: 0.43
				# YESALIAS-NEXT: IPC: 0.14

				# ALL-NEXT: Block RThroughput: 10.0

				# ALL: Instruction Info:
				# ALL-NEXT: [1]: #uOps
				# ALL-NEXT: [2]: Latency
				# ALL-NEXT: [3]: RThroughput
				# ALL-NEXT: [4]: MayLoad
				# ALL-NEXT: [5]: MayStore
				# ALL-NEXT: [6]: HasSideEffects (U)

				# ALL: [1] [2] [3] [4] [5] [6] Instructions:
				# ALL-NEXT: 3 7 1.00 * * addq $44, 64(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 128(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 192(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 256(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 320(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 384(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 448(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 512(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 576(%r14)
				# ALL-NEXT: 3 7 1.00 * * addq $44, 640(%r14)

				# ALL: Resources:
				# ALL-NEXT: [0] - SKXDivider
				# ALL-NEXT: [1] - SKXFPDivider
				# ALL-NEXT: [2] - SKXPort0
				# ALL-NEXT: [3] - SKXPort1
				# ALL-NEXT: [4] - SKXPort2
				# ALL-NEXT: [5] - SKXPort3
				# ALL-NEXT: [6] - SKXPort4
				# ALL-NEXT: [7] - SKXPort5
				# ALL-NEXT: [8] - SKXPort6
				# ALL-NEXT: [9] - SKXPort7

				# ALL: Resource pressure per iteration:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
				# ALL-NEXT: - - 2.50 2.50 6.66 6.67 10.00 2.50 2.50 6.67

				# ALL: Resource pressure by instruction:
				# ALL-NEXT: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions:
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 64(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 128(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 192(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 256(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.66 1.00 - 0.50 0.67 addq $44, 320(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.67 1.00 0.50 - 0.66 addq $44, 384(%r14)
				# ALL-NEXT: - - - 0.50 0.66 0.67 1.00 - 0.50 0.67 addq $44, 448(%r14)
				# ALL-NEXT: - - 0.50 - 0.67 0.66 1.00 0.50 - 0.67 addq $44, 512(%r14)
				# ALL-NEXT: - - - 0.50 0.67 0.67 1.00 - 0.50 0.66 addq $44, 576(%r14)
				# ALL-NEXT: - - 0.50 - 0.66 0.67 1.00 0.50 - 0.67 addq $44, 640(%r14)

				# ALL: Timeline view:

				# NOALIAS-NEXT: 012345678
				# NOALIAS-NEXT: Index 0123456789

				# YESALIAS-NEXT: 0123456789 0123456789 0123456789 012
				# YESALIAS-NEXT: Index 0123456789 0123456789 0123456789 0123456789

				# NOALIAS: [0,0] DeeeeeeeER. . . addq $44, 64(%r14)
				# NOALIAS-NEXT: [0,1] D=eeeeeeeER . . addq $44, 128(%r14)
				# NOALIAS-NEXT: [0,2] .D=eeeeeeeER . . addq $44, 192(%r14)
				# NOALIAS-NEXT: [0,3] .D==eeeeeeeER . . addq $44, 256(%r14)
				# NOALIAS-NEXT: [0,4] . D==eeeeeeeER . . addq $44, 320(%r14)
				# NOALIAS-NEXT: [0,5] . D===eeeeeeeER. . addq $44, 384(%r14)
				# NOALIAS-NEXT: [0,6] . D===eeeeeeeER . addq $44, 448(%r14)
				# NOALIAS-NEXT: [0,7] . D====eeeeeeeER . addq $44, 512(%r14)
				# NOALIAS-NEXT: [0,8] . D====eeeeeeeER. addq $44, 576(%r14)
				# NOALIAS-NEXT: [0,9] . D=====eeeeeeeER addq $44, 640(%r14)

				# YESALIAS: [0,0] DeeeeeeeER. . . . . . . . . . . . . . addq $44, 64(%r14)
				# YESALIAS-NEXT: [0,1] D=======eeeeeeeER . . . . . . . . . . . . addq $44, 128(%r14)
				# YESALIAS-NEXT: [0,2] .D=============eeeeeeeER . . . . . . . . . . . addq $44, 192(%r14)
				# YESALIAS-NEXT: [0,3] .D====================eeeeeeeER . . . . . . . . . addq $44, 256(%r14)
				# YESALIAS-NEXT: [0,4] . D==========================eeeeeeeER . . . . . . . . addq $44, 320(%r14)
				# YESALIAS-NEXT: [0,5] . D=================================eeeeeeeER. . . . . . . addq $44, 384(%r14)
				# YESALIAS-NEXT: [0,6] . D=======================================eeeeeeeER . . . . . addq $44, 448(%r14)
				# YESALIAS-NEXT: [0,7] . D==============================================eeeeeeeER . . . . addq $44, 512(%r14)
				# YESALIAS-NEXT: [0,8] . D====================================================eeeeeeeER . . addq $44, 576(%r14)
				# YESALIAS-NEXT: [0,9] . D===========================================================eeeeeeeER addq $44, 640(%r14)

				# ALL: Average Wait times (based on the timeline view):
				# ALL-NEXT: [0]: Executions
				# ALL-NEXT: [1]: Average time spent waiting in a scheduler's queue
				# ALL-NEXT: [2]: Average time spent waiting in a scheduler's queue while ready
				# ALL-NEXT: [3]: Average time elapsed from WB until retire stage

				# ALL: [0] [1] [2] [3]
				# ALL-NEXT: 0. 1 1.0 1.0 0.0 addq $44, 64(%r14)

				# NOALIAS-NEXT: 1. 1 2.0 1.0 0.0 addq $44, 128(%r14)
				# NOALIAS-NEXT: 2. 1 2.0 1.0 0.0 addq $44, 192(%r14)
				# NOALIAS-NEXT: 3. 1 3.0 1.0 0.0 addq $44, 256(%r14)
				# NOALIAS-NEXT: 4. 1 3.0 1.0 0.0 addq $44, 320(%r14)
				# NOALIAS-NEXT: 5. 1 4.0 1.0 0.0 addq $44, 384(%r14)
				# NOALIAS-NEXT: 6. 1 4.0 1.0 0.0 addq $44, 448(%r14)
				# NOALIAS-NEXT: 7. 1 5.0 1.0 0.0 addq $44, 512(%r14)
				# NOALIAS-NEXT: 8. 1 5.0 1.0 0.0 addq $44, 576(%r14)
				# NOALIAS-NEXT: 9. 1 6.0 1.0 0.0 addq $44, 640(%r14)
				# NOALIAS-NEXT: 1 3.5 1.0 0.0 <total>

				# YESALIAS-NEXT: 1. 1 8.0 0.0 0.0 addq $44, 128(%r14)
				# YESALIAS-NEXT: 2. 1 14.0 0.0 0.0 addq $44, 192(%r14)
				# YESALIAS-NEXT: 3. 1 21.0 0.0 0.0 addq $44, 256(%r14)
				# YESALIAS-NEXT: 4. 1 27.0 0.0 0.0 addq $44, 320(%r14)
				# YESALIAS-NEXT: 5. 1 34.0 0.0 0.0 addq $44, 384(%r14)
				# YESALIAS-NEXT: 6. 1 40.0 0.0 0.0 addq $44, 448(%r14)
				# YESALIAS-NEXT: 7. 1 47.0 0.0 0.0 addq $44, 512(%r14)
				# YESALIAS-NEXT: 8. 1 53.0 0.0 0.0 addq $44, 576(%r14)
				# YESALIAS-NEXT: 9. 1 60.0 0.0 0.0 addq $44, 640(%r14)
				# YESALIAS-NEXT: 1 30.5 0.1 0.0 <total>

This is an archive of the discontinued LLVM Phabricator instance.

[MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent (PR45793).ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 262042

llvm/include/llvm/MCA/HardwareUnits/LSUnit.h

llvm/lib/MCA/HardwareUnits/LSUnit.cpp

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st1.s

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st2.s

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st3.s

llvm/test/tools/llvm-mca/AArch64/Exynos/asimd-st4.s

llvm/test/tools/llvm-mca/AArch64/Exynos/float-store.s

llvm/test/tools/llvm-mca/AArch64/Exynos/store.s

llvm/test/tools/llvm-mca/X86/Barcelona/load-store-throughput.s

llvm/test/tools/llvm-mca/X86/Barcelona/store-throughput.s

llvm/test/tools/llvm-mca/X86/BdVer2/load-store-throughput.s

llvm/test/tools/llvm-mca/X86/BdVer2/memcpy-like-test.s

llvm/test/tools/llvm-mca/X86/BdVer2/store-throughput.s

llvm/test/tools/llvm-mca/X86/BtVer2/independent-load-stores.s

llvm/test/tools/llvm-mca/X86/BtVer2/xadd.s

llvm/test/tools/llvm-mca/X86/Haswell/independent-load-stores.s

llvm/test/tools/llvm-mca/X86/SkylakeClient/independent-load-stores.s

llvm/test/tools/llvm-mca/X86/SkylakeServer/independent-load-stores.s

[MCA] Fixed a bug where loads and stores were sometimes incorrectly marked as depedent (PR45793).
ClosedPublic