Diff 345535

llvm/docs/CommandGuide/llvm-exegesis.rst

	Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines
	Either `opcode-index`, `opcode-name` or `snippets-file` must be set.			Either `opcode-index`, `opcode-name` or `snippets-file` must be set.

	.. option:: -mode=[latency\|uops\|inverse_throughput\|analysis]			.. option:: -mode=[latency\|uops\|inverse_throughput\|analysis]

	Specify the run mode. Note that some modes have additional requirements and options.			Specify the run mode. Note that some modes have additional requirements and options.

	`latency` mode can be make use of either RDTSC or LBR.			`latency` mode can be make use of either RDTSC or LBR.
	`latency[LBR]` is only available on X86 (at least `Skylake`).			`latency[LBR]` is only available on X86 (at least `Skylake`).
	To run in `latency` mode, a positive value must be specified for `x86-lbr-sample-period` and `--repetition-mode=loop`.			To run in `latency` mode, a positive value must be specified
				for `x86-lbr-sample-period` and `--repetition-mode=loop`.

	In `analysis` mode, you also need to specify at least one of the			In `analysis` mode, you also need to specify at least one of the
	`-analysis-clusters-output-file=` and `-analysis-inconsistencies-output-file=`.			`-analysis-clusters-output-file=` and `-analysis-inconsistencies-output-file=`.

	.. option:: -x86-lbr-sample-period=<nBranches/sample>			.. option:: -x86-lbr-sample-period=<nBranches/sample>

	Specify the LBR sampling period - how many branches before we take a sample.			Specify the LBR sampling period - how many branches before we take a sample.
	When a positive value is specified for this option and when the mode is `latency`,			When a positive value is specified for this option and when the mode is `latency`,
	we will use LBRs for measuring.			we will use LBRs for measuring.
	On choosing the "right" sampling period, a small value is preferred, but throttling			On choosing the "right" sampling period, a small value is preferred, but throttling
	could occur if the sampling is too frequent. A prime number should be used to			could occur if the sampling is too frequent. A prime number should be used to
	avoid consistently skipping certain blocks.			avoid consistently skipping certain blocks.

	.. option:: -repetition-mode=[duplicate\|loop\|min]			.. option:: -repetition-mode=[duplicate\|loop\|min]

	Specify the repetition mode. `duplicate` will create a large, straight line			Specify the repetition mode. `duplicate` will create a large, straight line
	basic block with `num-repetitions` copies of the snippet. `loop` will wrap			basic block with `num-repetitions` instructions (repeating the snippet
	the snippet in a loop which will be run `num-repetitions` times. The `loop`			`num-repetitions`/`snippet size` times). `loop` will, optionally, duplicate the
	mode tends to better hide the effects of the CPU frontend on architectures			snippet until the loop body contains at least `loop-body-size` instructions,
				and then wrap the result in a loop which will execute `num-repetitions`
				instructions (thus, again, repeating the snippet
				`num-repetitions`/`snippet size` times). The `loop` mode, especially with loop
				unrolling tends to better hide the effects of the CPU frontend on architectures
	that cache decoded instructions, but consumes a register for counting			that cache decoded instructions, but consumes a register for counting
	iterations. If performing an analysis over many opcodes, it may be best			iterations. If performing an analysis over many opcodes, it may be best to
	to instead use the `min` mode, which will run each other mode, and produce			instead use the `min` mode, which will run each other mode,
	the minimal measured result.			and produce the minimal measured result.

	.. option:: -num-repetitions=<Number of repetitions>			.. option:: -num-repetitions=<Number of repetitions>

	Specify the number of repetitions of the asm snippet.			Specify the target number of executed instructions. Note that the actual
				repetition count of the snippet will be `num-repetitions`/`snippet size`.
	Higher values lead to more accurate measurements but lengthen the benchmark.			Higher values lead to more accurate measurements but lengthen the benchmark.

				.. option:: -loop-body-size=<Preferred loop body size>

				Only effective for `-repetition-mode=[loop\|min]`.
				Instead of looping over the snippet directly, first duplicate it so that the
				loop body contains at least this many instructions. This potentially results
				in loop body being cached in the CPU Op Cache / Loop Cache, which allows to
				which may have higher throughput than the CPU decoders.

	.. option:: -max-configs-per-opcode=<value>			.. option:: -max-configs-per-opcode=<value>

	Specify the maximum configurations that can be generated for each opcode.			Specify the maximum configurations that can be generated for each opcode.
	By default this is `1`, meaning that we assume that a single measurement is			By default this is `1`, meaning that we assume that a single measurement is
	enough to characterize an opcode. This might not be true of all instructions:			enough to characterize an opcode. This might not be true of all instructions:
	for example, the performance characteristics of the LEA instruction on X86			for example, the performance characteristics of the LEA instruction on X86
	depends on the value of assigned registers and immediates. Setting a value of			depends on the value of assigned registers and immediates. Setting a value of
	`-max-configs-per-opcode` larger than `1` allows `llvm-exegesis` to explore			`-max-configs-per-opcode` larger than `1` allows `llvm-exegesis` to explore
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/BenchmarkResult.h

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	struct InstructionBenchmark {
enum ModeE { Unknown, Latency, Uops, InverseThroughput };		enum ModeE { Unknown, Latency, Uops, InverseThroughput };
ModeE Mode;		ModeE Mode;
std::string CpuName;		std::string CpuName;
std::string LLVMTriple;		std::string LLVMTriple;
// Which instruction is being benchmarked here?		// Which instruction is being benchmarked here?
const MCInst &keyInstruction() const { return Key.Instructions[0]; }		const MCInst &keyInstruction() const { return Key.Instructions[0]; }
// The number of instructions inside the repeated snippet. For example, if a		// The number of instructions inside the repeated snippet. For example, if a
// snippet of 3 instructions is repeated 4 times, this is 12.		// snippet of 3 instructions is repeated 4 times, this is 12.
int NumRepetitions = 0;		unsigned NumRepetitions = 0;
enum RepetitionModeE { Duplicate, Loop, AggregateMin };		enum RepetitionModeE { Duplicate, Loop, AggregateMin };
// Note that measurements are per instruction.		// Note that measurements are per instruction.
std::vector<BenchmarkMeasure> Measurements;		std::vector<BenchmarkMeasure> Measurements;
std::string Error;		std::string Error;
std::string Info;		std::string Info;
std::vector<uint8_t> AssembledSnippet;		std::vector<uint8_t> AssembledSnippet;
// How to aggregate measurements.		// How to aggregate measurements.
enum ResultAggregationModeE { Min, Max, Mean, MinVariance };		enum ResultAggregationModeE { Min, Max, Mean, MinVariance };
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h

	Show All 35 Lines
	public:			public:
	explicit BenchmarkRunner(const LLVMState &State,			explicit BenchmarkRunner(const LLVMState &State,
	InstructionBenchmark::ModeE Mode);			InstructionBenchmark::ModeE Mode);

	virtual ~BenchmarkRunner();			virtual ~BenchmarkRunner();

	Expected<InstructionBenchmark>			Expected<InstructionBenchmark>
	runConfiguration(const BenchmarkCode &Configuration, unsigned NumRepetitions,			runConfiguration(const BenchmarkCode &Configuration, unsigned NumRepetitions,
				unsigned LoopUnrollFactor,
	ArrayRef<std::unique_ptr<const SnippetRepetitor>> Repetitors,			ArrayRef<std::unique_ptr<const SnippetRepetitor>> Repetitors,
	bool DumpObjectToDisk) const;			bool DumpObjectToDisk) const;

	// Scratch space to run instructions that touch memory.			// Scratch space to run instructions that touch memory.
	struct ScratchSpace {			struct ScratchSpace {
	static constexpr const size_t kAlignment = 1024;			static constexpr const size_t kAlignment = 1024;
	static constexpr const size_t kSize = 1 << 20; // 1MB.			static constexpr const size_t kSize = 1 << 20; // 1MB.
	ScratchSpace()			ScratchSpace()
	▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	#endif

const LLVMState &State;		const LLVMState &State;
const ExecutableFunction Function;		const ExecutableFunction Function;
BenchmarkRunner::ScratchSpace *const Scratch;		BenchmarkRunner::ScratchSpace *const Scratch;
};		};
} // namespace		} // namespace

Expected<InstructionBenchmark> BenchmarkRunner::runConfiguration(		Expected<InstructionBenchmark> BenchmarkRunner::runConfiguration(
const BenchmarkCode &BC, unsigned NumRepetitions,		const BenchmarkCode &BC, unsigned NumRepetitions, unsigned LoopBodySize,
ArrayRef<std::unique_ptr<const SnippetRepetitor>> Repetitors,		ArrayRef<std::unique_ptr<const SnippetRepetitor>> Repetitors,
bool DumpObjectToDisk) const {		bool DumpObjectToDisk) const {
InstructionBenchmark InstrBenchmark;		InstructionBenchmark InstrBenchmark;
InstrBenchmark.Mode = Mode;		InstrBenchmark.Mode = Mode;
InstrBenchmark.CpuName = std::string(State.getTargetMachine().getTargetCPU());		InstrBenchmark.CpuName = std::string(State.getTargetMachine().getTargetCPU());
InstrBenchmark.LLVMTriple =		InstrBenchmark.LLVMTriple =
State.getTargetMachine().getTargetTriple().normalize();		State.getTargetMachine().getTargetTriple().normalize();
InstrBenchmark.NumRepetitions = NumRepetitions;		InstrBenchmark.NumRepetitions = NumRepetitions;
Show All 18 Lines	private:
bool Clear = true;		bool Clear = true;
};		};
ClearBenchmarkOnReturn CBOR(&InstrBenchmark);		ClearBenchmarkOnReturn CBOR(&InstrBenchmark);

for (const std::unique_ptr<const SnippetRepetitor> &Repetitor : Repetitors) {		for (const std::unique_ptr<const SnippetRepetitor> &Repetitor : Repetitors) {
// Assemble at least kMinInstructionsForSnippet instructions by repeating		// Assemble at least kMinInstructionsForSnippet instructions by repeating
// the snippet for debug/analysis. This is so that the user clearly		// the snippet for debug/analysis. This is so that the user clearly
// understands that the inside instructions are repeated.		// understands that the inside instructions are repeated.
constexpr const int kMinInstructionsForSnippet = 16;		const int kMinInstructionsForSnippet = 4 * Instructions.size();
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'kMinInstructionsForSnippet' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'kMinInstructionsForSnippet' [readability…
		courbetUnsubmitted Not Done Reply Inline Actions [style] This is no longer a constant, you can remove the `k`. courbet: [style] This is no longer a constant, you can remove the `k`.
		const int kLoopBodySizeForSnippet = 2 * Instructions.size();
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'kLoopBodySizeForSnippet' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'kLoopBodySizeForSnippet' [readability…
		courbetUnsubmitted Not Done Reply Inline Actions ditto courbet: ditto
		lebedev.riAuthorUnsubmitted Not Done Reply Inline Actions So if i pass `LoopBodySize` into constructor of `LoopSnippetRepetitor`, how can i adjust it here then? lebedev.ri: So if i pass `LoopBodySize` into constructor of `LoopSnippetRepetitor`, how can i adjust it…
		courbetUnsubmitted Not Done Reply Inline Actions Right, I don't have a good suggestion... Let's keep it like this. courbet: Right, I don't have a good suggestion... Let's keep it like this.
{		{
SmallString<0> Buffer;		SmallString<0> Buffer;
raw_svector_ostream OS(Buffer);		raw_svector_ostream OS(Buffer);
if (Error E = assembleToStream(		if (Error E = assembleToStream(
State.getExegesisTarget(), State.createTargetMachine(),		State.getExegesisTarget(), State.createTargetMachine(),
BC.LiveIns, BC.Key.RegisterInitialValues,		BC.LiveIns, BC.Key.RegisterInitialValues,
Repetitor->Repeat(Instructions, kMinInstructionsForSnippet),		Repetitor->Repeat(Instructions, kMinInstructionsForSnippet,
		kLoopBodySizeForSnippet),
OS)) {		OS)) {
return std::move(E);		return std::move(E);
}		}
const ExecutableFunction EF(State.createTargetMachine(),		const ExecutableFunction EF(State.createTargetMachine(),
getObjectFromBuffer(OS.str()));		getObjectFromBuffer(OS.str()));
const auto FnBytes = EF.getFunctionBytes();		const auto FnBytes = EF.getFunctionBytes();
llvm::append_range(InstrBenchmark.AssembledSnippet, FnBytes);		llvm::append_range(InstrBenchmark.AssembledSnippet, FnBytes);
}		}

// Assemble NumRepetitions instructions repetitions of the snippet for		// Assemble NumRepetitions instructions repetitions of the snippet for
// measurements.		// measurements.
const auto Filler =		const auto Filler = Repetitor->Repeat(
Repetitor->Repeat(Instructions, InstrBenchmark.NumRepetitions);		Instructions, InstrBenchmark.NumRepetitions, LoopBodySize);

object::OwningBinary<object::ObjectFile> ObjectFile;		object::OwningBinary<object::ObjectFile> ObjectFile;
if (DumpObjectToDisk) {		if (DumpObjectToDisk) {
auto ObjectFilePath = writeObjectFile(BC, Filler);		auto ObjectFilePath = writeObjectFile(BC, Filler);
if (Error E = ObjectFilePath.takeError()) {		if (Error E = ObjectFilePath.takeError()) {
InstrBenchmark.Error = toString(std::move(E));		InstrBenchmark.Error = toString(std::move(E));
return InstrBenchmark;		return InstrBenchmark;
}		}
▲ Show 20 Lines • Show All 80 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h

Show All 33 Lines	public:
virtual ~SnippetRepetitor();		virtual ~SnippetRepetitor();

// Returns the set of registers that are reserved by the repetitor.		// Returns the set of registers that are reserved by the repetitor.
virtual BitVector getReservedRegs() const = 0;		virtual BitVector getReservedRegs() const = 0;

// Returns a functor that repeats `Instructions` so that the function executes		// Returns a functor that repeats `Instructions` so that the function executes
// at least `MinInstructions` instructions.		// at least `MinInstructions` instructions.
virtual FillFunction Repeat(ArrayRef<MCInst> Instructions,		virtual FillFunction Repeat(ArrayRef<MCInst> Instructions,
unsigned MinInstructions) const = 0;		unsigned MinInstructions,
		unsigned LoopBodySize) const = 0;
		courbetUnsubmitted Not Done Reply Inline Actions Adding the `LoopBodySize` here sort of breaks the `SnippetRepetitor` abstraction. I think `LoopBodySize` should be a member of `LoopSnippetRepetitor`, initialized in the constructor. courbet: Adding the `LoopBodySize` here sort of breaks the `SnippetRepetitor` abstraction. I think…

explicit SnippetRepetitor(const LLVMState &State) : State(State) {}		explicit SnippetRepetitor(const LLVMState &State) : State(State) {}

protected:		protected:
const LLVMState &State;		const LLVMState &State;
};		};

} // namespace exegesis		} // namespace exegesis
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

	//===-- SnippetRepetitor.cpp ------------------------------------- C++ --===//			//===-- SnippetRepetitor.cpp ------------------------------------- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include <array>			#include <array>
	#include <string>			#include <string>

	#include "SnippetRepetitor.h"			#include "SnippetRepetitor.h"
	#include "Target.h"			#include "Target.h"
				#include "llvm/ADT/Sequence.h"
	#include "llvm/CodeGen/TargetInstrInfo.h"			#include "llvm/CodeGen/TargetInstrInfo.h"
	#include "llvm/CodeGen/TargetSubtargetInfo.h"			#include "llvm/CodeGen/TargetSubtargetInfo.h"

	namespace llvm {			namespace llvm {
	namespace exegesis {			namespace exegesis {
	namespace {			namespace {

	class DuplicateSnippetRepetitor : public SnippetRepetitor {			class DuplicateSnippetRepetitor : public SnippetRepetitor {
	public:			public:
	using SnippetRepetitor::SnippetRepetitor;			using SnippetRepetitor::SnippetRepetitor;

	// Repeats the snippet until there are at least MinInstructions in the			// Repeats the snippet until there are at least MinInstructions in the
	// resulting code.			// resulting code.
	FillFunction Repeat(ArrayRef<MCInst> Instructions,			FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,
	unsigned MinInstructions) const override {			unsigned LoopBodySize) const override {
	return [Instructions, MinInstructions](FunctionFiller &Filler) {			return [Instructions, MinInstructions](FunctionFiller &Filler) {
	auto Entry = Filler.getEntry();			auto Entry = Filler.getEntry();
	if (!Instructions.empty()) {			if (!Instructions.empty()) {
	// Add the whole snippet at least once.			// Add the whole snippet at least once.
	Entry.addInstructions(Instructions);			Entry.addInstructions(Instructions);
	for (unsigned I = Instructions.size(); I < MinInstructions; ++I) {			for (unsigned I = Instructions.size(); I < MinInstructions; ++I) {
	Entry.addInstruction(Instructions[I % Instructions.size()]);			Entry.addInstruction(Instructions[I % Instructions.size()]);
	}			}
	Show All 11 Lines
	class LoopSnippetRepetitor : public SnippetRepetitor {			class LoopSnippetRepetitor : public SnippetRepetitor {
	public:			public:
	explicit LoopSnippetRepetitor(const LLVMState &State)			explicit LoopSnippetRepetitor(const LLVMState &State)
	: SnippetRepetitor(State),			: SnippetRepetitor(State),
	LoopCounter(State.getExegesisTarget().getLoopCounterRegister(			LoopCounter(State.getExegesisTarget().getLoopCounterRegister(
	State.getTargetMachine().getTargetTriple())) {}			State.getTargetMachine().getTargetTriple())) {}

	// Loop over the snippet ceil(MinInstructions / Instructions.Size()) times.			// Loop over the snippet ceil(MinInstructions / Instructions.Size()) times.
	FillFunction Repeat(ArrayRef<MCInst> Instructions,			FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,
	unsigned MinInstructions) const override {			unsigned LoopBodySize) const override {
	return [this, Instructions, MinInstructions](FunctionFiller &Filler) {			return [this, Instructions, MinInstructions,
				LoopBodySize](FunctionFiller &Filler) {
	const auto &ET = State.getExegesisTarget();			const auto &ET = State.getExegesisTarget();
	auto Entry = Filler.getEntry();			auto Entry = Filler.getEntry();
	auto Loop = Filler.addBasicBlock();			auto Loop = Filler.addBasicBlock();
	auto Exit = Filler.addBasicBlock();			auto Exit = Filler.addBasicBlock();

				const unsigned LoopUnrollFactor =
				LoopBodySize <= Instructions.size()
				? 1
				: divideCeil(LoopBodySize, Instructions.size());
				assert(LoopUnrollFactor >= 1 && "Should end up with at least 1 snippet.");

	// Set loop counter to the right value:			// Set loop counter to the right value:
	const APInt LoopCount(32, (MinInstructions + Instructions.size() - 1) /			const APInt LoopCount(
	Instructions.size());			32,
				divideCeil(MinInstructions, LoopUnrollFactor * Instructions.size()));
				assert(LoopCount.uge(1) && "Trip count should be at least 1.");
	for (const MCInst &Inst :			for (const MCInst &Inst :
	ET.setRegTo(State.getSubtargetInfo(), LoopCounter, LoopCount))			ET.setRegTo(State.getSubtargetInfo(), LoopCounter, LoopCount))
	Entry.addInstruction(Inst);			Entry.addInstruction(Inst);

	// Set up the loop basic block.			// Set up the loop basic block.
	Entry.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne());			Entry.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne());
	Loop.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne());			Loop.MBB->addSuccessor(Loop.MBB, BranchProbability::getOne());
	// The live ins are: the loop counter, the registers that were setup by			// The live ins are: the loop counter, the registers that were setup by
	// the entry block, and entry block live ins.			// the entry block, and entry block live ins.
	Loop.MBB->addLiveIn(LoopCounter);			Loop.MBB->addLiveIn(LoopCounter);
	for (unsigned Reg : Filler.getRegistersSetUp())			for (unsigned Reg : Filler.getRegistersSetUp())
	Loop.MBB->addLiveIn(Reg);			Loop.MBB->addLiveIn(Reg);
	for (const auto &LiveIn : Entry.MBB->liveins())			for (const auto &LiveIn : Entry.MBB->liveins())
	Loop.MBB->addLiveIn(LiveIn);			Loop.MBB->addLiveIn(LiveIn);
				for (auto _ : seq(0U, LoopUnrollFactor)) {
				(void)_;
	Loop.addInstructions(Instructions);			Loop.addInstructions(Instructions);
				}
	ET.decrementLoopCounterAndJump(Loop.MBB, Loop.MBB,			ET.decrementLoopCounterAndJump(Loop.MBB, Loop.MBB,
	State.getInstrInfo());			State.getInstrInfo());

	// Set up the exit basic block.			// Set up the exit basic block.
	Loop.MBB->addSuccessor(Exit.MBB, BranchProbability::getZero());			Loop.MBB->addSuccessor(Exit.MBB, BranchProbability::getZero());
	Exit.addReturn();			Exit.addReturn();
	};			};
	}			}
	Show All 31 Lines

llvm/tools/llvm-exegesis/llvm-exegesis.cpp

Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	cl::values(
"All of the above and take the minimum of measurements")),		"All of the above and take the minimum of measurements")),
cl::init(exegesis::InstructionBenchmark::Duplicate));		cl::init(exegesis::InstructionBenchmark::Duplicate));

static cl::opt<unsigned>		static cl::opt<unsigned>
NumRepetitions("num-repetitions",		NumRepetitions("num-repetitions",
cl::desc("number of time to repeat the asm snippet"),		cl::desc("number of time to repeat the asm snippet"),
cl::cat(BenchmarkOptions), cl::init(10000));		cl::cat(BenchmarkOptions), cl::init(10000));

		static cl::opt<unsigned>
		LoopBodySize("loop-body-size",
		cl::desc("when repeating the instruction snippet by looping "
		"over it, duplicate the snippet until the loop body "
		"contains at least this many instruction"),
		cl::cat(BenchmarkOptions), cl::init(0));

static cl::opt<unsigned> MaxConfigsPerOpcode(		static cl::opt<unsigned> MaxConfigsPerOpcode(
"max-configs-per-opcode",		"max-configs-per-opcode",
cl::desc(		cl::desc(
"allow to snippet generator to generate at most that many configs"),		"allow to snippet generator to generate at most that many configs"),
cl::cat(BenchmarkOptions), cl::init(1));		cl::cat(BenchmarkOptions), cl::init(1));

static cl::opt<bool> IgnoreInvalidSchedClass(		static cl::opt<bool> IgnoreInvalidSchedClass(
"ignore-invalid-sched-class",		"ignore-invalid-sched-class",
▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	#endif
}		}

// Write to standard output if file is not set.		// Write to standard output if file is not set.
if (BenchmarkFile.empty())		if (BenchmarkFile.empty())
BenchmarkFile = "-";		BenchmarkFile = "-";

for (const BenchmarkCode &Conf : Configurations) {		for (const BenchmarkCode &Conf : Configurations) {
InstructionBenchmark Result = ExitOnErr(Runner->runConfiguration(		InstructionBenchmark Result = ExitOnErr(Runner->runConfiguration(
Conf, NumRepetitions, Repetitors, DumpObjectToDisk));		Conf, NumRepetitions, LoopBodySize, Repetitors, DumpObjectToDisk));
ExitOnFileError(BenchmarkFile, Result.writeYaml(State, BenchmarkFile));		ExitOnFileError(BenchmarkFile, Result.writeYaml(State, BenchmarkFile));
}		}
exegesis::pfm::pfmTerminate();		exegesis::pfm::pfmTerminate();
}		}

// Prints the results of running analysis pass `Pass` to file `OutputFilename`		// Prints the results of running analysis pass `Pass` to file `OutputFilename`
// if OutputFilename is non-empty.		// if OutputFilename is non-empty.
template <typename Pass>		template <typename Pass>
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines

llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

Show All 36 Lines	void SetUp() override {
MMI = std::make_unique<MachineModuleInfo>(TM.get());		MMI = std::make_unique<MachineModuleInfo>(TM.get());
MF = &createVoidVoidPtrMachineFunction("TestFn", Mod.get(), MMI.get());		MF = &createVoidVoidPtrMachineFunction("TestFn", Mod.get(), MMI.get());
}		}

void TestCommon(InstructionBenchmark::RepetitionModeE RepetitionMode) {		void TestCommon(InstructionBenchmark::RepetitionModeE RepetitionMode) {
const auto Repetitor = SnippetRepetitor::Create(RepetitionMode, State);		const auto Repetitor = SnippetRepetitor::Create(RepetitionMode, State);
const std::vector<MCInst> Instructions = {MCInstBuilder(X86::NOOP)};		const std::vector<MCInst> Instructions = {MCInstBuilder(X86::NOOP)};
FunctionFiller Sink(*MF, {X86::EAX});		FunctionFiller Sink(*MF, {X86::EAX});
const auto Fill = Repetitor->Repeat(Instructions, kMinInstructions);		const auto Fill =
		Repetitor->Repeat(Instructions, kMinInstructions, kLoopBodySize);
Fill(Sink);		Fill(Sink);
}		}

static constexpr const unsigned kMinInstructions = 3;		static constexpr const unsigned kMinInstructions = 3;
		static constexpr const unsigned kLoopBodySize = 5;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'kLoopBodySize' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'kLoopBodySize' [readability-identifier…

std::unique_ptr<LLVMTargetMachine> TM;		std::unique_ptr<LLVMTargetMachine> TM;
std::unique_ptr<LLVMContext> Context;		std::unique_ptr<LLVMContext> Context;
std::unique_ptr<Module> Mod;		std::unique_ptr<Module> Mod;
std::unique_ptr<MachineModuleInfo> MMI;		std::unique_ptr<MachineModuleInfo> MMI;
MachineFunction *MF = nullptr;		MachineFunction *MF = nullptr;
};		};

Show All 15 Lines
}		}

TEST_F(X86SnippetRepetitorTest, Loop) {		TEST_F(X86SnippetRepetitorTest, Loop) {
TestCommon(InstructionBenchmark::Loop);		TestCommon(InstructionBenchmark::Loop);
// Duplicating creates an entry block, a loop body and a ret block.		// Duplicating creates an entry block, a loop body and a ret block.
ASSERT_EQ(MF->getNumBlockIDs(), 3u);		ASSERT_EQ(MF->getNumBlockIDs(), 3u);
const auto &LoopBlock = *MF->getBlockNumbered(1);		const auto &LoopBlock = *MF->getBlockNumbered(1);
EXPECT_THAT(LoopBlock.instrs(),		EXPECT_THAT(LoopBlock.instrs(),
ElementsAre(HasOpcode(X86::NOOP), HasOpcode(X86::ADD64ri8),		ElementsAre(HasOpcode(X86::NOOP), HasOpcode(X86::NOOP),
		HasOpcode(X86::NOOP), HasOpcode(X86::NOOP),
		HasOpcode(X86::NOOP), HasOpcode(X86::ADD64ri8),
HasOpcode(X86::JCC_1)));		HasOpcode(X86::JCC_1)));
EXPECT_THAT(LoopBlock.liveins(),		EXPECT_THAT(LoopBlock.liveins(),
UnorderedElementsAre(		UnorderedElementsAre(
LiveReg(X86::EAX),		LiveReg(X86::EAX),
LiveReg(State.getExegesisTarget().getLoopCounterRegister(		LiveReg(State.getExegesisTarget().getLoopCounterRegister(
State.getTargetMachine().getTargetTriple()))));		State.getTargetMachine().getTargetTriple()))));
EXPECT_THAT(MF->getBlockNumbered(2)->instrs(),		EXPECT_THAT(MF->getBlockNumbered(2)->instrs(),
ElementsAre(HasOpcode(X86::RETQ)));		ElementsAre(HasOpcode(X86::RETQ)));
}		}

} // namespace		} // namespace
} // namespace exegesis		} // namespace exegesis
} // namespace llvm		} // namespace llvm

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Loop unrolling for loop snippet repetitor mode
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 345535

llvm/docs/CommandGuide/llvm-exegesis.rst

llvm/tools/llvm-exegesis/lib/BenchmarkResult.h

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

llvm/tools/llvm-exegesis/llvm-exegesis.cpp

llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Loop unrolling for loop snippet repetitor modeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 345535

llvm/docs/CommandGuide/llvm-exegesis.rst

llvm/tools/llvm-exegesis/lib/BenchmarkResult.h

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

llvm/tools/llvm-exegesis/llvm-exegesis.cpp

llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

[llvm-exegesis] Loop unrolling for loop snippet repetitor mode
ClosedPublic