This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
tools/llvm-exegesis/lib/
-
llvm-exegesis/
-
lib/
-
SnippetGenerator.h
1
SnippetGenerator.cpp
-
unittests/tools/llvm-exegesis/X86/
-
tools/
-
llvm-exegesis/
-
X86/
1/4
SnippetGeneratorTest.cpp

Differential D60401

[llvm-exegesis] When generating templates with chained instructions, also add templates for helper instructions
Needs ReviewPublic

Authored by lebedev.ri on Apr 8 2019, 2:18 AM.

Download Raw Diff

Details

Reviewers

courbet
gchatelet

Summary

To measure characteristics of instruction Instr, we sometimes need to
chain it together with some other instructions. But that gives us
collective characteristics of those instructions combined.

Initially, in D60000, i have proposed to recover the actual characteristics
of the actual target instruction by using the LLVM scheduling data of
the helper instructions. But it was pointed out that we should not depend
on a-priori data, and use measurements only.

But for that, we need to not only measure the target instruction,
but also measure the instructions that were chained to the target instruction.

I don't believe those measurements should be done by hand.
If i want to measure latency of instruction that can't be executed serially,
at most, i'm willing to run analysis mode on the *automated* measurements,
i don't really want to look what instructions llvm-exegesis has used to
serialize execution. But maybe that is just me?

Diff Detail

Repository: rL LLVM

Event Timeline

lebedev.ri created this revision.Apr 8 2019, 2:18 AM

Herald added a subscriber: tschuett. · View Herald TranscriptApr 8 2019, 2:18 AM

To be noted this won't quite help the -mode=uops, since there it's the same opcode + randomness:

$ ./bin/llvm-exegesis -mode=uops -opcode-name=CMOV16rr -benchmarks-file=-
Check generated assembly with: /usr/bin/objdump -d /tmp/snippet-46b66a.o
---
mode:            uops
key:             
  instructions:    
    - 'CMOV16rr AX AX R12W i_0x2'
    - 'CMOV16rr BP BP DX i_0xe'
    - 'CMOV16rr BX BX R10W i_0x3'
    - 'CMOV16rr CX CX BX i_0x1'
    - 'CMOV16rr DI DI R13W i_0x4'
    - 'CMOV16rr DX DX R14W i_0xc'
    - 'CMOV16rr SI SI R10W i_0x9'
    - 'CMOV16rr R8W R8W SI i_0x2'
    - 'CMOV16rr R9W R9W DI i_0x6'
    - 'CMOV16rr R10W R10W BP i_0xc'
    - 'CMOV16rr R11W R11W BX i_0xd'
    - 'CMOV16rr R12W R12W R15W i_0xe'
    - 'CMOV16rr R13W R13W DI i_0xc'
    - 'CMOV16rr R14W R14W R14W i_0xf'
    - 'CMOV16rr R15W R15W R13W i_0x4'
  config:          ''
  register_initial_values: 
    - 'AX=0x0'
    - 'R12W=0x0'
    - 'EFLAGS=0x0'
    - 'BP=0x0'
    - 'DX=0x0'
    - 'BX=0x0'
    - 'R10W=0x0'
    - 'CX=0x0'
    - 'DI=0x0'
    - 'R13W=0x0'
    - 'R14W=0x0'
    - 'SI=0x0'
    - 'R8W=0x0'
    - 'R9W=0x0'
    - 'R11W=0x0'
    - 'R15W=0x0'
cpu_name:        bdver2
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 10000
measurements:    
  - { key: PdFPU0, value: 0, per_snippet_value: 0 }
  - { key: PdFPU1, value: 0, per_snippet_value: 0 }
  - { key: PdFPU2, value: 0, per_snippet_value: 0 }
  - { key: PdFPU3, value: 0, per_snippet_value: 0 }
  - { key: NumMicroOps, value: 1.0135, per_snippet_value: 15.2025 }
error:           ''
info:            instruction has tied variables, using static renaming.
assembled_snippet: 5541574156415541545366B800006641BC00004883EC08C7042400000000C7442404000000009D66BD000066BA000066BB00006641BA000066B9000066BF00006641BD00006641BE000066BE00006641B800006641B900006641BB00006641BF000066410F42C4660F4EEA66410F43DA660F41CB66410F44FD66410F4CD666410F49F266440F42C666440F46CF66440F4CD566440F4DDB66450F4EE766440F4CEF66450F4FF666450F44FD66410F42C45B415C415D415E415F5DC3
...

I understand the motivation behind this change but I think we need a more principled approach. I'll try to sum up my reasoning here.
To me there are two useful modes for llvm-exegesis:

We want to look at a particular instruction latency - useful when we want to quickly check an assumption,
We want to get the latency for all the instructions - useful to fully characterize or check a processor.

The analysis tool only makes sense for the second bullet.

For Latency analysis, an instruction falls in one of the following benchmarking modes:

Infeasible (privileged instructions, inadequate control flow e.g HLT)
Measurable in isolation
Measurable through another instruction

For 2, we want to explore all the dimensions of the instruction:

Impact of choosing several time the same register (XOR EAX, EAX, EAX) or different ones (XOR EAX, EBX, ECX)
Impact of choosing special values for immediates (IMUL EAX, EAX, 0 or special values for floating point numbers sNaN qNaN ±∞ ±0 normal and denormal)

For 3, this is 2 combined with a second instruction. The paired instruction is another dimension to explore. Because we're mostly interested in the behavior of the first instruction, we don't need to explore all of this dimension. We can restrict ourselves to the compatible instructions with the less degrees of freedom.

For this exploration to be efficient (manageable) we can't eagerly generate all of the templates. We need a preprocessing step to gather which instructions belong to 2 or 3 - and for the ones in 3 which set of instructions is worth considering, then generate a dependency graph and process the instructions in an order that would allow to deduce the latencies for the instructions in 3, but still exploring the dimensions for 2.

In this automated mode (second bullet) the values for instructions in 3 can be recovered by solving an Ordinary Least Square. The recovered measurements can then be processed by the analysis tool.
I've started working on this, I just need to dedicate more time to it.

tools/llvm-exegesis/lib/SnippetGenerator.cpp
106	Can't you just: std::move(Templates.begin(), Templates.end(), std::back_inserter(FinalTemplates)); )
unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp
199	`SETCCrExhaustive`
213	I don't understand this second line.
224	`available`

lebedev.ri marked an inline comment as done.Apr 11 2019, 4:50 AM

lebedev.ri added inline comments.

unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp
213	Line 2 says that while PrimaryInstruction1 is the primary instruction in that snippet, that PrimaryInstruction1 is the same instruction as SecondaryInstruction0.

In D60401#1462522, @gchatelet wrote:

...

So i agree in general, but i don't think i agree with fine print.

What if i don't want to validate the whole entirety of the instructions,
but only one of these instructions that is Measurable through another instruction?
I simply can't?

I agree with the general direction of to fully characterize or check a processor.,
but i'm not fully sure yet how that will be used/useful for actual schedule profiles.

I guess it's also a question of "is already useful and getting useful every day"
vs. "useful and will be more useful some time in the future",
i.e. the rate of improvement.

Just 2 cent.

In D60401#1462567, @lebedev.ri wrote:

In D60401#1462522, @gchatelet wrote:

...

So i agree in general, but i don't think i agree with fine print.

What if i don't want to validate the whole entirety of the instructions,
but only one of these instructions that is Measurable through another instruction?
I simply can't?

Well yes that would still be possible but you'd still need to go through the preprocessing / build the dependency graph for the subset of interest. The exploration would be restricted to the instruction + dependent instructions.
There will be a startup cost for this strategy but it will take much less time than testing everything.

If the startup cost is too big there are a few possibilities we can investigate:

use a cache to store the precomputed data.
have the precomputation done during llvm-exegesis build.

Would that work for you?

In D60401#1462578, @gchatelet wrote:

In D60401#1462567, @lebedev.ri wrote:

In D60401#1462522, @gchatelet wrote:

...

So i agree in general, but i don't think i agree with fine print.

What if i don't want to validate the whole entirety of the instructions,
but only one of these instructions that is Measurable through another instruction?
I simply can't?

Well yes that would still be possible but you'd still need to go through the preprocessing / build the dependency graph for the subset of interest. The exploration would be restricted to the instruction + dependent instructions.
There will be a startup cost for this strategy but it will take much less time than testing everything.

If the startup cost is too big there are a few possibilities we can investigate:

use a cache to store the precomputed data.

have the precomputation done during llvm-exegesis build.

Would that work for you?

I think we are talking about *essentially* the same thing, just from
different perspectives so it looks like we are talking about different things.

*Of course* it will still need preprocessing, there will just be no way around that
(well, that is if we do not want to use apriori data).

What i'm saying is, i wouldn't want to run the entire exhaustive benchmark over all the
instructions just so that i can look at a single instruction that needs post-processing.

I'd want to run a *subset* of the benchmarks, that would only explore the target instruction,
plus some extra instructions to 'replace' apriori data.
Which is roughly what this patch does.

So from where i'm sitting, the other exploration is a generalization of *this*.
Maybe i'm missing the point?

mstojanovic added a subscriber: mstojanovic.Dec 18 2019, 10:04 AM

Revision Contents

Path

Size

tools/

llvm-exegesis/

lib/

SnippetGenerator.h

7 lines

SnippetGenerator.cpp

80 lines

unittests/

tools/

llvm-exegesis/

X86/

SnippetGeneratorTest.cpp

59 lines

Diff 194105

tools/llvm-exegesis/lib/SnippetGenerator.h

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines

	// Common code for all benchmark modes.			// Common code for all benchmark modes.
	class SnippetGenerator {			class SnippetGenerator {
	public:			public:
	explicit SnippetGenerator(const LLVMState &State);			explicit SnippetGenerator(const LLVMState &State);

	virtual ~SnippetGenerator();			virtual ~SnippetGenerator();

				// To measure characteristics of instruction Instr, we sometimes need to
				// chain it together with some other instructions. But that gives us
				// collective characteristics. So we also need to measure characteristics
				// of those additional instructions, and later do some post-processing.
				llvm::Expected<std::vector<CodeTemplate>>
				generateAllCodeTemplates(const Instruction &MainInstr) const;

	// Calls generateCodeTemplate and expands it into one or more BenchmarkCode.			// Calls generateCodeTemplate and expands it into one or more BenchmarkCode.
	llvm::Expected<std::vector<BenchmarkCode>>			llvm::Expected<std::vector<BenchmarkCode>>
	generateConfigurations(const Instruction &Instr) const;			generateConfigurations(const Instruction &Instr) const;

	// Given a snippet, computes which registers the setup code needs to define.			// Given a snippet, computes which registers the setup code needs to define.
	std::vector<RegisterValue> computeRegisterInitialValues(			std::vector<RegisterValue> computeRegisterInitialValues(
	const std::vector<InstructionTemplate> &Snippet) const;			const std::vector<InstructionTemplate> &Snippet) const;

	Show All 33 Lines

tools/llvm-exegesis/lib/SnippetGenerator.cpp

	//===-- SnippetGenerator.cpp ------------------------------------- C++ --===//			//===-- SnippetGenerator.cpp ------------------------------------- C++ --===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include <array>			#include <array>
	#include <string>			#include <string>

	#include "Assembler.h"			#include "Assembler.h"
	#include "MCInstrDescView.h"			#include "MCInstrDescView.h"
	#include "SnippetGenerator.h"			#include "SnippetGenerator.h"
	#include "Target.h"			#include "Target.h"
				#include "llvm/ADT/SetVector.h"
				#include "llvm/ADT/SmallSet.h"
				#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/StringExtras.h"			#include "llvm/ADT/StringExtras.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include "llvm/ADT/Twine.h"			#include "llvm/ADT/Twine.h"
	#include "llvm/Support/FileSystem.h"			#include "llvm/Support/FileSystem.h"
	#include "llvm/Support/FormatVariadic.h"			#include "llvm/Support/FormatVariadic.h"
	#include "llvm/Support/Program.h"			#include "llvm/Support/Program.h"

	namespace llvm {			namespace llvm {
	namespace exegesis {			namespace exegesis {

	std::vector<CodeTemplate> getSingleton(CodeTemplate &&CT) {			std::vector<CodeTemplate> getSingleton(CodeTemplate &&CT) {
	std::vector<CodeTemplate> Result;			std::vector<CodeTemplate> Result;
	Result.push_back(std::move(CT));			Result.push_back(std::move(CT));
	return Result;			return Result;
	}			}

	SnippetGeneratorFailure::SnippetGeneratorFailure(const llvm::Twine &S)			SnippetGeneratorFailure::SnippetGeneratorFailure(const llvm::Twine &S)
	: llvm::StringError(S, llvm::inconvertibleErrorCode()) {}			: llvm::StringError(S, llvm::inconvertibleErrorCode()) {}

	SnippetGenerator::SnippetGenerator(const LLVMState &State) : State(State) {}			SnippetGenerator::SnippetGenerator(const LLVMState &State) : State(State) {}

	SnippetGenerator::~SnippetGenerator() = default;			SnippetGenerator::~SnippetGenerator() = default;

				llvm::Expected<std::vector<CodeTemplate>>
				SnippetGenerator::generateAllCodeTemplates(const Instruction &MainInstr) const {
				std::vector<std::vector<CodeTemplate>> AllTemplates;

				using Opcode = decltype(MCInstrDesc::Opcode);
				llvm::SmallSet<Opcode, 32> ProcessedOpcodes;
				llvm::SetVector<Opcode, llvm::SmallVector<Opcode, 32>,
				decltype(ProcessedOpcodes)>
				Worklist;

				Worklist.insert(MainInstr.Description->getOpcode());
				while (!Worklist.empty()) {
				AllTemplates.reserve(AllTemplates.size() + Worklist.size());
				Opcode OpcodeToProcess = Worklist.pop_back_val();

				assert(!ProcessedOpcodes.count(OpcodeToProcess) &&
				"Should not process instructions that we have already processed.");
				const Instruction &InstrToProcess = State.getIC().getInstr(OpcodeToProcess);
				assert(InstrToProcess.Description->getOpcode() == OpcodeToProcess);

				if (auto E = generateCodeTemplates(InstrToProcess)) {
				ProcessedOpcodes.insert(OpcodeToProcess);
				assert(AllTemplates.capacity() > 0 && "Should have fully preallocated.");
				AllTemplates.emplace_back(std::move(E.get()));
				const std::vector<CodeTemplate> &NewTemplates = AllTemplates.back();

				llvm::for_each(NewTemplates, [InstrToProcess, &ProcessedOpcodes,
				&Worklist](const CodeTemplate &CT) {
				assert(CT.Instructions.front().Instr.Description ==
				InstrToProcess.Description &&
				"Expected the target instruction to be the first instruction in "
				"the template.");
				// Add every instruction (except the first one, which is the
				// instruction we have just processed) to the worklist, unless it
				// was processed/queued for processing already.
				for (const InstructionTemplate &IT :
				ArrayRef<InstructionTemplate>(CT.Instructions).drop_front()) {
				if (!ProcessedOpcodes.count(IT.getOpcode()))
				Worklist.insert(IT.getOpcode());
				}
				});
				} else {
				// We have failed to produce templates for instruction InstrToProcess.
				// If this is the MainInstr instruction, then this is fatal.
				if (AllTemplates.empty())
				return E.takeError();
				// Else, we have failed to produce templates for secondary instructions.
				// Ignore error.
				}
				}

				// Flatten vector-of-vectors-of-elements into a vector-of-elements.
				size_t TotalTemplateCount = std::accumulate(
				AllTemplates.begin(), AllTemplates.end(), size_t(0),
				[](size_t Size, const std::vector<CodeTemplate> &Templates) {
				return Size + Templates.size();
				});
				std::vector<CodeTemplate> FinalTemplates(std::move(AllTemplates.front()));
				FinalTemplates.reserve(TotalTemplateCount);
				llvm::for_each(
				llvm::make_range(std::next(std::make_move_iterator(AllTemplates.begin())),
				std::make_move_iterator(AllTemplates.end())),
				[&FinalTemplates](std::vector<CodeTemplate> &&Templates) {
				llvm::for_each(
				llvm::make_range(std::make_move_iterator(Templates.begin()),
				gchateletUnsubmitted Not Done Reply Inline Actions Can't you just: std::move(Templates.begin(), Templates.end(), std::back_inserter(FinalTemplates)); ) gchatelet: Can't you just: ``` std::move(Templates.begin(), Templates.end(), std::back_inserter…
				std::make_move_iterator(Templates.end())),
				[&FinalTemplates](CodeTemplate &&Template) {
				FinalTemplates.emplace_back(std::move(Template));
				});
				});
				assert(FinalTemplates.size() == TotalTemplateCount);

				return FinalTemplates;
				}

	llvm::Expected<std::vector<BenchmarkCode>>			llvm::Expected<std::vector<BenchmarkCode>>
	SnippetGenerator::generateConfigurations(const Instruction &Instr) const {			SnippetGenerator::generateConfigurations(const Instruction &Instr) const {
	if (auto E = generateCodeTemplates(Instr)) {			if (auto E = generateAllCodeTemplates(Instr)) {
	const auto &RATC = State.getRATC();			const auto &RATC = State.getRATC();
	std::vector<BenchmarkCode> Output;			std::vector<BenchmarkCode> Output;
	for (CodeTemplate &CT : E.get()) {			for (CodeTemplate &CT : E.get()) {
	const llvm::BitVector &ForbiddenRegs =			const llvm::BitVector &ForbiddenRegs =
	CT.ScratchSpacePointerInReg			CT.ScratchSpacePointerInReg
	? RATC.getRegister(CT.ScratchSpacePointerInReg).aliasedBits()			? RATC.getRegister(CT.ScratchSpacePointerInReg).aliasedBits()
	: RATC.emptyRegisters();			: RATC.emptyRegisters();
	// TODO: Generate as many BenchmarkCode as needed.			// TODO: Generate as many BenchmarkCode as needed.
	▲ Show 20 Lines • Show All 155 Lines • Show Last 20 Lines

unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp

Show All 21 Lines
void InitializeX86ExegesisTarget();		void InitializeX86ExegesisTarget();

namespace {		namespace {

using testing::AnyOf;		using testing::AnyOf;
using testing::ElementsAre;		using testing::ElementsAre;
using testing::Gt;		using testing::Gt;
using testing::HasSubstr;		using testing::HasSubstr;
		using testing::Lt;
using testing::Not;		using testing::Not;
using testing::SizeIs;		using testing::SizeIs;
using testing::UnorderedElementsAre;		using testing::UnorderedElementsAre;

MATCHER(IsInvalid, "") { return !arg.isValid(); }		MATCHER(IsInvalid, "") { return !arg.isValid(); }
MATCHER(IsReg, "") { return arg.isReg(); }		MATCHER(IsReg, "") { return arg.isReg(); }

class X86SnippetGeneratorTest : public ::testing::Test {		class X86SnippetGeneratorTest : public ::testing::Test {
Show All 23 Lines	protected:
std::vector<CodeTemplate> checkAndGetCodeTemplates(unsigned Opcode) {		std::vector<CodeTemplate> checkAndGetCodeTemplates(unsigned Opcode) {
randomGenerator().seed(0); // Initialize seed.		randomGenerator().seed(0); // Initialize seed.
const Instruction &Instr = State.getIC().getInstr(Opcode);		const Instruction &Instr = State.getIC().getInstr(Opcode);
auto CodeTemplateOrError = Generator.generateCodeTemplates(Instr);		auto CodeTemplateOrError = Generator.generateCodeTemplates(Instr);
EXPECT_FALSE(CodeTemplateOrError.takeError()); // Valid configuration.		EXPECT_FALSE(CodeTemplateOrError.takeError()); // Valid configuration.
return std::move(CodeTemplateOrError.get());		return std::move(CodeTemplateOrError.get());
}		}

		std::vector<CodeTemplate> checkAndGetAllCodeTemplates(unsigned Opcode) {
		randomGenerator().seed(0); // Initialize seed.
		const Instruction &Instr = State.getIC().getInstr(Opcode);
		auto CodeTemplateOrError = Generator.generateAllCodeTemplates(Instr);
		EXPECT_FALSE(CodeTemplateOrError.takeError()); // Valid configuration.
		return std::move(CodeTemplateOrError.get());
		}

SnippetGeneratorT Generator;		SnippetGeneratorT Generator;
};		};

using LatencySnippetGeneratorTest =		using LatencySnippetGeneratorTest =
SnippetGeneratorTest<LatencySnippetGenerator>;		SnippetGeneratorTest<LatencySnippetGenerator>;

using UopsSnippetGeneratorTest = SnippetGeneratorTest<UopsSnippetGenerator>;		using UopsSnippetGeneratorTest = SnippetGeneratorTest<UopsSnippetGenerator>;

▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	for (const auto &CT : CodeTemplates) {
EXPECT_THAT(CT.Execution, ExecutionMode::SERIAL_VIA_NON_MEMORY_INSTR);		EXPECT_THAT(CT.Execution, ExecutionMode::SERIAL_VIA_NON_MEMORY_INSTR);
ASSERT_THAT(CT.Instructions, SizeIs(2));		ASSERT_THAT(CT.Instructions, SizeIs(2));
const InstructionTemplate &IT = CT.Instructions[0];		const InstructionTemplate &IT = CT.Instructions[0];
EXPECT_THAT(IT.getOpcode(), Opcode);		EXPECT_THAT(IT.getOpcode(), Opcode);
ASSERT_THAT(IT.VariableValues, SizeIs(0));		ASSERT_THAT(IT.VariableValues, SizeIs(0));
}		}
}		}

		TEST_F(LatencySnippetGeneratorTest, SETCCrExaustive) {
		gchateletUnsubmitted Not Done Reply Inline Actions `SETCCrExhaustive` gchatelet: `SETCCrExhaustive`
		const unsigned Opcode = llvm::X86::SETCCr;
		const std::vector<CodeTemplate> CodeTemplates =
		checkAndGetAllCodeTemplates(Opcode);
		ASSERT_THAT(CodeTemplates, SizeIs(Gt(2U))) << "Many templates are available";
		using OpcodeTy = decltype(MCInstrDesc::Opcode);
		std::set<OpcodeTy> PrimaryInstructions;
		std::set<OpcodeTy> SecondaryInstructions;
		llvm::for_each(CodeTemplates, [&PrimaryInstructions, &SecondaryInstructions](
		const CodeTemplate &CT) {
		ASSERT_THAT(CT.Instructions, SizeIs(Gt(0U))) << "Have at least one instr.";
		ASSERT_THAT(CT.Instructions, SizeIs(Lt(3U)))
		<< "It is expected that all templates have 1 or 2 instructions";
		// IT0: PrimaryInstruction0[, SecondaryInstruction0[, ...]]
		// IT1: PrimaryInstruction1(==SecondaryInstruction0)[, ...]
		gchateletUnsubmitted Not Done Reply Inline Actions I don't understand this second line. gchatelet: I don't understand this second line.
		lebedev.riAuthorUnsubmitted Done Reply Inline Actions Line 2 says that while PrimaryInstruction1 is the primary instruction in that snippet, that PrimaryInstruction1 is the same instruction as SecondaryInstruction0. lebedev.ri: Line 2 says that while PrimaryInstruction1 is the primary instruction in that snippet, that…
		// ...
		PrimaryInstructions.emplace(CT.Instructions.front().getOpcode());
		llvm::for_each(ArrayRef<InstructionTemplate>(CT.Instructions).drop_front(),
		[&SecondaryInstructions](const InstructionTemplate &IT) {
		SecondaryInstructions.emplace(IT.getOpcode());
		});
		});
		ASSERT_THAT(PrimaryInstructions, SizeIs(Gt(1U)))
		<< "Many templates are available";
		ASSERT_THAT(SecondaryInstructions, SizeIs(Gt(0U)))
		<< "Some secondary templates are avaliable";
		gchateletUnsubmitted Not Done Reply Inline Actions `available` gchatelet: `available`

		ASSERT_TRUE(PrimaryInstructions.count(Opcode))
		<< "We should have benchmarks to measure the target instruction";

		// All entries in PrimaryInstructions (except the target instruction) should
		// have been mentioned in SecondaryInstructions.
		// I.e. we only measured instrs we needed to measure.
		for (OpcodeTy PrimaryOpcode : PrimaryInstructions) {
		if (PrimaryOpcode == Opcode) // Ignore the actual target instruction
		continue; // It is unlikely that we manage to find a loop.
		ASSERT_TRUE(SecondaryInstructions.count(PrimaryOpcode))
		<< "Weird benchmark to measure instruction that was not required";
		}

		bool HaveAtLeastOneSecondaryOpcodeAsPrimaryOpcode = llvm::any_of(
		SecondaryInstructions, [PrimaryInstructions](OpcodeTy SecondaryOpcode) {
		return PrimaryInstructions.find(SecondaryOpcode) !=
		PrimaryInstructions.end();
		});
		ASSERT_TRUE(HaveAtLeastOneSecondaryOpcodeAsPrimaryOpcode)
		<< "And should have a template to measure at least one of the secondary "
		"instructions as primary instruction";
		}

TEST_F(UopsSnippetGeneratorTest, ParallelInstruction) {		TEST_F(UopsSnippetGeneratorTest, ParallelInstruction) {
// - BNDCL32rr		// - BNDCL32rr
// - Op0 Explicit Use RegClass(BNDR)		// - Op0 Explicit Use RegClass(BNDR)
// - Op1 Explicit Use RegClass(GR32)		// - Op1 Explicit Use RegClass(GR32)
// - Var0 [Op0]		// - Var0 [Op0]
// - Var1 [Op1]		// - Var1 [Op1]
const unsigned Opcode = llvm::X86::BNDCL32rr;		const unsigned Opcode = llvm::X86::BNDCL32rr;
const auto CodeTemplates = checkAndGetCodeTemplates(Opcode);		const auto CodeTemplates = checkAndGetCodeTemplates(Opcode);
▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines