This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-exegesis/X86/latency/
-
tools/
-
llvm-exegesis/
-
X86/
-
latency/
-
memory-annotations.s
-
tools/llvm-exegesis/lib/
-
llvm-exegesis/
-
lib/
-
Assembler.h
4/5
Assembler.cpp
-
BenchmarkRunner.h
1/1
BenchmarkRunner.cpp
-
SnippetRepetitor.h
-
SnippetRepetitor.cpp
-
unittests/tools/llvm-exegesis/
-
tools/
-
llvm-exegesis/
-
Common/
-
AssemblerUtils.h
-
X86/
-
SnippetRepetitorTest.cpp

Differential D151025

[llvm-exegesis] Add support for using memory annotations
ClosedPublic

Authored by aidengrossman on May 20 2023, 3:43 AM.

Download Raw Diff

Details

Reviewers

RKSimon
courbet
gchatelet
ondrasej

Summary

This patch adds in support for using memory annotations in the
subprocess execution mode.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aidengrossman created this revision.May 20 2023, 3:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2023, 3:43 AM

Herald added subscribers: mstojanovic, krytarowski. · View Herald Transcript

aidengrossman requested review of this revision.May 20 2023, 3:43 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2023, 3:43 AM

Herald added subscribers: llvm-commits, courbet. · View Herald Transcript

aidengrossman added a parent revision: D151024: [llvm-exegesis] Add memory annotation parsing.May 20 2023, 3:43 AM

Harbormaster completed remote builds in B233356: Diff 524016.May 20 2023, 3:43 AM

aidengrossman added reviewers: RKSimon, courbet, gchatelet, ondrasej.May 20 2023, 3:47 AM

aidengrossman added a child revision: D151039: [Docs][llvm-exegesis] Add documentation for memory.May 20 2023, 4:16 PM

aidengrossman mentioned this in D151019: [llvm-exegesis] Refactor FunctionExecutorImpl and create factory.May 23 2023, 6:04 PM

Rebase, cleanup tests, fix some function calls in unittests that I forgot to update after adjusting function signatures.

Harbormaster completed remote builds in B234044: Diff 524963.May 23 2023, 6:48 PM

Rebase + format

Harbormaster completed remote builds in B234807: Diff 525996.May 26 2023, 2:43 AM

Rebase

Harbormaster completed remote builds in B235121: Diff 526396.May 29 2023, 12:03 AM

courbet added inline comments.May 30 2023, 2:59 AM

llvm/tools/llvm-exegesis/lib/Assembler.cpp
51–54	If we're always going to call these 4 in a sequence, let's have a single point of entry in `ExegesisTarget`

Address reviewer feedback.

Harbormaster completed remote builds in B235495: Diff 526920.May 31 2023, 12:08 AM

Address reviewer feedback, rebase.

Harbormaster completed remote builds in B235759: Diff 527310.Jun 1 2023, 1:03 AM

Update preprocessor directives to prevent build failures on Linux platforms that don't have libpfm.

Harbormaster completed remote builds in B236548: Diff 528323.Jun 5 2023, 2:54 AM

Rebase

Harbormaster completed remote builds in B238399: Diff 530783.Jun 12 2023, 11:56 PM

Rebase

Harbormaster completed remote builds in B239322: Diff 532020.Jun 16 2023, 12:45 AM

Rebase

Harbormaster completed remote builds in B239338: Diff 532036.Jun 16 2023, 1:17 AM

courbet added inline comments.Jun 16 2023, 1:38 AM

llvm/tools/llvm-exegesis/lib/Assembler.cpp
70	This might clobber the registers we just set up above.

courbet mentioned this in D151023: [llvm-exegesis] Add Target Memory Utility Functions.Jun 16 2023, 1:42 AM

aidengrossman added inline comments.Jun 16 2023, 1:42 AM

llvm/tools/llvm-exegesis/lib/Assembler.cpp
70	Yes. I have a TODO listed in https://reviews.llvm.org/D151023 for this function to push the current values of RAX, RDI, and RSI to the stack to save them and then pop them back after the operation. I was planning on doing this as a follow up after this landed once I actually needed to use those registers, but if this is a blocker I can get this fixed before landing.

Rebase

Harbormaster completed remote builds in B239367: Diff 532075.Jun 16 2023, 3:22 AM

@courbet This patch is ready for review when you have some time. Thanks!

courbet added inline comments.Jun 19 2023, 1:37 AM

llvm/tools/llvm-exegesis/lib/Assembler.cpp
234	why `else` ? You could have both liveins and memory, right ? (same below) Can you add a test ?
llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp
39–40	please add a `#define HAS_RSEQ` for this to make the code clearer.

Address first round of reviewer feedback.

Harbormaster completed remote builds in B239746: Diff 532569.Jun 19 2023, 2:06 AM

aidengrossman added inline comments.Jun 19 2023, 2:10 AM

llvm/tools/llvm-exegesis/lib/Assembler.cpp
234	Ah, yep. I made it either/or here as it doesn't support passing the `RDI` (or different depending on ABI) scratch memory register, but there are definitely other uses for passing registers as live ins. Should be fixed in the next update. Thanks for the catch!

Ignore the "in the next update" part of the last inline comment. I submitted it at the wrong point. @courbet This should be ready for another round of review when you get a chance. Thank you for all your patience with reviews on this stack!

courbet accepted this revision.Jun 20 2023, 1:06 AM

courbet added inline comments.

llvm/test/tools/llvm-exegesis/X86/memory-annotations-livein.test
5 ↗	(On Diff #532569)	Can you add a comment to explain what the test checks ? Something like: # This test check that we can use a combination of memory and register definitions. Same for the other test.

This revision is now accepted and ready to land.Jun 20 2023, 1:06 AM

Closed by 9f80831f3627e800709e2434bbbd5bb179b1576e.

aidengrossman closed this revision.Jun 26 2023, 11:54 PM

The tests aren't working on kernel 4.18 - are they missing a REQUIRES: exegesis-can-execute-in-subprocess?

In D151025#4451358, @Hahnfeld wrote:

The tests aren't working on kernel 4.18 - are they missing a REQUIRES: exegesis-can-execute-in-subprocess?

Ah, yep. Sorry for not catching that. Hopefully 864787990091f61e0b2cdcc13bdcd1e3677a334f fixes that for you.

In D151025#4451279, @aidengrossman wrote:

Closed by 9f80831f3627e800709e2434bbbd5bb179b1576e.

It seems that you lost Differential Revision: in the commit. If you use arc (https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-command-line), arc amend should give you the reviewed-by information from @courbet

Yes. I meant to add it on when cherry-picking from the branch containing these changes in my fork but forgot to add it in. I was doing it manually though and didn't realize arc amend would just do that. Thanks for the tip and sorry for the confusion there.

@aidengrossman the two new lit tests are consistently failing on my machine, which is Ubuntu 22.04.2 but with kernel version 6.3.7. I get:

$ /home/jayfoad2/llvm-release/bin/llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -snippets-file=/home/jayfoad2/git/llvm-project/llvm/test/tools/llvm-exegesis/X86/latency/memory-annotations.s -execution-mode=subprocess
---
mode:            latency
key:
  instructions:
    - 'MOV64ri32 RAX i_0x2000'
    - 'MOV64rm RDI RAX i_0x1 %noreg i_0x0 %noreg'
  config:          ''
  register_initial_values: []
cpu_name:        znver4
llvm_triple:     x86_64-unknown-unknown
num_repetitions: 10000
measurements:    []
error:           'The benchmarking subprocess sent unexpected signal: Segmentation fault'
info:            ''
assembled_snippet: 415541544989FC4989F548BF0000000000000000488D350000000048C1EE0C48C1E60C4881EE0010000048B80B000000000000000F054C8D05000000004C89E74C01C748C1EF0C48C1E70C4881C70010000048BE00F0FFFFFF7F00004829FE48B80B000000000000000F0548BF00E0FFFFFF7F000048BE001000000000000048BA030000000000000049BA01001000000000004D89E849B9000000000000000048B809000000000000000F0548BF002000000000000048BE001000000000000048BA030000000000000049BA010010000000000049B804E0FFFFFF7F0000458B0049B9000000000000000048B809000000000000000F0548BC00F0FFFFFF7F000050575648BF00E0FFFFFF7F00008B3F48BE032400000000000048B810000000000000000F055E585848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848BF00E0FFFFFF7F00008B3F48BE012400000000000048B810000000000000000F0548BF000000000000000048B83C000000000000000F05415C415DC3
...

I've tried running this under lldb but I can't catch the segfault, even if I use settings set target.process.follow-fork-mode child.

Any ideas what might be going wrong or how to debug it?

I'm not sure what could be going wrong. I saw segmentation fault failures on one of the builders but after I added a check to the lit config to ensure that the subprocess mode was actually supported it went away. It might have something to do with security policies since this patch does some things that probably seem quite abnormal in regards to the process memory, but I might be completely wrong there with that assumption.

To debug you need to modify the code slightly as this patch uses ptrace, so lldb isn't able to grab the child process because there's already another process tracing it. You'll need to comment out the ptrace calls and then you should be able to get into the child process. I had a flag to do that on one of my development branches that maybe I should upstream at some point.

Once you have that setup, I've found it easiest to debug by breaking on a line right before the function call into the MCJITed test/test harness. Then I would single step through the assembly to see what was happening.

However, I understand this might be a pain to debug, so if you want me to disable the tests or add something to the lit config to ensure that these tests only run if memory annotations don't cause seg-faults while we can work on reproducing it without being too interruptive, I'm fine with doing that.

In D151025#4455327, @foad wrote:

@aidengrossman the two new lit tests are consistently failing on my machine, which is Ubuntu 22.04.2 but with kernel version 6.3.7. I get:

$ /home/jayfoad2/llvm-release/bin/llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -snippets-file=/home/jayfoad2/git/llvm-project/llvm/test/tools/llvm-exegesis/X86/latency/memory-annotations.s -execution-mode=subprocess
---
mode:            latency
key:
  instructions:
    - 'MOV64ri32 RAX i_0x2000'
    - 'MOV64rm RDI RAX i_0x1 %noreg i_0x0 %noreg'
  config:          ''
  register_initial_values: []
cpu_name:        znver4
llvm_triple:     x86_64-unknown-unknown
num_repetitions: 10000
measurements:    []
error:           'The benchmarking subprocess sent unexpected signal: Segmentation fault'
info:            ''
assembled_snippet: 415541544989FC4989F548BF0000000000000000488D350000000048C1EE0C48C1E60C4881EE0010000048B80B000000000000000F054C8D05000000004C89E74C01C748C1EF0C48C1E70C4881C70010000048BE00F0FFFFFF7F00004829FE48B80B000000000000000F0548BF00E0FFFFFF7F000048BE001000000000000048BA030000000000000049BA01001000000000004D89E849B9000000000000000048B809000000000000000F0548BF002000000000000048BE001000000000000048BA030000000000000049BA010010000000000049B804E0FFFFFF7F0000458B0049B9000000000000000048B809000000000000000F0548BC00F0FFFFFF7F000050575648BF00E0FFFFFF7F00008B3F48BE032400000000000048B810000000000000000F055E585848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848C7C000200000488B3848BF00E0FFFFFF7F00008B3F48BE012400000000000048B810000000000000000F0548BF000000000000000048B83C000000000000000F05415C415DC3
...

I've tried running this under lldb but I can't catch the segfault, even if I use settings set target.process.follow-fork-mode child.

Any ideas what might be going wrong or how to debug it?

llvm-exegesis -mode=latency -snippets-file=memory-annotations-livein.s -execution-mode=subprocess has this 'The benchmarking subprocess sent unexpected signal: Segmentation fault' for me as well, but I haven't looked closely.
(sudo apt install libpfm4-dev so that LLVM_ENABLE_LIBPFM links the dependency)

In D151025#4456730, @aidengrossman wrote:

I'm not sure what could be going wrong. I saw segmentation fault failures on one of the builders but after I added a check to the lit config to ensure that the subprocess mode was actually supported it went away. It might have something to do with security policies since this patch does some things that probably seem quite abnormal in regards to the process memory, but I might be completely wrong there with that assumption.

To debug you need to modify the code slightly as this patch uses ptrace, so lldb isn't able to grab the child process because there's already another process tracing it. You'll need to comment out the ptrace calls and then you should be able to get into the child process. I had a flag to do that on one of my development branches that maybe I should upstream at some point.

Once you have that setup, I've found it easiest to debug by breaking on a line right before the function call into the MCJITed test/test harness. Then I would single step through the assembly to see what was happening.

However, I understand this might be a pain to debug, so if you want me to disable the tests or add something to the lit config to ensure that these tests only run if memory annotations don't cause seg-faults while we can work on reproducing it without being too interruptive, I'm fine with doing that.

strace on the JITted part of the child process shows:

munmap(NULL, 140344954515456)           = 0
munmap(0x7fa49b29d000, 392533778432)    = 0
mmap(0x7fffffffe000, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED_NOREPLACE, 8, 0) = 0x7fffffffe000
mmap(0x2000, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED_NOREPLACE, 9, 0) = -1 EPERM (Operation not permitted)
ioctl(7, PERF_EVENT_IOC_RESET, PERF_IOC_FLAG_GROUP|0x2) = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x2000} ---
+++ killed by SIGSEGV (core dumped) +++

i.e. the access to address 0x2000 fails because the mmap call failed with EPERM.

There is some discussion here about why you might get the "nonsensical" error EPERM if you try to map an address lower than mmap_min_addr: https://lore.kernel.org/all/20230418214009.1142926-1-Liam.Howlett@oracle.com/T/#u

And sure enough my system has mmap_min_addr set higher than 0x2000:

$ sysctl vm.mmap_min_addr
vm.mmap_min_addr = 65536

This setting comes from /etc/sysctl.d/10-zeropage.conf which seems to be an Ubuntu-specific change to the Debain procps package.

I have verified that sed -i s/8192/65536/ test/tools/llvm-exegesis/X86/latency/memory-annotations*.s fixes the tests for me.

Interesting. I'm running Ubuntu 22.04 as well and running sysctl vm.mmap_min_addr for me also gives 65536, but I was doing all of my development inside of a privileged container which I assume was able to allocate memory at an address lower than that maybe due to permissions differences. It fails for me when I do a build outside a container. Thank you so much for debugging this! I'll get a patch up changing the addresses soon and hopefully that fixes the test failures.

46f42e2ee59ac5ff7b153648e30273e499f7b7e3 should fix the issue. Please let me know if there are any further issues.

Interesting. I'm running Ubuntu 22.04 as well and running sysctl vm.mmap_min_addr for me also gives 65536, but I was doing all of my development inside of a privileged container which I assume was able to allocate memory at an address lower than that maybe due to permissions differences.

Yeah, I saw something about the limit not being enforced for processes with CAP_SYS_RAWIO, so maybe that was the difference.

Anyway your fix works for me - thanks!

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-exegesis/

X86/

latency/

memory-annotations.s

13 lines

tools/

llvm-exegesis/

lib/

7 lines

61 lines

8 lines

84 lines

4 lines

19 lines

unittests/

tools/

llvm-exegesis/

Common/

AssemblerUtils.h

5 lines

X86/

SnippetRepetitorTest.cpp

2 lines

Diff 526920

llvm/test/tools/llvm-exegesis/X86/latency/memory-annotations.s

This file was added.

				# REQUIRES: exegesis-can-execute-x86_64, exegesis-can-measure-latency, x86_64-linux

				# RUN: llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -snippets-file=%s -execution-mode=subprocess \| FileCheck %s
				# RUN: llvm-exegesis -mtriple=x86_64-unknown-unknown -mode=latency -snippets-file=%s -execution-mode=subprocess -repetition-mode=loop \| FileCheck %s

				# CHECK: measurements:
				# CHECK-NEXT: value: {{.}}, per_snippet_value: {{.}}

				# LLVM-EXEGESIS-MEM-DEF test1 4096 2147483647
				# LLVM-EXEGESIS-MEM-MAP test1 8192

				movq $8192, %rax
				movq (%rax), %rdi

llvm/tools/llvm-exegesis/lib/Assembler.h

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	class BasicBlockFiller {			class BasicBlockFiller {
	public:			public:
	BasicBlockFiller(MachineFunction &MF, MachineBasicBlock *MBB,			BasicBlockFiller(MachineFunction &MF, MachineBasicBlock *MBB,
	const MCInstrInfo *MCII);			const MCInstrInfo *MCII);

	void addInstruction(const MCInst &Inst, const DebugLoc &DL = DebugLoc());			void addInstruction(const MCInst &Inst, const DebugLoc &DL = DebugLoc());
	void addInstructions(ArrayRef<MCInst> Insts, const DebugLoc &DL = DebugLoc());			void addInstructions(ArrayRef<MCInst> Insts, const DebugLoc &DL = DebugLoc());

	void addReturn(const DebugLoc &DL = DebugLoc());			void addReturn(const ExegesisTarget &ET, bool SubprocessCleanup,
				const DebugLoc &DL = DebugLoc());

	MachineFunction &MF;			MachineFunction &MF;
	MachineBasicBlock *const MBB;			MachineBasicBlock *const MBB;
	const MCInstrInfo *const MCII;			const MCInstrInfo *const MCII;
	};			};

	// Helper to fill in a function.			// Helper to fill in a function.
	class FunctionFiller {			class FunctionFiller {
	Show All 24 Lines
	// Creates a temporary `void foo(char*)` function containing the provided			// Creates a temporary `void foo(char*)` function containing the provided
	// Instructions. Runs a set of llvm Passes to provide correct prologue and			// Instructions. Runs a set of llvm Passes to provide correct prologue and
	// epilogue. Once the MachineFunction is ready, it is assembled for TM to			// epilogue. Once the MachineFunction is ready, it is assembled for TM to
	// AsmStream, the temporary function is eventually discarded.			// AsmStream, the temporary function is eventually discarded.
	Error assembleToStream(const ExegesisTarget &ET,			Error assembleToStream(const ExegesisTarget &ET,
	std::unique_ptr<LLVMTargetMachine> TM,			std::unique_ptr<LLVMTargetMachine> TM,
	ArrayRef<unsigned> LiveIns,			ArrayRef<unsigned> LiveIns,
	ArrayRef<RegisterValue> RegisterInitialValues,			ArrayRef<RegisterValue> RegisterInitialValues,
	const FillFunction &Fill, raw_pwrite_stream &AsmStream);			const FillFunction &Fill, raw_pwrite_stream &AsmStreamm,
				const BenchmarkKey &Key,
				bool GenerateMemoryInstructions);

	// Creates an ObjectFile in the format understood by the host.			// Creates an ObjectFile in the format understood by the host.
	// Note: the resulting object keeps a copy of Buffer so it can be discarded once			// Note: the resulting object keeps a copy of Buffer so it can be discarded once
	// this function returns.			// this function returns.
	object::OwningBinary<object::ObjectFile> getObjectFromBuffer(StringRef Buffer);			object::OwningBinary<object::ObjectFile> getObjectFromBuffer(StringRef Buffer);

	// Loads the content of Filename as on ObjectFile and returns it.			// Loads the content of Filename as on ObjectFile and returns it.
	object::OwningBinary<object::ObjectFile> getObjectFromFile(StringRef Filename);			object::OwningBinary<object::ObjectFile> getObjectFromFile(StringRef Filename);
	Show All 30 Lines

llvm/tools/llvm-exegesis/lib/Assembler.cpp

//===-- Assembler.cpp -------------------------------------------- C++ --===//		//===-- Assembler.cpp -------------------------------------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "Assembler.h"		#include "Assembler.h"

#include "SnippetRepetitor.h"		#include "SnippetRepetitor.h"
		#include "SubprocessMemory.h"
#include "Target.h"		#include "Target.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/CodeGen/FunctionLoweringInfo.h"		#include "llvm/CodeGen/FunctionLoweringInfo.h"
#include "llvm/CodeGen/GlobalISel/CallLowering.h"		#include "llvm/CodeGen/GlobalISel/CallLowering.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"		#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/TargetInstrInfo.h"		#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/CodeGen/TargetPassConfig.h"		#include "llvm/CodeGen/TargetPassConfig.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/ExecutionEngine/SectionMemoryManager.h"		#include "llvm/ExecutionEngine/SectionMemoryManager.h"
#include "llvm/IR/BasicBlock.h"		#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/LegacyPassManager.h"		#include "llvm/IR/LegacyPassManager.h"
#include "llvm/MC/MCInstrInfo.h"		#include "llvm/MC/MCInstrInfo.h"
#include "llvm/Support/Alignment.h"		#include "llvm/Support/Alignment.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"

		#ifdef __linux__
		#include "perfmon/perf_event.h"
		#endif // __linux__

namespace llvm {		namespace llvm {
namespace exegesis {		namespace exegesis {

static constexpr const char ModuleID[] = "ExegesisInfoTest";		static constexpr const char ModuleID[] = "ExegesisInfoTest";
static constexpr const char FunctionID[] = "foo";		static constexpr const char FunctionID[] = "foo";
static const Align kFunctionAlignment(4096);		static const Align kFunctionAlignment(4096);

// Fills the given basic block with register setup code, and returns true if		// Fills the given basic block with register setup code, and returns true if
// all registers could be setup correctly.		// all registers could be setup correctly.
static bool generateSnippetSetupCode(		static bool generateSnippetSetupCode(
const ExegesisTarget &ET, const MCSubtargetInfo *const MSI,		const ExegesisTarget &ET, const MCSubtargetInfo *const MSI,
ArrayRef<RegisterValue> RegisterInitialValues, BasicBlockFiller &BBF) {		ArrayRef<RegisterValue> RegisterInitialValues, BasicBlockFiller &BBF,
		const BenchmarkKey &Key, bool GenerateMemoryInstructions) {
bool IsSnippetSetupComplete = true;		bool IsSnippetSetupComplete = true;
		if (GenerateMemoryInstructions) {
		BBF.addInstructions(ET.generateMemoryInitialSetup());
		for (const MemoryMapping &MM : Key.MemoryMappings) {
		BBF.addInstructions(ET.generateMmap(
		MM.Address, Key.MemoryValues.at(MM.MemoryValueName).Size,
		courbetUnsubmitted Done Reply Inline Actions If we're always going to call these 4 in a sequence, let's have a single point of entry in `ExegesisTarget` courbet: If we're always going to call these 4 in a sequence, let's have a single point of entry in…
		ET.getAuxiliaryMemoryStartAddress() +
		sizeof(int) * (Key.MemoryValues.at(MM.MemoryValueName).Number +
		AuxiliaryMemoryOffset)));
		}
		BBF.addInstructions(ET.setStackRegisterToAuxMem());
		}
for (const RegisterValue &RV : RegisterInitialValues) {		for (const RegisterValue &RV : RegisterInitialValues) {
// Load a constant in the register.		// Load a constant in the register.
const auto SetRegisterCode = ET.setRegTo(*MSI, RV.Register, RV.Value);		const auto SetRegisterCode = ET.setRegTo(*MSI, RV.Register, RV.Value);
if (SetRegisterCode.empty())		if (SetRegisterCode.empty())
IsSnippetSetupComplete = false;		IsSnippetSetupComplete = false;
BBF.addInstructions(SetRegisterCode);		BBF.addInstructions(SetRegisterCode);
}		}
		if (GenerateMemoryInstructions) {
		#ifdef __linux__
		BBF.addInstructions(ET.configurePerfCounter(PERF_EVENT_IOC_RESET));
		courbetUnsubmitted Not Done Reply Inline Actions This might clobber the registers we just set up above. courbet: This might clobber the registers we just set up above.
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions Yes. I have a TODO listed in https://reviews.llvm.org/D151023 for this function to push the current values of RAX, RDI, and RSI to the stack to save them and then pop them back after the operation. I was planning on doing this as a follow up after this landed once I actually needed to use those registers, but if this is a blocker I can get this fixed before landing. aidengrossman: Yes. I have a TODO listed in https://reviews.llvm.org/D151023 for this function to push the…
		#endif // __linux__
		}
return IsSnippetSetupComplete;		return IsSnippetSetupComplete;
}		}

// Small utility function to add named passes.		// Small utility function to add named passes.
static bool addPass(PassManagerBase &PM, StringRef PassName,		static bool addPass(PassManagerBase &PM, StringRef PassName,
TargetPassConfig &TPC) {		TargetPassConfig &TPC) {
const PassRegistry *PR = PassRegistry::getPassRegistry();		const PassRegistry *PR = PassRegistry::getPassRegistry();
const PassInfo *PI = PR->getPassInfo(PassName);		const PassInfo *PI = PR->getPassInfo(PassName);
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines
}		}

void BasicBlockFiller::addInstructions(ArrayRef<MCInst> Insts,		void BasicBlockFiller::addInstructions(ArrayRef<MCInst> Insts,
const DebugLoc &DL) {		const DebugLoc &DL) {
for (const MCInst &Inst : Insts)		for (const MCInst &Inst : Insts)
addInstruction(Inst, DL);		addInstruction(Inst, DL);
}		}

void BasicBlockFiller::addReturn(const DebugLoc &DL) {		void BasicBlockFiller::addReturn(const ExegesisTarget &ET,
		bool SubprocessCleanup, const DebugLoc &DL) {
		// Insert cleanup code
		if (SubprocessCleanup) {
		#ifdef __linux__
		addInstructions(ET.configurePerfCounter(PERF_EVENT_IOC_DISABLE));
		addInstructions(ET.generateExitSyscall(0));
		#endif // __linux__
		}
// Insert the return code.		// Insert the return code.
const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();		const TargetInstrInfo *TII = MF.getSubtarget().getInstrInfo();
if (TII->getReturnOpcode() < TII->getNumOpcodes()) {		if (TII->getReturnOpcode() < TII->getNumOpcodes()) {
BuildMI(MBB, DL, TII->get(TII->getReturnOpcode()));		BuildMI(MBB, DL, TII->get(TII->getReturnOpcode()));
} else {		} else {
MachineIRBuilder MIB(MF);		MachineIRBuilder MIB(MF);
MIB.setMBB(*MBB);		MIB.setMBB(*MBB);

Show All 37 Lines	BitVector getFunctionReservedRegs(const TargetMachine &TM) {
// Saving reserved registers for client.		// Saving reserved registers for client.
return MF.getSubtarget().getRegisterInfo()->getReservedRegs(MF);		return MF.getSubtarget().getRegisterInfo()->getReservedRegs(MF);
}		}

Error assembleToStream(const ExegesisTarget &ET,		Error assembleToStream(const ExegesisTarget &ET,
std::unique_ptr<LLVMTargetMachine> TM,		std::unique_ptr<LLVMTargetMachine> TM,
ArrayRef<unsigned> LiveIns,		ArrayRef<unsigned> LiveIns,
ArrayRef<RegisterValue> RegisterInitialValues,		ArrayRef<RegisterValue> RegisterInitialValues,
const FillFunction &Fill, raw_pwrite_stream &AsmStream) {		const FillFunction &Fill, raw_pwrite_stream &AsmStream,
		const BenchmarkKey &Key,
		bool GenerateMemoryInstructions) {
auto Context = std::make_unique<LLVMContext>();		auto Context = std::make_unique<LLVMContext>();
std::unique_ptr<Module> Module =		std::unique_ptr<Module> Module =
createModule(Context, TM->createDataLayout());		createModule(Context, TM->createDataLayout());
auto MMIWP = std::make_unique<MachineModuleInfoWrapperPass>(TM.get());		auto MMIWP = std::make_unique<MachineModuleInfoWrapperPass>(TM.get());
MachineFunction &MF = createVoidVoidPtrMachineFunction(		MachineFunction &MF = createVoidVoidPtrMachineFunction(
FunctionID, Module.get(), &MMIWP.get()->getMMI());		FunctionID, Module.get(), &MMIWP.get()->getMMI());
MF.ensureAlignment(kFunctionAlignment);		MF.ensureAlignment(kFunctionAlignment);

// We need to instruct the passes that we're done with SSA and virtual		// We need to instruct the passes that we're done with SSA and virtual
// registers.		// registers.
auto &Properties = MF.getProperties();		auto &Properties = MF.getProperties();
Properties.set(MachineFunctionProperties::Property::NoVRegs);		Properties.set(MachineFunctionProperties::Property::NoVRegs);
Properties.reset(MachineFunctionProperties::Property::IsSSA);		Properties.reset(MachineFunctionProperties::Property::IsSSA);
Properties.set(MachineFunctionProperties::Property::NoPHIs);		Properties.set(MachineFunctionProperties::Property::NoPHIs);

		if (GenerateMemoryInstructions) {
		for (const unsigned Reg : ET.getArgumentRegisters()) {
		MF.getRegInfo().addLiveIn(Reg);
		}
		} else {
for (const unsigned Reg : LiveIns)		for (const unsigned Reg : LiveIns)
MF.getRegInfo().addLiveIn(Reg);		MF.getRegInfo().addLiveIn(Reg);
		}
		courbetUnsubmitted Done Reply Inline Actions why `else` ? You could have both liveins and memory, right ? (same below) Can you add a test ? courbet: why `else` ? You could have both liveins and memory, right ? (same below) Can you add a test ?
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions Ah, yep. I made it either/or here as it doesn't support passing the `RDI` (or different depending on ABI) scratch memory register, but there are definitely other uses for passing registers as live ins. Should be fixed in the next update. Thanks for the catch! aidengrossman: Ah, yep. I made it either/or here as it doesn't support passing the `RDI` (or different…

std::vector<unsigned> RegistersSetUp;		std::vector<unsigned> RegistersSetUp;
for (const auto &InitValue : RegisterInitialValues) {		for (const auto &InitValue : RegisterInitialValues) {
RegistersSetUp.push_back(InitValue.Register);		RegistersSetUp.push_back(InitValue.Register);
}		}
FunctionFiller Sink(MF, std::move(RegistersSetUp));		FunctionFiller Sink(MF, std::move(RegistersSetUp));
auto Entry = Sink.getEntry();		auto Entry = Sink.getEntry();
for (const unsigned Reg : LiveIns)		if (GenerateMemoryInstructions) {
		for (const unsigned Reg : ET.getArgumentRegisters())
Entry.MBB->addLiveIn(Reg);		Entry.MBB->addLiveIn(Reg);
		} else {
		for (const unsigned Reg : LiveIns) {
		Entry.MBB->addLiveIn(Reg);
		}
		}

const bool IsSnippetSetupComplete = generateSnippetSetupCode(		const bool IsSnippetSetupComplete = generateSnippetSetupCode(
ET, TM->getMCSubtargetInfo(), RegisterInitialValues, Entry);		ET, TM->getMCSubtargetInfo(), RegisterInitialValues, Entry, Key,
		GenerateMemoryInstructions);

// If the snippet setup is not complete, we disable liveliness tracking. This		// If the snippet setup is not complete, we disable liveliness tracking. This
// means that we won't know what values are in the registers.		// means that we won't know what values are in the registers.
// FIXME: this should probably be an assertion.		// FIXME: this should probably be an assertion.
if (!IsSnippetSetupComplete)		if (!IsSnippetSetupComplete)
Properties.reset(MachineFunctionProperties::Property::TracksLiveness);		Properties.reset(MachineFunctionProperties::Property::TracksLiveness);

Fill(Sink);		Fill(Sink);
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	protected:
const Benchmark::ModeE Mode;		const Benchmark::ModeE Mode;
const BenchmarkPhaseSelectorE BenchmarkPhaseSelector;		const BenchmarkPhaseSelectorE BenchmarkPhaseSelector;
const ExecutionModeE ExecutionMode;		const ExecutionModeE ExecutionMode;

private:		private:
virtual Expected<std::vector<BenchmarkMeasure>>		virtual Expected<std::vector<BenchmarkMeasure>>
runMeasurements(const FunctionExecutor &Executor) const = 0;		runMeasurements(const FunctionExecutor &Executor) const = 0;

Expected<SmallString<0>> assembleSnippet(const BenchmarkCode &BC,		Expected<SmallString<0>>
const SnippetRepetitor &Repetitor,		assembleSnippet(const BenchmarkCode &BC, const SnippetRepetitor &Repetitor,
unsigned MinInstructions,		unsigned MinInstructions, unsigned LoopBodySize,
unsigned LoopBodySize) const;		bool GenerateMemoryInstructions) const;

Expected<std::string> writeObjectFile(StringRef Buffer,		Expected<std::string> writeObjectFile(StringRef Buffer,
StringRef FileName) const;		StringRef FileName) const;

const std::unique_ptr<ScratchSpace> Scratch;		const std::unique_ptr<ScratchSpace> Scratch;

Expected<std::unique_ptr<FunctionExecutor>>		Expected<std::unique_ptr<FunctionExecutor>>
createFunctionExecutor(object::OwningBinary<object::ObjectFile> Obj,		createFunctionExecutor(object::OwningBinary<object::ObjectFile> Obj,
const BenchmarkKey &Key) const;		const BenchmarkKey &Key) const;
};		};

} // namespace exegesis		} // namespace exegesis
} // namespace llvm		} // namespace llvm

#endif // LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKRUNNER_H		#endif // LLVM_TOOLS_LLVM_EXEGESIS_BENCHMARKRUNNER_H

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

Show All 9 Lines
#include <memory>		#include <memory>
#include <string>		#include <string>

#include "Assembler.h"		#include "Assembler.h"
#include "BenchmarkRunner.h"		#include "BenchmarkRunner.h"
#include "Error.h"		#include "Error.h"
#include "MCInstrDescView.h"		#include "MCInstrDescView.h"
#include "PerfHelper.h"		#include "PerfHelper.h"
		#include "SubprocessMemory.h"
#include "Target.h"		#include "Target.h"
#include "llvm/ADT/ScopeExit.h"		#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/Support/CrashRecoveryContext.h"		#include "llvm/Support/CrashRecoveryContext.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/MemoryBuffer.h"		#include "llvm/Support/MemoryBuffer.h"
#include "llvm/Support/Program.h"		#include "llvm/Support/Program.h"
#include "llvm/Support/Signals.h"		#include "llvm/Support/Signals.h"

#ifdef __linux__		#ifdef __linux__
#include <perfmon/perf_event.h>		#include <perfmon/perf_event.h>
#include <sys/mman.h>		#include <sys/mman.h>
#include <sys/ptrace.h>		#include <sys/ptrace.h>
#include <sys/syscall.h>		#include <sys/syscall.h>
#include <sys/wait.h>		#include <sys/wait.h>
#include <unistd.h>		#include <unistd.h>

		#ifdef __GLIBC__
		#if __GLIBC_MINOR__ >= 35
		courbetUnsubmitted Done Reply Inline Actions please add a `#define HAS_RSEQ` for this to make the code clearer. courbet: please add a `#define HAS_RSEQ` for this to make the code clearer.
		#include <sys/rseq.h>
		#endif // __GLIBC__MINOR > 35
		#endif // __GLIBC__
#endif // __linux__		#endif // __linux__

namespace llvm {		namespace llvm {
namespace exegesis {		namespace exegesis {

BenchmarkRunner::BenchmarkRunner(const LLVMState &State, Benchmark::ModeE Mode,		BenchmarkRunner::BenchmarkRunner(const LLVMState &State, Benchmark::ModeE Mode,
BenchmarkPhaseSelectorE BenchmarkPhaseSelector,		BenchmarkPhaseSelectorE BenchmarkPhaseSelector,
ExecutionModeE ExecutionMode)		ExecutionModeE ExecutionMode)
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	SubProcessFunctionExecutorImpl(const LLVMState &State,
object::OwningBinary<object::ObjectFile> Obj,		object::OwningBinary<object::ObjectFile> Obj,
const BenchmarkKey &Key)		const BenchmarkKey &Key)
: State(State), Function(State.createTargetMachine(), std::move(Obj)),		: State(State), Function(State.createTargetMachine(), std::move(Obj)),
Key(Key) {}		Key(Key) {}

private:		private:
enum ChildProcessExitCodeE {		enum ChildProcessExitCodeE {
CounterFDReadFailed = 1,		CounterFDReadFailed = 1,
TranslatingCounterFDFailed		TranslatingCounterFDFailed,
		RSeqDisableFailed,
		FunctionDataMappingFailed,
		AuxiliaryMemorySetupFailed

};		};

StringRef childProcessExitCodeToString(int ExitCode) const {		StringRef childProcessExitCodeToString(int ExitCode) const {
switch (ExitCode) {		switch (ExitCode) {
case ChildProcessExitCodeE::CounterFDReadFailed:		case ChildProcessExitCodeE::CounterFDReadFailed:
return "Counter file descriptor read failed";		return "Counter file descriptor read failed";
case ChildProcessExitCodeE::TranslatingCounterFDFailed:		case ChildProcessExitCodeE::TranslatingCounterFDFailed:
return "Translating counter file descriptor into a file descriptor in "		return "Translating counter file descriptor into a file descriptor in "
"the child process failed";		"the child process failed";
		case ChildProcessExitCodeE::RSeqDisableFailed:
		return "Disabling restartable sequences failed";
		case ChildProcessExitCodeE::FunctionDataMappingFailed:
		return "Failed to map memory for assembled snippet";
		case ChildProcessExitCodeE::AuxiliaryMemorySetupFailed:
		return "Failed to setup auxiliary memory";
default:		default:
return "Child process returned with unknown exit code";		return "Child process returned with unknown exit code";
}		}
}		}

Error createSubProcessAndRunBenchmark(		Error createSubProcessAndRunBenchmark(
StringRef CounterName, SmallVectorImpl<int64_t> &CounterValues) const {		StringRef CounterName, SmallVectorImpl<int64_t> &CounterValues) const {
int PipeFiles[2];		int PipeFiles[2];
int PipeSuccessOrErr = pipe(PipeFiles);		int PipeSuccessOrErr = pipe(PipeFiles);
if (PipeSuccessOrErr != 0) {		if (PipeSuccessOrErr != 0) {
return make_error<Failure>(		return make_error<Failure>(
"Failed to create a pipe for interprocess communication between "		"Failed to create a pipe for interprocess communication between "
"llvm-exegesis and the benchmarking subprocess");		"llvm-exegesis and the benchmarking subprocess");
}		}

		SubprocessMemory SPMemory;
		Error MemoryInitError = SPMemory.initializeSubprocessMemory(getpid());
		if (MemoryInitError)
		return MemoryInitError;

		Error AddMemDefError =
		SPMemory.addMemoryDefinition(Key.MemoryValues, getpid());
		if (AddMemDefError)
		return AddMemDefError;

pid_t ParentOrChildPID = fork();		pid_t ParentOrChildPID = fork();
if (ParentOrChildPID == 0) {		if (ParentOrChildPID == 0) {
// We are in the child process, close the write end of the pipe		// We are in the child process, close the write end of the pipe
close(PipeFiles[1]);		close(PipeFiles[1]);
// Unregister handlers, signal handling is now handled through ptrace in		// Unregister handlers, signal handling is now handled through ptrace in
// the host process		// the host process
llvm::sys::unregisterHandlers();		llvm::sys::unregisterHandlers();
prepareAndRunBenchmark(PipeFiles[0], Key);		prepareAndRunBenchmark(PipeFiles[0], Key);
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	[[noreturn]] void prepareAndRunBenchmark(int Pipe,
int ParentPIDFD = syscall(SYS_pidfd_open, ParentPID, 0);		int ParentPIDFD = syscall(SYS_pidfd_open, ParentPID, 0);
int CounterFileDescriptor =		int CounterFileDescriptor =
syscall(SYS_pidfd_getfd, ParentPIDFD, ParentCounterFileDescriptor, 0);		syscall(SYS_pidfd_getfd, ParentPIDFD, ParentCounterFileDescriptor, 0);

if (CounterFileDescriptor == -1) {		if (CounterFileDescriptor == -1) {
exit(ChildProcessExitCodeE::TranslatingCounterFDFailed);		exit(ChildProcessExitCodeE::TranslatingCounterFDFailed);
}		}

ioctl(CounterFileDescriptor, PERF_EVENT_IOC_RESET);		// Glibc versions greater than 2.35 automatically call rseq during
this->Function(nullptr);		// initialization Unmapping the region that glibc sets up for this causes
ioctl(CounterFileDescriptor, PERF_EVENT_IOC_DISABLE);		// segfaults in the program Unregister the rseq region so that we can safely
		// unmap it later
		#ifdef __GLIBC__
		#if __GLIBC_MINOR__ >= 35
		long RseqDisableOutput =
		syscall(SYS_rseq, (intptr_t)__builtin_thread_pointer() + __rseq_offset,
		__rseq_size, RSEQ_FLAG_UNREGISTER, RSEQ_SIG);
		if (RseqDisableOutput != 0)
		exit(ChildProcessExitCodeE::RSeqDisableFailed);
		#endif
		#endif

		size_t FunctionDataCopySize = this->Function.FunctionBytes.size();
		char *FunctionDataCopy =
		(char *)mmap(NULL, FunctionDataCopySize, PROT_READ \| PROT_WRITE,
		MAP_PRIVATE \| MAP_ANONYMOUS, 0, 0);
		if ((intptr_t)FunctionDataCopy == -1)
		exit(ChildProcessExitCodeE::FunctionDataMappingFailed);

		memcpy(FunctionDataCopy, this->Function.FunctionBytes.data(),
		this->Function.FunctionBytes.size());
		mprotect(FunctionDataCopy, FunctionDataCopySize, PROT_READ \| PROT_EXEC);

		Expected<int> AuxMemFDOrError =
		SubprocessMemory::setupAuxiliaryMemoryInSubprocess(
		Key.MemoryValues, ParentPID, CounterFileDescriptor);
		if (!AuxMemFDOrError)
		exit(ChildProcessExitCodeE::AuxiliaryMemorySetupFailed);

		((void (*)(size_t, int))(intptr_t)FunctionDataCopy)(FunctionDataCopySize,
		*AuxMemFDOrError);

exit(0);		exit(0);
}		}

Expected<llvm::SmallVector<int64_t, 4>>		Expected<llvm::SmallVector<int64_t, 4>>
runWithCounter(StringRef CounterName) const override {		runWithCounter(StringRef CounterName) const override {
SmallVector<int64_t, 4> Value(1, 0);		SmallVector<int64_t, 4> Value(1, 0);
Error PossibleBenchmarkError =		Error PossibleBenchmarkError =
Show All 10 Lines	#endif
const ExecutableFunction Function;		const ExecutableFunction Function;
const BenchmarkKey &Key;		const BenchmarkKey &Key;
};		};
#endif // __linux__		#endif // __linux__
} // namespace		} // namespace

Expected<SmallString<0>> BenchmarkRunner::assembleSnippet(		Expected<SmallString<0>> BenchmarkRunner::assembleSnippet(
const BenchmarkCode &BC, const SnippetRepetitor &Repetitor,		const BenchmarkCode &BC, const SnippetRepetitor &Repetitor,
unsigned MinInstructions, unsigned LoopBodySize) const {		unsigned MinInstructions, unsigned LoopBodySize,
		bool GenerateMemoryInstructions) const {
const std::vector<MCInst> &Instructions = BC.Key.Instructions;		const std::vector<MCInst> &Instructions = BC.Key.Instructions;
SmallString<0> Buffer;		SmallString<0> Buffer;
raw_svector_ostream OS(Buffer);		raw_svector_ostream OS(Buffer);
if (Error E = assembleToStream(		if (Error E = assembleToStream(
State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns,		State.getExegesisTarget(), State.createTargetMachine(), BC.LiveIns,
BC.Key.RegisterInitialValues,		BC.Key.RegisterInitialValues,
Repetitor.Repeat(Instructions, MinInstructions, LoopBodySize), OS)) {		Repetitor.Repeat(Instructions, MinInstructions, LoopBodySize,
		GenerateMemoryInstructions),
		OS, BC.Key, GenerateMemoryInstructions)) {
return std::move(E);		return std::move(E);
}		}
return Buffer;		return Buffer;
}		}

Expected<BenchmarkRunner::RunnableConfiguration>		Expected<BenchmarkRunner::RunnableConfiguration>
BenchmarkRunner::getRunnableConfiguration(		BenchmarkRunner::getRunnableConfiguration(
const BenchmarkCode &BC, unsigned NumRepetitions, unsigned LoopBodySize,		const BenchmarkCode &BC, unsigned NumRepetitions, unsigned LoopBodySize,
const SnippetRepetitor &Repetitor) const {		const SnippetRepetitor &Repetitor) const {
RunnableConfiguration RC;		RunnableConfiguration RC;

Benchmark &InstrBenchmark = RC.InstrBenchmark;		Benchmark &InstrBenchmark = RC.InstrBenchmark;
InstrBenchmark.Mode = Mode;		InstrBenchmark.Mode = Mode;
InstrBenchmark.CpuName = std::string(State.getTargetMachine().getTargetCPU());		InstrBenchmark.CpuName = std::string(State.getTargetMachine().getTargetCPU());
InstrBenchmark.LLVMTriple =		InstrBenchmark.LLVMTriple =
State.getTargetMachine().getTargetTriple().normalize();		State.getTargetMachine().getTargetTriple().normalize();
InstrBenchmark.NumRepetitions = NumRepetitions;		InstrBenchmark.NumRepetitions = NumRepetitions;
InstrBenchmark.Info = BC.Info;		InstrBenchmark.Info = BC.Info;

const std::vector<MCInst> &Instructions = BC.Key.Instructions;		const std::vector<MCInst> &Instructions = BC.Key.Instructions;

		bool GenerateMemoryInstructions = ExecutionMode == ExecutionModeE::SubProcess;

InstrBenchmark.Key = BC.Key;		InstrBenchmark.Key = BC.Key;

// Assemble at least kMinInstructionsForSnippet instructions by repeating		// Assemble at least kMinInstructionsForSnippet instructions by repeating
// the snippet for debug/analysis. This is so that the user clearly		// the snippet for debug/analysis. This is so that the user clearly
// understands that the inside instructions are repeated.		// understands that the inside instructions are repeated.
if (BenchmarkPhaseSelector > BenchmarkPhaseSelectorE::PrepareSnippet) {		if (BenchmarkPhaseSelector > BenchmarkPhaseSelectorE::PrepareSnippet) {
const int MinInstructionsForSnippet = 4 * Instructions.size();		const int MinInstructionsForSnippet = 4 * Instructions.size();
const int LoopBodySizeForSnippet = 2 * Instructions.size();		const int LoopBodySizeForSnippet = 2 * Instructions.size();
auto Snippet = assembleSnippet(BC, Repetitor, MinInstructionsForSnippet,		auto Snippet =
LoopBodySizeForSnippet);		assembleSnippet(BC, Repetitor, MinInstructionsForSnippet,
		LoopBodySizeForSnippet, GenerateMemoryInstructions);
if (Error E = Snippet.takeError())		if (Error E = Snippet.takeError())
return std::move(E);		return std::move(E);
const ExecutableFunction EF(State.createTargetMachine(),		const ExecutableFunction EF(State.createTargetMachine(),
getObjectFromBuffer(*Snippet));		getObjectFromBuffer(*Snippet));
const auto FnBytes = EF.getFunctionBytes();		const auto FnBytes = EF.getFunctionBytes();
llvm::append_range(InstrBenchmark.AssembledSnippet, FnBytes);		llvm::append_range(InstrBenchmark.AssembledSnippet, FnBytes);
}		}

// Assemble NumRepetitions instructions repetitions of the snippet for		// Assemble NumRepetitions instructions repetitions of the snippet for
// measurements.		// measurements.
if (BenchmarkPhaseSelector > BenchmarkPhaseSelectorE::PrepareAndAssembleSnippet) {		if (BenchmarkPhaseSelector >
		BenchmarkPhaseSelectorE::PrepareAndAssembleSnippet) {
auto Snippet = assembleSnippet(BC, Repetitor, InstrBenchmark.NumRepetitions,		auto Snippet = assembleSnippet(BC, Repetitor, InstrBenchmark.NumRepetitions,
LoopBodySize);		LoopBodySize, GenerateMemoryInstructions);
if (Error E = Snippet.takeError())		if (Error E = Snippet.takeError())
return std::move(E);		return std::move(E);
RC.ObjectFile = getObjectFromBuffer(*Snippet);		RC.ObjectFile = getObjectFromBuffer(*Snippet);
}		}

return std::move(RC);		return std::move(RC);
}		}

▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h

Show All 33 Lines	public:
virtual ~SnippetRepetitor();		virtual ~SnippetRepetitor();

// Returns the set of registers that are reserved by the repetitor.		// Returns the set of registers that are reserved by the repetitor.
virtual BitVector getReservedRegs() const = 0;		virtual BitVector getReservedRegs() const = 0;

// Returns a functor that repeats `Instructions` so that the function executes		// Returns a functor that repeats `Instructions` so that the function executes
// at least `MinInstructions` instructions.		// at least `MinInstructions` instructions.
virtual FillFunction Repeat(ArrayRef<MCInst> Instructions,		virtual FillFunction Repeat(ArrayRef<MCInst> Instructions,
unsigned MinInstructions,		unsigned MinInstructions, unsigned LoopBodySize,
unsigned LoopBodySize) const = 0;		bool CleanupMemory) const = 0;

explicit SnippetRepetitor(const LLVMState &State) : State(State) {}		explicit SnippetRepetitor(const LLVMState &State) : State(State) {}

protected:		protected:
const LLVMState &State;		const LLVMState &State;
};		};

} // namespace exegesis		} // namespace exegesis
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

Show All 20 Lines

class DuplicateSnippetRepetitor : public SnippetRepetitor {		class DuplicateSnippetRepetitor : public SnippetRepetitor {
public:		public:
using SnippetRepetitor::SnippetRepetitor;		using SnippetRepetitor::SnippetRepetitor;

// Repeats the snippet until there are at least MinInstructions in the		// Repeats the snippet until there are at least MinInstructions in the
// resulting code.		// resulting code.
FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,		FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,
unsigned LoopBodySize) const override {		unsigned LoopBodySize,
return [Instructions, MinInstructions](FunctionFiller &Filler) {		bool CleanupMemory) const override {
		return [this, Instructions, MinInstructions,
		CleanupMemory](FunctionFiller &Filler) {
auto Entry = Filler.getEntry();		auto Entry = Filler.getEntry();
if (!Instructions.empty()) {		if (!Instructions.empty()) {
// Add the whole snippet at least once.		// Add the whole snippet at least once.
Entry.addInstructions(Instructions);		Entry.addInstructions(Instructions);
for (unsigned I = Instructions.size(); I < MinInstructions; ++I) {		for (unsigned I = Instructions.size(); I < MinInstructions; ++I) {
Entry.addInstruction(Instructions[I % Instructions.size()]);		Entry.addInstruction(Instructions[I % Instructions.size()]);
}		}
}		}
Entry.addReturn();		Entry.addReturn(State.getExegesisTarget(), CleanupMemory);
};		};
}		}

BitVector getReservedRegs() const override {		BitVector getReservedRegs() const override {
// We're using no additional registers.		// We're using no additional registers.
return State.getRATC().emptyRegisters();		return State.getRATC().emptyRegisters();
}		}
};		};

class LoopSnippetRepetitor : public SnippetRepetitor {		class LoopSnippetRepetitor : public SnippetRepetitor {
public:		public:
explicit LoopSnippetRepetitor(const LLVMState &State)		explicit LoopSnippetRepetitor(const LLVMState &State)
: SnippetRepetitor(State),		: SnippetRepetitor(State),
LoopCounter(State.getExegesisTarget().getLoopCounterRegister(		LoopCounter(State.getExegesisTarget().getLoopCounterRegister(
State.getTargetMachine().getTargetTriple())) {}		State.getTargetMachine().getTargetTriple())) {}

// Loop over the snippet ceil(MinInstructions / Instructions.Size()) times.		// Loop over the snippet ceil(MinInstructions / Instructions.Size()) times.
FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,		FillFunction Repeat(ArrayRef<MCInst> Instructions, unsigned MinInstructions,
unsigned LoopBodySize) const override {		unsigned LoopBodySize,
return [this, Instructions, MinInstructions,		bool CleanupMemory) const override {
LoopBodySize](FunctionFiller &Filler) {		return [this, Instructions, MinInstructions, LoopBodySize,
		CleanupMemory](FunctionFiller &Filler) {
const auto &ET = State.getExegesisTarget();		const auto &ET = State.getExegesisTarget();
auto Entry = Filler.getEntry();		auto Entry = Filler.getEntry();

// We can not use loop snippet repetitor for terminator instructions.		// We can not use loop snippet repetitor for terminator instructions.
for (const MCInst &Inst : Instructions) {		for (const MCInst &Inst : Instructions) {
const unsigned Opcode = Inst.getOpcode();		const unsigned Opcode = Inst.getOpcode();
const MCInstrDesc &MCID = Filler.MCII->get(Opcode);		const MCInstrDesc &MCID = Filler.MCII->get(Opcode);
if (!MCID.isTerminator())		if (!MCID.isTerminator())
continue;		continue;
Entry.addReturn();		Entry.addReturn(State.getExegesisTarget(), CleanupMemory);
return;		return;
}		}

auto Loop = Filler.addBasicBlock();		auto Loop = Filler.addBasicBlock();
auto Exit = Filler.addBasicBlock();		auto Exit = Filler.addBasicBlock();

const unsigned LoopUnrollFactor =		const unsigned LoopUnrollFactor =
LoopBodySize <= Instructions.size()		LoopBodySize <= Instructions.size()
Show All 28 Lines	return [this, Instructions, MinInstructions, LoopBodySize,
(void)_;		(void)_;
Loop.addInstructions(Instructions);		Loop.addInstructions(Instructions);
}		}
ET.decrementLoopCounterAndJump(Loop.MBB, Loop.MBB,		ET.decrementLoopCounterAndJump(Loop.MBB, Loop.MBB,
State.getInstrInfo());		State.getInstrInfo());

// Set up the exit basic block.		// Set up the exit basic block.
Loop.MBB->addSuccessor(Exit.MBB, BranchProbability::getZero());		Loop.MBB->addSuccessor(Exit.MBB, BranchProbability::getZero());
Exit.addReturn();		Exit.addReturn(State.getExegesisTarget(), CleanupMemory);
};		};
}		}

BitVector getReservedRegs() const override {		BitVector getReservedRegs() const override {
// We're using a single loop counter, but we have to reserve all aliasing		// We're using a single loop counter, but we have to reserve all aliasing
// registers.		// registers.
return State.getRATC().getRegister(LoopCounter).aliasedBits();		return State.getRATC().getRegister(LoopCounter).aliasedBits();
}		}
Show All 25 Lines

llvm/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	return std::unique_ptr<LLVMTargetMachine>(
static_cast<LLVMTargetMachine *>(TM));		static_cast<LLVMTargetMachine *>(TM));
}		}

ExecutableFunction		ExecutableFunction
assembleToFunction(ArrayRef<RegisterValue> RegisterInitialValues,		assembleToFunction(ArrayRef<RegisterValue> RegisterInitialValues,
FillFunction Fill) {		FillFunction Fill) {
SmallString<256> Buffer;		SmallString<256> Buffer;
raw_svector_ostream AsmStream(Buffer);		raw_svector_ostream AsmStream(Buffer);
		BenchmarkKey Key;
		Key.RegisterInitialValues = RegisterInitialValues;
EXPECT_FALSE(assembleToStream(ET, createTargetMachine(), /LiveIns=*/{},		EXPECT_FALSE(assembleToStream(ET, createTargetMachine(), /LiveIns=*/{},
RegisterInitialValues, Fill, AsmStream));		RegisterInitialValues, Fill, AsmStream, Key,
		false));
return ExecutableFunction(createTargetMachine(),		return ExecutableFunction(createTargetMachine(),
getObjectFromBuffer(AsmStream.str()));		getObjectFromBuffer(AsmStream.str()));
}		}

const std::string TT;		const std::string TT;
const std::string CpuName;		const std::string CpuName;
const bool CanExecute;		const bool CanExecute;
const ExegesisTarget *const ET;		const ExegesisTarget *const ET;
};		};

} // namespace exegesis		} // namespace exegesis
} // namespace llvm		} // namespace llvm

#endif		#endif

llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

Show All 37 Lines	void SetUp() override {
MF = &createVoidVoidPtrMachineFunction("TestFn", Mod.get(), MMI.get());		MF = &createVoidVoidPtrMachineFunction("TestFn", Mod.get(), MMI.get());
}		}

void TestCommon(Benchmark::RepetitionModeE RepetitionMode) {		void TestCommon(Benchmark::RepetitionModeE RepetitionMode) {
const auto Repetitor = SnippetRepetitor::Create(RepetitionMode, State);		const auto Repetitor = SnippetRepetitor::Create(RepetitionMode, State);
const std::vector<MCInst> Instructions = {MCInstBuilder(X86::NOOP)};		const std::vector<MCInst> Instructions = {MCInstBuilder(X86::NOOP)};
FunctionFiller Sink(*MF, {X86::EAX});		FunctionFiller Sink(*MF, {X86::EAX});
const auto Fill =		const auto Fill =
Repetitor->Repeat(Instructions, kMinInstructions, kLoopBodySize);		Repetitor->Repeat(Instructions, kMinInstructions, kLoopBodySize, false);
Fill(Sink);		Fill(Sink);
}		}

static constexpr const unsigned kMinInstructions = 3;		static constexpr const unsigned kMinInstructions = 3;
static constexpr const unsigned kLoopBodySize = 5;		static constexpr const unsigned kLoopBodySize = 5;

std::unique_ptr<LLVMTargetMachine> TM;		std::unique_ptr<LLVMTargetMachine> TM;
std::unique_ptr<LLVMContext> Context;		std::unique_ptr<LLVMContext> Context;
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-exegesis] Add support for using memory annotationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 526920

llvm/test/tools/llvm-exegesis/X86/latency/memory-annotations.s

llvm/tools/llvm-exegesis/lib/Assembler.h

llvm/tools/llvm-exegesis/lib/Assembler.cpp

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.h

llvm/tools/llvm-exegesis/lib/BenchmarkRunner.cpp

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.h

llvm/tools/llvm-exegesis/lib/SnippetRepetitor.cpp

llvm/unittests/tools/llvm-exegesis/Common/AssemblerUtils.h

llvm/unittests/tools/llvm-exegesis/X86/SnippetRepetitorTest.cpp

[llvm-exegesis] Add support for using memory annotations
ClosedPublic