This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
3
SlotIndexes.h
-
lib/CodeGen/
-
CodeGen/
-
MLRegallocEvictAdvisor.h
20/23
MLRegallocEvictAdvisor.cpp
-
test/CodeGen/MLRegalloc/
-
CodeGen/
-
MLRegalloc/
4/4
dev-mode-extra-features-logging.ll
-
unittests/CodeGen/
-
CodeGen/
-
CMakeLists.txt
-
MLRegallocDevelopmentFeatures.cpp

Differential D131930

[mlgo] Add in-development instruction based features for regalloc advisor
ClosedPublic

Authored by aidengrossman on Aug 15 2022, 4:44 PM.

Download Raw Diff

Details

Reviewers

MatzeB
mtrofin

Commits

rGe5e3dccd0741: [mlgo] Add in-development instruction based features for regalloc advisor

Summary

This patch adds in instruction based features to the regalloc advisor
gated behind a flag so a user can decide at runtime whether or not they
want to enable the feature. The features are only enabled when LLVM is
compiled in MLGO develpment mode (LLVM_HAVE_TF_API) is set to true.

To extract the instruction features, I'm taking a list of segments from
each LiveInterval and noting the start and end SlotIndices. This list is then
sorted based on the start SlotIndex and I iterate through each SlotIndex
to grab instructions, making sure to check for overlaps. This results in
a vector of opcodes and binary mapping matrix that maps live ranges to the
opcodes of the instructions within that LR.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aidengrossman created this revision.Aug 15 2022, 4:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2022, 4:44 PM

Herald added subscribers: mtrofin, mgrang, hiraditya and 2 others. · View Herald Transcript

aidengrossman requested review of this revision.Aug 15 2022, 4:44 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 15 2022, 4:44 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

aidengrossman added reviewers: MatzeB, mtrofin.Aug 15 2022, 4:46 PM

Harbormaster completed remote builds in B181411: Diff 452848.Aug 15 2022, 5:13 PM

mtrofin added inline comments.Aug 16 2022, 12:48 PM

llvm/test/CodeGen/MLRegalloc/dev-mode-extra-features-logging.ll
3	Because the output very verbose, perhaps just checking specific values may be more maintainable? Like checking that you counted things correctly in your 33x300 matrix - add some comments maybe about what the output looks like and why certain sentinel values are expected. This would also help one identify, later, if the test breaks legitimately due to a change (e.g. some machine instruction is now not present anymore -> of course the features change) vs due to a bug.

Changed up dev mode extra features test to make it less fragile by getting rid
of the exact logging diff check and adding more FileCheck checks on the
instructions and mapping matrix for the first eviction problem.

aidengrossman added inline comments.Aug 16 2022, 2:29 PM

llvm/test/CodeGen/MLRegalloc/dev-mode-extra-features-logging.ll
3	That's a very good point. The test could potentially be pretty fragile if checking the exact code output. I reworked the check to get rid of the exact diff and added a lot more FileCheck checks along with comments to make sure that the output looks as expected to some reasonable degree.

Harbormaster completed remote builds in B181631: Diff 453131.Aug 16 2022, 2:34 PM

mtrofin added inline comments.Aug 16 2022, 5:45 PM

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
910	this deserves more comments about: the whole structure of the data packet you're preparing and the various steps along the way - it'll greatly help with maintainability, etc.
911	may be more readable to have a small struct with the 2 slot indices and size_t with nice readable names, and then avoid the whole std::get<x> business.
927	nit: we generally do if (!CurrentMachineInstruction).
1052	nit: or, `// continue from the feature index the previous loop left off` ?
llvm/test/CodeGen/MLRegalloc/dev-mode-extra-features-logging.ll
3	thanks, this looks way more manageable!
53	add a new line

Adjust comments to better document the extraction algorithm and the data that
it returns. Also refactor to passing around a struct rather than a tuple and
adjust a comment in the dev mode logging.

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
910	I've added quite a few more comments on the structure of the extraction algorithm itself and the structure of the data that it extracts. Let me know if anything is unclear or too verbose and I'll work on fixing it.
1052	This was left over from some experimentation that I have since removed and thus was commenting a line that isn't even there. I've changed it to note your suggestion and also to explain why the indexing is slightly odd (only going to `FeaturesWithDevelopmentCount -1 1` rather than `FeaturesWithDevelopmentCount` itself.

Harbormaster completed remote builds in B181671: Diff 453187.Aug 16 2022, 8:01 PM

Fixed release mode build and gated more of the development feature tooling
with the preprocessing behind the LLVM_HAVE_TF_API flag.

Harbormaster completed remote builds in B182150: Diff 453885.Aug 18 2022, 10:58 PM

Just making sure this patch doesn't go anywhere yet. When built with assertions on, the new development features are currently tripping an assertion when CurrentIndex.getNextIndex() is called when CurrentMachineInstruction is equal to nullptr. Need to do some more debugging/investigation see why this is going on.

Add if statement to handle edge case where the current machine instruction
is null and it tries to skip to the slotindex in the next machine instruction
but can't due to it being at the of the slotindex analysis.

Harbormaster completed remote builds in B186025: Diff 459302.Sep 10 2022, 1:09 PM

mtrofin added inline comments.Sep 12 2022, 5:10 PM

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
139	rename this to `OpcodeValueCutoff` - `Count` to me refers at this point to how many opcodes in the tensor, but IIUC you actually mean values larger than that are unknown to us. Same for the comment.
251	you can avoid the -1 thing by splitting your RA_EVICT_FEATURES_UNDER_DEVELOPMENT_LIST in 2, the "first" and the "rest", and then forcing the index of the first to be == `FeatureCount` #ifdef LLVM_HAVE_TF_API #define RA_FIRST_EXTRA_FEATURE(M) \ M(int64_t, instructions, InstructionsShape, \ "Opcodes of the instructions covered by the eviction problem") #define RA_REST_EXTRA_FEATURES(M) \ M(int64_t, instructions_mapping, InstructionsMappingShape, \ "A binary matrix mapping LRs to instruction opcodes") #define RA_EVICT_FEATURES_UNDER_DEVELOPMENT(M) \ RA_FIRST_EXTRA_FEATURE(M) \ RA_REST_EXTRA_FEATURES(M) #else #define RA_FIRST_EXTRA_FEATURE(M) #define RA_REST_EXTRA_FEATURES(M) #endif #define RA_EVICT_FEATURES_UNDER_DEVELOPMENT(M) \ RA_FIRST_EXTRA_FEATURE(M) \ RA_REST_EXTRA_FEATURES(M) enum FeatureIDs { #define _FEATURE_IDX_SIMPLE(_, name, __, ___) name #define _FEATURE_IDX(A,B,C,D) _FEATURE_IDX_SIMPLE(A,B,C,D), RA_EVICT_FEATURES_LIST(_FEATURE_IDX) FeatureCount, RA_FIRST_EXTRA_FEATURE(_FEATURE_IDX_SIMPLE) = FeatureCount, RA_REST_EXTRA_FEATURES(_FEATURE_IDX) FeaturesWithDevelopmentCount #undef _FEATURE_IDX };
264	Initialize things that don't have default ctors easy to do at def, and avoids nondeterministic bugs. Like Pos, at minimum, set it to 0.
266	add a comment as to what `Pos` is - it's the index of the column corresponding to the physical register of the live interval segment captured by this LRStartEndInfo, right? (or the candidate) also a comment for LRStartEndInfo overall
1014	This feels like it'd benefit from a unittest. The core logic is about intervals that cover instruction opcodes, so the only LLVM-ness of it is in `LIS->getInstructionFromIndex(CurrentIndex)` (because `LIS->getSlotIndexes()->getLastIndex())` is basically a constant you can pass in) You can test it by making a small change: make this into a utility that takes LRPosInfo and a std::function (or a llvm::function_ref, whatever) that gives the opcode for a SlotIndex. The utility doesn't need to know about mlevictadvisor or anything - it's just dealing with intervals. You can also pass a Runner in, and in the test case, just use the NoInferenceModelRunner. So then you can set up all sorts of interesting interval overlapping cases, and you can just populate your opcodes with incrementing numbers or something. The nice thing is that the unittest doesn't need any #define specific stuff - it's generic. The rest can stay the same - meaning, let `MLEvictAdvisor::extractInstructionFeatures` call that utility, etc.
1019	(nit - it took me a few reads to get it) "A vector of size "max instruction count". It contains the instruction opcodes of instructions covered by all intervals in LRPosInfo" wdyt?
1028	This is rather the Instruction "Index" or something like that, right? Count to me means total.
1029	Nit: CurrentSegmentIdx
1035	typo: process"ed"
1058	you don't need the -1 thing here if you split the extra list in 2
1073	the exit condition could include LRPosInfo[OverlapCheckCurrentSegment].Begin > CurrentIndex, for more clarity?
1083	can't `LRPosInfo[OverlapCheckCurrentSegment].End < CurrentIndex`? So we know the CurrentIndex is below Begin, but it could also be below this segment's end? Current instruction: 5 CurrentSegment: [1,7) NextSegment: [2,4) NextNextSegment: [2, 8)
1099	you just need to go to the next segment and potentially set CurrentIndex to that one's beginning if it's after CurrentIndex (simpler code)

aidengrossman marked 10 inline comments as done.Sep 13 2022, 1:26 AM

aidengrossman added inline comments.

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
264	That's a good point. Done for `Pos`. The `SlotIndex` class has a default constructor creating an invalid index, so we should be good there.
266	Added a comment for the entirety of `LRStartEndInfo` which also covers the meaning of the `Pos` member.

Address some of the inline comments (still a couple more to get to).

Harbormaster completed remote builds in B186322: Diff 459679.Sep 13 2022, 2:13 AM

Addressed more inline comments. Still some to go.

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
1083	`LRPosInfo[OverlapCheckCurrentSegment].End` can be less than `CurrentIndex`, but it shouldn't be missing any instructions. It's not checking if `CurrentIndex` is below begin, but rather if `CurrentIndex` is above the beginning of the current segment, because if the beginning of the current segment being checked is greater than the current index under analysis, it is guaranteed that the segment currently being processed as well as all future segments don't contain the currently being processed index due to the segments being sorted in ascending order by beginning index.

Harbormaster completed remote builds in B186523: Diff 459969.Sep 13 2022, 8:48 PM

aidengrossman added inline comments.Sep 15 2022, 4:18 PM

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp
1083	Nevermind my previous comment. Not sure what I was thinking there. This was an actual issue and I ended up catching it while looking through some unit test cases. Should be fixed on the next push as I added in a condition to only set the relevant element of the `instructions_mapping` matrix if `CurrentIndex <= LRPosInfo[OverlapCheckCurrentSegment].End`.

Refactor for unit tests, add unit tests, and modify main feature extraction code
to fix some edge cases found through unit testing.

Herald added subscribers: arphaman, mgorny. · View Herald TranscriptSep 16 2022, 8:38 AM

Harbormaster completed remote builds in B187164: Diff 460781.Sep 16 2022, 8:45 AM

Removed conditional compilation for the feature extraction for the in
development features and moved size constants to MLRegallocEvictAdvisor.h to
allow for the constants to be used during unit testing.

mtrofin added inline comments.Sep 16 2022, 4:24 PM

llvm/include/llvm/CodeGen/SlotIndexes.h
150	(just noticed this) can you avoid this, because it's just for test, and instead just replace the SlotIndex in your collection?

aidengrossman added inline comments.Sep 16 2022, 4:29 PM

llvm/include/llvm/CodeGen/SlotIndexes.h
150	This is for instantiation of each SlotIndex within the list. I would like to avoid it if possible too, but in order to setup the test case the SlotIndex needs to have a valid `IndexListEntry*` pointer, and there is no publicly exposed constructor that allows setting that, so I'm calling the default constructor and setting the pointer with this function. So it doesn't seem like there's a super practical way to avoid this. Unless there's some easy way to avoid this that I'm just not seeing?

mtrofin added inline comments.Sep 16 2022, 5:03 PM

llvm/include/llvm/CodeGen/SlotIndexes.h
150	Ugh... I'd rather the ctor be public - that way, the state management of a `SlotIndex` stays the same (once created, it doesn't change its `lie` - easier to understand / maintain) or (my initial preference) introduce an abstraction around the slot index stuff. IIRC there isn't that much you actually need from that API.

Harbormaster completed remote builds in B187276: Diff 460937.Sep 16 2022, 5:28 PM

Changed test setup to use SlotIndex constructor instead of setting the
ListIndexEntry pointer directly by getting rid of the pointer setter
and making the constructor public with a comment nothing that the
constructor isn't for general public consumption.

Harbormaster completed remote builds in B187288: Diff 460951.Sep 16 2022, 6:08 PM

Fixed conditional compilation for dev features so that the default case
doesn't break.

Harbormaster completed remote builds in B187302: Diff 460967.Sep 17 2022, 12:20 AM

mtrofin accepted this revision.Sep 17 2022, 9:27 AM

This revision is now accepted and ready to land.Sep 17 2022, 9:27 AM

This revision was landed with ongoing or failed builds.Sep 17 2022, 12:55 PM

Closed by commit rGe5e3dccd0741: [mlgo] Add in-development instruction based features for regalloc advisor (authored by aidengrossman). · Explain Why

This revision was automatically updated to reflect the committed changes.

aidengrossman added a commit: rGe5e3dccd0741: [mlgo] Add in-development instruction based features for regalloc advisor.

lkail added a subscriber: lkail.Sep 18 2022, 6:03 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

SlotIndexes.h

8 lines

lib/

CodeGen/

MLRegallocEvictAdvisor.h

72 lines

MLRegallocEvictAdvisor.cpp

245 lines

test/

CodeGen/

MLRegalloc/

dev-mode-extra-features-logging.ll

52 lines

unittests/

CodeGen/

CMakeLists.txt

1 line

MLRegallocDevelopmentFeatures.cpp

209 lines

Diff 461025

llvm/include/llvm/CodeGen/SlotIndexes.h

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	enum Slot {
/// used anywhere.		/// used anywhere.
Slot_Dead,		Slot_Dead,

Slot_Count		Slot_Count
};		};

PointerIntPair<IndexListEntry*, 2, unsigned> lie;		PointerIntPair<IndexListEntry*, 2, unsigned> lie;

SlotIndex(IndexListEntry *entry, unsigned slot)
: lie(entry, slot) {}

IndexListEntry* listEntry() const {		IndexListEntry* listEntry() const {
assert(isValid() && "Attempt to compare reserved index.");		assert(isValid() && "Attempt to compare reserved index.");
#ifdef EXPENSIVE_CHECKS		#ifdef EXPENSIVE_CHECKS
assert(!lie.getPointer()->isPoisoned() &&		assert(!lie.getPointer()->isPoisoned() &&
"Attempt to access deleted list-entry.");		"Attempt to access deleted list-entry.");
#endif // EXPENSIVE_CHECKS		#endif // EXPENSIVE_CHECKS
return lie.getPointer();		return lie.getPointer();
}		}
Show All 12 Lines	enum {
/// The default distance between instructions as returned by distance().		/// The default distance between instructions as returned by distance().
/// This may vary as instructions are inserted and removed.		/// This may vary as instructions are inserted and removed.
InstrDist = 4 * Slot_Count		InstrDist = 4 * Slot_Count
};		};

/// Construct an invalid index.		/// Construct an invalid index.
SlotIndex() = default;		SlotIndex() = default;

		// Creates a SlotIndex from an IndexListEntry and a slot. Generally should
		// not be used. This method is only public to facilitate writing certain
		// unit tests.
		SlotIndex(IndexListEntry *entry, unsigned slot) : lie(entry, slot) {}

// Construct a new slot index from the given one, and set the slot.		// Construct a new slot index from the given one, and set the slot.
SlotIndex(const SlotIndex &li, Slot s) : lie(li.listEntry(), unsigned(s)) {		SlotIndex(const SlotIndex &li, Slot s) : lie(li.listEntry(), unsigned(s)) {
assert(lie.getPointer() != nullptr &&		assert(lie.getPointer() != nullptr &&
"Attempt to construct index with 0 pointer.");		"Attempt to construct index with 0 pointer.");
}		}

/// Returns true if this is a valid index. Invalid indices do		/// Returns true if this is a valid index. Invalid indices do
		mtrofinUnsubmitted Not Done Reply Inline Actions (just noticed this) can you avoid this, because it's just for test, and instead just replace the SlotIndex in your collection? mtrofin: (just noticed this) can you avoid this, because it's just for test, and instead just replace…
		aidengrossmanAuthorUnsubmitted Not Done Reply Inline Actions This is for instantiation of each SlotIndex within the list. I would like to avoid it if possible too, but in order to setup the test case the SlotIndex needs to have a valid `IndexListEntry` pointer, and there is no publicly exposed constructor that allows setting that, so I'm calling the default constructor and setting the pointer with this function. So it doesn't seem like there's a super practical way to avoid this. Unless there's some easy way to avoid this that I'm just not seeing? aidengrossman:* This is for instantiation of each SlotIndex within the list. I would like to avoid it if…
		mtrofinUnsubmitted Not Done Reply Inline Actions Ugh... I'd rather the ctor be public - that way, the state management of a `SlotIndex` stays the same (once created, it doesn't change its `lie` - easier to understand / maintain) or (my initial preference) introduce an abstraction around the slot index stuff. IIRC there isn't that much you actually need from that API. mtrofin: Ugh... I'd rather the ctor be public - that way, the state management of a `SlotIndex` stays…
/// not point into an index table, and cannot be compared.		/// not point into an index table, and cannot be compared.
bool isValid() const {		bool isValid() const {
return lie.getPointer();		return lie.getPointer();
}		}

/// Return true for a valid index.		/// Return true for a valid index.
explicit operator bool() const { return isValid(); }		explicit operator bool() const { return isValid(); }

▲ Show 20 Lines • Show All 494 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MLRegallocEvictAdvisor.h

This file was added.

				//===- MLRegAllocEvictAdvisor.cpp - ML eviction advisor -------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// Function declarations of utilities related to feature extraction for unit
				// testing.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_MLREGALLOCEVICTIONADVISOR_H
				#define LLVM_CODEGEN_MLREGALLOCEVICTIONADVISOR_H

				#include "llvm/Analysis/MLModelRunner.h"
				#include "llvm/CodeGen/SlotIndexes.h"

				using namespace llvm;

				// LRStartEndInfo contains the start and end of a specific live range as
				// slot indices as well as storing the index of the physical register it
				// is assigned to (or 1 above the phys reg count if its the candidate).
				// Used when extracting per-instruction features in the context of a
				// specific eviction problem.
				struct LRStartEndInfo {
				SlotIndex Begin;
				SlotIndex End;
				size_t Pos = 0;
				};

				void extractInstructionFeatures(
				llvm::SmallVectorImpl<LRStartEndInfo> &LRPosInfo,
				MLModelRunner *RegallocRunner, function_ref<int(SlotIndex)> GetOpcode,
				const int InstructionsIndex, const int InstructionsMappingIndex,
				const SlotIndex LastIndex);

				// This is the maximum number of interfererring ranges. That's the number of
				// distinct AllocationOrder values, which comes from MCRegisterClass::RegsSize.
				// For X86, that's 32.
				// TODO: find a way to get this, statically, in a programmatic way.
				static const int64_t MaxInterferences = 32;

				// Logically, we can think of the feature set given to the evaluator as a 2D
				// matrix. The rows are the features (see next). The columns correspond to the
				// interferences. We treat the candidate virt reg as an 'interference', too, as
				// its feature set is the same as that of the interferring ranges. So we'll have
				// MaxInterferences + 1 columns and by convention, we will use the last column
				// for the virt reg seeking allocation.
				static const int64_t CandidateVirtRegPos = MaxInterferences;
				static const int64_t NumberOfInterferences = CandidateVirtRegPos + 1;

				// The number of instructions that a specific live range might have is variable,
				// but we're passing in a single matrix of instructions and tensorflow saved
				// models only support a fixed input size, so we have to cap the number of
				// instructions that can be passed along. The specific value was derived from
				// experimentation such that the majority of eviction problems would be
				// completely covered.
				static const int ModelMaxSupportedInstructionCount = 300;

				// When extracting per-instruction features, the advisor will currently create
				// a vector of size ModelMaxSupportedInstructionCount to hold the opcodes of the
				// instructions relevant to the eviction problem, and a NumberOfInterferences *
				// ModelMaxSupportedInstructionCount matrix that maps LRs to the instructions
				// that they span.
				static const std::vector<int64_t> InstructionsShape{
				1, ModelMaxSupportedInstructionCount};
				static const std::vector<int64_t> InstructionsMappingShape{
				1, NumberOfInterferences, ModelMaxSupportedInstructionCount};

				#endif // LLVM_CODEGEN_MLREGALLOCEVICTIONADVISOR_H

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp

Show All 14 Lines
#include "RegAllocGreedy.h"		#include "RegAllocGreedy.h"
#include "llvm/Analysis/MLModelRunner.h"		#include "llvm/Analysis/MLModelRunner.h"
#include "llvm/Analysis/TensorSpec.h"		#include "llvm/Analysis/TensorSpec.h"
#if defined(LLVM_HAVE_TF_AOT_REGALLOCEVICTMODEL) \|\| defined(LLVM_HAVE_TF_API)		#if defined(LLVM_HAVE_TF_AOT_REGALLOCEVICTMODEL) \|\| defined(LLVM_HAVE_TF_API)
#include "llvm/Analysis/ModelUnderTrainingRunner.h"		#include "llvm/Analysis/ModelUnderTrainingRunner.h"
#include "llvm/Analysis/NoInferenceModelRunner.h"		#include "llvm/Analysis/NoInferenceModelRunner.h"
#include "llvm/Analysis/Utils/TrainingLogger.h"		#include "llvm/Analysis/Utils/TrainingLogger.h"
#endif		#endif
		#include "MLRegallocEvictAdvisor.h"
#include "llvm/Analysis/ReleaseModeModelRunner.h"		#include "llvm/Analysis/ReleaseModeModelRunner.h"
#include "llvm/CodeGen/CalcSpillWeights.h"		#include "llvm/CodeGen/CalcSpillWeights.h"
#include "llvm/CodeGen/LiveRegMatrix.h"		#include "llvm/CodeGen/LiveRegMatrix.h"
#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"		#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineLoopInfo.h"		#include "llvm/CodeGen/MachineLoopInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/Passes.h"		#include "llvm/CodeGen/Passes.h"
Show All 28 Lines
static cl::opt<std::string> TrainingLog(		static cl::opt<std::string> TrainingLog(
"regalloc-training-log", cl::Hidden,		"regalloc-training-log", cl::Hidden,
cl::desc("Training log for the register allocator eviction model"));		cl::desc("Training log for the register allocator eviction model"));

static cl::opt<std::string> ModelUnderTraining(		static cl::opt<std::string> ModelUnderTraining(
"regalloc-model", cl::Hidden,		"regalloc-model", cl::Hidden,
cl::desc("The model being trained for register allocation eviction"));		cl::desc("The model being trained for register allocation eviction"));

		static cl::opt<bool> EnableDevelopmentFeatures(
		"regalloc-enable-development-features", cl::Hidden,
		cl::desc("Whether or not to enable features under development for the ML "
		"regalloc advisor"));

		#else
		static const bool EnableDevelopmentFeatures = false;
#endif // #ifdef LLVM_HAVE_TF_API		#endif // #ifdef LLVM_HAVE_TF_API

extern cl::opt<unsigned> EvictInterferenceCutoff;		extern cl::opt<unsigned> EvictInterferenceCutoff;

/// The score injection pass.		/// The score injection pass.
/// This pass calculates the score for a function and inserts it in the log, but		/// This pass calculates the score for a function and inserts it in the log, but
/// this happens only in development mode. It's a no-op otherwise.		/// this happens only in development mode. It's a no-op otherwise.
namespace llvm {		namespace llvm {
Show All 30 Lines

INITIALIZE_PASS(RegAllocScoring, "regallocscoringpass",		INITIALIZE_PASS(RegAllocScoring, "regallocscoringpass",
"Register Allocation Scoring Pass", false, false)		"Register Allocation Scoring Pass", false, false)

// ===================================		// ===================================
// Common ML Advisor declarations		// Common ML Advisor declarations
// ===================================		// ===================================
namespace {		namespace {
// This is the maximum number of interfererring ranges. That's the number of		// The model can only accept a specified number of opcodes and will error it if
// distinct AllocationOrder values, which comes from MCRegisterClass::RegsSize.		// fed an opcode it hasn't seen before. This constant sets the current cutoff.
// For X86, that's 32.		static const int OpcodeValueCutoff = 17716;
// TODO: find a way to get this, statically, in a programmatic way.
static const int64_t MaxInterferences = 32;

// Logically, we can think of the feature set given to the evaluator as a 2D
// matrix. The rows are the features (see next). The columns correspond to the
// interferences. We treat the candidate virt reg as an 'interference', too, as
// its feature set is the same as that of the interferring ranges. So we'll have
// MaxInterferences + 1 columns and by convention, we will use the last column
// for the virt reg seeking allocation.
static const int64_t CandidateVirtRegPos = MaxInterferences;
static const int64_t NumberOfInterferences = CandidateVirtRegPos + 1;

// Most features are as described above, so we'll reuse this vector in defining		// Most features are as described above, so we'll reuse this vector in defining
// them.		// them.
static const std::vector<int64_t> PerLiveRangeShape{1, NumberOfInterferences};		static const std::vector<int64_t> PerLiveRangeShape{1, NumberOfInterferences};

// --------------		// --------------
// Features table		// Features table
// --------------		// --------------
// For each interfering live range (incl. the candidate) we collect a number of		// For each interfering live range (incl. the candidate) we collect a number of
// features. However, because the features are of different types (and because		// features. However, because the features are of different types (and because
// of ML best practices), we organize the tensors per feature, not per		// of ML best practices), we organize the tensors per feature, not per
// candidate. Each such tensor has a scalar value corresponding to the		// candidate. Each such tensor has a scalar value corresponding to the
// interferring live range at that position, in the order in AllocationOrder.		// interferring live range at that position, in the order in AllocationOrder.
// The last position corresponds to the virt reg seeking allocation.		// The last position corresponds to the virt reg seeking allocation.
// Exception to all that is the progression feature, which is just a scalar (see		// Exception to all that is the progression feature, which is just a scalar (see
// its documentation for details).		// its documentation for details).
		mtrofinUnsubmitted Done Reply Inline Actions rename this to `OpcodeValueCutoff` - `Count` to me refers at this point to how many opcodes in the tensor, but IIUC you actually mean values larger than that are unknown to us. Same for the comment. mtrofin: rename this to `OpcodeValueCutoff` - `Count` to me refers at this point to how many opcodes in…
// Note on naming: the "_by_max" are normalized using the largest value of that		// Note on naming: the "_by_max" are normalized using the largest value of that
// tensor, as observed in the current decision making stage (i.e. for the		// tensor, as observed in the current decision making stage (i.e. for the
// current call to the advisor's tryFindEvictionCandidate)		// current call to the advisor's tryFindEvictionCandidate)
//		//
// The feature list format: type, name, shape, documentation.		// The feature list format: type, name, shape, documentation.
// Note: we can really just use int64 and float, hence the modeling of some		// Note: we can really just use int64 and float, hence the modeling of some
// bools as int64 values.		// bools as int64 values.
#define RA_EVICT_FEATURES_LIST(M) \		#define RA_EVICT_FEATURES_LIST(M) \
Show All 37 Lines	#define RA_EVICT_FEATURES_LIST(M) \
M(float, use_def_density, PerLiveRangeShape, \		M(float, use_def_density, PerLiveRangeShape, \
"the max weight, as computed by the manual heuristic") \		"the max weight, as computed by the manual heuristic") \
M(int64_t, max_stage, PerLiveRangeShape, \		M(int64_t, max_stage, PerLiveRangeShape, \
"largest stage of an interval in this LR") \		"largest stage of an interval in this LR") \
M(int64_t, min_stage, PerLiveRangeShape, \		M(int64_t, min_stage, PerLiveRangeShape, \
"lowest stage of an interval in this LR") \		"lowest stage of an interval in this LR") \
M(float, progress, {1}, "ratio of current queue size to initial size")		M(float, progress, {1}, "ratio of current queue size to initial size")

		#ifdef LLVM_HAVE_TF_API
		#define RA_EVICT_FIRST_DEVELOPMENT_FEATURE(M) \
		M(int64_t, instructions, InstructionsShape, \
		"Opcodes of the instructions covered by the eviction problem")

		#define RA_EVICT_REST_DEVELOPMENT_FEATURES(M) \
		M(int64_t, instructions_mapping, InstructionsMappingShape, \
		"A binary matrix mapping LRs to instruction opcodes")
		#else
		#define RA_EVICT_FIRST_DEVELOPMENT_FEATURE(M)
		#define RA_EVICT_REST_DEVELOPMENT_FEATURES(M)
		#endif

// The model learns to pick one of the mask == 1 interferences. This is the		// The model learns to pick one of the mask == 1 interferences. This is the
// name of the output tensor. The contract with the model is that the output		// name of the output tensor. The contract with the model is that the output
// will be guaranteed to be to a mask == 1 position. Using a macro here to		// will be guaranteed to be to a mask == 1 position. Using a macro here to
// avoid 'not used' warnings (and keep cond compilation to a minimum)		// avoid 'not used' warnings (and keep cond compilation to a minimum)
#define DecisionName "index_to_evict"		#define DecisionName "index_to_evict"

// Named features index.		// Named features index.
enum FeatureIDs {		enum FeatureIDs {
#define _FEATURE_IDX(_, name, __, ___) name,		#define _FEATURE_IDX_SIMPLE(_, name, __, ___) name
RA_EVICT_FEATURES_LIST(_FEATURE_IDX)		#define _FEATURE_IDX(A, B, C, D) _FEATURE_IDX_SIMPLE(A, B, C, D),
		RA_EVICT_FEATURES_LIST(_FEATURE_IDX) FeatureCount,
		#ifdef LLVM_HAVE_TF_API
		RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_FEATURE_IDX_SIMPLE) = FeatureCount,
		#else
		RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_FEATURE_IDX)
		#endif // #ifdef LLVM_HAVE_TF_API
		RA_EVICT_REST_DEVELOPMENT_FEATURES(_FEATURE_IDX) FeaturesWithDevelopmentCount
#undef _FEATURE_IDX		#undef _FEATURE_IDX
FeatureCount		#undef _FEATURE_IDX_SIMPLE
};		};

// The ML advisor will typically have a sparse input to the evaluator, because		// The ML advisor will typically have a sparse input to the evaluator, because
// various phys regs won't be available. It's easier (maintenance-wise) to		// various phys regs won't be available. It's easier (maintenance-wise) to
// bulk-reset the state of the evaluator each time we are about to use it		// bulk-reset the state of the evaluator each time we are about to use it
// again.		// again.
template <typename T> size_t getTotalSize(const std::vector<int64_t> &Shape) {		template <typename T> size_t getTotalSize(const std::vector<int64_t> &Shape) {
size_t Ret = sizeof(T);		size_t Ret = sizeof(T);
for (const auto V : Shape)		for (const auto V : Shape)
Ret *= V;		Ret *= V;
return Ret;		return Ret;
}		}

void resetInputs(MLModelRunner &Runner) {		void resetInputs(MLModelRunner &Runner) {
#define _RESET(TYPE, NAME, SHAPE, __) \		#define _RESET(TYPE, NAME, SHAPE, __) \
std::memset(Runner.getTensorUntyped(FeatureIDs::NAME), 0, \		std::memset(Runner.getTensorUntyped(FeatureIDs::NAME), 0, \
getTotalSize<TYPE>(SHAPE));		getTotalSize<TYPE>(SHAPE));
RA_EVICT_FEATURES_LIST(_RESET)		RA_EVICT_FEATURES_LIST(_RESET)
		if (EnableDevelopmentFeatures) {
		RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_RESET)
		RA_EVICT_REST_DEVELOPMENT_FEATURES(_RESET)
#undef _RESET		#undef _RESET
}		}
		}

// Per-live interval components that get aggregated into the feature values		// Per-live interval components that get aggregated into the feature values
// that will be passed to the evaluator.		// that will be passed to the evaluator.
		mtrofinUnsubmitted Done Reply Inline Actions you can avoid the -1 thing by splitting your RA_EVICT_FEATURES_UNDER_DEVELOPMENT_LIST in 2, the "first" and the "rest", and then forcing the index of the first to be == `FeatureCount` #ifdef LLVM_HAVE_TF_API #define RA_FIRST_EXTRA_FEATURE(M) \ M(int64_t, instructions, InstructionsShape, \ "Opcodes of the instructions covered by the eviction problem") #define RA_REST_EXTRA_FEATURES(M) \ M(int64_t, instructions_mapping, InstructionsMappingShape, \ "A binary matrix mapping LRs to instruction opcodes") #define RA_EVICT_FEATURES_UNDER_DEVELOPMENT(M) \ RA_FIRST_EXTRA_FEATURE(M) \ RA_REST_EXTRA_FEATURES(M) #else #define RA_FIRST_EXTRA_FEATURE(M) #define RA_REST_EXTRA_FEATURES(M) #endif #define RA_EVICT_FEATURES_UNDER_DEVELOPMENT(M) \ RA_FIRST_EXTRA_FEATURE(M) \ RA_REST_EXTRA_FEATURES(M) enum FeatureIDs { #define _FEATURE_IDX_SIMPLE(_, name, __, ___) name #define _FEATURE_IDX(A,B,C,D) _FEATURE_IDX_SIMPLE(A,B,C,D), RA_EVICT_FEATURES_LIST(_FEATURE_IDX) FeatureCount, RA_FIRST_EXTRA_FEATURE(_FEATURE_IDX_SIMPLE) = FeatureCount, RA_REST_EXTRA_FEATURES(_FEATURE_IDX) FeaturesWithDevelopmentCount #undef _FEATURE_IDX }; mtrofin: you can avoid the -1 thing by splitting your RA_EVICT_FEATURES_UNDER_DEVELOPMENT_LIST in 2, the…
struct LIFeatureComponents {		struct LIFeatureComponents {
double R = 0;		double R = 0;
double W = 0;		double W = 0;
double RW = 0;		double RW = 0;
double IndVarUpdates = 0;		double IndVarUpdates = 0;
double HintWeights = 0.0;		double HintWeights = 0.0;
int64_t NrDefsAndUses = 0;		int64_t NrDefsAndUses = 0;
float HottestBlockFreq = 0.0;		float HottestBlockFreq = 0.0;
bool IsRemat = false;		bool IsRemat = false;
};		};

using CandidateRegList =		using CandidateRegList =
std::array<std::pair<MCRegister, bool>, NumberOfInterferences>;		std::array<std::pair<MCRegister, bool>, NumberOfInterferences>;
		mtrofinUnsubmitted Done Reply Inline Actions Initialize things that don't have default ctors easy to do at def, and avoids nondeterministic bugs. Like Pos, at minimum, set it to 0. mtrofin: Initialize things that don't have default ctors easy to do at def, and avoids nondeterministic…
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions That's a good point. Done for `Pos`. The `SlotIndex` class has a default constructor creating an invalid index, so we should be good there. aidengrossman: That's a good point. Done for `Pos`. The `SlotIndex` class has a default constructor creating…
using FeaturesListNormalizer =		using FeaturesListNormalizer =
llvm::SmallVector<float, FeatureIDs::FeatureCount>;		llvm::SmallVector<float, FeatureIDs::FeatureCount>;
		mtrofinUnsubmitted Done Reply Inline Actions add a comment as to what `Pos` is - it's the index of the column corresponding to the physical register of the live interval segment captured by this LRStartEndInfo, right? (or the candidate) also a comment for LRStartEndInfo overall mtrofin: add a comment as to what `Pos` is - it's the index of the column corresponding to the physical…
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions Added a comment for the entirety of `LRStartEndInfo` which also covers the meaning of the `Pos` member. aidengrossman: Added a comment for the entirety of `LRStartEndInfo` which also covers the meaning of the `Pos`…

/// The ML evictor (commonalities between release and development mode)		/// The ML evictor (commonalities between release and development mode)
class MLEvictAdvisor : public RegAllocEvictionAdvisor {		class MLEvictAdvisor : public RegAllocEvictionAdvisor {
public:		public:
MLEvictAdvisor(const MachineFunction &MF, const RAGreedy &RA,		MLEvictAdvisor(const MachineFunction &MF, const RAGreedy &RA,
MLModelRunner *Runner, const MachineBlockFrequencyInfo &MBFI,		MLModelRunner *Runner, const MachineBlockFrequencyInfo &MBFI,
const MachineLoopInfo &Loops);		const MachineLoopInfo &Loops);

Show All 13 Lines	protected:
virtual int64_t		virtual int64_t
tryFindEvictionCandidatePosition(const LiveInterval &VirtReg,		tryFindEvictionCandidatePosition(const LiveInterval &VirtReg,
const AllocationOrder &Order,		const AllocationOrder &Order,
unsigned OrderLimit, uint8_t CostPerUseLimit,		unsigned OrderLimit, uint8_t CostPerUseLimit,
const SmallVirtRegSet &FixedRegisters) const;		const SmallVirtRegSet &FixedRegisters) const;

/// Load the features of the given VirtReg (allocated or not) at column Pos,		/// Load the features of the given VirtReg (allocated or not) at column Pos,
/// but if that can't be evicted, return false instead.		/// but if that can't be evicted, return false instead.
bool loadInterferenceFeatures(const LiveInterval &VirtReg, MCRegister PhysReg,		bool
bool IsHint,		loadInterferenceFeatures(const LiveInterval &VirtReg, MCRegister PhysReg,
const SmallVirtRegSet &FixedRegisters,		bool IsHint, const SmallVirtRegSet &FixedRegisters,
llvm::SmallVectorImpl<float> &Largest,		llvm::SmallVectorImpl<float> &Largest, size_t Pos,
size_t Pos) const;		SmallVectorImpl<LRStartEndInfo> &LRPosInfo) const;

private:		private:
static float getInitialQueueSize(const MachineFunction &MF);		static float getInitialQueueSize(const MachineFunction &MF);

MCRegister tryFindEvictionCandidate(		MCRegister tryFindEvictionCandidate(
const LiveInterval &VirtReg, const AllocationOrder &Order,		const LiveInterval &VirtReg, const AllocationOrder &Order,
uint8_t CostPerUseLimit,		uint8_t CostPerUseLimit,
const SmallVirtRegSet &FixedRegisters) const override;		const SmallVirtRegSet &FixedRegisters) const override;

void extractFeatures(const SmallVectorImpl<const LiveInterval *> &Intervals,		void extractFeatures(const SmallVectorImpl<const LiveInterval *> &Intervals,
llvm::SmallVectorImpl<float> &Largest, size_t Pos,		llvm::SmallVectorImpl<float> &Largest, size_t Pos,
int64_t IsHint, int64_t LocalIntfsCount,		int64_t IsHint, int64_t LocalIntfsCount, float NrUrgent,
float NrUrgent) const;		SmallVectorImpl<LRStartEndInfo> &LRPosInfo) const;

// Point-in-time: we didn't learn this, so we always delegate to the		// Point-in-time: we didn't learn this, so we always delegate to the
// default.		// default.
bool canEvictHintInterference(		bool canEvictHintInterference(
const LiveInterval &VirtReg, MCRegister PhysReg,		const LiveInterval &VirtReg, MCRegister PhysReg,
const SmallVirtRegSet &FixedRegisters) const override {		const SmallVirtRegSet &FixedRegisters) const override {
return getDefaultAdvisor().canEvictHintInterference(VirtReg, PhysReg,		return getDefaultAdvisor().canEvictHintInterference(VirtReg, PhysReg,
FixedRegisters);		FixedRegisters);
Show All 26 Lines
// ===================================		// ===================================
// Release (AOT) - specifics		// Release (AOT) - specifics
// ===================================		// ===================================
class ReleaseModeEvictionAdvisorAnalysis final		class ReleaseModeEvictionAdvisorAnalysis final
: public RegAllocEvictionAdvisorAnalysis {		: public RegAllocEvictionAdvisorAnalysis {
public:		public:
ReleaseModeEvictionAdvisorAnalysis()		ReleaseModeEvictionAdvisorAnalysis()
: RegAllocEvictionAdvisorAnalysis(AdvisorMode::Release) {		: RegAllocEvictionAdvisorAnalysis(AdvisorMode::Release) {
		if (EnableDevelopmentFeatures) {
		InputFeatures = {RA_EVICT_FEATURES_LIST(
		_DECL_FEATURES) RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_DECL_FEATURES)
		RA_EVICT_REST_DEVELOPMENT_FEATURES(_DECL_FEATURES)};
		} else {
InputFeatures = {RA_EVICT_FEATURES_LIST(_DECL_FEATURES)};		InputFeatures = {RA_EVICT_FEATURES_LIST(_DECL_FEATURES)};
}		}
		}
// support for isa<> and dyn_cast.		// support for isa<> and dyn_cast.
static bool classof(const RegAllocEvictionAdvisorAnalysis *R) {		static bool classof(const RegAllocEvictionAdvisorAnalysis *R) {
return R->getAdvisorMode() == AdvisorMode::Release;		return R->getAdvisorMode() == AdvisorMode::Release;
}		}

private:		private:
std::vector<TensorSpec> InputFeatures;		std::vector<TensorSpec> InputFeatures;

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	private:
Logger *const Log;		Logger *const Log;
};		};

class DevelopmentModeEvictionAdvisorAnalysis final		class DevelopmentModeEvictionAdvisorAnalysis final
: public RegAllocEvictionAdvisorAnalysis {		: public RegAllocEvictionAdvisorAnalysis {
public:		public:
DevelopmentModeEvictionAdvisorAnalysis()		DevelopmentModeEvictionAdvisorAnalysis()
: RegAllocEvictionAdvisorAnalysis(AdvisorMode::Development) {		: RegAllocEvictionAdvisorAnalysis(AdvisorMode::Development) {
		if (EnableDevelopmentFeatures) {
		InputFeatures = {RA_EVICT_FEATURES_LIST(
		_DECL_FEATURES) RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_DECL_FEATURES)
		RA_EVICT_REST_DEVELOPMENT_FEATURES(_DECL_FEATURES)};
		TrainingInputFeatures = {
		RA_EVICT_FEATURES_LIST(_DECL_TRAIN_FEATURES)
		RA_EVICT_FIRST_DEVELOPMENT_FEATURE(_DECL_TRAIN_FEATURES)
		RA_EVICT_REST_DEVELOPMENT_FEATURES(_DECL_TRAIN_FEATURES)
		TensorSpec::createSpec<float>("action_discount", {1}),
		TensorSpec::createSpec<int32_t>("action_step_type", {1}),
		TensorSpec::createSpec<float>("action_reward", {1})};
		} else {
InputFeatures = {RA_EVICT_FEATURES_LIST(_DECL_FEATURES)};		InputFeatures = {RA_EVICT_FEATURES_LIST(_DECL_FEATURES)};
TrainingInputFeatures = {		TrainingInputFeatures = {
RA_EVICT_FEATURES_LIST(_DECL_TRAIN_FEATURES)		RA_EVICT_FEATURES_LIST(_DECL_TRAIN_FEATURES)
TensorSpec::createSpec<float>("action_discount", {1}),		TensorSpec::createSpec<float>("action_discount", {1}),
TensorSpec::createSpec<int32_t>("action_step_type", {1}),		TensorSpec::createSpec<int32_t>("action_step_type", {1}),
TensorSpec::createSpec<float>("action_reward", {1})};		TensorSpec::createSpec<float>("action_reward", {1})};
}		}
		}
// support for isa<> and dyn_cast.		// support for isa<> and dyn_cast.
static bool classof(const RegAllocEvictionAdvisorAnalysis *R) {		static bool classof(const RegAllocEvictionAdvisorAnalysis *R) {
return R->getAdvisorMode() == AdvisorMode::Development;		return R->getAdvisorMode() == AdvisorMode::Development;
}		}

/// get the logger for the given function, or nullptr if we didn't collect		/// get the logger for the given function, or nullptr if we didn't collect
/// one. This is used to inject the score by the RegAllocScoring pass.		/// one. This is used to inject the score by the RegAllocScoring pass.
Logger *getLogger(const MachineFunction &MF) const {		Logger *getLogger(const MachineFunction &MF) const {
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	int64_t MLEvictAdvisor::tryFindEvictionCandidatePosition(
assert(Ret >= 0);		assert(Ret >= 0);
assert(Ret <= CandidateVirtRegPos);		assert(Ret <= CandidateVirtRegPos);
return Ret;		return Ret;
}		}

bool MLEvictAdvisor::loadInterferenceFeatures(		bool MLEvictAdvisor::loadInterferenceFeatures(
const LiveInterval &VirtReg, MCRegister PhysReg, bool IsHint,		const LiveInterval &VirtReg, MCRegister PhysReg, bool IsHint,
const SmallVirtRegSet &FixedRegisters,		const SmallVirtRegSet &FixedRegisters,
llvm::SmallVectorImpl<float> &Largest, size_t Pos) const {		llvm::SmallVectorImpl<float> &Largest, size_t Pos,
		llvm::SmallVectorImpl<LRStartEndInfo> &LRPosInfo) const {
// It is only possible to evict virtual register interference.		// It is only possible to evict virtual register interference.
if (Matrix->checkInterference(VirtReg, PhysReg) > LiveRegMatrix::IK_VirtReg) {		if (Matrix->checkInterference(VirtReg, PhysReg) > LiveRegMatrix::IK_VirtReg) {
// leave unavailable		// leave unavailable
return false;		return false;
}		}

const bool IsLocal = LIS->intervalIsInOneMBB(VirtReg);		const bool IsLocal = LIS->intervalIsInOneMBB(VirtReg);
int64_t LocalIntfs = 0;		int64_t LocalIntfs = 0;
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	for (const LiveInterval *Intf : reverse(IFIntervals)) {

LocalIntfs += (IsLocal && LIS->intervalIsInOneMBB(*Intf) &&		LocalIntfs += (IsLocal && LIS->intervalIsInOneMBB(*Intf) &&
(!EnableLocalReassign \|\| !canReassign(*Intf, PhysReg)));		(!EnableLocalReassign \|\| !canReassign(*Intf, PhysReg)));
}		}
}		}
// OK, so if we made it this far, this LR is an eviction candidate, load its		// OK, so if we made it this far, this LR is an eviction candidate, load its
// features.		// features.
extractFeatures(InterferingIntervals, Largest, Pos, IsHint, LocalIntfs,		extractFeatures(InterferingIntervals, Largest, Pos, IsHint, LocalIntfs,
NrUrgent);		NrUrgent, LRPosInfo);
return true;		return true;
}		}

MCRegister MLEvictAdvisor::tryFindEvictionCandidate(		MCRegister MLEvictAdvisor::tryFindEvictionCandidate(
const LiveInterval &VirtReg, const AllocationOrder &Order,		const LiveInterval &VirtReg, const AllocationOrder &Order,
uint8_t CostPerUseLimit, const SmallVirtRegSet &FixedRegisters) const {		uint8_t CostPerUseLimit, const SmallVirtRegSet &FixedRegisters) const {
auto MaybeOrderLimit = getOrderLimit(VirtReg, Order, CostPerUseLimit);		auto MaybeOrderLimit = getOrderLimit(VirtReg, Order, CostPerUseLimit);
if (!MaybeOrderLimit)		if (!MaybeOrderLimit)
Show All 27 Lines	MCRegister MLEvictAdvisor::tryFindEvictionCandidate(
FeaturesListNormalizer Largest(FeatureIDs::FeatureCount, 0.0);		FeaturesListNormalizer Largest(FeatureIDs::FeatureCount, 0.0);

// Same overal idea as in the default eviction policy - we visit the values		// Same overal idea as in the default eviction policy - we visit the values
// of AllocationOrder one at a time. If it's not legally available, we mask		// of AllocationOrder one at a time. If it's not legally available, we mask
// off the corresponding feature column (==do nothing because we already		// off the corresponding feature column (==do nothing because we already
// reset all the features to 0) Use Pos to capture the column we load		// reset all the features to 0) Use Pos to capture the column we load
// features at - in AllocationOrder order.		// features at - in AllocationOrder order.
size_t Pos = 0;		size_t Pos = 0;
		SmallVector<LRStartEndInfo, NumberOfInterferences> LRPosInfo;
for (auto I = Order.begin(), E = Order.getOrderLimitEnd(OrderLimit); I != E;		for (auto I = Order.begin(), E = Order.getOrderLimitEnd(OrderLimit); I != E;
++I, ++Pos) {		++I, ++Pos) {
MCRegister PhysReg = *I;		MCRegister PhysReg = *I;
assert(!Regs[Pos].second);		assert(!Regs[Pos].second);
assert(PhysReg);		assert(PhysReg);
if (!canAllocatePhysReg(CostPerUseLimit, PhysReg)) {		if (!canAllocatePhysReg(CostPerUseLimit, PhysReg)) {
continue;		continue;
}		}
if (loadInterferenceFeatures(VirtReg, PhysReg, I.isHint(), FixedRegisters,		if (loadInterferenceFeatures(VirtReg, PhysReg, I.isHint(), FixedRegisters,
Largest, Pos)) {		Largest, Pos, LRPosInfo)) {
++Available;		++Available;
Regs[Pos] = std::make_pair(PhysReg, true);		Regs[Pos] = std::make_pair(PhysReg, true);
}		}
}		}
if (Available == 0) {		if (Available == 0) {
// Nothing to decide, nothing to learn.		// Nothing to decide, nothing to learn.
assert(!MustFindEviction);		assert(!MustFindEviction);
return MCRegister::NoRegister;		return MCRegister::NoRegister;
}		}
const size_t ValidPosLimit = Pos;		const size_t ValidPosLimit = Pos;
// If we must find eviction, the candidate should be masked out of the		// If we must find eviction, the candidate should be masked out of the
// decision making process.		// decision making process.
Regs[CandidateVirtRegPos].second = !MustFindEviction;		Regs[CandidateVirtRegPos].second = !MustFindEviction;
if (!MustFindEviction)		if (!MustFindEviction)
extractFeatures(SmallVector<const LiveInterval *, 1>(1, &VirtReg), Largest,		extractFeatures(SmallVector<const LiveInterval *, 1>(1, &VirtReg), Largest,
CandidateVirtRegPos, /IsHint/ 0,		CandidateVirtRegPos, /IsHint/ 0,
/LocalIntfsCount/ 0,		/LocalIntfsCount/ 0,
/NrUrgent/ 0.0);		/NrUrgent/ 0.0, LRPosInfo);
assert(InitialQSize > 0.0 && "We couldn't have gotten here if we had "		assert(InitialQSize > 0.0 && "We couldn't have gotten here if we had "
"nothing to allocate initially.");		"nothing to allocate initially.");
		#ifdef LLVM_HAVE_TF_API
		if (EnableDevelopmentFeatures) {
		extractInstructionFeatures(
		LRPosInfo, Runner,
		[this](SlotIndex InputIndex) -> int {
		auto *CurrentMachineInstruction =
		LIS->getInstructionFromIndex(InputIndex);
		if (!CurrentMachineInstruction) {
		return -1;
		}
		return CurrentMachineInstruction->getOpcode();
		},
		FeatureIDs::instructions, FeatureIDs::instructions_mapping,
		LIS->getSlotIndexes()->getLastIndex());
		}
		#endif // #ifdef LLVM_HAVE_TF_API
// Normalize the features.		// Normalize the features.
for (auto &V : Largest)		for (auto &V : Largest)
V = V ? V : 1.0;		V = V ? V : 1.0;
for (size_t FeatureIndex = 0; FeatureIndex < FeatureIDs::FeatureCount;		for (size_t FeatureIndex = 0; FeatureIndex < FeatureIDs::FeatureCount;
++FeatureIndex) {		++FeatureIndex) {
if (DoNotNormalize.test(FeatureIndex))		if (DoNotNormalize.test(FeatureIndex))
continue;		continue;
for (size_t Pos = 0; Pos < NumberOfInterferences; ++Pos) {		for (size_t Pos = 0; Pos < NumberOfInterferences; ++Pos) {
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	MLEvictAdvisor::getLIFeatureComponents(const LiveInterval &LI) const {
return Ret;		return Ret;
}		}

// Overall, this currently mimics what we do for weight calculation, but instead		// Overall, this currently mimics what we do for weight calculation, but instead
// of accummulating the various features, we keep them separate.		// of accummulating the various features, we keep them separate.
void MLEvictAdvisor::extractFeatures(		void MLEvictAdvisor::extractFeatures(
const SmallVectorImpl<const LiveInterval *> &Intervals,		const SmallVectorImpl<const LiveInterval *> &Intervals,
llvm::SmallVectorImpl<float> &Largest, size_t Pos, int64_t IsHint,		llvm::SmallVectorImpl<float> &Largest, size_t Pos, int64_t IsHint,
int64_t LocalIntfsCount, float NrUrgent) const {		int64_t LocalIntfsCount, float NrUrgent,
		SmallVectorImpl<LRStartEndInfo> &LRPosInfo) const {
int64_t NrDefsAndUses = 0;		int64_t NrDefsAndUses = 0;
int64_t NrBrokenHints = 0;		int64_t NrBrokenHints = 0;
double R = 0.0;		double R = 0.0;
double W = 0.0;		double W = 0.0;
double RW = 0.0;		double RW = 0.0;
double IndVarUpdates = 0.0;		double IndVarUpdates = 0.0;
double HintWeights = 0.0;		double HintWeights = 0.0;
float StartBBFreq = 0.0;		float StartBBFreq = 0.0;
Show All 30 Lines	for (const auto *L : Intervals) {
R += LIFC.R;		R += LIFC.R;
W += LIFC.W;		W += LIFC.W;
RW += LIFC.RW;		RW += LIFC.RW;

IndVarUpdates += LIFC.IndVarUpdates;		IndVarUpdates += LIFC.IndVarUpdates;

HintWeights += LIFC.HintWeights;		HintWeights += LIFC.HintWeights;
NrRematerializable += LIFC.IsRemat;		NrRematerializable += LIFC.IsRemat;

		if (EnableDevelopmentFeatures) {
		for (auto CurrentSegment : LI) {
		LRPosInfo.push_back(
		LRStartEndInfo{CurrentSegment.start, CurrentSegment.end, Pos});
		}
		}
}		}
size_t Size = 0;		size_t Size = 0;
if (!Intervals.empty()) {		if (!Intervals.empty()) {
StartBBFreq =		StartBBFreq =
MBFI.getBlockFreqRelativeToEntryBlock(LIS->getMBBFromIndex(StartSI));		MBFI.getBlockFreqRelativeToEntryBlock(LIS->getMBBFromIndex(StartSI));
if (EndSI >= LIS->getSlotIndexes()->getLastIndex())		if (EndSI >= LIS->getSlotIndexes()->getLastIndex())
EndSI = LIS->getSlotIndexes()->getLastIndex().getPrevIndex();		EndSI = LIS->getSlotIndexes()->getLastIndex().getPrevIndex();
EndBBFreq =		EndBBFreq =
Show All 26 Lines	#define SET(ID, TYPE, VAL) \
SET(hottest_bb_freq_by_max, float, HottestBlockFreq);		SET(hottest_bb_freq_by_max, float, HottestBlockFreq);
SET(liverange_size, float, Size);		SET(liverange_size, float, Size);
SET(use_def_density, float, TotalWeight);		SET(use_def_density, float, TotalWeight);
SET(max_stage, int64_t, MaxStage);		SET(max_stage, int64_t, MaxStage);
SET(min_stage, int64_t, MinStage);		SET(min_stage, int64_t, MinStage);
#undef SET		#undef SET
}		}

		void extractInstructionFeatures(SmallVectorImpl<LRStartEndInfo> &LRPosInfo,
		mtrofinUnsubmitted Done Reply Inline Actions this deserves more comments about: the whole structure of the data packet you're preparing and the various steps along the way - it'll greatly help with maintainability, etc. mtrofin: this deserves more comments about: the whole structure of the data packet you're preparing and…
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions I've added quite a few more comments on the structure of the extraction algorithm itself and the structure of the data that it extracts. Let me know if anything is unclear or too verbose and I'll work on fixing it. aidengrossman: I've added quite a few more comments on the structure of the extraction algorithm itself and…
		MLModelRunner *RegallocRunner,
		mtrofinUnsubmitted Done Reply Inline Actions may be more readable to have a small struct with the 2 slot indices and size_t with nice readable names, and then avoid the whole std::get<x> business. mtrofin: may be more readable to have a small struct with the 2 slot indices and size_t with nice…
		function_ref<int(SlotIndex)> GetOpcode,
		const int InstructionsIndex,
		const int InstructionsMappingIndex,
		const SlotIndex LastIndex) {
		// This function extracts instruction based features relevant to the eviction
		// problem currently being solved. This function ends up extracting two
		// tensors.
		// 1 - A vector of size max instruction count. It contains the opcodes of the
		// instructions spanned by all the intervals in the current instance of the
		// eviction problem.
		// 2 - A binary mapping matrix of size (LR count * max
		// instruction count) which maps where the LRs are live to the actual opcodes
		// for which they are live.

		// Start off by sorting the segments based on the beginning slot index.
		std::sort(
		mtrofinUnsubmitted Done Reply Inline Actions nit: we generally do if (!CurrentMachineInstruction). mtrofin: nit: we generally do if (!CurrentMachineInstruction).
		LRPosInfo.begin(), LRPosInfo.end(),
		[](LRStartEndInfo A, LRStartEndInfo B) { return A.Begin < B.Begin; });
		size_t InstructionIndex = 0;
		size_t CurrentSegmentIndex = 0;
		SlotIndex CurrentIndex = LRPosInfo[0].Begin;
		// This loop processes all the segments sequentially by starting at the
		// beginning slot index of the first segment, iterating through all the slot
		// indices before the end slot index of that segment (while checking for
		// overlaps with segments that start at greater slot indices). After hitting
		// that end index, the current segment being processed gets bumped until they
		// are all processed or the max instruction count is hit, where everything is
		// just truncated.
		while (true) {
		// If the index that we are currently at is within the current segment and
		// we haven't hit the max instruction count, continue processing the current
		// segment.
		while (CurrentIndex <= LRPosInfo[CurrentSegmentIndex].End &&
		InstructionIndex < ModelMaxSupportedInstructionCount) {
		int CurrentOpcode = GetOpcode(CurrentIndex);
		// If the current machine instruction is null, skip it
		if (CurrentOpcode == -1) {
		// If we're currently at the last index in the SlotIndex analysis,
		// we can't go any further, so return from the function
		if (CurrentIndex >= LastIndex) {
		return;
		}
		CurrentIndex = CurrentIndex.getNextIndex();
		continue;
		}
		// Current code assumes we're not going to get any disjointed segments
		assert(LRPosInfo[CurrentSegmentIndex].Begin <= CurrentIndex);
		RegallocRunner->getTensor<int64_t>(InstructionsIndex)[InstructionIndex] =
		CurrentOpcode < OpcodeValueCutoff ? CurrentOpcode : 0;
		// set value in the binary mapping matrix for the current instruction
		auto CurrentSegmentPosition = LRPosInfo[CurrentSegmentIndex].Pos;
		RegallocRunner->getTensor<int64_t>(
		InstructionsMappingIndex)[CurrentSegmentPosition *
		ModelMaxSupportedInstructionCount +
		InstructionIndex] = 1;
		// All of the segments are sorted based on the beginning slot index, but
		// this doesn't mean that the beginning slot index of the next segment is
		// after the end segment of the one being currently processed. This while
		// loop checks for overlapping segments and modifies the portion of the
		// column in the mapping matrix for the currently processed instruction
		// for the LR it is checking. Also make sure that the beginning of the
		// current segment we're checking for overlap in is less than the current
		// index, otherwise we're done checking overlaps.
		size_t OverlapCheckCurrentSegment = CurrentSegmentIndex + 1;
		while (OverlapCheckCurrentSegment < LRPosInfo.size() &&
		LRPosInfo[OverlapCheckCurrentSegment].Begin <= CurrentIndex) {
		auto OverlapCurrentSegmentPosition =
		LRPosInfo[OverlapCheckCurrentSegment].Pos;
		if (LRPosInfo[OverlapCheckCurrentSegment].End >= CurrentIndex) {
		RegallocRunner->getTensor<int64_t>(
		InstructionsMappingIndex)[OverlapCurrentSegmentPosition *
		ModelMaxSupportedInstructionCount +
		InstructionIndex] = 1;
		}
		++OverlapCheckCurrentSegment;
		}
		++InstructionIndex;
		if (CurrentIndex >= LastIndex) {
		return;
		}
		CurrentIndex = CurrentIndex.getNextIndex();
		}
		// if we've just finished processing through the last segment or if we've
		// hit the maximum number of instructions, break out of the loop.
		if (CurrentSegmentIndex == LRPosInfo.size() - 1 \|\|
		InstructionIndex >= ModelMaxSupportedInstructionCount) {
		break;
		}
		// If the segments are not overlapping, we need to move to the beginning
		// index of the next segment to avoid having instructions not attached to
		// any register.
		if (LRPosInfo[CurrentSegmentIndex + 1].Begin >
		LRPosInfo[CurrentSegmentIndex].End) {
		CurrentIndex = LRPosInfo[CurrentSegmentIndex + 1].Begin;
		}
		++CurrentSegmentIndex;
		}
		}

// Development mode-specific implementations		// Development mode-specific implementations
#ifdef LLVM_HAVE_TF_API		#ifdef LLVM_HAVE_TF_API

RegAllocEvictionAdvisorAnalysis *llvm::createDevelopmentModeAdvisor() {		RegAllocEvictionAdvisorAnalysis *llvm::createDevelopmentModeAdvisor() {
		mtrofinUnsubmitted Not Done Reply Inline Actions This feels like it'd benefit from a unittest. The core logic is about intervals that cover instruction opcodes, so the only LLVM-ness of it is in `LIS->getInstructionFromIndex(CurrentIndex)` (because `LIS->getSlotIndexes()->getLastIndex())` is basically a constant you can pass in) You can test it by making a small change: make this into a utility that takes LRPosInfo and a std::function (or a llvm::function_ref, whatever) that gives the opcode for a SlotIndex. The utility doesn't need to know about mlevictadvisor or anything - it's just dealing with intervals. You can also pass a Runner in, and in the test case, just use the NoInferenceModelRunner. So then you can set up all sorts of interesting interval overlapping cases, and you can just populate your opcodes with incrementing numbers or something. The nice thing is that the unittest doesn't need any #define specific stuff - it's generic. The rest can stay the same - meaning, let `MLEvictAdvisor::extractInstructionFeatures` call that utility, etc. mtrofin: This feels like it'd benefit from a unittest. The core logic is about intervals that cover…
return new DevelopmentModeEvictionAdvisorAnalysis();		return new DevelopmentModeEvictionAdvisorAnalysis();
}		}

int64_t DevelopmentModeEvictAdvisor::tryFindEvictionCandidatePosition(		int64_t DevelopmentModeEvictAdvisor::tryFindEvictionCandidatePosition(
const LiveInterval &VirtReg, const AllocationOrder &Order,		const LiveInterval &VirtReg, const AllocationOrder &Order,
		mtrofinUnsubmitted Done Reply Inline Actions (nit - it took me a few reads to get it) "A vector of size "max instruction count". It contains the instruction opcodes of instructions covered by all intervals in LRPosInfo" wdyt? mtrofin: (nit - it took me a few reads to get it) "A vector of size "max instruction count". It contains…
unsigned OrderLimit, uint8_t CostPerUseLimit,		unsigned OrderLimit, uint8_t CostPerUseLimit,
const SmallVirtRegSet &FixedRegisters) const {		const SmallVirtRegSet &FixedRegisters) const {
int64_t Ret = 0;		int64_t Ret = 0;
if (isa<ModelUnderTrainingRunner>(getRunner())) {		if (isa<ModelUnderTrainingRunner>(getRunner())) {
Ret = MLEvictAdvisor::tryFindEvictionCandidatePosition(		Ret = MLEvictAdvisor::tryFindEvictionCandidatePosition(
VirtReg, Order, OrderLimit, CostPerUseLimit, FixedRegisters);		VirtReg, Order, OrderLimit, CostPerUseLimit, FixedRegisters);
} else {		} else {
MCRegister PhysReg = getDefaultAdvisor().tryFindEvictionCandidate(		MCRegister PhysReg = getDefaultAdvisor().tryFindEvictionCandidate(
VirtReg, Order, CostPerUseLimit, FixedRegisters);		VirtReg, Order, CostPerUseLimit, FixedRegisters);
		mtrofinUnsubmitted Done Reply Inline Actions This is rather the Instruction "Index" or something like that, right? Count to me means total. mtrofin: This is rather the Instruction "Index" or something like that, right? Count to me means total.
// Find the index of the selected PhysReg. We need it for logging,		// Find the index of the selected PhysReg. We need it for logging,
		mtrofinUnsubmitted Done Reply Inline Actions Nit: CurrentSegmentIdx mtrofin: Nit: CurrentSegmentIdx
// otherwise this is wasted cycles (but so would starting development mode		// otherwise this is wasted cycles (but so would starting development mode
// without a model nor logging)		// without a model nor logging)
if (!PhysReg)		if (!PhysReg)
Ret = CandidateVirtRegPos;		Ret = CandidateVirtRegPos;
else		else
for (auto I = Order.begin(), E = Order.getOrderLimitEnd(OrderLimit);		for (auto I = Order.begin(), E = Order.getOrderLimitEnd(OrderLimit);
		mtrofinUnsubmitted Done Reply Inline Actions typo: process"ed" mtrofin: typo: process"ed"
I != E; ++I, ++Ret)		I != E; ++I, ++Ret)
if (*I == PhysReg)		if (*I == PhysReg)
break;		break;
}		}
if (TrainingLog.empty())		if (TrainingLog.empty())
return Ret;		return Ret;
size_t CurrentFeature = 0;		size_t CurrentFeature = 0;
for (; CurrentFeature < FeatureIDs::FeatureCount; ++CurrentFeature) {		size_t FeatureCount = EnableDevelopmentFeatures
		? FeatureIDs::FeaturesWithDevelopmentCount
		: FeatureIDs::FeatureCount;
		for (; CurrentFeature < FeatureCount; ++CurrentFeature) {
Log->logSpecifiedTensorValue(		Log->logSpecifiedTensorValue(
CurrentFeature, reinterpret_cast<const char *>(		CurrentFeature, reinterpret_cast<const char *>(
getRunner().getTensorUntyped(CurrentFeature)));		getRunner().getTensorUntyped(CurrentFeature)));
}		}
if (auto *MUTR = dyn_cast<ModelUnderTrainingRunner>(&getRunner()))		if (auto *MUTR = dyn_cast<ModelUnderTrainingRunner>(&getRunner()))
for (size_t I = 1; I < MUTR->outputLoggedFeatureSpecs().size();		for (size_t I = 1; I < MUTR->outputLoggedFeatureSpecs().size();
		mtrofinUnsubmitted Done Reply Inline Actions nit: or, `// continue from the feature index the previous loop left off` ? mtrofin: nit: or, `// continue from the feature index the previous loop left off` ?
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions This was left over from some experimentation that I have since removed and thus was commenting a line that isn't even there. I've changed it to note your suggestion and also to explain why the indexing is slightly odd (only going to `FeaturesWithDevelopmentCount -1 1` rather than `FeaturesWithDevelopmentCount` itself. aidengrossman: This was left over from some experimentation that I have since removed and thus was commenting…
++I, ++CurrentFeature)		++I, ++CurrentFeature)
Log->logSpecifiedTensorValue(		Log->logSpecifiedTensorValue(
CurrentFeature,		CurrentFeature,
reinterpret_cast<const char *>(		reinterpret_cast<const char *>(
MUTR->lastEvaluationResult()->getUntypedTensorValue(I)));		MUTR->lastEvaluationResult()->getUntypedTensorValue(I)));
// The output is right after the features and the extra outputs		// The output is right after the features and the extra outputs
		mtrofinUnsubmitted Done Reply Inline Actions you don't need the -1 thing here if you split the extra list in 2 mtrofin: you don't need the -1 thing here if you split the extra list in 2
Log->logInt64Value(CurrentFeature, &Ret);		Log->logInt64Value(CurrentFeature, &Ret);
return Ret;		return Ret;
}		}

bool RegAllocScoring::runOnMachineFunction(MachineFunction &MF) {		bool RegAllocScoring::runOnMachineFunction(MachineFunction &MF) {
if (auto *DevModeAnalysis = dyn_cast<DevelopmentModeEvictionAdvisorAnalysis>(		if (auto *DevModeAnalysis = dyn_cast<DevelopmentModeEvictionAdvisorAnalysis>(
&getAnalysis<RegAllocEvictionAdvisorAnalysis>()))		&getAnalysis<RegAllocEvictionAdvisorAnalysis>()))
if (auto *Log = DevModeAnalysis->getLogger(MF))		if (auto *Log = DevModeAnalysis->getLogger(MF))
Log->logFloatFinalReward(static_cast<float>(		Log->logFloatFinalReward(static_cast<float>(
calculateRegAllocScore(MF, getAnalysis<MachineBlockFrequencyInfo>())		calculateRegAllocScore(MF, getAnalysis<MachineBlockFrequencyInfo>())
.getScore()));		.getScore()));

return false;		return false;
}		}
#endif // #ifdef LLVM_HAVE_TF_API		#endif // #ifdef LLVM_HAVE_TF_API
		mtrofinUnsubmitted Done Reply Inline Actions the exit condition could include LRPosInfo[OverlapCheckCurrentSegment].Begin > CurrentIndex, for more clarity? mtrofin: the exit condition could include LRPosInfo[OverlapCheckCurrentSegment].Begin > CurrentIndex…

RegAllocEvictionAdvisorAnalysis *llvm::createReleaseModeAdvisor() {		RegAllocEvictionAdvisorAnalysis *llvm::createReleaseModeAdvisor() {
return new ReleaseModeEvictionAdvisorAnalysis();		return new ReleaseModeEvictionAdvisorAnalysis();
}		}

// In all cases except development mode, we don't need scoring.		// In all cases except development mode, we don't need scoring.
#if !defined(LLVM_HAVE_TF_API)		#if !defined(LLVM_HAVE_TF_API)
bool RegAllocScoring::runOnMachineFunction(MachineFunction &) { return false; }		bool RegAllocScoring::runOnMachineFunction(MachineFunction &) { return false; }
#endif		#endif
		mtrofinUnsubmitted Not Done Reply Inline Actions can't `LRPosInfo[OverlapCheckCurrentSegment].End < CurrentIndex`? So we know the CurrentIndex is below Begin, but it could also be below this segment's end? Current instruction: 5 CurrentSegment: [1,7) NextSegment: [2,4) NextNextSegment: [2, 8) mtrofin: can't `LRPosInfo[OverlapCheckCurrentSegment].End < CurrentIndex`? So we know the CurrentIndex…
		aidengrossmanAuthorUnsubmitted Not Done Reply Inline Actions `LRPosInfo[OverlapCheckCurrentSegment].End` can be less than `CurrentIndex`, but it shouldn't be missing any instructions. It's not checking if `CurrentIndex` is below begin, but rather if `CurrentIndex` is above the beginning of the current segment, because if the beginning of the current segment being checked is greater than the current index under analysis, it is guaranteed that the segment currently being processed as well as all future segments don't contain the currently being processed index due to the segments being sorted in ascending order by beginning index. aidengrossman: `LRPosInfo[OverlapCheckCurrentSegment].End` can be less than `CurrentIndex`, but it shouldn't…
		aidengrossmanAuthorUnsubmitted Done Reply Inline Actions Nevermind my previous comment. Not sure what I was thinking there. This was an actual issue and I ended up catching it while looking through some unit test cases. Should be fixed on the next push as I added in a condition to only set the relevant element of the `instructions_mapping` matrix if `CurrentIndex <= LRPosInfo[OverlapCheckCurrentSegment].End`. aidengrossman: Nevermind my previous comment. Not sure what I was thinking there. This was an actual issue and…
		mtrofinUnsubmitted Done Reply Inline Actions you just need to go to the next segment and potentially set CurrentIndex to that one's beginning if it's after CurrentIndex (simpler code) mtrofin: you just need to go to the next segment and potentially set CurrentIndex to that one's…

llvm/test/CodeGen/MLRegalloc/dev-mode-extra-features-logging.ll

This file was added.

				; REQUIRES: have_tf_api
				; REQUIRES: x86_64-linux
				;
				mtrofinUnsubmitted Done Reply Inline Actions Because the output very verbose, perhaps just checking specific values may be more maintainable? Like checking that you counted things correctly in your 33x300 matrix - add some comments maybe about what the output looks like and why certain sentinel values are expected. This would also help one identify, later, if the test breaks legitimately due to a change (e.g. some machine instruction is now not present anymore -> of course the features change) vs due to a bug. mtrofin: Because the output very verbose, perhaps just checking specific values may be more maintainable?
				aidengrossmanAuthorUnsubmitted Done Reply Inline Actions That's a very good point. The test could potentially be pretty fragile if checking the exact code output. I reworked the check to get rid of the exact diff and added a lot more FileCheck checks along with comments to make sure that the output looks as expected to some reasonable degree. aidengrossman: That's a very good point. The test could potentially be pretty fragile if checking the exact…
				mtrofinUnsubmitted Done Reply Inline Actions thanks, this looks way more manageable! mtrofin: thanks, this looks way more manageable!
				; Check that we log the currently in development features correctly with both the default
				; case and with a learned policy.
				;
				; RUN: llc -mtriple=x86_64-linux-unknown -regalloc=greedy -regalloc-enable-advisor=development \
				; RUN: -regalloc-training-log=%t1 -tfutils-text-log \
				; RUN: -regalloc-enable-development-features < %S/Inputs/input.ll
				; RUN: sed -i 's/ \+/ /g' %t1
				; RUN: sed -i 's/\\n key:/\n key:/g' %t1
				; RUN: sed -i 's/\\n feature/\n feature/g' %t1
				; RUN: sed -i 's/\\n/ /g' %t1
				; RUN: FileCheck --input-file %t1 %s

				; RUN: rm -rf %t && mkdir %t
				; RUN: %python %S/../../../lib/Analysis/models/gen-regalloc-eviction-test-model.py %t_savedmodel
				; RUN: %python %S/../../../lib/Analysis/models/saved-model-to-tflite.py %t_savedmodel %t
				; RUN: llc -mtriple=x86_64-linux-unknown -regalloc=greedy -regalloc-enable-advisor=development \
				; RUN: -regalloc-training-log=%t2 -tfutils-text-log -regalloc-model=%t \
				; RUN: -regalloc-enable-development-features < %S/Inputs/input.ll
				; RUN: sed -i 's/ \+/ /g' %t2
				; RUN: sed -i 's/\\n key:/\n key:/g' %t2
				; RUN: sed -i 's/\\n feature/\n feature/g' %t2
				; RUN: sed -i 's/\\n/ /g' %t2
				; RUN: FileCheck --input-file %t2 %s

				; CHECK-NOT: nan
				; CHECK-LABEL: key: \"instructions\"
				; Check the first five opcodes in the first eviction problem
				; CHECK-NEXT: value: 19
				; CHECK-SAME: value: 19
				; CHECK-SAME: value: 3031
				; CHECK-SAME: value: 1245
				; CHECK-SAME: value: 1264
				; The first eviction problem is significantly less than 300 instructions. Check
				; that there is a zero value
				; CHECK-SAME: value: 0
				; Only the candidate virtreg and the 10th LR are included in this problem. Make
				; sure the other LRs have values of zero.
				; CHECK-LABEL: key: \"instructions_mapping\"
				; CHECK-COUNT-2700: value: 0
				; CHECK-SAME: value: 1
				; Indexing 300 back from where the candidate vr actual resides due to the fact
				; that not all the values between the 10th LR and the candidate are zero.
				; CHECK-COUNT-6600: value: 0
				; CHECK-SAME: value: 1
				; Ensure that we can still go through the mapping matrices for the rest of the
				; eviction problems to make sure we haven't hit the end of the matrix above.
				; There are a total of 23 eviction problems with this test.
				; CHECK-COUNT-22: int64_list
				; CHECK: key: \"is_free\"
				mtrofinUnsubmitted Done Reply Inline Actions add a new line mtrofin: add a new line

llvm/unittests/CodeGen/CMakeLists.txt

Show All 29 Lines	add_llvm_unittest(CodeGenTests
MachineOperandTest.cpp		MachineOperandTest.cpp
RegAllocScoreTest.cpp		RegAllocScoreTest.cpp
PassManagerTest.cpp		PassManagerTest.cpp
ScalableVectorMVTsTest.cpp		ScalableVectorMVTsTest.cpp
SelectionDAGAddressAnalysisTest.cpp		SelectionDAGAddressAnalysisTest.cpp
TypeTraitsTest.cpp		TypeTraitsTest.cpp
TargetOptionsTest.cpp		TargetOptionsTest.cpp
TestAsmPrinter.cpp		TestAsmPrinter.cpp
		MLRegallocDevelopmentFeatures.cpp
)		)

add_subdirectory(GlobalISel)		add_subdirectory(GlobalISel)

target_link_libraries(CodeGenTests PRIVATE LLVMTestingSupport)		target_link_libraries(CodeGenTests PRIVATE LLVMTestingSupport)

llvm/unittests/CodeGen/MLRegallocDevelopmentFeatures.cpp

This file was added.

				//===- MLRegAllocDevelopmentFeatures.cpp - test dev MLRegalloc features ---===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "../../lib/CodeGen/MLRegallocEvictAdvisor.h"
				#include "llvm/Analysis/NoInferenceModelRunner.h"
				#include "llvm/CodeGen/SlotIndexes.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/Support/Allocator.h"
				#include "llvm/Support/CodeGen.h"
				#include "gmock/gmock.h"
				#include "gtest/gtest.h"

				#include <vector>

				using testing::ContainerEq;
				using testing::Test;

				struct LRPosInfoIndexes {
				size_t StartIndex;
				size_t EndIndex;
				size_t PhysReg;
				};

				class RegallocDevelopmentFeaturesTest : public ::Test {
				protected:
				SmallVector<LRStartEndInfo>
				setupOverlapProblem(const SmallVectorImpl<LRPosInfoIndexes> &Segments,
				ilist<IndexListEntry> &IndexList) {
				SmallVector<LRStartEndInfo> PositionsToReturn;
				PositionsToReturn.reserve(Segments.size());
				for (auto CurrentPosIndexInfo : Segments) {
				LRStartEndInfo CurrentPosInfo = {};
				CurrentPosInfo.Pos = CurrentPosIndexInfo.PhysReg;
				PositionsToReturn.push_back(CurrentPosInfo);
				}
				size_t CurrentSegmentIndex = 0;
				size_t CurrentIndex = 0;
				while (CurrentSegmentIndex < Segments.size()) {
				auto CurrentLEMem = static_cast<IndexListEntry >(
				Allocator.Allocate(sizeof(IndexListEntry), alignof(IndexListEntry)));
				auto *CurrentListEntry =
				new (CurrentLEMem) IndexListEntry(nullptr, CurrentIndex);
				IndexList.push_back(CurrentListEntry);
				for (size_t CurrentPosInfoIndex = 0;
				CurrentPosInfoIndex < Segments.size(); ++CurrentPosInfoIndex) {
				if ((CurrentIndex / SlotIndex::InstrDist) ==
				Segments[CurrentPosInfoIndex].StartIndex) {
				PositionsToReturn[CurrentPosInfoIndex].Begin =
				SlotIndex(CurrentListEntry, 0);
				} else if ((CurrentIndex / SlotIndex::InstrDist) ==
				Segments[CurrentPosInfoIndex].EndIndex) {
				PositionsToReturn[CurrentPosInfoIndex].End =
				SlotIndex(CurrentListEntry, 0);
				++CurrentSegmentIndex;
				}
				}
				CurrentIndex += SlotIndex::InstrDist;
				}
				return PositionsToReturn;
				}

				NoInferenceModelRunner setupModelRunner() {
				const std::vector<TensorSpec> Inputs{
				TensorSpec::createSpec<int64_t>("instructions", InstructionsShape),
				TensorSpec::createSpec<int64_t>("instructions_mapping",
				InstructionsMappingShape)};
				LLVMContext Ctx;
				return NoInferenceModelRunner(Ctx, Inputs);
				}

				std::vector<int64_t>
				getExpectedMappingMatrix(SmallVectorImpl<LRPosInfoIndexes> &OverlapSetup) {
				std::vector<int64_t> ExpectedMappingMatrix(
				NumberOfInterferences * ModelMaxSupportedInstructionCount, 0);
				for (auto NewSegment : OverlapSetup) {
				for (size_t CurrentIndex = NewSegment.StartIndex;
				CurrentIndex <= NewSegment.EndIndex; ++CurrentIndex) {
				ExpectedMappingMatrix[NewSegment.PhysReg *
				ModelMaxSupportedInstructionCount +
				CurrentIndex] = 1;
				}
				}
				return ExpectedMappingMatrix;
				}

				void runOverlapTest(SmallVectorImpl<LRPosInfoIndexes> &OverlapSetup) {
				ilist<IndexListEntry> IndexList;
				auto OverlapProblem = setupOverlapProblem(OverlapSetup, IndexList);
				NoInferenceModelRunner ModelRunner = setupModelRunner();
				size_t MaxIndex = 0;
				for (size_t CurrentOverlap = 0; CurrentOverlap < OverlapSetup.size();
				++CurrentOverlap) {
				if (OverlapSetup[CurrentOverlap].EndIndex >
				OverlapSetup[MaxIndex].EndIndex) {
				MaxIndex = CurrentOverlap;
				}
				}
				SlotIndex LastIndex = OverlapProblem[MaxIndex].End;
				extractInstructionFeatures(
				OverlapProblem, &ModelRunner,
				[](SlotIndex InputSlot) -> int { return 0; }, 0, 1, LastIndex);
				std::vector<int64_t> MappingMatrix(
				ModelRunner.getTensor<int64_t>(1),
				ModelRunner.getTensor<int64_t>(1) +
				NumberOfInterferences * ModelMaxSupportedInstructionCount);
				ASSERT_THAT(MappingMatrix,
				ContainerEq(getExpectedMappingMatrix(OverlapSetup)));
				IndexList.clearAndLeakNodesUnsafely();
				}

				BumpPtrAllocator Allocator;
				};

				// meta tests to ensure that test setup works correctly

				TEST_F(RegallocDevelopmentFeaturesTest,
				MetaOverlapInstructionDistancesAreCorrect) {
				SmallVector<LRPosInfoIndexes, 2> OverlapSetup;
				OverlapSetup.push_back({0, 5, 0});
				OverlapSetup.push_back({5, 10, 0});
				ilist<IndexListEntry> IndexList;
				auto OverlapProblem = setupOverlapProblem(OverlapSetup, IndexList);
				ASSERT_EQ(OverlapProblem[0].End.distance(OverlapProblem[1].End),
				5 * SlotIndex::InstrDist);
				ASSERT_EQ(OverlapProblem[0].End.distance(OverlapProblem[1].Begin), 0);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, MetaSlotIndicesAreValid) {
				SmallVector<LRPosInfoIndexes, 1> OverlapSetup;
				OverlapSetup.push_back({0, 10, 0});
				ilist<IndexListEntry> IndexList;
				auto OverlapProblem = setupOverlapProblem(OverlapSetup, IndexList);
				ASSERT_TRUE(OverlapProblem[0].Begin.isValid());
				ASSERT_TRUE(OverlapProblem[0].End.isValid());
				}

				// Testing of feature extraction for per-instruction features

				TEST_F(RegallocDevelopmentFeaturesTest, InstructionOpcodesAreCorrect) {
				SmallVector<LRPosInfoIndexes, 1> OverlapSetup;
				OverlapSetup.push_back({0, ModelMaxSupportedInstructionCount - 1, 0});
				ilist<IndexListEntry> IndexList;
				auto OverlapProblem = setupOverlapProblem(OverlapSetup, IndexList);
				NoInferenceModelRunner ModelRunner = setupModelRunner();
				SlotIndex LastIndex = OverlapProblem[0].End;
				SlotIndex FirstIndex = OverlapProblem[0].Begin;
				extractInstructionFeatures(
				OverlapProblem, &ModelRunner,
				[FirstIndex](SlotIndex InputSlot) -> int {
				return FirstIndex.distance(InputSlot) / SlotIndex::InstrDist;
				},
				0, 1, LastIndex);
				for (size_t CurrentInstructionIndex = 0;
				CurrentInstructionIndex < ModelMaxSupportedInstructionCount;
				++CurrentInstructionIndex) {
				ASSERT_EQ(
				(size_t)ModelRunner.getTensor<int64_t>(0)[CurrentInstructionIndex],
				CurrentInstructionIndex);
				}
				}

				TEST_F(RegallocDevelopmentFeaturesTest, FullOverlap) {
				SmallVector<LRPosInfoIndexes, 2> OverlapSetup;
				OverlapSetup.push_back({0, ModelMaxSupportedInstructionCount - 1, 0});
				OverlapSetup.push_back({0, ModelMaxSupportedInstructionCount - 1, 1});
				runOverlapTest(OverlapSetup);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, PartialOverlap) {
				SmallVector<LRPosInfoIndexes, 2> OverlapSetup;
				OverlapSetup.push_back({0, 20, 0});
				OverlapSetup.push_back({15, 30, 1});
				runOverlapTest(OverlapSetup);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, PartialOverlapOpposite) {
				SmallVector<LRPosInfoIndexes, 2> OverlapSetup;
				OverlapSetup.push_back({15, 30, 1});
				OverlapSetup.push_back({0, 20, 0});
				runOverlapTest(OverlapSetup);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, InternalOverlap) {
				SmallVector<LRPosInfoIndexes, 2> OverlapSetup;
				OverlapSetup.push_back({0, 30, 0});
				OverlapSetup.push_back({10, 20, 1});
				runOverlapTest(OverlapSetup);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, TripleInternalOverlap) {
				SmallVector<LRPosInfoIndexes, 3> OverlapSetup;
				OverlapSetup.push_back({0, 30, 0});
				OverlapSetup.push_back({10, 25, 1});
				OverlapSetup.push_back({15, 20, 2});
				runOverlapTest(OverlapSetup);
				}

				TEST_F(RegallocDevelopmentFeaturesTest, InternalMultiOverlap) {
				SmallVector<LRPosInfoIndexes, 3> OverlapSetup;
				OverlapSetup.push_back({0, 45, 0});
				OverlapSetup.push_back({30, 40, 1});
				OverlapSetup.push_back({35, 60, 2});
				runOverlapTest(OverlapSetup);
				}

This is an archive of the discontinued LLVM Phabricator instance.

[mlgo] Add in-development instruction based features for regalloc advisorClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 461025

llvm/include/llvm/CodeGen/SlotIndexes.h

llvm/lib/CodeGen/MLRegallocEvictAdvisor.h

llvm/lib/CodeGen/MLRegallocEvictAdvisor.cpp

llvm/test/CodeGen/MLRegalloc/dev-mode-extra-features-logging.ll

llvm/unittests/CodeGen/CMakeLists.txt

llvm/unittests/CodeGen/MLRegallocDevelopmentFeatures.cpp

[mlgo] Add in-development instruction based features for regalloc advisor
ClosedPublic