This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
ADT/
-
SparseSet.h
-
CodeGen/
-
LiveRangeEdit.h
-
TargetInstrInfo.h
-
lib/
-
CodeGen/
-
InlineSpiller.cpp
-
LiveRangeEdit.cpp
-
RegAllocFast.cpp
-
Target/ARM/
-
ARM/
-
Thumb1InstrInfo.h
2
Thumb1InstrInfo.cpp
-
test/CodeGen/Thumb/
-
CodeGen/
-
Thumb/
-
hgpr-spill-basic.mir
-
hgpr-spill-fast-all.mir
-
hgpr-spill-fast-tsave.mir
-
hgpr-spill-fast-tsave2.mir
-
hgpr-spill-fast.mir

Differential D49364

[ARM] Add support for spilling high registers in Thumb1
Needs ReviewPublic

Authored by petpav01 on Jul 16 2018, 1:40 AM.

Download Raw Diff

Details

Reviewers

olista01
t.p.northover
javed.absar
eli.friedman

Summary

LLVM normally only makes use of low registers in Thumb1 and methods Thumb1InstrInfo::storeRegToStackSlot()/loadRegFromStackSlot() are currently able to store/restore only them. However, it is possible in rare cases that a register allocator might need to spill a high register in the middle of a function as well.

Example:

$ cat test.c
void constraint_h(void) {
  int i;
  asm volatile("@ %0" : : "h" (i) : "r12");
}
$ clang -target arm-none-eabi -march=armv6-m -c test.c
clang-7: [...]/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp:85: virtual void llvm::Thumb1InstrInfo::storeRegToStackSlot(llvm::MachineBasicBlock&, llvm::MachineBasicBlock::iterator, unsigned int, bool, int, const llvm::TargetRegisterClass*, const llvm::TargetRegisterInfo*) const: Assertion `(RC == &ARM::tGPRRegClass || (TargetRegisterInfo::isPhysicalRegister(SrcReg) && isARMLowRegister(SrcReg))) && "Unknown regclass!"' failed.
[...]

The program was compiled at -O0 and so Fast Register Allocator is used. The following happens in this case:

Prior to register allocation, MIR looks as follows:

Frame Objects:
  fi#0: size=4, align=4, at location [SP]

bb.0.entry:
  %1:tgpr = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
  %0:hgpr = COPY %1:tgpr
  INLINEASM &"@ $0" [sideeffect] [attdialect], $0:[reguse:hGPR], %0:hgpr, $1:[clobber], implicit-def early-clobber $r12, !3
  tBX_RET 14, $noreg

Fast Register Allocator first satisfies %0:hgpr by selecting r12.
When the scan reaches the INLINEASM instruction, the allocator however notices that r12 is clobbered and so it needs to be spilled.
The allocator calls Thumb1InstrInfo::storeRegToStackSlot() to store the register in a stack slot but the method does not know how to do it and aborts. This can also result in a miscompilation if LLVM is built without assertions enabled.

Both store and load of a high register in Thumb1 needs an additional low register. For instance, the store is implemented as:

mov %lowReg, %spilledHighReg
str %lowReg, ...

An initial patch in this review extended Thumb1InstrInfo::storeRegToStackSlot() and loadRegFromStackSlot() to allow storing and restoring high registers by inserting a pseudo-instruction that got later lowered after register allocation in ThumbRegisterInfo::eliminateFrameIndex(). This relied on the register scavenger to secure a low register for the sequence. This is possibly problematic when the register pressure is high because ThumbRegisterInfo::saveScavengerRegister() at the moment also tries to make use of high register r12.

The current patch extends RegAllocFast and InlineSpiller to handle a spill with an intermediary directly.

Diff Detail

Event Timeline

petpav01 created this revision.Jul 16 2018, 1:40 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptJul 16 2018, 1:40 AM

Herald added subscribers: llvm-commits, chrib, kristof.beyls, qcolombet. · View Herald Transcript

petpav01 added a reviewer: eli.friedman.Jul 16 2018, 2:56 AM

thopre added a subscriber: thopre.Jul 16 2018, 9:46 AM

thopre added inline comments.

lib/Target/ARM/Thumb1InstrInfo.cpp
132–133	Can you put a similar comment to the store to stack slot?

This is possibly problematic when the register pressure is high because ThumbRegisterInfo::saveScavengerRegister() currently also tries to make use of high register r12.

Also, the constant islands pass can clobber lr. But given the only way to end up with an "hGPR" register class is inline asm, we could probably work around this issue by excluding ip/lr from allocation order for hGPR.

That said, if we ever want to make the high registers generally allocatable in Thumb1 mode, this patch probably isn't the right solution; instead, we should make the register allocator insert the copy, so we aren't forced to scavenge a register later.

lib/Target/ARM/ARMRegisterInfo.td
209 ↗	(On Diff #155628)	This comment isn't really right. It's still worth using the high registers, with appropriate cost constraints: there are a few important instructions which can take high registers as inputs (cmp, add, bx/blx), and even if we're just effectively using them as spill slots, it's cheaper than spilling to the stack.

petpav01 updated this revision to Diff 156217.Jul 19 2018, 12:41 AM

In D49364#1164005, @efriedma wrote:

This is possibly problematic when the register pressure is high because ThumbRegisterInfo::saveScavengerRegister() currently also tries to make use of high register r12.

Also, the constant islands pass can clobber lr. But given the only way to end up with an "hGPR" register class is inline asm, we could probably work around this issue by excluding ip/lr from allocation order for hGPR.

That said, if we ever want to make the high registers generally allocatable in Thumb1 mode, this patch probably isn't the right solution; instead, we should make the register allocator insert the copy, so we aren't forced to scavenge a register later.

Sorry, the mentioned idea with the copy is not quite clear to me. Could you please explain it a bit more for me?

lib/Target/ARM/ARMRegisterInfo.td
209 ↗	(On Diff #155628)	Updated, hopefully it now makes sense.
lib/Target/ARM/Thumb1InstrInfo.cpp
132–133	Added.

Sorry, the mentioned idea with the copy is not quite clear to me. Could you please explain it a bit more for me?

Say the target has a new hook, call it "getRegClassForStackSaveRestore()" or something, which takes a register class, and returns a register class appropriate for stack save/restore operations. Then when a register allocator wants to spill a vreg, it first calls getRegClassForStackSaveRestore(); if that returns a new register class, instead of spilling using storeRegToStackSlot, it makes a new vreg with the returned class, and inserts a COPY to that vreg.

This avoids having to scavenge a register later; the register allocator has more ways to make a register available, so the resulting code is likely more efficient, and it avoids the potential problem of needing to scavenge multiple registers in ThumbRegisterInfo::eliminateFrameIndex.

thegameg added a subscriber: thegameg.Jul 30 2018, 4:04 AM

Thanks for the explanation of this idea. Updated patch goes in that direction and provides a prototype of this approach. The implementation is done in Fast Register Allocator (which has its own spiller code) and in InlineSpiller (used by the other LLVM allocators: Basic, Greedy, PBQP).

The implemented approach is to always make a complete spill of a high register to stack instead of initially moving it only to a low register and then spill the low register if actually needed. This allows to keep things a bit simpler to implement and reason about. InlineSpiller could be improved to implement only the mentioned "half-spills" but it does not appear necessary for now. With this problem currently being limited only to inline assembler, I think the Greedy production allocator should not get in a state where it would need to spill high registers.

The patch is not complete but I thought I would ask for feedback on it early, before I dive into solving remaining issues.

Known problems:

Reloads of registers that require a COPY instruction should be done by RegAllocFast before other register uses try to get satisfied to provide better assignment possibilities for the temporary register.
Helper registers used in high-register reloads should get properly removed from UsedInInstr in RegAllocFast so they can get used by the actual instruction.
RegAllocFast uses one temporary virtual register for all COPY instructions that it needs to insert for high-register spills. This is a workaround for LiveRegMap (SparseSet) not being resizable when it is not empty.
Operands of COPY instructions inserted by InlineSpiller can get inflated to GPR. This is visible in test hgpr-spill-basic.mir and would cause a problem if the inflated GRP register needed to get subsequently spilled.
Interaction with snippets and hoisting in InlineSpiller is likely not really correct.

Herald added subscribers: eraman, MatzeB. · View Herald TranscriptAug 14 2018, 2:38 AM

petpav01 mentioned this in D51927: [ARM] Enable spilling of the hGPR class in Thumb2.Sep 11 2018, 6:03 AM

Updated patch improves the RegAllocFast part and adds more testing for it. InlineSpiller has no new changes.

Description of the changes:

Code to allocate an intermediary register for the spill is moved to RegAllocFast::handleIntermediarySpill().
RegAllocFast::allocVirtReg() is split into allocVirtReg() and assignVirtReg(). The former method still does most of the allocation work but leaves final update of PhysRegState + LRI->PhysReg and error reporting to assignVirtReg(). This allows handleIntermediarySpill() to call allocVirtReg() to get a free register without updating other state.
Spilling all registers prior to a call instruction in RegAllocFast::allocateBasicBlock() is moved before clearing of UsedInInstr so an intermediary does not get allocated to a register used by the instruction.

This is still not a complete patch. Known problems are:

handleIntermediarySpill() does not always correctly update debug information (DBG_VALUEs).
InlineSpiller still has the same problems as mentioned previously and needs more work.
Changes implemented in SparseSet are currently without testing.

Any feedback on this is very welcome, especially whether the overall approach looks sensible or if some different idea would be preferable and better.

Note: There is a ongoing rewrite of RegAllocFast in D52010 which means this patch will need to be somewhat reworked after the rewrite lands but I do not think it should affect the basic idea that is implemented here.

Herald added a subscriber: dexonsmith. · View Herald TranscriptNov 7 2018, 5:59 AM

pratlucas mentioned this in D80999: [ARM][CodeGen] Enabling spilling of high registers in RegAllocFast for Thumb1.Jun 2 2020, 6:27 AM

Revision Contents

Path

Size

include/

llvm/

ADT/

SparseSet.h

23 lines

CodeGen/

LiveRangeEdit.h

7 lines

TargetInstrInfo.h

23 lines

lib/

CodeGen/

InlineSpiller.cpp

59 lines

LiveRangeEdit.cpp

7 lines

RegAllocFast.cpp

227 lines

Target/

ARM/

Thumb1InstrInfo.h

3 lines

Thumb1InstrInfo.cpp

7 lines

test/

CodeGen/

Thumb/

hgpr-spill-basic.mir

74 lines

hgpr-spill-fast-all.mir

155 lines

hgpr-spill-fast-tsave.mir

116 lines

hgpr-spill-fast-tsave2.mir

116 lines

hgpr-spill-fast.mir

56 lines

Diff 172930

include/llvm/ADT/SparseSet.h

Show All 17 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_ADT_SPARSESET_H		#ifndef LLVM_ADT_SPARSESET_H
#define LLVM_ADT_SPARSESET_H		#define LLVM_ADT_SPARSESET_H

#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Support/Allocator.h"		#include "llvm/Support/Allocator.h"
		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <cstdlib>		#include <cstdlib>
#include <limits>		#include <limits>
#include <utility>		#include <utility>

namespace llvm {		namespace llvm {

▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	template<typename ValueT,
typename SparseT = uint8_t>		typename SparseT = uint8_t>
class SparseSet {		class SparseSet {
static_assert(std::numeric_limits<SparseT>::is_integer &&		static_assert(std::numeric_limits<SparseT>::is_integer &&
!std::numeric_limits<SparseT>::is_signed,		!std::numeric_limits<SparseT>::is_signed,
"SparseT must be an unsigned integer type");		"SparseT must be an unsigned integer type");

using KeyT = typename KeyFunctorT::argument_type;		using KeyT = typename KeyFunctorT::argument_type;
using DenseT = SmallVector<ValueT, 8>;		using DenseT = SmallVector<ValueT, 8>;
using size_type = unsigned;
DenseT Dense;		DenseT Dense;
SparseT *Sparse = nullptr;		SparseT *Sparse = nullptr;
unsigned Universe = 0;		unsigned Universe = 0;
KeyFunctorT KeyIndexOf;		KeyFunctorT KeyIndexOf;
SparseSetValFunctor<KeyT, ValueT, KeyFunctorT> ValIndexOf;		SparseSetValFunctor<KeyT, ValueT, KeyFunctorT> ValIndexOf;

public:		public:
using value_type = ValueT;		using value_type = ValueT;
using reference = ValueT &;		using reference = ValueT &;
using const_reference = const ValueT &;		using const_reference = const ValueT &;
using pointer = ValueT *;		using pointer = ValueT *;
using const_pointer = const ValueT *;		using const_pointer = const ValueT *;
		using size_type = unsigned;

SparseSet() = default;		SparseSet() = default;
SparseSet(const SparseSet &) = delete;		SparseSet(const SparseSet &) = delete;
SparseSet &operator=(const SparseSet &) = delete;		SparseSet &operator=(const SparseSet &) = delete;
~SparseSet() { free(Sparse); }		~SparseSet() { free(Sparse); }

/// setUniverse - Set the universe size which determines the largest key the		/// setUniverse - Set the universe size which determines the largest key the
/// set can hold. The universe must be sized before any elements can be		/// set can hold. The universe must be sized before any elements can be
/// added.		/// added.
///		///
/// @param U Universe size. All object keys must be less than U.		/// @param U Universe size. All object keys must be less than U.
///		///
void setUniverse(unsigned U) {		void setUniverse(unsigned U) {
// It's not hard to resize the universe on a non-empty set, but it doesn't
// seem like a likely use case, so we can add that code when we need it.
assert(empty() && "Can only resize universe on an empty map");
// Hysteresis prevents needless reallocations.		// Hysteresis prevents needless reallocations.
if (U >= Universe/4 && U <= Universe)		if (U >= Universe/4 && U <= Universe)
return;		return;
free(Sparse);		if (U > Universe)
		U = std::max(U, 2 * Universe);

// The Sparse array doesn't actually need to be initialized, so malloc		// The Sparse array doesn't actually need to be initialized, so malloc
// would be enough here, but that will cause tools like valgrind to		// would be enough here, but that will cause tools like valgrind to
// complain about branching on uninitialized data.		// complain about branching on uninitialized data.
Sparse = static_cast<SparseT*>(safe_calloc(U, sizeof(SparseT)));		SparseT S = static_cast<SparseT>(safe_calloc(U, sizeof(SparseT)));

		// Record already inserted elements in the new Sparse array.
		for (unsigned i = 0, e = size(); i < e; i++) {
		unsigned Idx = ValIndexOf(Dense[i]);
		assert(Idx <= U && "Index of an already inserted element is bigger than "
		"the new universe size");
		S[Idx] = i;
		}

		free(Sparse);
		Sparse = S;
Universe = U;		Universe = U;
}		}

// Import trivial vector stuff from DenseT.		// Import trivial vector stuff from DenseT.
using iterator = typename DenseT::iterator;		using iterator = typename DenseT::iterator;
using const_iterator = typename DenseT::const_iterator;		using const_iterator = typename DenseT::const_iterator;

const_iterator begin() const { return Dense.begin(); }		const_iterator begin() const { return Dense.begin(); }
▲ Show 20 Lines • Show All 141 Lines • Show Last 20 Lines

include/llvm/CodeGen/LiveRangeEdit.h

Show All 34 Lines
namespace llvm {		namespace llvm {

class LiveIntervals;		class LiveIntervals;
class MachineBlockFrequencyInfo;		class MachineBlockFrequencyInfo;
class MachineInstr;		class MachineInstr;
class MachineLoopInfo;		class MachineLoopInfo;
class MachineOperand;		class MachineOperand;
class TargetInstrInfo;		class TargetInstrInfo;
		class TargetRegisterClass;
class TargetRegisterInfo;		class TargetRegisterInfo;
class VirtRegMap;		class VirtRegMap;

class LiveRangeEdit : private MachineRegisterInfo::Delegate {		class LiveRangeEdit : private MachineRegisterInfo::Delegate {
public:		public:
/// Callback methods for LiveRangeEdit owners.		/// Callback methods for LiveRangeEdit owners.
class Delegate {		class Delegate {
virtual void anchor();		virtual void anchor();
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	public:
/// We don't want to allocate phys register for the dummy register, so		/// We don't want to allocate phys register for the dummy register, so
/// we want to drop it from the NewRegs set.		/// we want to drop it from the NewRegs set.
void pop_back() { NewRegs.pop_back(); }		void pop_back() { NewRegs.pop_back(); }

ArrayRef<unsigned> regs() const {		ArrayRef<unsigned> regs() const {
return makeArrayRef(NewRegs).slice(FirstNew);		return makeArrayRef(NewRegs).slice(FirstNew);
}		}

/// createFrom - Create a new virtual register based on OldReg.		/// createFrom - Create a new virtual register based on OldReg. If RC is
unsigned createFrom(unsigned OldReg);		/// specified then the register will have this class, else the class of OldReg
		/// is used.
		unsigned createFrom(unsigned OldReg, const TargetRegisterClass *RC = nullptr);

/// create - Create a new register with the same class and original slot as		/// create - Create a new register with the same class and original slot as
/// parent.		/// parent.
LiveInterval &createEmptyInterval() {		LiveInterval &createEmptyInterval() {
return createEmptyIntervalFrom(getReg(), true);		return createEmptyIntervalFrom(getReg(), true);
}		}

unsigned create() { return createFrom(getReg()); }		unsigned create() { return createFrom(getReg()); }
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

include/llvm/CodeGen/TargetInstrInfo.h

Show First 20 Lines • Show All 892 Lines • ▼ Show 20 Lines	virtual void loadRegFromStackSlot(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MI,		MachineBasicBlock::iterator MI,
unsigned DestReg, int FrameIndex,		unsigned DestReg, int FrameIndex,
const TargetRegisterClass *RC,		const TargetRegisterClass *RC,
const TargetRegisterInfo *TRI) const {		const TargetRegisterInfo *TRI) const {
llvm_unreachable("Target didn't implement "		llvm_unreachable("Target didn't implement "
"TargetInstrInfo::loadRegFromStackSlot!");		"TargetInstrInfo::loadRegFromStackSlot!");
}		}

		/// Return a register class that is appropriate for stack save/restore of the
		/// given register class.
		///
		/// For instance, Thumb1 does not provide instructions to directly
		/// save/restore high registers. Storing a high register must be done by first
		/// copying the value in a low register and then saving this register.
		/// Similarly, reload requires an adequately reversed sequence. For this case,
		/// the method returns the low-register class when given the high-register
		/// class.
		///
		/// This allows to allocate a new register with the returned class and insert
		/// a COPY instruction before/after the store/load created by
		/// storeRegToStackSlot()/loadRegFromStackSlot():
		/// %1:save-restore-class = COPY %0:original-class
		/// STR %1:save-restore-class, %stack.1
		///
		/// %1:save-restore-class = LDR %stack.1
		/// %0:original-class = COPY %1:save-restore-class
		virtual const TargetRegisterClass *
		getRegClassForStackSaveRestore(const TargetRegisterClass *RC) const {
		return RC;
		}

/// This function is called for all pseudo instructions		/// This function is called for all pseudo instructions
/// that remain after register allocation. Many pseudo instructions are		/// that remain after register allocation. Many pseudo instructions are
/// created to help register allocation. This is the place to convert them		/// created to help register allocation. This is the place to convert them
/// into real instructions. The target can edit MI in place, or it can insert		/// into real instructions. The target can edit MI in place, or it can insert
/// new instructions and erase MI. The function should return true if		/// new instructions and erase MI. The function should return true if
/// anything was changed.		/// anything was changed.
virtual bool expandPostRAPseudo(MachineInstr &MI) const { return false; }		virtual bool expandPostRAPseudo(MachineInstr &MI) const { return false; }

▲ Show 20 Lines • Show All 805 Lines • Show Last 20 Lines

lib/CodeGen/InlineSpiller.cpp

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	private:
void markValueUsed(LiveInterval, VNInfo);		void markValueUsed(LiveInterval, VNInfo);
bool reMaterializeFor(LiveInterval &, MachineInstr &MI);		bool reMaterializeFor(LiveInterval &, MachineInstr &MI);
void reMaterializeAll();		void reMaterializeAll();

bool coalesceStackAccess(MachineInstr *MI, unsigned Reg);		bool coalesceStackAccess(MachineInstr *MI, unsigned Reg);
bool foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>>,		bool foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>>,
MachineInstr *LoadMI = nullptr);		MachineInstr *LoadMI = nullptr);
void insertReload(unsigned VReg, SlotIndex, MachineBasicBlock::iterator MI);		void insertReload(unsigned VReg, SlotIndex, MachineBasicBlock::iterator MI);
void insertSpill(unsigned VReg, bool isKill, MachineBasicBlock::iterator MI);		void insertSpill(unsigned VReg, MachineBasicBlock::iterator MI);

void spillAroundUses(unsigned Reg);		void spillAroundUses(unsigned Reg);
void spillAll();		void spillAll();
};		};

} // end anonymous namespace		} // end anonymous namespace

Spiller::~Spiller() = default;		Spiller::~Spiller() = default;
▲ Show 20 Lines • Show All 633 Lines • ▼ Show 20 Lines
}		}

void InlineSpiller::insertReload(unsigned NewVReg,		void InlineSpiller::insertReload(unsigned NewVReg,
SlotIndex Idx,		SlotIndex Idx,
MachineBasicBlock::iterator MI) {		MachineBasicBlock::iterator MI) {
MachineBasicBlock &MBB = *MI->getParent();		MachineBasicBlock &MBB = *MI->getParent();

MachineInstrSpan MIS(MI);		MachineInstrSpan MIS(MI);
TII.loadRegFromStackSlot(MBB, MI, NewVReg, StackSlot,		unsigned LoadReg = NewVReg;
MRI.getRegClass(NewVReg), &TRI);		const TargetRegisterClass &RC = *MRI.getRegClass(NewVReg);
		const TargetRegisterClass &LoadRC = *TII.getRegClassForStackSaveRestore(&RC);
		if (&RC != &LoadRC) {
		LoadReg = Edit->createFrom(NewVReg, &LoadRC);
		LLVM_DEBUG(dbgs() << "Using " << printReg(LoadReg, &TRI) << ":"
		<< TRI.getRegClassName(&LoadRC)
		<< " as an intermediate for the reload\n");
		}

		TII.loadRegFromStackSlot(MBB, MI, LoadReg, StackSlot, &LoadRC, &TRI);

		if (&RC != &LoadRC)
		BuildMI(MBB, MI, MI->getDebugLoc(), TII.get(TargetOpcode::COPY), NewVReg)
		.addReg(LoadReg, RegState::Kill);

LIS.InsertMachineInstrRangeInMaps(MIS.begin(), MI);		LIS.InsertMachineInstrRangeInMaps(MIS.begin(), MI);

LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(MIS.begin(), MI, LIS, "reload",		LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(MIS.begin(), MI, LIS, "reload",
NewVReg));		NewVReg));
++NumReloads;		++NumReloads;
}		}

/// Check if \p Def fully defines a VReg with an undefined value.		/// Check if \p Def fully defines a VReg with an undefined value.
/// If that's the case, that means the value of VReg is actually		/// If that's the case, that means the value of VReg is actually
/// not relevant.		/// not relevant.
static bool isFullUndefDef(const MachineInstr &Def) {		static bool isFullUndefDef(const MachineInstr &Def) {
if (!Def.isImplicitDef())		if (!Def.isImplicitDef())
return false;		return false;
assert(Def.getNumOperands() == 1 &&		assert(Def.getNumOperands() == 1 &&
"Implicit def with more than one definition");		"Implicit def with more than one definition");
// We can say that the VReg defined by Def is undef, only if it is		// We can say that the VReg defined by Def is undef, only if it is
// fully defined by Def. Otherwise, some of the lanes may not be		// fully defined by Def. Otherwise, some of the lanes may not be
// undef and the value of the VReg matters.		// undef and the value of the VReg matters.
return !Def.getOperand(0).getSubReg();		return !Def.getOperand(0).getSubReg();
}		}

/// insertSpill - Insert a spill of NewVReg after MI.		/// insertSpill - Insert a spill of NewVReg after MI.
void InlineSpiller::insertSpill(unsigned NewVReg, bool isKill,		void InlineSpiller::insertSpill(unsigned NewVReg,
MachineBasicBlock::iterator MI) {		MachineBasicBlock::iterator MI) {
MachineBasicBlock &MBB = *MI->getParent();		MachineBasicBlock &MBB = *MI->getParent();

MachineInstrSpan MIS(MI);		MachineBasicBlock::iterator InsertMI = std::next(MI);
bool IsRealSpill = true;		bool IsRealSpill = true;
if (isFullUndefDef(*MI)) {		if (isFullUndefDef(*MI)) {
// Don't spill undef value.		// Don't spill undef value.
// Anything works for undef, in particular keeping the memory		// Anything works for undef, in particular keeping the memory
// uninitialized is a viable option and it saves code size and		// uninitialized is a viable option and it saves code size and
// run time.		// run time.
BuildMI(MBB, std::next(MI), MI->getDebugLoc(), TII.get(TargetOpcode::KILL))		BuildMI(MBB, InsertMI, MI->getDebugLoc(), TII.get(TargetOpcode::KILL))
.addReg(NewVReg, getKillRegState(isKill));		.addReg(NewVReg, RegState::Kill);
IsRealSpill = false;		IsRealSpill = false;
} else		} else {
TII.storeRegToStackSlot(MBB, std::next(MI), NewVReg, isKill, StackSlot,		unsigned StoreReg = NewVReg;
MRI.getRegClass(NewVReg), &TRI);		const TargetRegisterClass &RC = *MRI.getRegClass(NewVReg);
		const TargetRegisterClass &StoreRC =
		*TII.getRegClassForStackSaveRestore(&RC);
		if (&RC != &StoreRC) {
		StoreReg = Edit->createFrom(NewVReg, &StoreRC);
		LLVM_DEBUG(dbgs() << "Using " << printReg(StoreReg, &TRI) << ":"
		<< TRI.getRegClassName(&StoreRC)
		<< " as an intermediate for the spill\n");

		BuildMI(MBB, InsertMI, MI->getDebugLoc(), TII.get(TargetOpcode::COPY),
		StoreReg)
		.addReg(NewVReg, RegState::Kill);
		}

		TII.storeRegToStackSlot(MBB, InsertMI, StoreReg, RegState::Kill, StackSlot,
		&StoreRC, &TRI);
		}

LIS.InsertMachineInstrRangeInMaps(std::next(MI), MIS.end());		LIS.InsertMachineInstrRangeInMaps(std::next(MI), InsertMI);

LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(std::next(MI), MIS.end(), LIS,		LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(std::next(MI), InsertMI, LIS,
"spill"));		"spill"));
++NumSpills;		++NumSpills;
if (IsRealSpill)		if (IsRealSpill)
HSpiller.addToMergeableSpills(*std::next(MI), StackSlot, Original);		HSpiller.addToMergeableSpills(*std::prev(InsertMI), StackSlot, Original);
}		}

/// spillAroundUses - insert spill code around each use of Reg.		/// spillAroundUses - insert spill code around each use of Reg.
void InlineSpiller::spillAroundUses(unsigned Reg) {		void InlineSpiller::spillAroundUses(unsigned Reg) {
LLVM_DEBUG(dbgs() << "spillAroundUses " << printReg(Reg) << '\n');		LLVM_DEBUG(dbgs() << "spillAroundUses " << printReg(Reg) << '\n');
LiveInterval &OldLI = LIS.getInterval(Reg);		LiveInterval &OldLI = LIS.getInterval(Reg);

// Iterate over instructions using Reg.		// Iterate over instructions using Reg.
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	for (const auto &OpPair : Ops) {
hasLiveDef = true;		hasLiveDef = true;
}		}
}		}
LLVM_DEBUG(dbgs() << "\trewrite: " << Idx << '\t' << *MI << '\n');		LLVM_DEBUG(dbgs() << "\trewrite: " << Idx << '\t' << *MI << '\n');

// FIXME: Use a second vreg if instruction has no tied ops.		// FIXME: Use a second vreg if instruction has no tied ops.
if (RI.Writes)		if (RI.Writes)
if (hasLiveDef)		if (hasLiveDef)
insertSpill(NewVReg, true, MI);		insertSpill(NewVReg, MI);
}		}
}		}

/// spillAll - Spill all registers remaining after rematerialization.		/// spillAll - Spill all registers remaining after rematerialization.
void InlineSpiller::spillAll() {		void InlineSpiller::spillAll() {
// Update LiveStacks now that we are committed to spilling.		// Update LiveStacks now that we are committed to spilling.
if (StackSlot == VirtRegMap::NO_STACK_SLOT) {		if (StackSlot == VirtRegMap::NO_STACK_SLOT) {
StackSlot = VRM.assignVirt2StackSlot(Original);		StackSlot = VRM.assignVirt2StackSlot(Original);
▲ Show 20 Lines • Show All 474 Lines • Show Last 20 Lines

lib/CodeGen/LiveRangeEdit.cpp

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if (createSubRanges) {
LiveInterval &OldLI = LIS.getInterval(OldReg);		LiveInterval &OldLI = LIS.getInterval(OldReg);
VNInfo::Allocator &Alloc = LIS.getVNInfoAllocator();		VNInfo::Allocator &Alloc = LIS.getVNInfoAllocator();
for (LiveInterval::SubRange &S : OldLI.subranges())		for (LiveInterval::SubRange &S : OldLI.subranges())
LI.createSubRange(Alloc, S.LaneMask);		LI.createSubRange(Alloc, S.LaneMask);
}		}
return LI;		return LI;
}		}

unsigned LiveRangeEdit::createFrom(unsigned OldReg) {		unsigned LiveRangeEdit::createFrom(unsigned OldReg,
unsigned VReg = MRI.createVirtualRegister(MRI.getRegClass(OldReg));		const TargetRegisterClass *RC) {
		if (RC == nullptr)
		RC = MRI.getRegClass(OldReg);
		unsigned VReg = MRI.createVirtualRegister(RC);
if (VRM) {		if (VRM) {
VRM->setIsSplitFromReg(VReg, VRM->getOriginal(OldReg));		VRM->setIsSplitFromReg(VReg, VRM->getOriginal(OldReg));
}		}
// FIXME: Getting the interval here actually computes it.		// FIXME: Getting the interval here actually computes it.
// In theory, this may not be what we want, but in practice		// In theory, this may not be what we want, but in practice
// the createEmptyIntervalFrom API is used when this is not		// the createEmptyIntervalFrom API is used when this is not
// the case. Generally speaking we just want to annotate the		// the case. Generally speaking we just want to annotate the
// LiveInterval when it gets created but we cannot do that at		// LiveInterval when it gets created but we cannot do that at
▲ Show 20 Lines • Show All 411 Lines • Show Last 20 Lines

lib/CodeGen/RegAllocFast.cpp

Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	private:
void addKillFlag(const LiveReg &LRI);		void addKillFlag(const LiveReg &LRI);
void killVirtReg(LiveReg &LR);		void killVirtReg(LiveReg &LR);
void killVirtReg(unsigned VirtReg);		void killVirtReg(unsigned VirtReg);
void spillVirtReg(MachineBasicBlock::iterator MI, LiveReg &LR);		void spillVirtReg(MachineBasicBlock::iterator MI, LiveReg &LR);
void spillVirtReg(MachineBasicBlock::iterator MI, unsigned VirtReg);		void spillVirtReg(MachineBasicBlock::iterator MI, unsigned VirtReg);

void usePhysReg(MachineOperand &MO);		void usePhysReg(MachineOperand &MO);
void definePhysReg(MachineBasicBlock::iterator MI, MCPhysReg PhysReg,		void definePhysReg(MachineBasicBlock::iterator MI, MCPhysReg PhysReg,
RegState NewState);		RegState NewState, bool IsUsedInInstr = false);
unsigned calcSpillCost(MCPhysReg PhysReg) const;		unsigned calcSpillCost(MCPhysReg PhysReg) const;
void assignVirtToPhysReg(LiveReg &, MCPhysReg PhysReg);

LiveRegMap::iterator findLiveVirtReg(unsigned VirtReg) {		LiveRegMap::iterator findLiveVirtReg(unsigned VirtReg) {
return LiveVirtRegs.find(TargetRegisterInfo::virtReg2Index(VirtReg));		return LiveVirtRegs.find(TargetRegisterInfo::virtReg2Index(VirtReg));
}		}

LiveRegMap::const_iterator findLiveVirtReg(unsigned VirtReg) const {		LiveRegMap::const_iterator findLiveVirtReg(unsigned VirtReg) const {
return LiveVirtRegs.find(TargetRegisterInfo::virtReg2Index(VirtReg));		return LiveVirtRegs.find(TargetRegisterInfo::virtReg2Index(VirtReg));
}		}

void allocVirtReg(MachineInstr &MI, LiveReg &LR, unsigned Hint);		bool allocVirtReg(MachineInstr &MI, unsigned VirtReg, unsigned Hint,
		MCPhysReg *PhysReg, bool IsUsedInInstr);
		void assignVirtReg(MachineInstr &MI, LiveReg &LR, unsigned Hint);
MCPhysReg defineVirtReg(MachineInstr &MI, unsigned OpNum, unsigned VirtReg,		MCPhysReg defineVirtReg(MachineInstr &MI, unsigned OpNum, unsigned VirtReg,
unsigned Hint);		unsigned Hint);
LiveReg &reloadVirtReg(MachineInstr &MI, unsigned OpNum, unsigned VirtReg,		LiveReg &reloadVirtReg(MachineInstr &MI, unsigned OpNum, unsigned VirtReg,
unsigned Hint);		unsigned Hint);
void spillAll(MachineBasicBlock::iterator MI);		void spillAll(MachineBasicBlock::iterator MI);
bool setPhysReg(MachineInstr &MI, unsigned OpNum, MCPhysReg PhysReg);		bool setPhysReg(MachineInstr &MI, unsigned OpNum, MCPhysReg PhysReg);

int getStackSpaceFor(unsigned VirtReg);		int getStackSpaceFor(unsigned VirtReg);
void spill(MachineBasicBlock::iterator Before, unsigned VirtReg,		void spill(MachineBasicBlock::iterator Before, unsigned VirtReg,
MCPhysReg AssignedReg, bool Kill);		MCPhysReg AssignedReg, bool Kill);
void reload(MachineBasicBlock::iterator Before, unsigned VirtReg,		void reload(MachineBasicBlock::iterator Before, unsigned VirtReg,
MCPhysReg PhysReg);		MCPhysReg PhysReg);

		unsigned createVirtReg(const TargetRegisterClass &RC);
		void handleIntermediarySpill(MachineBasicBlock::iterator BeginMII,
		MachineBasicBlock::iterator EndMII,
		unsigned VirtReg);

void dumpState();		void dumpState();
};		};

} // end anonymous namespace		} // end anonymous namespace

char RegAllocFast::ID = 0;		char RegAllocFast::ID = 0;

INITIALIZE_PASS(RegAllocFast, "regallocfast", "Fast Register Allocator", false,		INITIALIZE_PASS(RegAllocFast, "regallocfast", "Fast Register Allocator", false,
Show All 28 Lines
void RegAllocFast::spill(MachineBasicBlock::iterator Before, unsigned VirtReg,		void RegAllocFast::spill(MachineBasicBlock::iterator Before, unsigned VirtReg,
MCPhysReg AssignedReg, bool Kill) {		MCPhysReg AssignedReg, bool Kill) {
LLVM_DEBUG(dbgs() << "Spilling " << printReg(VirtReg, TRI)		LLVM_DEBUG(dbgs() << "Spilling " << printReg(VirtReg, TRI)
<< " in " << printReg(AssignedReg, TRI));		<< " in " << printReg(AssignedReg, TRI));
int FI = getStackSpaceFor(VirtReg);		int FI = getStackSpaceFor(VirtReg);
LLVM_DEBUG(dbgs() << " to stack slot #" << FI << '\n');		LLVM_DEBUG(dbgs() << " to stack slot #" << FI << '\n');

const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);		const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
TII->storeRegToStackSlot(*MBB, Before, AssignedReg, Kill, FI, &RC, TRI);		const TargetRegisterClass &StoreRC =
		*TII->getRegClassForStackSaveRestore(&RC);

		MachineBasicBlock::iterator PrevMII =
		Before == MBB->begin() ? MBB->end() : std::prev(Before);
		unsigned StoreReg = AssignedReg;
		bool NeedsIntermediary = &RC != &StoreRC && !StoreRC.contains(StoreReg);
		if (NeedsIntermediary) {
		assert(&StoreRC == TII->getRegClassForStackSaveRestore(&StoreRC) &&
		"Invalid regclass cascade for stack save");
		StoreReg = createVirtReg(StoreRC);
		LLVM_DEBUG(dbgs() << "Using " << printReg(StoreReg, TRI) << ":"
		<< TRI->getRegClassName(&StoreRC)
		<< " as an intermediary for the spill\n");

		BuildMI(*MBB, Before, Before->getDebugLoc(), TII->get(TargetOpcode::COPY),
		StoreReg)
		.addReg(AssignedReg, llvm::RegState::Kill);
		}

		TII->storeRegToStackSlot(*MBB, Before, StoreReg, Kill, FI, &StoreRC, TRI);
++NumStores;		++NumStores;

		if (NeedsIntermediary)
		handleIntermediarySpill(PrevMII == MBB->end() ? MBB->begin()
		: std::next(PrevMII),
		Before, StoreReg);

// If this register is used by DBG_VALUE then insert new DBG_VALUE to		// If this register is used by DBG_VALUE then insert new DBG_VALUE to
// identify spilled location as the place to find corresponding variable's		// identify spilled location as the place to find corresponding variable's
// value.		// value.
SmallVectorImpl<MachineInstr *> &LRIDbgValues = LiveDbgValueMap[VirtReg];		SmallVectorImpl<MachineInstr *> &LRIDbgValues = LiveDbgValueMap[VirtReg];
for (MachineInstr *DBG : LRIDbgValues) {		for (MachineInstr *DBG : LRIDbgValues) {
MachineInstr NewDV = buildDbgValueForSpill(MBB, Before, *DBG, FI);		MachineInstr NewDV = buildDbgValueForSpill(MBB, Before, *DBG, FI);
assert(NewDV->getParent() == MBB && "dangling parent pointer");		assert(NewDV->getParent() == MBB && "dangling parent pointer");
(void)NewDV;		(void)NewDV;
LLVM_DEBUG(dbgs() << "Inserting debug info due to spill:\n" << *NewDV);		LLVM_DEBUG(dbgs() << "Inserting debug info due to spill:\n" << *NewDV);
}		}
// Now this register is spilled there is should not be any DBG_VALUE		// Now this register is spilled there is should not be any DBG_VALUE
// pointing to this register because they are all pointing to spilled value		// pointing to this register because they are all pointing to spilled value
// now.		// now.
LRIDbgValues.clear();		LRIDbgValues.clear();
}		}

/// Insert reload instruction for \p PhysReg before \p Before.		/// Insert reload instruction for \p PhysReg before \p Before.
void RegAllocFast::reload(MachineBasicBlock::iterator Before, unsigned VirtReg,		void RegAllocFast::reload(MachineBasicBlock::iterator Before, unsigned VirtReg,
MCPhysReg PhysReg) {		MCPhysReg PhysReg) {
LLVM_DEBUG(dbgs() << "Reloading " << printReg(VirtReg, TRI) << " into "		LLVM_DEBUG(dbgs() << "Reloading " << printReg(VirtReg, TRI) << " into "
<< printReg(PhysReg, TRI) << '\n');		<< printReg(PhysReg, TRI) << '\n');
int FI = getStackSpaceFor(VirtReg);		int FI = getStackSpaceFor(VirtReg);
const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);		const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
TII->loadRegFromStackSlot(*MBB, Before, PhysReg, FI, &RC, TRI);		const TargetRegisterClass &LoadRC = *TII->getRegClassForStackSaveRestore(&RC);

		MachineBasicBlock::iterator PrevMII =
		Before == MBB->begin() ? MBB->end() : std::prev(Before);
		unsigned LoadReg = PhysReg;
		bool NeedsIntermediary = &RC != &LoadRC && !LoadRC.contains(LoadReg);
		if (NeedsIntermediary) {
		assert(&LoadRC == TII->getRegClassForStackSaveRestore(&LoadRC) &&
		"Invalid regclass cascade for stack restore");
		LoadReg = createVirtReg(LoadRC);
		LLVM_DEBUG(dbgs() << "Using " << printReg(LoadReg, TRI) << ":"
		<< TRI->getRegClassName(&LoadRC)
		<< " as an intermediary for the reload\n");
		}

		TII->loadRegFromStackSlot(*MBB, Before, LoadReg, FI, &LoadRC, TRI);
++NumLoads;		++NumLoads;

		if (NeedsIntermediary) {
		BuildMI(*MBB, Before, Before->getDebugLoc(), TII->get(TargetOpcode::COPY),
		PhysReg)
		.addReg(LoadReg, llvm::RegState::Kill);
		handleIntermediarySpill(PrevMII == MBB->end() ? MBB->begin()
		: std::next(PrevMII),
		Before, LoadReg);
		}
}		}

/// Return true if MO is the only remaining reference to its virtual register,		/// Return true if MO is the only remaining reference to its virtual register,
/// and it is guaranteed to be a block-local register.		/// and it is guaranteed to be a block-local register.
bool RegAllocFast::isLastUseOfLocalReg(const MachineOperand &MO) const {		bool RegAllocFast::isLastUseOfLocalReg(const MachineOperand &MO) const {
// If the register has ever been spilled or reloaded, we conservatively assume		// If the register has ever been spilled or reloaded, we conservatively assume
// it is a global register used in multiple blocks.		// it is a global register used in multiple blocks.
if (StackSlotForVirtReg[MO.getReg()] != -1)		if (StackSlotForVirtReg[MO.getReg()] != -1)
▲ Show 20 Lines • Show All 153 Lines • ▼ Show 20 Lines	void RegAllocFast::usePhysReg(MachineOperand &MO) {
setPhysRegState(PhysReg, regFree);		setPhysRegState(PhysReg, regFree);
MO.setIsKill();		MO.setIsKill();
}		}

/// Mark PhysReg as reserved or free after spilling any virtregs. This is very		/// Mark PhysReg as reserved or free after spilling any virtregs. This is very
/// similar to defineVirtReg except the physreg is reserved instead of		/// similar to defineVirtReg except the physreg is reserved instead of
/// allocated.		/// allocated.
void RegAllocFast::definePhysReg(MachineBasicBlock::iterator MI,		void RegAllocFast::definePhysReg(MachineBasicBlock::iterator MI,
MCPhysReg PhysReg, RegState NewState) {		MCPhysReg PhysReg, RegState NewState,
		bool IsUsedInInstr) {
		if (IsUsedInInstr)
markRegUsedInInstr(PhysReg);		markRegUsedInInstr(PhysReg);
switch (unsigned VirtReg = PhysRegState[PhysReg]) {		switch (unsigned VirtReg = PhysRegState[PhysReg]) {
case regDisabled:		case regDisabled:
break;		break;
default:		default:
spillVirtReg(MI, VirtReg);		spillVirtReg(MI, VirtReg);
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case regFree:		case regFree:
case regReserved:		case regReserved:
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	default: {
Cost += LRI->Dirty ? spillDirty : spillClean;		Cost += LRI->Dirty ? spillDirty : spillClean;
break;		break;
}		}
}		}
}		}
return Cost;		return Cost;
}		}

/// This method updates local state so that we know that PhysReg is the
/// proper container for VirtReg now. The physical register must not be used
/// for anything else when this is called.
void RegAllocFast::assignVirtToPhysReg(LiveReg &LR, MCPhysReg PhysReg) {
unsigned VirtReg = LR.VirtReg;
LLVM_DEBUG(dbgs() << "Assigning " << printReg(VirtReg, TRI) << " to "
<< printReg(PhysReg, TRI) << '\n');
assert(LR.PhysReg == 0 && "Already assigned a physreg");
assert(PhysReg != 0 && "Trying to assign no register");
LR.PhysReg = PhysReg;
setPhysRegState(PhysReg, VirtReg);
}

/// Allocates a physical register for VirtReg.		/// Allocates a physical register for VirtReg.
void RegAllocFast::allocVirtReg(MachineInstr &MI, LiveReg &LR, unsigned Hint) {		bool RegAllocFast::allocVirtReg(MachineInstr &MI, unsigned VirtReg,
const unsigned VirtReg = LR.VirtReg;		unsigned Hint, MCPhysReg *OutPhysReg,
		bool IsUsedInInstr) {
assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&		assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&
"Can only allocate virtual registers");		"Can only allocate virtual registers");

const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);		const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
LLVM_DEBUG(dbgs() << "Search register for " << printReg(VirtReg)		LLVM_DEBUG(dbgs() << "Search register for " << printReg(VirtReg)
<< " in class " << TRI->getRegClassName(&RC) << '\n');		<< " in class " << TRI->getRegClassName(&RC) << '\n');

// Take hint when possible.		// Take hint when possible.
if (TargetRegisterInfo::isPhysicalRegister(Hint) &&		if (TargetRegisterInfo::isPhysicalRegister(Hint) &&
MRI->isAllocatable(Hint) && RC.contains(Hint)) {		MRI->isAllocatable(Hint) && RC.contains(Hint)) {
// Ignore the hint if we would have to spill a dirty register.		// Ignore the hint if we would have to spill a dirty register.
unsigned Cost = calcSpillCost(Hint);		unsigned Cost = calcSpillCost(Hint);
if (Cost < spillDirty) {		if (Cost < spillDirty) {
if (Cost)		if (Cost)
definePhysReg(MI, Hint, regFree);		definePhysReg(MI, Hint, regFree, IsUsedInInstr);
assignVirtToPhysReg(LR, Hint);		*OutPhysReg = Hint;
return;		return true;
}		}
}		}

// First try to find a completely free register.		// First try to find a completely free register.
ArrayRef<MCPhysReg> AllocationOrder = RegClassInfo.getOrder(&RC);		ArrayRef<MCPhysReg> AllocationOrder = RegClassInfo.getOrder(&RC);
for (MCPhysReg PhysReg : AllocationOrder) {		for (MCPhysReg PhysReg : AllocationOrder) {
if (PhysRegState[PhysReg] == regFree && !isRegUsedInInstr(PhysReg)) {		if (PhysRegState[PhysReg] == regFree && !isRegUsedInInstr(PhysReg)) {
assignVirtToPhysReg(LR, PhysReg);		*OutPhysReg = PhysReg;
return;		return true;
}		}
}		}

LLVM_DEBUG(dbgs() << "Allocating " << printReg(VirtReg) << " from "		LLVM_DEBUG(dbgs() << "Allocating " << printReg(VirtReg) << " from "
<< TRI->getRegClassName(&RC) << '\n');		<< TRI->getRegClassName(&RC) << '\n');

unsigned BestReg = 0;		unsigned BestReg = 0;
unsigned BestCost = spillImpossible;		unsigned BestCost = spillImpossible;
for (MCPhysReg PhysReg : AllocationOrder) {		for (MCPhysReg PhysReg : AllocationOrder) {
LLVM_DEBUG(dbgs() << "\tRegister: " << printReg(PhysReg, TRI) << ' ');		LLVM_DEBUG(dbgs() << "\tRegister: " << printReg(PhysReg, TRI) << ' ');
unsigned Cost = calcSpillCost(PhysReg);		unsigned Cost = calcSpillCost(PhysReg);
LLVM_DEBUG(dbgs() << "Cost: " << Cost << " BestCost: " << BestCost << '\n');		LLVM_DEBUG(dbgs() << "Cost: " << Cost << " BestCost: " << BestCost << '\n');
// Cost is 0 when all aliases are already disabled.		// Cost is 0 when all aliases are already disabled.
if (Cost == 0) {		if (Cost == 0) {
assignVirtToPhysReg(LR, PhysReg);		*OutPhysReg = PhysReg;
return;		return true;
}		}
if (Cost < BestCost) {		if (Cost < BestCost) {
BestReg = PhysReg;		BestReg = PhysReg;
BestCost = Cost;		BestCost = Cost;
}		}
}		}

if (!BestReg) {		if (BestReg) {
		definePhysReg(MI, BestReg, regFree, IsUsedInInstr);
		*OutPhysReg = BestReg;
		return true;
		}

		OutPhysReg = AllocationOrder.begin();
		return false;
		}

		void RegAllocFast::assignVirtReg(MachineInstr &MI, LiveReg &LR, unsigned Hint) {
		assert(LR.PhysReg == 0 && "Already assigned a physreg");

		const unsigned VirtReg = LR.VirtReg;
		MCPhysReg PhysReg;
		bool Defined = allocVirtReg(MI, VirtReg, Hint, &PhysReg, true);
		if (!Defined) {
// Nothing we can do. Report an error and keep going with a bad allocation.		// Nothing we can do. Report an error and keep going with a bad allocation.
if (MI.isInlineAsm())		if (MI.isInlineAsm())
MI.emitError("inline assembly requires more registers than available");		MI.emitError("inline assembly requires more registers than available");
else		else
MI.emitError("ran out of registers during register allocation");		MI.emitError("ran out of registers during register allocation");
definePhysReg(MI, *AllocationOrder.begin(), regFree);		definePhysReg(MI, PhysReg, regFree);
assignVirtToPhysReg(LR, *AllocationOrder.begin());
return;
}		}

definePhysReg(MI, BestReg, regFree);		// Update local state so that we know that PhysReg is the proper container for
assignVirtToPhysReg(LR, BestReg);		// VirtReg now.
		LLVM_DEBUG(dbgs() << "Assigning " << printReg(VirtReg, TRI) << " to "
		<< printReg(PhysReg, TRI) << '\n');
		LR.PhysReg = PhysReg;
		setPhysRegState(PhysReg, VirtReg);
}		}

/// Allocates a register for VirtReg and mark it as dirty.		/// Allocates a register for VirtReg and mark it as dirty.
MCPhysReg RegAllocFast::defineVirtReg(MachineInstr &MI, unsigned OpNum,		MCPhysReg RegAllocFast::defineVirtReg(MachineInstr &MI, unsigned OpNum,
unsigned VirtReg, unsigned Hint) {		unsigned VirtReg, unsigned Hint) {
assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&		assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&
"Not a virtual register");		"Not a virtual register");
LiveRegMap::iterator LRI;		LiveRegMap::iterator LRI;
bool New;		bool New;
std::tie(LRI, New) = LiveVirtRegs.insert(LiveReg(VirtReg));		std::tie(LRI, New) = LiveVirtRegs.insert(LiveReg(VirtReg));
if (!LRI->PhysReg) {		if (!LRI->PhysReg) {
// If there is no hint, peek at the only use of this register.		// If there is no hint, peek at the only use of this register.
if ((!Hint \|\| !TargetRegisterInfo::isPhysicalRegister(Hint)) &&		if ((!Hint \|\| !TargetRegisterInfo::isPhysicalRegister(Hint)) &&
MRI->hasOneNonDBGUse(VirtReg)) {		MRI->hasOneNonDBGUse(VirtReg)) {
const MachineInstr &UseMI = *MRI->use_instr_nodbg_begin(VirtReg);		const MachineInstr &UseMI = *MRI->use_instr_nodbg_begin(VirtReg);
// It's a copy, use the destination register as a hint.		// It's a copy, use the destination register as a hint.
if (UseMI.isCopyLike())		if (UseMI.isCopyLike())
Hint = UseMI.getOperand(0).getReg();		Hint = UseMI.getOperand(0).getReg();
}		}
allocVirtReg(MI, *LRI, Hint);		assignVirtReg(MI, *LRI, Hint);
} else if (LRI->LastUse) {		} else if (LRI->LastUse) {
// Redefining a live register - kill at the last use, unless it is this		// Redefining a live register - kill at the last use, unless it is this
// instruction defining VirtReg multiple times.		// instruction defining VirtReg multiple times.
if (LRI->LastUse != &MI \|\| LRI->LastUse->getOperand(LRI->LastOpNum).isUse())		if (LRI->LastUse != &MI \|\| LRI->LastUse->getOperand(LRI->LastOpNum).isUse())
addKillFlag(*LRI);		addKillFlag(*LRI);
}		}
assert(LRI->PhysReg && "Register not assigned");		assert(LRI->PhysReg && "Register not assigned");
LRI->LastUse = &MI;		LRI->LastUse = &MI;
Show All 10 Lines	RegAllocFast::LiveReg &RegAllocFast::reloadVirtReg(MachineInstr &MI,
unsigned Hint) {		unsigned Hint) {
assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&		assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&
"Not a virtual register");		"Not a virtual register");
LiveRegMap::iterator LRI;		LiveRegMap::iterator LRI;
bool New;		bool New;
std::tie(LRI, New) = LiveVirtRegs.insert(LiveReg(VirtReg));		std::tie(LRI, New) = LiveVirtRegs.insert(LiveReg(VirtReg));
MachineOperand &MO = MI.getOperand(OpNum);		MachineOperand &MO = MI.getOperand(OpNum);
if (!LRI->PhysReg) {		if (!LRI->PhysReg) {
allocVirtReg(MI, *LRI, Hint);		assignVirtReg(MI, *LRI, Hint);
reload(MI, VirtReg, LRI->PhysReg);		reload(MI, VirtReg, LRI->PhysReg);
} else if (LRI->Dirty) {		} else if (LRI->Dirty) {
if (isLastUseOfLocalReg(MO)) {		if (isLastUseOfLocalReg(MO)) {
LLVM_DEBUG(dbgs() << "Killing last use: " << MO << '\n');		LLVM_DEBUG(dbgs() << "Killing last use: " << MO << '\n');
if (MO.isUse())		if (MO.isUse())
MO.setIsKill();		MO.setIsKill();
else		else
MO.setIsDead();		MO.setIsDead();
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	bool RegAllocFast::setPhysReg(MachineInstr &MI, unsigned OpNum,
// A <def,read-undef> of a sub-register requires an implicit def of the full		// A <def,read-undef> of a sub-register requires an implicit def of the full
// register.		// register.
if (MO.isDef() && MO.isUndef())		if (MO.isDef() && MO.isUndef())
MI.addRegisterDefined(PhysReg, TRI);		MI.addRegisterDefined(PhysReg, TRI);

return Dead;		return Dead;
}		}

		/// Create a new virtual register for use by the allocator.
		unsigned RegAllocFast::createVirtReg(const TargetRegisterClass &RC) {
		unsigned Reg = MRI->createVirtualRegister(&RC);
		unsigned NumVirtRegs = MRI->getNumVirtRegs();
		StackSlotForVirtReg.resize(NumVirtRegs);
		LiveVirtRegs.setUniverse(NumVirtRegs);
		return Reg;
		}

		/// Process a spill/reload sequence that uses an intermediary register. The
		/// method expects an instruction range implementing the spill/reload and id of
		/// the new intermediary register. The intermediary is allocated to a physical
		/// register and the instruction sequence is appropriately updated.
		void RegAllocFast::handleIntermediarySpill(MachineBasicBlock::iterator BeginMII,
		MachineBasicBlock::iterator EndMII,
		unsigned VirtReg) {
		assert(TargetRegisterInfo::isVirtualRegister(VirtReg) &&
		"Not a virtual register");

		const TargetRegisterClass &RC = *MRI->getRegClass(VirtReg);
		LLVM_DEBUG(dbgs() << "Allocating intermediary register "
		<< printReg(VirtReg, TRI) << ":"
		<< TRI->getRegClassName(&RC)
		<< " to a physical register\n");

		// Allocate the intermediary virtual register to a physical register.
		MCPhysReg InterPhysReg;
		bool Defined = allocVirtReg(*BeginMII, VirtReg, 0, &InterPhysReg, false);
		if (!Defined) {
		// If an instruction uses a large number of registers (for instance, it is a
		// complex INLINEASM), it is possible that all registers that can store the
		// intermediary are already in use. In that case, one of these registers is
		// temporarily spilled so the intermediary can be allocated.
		//
		// Note: The target must guarantee that an intermediary register can be
		// successfully stored/loaded without modifying content of any of its super
		// registers.
		int FI = getStackSpaceFor(VirtReg);

		LLVM_DEBUG(dbgs() << "Temporarily spilling " << printReg(InterPhysReg, TRI)
		<< " to stack slot #" << FI
		<< " to allocate intermediary register "
		<< printReg(VirtReg, TRI) << ":"
		<< TRI->getRegClassName(&RC) << "\n");

		// TODO Fix debug information for the spill (DBG_VALUE).
		TII->storeRegToStackSlot(*MBB, BeginMII, InterPhysReg, true, FI, &RC, TRI);
		++NumStores;

		TII->loadRegFromStackSlot(*MBB, EndMII, InterPhysReg, FI, &RC, TRI);
		++NumLoads;
		}

		// Update the intermediary register in the spill sequence.
		for (MachineInstr &MI : make_range(BeginMII, EndMII)) {
		for (unsigned I = 0, E = MI.getNumOperands(); I != E; ++I) {
		const MachineOperand &MO = MI.getOperand(I);
		if (MO.isReg() && MO.getReg() == VirtReg)
		setPhysReg(MI, I, InterPhysReg);
		}
		}
		}

// Handles special instruction operand like early clobbers and tied ops when		// Handles special instruction operand like early clobbers and tied ops when
// there are additional physreg defines.		// there are additional physreg defines.
void RegAllocFast::handleThroughOperands(MachineInstr &MI,		void RegAllocFast::handleThroughOperands(MachineInstr &MI,
SmallVectorImpl<unsigned> &VirtDead) {		SmallVectorImpl<unsigned> &VirtDead) {
LLVM_DEBUG(dbgs() << "Scanning for through registers:");		LLVM_DEBUG(dbgs() << "Scanning for through registers:");
SmallSet<unsigned, 8> ThroughRegs;		SmallSet<unsigned, 8> ThroughRegs;
for (const MachineOperand &MO : MI.operands()) {		for (const MachineOperand &MO : MI.operands()) {
if (!MO.isReg()) continue;		if (!MO.isReg()) continue;
▲ Show 20 Lines • Show All 266 Lines • ▼ Show 20 Lines	for (unsigned I = 0; I != VirtOpEnd; ++I) {
LiveReg &LR = reloadVirtReg(MI, I, Reg, CopyDstReg);		LiveReg &LR = reloadVirtReg(MI, I, Reg, CopyDstReg);
MCPhysReg PhysReg = LR.PhysReg;		MCPhysReg PhysReg = LR.PhysReg;
CopySrcReg = (CopySrcReg == Reg \|\| CopySrcReg == PhysReg) ? PhysReg : 0;		CopySrcReg = (CopySrcReg == Reg \|\| CopySrcReg == PhysReg) ? PhysReg : 0;
if (setPhysReg(MI, I, PhysReg))		if (setPhysReg(MI, I, PhysReg))
killVirtReg(LR);		killVirtReg(LR);
}		}
}		}

		unsigned DefOpEnd = MI.getNumOperands();
		if (MI.isCall()) {
		// Spill all virtregs before a call. This serves one purpose: If an
		// exception is thrown, the landing pad is going to expect to find
		// registers in their spill slots.
		// Note: although this is appealing to just consider all definitions
		// as call-clobbered, this is not correct because some of those
		// definitions may be used later on and we do not want to reuse
		// those for virtual registers in between.
		LLVM_DEBUG(dbgs() << " Spilling remaining registers before call.\n");
		spillAll(MI);
		}

// Track registers defined by instruction - early clobbers and tied uses at		// Track registers defined by instruction - early clobbers and tied uses at
// this point.		// this point.
UsedInInstr.clear();		UsedInInstr.clear();
if (hasEarlyClobbers) {		if (hasEarlyClobbers) {
for (const MachineOperand &MO : MI.operands()) {		for (const MachineOperand &MO : MI.operands()) {
if (!MO.isReg()) continue;		if (!MO.isReg()) continue;
unsigned Reg = MO.getReg();		unsigned Reg = MO.getReg();
if (!Reg \|\| !TargetRegisterInfo::isPhysicalRegister(Reg)) continue;		if (!Reg \|\| !TargetRegisterInfo::isPhysicalRegister(Reg)) continue;
// Look for physreg defs and tied uses.		// Look for physreg defs and tied uses.
if (!MO.isDef() && !MO.isTied()) continue;		if (!MO.isDef() && !MO.isTied()) continue;
markRegUsedInInstr(Reg);		markRegUsedInInstr(Reg);
}		}
}		}

unsigned DefOpEnd = MI.getNumOperands();
if (MI.isCall()) {
// Spill all virtregs before a call. This serves one purpose: If an
// exception is thrown, the landing pad is going to expect to find
// registers in their spill slots.
// Note: although this is appealing to just consider all definitions
// as call-clobbered, this is not correct because some of those
// definitions may be used later on and we do not want to reuse
// those for virtual registers in between.
LLVM_DEBUG(dbgs() << " Spilling remaining registers before call.\n");
spillAll(MI);
}

// Third scan.		// Third scan.
// Allocate defs and collect dead defs.		// Allocate defs and collect dead defs.
for (unsigned I = 0; I != DefOpEnd; ++I) {		for (unsigned I = 0; I != DefOpEnd; ++I) {
const MachineOperand &MO = MI.getOperand(I);		const MachineOperand &MO = MI.getOperand(I);
if (!MO.isReg() \|\| !MO.isDef() \|\| !MO.getReg() \|\| MO.isEarlyClobber())		if (!MO.isReg() \|\| !MO.isDef() \|\| !MO.getReg() \|\| MO.isEarlyClobber())
continue;		continue;
unsigned Reg = MO.getReg();		unsigned Reg = MO.getReg();

▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

lib/Target/ARM/Thumb1InstrInfo.h

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	void storeRegToStackSlot(MachineBasicBlock &MBB,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;

void loadRegFromStackSlot(MachineBasicBlock &MBB,		void loadRegFromStackSlot(MachineBasicBlock &MBB,
MachineBasicBlock::iterator MBBI,		MachineBasicBlock::iterator MBBI,
unsigned DestReg, int FrameIndex,		unsigned DestReg, int FrameIndex,
const TargetRegisterClass *RC,		const TargetRegisterClass *RC,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;

		const TargetRegisterClass *
		getRegClassForStackSaveRestore(const TargetRegisterClass *RC) const override;

bool canCopyGluedNodeDuringSchedule(SDNode *N) const override;		bool canCopyGluedNodeDuringSchedule(SDNode *N) const override;
private:		private:
void expandLoadStackGuard(MachineBasicBlock::iterator MI) const override;		void expandLoadStackGuard(MachineBasicBlock::iterator MI) const override;
};		};
}		}

#endif		#endif

lib/Target/ARM/Thumb1InstrInfo.cpp

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	if (RC->hasSuperClassEq(&ARM::tGPRRegClass) \|\|
MachineMemOperand *MMO = MF.getMachineMemOperand(		MachineMemOperand *MMO = MF.getMachineMemOperand(
MachinePointerInfo::getFixedStack(MF, FI), MachineMemOperand::MOLoad,		MachinePointerInfo::getFixedStack(MF, FI), MachineMemOperand::MOLoad,
MFI.getObjectSize(FI), MFI.getObjectAlignment(FI));		MFI.getObjectSize(FI), MFI.getObjectAlignment(FI));
BuildMI(MBB, I, DL, get(ARM::tLDRspi), DestReg)		BuildMI(MBB, I, DL, get(ARM::tLDRspi), DestReg)
.addFrameIndex(FI)		.addFrameIndex(FI)
.addImm(0)		.addImm(0)
.addMemOperand(MMO)		.addMemOperand(MMO)
.add(predOps(ARMCC::AL));		.add(predOps(ARMCC::AL));
}		}
}		}
		thopreUnsubmitted Not Done Reply Inline Actions Can you put a similar comment to the store to stack slot? thopre: Can you put a similar comment to the store to stack slot?
		petpav01AuthorUnsubmitted Not Done Reply Inline Actions Added. petpav01: Added.

		const TargetRegisterClass *Thumb1InstrInfo::getRegClassForStackSaveRestore(
		const TargetRegisterClass *RC) const {
		if (ARM::hGPRRegClass.hasSubClassEq(RC))
		return &ARM::tGPRRegClass;
		return RC;
		}

void Thumb1InstrInfo::expandLoadStackGuard(		void Thumb1InstrInfo::expandLoadStackGuard(
MachineBasicBlock::iterator MI) const {		MachineBasicBlock::iterator MI) const {
MachineFunction &MF = *MI->getParent()->getParent();		MachineFunction &MF = *MI->getParent()->getParent();
const TargetMachine &TM = MF.getTarget();		const TargetMachine &TM = MF.getTarget();
if (TM.isPositionIndependent())		if (TM.isPositionIndependent())
expandLoadStackGuardBase(MI, ARM::tLDRLIT_ga_pcrel, ARM::tLDRi);		expandLoadStackGuardBase(MI, ARM::tLDRLIT_ga_pcrel, ARM::tLDRi);
else		else
expandLoadStackGuardBase(MI, ARM::tLDRLIT_ga_abs, ARM::tLDRi);		expandLoadStackGuardBase(MI, ARM::tLDRLIT_ga_abs, ARM::tLDRi);
Show All 14 Lines

test/CodeGen/Thumb/hgpr-spill-basic.mir

This file was added.

				# RUN: llc -run-pass regallocbasic %s -o - \| FileCheck %s --check-prefix=CHECK-ALLOC
				# RUN: llc -run-pass regallocbasic,virtregrewriter %s -o - \| FileCheck %s --check-prefix=CHECK-REWRITE

				# This test examines register allocation and spilling of high register in Thumb1
				# with Basic Register Allocator. The test uses two consecutive inline assembler
				# expressions that both request an input variable to be loaded in a high
				# register. The first expression marks {r8, r9, r10, r11} as clobbered, the
				# second one marks {r12, lr} as such. The allocator cannot choose the same
				# register to load the variable and a spill occurs.
				#
				# The test checks that InlineSpiller used by Basic Register Allocator implements
				# the following:
				# * A high register in Thumb1 is spilled by inserting a copy to a low register
				# and then saving that.
				# * A high register in Thumb1 is restored by inserting a load to a low register
				# and then a copy to the high register.

				--- \|
				; ModuleID = 'test.ll'
				source_filename = "test.c"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumbv6m-none--eabi"

				define dso_local void @constraint_h() {
				entry:
				%i = alloca i32, align 4
				%0 = load i32, i32* %i, align 4
				call void asm sideeffect "@ $0", "h,~{r8},~{r9},~{r10},~{r11}"(i32 %0)
				call void asm sideeffect "@ $0", "h,~{r12},~{lr}"(i32 %0)
				ret void
				}

				...
				---
				name: constraint_h
				tracksRegLiveness: true
				registers:
				- { id: 0, class: hgpr }
				- { id: 1, class: tgpr }
				stack:
				- { id: 0, name: i, size: 4, alignment: 4, stack-id: 0, local-offset: -4 }
				body: \|
				bb.0.entry:
				%1:tgpr = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
				%0:hgpr = COPY %1
				INLINEASM &"@ $0", 1, 589833, %0, 12, implicit-def early-clobber $r8, implicit-def early-clobber $r9, implicit-def early-clobber $r10, implicit-def early-clobber $r11
				INLINEASM &"@ $0", 1, 589833, %0, 12, implicit-def early-clobber $r12, implicit-def early-clobber $lr
				tBX_RET 14, $noreg

				...

				# CHECK-ALLOC: bb.0.entry:
				# CHECK-ALLOC-NEXT: %1:tgpr = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
				# CHECK-ALLOC-NEXT: %2:gpr = COPY %1
				# CHECK-ALLOC-NEXT: %3:tgpr = COPY %2
				# CHECK-ALLOC-NEXT: tSTRspi %3, %stack.1, 0, 14, $noreg :: (store 4 into %stack.1)
				# CHECK-ALLOC-NEXT: %5:tgpr = tLDRspi %stack.1, 0, 14, $noreg :: (load 4 from %stack.1)
				# CHECK-ALLOC-NEXT: %4:hgpr = COPY %5
				# CHECK-ALLOC-NEXT: INLINEASM &"@ $0", 1, 589833, %4, 12, implicit-def early-clobber $r8, implicit-def early-clobber $r9, implicit-def early-clobber $r10, implicit-def early-clobber $r11
				# CHECK-ALLOC-NEXT: %7:tgpr = tLDRspi %stack.1, 0, 14, $noreg :: (load 4 from %stack.1)
				# CHECK-ALLOC-NEXT: %6:hgpr = COPY %7
				# CHECK-ALLOC-NEXT: INLINEASM &"@ $0", 1, 589833, %6, 12, implicit-def early-clobber $r12, implicit-def early-clobber $lr
				# CHECK-ALLOC-NEXT: tBX_RET 14, $noreg

				# CHECK-REWRITE: bb.0.entry:
				# CHECK-REWRITE-NEXT: renamable $r0 = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
				# CHECK-REWRITE-NEXT: tSTRspi killed renamable $r0, %stack.1, 0, 14, $noreg :: (store 4 into %stack.1)
				# CHECK-REWRITE-NEXT: renamable $r0 = tLDRspi %stack.1, 0, 14, $noreg :: (load 4 from %stack.1)
				# CHECK-REWRITE-NEXT: renamable $r12 = COPY killed renamable $r0
				# CHECK-REWRITE-NEXT: INLINEASM &"@ $0", 1, 589833, killed renamable $r12, 12, implicit-def early-clobber $r8, implicit-def early-clobber $r9, implicit-def early-clobber $r10, implicit-def early-clobber $r11
				# CHECK-REWRITE-NEXT: renamable $r0 = tLDRspi %stack.1, 0, 14, $noreg :: (load 4 from %stack.1)
				# CHECK-REWRITE-NEXT: renamable $r8 = COPY killed renamable $r0
				# CHECK-REWRITE-NEXT: INLINEASM &"@ $0", 1, 589833, killed renamable $r8, 12, implicit-def early-clobber $r12, implicit-def early-clobber $lr
				# CHECK-REWRITE-NEXT: tBX_RET 14, $noreg

test/CodeGen/Thumb/hgpr-spill-fast-all.mir

This file was added.

				# RUN: llc -run-pass regallocfast %s -o - \| FileCheck %s

				# Check that Fast Register Allocator can succesfully spill all virtual registers
				# before a call instruction, including any high registers.
				#
				# The test operates as follows:
				# * Load a value in a high register which gets allocated to r12.
				# * Load values in all low registers r0-r7.
				# * Perform a call. The allocator spills all virtual registers prior calls and
				# so it must be able to successfully store the values loaded in r12, r0-r7 to
				# the stack.

				--- \|
				; ModuleID = 'test.ll'
				source_filename = "test.c"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumbv6m-none--eabi"

				define dso_local i32 @constraint_h() {
				entry:
				%ih = alloca i32, align 4
				%i0 = alloca i32, align 4
				%i1 = alloca i32, align 4
				%i2 = alloca i32, align 4
				%i3 = alloca i32, align 4
				%i4 = alloca i32, align 4
				%i5 = alloca i32, align 4
				%i6 = alloca i32, align 4
				%i7 = alloca i32, align 4
				%0 = load i32, i32* %ih, align 4
				%1 = load i32, i32* %i0, align 4
				%2 = load i32, i32* %i1, align 4
				%3 = load i32, i32* %i2, align 4
				%4 = load i32, i32* %i3, align 4
				%5 = load i32, i32* %i4, align 4
				%6 = load i32, i32* %i5, align 4
				%7 = load i32, i32* %i6, align 4
				%8 = load i32, i32* %i7, align 4
				call void @bar()
				%add = add nsw i32 %0, %1
				%add1 = add nsw i32 %add, %2
				%add2 = add nsw i32 %add1, %3
				%add3 = add nsw i32 %add2, %4
				%add4 = add nsw i32 %add3, %5
				%add5 = add nsw i32 %add4, %6
				%add6 = add nsw i32 %add5, %7
				%add7 = add nsw i32 %add6, %8
				ret i32 %add7
				}

				declare void @bar()

				...
				---
				name: constraint_h
				tracksRegLiveness: true
				registers:
				- { id: 0, class: tgpr }
				- { id: 1, class: hgpr }
				- { id: 2, class: tgpr }
				- { id: 3, class: tgpr }
				- { id: 4, class: tgpr }
				- { id: 5, class: tgpr }
				- { id: 6, class: tgpr }
				- { id: 7, class: tgpr }
				- { id: 8, class: tgpr }
				- { id: 9, class: tgpr }
				- { id: 10, class: tgpr }
				- { id: 11, class: tgpr }
				- { id: 12, class: tgpr }
				- { id: 13, class: tgpr }
				- { id: 14, class: tgpr }
				- { id: 15, class: tgpr }
				- { id: 16, class: tgpr }
				- { id: 17, class: tgpr }
				- { id: 18, class: tgpr }
				stack:
				- { id: 0, name: ih, size: 4, alignment: 4, stack-id: 0, local-offset: -4 }
				- { id: 1, name: i0, size: 4, alignment: 4, stack-id: 0, local-offset: -8 }
				- { id: 2, name: i1, size: 4, alignment: 4, stack-id: 0, local-offset: -12 }
				- { id: 3, name: i2, size: 4, alignment: 4, stack-id: 0, local-offset: -16 }
				- { id: 4, name: i3, size: 4, alignment: 4, stack-id: 0, local-offset: -20 }
				- { id: 5, name: i4, size: 4, alignment: 4, stack-id: 0, local-offset: -24 }
				- { id: 6, name: i5, size: 4, alignment: 4, stack-id: 0, local-offset: -28 }
				- { id: 7, name: i6, size: 4, alignment: 4, stack-id: 0, local-offset: -32 }
				- { id: 8, name: i7, size: 4, alignment: 4, stack-id: 0, local-offset: -36 }
				body: \|
				bb.0.entry:
				%0:tgpr = tLDRspi %stack.0.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				%1:hgpr = COPY %0
				%2:tgpr = tLDRspi %stack.1.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				%3:tgpr = tLDRspi %stack.2.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				%4:tgpr = tLDRspi %stack.3.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				%5:tgpr = tLDRspi %stack.4.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				%6:tgpr = tLDRspi %stack.5.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				%7:tgpr = tLDRspi %stack.6.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				%8:tgpr = tLDRspi %stack.7.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				%9:tgpr = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				tBL 14, $noreg, @bar, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
				%10:tgpr = COPY %1
				%11:tgpr, $cpsr = nsw tADDrr %10, %2, 14, $noreg
				%12:tgpr, $cpsr = nsw tADDrr %11, %3, 14, $noreg
				%13:tgpr, $cpsr = nsw tADDrr %12, %4, 14, $noreg
				%14:tgpr, $cpsr = nsw tADDrr %13, %5, 14, $noreg
				%15:tgpr, $cpsr = nsw tADDrr %14, %6, 14, $noreg
				%16:tgpr, $cpsr = nsw tADDrr %15, %7, 14, $noreg
				%17:tgpr, $cpsr = nsw tADDrr %16, %8, 14, $noreg
				%18:tgpr, $cpsr = nsw tADDrr %17, %9, 14, $noreg
				$r0 = COPY %18
				tBX_RET 14, $noreg, implicit $r0

				...

				# CHECK: bb.0.entry:
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.0.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				# CHECK-NEXT: renamable $r12 = COPY killed renamable $r0
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.1.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				# CHECK-NEXT: renamable $r1 = tLDRspi %stack.2.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				# CHECK-NEXT: renamable $r2 = tLDRspi %stack.3.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				# CHECK-NEXT: renamable $r3 = tLDRspi %stack.4.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				# CHECK-NEXT: renamable $r4 = tLDRspi %stack.5.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				# CHECK-NEXT: renamable $r5 = tLDRspi %stack.6.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				# CHECK-NEXT: renamable $r6 = tLDRspi %stack.7.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				# CHECK-NEXT: renamable $r7 = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				# CHECK-NEXT: tSTRspi killed $r0, %stack.10, 0, 14, $noreg :: (store 4 into %stack.10)
				# CHECK-NEXT: renamable $r0 = COPY killed $r12
				# CHECK-NEXT: tSTRspi killed renamable $r0, %stack.9, 0, 14, $noreg :: (store 4 into %stack.9)
				# CHECK-NEXT: tSTRspi killed $r1, %stack.11, 0, 14, $noreg :: (store 4 into %stack.11)
				# CHECK-NEXT: tSTRspi killed $r2, %stack.12, 0, 14, $noreg :: (store 4 into %stack.12)
				# CHECK-NEXT: tSTRspi killed $r3, %stack.13, 0, 14, $noreg :: (store 4 into %stack.13)
				# CHECK-NEXT: tSTRspi killed $r4, %stack.14, 0, 14, $noreg :: (store 4 into %stack.14)
				# CHECK-NEXT: tSTRspi killed $r5, %stack.15, 0, 14, $noreg :: (store 4 into %stack.15)
				# CHECK-NEXT: tSTRspi killed $r6, %stack.16, 0, 14, $noreg :: (store 4 into %stack.16)
				# CHECK-NEXT: tSTRspi killed $r7, %stack.17, 0, 14, $noreg :: (store 4 into %stack.17)
				# CHECK-NEXT: tBL 14, $noreg, @bar, csr_aapcs, implicit-def dead $lr, implicit $sp, implicit-def $sp
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.9, 0, 14, $noreg :: (load 4 from %stack.9)
				# CHECK-NEXT: $r12 = COPY killed renamable $r0
				# CHECK-NEXT: renamable $r0 = COPY killed renamable $r12
				# CHECK-NEXT: $r1 = tLDRspi %stack.10, 0, 14, $noreg :: (load 4 from %stack.10)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r1, 14, $noreg
				# CHECK-NEXT: $r2 = tLDRspi %stack.11, 0, 14, $noreg :: (load 4 from %stack.11)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r2, 14, $noreg
				# CHECK-NEXT: $r3 = tLDRspi %stack.12, 0, 14, $noreg :: (load 4 from %stack.12)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r3, 14, $noreg
				# CHECK-NEXT: $r4 = tLDRspi %stack.13, 0, 14, $noreg :: (load 4 from %stack.13)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r4, 14, $noreg
				# CHECK-NEXT: $r5 = tLDRspi %stack.14, 0, 14, $noreg :: (load 4 from %stack.14)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r5, 14, $noreg
				# CHECK-NEXT: $r6 = tLDRspi %stack.15, 0, 14, $noreg :: (load 4 from %stack.15)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r6, 14, $noreg
				# CHECK-NEXT: $r7 = tLDRspi %stack.16, 0, 14, $noreg :: (load 4 from %stack.16)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r7, 14, $noreg
				# CHECK-NEXT: $r1 = tLDRspi %stack.17, 0, 14, $noreg :: (load 4 from %stack.17)
				# CHECK-NEXT: renamable $r0, $cpsr = nsw tADDrr killed renamable $r0, killed renamable $r1, 14, $noreg
				# CHECK-NEXT: tBX_RET 14, $noreg, implicit killed $r0

test/CodeGen/Thumb/hgpr-spill-fast-tsave.mir

This file was added.

				# RUN: llc -run-pass regallocfast %s -o - \| FileCheck %s

				# Check that when storing a high register to a stack slot using an intermediary,
				# Fast Register Allocator is also able to spill a value in a register that it
				# needs to allocate for the intermediary.
				#
				# The test operates as follows:
				# * Physically define registers r0-r6 to make them reserved.
				# * Load a value in a high register which gets allocated to r12.
				# * Load a value in a low register which gets allocated to the remaining
				# register r7.
				# * Use INLINEASM that has r0-r6 and the value currently in r7 as inputs but
				# marks r12 as clobbered. The allocator must store the current value in r12 to
				# the stack. This requires the value in r7 to be also spilled and then
				# reloaded.

				--- \|
				; ModuleID = 'test.ll'
				source_filename = "test.c"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumbv6m-none--eabi"

				define dso_local i32 @constraint_h() {
				entry:
				%i0 = alloca i32, align 4
				%i1 = alloca i32, align 4
				%i2 = alloca i32, align 4
				%i3 = alloca i32, align 4
				%i4 = alloca i32, align 4
				%i5 = alloca i32, align 4
				%i6 = alloca i32, align 4
				%ih = alloca i32, align 4
				%i7 = alloca i32, align 4
				%0 = load i32, i32* %i0, align 4
				%1 = load i32, i32* %i1, align 4
				%2 = load i32, i32* %i2, align 4
				%3 = load i32, i32* %i3, align 4
				%4 = load i32, i32* %i4, align 4
				%5 = load i32, i32* %i5, align 4
				%6 = load i32, i32* %i6, align 4
				%7 = load i32, i32* %ih, align 4
				%8 = load i32, i32* %i7, align 4
				call void asm sideeffect "@ $0 $1 $2 $3 $4 $5 $6 $7", "{r0},{r1},{r2},{r3},{r4},{r5},{r6},r,~{r12}"(i32 %0, i32 %1, i32 %2, i32 %3, i32 %4, i32 %5, i32 %6, i32 %8)
				ret i32 %8
				}

				...
				---
				name: constraint_h
				tracksRegLiveness: true
				registers:
				- { id: 0, class: tgpr }
				- { id: 1, class: tgpr }
				- { id: 2, class: tgpr }
				- { id: 3, class: tgpr }
				- { id: 4, class: tgpr }
				- { id: 5, class: tgpr }
				- { id: 6, class: tgpr }
				- { id: 7, class: tgpr }
				- { id: 8, class: hgpr }
				- { id: 9, class: tgpr }
				stack:
				- { id: 0, name: i0, size: 4, alignment: 4, stack-id: 0, local-offset: -4 }
				- { id: 1, name: i1, size: 4, alignment: 4, stack-id: 0, local-offset: -8 }
				- { id: 2, name: i2, size: 4, alignment: 4, stack-id: 0, local-offset: -12 }
				- { id: 3, name: i3, size: 4, alignment: 4, stack-id: 0, local-offset: -16 }
				- { id: 4, name: i4, size: 4, alignment: 4, stack-id: 0, local-offset: -20 }
				- { id: 5, name: i5, size: 4, alignment: 4, stack-id: 0, local-offset: -24 }
				- { id: 6, name: i6, size: 4, alignment: 4, stack-id: 0, local-offset: -28 }
				- { id: 7, name: ih, size: 4, alignment: 4, stack-id: 0, local-offset: -32 }
				- { id: 8, name: i7, size: 4, alignment: 4, stack-id: 0, local-offset: -36 }
				body: \|
				bb.0.entry:
				%0:tgpr = tLDRspi %stack.0.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				%1:tgpr = tLDRspi %stack.1.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				%2:tgpr = tLDRspi %stack.2.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				%3:tgpr = tLDRspi %stack.3.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				%4:tgpr = tLDRspi %stack.4.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				%5:tgpr = tLDRspi %stack.5.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				%6:tgpr = tLDRspi %stack.6.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				$r0 = COPY %0
				$r1 = COPY %1
				$r2 = COPY %2
				$r3 = COPY %3
				$r4 = COPY %4
				$r5 = COPY %5
				$r6 = COPY %6
				%7:tgpr = tLDRspi %stack.7.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				%8:hgpr = COPY %7
				%9:tgpr = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				INLINEASM &"@ $0 $1 $2 $3 $4 $5 $6 $7", 1, 9, $r0, 9, $r1, 9, $r2, 9, $r3, 9, $r4, 9, $r5, 9, $r6, 655369, %9, 12, implicit-def early-clobber $r12
				$r0 = COPY %8
				tBX_RET 14, $noreg, implicit $r0

				...

				# CHECK: bb.0.entry:
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.0.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				# CHECK-NEXT: renamable $r1 = tLDRspi %stack.1.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				# CHECK-NEXT: renamable $r2 = tLDRspi %stack.2.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				# CHECK-NEXT: renamable $r3 = tLDRspi %stack.3.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				# CHECK-NEXT: renamable $r4 = tLDRspi %stack.4.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				# CHECK-NEXT: renamable $r5 = tLDRspi %stack.5.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				# CHECK-NEXT: renamable $r6 = tLDRspi %stack.6.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				# CHECK-NEXT: renamable $r7 = tLDRspi %stack.7.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				# CHECK-NEXT: renamable $r12 = COPY killed renamable $r7
				# CHECK-NEXT: renamable $r7 = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				# CHECK-NEXT: tSTRspi killed $r7, %stack.10, 0, 14, $noreg :: (store 4 into %stack.10)
				# CHECK-NEXT: renamable $r7 = COPY killed $r12
				# CHECK-NEXT: tSTRspi killed renamable $r7, %stack.9, 0, 14, $noreg :: (store 4 into %stack.9)
				# CHECK-NEXT: $r7 = tLDRspi %stack.10, 0, 14, $noreg :: (load 4 from %stack.10)
				# CHECK-NEXT: INLINEASM &"@ $0 $1 $2 $3 $4 $5 $6 $7", 1, 9, killed $r0, 9, killed $r1, 9, killed $r2, 9, killed $r3, 9, killed $r4, 9, killed $r5, 9, killed $r6, 655369, killed renamable $r7, 12, implicit-def early-clobber $r12
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.9, 0, 14, $noreg :: (load 4 from %stack.9)
				# CHECK-NEXT: $r12 = COPY killed renamable $r0
				# CHECK-NEXT: $r0 = COPY killed renamable $r12
				# CHECK-NEXT: tBX_RET 14, $noreg, implicit killed $r0

test/CodeGen/Thumb/hgpr-spill-fast-tsave2.mir

This file was added.

				# RUN: llc -run-pass regallocfast %s -o - \| FileCheck %s

				# Check that when storing a high register to a stack slot using an intermediary,
				# Fast Register Allocator is able to insert a temporary spill of a register that
				# it needs for the intermediary if no such register can be normally allocated.
				#
				# The test operates as follows:
				# * Physically define registers r0-r6 to make them reserved.
				# * Load a value in a high register which gets allocated to r12.
				# * Physically define the remaining low register r7 to make it reserved.
				# * Use INLINEASM that has r0-r7 as inputs but marks r12 as clobbered. The
				# allocator must store the current value in r12 to the stack. This requires a
				# temporary spill of one of the low registers that are already used by
				# INLINEASM.

				--- \|
				; ModuleID = 'test.ll'
				source_filename = "test.c"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumbv6m-none--eabi"

				define dso_local i32 @constraint_h() {
				entry:
				%i0 = alloca i32, align 4
				%i1 = alloca i32, align 4
				%i2 = alloca i32, align 4
				%i3 = alloca i32, align 4
				%i4 = alloca i32, align 4
				%i5 = alloca i32, align 4
				%i6 = alloca i32, align 4
				%ih = alloca i32, align 4
				%i7 = alloca i32, align 4
				%0 = load i32, i32* %i0, align 4
				%1 = load i32, i32* %i1, align 4
				%2 = load i32, i32* %i2, align 4
				%3 = load i32, i32* %i3, align 4
				%4 = load i32, i32* %i4, align 4
				%5 = load i32, i32* %i5, align 4
				%6 = load i32, i32* %i6, align 4
				%7 = load i32, i32* %ih, align 4
				%8 = load i32, i32* %i7, align 4
				call void asm sideeffect "@ $0 $1 $2 $3 $4 $5 $6 $7", "{r0},{r1},{r2},{r3},{r4},{r5},{r6},{r7},~{r12}"(i32 %0, i32 %1, i32 %2, i32 %3, i32 %4, i32 %5, i32 %6, i32 %8)
				ret i32 %8
				}

				...
				---
				name: constraint_h
				tracksRegLiveness: true
				registers:
				- { id: 0, class: tgpr }
				- { id: 1, class: tgpr }
				- { id: 2, class: tgpr }
				- { id: 3, class: tgpr }
				- { id: 4, class: tgpr }
				- { id: 5, class: tgpr }
				- { id: 6, class: tgpr }
				- { id: 7, class: tgpr }
				- { id: 8, class: hgpr }
				- { id: 9, class: tgpr }
				stack:
				- { id: 0, name: i0, size: 4, alignment: 4, stack-id: 0, local-offset: -4 }
				- { id: 1, name: i1, size: 4, alignment: 4, stack-id: 0, local-offset: -8 }
				- { id: 2, name: i2, size: 4, alignment: 4, stack-id: 0, local-offset: -12 }
				- { id: 3, name: i3, size: 4, alignment: 4, stack-id: 0, local-offset: -16 }
				- { id: 4, name: i4, size: 4, alignment: 4, stack-id: 0, local-offset: -20 }
				- { id: 5, name: i5, size: 4, alignment: 4, stack-id: 0, local-offset: -24 }
				- { id: 6, name: i6, size: 4, alignment: 4, stack-id: 0, local-offset: -28 }
				- { id: 7, name: ih, size: 4, alignment: 4, stack-id: 0, local-offset: -32 }
				- { id: 8, name: i7, size: 4, alignment: 4, stack-id: 0, local-offset: -36 }
				body: \|
				bb.0.entry:
				%0:tgpr = tLDRspi %stack.0.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				%1:tgpr = tLDRspi %stack.1.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				%2:tgpr = tLDRspi %stack.2.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				%3:tgpr = tLDRspi %stack.3.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				%4:tgpr = tLDRspi %stack.4.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				%5:tgpr = tLDRspi %stack.5.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				%6:tgpr = tLDRspi %stack.6.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				$r0 = COPY %0
				$r1 = COPY %1
				$r2 = COPY %2
				$r3 = COPY %3
				$r4 = COPY %4
				$r5 = COPY %5
				$r6 = COPY %6
				%7:tgpr = tLDRspi %stack.7.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				%8:hgpr = COPY %7
				%9:tgpr = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				$r7 = COPY %9
				INLINEASM &"@ $0 $1 $2 $3 $4 $5 $6 $7", 1, 9, $r0, 9, $r1, 9, $r2, 9, $r3, 9, $r4, 9, $r5, 9, $r6, 9, $r7, 12, implicit-def early-clobber $r12
				$r0 = COPY %8
				tBX_RET 14, $noreg, implicit $r0

				...

				# CHECK: bb.0.entry:
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.0.i0, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i0)
				# CHECK-NEXT: renamable $r1 = tLDRspi %stack.1.i1, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i1)
				# CHECK-NEXT: renamable $r2 = tLDRspi %stack.2.i2, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i2)
				# CHECK-NEXT: renamable $r3 = tLDRspi %stack.3.i3, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i3)
				# CHECK-NEXT: renamable $r4 = tLDRspi %stack.4.i4, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i4)
				# CHECK-NEXT: renamable $r5 = tLDRspi %stack.5.i5, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i5)
				# CHECK-NEXT: renamable $r6 = tLDRspi %stack.6.i6, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i6)
				# CHECK-NEXT: renamable $r7 = tLDRspi %stack.7.ih, 0, 14, $noreg :: (dereferenceable load 4 from %ir.ih)
				# CHECK-NEXT: renamable $r12 = COPY killed renamable $r7
				# CHECK-NEXT: renamable $r7 = tLDRspi %stack.8.i7, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i7)
				# CHECK-NEXT: tSTRspi killed $r0, %stack.10, 0, 14, $noreg :: (store 4 into %stack.10)
				# CHECK-NEXT: renamable $r0 = COPY killed $r12
				# CHECK-NEXT: tSTRspi killed renamable $r0, %stack.9, 0, 14, $noreg :: (store 4 into %stack.9)
				# CHECK-NEXT: $r0 = tLDRspi %stack.10, 0, 14, $noreg :: (load 4 from %stack.10)
				# CHECK-NEXT: INLINEASM &"@ $0 $1 $2 $3 $4 $5 $6 $7", 1, 9, killed $r0, 9, killed $r1, 9, killed $r2, 9, killed $r3, 9, killed $r4, 9, killed $r5, 9, killed $r6, 9, killed $r7, 12, implicit-def early-clobber $r12
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.9, 0, 14, $noreg :: (load 4 from %stack.9)
				# CHECK-NEXT: $r12 = COPY killed renamable $r0
				# CHECK-NEXT: $r0 = COPY killed renamable $r12
				# CHECK-NEXT: tBX_RET 14, $noreg, implicit killed $r0

test/CodeGen/Thumb/hgpr-spill-fast.mir

This file was added.

				# RUN: llc -run-pass regallocfast %s -o - \| FileCheck %s

				# This test examines register allocation and spilling of high registers in
				# Thumb1 with Fast Register Allocator. The test uses inline assembler that
				# requests an input variable to be loaded in a high register but at the same
				# time has r12 marked as clobbered. The allocator initially satisfies the load
				# request by selecting r12 but then needs to spill this register when it reaches
				# the INLINEASM instruction and notices its clobber definition.
				#
				# The test checks that Fast Register Allocator implements the following:
				# * A high register in Thumb1 is spilled by inserting a copy to a low register
				# and then saving that.
				# * A high register in Thumb1 is restored by inserting a load to a low register
				# and then a copy to the high register.

				--- \|
				; ModuleID = 'test.ll'
				source_filename = "test.c"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumbv6m-none--eabi"

				define dso_local void @constraint_h() {
				entry:
				%i = alloca i32, align 4
				%0 = load i32, i32* %i, align 4
				call void asm sideeffect "@ $0", "h,~{r12}"(i32 %0)
				ret void
				}

				...
				---
				name: constraint_h
				tracksRegLiveness: true
				registers:
				- { id: 0, class: hgpr }
				- { id: 1, class: tgpr }
				stack:
				- { id: 0, name: i, size: 4, alignment: 4, stack-id: 0, local-offset: -4 }
				body: \|
				bb.0.entry:
				%1:tgpr = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
				%0:hgpr = COPY %1
				INLINEASM &"@ $0", 1, 589833, %0, 12, implicit-def early-clobber $r12
				tBX_RET 14, $noreg

				...

				# CHECK: bb.0.entry:
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.0.i, 0, 14, $noreg :: (dereferenceable load 4 from %ir.i)
				# CHECK-NEXT: renamable $r12 = COPY killed renamable $r0
				# CHECK-NEXT: renamable $r0 = COPY killed $r12
				# CHECK-NEXT: tSTRspi killed renamable $r0, %stack.1, 0, 14, $noreg :: (store 4 into %stack.1)
				# CHECK-NEXT: renamable $r0 = tLDRspi %stack.1, 0, 14, $noreg :: (load 4 from %stack.1)
				# CHECK-NEXT: $r8 = COPY killed renamable $r0
				# CHECK-NEXT: INLINEASM &"@ $0", 1, 589833, killed renamable $r8, 12, implicit-def early-clobber $r12
				# CHECK-NEXT: tBX_RET 14, $noreg

This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add support for spilling high registers in Thumb1Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 172930

include/llvm/ADT/SparseSet.h

include/llvm/CodeGen/LiveRangeEdit.h

include/llvm/CodeGen/TargetInstrInfo.h

lib/CodeGen/InlineSpiller.cpp

lib/CodeGen/LiveRangeEdit.cpp

lib/CodeGen/RegAllocFast.cpp

lib/Target/ARM/Thumb1InstrInfo.h

lib/Target/ARM/Thumb1InstrInfo.cpp

test/CodeGen/Thumb/hgpr-spill-basic.mir

test/CodeGen/Thumb/hgpr-spill-fast-all.mir

test/CodeGen/Thumb/hgpr-spill-fast-tsave.mir

test/CodeGen/Thumb/hgpr-spill-fast-tsave2.mir

test/CodeGen/Thumb/hgpr-spill-fast.mir

[ARM] Add support for spilling high registers in Thumb1
Needs ReviewPublic