This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
-
PPCInstrInfo.cpp
-
PPCInstrVSX.td
-
PPCRegisterInfo.cpp
-
PPCRegisterInfo.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
-
gpr-vsr-spill.ll

Differential D34815

[Power9] Spill gprs to vector registers rather than stack
ClosedPublic

Authored by syzaara on Jun 29 2017, 9:12 AM.

Download Raw Diff

Details

Reviewers

kbarton
nemanjai
lei
sfertile
jtony
inouehrs
stefanp
gyiu
hfinkel
echristo

Commits

rGfcd9697d72ee: [Power9] Spill gprs to vector registers rather than stack
rL313886: [Power9] Spill gprs to vector registers rather than stack

Summary

This patch updates register allocation to enable spilling gprs to vector registers rather than the stack. A new register class is added which is a super class of G8RC and VSFRC, called GPFPRC. The getLargestLegalSuperClass then returns GPFPRC for an input of G8RC. The patch also adds post RA pseudo instructions (VSRSPILL_LD, VSRSPILL_ST) for spilling a register of the new class used in LoadRegFromStackSlot and StoreRegToStackSlot. These are then expanded after register allocation to either a scalar or vector load.

Diff Detail

Repository: rL LLVM

Event Timeline

syzaara created this revision.Jun 29 2017, 9:12 AM

Herald added a subscriber: qcolombet. · View Herald TranscriptJun 29 2017, 9:12 AM

Looks interesting.
This patch potentially increases the number of VSR save/restore in method prologue/epilogue (depending on which VSR is selected for spilling). Is my understanding correct?

lib/Target/PowerPC/PPCInstrInfo.cpp
52 ↗	(On Diff #104661)	I feel this name somewhat misleading. MTVSR instruction may be used for other purposes. NumGPRtoVSRSpill or somthing?
test/CodeGen/PowerPC/gpr-vsr-spill2.ll
25 ↗	(On Diff #104661)	What's the intention of this complicated test case without spills to VSR?

inouehrs added inline comments.Jun 29 2017, 10:55 AM

test/CodeGen/PowerPC/gpr-vsr-spill.ll
19 ↗	(On Diff #104661)	Actually, I cannot catch why we need spill here. The inline-asm clobbers all gprs but r30 and r31. So why we don't just use r30 and r31 for %a and %b?

syzaara added inline comments.Jun 29 2017, 11:10 AM

test/CodeGen/PowerPC/gpr-vsr-spill2.ll
25 ↗	(On Diff #104661)	This case shows how a spill of the new reg class is handled. Here we spilled a GPR to GPFPR where the new reg was also a gpr. We then needed to spill the new GPFPR using either a scalar store or vector store depending on the allocated register.

syzaara added inline comments.Jun 29 2017, 11:31 AM

test/CodeGen/PowerPC/gpr-vsr-spill.ll
19 ↗	(On Diff #104661)	Yes, but we need a register to save the result of the add. The result register used for the add is r30 and so one of the input parameters is spilled.

hfinkel added inline comments.Jul 3 2017, 8:20 PM

test/CodeGen/PowerPC/gpr-vsr-spill2.ll
1 ↗	(On Diff #104661)	Having this as an IR-level test seems fragile. Could you make this into a (simpler) MIR test that shows the behavior?

nemanjai added inline comments.Jul 7 2017, 2:02 PM

lib/Target/PowerPC/PPCInstrInfo.cpp
1994 ↗	(On Diff #104661)	Well, this is a pseudo that requires being `expandPostRAPseudo()`-ed. Wouldn't we want to say `return expandPostRAPseudo(MI)` here?

Overall, I like the patch. Seems quite nice and simple. However, I'm really not a fan of the naming convention. It is not clear to me why someone would be expected to make the connection between "GPFPRC" and "VSRSPILL". I think those should use the same base name to make the connection clear. Furthermore, I don't really think you should convey what registers are in the register class, but what the register class is used for. Perhaps the class and the related artifacts should be something like SPILLTOVSRRC and SPILLTOVSR_LD, etc. Perhaps other reviewers can chime in here as well.

Also, I think an important opportunity is lost since we don't do this for GPRRC. Of course, it doesn't have to be done in this patch, but I think a comment including a FIXME indicating this limitation is in order. Then if this turns out to be a performance win, we can follow this up with a patch that handles the 32-bit registers as well.

Only spill to volatile vsrs as spilling to non-volatile increases prologue/epilogue and leads to performance degradation.

syzaara added inline comments.Jul 31 2017, 1:11 PM

test/CodeGen/PowerPC/gpr-vsr-spill2.ll
1 ↗	(On Diff #104661)	I tried to create an MIR case using this, but the limitation with MIR tests identified in https://reviews.llvm.org/D33562 with MachineFunctionInfo not being saved/dumped as part of emitting .mir leads to machine verified errors. I tried to change the global vars to local vars to get around this limitation. However, doing that no longer reproduces the narrowed case so I will leave this as an IR test.

Other than a few minor inline comments, this LGTM.

Perhaps some of the other reviewers want to chime in on this. Otherwise please address those nits and commit.

lib/Target/PowerPC/PPCInstrInfo.cpp
934 ↗	(On Diff #108980)	I think for most (all?) other conditions, we have the source first. Please stick to that convention here as well.
2036 ↗	(On Diff #108980)	Just a nit. The register you're spilling is the source and the stack slot you're spilling it to is the target. So calling it `TargetReg` is a bit misleading when it's a store. :)
lib/Target/PowerPC/PPCRegisterInfo.cpp
54 ↗	(On Diff #108980)	Nit: it's not actually called `gp8rc` but `g8rc` if I remember correctly.
341 ↗	(On Diff #108980)	`// For Power9 we allow the user to enable GPR to vector spills.` Since we don't currently enable it by default even on Power9.
344 ↗	(On Diff #108980)	Please add a check for ELFv2 ABI. We are allowing spills only to the volatile VSR's, so we want to enable this only on the ABI where the VSR's we've selected are actually volatile.
lib/Target/PowerPC/PPCRegisterInfo.td
308 ↗	(On Diff #108980)	`// Allow spilling GPR's into caller-saved VSR's.`
test/CodeGen/PowerPC/gpr-vsr-spill2.ll
1 ↗	(On Diff #108980)	As implemented, this test case doesn't really test anything meaningful. It really just tests that there's a reg-to-reg copy (implemented as a move-register) followed by a spill of the target register. The two could be separated by arbitrary amount of code (including redefinition of the register). Unless you can add more meaningful testing to this complicated test case, I would simply get rid of it.

Forgot to accept :).

This revision is now accepted and ready to land.Sep 18 2017, 6:30 AM

Addressed review comments.

Closed by commit rL313886: [Power9] Spill gprs to vector registers rather than stack (authored by syzaara). · Explain WhySep 21 2017, 9:14 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

PowerPC/

71 lines

24 lines

22 lines

5 lines

test/

CodeGen/

PowerPC/

gpr-vsr-spill.ll

24 lines

Diff 116204

llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.cpp

Show All 40 Lines
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "ppc-instr-info"		#define DEBUG_TYPE "ppc-instr-info"

#define GET_INSTRMAP_INFO		#define GET_INSTRMAP_INFO
#define GET_INSTRINFO_CTOR_DTOR		#define GET_INSTRINFO_CTOR_DTOR
#include "PPCGenInstrInfo.inc"		#include "PPCGenInstrInfo.inc"

		STATISTIC(NumStoreSPILLVSRRCAsVec,
		"Number of spillvsrrc spilled to stack as vec");
		STATISTIC(NumStoreSPILLVSRRCAsGpr,
		"Number of spillvsrrc spilled to stack as gpr");
		STATISTIC(NumGPRtoVSRSpill, "Number of gpr spills to spillvsrrc");

static cl::		static cl::
opt<bool> DisableCTRLoopAnal("disable-ppc-ctrloop-analysis", cl::Hidden,		opt<bool> DisableCTRLoopAnal("disable-ppc-ctrloop-analysis", cl::Hidden,
cl::desc("Disable analysis for CTR loops"));		cl::desc("Disable analysis for CTR loops"));

static cl::opt<bool> DisableCmpOpt("disable-ppc-cmp-opt",		static cl::opt<bool> DisableCmpOpt("disable-ppc-cmp-opt",
cl::desc("Disable compare instruction optimization"), cl::Hidden);		cl::desc("Disable compare instruction optimization"), cl::Hidden);

static cl::opt<bool> VSXSelfCopyCrash("crash-on-ppc-vsx-self-copy",		static cl::opt<bool> VSXSelfCopyCrash("crash-on-ppc-vsx-self-copy",
▲ Show 20 Lines • Show All 218 Lines • ▼ Show 20 Lines	unsigned PPCInstrInfo::isLoadFromStackSlot(const MachineInstr &MI,
case PPC::RESTORE_CRBIT:		case PPC::RESTORE_CRBIT:
case PPC::LVX:		case PPC::LVX:
case PPC::LXVD2X:		case PPC::LXVD2X:
case PPC::LXVX:		case PPC::LXVX:
case PPC::QVLFDX:		case PPC::QVLFDX:
case PPC::QVLFSXs:		case PPC::QVLFSXs:
case PPC::QVLFDXb:		case PPC::QVLFDXb:
case PPC::RESTORE_VRSAVE:		case PPC::RESTORE_VRSAVE:
		case PPC::SPILLTOVSR_LD:
// Check for the operands added by addFrameReference (the immediate is the		// Check for the operands added by addFrameReference (the immediate is the
// offset which defaults to 0).		// offset which defaults to 0).
if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&		if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&
MI.getOperand(2).isFI()) {		MI.getOperand(2).isFI()) {
FrameIndex = MI.getOperand(2).getIndex();		FrameIndex = MI.getOperand(2).getIndex();
return MI.getOperand(0).getReg();		return MI.getOperand(0).getReg();
}		}
break;		break;
Show All 37 Lines	unsigned PPCInstrInfo::isStoreToStackSlot(const MachineInstr &MI,
case PPC::SPILL_CRBIT:		case PPC::SPILL_CRBIT:
case PPC::STVX:		case PPC::STVX:
case PPC::STXVD2X:		case PPC::STXVD2X:
case PPC::STXVX:		case PPC::STXVX:
case PPC::QVSTFDX:		case PPC::QVSTFDX:
case PPC::QVSTFSXs:		case PPC::QVSTFSXs:
case PPC::QVSTFDXb:		case PPC::QVSTFDXb:
case PPC::SPILL_VRSAVE:		case PPC::SPILL_VRSAVE:
		case PPC::SPILLTOVSR_ST:
// Check for the operands added by addFrameReference (the immediate is the		// Check for the operands added by addFrameReference (the immediate is the
// offset which defaults to 0).		// offset which defaults to 0).
if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&		if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&
MI.getOperand(2).isFI()) {		MI.getOperand(2).isFI()) {
FrameIndex = MI.getOperand(2).getIndex();		FrameIndex = MI.getOperand(2).getIndex();
return MI.getOperand(0).getReg();		return MI.getOperand(0).getReg();
}		}
break;		break;
▲ Show 20 Lines • Show All 568 Lines • ▼ Show 20 Lines	if (PPC::CRBITRCRegClass.contains(SrcReg) &&
BuildMI(MBB, I, DL, get(PPC::MFOCRF8), DestReg).addReg(SrcReg);		BuildMI(MBB, I, DL, get(PPC::MFOCRF8), DestReg).addReg(SrcReg);
getKillRegState(KillSrc);		getKillRegState(KillSrc);
return;		return;
} else if (PPC::CRRCRegClass.contains(SrcReg) &&		} else if (PPC::CRRCRegClass.contains(SrcReg) &&
PPC::GPRCRegClass.contains(DestReg)) {		PPC::GPRCRegClass.contains(DestReg)) {
BuildMI(MBB, I, DL, get(PPC::MFOCRF), DestReg).addReg(SrcReg);		BuildMI(MBB, I, DL, get(PPC::MFOCRF), DestReg).addReg(SrcReg);
getKillRegState(KillSrc);		getKillRegState(KillSrc);
return;		return;
		} else if (PPC::G8RCRegClass.contains(SrcReg) &&
		PPC::VSFRCRegClass.contains(DestReg)) {
		BuildMI(MBB, I, DL, get(PPC::MTVSRD), DestReg).addReg(SrcReg);
		NumGPRtoVSRSpill++;
		getKillRegState(KillSrc);
		return;
		} else if (PPC::VSFRCRegClass.contains(SrcReg) &&
		PPC::G8RCRegClass.contains(DestReg)) {
		BuildMI(MBB, I, DL, get(PPC::MFVSRD), DestReg).addReg(SrcReg);
		getKillRegState(KillSrc);
		return;
}		}

unsigned Opc;		unsigned Opc;
if (PPC::GPRCRegClass.contains(DestReg, SrcReg))		if (PPC::GPRCRegClass.contains(DestReg, SrcReg))
Opc = PPC::OR;		Opc = PPC::OR;
else if (PPC::G8RCRegClass.contains(DestReg, SrcReg))		else if (PPC::G8RCRegClass.contains(DestReg, SrcReg))
Opc = PPC::OR8;		Opc = PPC::OR8;
else if (PPC::F4RCRegClass.contains(DestReg, SrcReg))		else if (PPC::F4RCRegClass.contains(DestReg, SrcReg))
Opc = PPC::FMR;		Opc = PPC::FMR;
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVSTFSXs))
FrameIdx));		FrameIdx));
NonRI = true;		NonRI = true;
} else if (PPC::QBRCRegClass.hasSubClassEq(RC)) {		} else if (PPC::QBRCRegClass.hasSubClassEq(RC)) {
NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVSTFDXb))		NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVSTFDXb))
.addReg(SrcReg,		.addReg(SrcReg,
getKillRegState(isKill)),		getKillRegState(isKill)),
FrameIdx));		FrameIdx));
NonRI = true;		NonRI = true;
		} else if (PPC::SPILLTOVSRRCRegClass.hasSubClassEq(RC)) {
		NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::SPILLTOVSR_ST))
		.addReg(SrcReg,
		getKillRegState(isKill)),
		FrameIdx));
} else {		} else {
llvm_unreachable("Unknown regclass!");		llvm_unreachable("Unknown regclass!");
}		}

return false;		return false;
}		}

void		void
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	bool PPCInstrInfo::LoadRegFromStackSlot(MachineFunction &MF, const DebugLoc &DL,
} else if (PPC::QSRCRegClass.hasSubClassEq(RC)) {		} else if (PPC::QSRCRegClass.hasSubClassEq(RC)) {
NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVLFSXs), DestReg),		NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVLFSXs), DestReg),
FrameIdx));		FrameIdx));
NonRI = true;		NonRI = true;
} else if (PPC::QBRCRegClass.hasSubClassEq(RC)) {		} else if (PPC::QBRCRegClass.hasSubClassEq(RC)) {
NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVLFDXb), DestReg),		NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::QVLFDXb), DestReg),
FrameIdx));		FrameIdx));
NonRI = true;		NonRI = true;
		} else if (PPC::SPILLTOVSRRCRegClass.hasSubClassEq(RC)) {
		NewMIs.push_back(addFrameReference(BuildMI(MF, DL, get(PPC::SPILLTOVSR_LD),
		DestReg), FrameIdx));
} else {		} else {
llvm_unreachable("Unknown regclass!");		llvm_unreachable("Unknown regclass!");
}		}

return false;		return false;
}		}

void		void
▲ Show 20 Lines • Show All 797 Lines • ▼ Show 20 Lines	case PPC::DFSTOREf64: {
if ((TargetReg >= PPC::F0 && TargetReg <= PPC::F31) \|\|		if ((TargetReg >= PPC::F0 && TargetReg <= PPC::F31) \|\|
(TargetReg >= PPC::VSL0 && TargetReg <= PPC::VSL31))		(TargetReg >= PPC::VSL0 && TargetReg <= PPC::VSL31))
Opcode = LowerOpcode;		Opcode = LowerOpcode;
else		else
Opcode = UpperOpcode;		Opcode = UpperOpcode;
MI.setDesc(get(Opcode));		MI.setDesc(get(Opcode));
return true;		return true;
}		}
		case PPC::SPILLTOVSR_LD: {
		unsigned TargetReg = MI.getOperand(0).getReg();
		if (PPC::VSFRCRegClass.contains(TargetReg)) {
		MI.setDesc(get(PPC::DFLOADf64));
		return expandPostRAPseudo(MI);
		}
		else
		MI.setDesc(get(PPC::LD));
		return true;
		}
		case PPC::SPILLTOVSR_ST: {
		unsigned SrcReg = MI.getOperand(0).getReg();
		if (PPC::VSFRCRegClass.contains(SrcReg)) {
		NumStoreSPILLVSRRCAsVec++;
		MI.setDesc(get(PPC::DFSTOREf64));
		return expandPostRAPseudo(MI);
		} else {
		NumStoreSPILLVSRRCAsGpr++;
		MI.setDesc(get(PPC::STD));
		}
		return true;
		}
		case PPC::SPILLTOVSR_LDX: {
		unsigned TargetReg = MI.getOperand(0).getReg();
		if (PPC::VSFRCRegClass.contains(TargetReg))
		MI.setDesc(get(PPC::LXSDX));
		else
		MI.setDesc(get(PPC::LDX));
		return true;
		}
		case PPC::SPILLTOVSR_STX: {
		unsigned SrcReg = MI.getOperand(0).getReg();
		if (PPC::VSFRCRegClass.contains(SrcReg)) {
		NumStoreSPILLVSRRCAsVec++;
		MI.setDesc(get(PPC::STXSDX));
		} else {
		NumStoreSPILLVSRRCAsGpr++;
		MI.setDesc(get(PPC::STDX));
		}
		return true;
		}

case PPC::CFENCE8: {		case PPC::CFENCE8: {
auto Val = MI.getOperand(0).getReg();		auto Val = MI.getOperand(0).getReg();
BuildMI(MBB, MI, DL, get(PPC::CMPD), PPC::CR7).addReg(Val).addReg(Val);		BuildMI(MBB, MI, DL, get(PPC::CMPD), PPC::CR7).addReg(Val).addReg(Val);
BuildMI(MBB, MI, DL, get(PPC::CTRL_DEP))		BuildMI(MBB, MI, DL, get(PPC::CTRL_DEP))
.addImm(PPC::PRED_NE_MINUS)		.addImm(PPC::PRED_NE_MINUS)
.addReg(PPC::CR7)		.addReg(PPC::CR7)
.addImm(1);		.addImm(1);
MI.setDesc(get(PPC::ISYNC));		MI.setDesc(get(PPC::ISYNC));
Show All 17 Lines

llvm/trunk/lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines

def PPCRegVSSRCAsmOperand : AsmOperandClass {		def PPCRegVSSRCAsmOperand : AsmOperandClass {
let Name = "RegVSSRC"; let PredicateMethod = "isVSRegNumber";		let Name = "RegVSSRC"; let PredicateMethod = "isVSRegNumber";
}		}
def vssrc : RegisterOperand<VSSRC> {		def vssrc : RegisterOperand<VSSRC> {
let ParserMatchClass = PPCRegVSSRCAsmOperand;		let ParserMatchClass = PPCRegVSSRCAsmOperand;
}		}

		def PPCRegSPILLTOVSRRCAsmOperand : AsmOperandClass {
		let Name = "RegSPILLTOVSRRC"; let PredicateMethod = "isVSRegNumber";
		}

		def spilltovsrrc : RegisterOperand<SPILLTOVSRRC> {
		let ParserMatchClass = PPCRegSPILLTOVSRRCAsmOperand;
		}
// Little-endian-specific nodes.		// Little-endian-specific nodes.
def SDT_PPClxvd2x : SDTypeProfile<1, 1, [		def SDT_PPClxvd2x : SDTypeProfile<1, 1, [
SDTCisVT<0, v2f64>, SDTCisPtrTy<1>		SDTCisVT<0, v2f64>, SDTCisPtrTy<1>
]>;		]>;
def SDT_PPCstxvd2x : SDTypeProfile<0, 2, [		def SDT_PPCstxvd2x : SDTypeProfile<0, 2, [
SDTCisVT<0, v2f64>, SDTCisPtrTy<1>		SDTCisVT<0, v2f64>, SDTCisPtrTy<1>
]>;		]>;
def SDT_PPCxxswapd : SDTypeProfile<1, 1, [		def SDT_PPCxxswapd : SDTypeProfile<1, 1, [
▲ Show 20 Lines • Show All 2,800 Lines • ▼ Show 20 Lines	def DFSTOREf64 : Pseudo<(outs), (ins vsfrc:$XT, memrix:$dst),
[(store f64:$XT, ixaddr:$dst)]>;		[(store f64:$XT, ixaddr:$dst)]>;
}		}
def : Pat<(f64 (extloadf32 ixaddr:$src)),		def : Pat<(f64 (extloadf32 ixaddr:$src)),
(COPY_TO_REGCLASS (DFLOADf32 ixaddr:$src), VSFRC)>;		(COPY_TO_REGCLASS (DFLOADf32 ixaddr:$src), VSFRC)>;
def : Pat<(f32 (fpround (extloadf32 ixaddr:$src))),		def : Pat<(f32 (fpround (extloadf32 ixaddr:$src))),
(f32 (DFLOADf32 ixaddr:$src))>;		(f32 (DFLOADf32 ixaddr:$src))>;
} // end HasP9Vector, AddedComplexity		} // end HasP9Vector, AddedComplexity

		let Predicates = [HasP9Vector] in {
		let isPseudo = 1 in {
		let mayStore = 1 in {
		def SPILLTOVSR_STX : Pseudo<(outs), (ins spilltovsrrc:$XT, memrr:$dst),
		"#SPILLTOVSR_STX", []>;
		def SPILLTOVSR_ST : Pseudo<(outs), (ins spilltovsrrc:$XT, memrix:$dst),
		"#SPILLTOVSR_ST", []>;
		}
		let mayLoad = 1 in {
		def SPILLTOVSR_LDX : Pseudo<(outs spilltovsrrc:$XT), (ins memrr:$src),
		"#SPILLTOVSR_LDX", []>;
		def SPILLTOVSR_LD : Pseudo<(outs spilltovsrrc:$XT), (ins memrix:$src),
		"#SPILLTOVSR_LD", []>;

		}
		}
		}
// Integer extend helper dags 32 -> 64		// Integer extend helper dags 32 -> 64
def AnyExts {		def AnyExts {
dag A = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $A, sub_32);		dag A = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $A, sub_32);
dag B = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $B, sub_32);		dag B = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $B, sub_32);
dag C = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $C, sub_32);		dag C = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $C, sub_32);
dag D = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $D, sub_32);		dag D = (INSERT_SUBREG (i64 (IMPLICIT_DEF)), $D, sub_32);
}		}

▲ Show 20 Lines • Show All 356 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.cpp

Show All 15 Lines
#include "PPC.h"		#include "PPC.h"
#include "PPCFrameLowering.h"		#include "PPCFrameLowering.h"
#include "PPCInstrBuilder.h"		#include "PPCInstrBuilder.h"
#include "PPCMachineFunctionInfo.h"		#include "PPCMachineFunctionInfo.h"
#include "PPCSubtarget.h"		#include "PPCSubtarget.h"
#include "PPCTargetMachine.h"		#include "PPCTargetMachine.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/Statistic.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/RegisterScavenging.h"		#include "llvm/CodeGen/RegisterScavenging.h"
#include "llvm/IR/CallingConv.h"		#include "llvm/IR/CallingConv.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
Show All 12 Lines

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "reginfo"		#define DEBUG_TYPE "reginfo"

#define GET_REGINFO_TARGET_DESC		#define GET_REGINFO_TARGET_DESC
#include "PPCGenRegisterInfo.inc"		#include "PPCGenRegisterInfo.inc"

		STATISTIC(InflateGPRC, "Number of gprc inputs for getLargestLegalClass");
		STATISTIC(InflateGP8RC, "Number of g8rc inputs for getLargestLegalClass");

static cl::opt<bool>		static cl::opt<bool>
EnableBasePointer("ppc-use-base-pointer", cl::Hidden, cl::init(true),		EnableBasePointer("ppc-use-base-pointer", cl::Hidden, cl::init(true),
cl::desc("Enable use of a base pointer for complex stack frames"));		cl::desc("Enable use of a base pointer for complex stack frames"));

static cl::opt<bool>		static cl::opt<bool>
AlwaysBasePointer("ppc-always-use-base-pointer", cl::Hidden, cl::init(false),		AlwaysBasePointer("ppc-always-use-base-pointer", cl::Hidden, cl::init(false),
cl::desc("Force the use of a base pointer in every function"));		cl::desc("Force the use of a base pointer in every function"));

		static cl::opt<bool>
		EnableGPRToVecSpills("ppc-enable-gpr-to-vsr-spills", cl::Hidden, cl::init(false),
		cl::desc("Enable spills from gpr to vsr rather than stack"));

PPCRegisterInfo::PPCRegisterInfo(const PPCTargetMachine &TM)		PPCRegisterInfo::PPCRegisterInfo(const PPCTargetMachine &TM)
: PPCGenRegisterInfo(TM.isPPC64() ? PPC::LR8 : PPC::LR,		: PPCGenRegisterInfo(TM.isPPC64() ? PPC::LR8 : PPC::LR,
TM.isPPC64() ? 0 : 1,		TM.isPPC64() ? 0 : 1,
TM.isPPC64() ? 0 : 1),		TM.isPPC64() ? 0 : 1),
TM(TM) {		TM(TM) {
ImmToIdxMap[PPC::LD] = PPC::LDX; ImmToIdxMap[PPC::STD] = PPC::STDX;		ImmToIdxMap[PPC::LD] = PPC::LDX; ImmToIdxMap[PPC::STD] = PPC::STDX;
ImmToIdxMap[PPC::LBZ] = PPC::LBZX; ImmToIdxMap[PPC::STB] = PPC::STBX;		ImmToIdxMap[PPC::LBZ] = PPC::LBZX; ImmToIdxMap[PPC::STB] = PPC::STBX;
ImmToIdxMap[PPC::LHZ] = PPC::LHZX; ImmToIdxMap[PPC::LHA] = PPC::LHAX;		ImmToIdxMap[PPC::LHZ] = PPC::LHZX; ImmToIdxMap[PPC::LHA] = PPC::LHAX;
Show All 9 Lines	PPCRegisterInfo::PPCRegisterInfo(const PPCTargetMachine &TM)
ImmToIdxMap[PPC::LHZ8] = PPC::LHZX8; ImmToIdxMap[PPC::LWZ8] = PPC::LWZX8;		ImmToIdxMap[PPC::LHZ8] = PPC::LHZX8; ImmToIdxMap[PPC::LWZ8] = PPC::LWZX8;
ImmToIdxMap[PPC::STB8] = PPC::STBX8; ImmToIdxMap[PPC::STH8] = PPC::STHX8;		ImmToIdxMap[PPC::STB8] = PPC::STBX8; ImmToIdxMap[PPC::STH8] = PPC::STHX8;
ImmToIdxMap[PPC::STW8] = PPC::STWX8; ImmToIdxMap[PPC::STDU] = PPC::STDUX;		ImmToIdxMap[PPC::STW8] = PPC::STWX8; ImmToIdxMap[PPC::STDU] = PPC::STDUX;
ImmToIdxMap[PPC::ADDI8] = PPC::ADD8;		ImmToIdxMap[PPC::ADDI8] = PPC::ADD8;

// VSX		// VSX
ImmToIdxMap[PPC::DFLOADf32] = PPC::LXSSPX;		ImmToIdxMap[PPC::DFLOADf32] = PPC::LXSSPX;
ImmToIdxMap[PPC::DFLOADf64] = PPC::LXSDX;		ImmToIdxMap[PPC::DFLOADf64] = PPC::LXSDX;
		ImmToIdxMap[PPC::SPILLTOVSR_LD] = PPC::SPILLTOVSR_LDX;
		ImmToIdxMap[PPC::SPILLTOVSR_ST] = PPC::SPILLTOVSR_STX;
ImmToIdxMap[PPC::DFSTOREf32] = PPC::STXSSPX;		ImmToIdxMap[PPC::DFSTOREf32] = PPC::STXSSPX;
ImmToIdxMap[PPC::DFSTOREf64] = PPC::STXSDX;		ImmToIdxMap[PPC::DFSTOREf64] = PPC::STXSDX;
ImmToIdxMap[PPC::LXV] = PPC::LXVX;		ImmToIdxMap[PPC::LXV] = PPC::LXVX;
ImmToIdxMap[PPC::LXSD] = PPC::LXSDX;		ImmToIdxMap[PPC::LXSD] = PPC::LXSDX;
ImmToIdxMap[PPC::LXSSP] = PPC::LXSSPX;		ImmToIdxMap[PPC::LXSSP] = PPC::LXSSPX;
ImmToIdxMap[PPC::STXV] = PPC::STXVX;		ImmToIdxMap[PPC::STXV] = PPC::STXVX;
ImmToIdxMap[PPC::STXSD] = PPC::STXSDX;		ImmToIdxMap[PPC::STXSD] = PPC::STXSDX;
ImmToIdxMap[PPC::STXSSP] = PPC::STXSSPX;		ImmToIdxMap[PPC::STXSSP] = PPC::STXSSPX;
▲ Show 20 Lines • Show All 230 Lines • ▼ Show 20 Lines
const TargetRegisterClass *		const TargetRegisterClass *
PPCRegisterInfo::getLargestLegalSuperClass(const TargetRegisterClass *RC,		PPCRegisterInfo::getLargestLegalSuperClass(const TargetRegisterClass *RC,
const MachineFunction &MF) const {		const MachineFunction &MF) const {
const PPCSubtarget &Subtarget = MF.getSubtarget<PPCSubtarget>();		const PPCSubtarget &Subtarget = MF.getSubtarget<PPCSubtarget>();
if (Subtarget.hasVSX()) {		if (Subtarget.hasVSX()) {
// With VSX, we can inflate various sub-register classes to the full VSX		// With VSX, we can inflate various sub-register classes to the full VSX
// register set.		// register set.

		// For Power9 we allow the user to enable GPR to vector spills.
		// FIXME: Currently limited to spilling GP8RC. A follow on patch will add
		// support to spill GPRC.
		if (TM.isELFv2ABI()) {
		if (Subtarget.hasP9Vector() && EnableGPRToVecSpills &&
		RC == &PPC::G8RCRegClass) {
		InflateGP8RC++;
		return &PPC::SPILLTOVSRRCRegClass;
		}
		if (RC == &PPC::GPRCRegClass && EnableGPRToVecSpills)
		InflateGPRC++;
		}
if (RC == &PPC::F8RCRegClass)		if (RC == &PPC::F8RCRegClass)
return &PPC::VSFRCRegClass;		return &PPC::VSFRCRegClass;
else if (RC == &PPC::VRRCRegClass)		else if (RC == &PPC::VRRCRegClass)
return &PPC::VSRCRegClass;		return &PPC::VSRCRegClass;
else if (RC == &PPC::F4RCRegClass && Subtarget.hasP8Vector())		else if (RC == &PPC::F4RCRegClass && Subtarget.hasP8Vector())
return &PPC::VSSRCRegClass;		return &PPC::VSSRCRegClass;
}		}

▲ Show 20 Lines • Show All 752 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/PowerPC/PPCRegisterInfo.td

	Show First 20 Lines • Show All 299 Lines • ▼ Show 20 Lines
	def VFRC : RegisterClass<"PPC", [f64], 64,			def VFRC : RegisterClass<"PPC", [f64], 64,
	(add VF2, VF3, VF4, VF5, VF0, VF1, VF6, VF7,			(add VF2, VF3, VF4, VF5, VF0, VF1, VF6, VF7,
	VF8, VF9, VF10, VF11, VF12, VF13, VF14,			VF8, VF9, VF10, VF11, VF12, VF13, VF14,
	VF15, VF16, VF17, VF18, VF19, VF31, VF30,			VF15, VF16, VF17, VF18, VF19, VF31, VF30,
	VF29, VF28, VF27, VF26, VF25, VF24, VF23,			VF29, VF28, VF27, VF26, VF25, VF24, VF23,
	VF22, VF21, VF20)>;			VF22, VF21, VF20)>;
	def VSFRC : RegisterClass<"PPC", [f64], 64, (add F8RC, VFRC)>;			def VSFRC : RegisterClass<"PPC", [f64], 64, (add F8RC, VFRC)>;

				// Allow spilling GPR's into caller-saved VSR's.
				def SPILLTOVSRRC : RegisterClass<"PPC", [i64, f64], 64, (add G8RC, (sub VSFRC,
				(sequence "VF%u", 31, 20),
				(sequence "F%u", 31, 14)))>;

	// Register class for single precision scalars in VSX registers			// Register class for single precision scalars in VSX registers
	def VSSRC : RegisterClass<"PPC", [f32], 32, (add VSFRC)>;			def VSSRC : RegisterClass<"PPC", [f32], 32, (add VSFRC)>;

	// For QPX			// For QPX
	def QFRC : RegisterClass<"PPC", [v4f64], 256, (add (sequence "QF%u", 0, 13),			def QFRC : RegisterClass<"PPC", [v4f64], 256, (add (sequence "QF%u", 0, 13),
	(sequence "QF%u", 31, 14))>;			(sequence "QF%u", 31, 14))>;
	def QSRC : RegisterClass<"PPC", [v4f32], 128, (add QFRC)>;			def QSRC : RegisterClass<"PPC", [v4f32], 128, (add QFRC)>;
	def QBRC : RegisterClass<"PPC", [v4i1], 256, (add QFRC)> {			def QBRC : RegisterClass<"PPC", [v4i1], 256, (add QFRC)> {
	Show All 37 Lines

llvm/trunk/test/CodeGen/PowerPC/gpr-vsr-spill.ll

				; RUN: llc -verify-machineinstrs -mcpu=pwr9 -ppc-enable-gpr-to-vsr-spills < %s \| FileCheck %s
				define signext i32 @foo(i32 signext %a, i32 signext %b) {
				entry:
				%cmp = icmp slt i32 %a, %b
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry
				%0 = tail call i32 asm "add $0, $1, $2", "=r,r,r,~{r0},~{r1},~{r2},~{r3},~{r4},~{r5},~{r6},~{r7},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{r16},~{r17},~{r18},~{r19},~{r20},~{r21},~{r22},~{r23},~{r24},~{r25},~{r26},~{r27},~{r28},~{r29}"(i32 %a, i32 %b)
				%mul = mul nsw i32 %0, %a
				%add = add i32 %b, %a
				%tmp = add i32 %add, %mul
				br label %if.end

				if.end: ; preds = %if.then, %entry
				%e.0 = phi i32 [ %tmp, %if.then ], [ undef, %entry ]
				ret i32 %e.0
				; CHECK: @foo
				; CHECK: mr [[NEWREG:[0-9]+]], 3
				; CHECK: mtvsrd [[NEWREG2:[0-9]+]], 4
				; CHECK: mffprd [[REG1:[0-9]+]], [[NEWREG2]]
				; CHECK: add {{[0-9]+}}, [[NEWREG]], [[REG1]]
				; CHECK: mffprd [[REG2:[0-9]+]], [[NEWREG2]]
				; CHECK: add {{[0-9]+}}, [[REG2]], [[NEWREG]]
				}