This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Yet another approach to __tls_get_addr
ClosedPublic

Authored by wschmidt on Feb 2 2015, 11:31 AM.

Download Raw Diff

Details

Reviewers

chandlerc
uweigand
echristo
kbarton
seurer
hfinkel
nemanjai

Summary

This patch is a third attempt to properly handle the local-dynamic and global-dynamic TLS models.

In my original implementation, calls to __tls_get_addr were hidden from view until the asm-printer phase, at which point the underlying branch-and-link instruction was created with proper relocations. This mostly worked well, but I used some repellent techniques to ensure that the TLS_GET_ADDR nodes at the SD and MI levels correctly received input from GPR3 and produced output into GPR3. This proved to work badly in the presence of multiple TLS variable accesses, with the copies to and from GPR3 being scheduled incorrectly and generally creating havoc.

In r221703, I addressed that problem by representing the calls to __tls_get_addr as true calls during instruction lowering. This had the advantage of removing all of the bad hacks and relying on the existing call machinery to properly glue the copies in place. It looked like this was going to be the right way to go.

However, as a side effect of the recent discovery of problems with linker optimizations for TLS, we discovered cases of suboptimal code generation with this strategy. The problem comes when tls_get_addr is called for the same address, and there is a resulting CSE opportunity. It turns out that in such cases MachineCSE will common the addis/addi instructions that set up the input value to tls_get_addr, but will not common the calls themselves. MachineCSE does not have any machinery to common idempotent calls. This is perfectly sensible, since presumably this would be done at the IR level, and introducing calls in the back end isn't commonplace. In any case, we end up with two calls to __tls_get_addr when one would suffice, and that isn't good.

I presumed that the original design would have allowed commoning of the machine-specific nodes that hid the __tls_get_addr calls, so as suggested by Ulrich Weigand, I went back to that design and cleaned it up so that the copies were properly held together by glue nodes. However, it turned out that this didn't work either...the presence of copies to physical registers kept the machine-specific nodes from being commoned also.

All of which leads to the design presented here. This is a return to the original design, except that no attempt is made to introduce copies to and from GPR3 during instruction lowering. Virtual registers are used until prior to register allocation. At that point, a special pass is run that identifies the machine-specific nodes that hide the tls_get_addr calls and introduces the copies to and from GPR3 around them. The register allocator then coalesces these copies away. With this design, MachineCSE succeeds in commoning tls_get_addr calls where possible, and we get nice optimal code generation (better than GCC at the moment, which does not common these calls).

This is a relatively large patch, but a great deal of it is simply undoing the changes made by r221703. The interesting bits are:

PPCFrameLowering.cpp forces a stack frame when LR must be saved
Slight changes in PPCISelLowering.cpp to use virtual registers for GET_TLS_ADDR nodes
Use of "let Defs = [LR]" (or [LR8]) in PPCInstrinfo.td and PPCInstr64Bit.td
The new pass in PPCInstrInfo.cpp
Test cases

One thing missing from the original design was recording a definition of the link register on the GET_TLS_ADDR nodes. Doing this was found to be insufficient to force a stack frame to be created, which led to looping behavior because two different LR values were stored at the same address. This appears to have been an oversight in PPCFrameLowering::determineFrameLayout(), though there might be a better way to fix this. Hal, I'd be interested in your thoughts on that.

Because MustSaveLR() returns true for calls to builtin_return_address, this changed the expected behavior of test/CodeGen/PowerPC/retaddr2.ll, which now stacks a frame but formerly did not. I believe the new behavior is more correct, but would like your opinion on this as well.

Currently the pass is set to occur always. I don't think we can restrict it to run only for certain values of TM.getRelocationModel(), since individual variables can be annotated with the TLS model to be used. But if there is a safe way to avoid the pass sometimes, that would be nice (although it is a very fast pass).

Chandler, Kit, Nemanja, Bill: I listed you as reviewers, but you can consider this as more of an FYI. Review is welcome but not required. Thanks!

Diff Detail

Event Timeline

wschmidt updated this revision to Diff 19164.Feb 2 2015, 11:31 AM

wschmidt retitled this revision from to [PowerPC] Yet another approach to __tls_get_addr.

wschmidt updated this object.

wschmidt edited the test plan for this revision. (Show Details)

wschmidt added reviewers: hfinkel, chandlerc, echristo, kbarton, nemanjai, seurer, uweigand.

wschmidt added a subscriber: Unknown Object (MLST).

This is great, thanks! With the changes noted below, LGTM.

lib/Target/PowerPC/PPCAsmPrinter.cpp
820	Can there be an assert here that the relevant operands are actually equal to PPC::X3 or PPC::R3? If so, please add one.
881	Can you please refactor the code here so that it shares as much as possible with the PPC::GETtlsADDR/PPC::GETtlsADDR32 handling above. It looks quite similar.
lib/Target/PowerPC/PPCInstr64Bit.td
904	No need for the { } around on def.
921	Same here (no { } needed).
lib/Target/PowerPC/PPCInstrInfo.cpp
2281	When you rebase, you'll discover that I just moved all of these other passes into separate files. Please don't add it here, but rather add a separate file. I think that this file becomes somewhat unwieldy with a collection of passes tacked on to the end.
2290	Please explain why we're doing this here (that it turns into a call, and thus the register choice is actually constrained).
lib/Target/PowerPC/PPCInstrInfo.td
2511	You don't need the { } around a single following def.
2521	Same here (no { } needed).

This revision is now accepted and ready to land.Feb 2 2015, 12:45 PM

hfinkel added inline comments.Feb 2 2015, 6:03 PM

lib/Target/PowerPC/PPCInstr64Bit.td
905	To follow-up on the IRC messages I missed, to keep the anti-dep breaker from disturbing the r3 assignments, you'll probably want to set let hasExtraSrcRegAllocReq = 1, and/or hasExtraDefRegAllocReq = 1 here (and on the other pseudos in this patch).

Revised to address Hal's comments. I also had to add flags to the GETtls* instruction definitions to keep the aggressive anti-dependency breaker from touching the mentions of GPR3.

Committed as r227976. Thanks for the review!

Bill

Revision Contents

Path

Size

lib/

Target/

PowerPC/

8 lines

59 lines

30 lines

22 lines

69 lines

17 lines

85 lines

27 lines

6 lines

1 line

test/

CodeGen/

PowerPC/

retaddr2.ll

2 lines

tls-cse.ll

52 lines

tls-store2.ll

6 lines

Diff 19164

lib/Target/PowerPC/PPC.h

Show All 34 Lines	#ifndef NDEBUG
FunctionPass *createPPCCTRLoopsVerify();		FunctionPass *createPPCCTRLoopsVerify();
#endif		#endif
FunctionPass *createPPCEarlyReturnPass();		FunctionPass *createPPCEarlyReturnPass();
FunctionPass *createPPCVSXCopyPass();		FunctionPass *createPPCVSXCopyPass();
FunctionPass *createPPCVSXCopyCleanupPass();		FunctionPass *createPPCVSXCopyCleanupPass();
FunctionPass *createPPCVSXFMAMutatePass();		FunctionPass *createPPCVSXFMAMutatePass();
FunctionPass *createPPCBranchSelectionPass();		FunctionPass *createPPCBranchSelectionPass();
FunctionPass *createPPCISelDag(PPCTargetMachine &TM);		FunctionPass *createPPCISelDag(PPCTargetMachine &TM);
		FunctionPass *createPPCTLSDynamicCallPass();
void LowerPPCMachineInstrToMCInst(const MachineInstr *MI, MCInst &OutMI,		void LowerPPCMachineInstrToMCInst(const MachineInstr *MI, MCInst &OutMI,
AsmPrinter &AP, bool isDarwin);		AsmPrinter &AP, bool isDarwin);

/// \brief Creates an PPC-specific Target Transformation Info pass.		/// \brief Creates an PPC-specific Target Transformation Info pass.
ImmutablePass createPPCTargetTransformInfoPass(const PPCTargetMachine TM);		ImmutablePass createPPCTargetTransformInfoPass(const PPCTargetMachine TM);

void initializePPCVSXFMAMutatePass(PassRegistry&);		void initializePPCVSXFMAMutatePass(PassRegistry&);
extern char &PPCVSXFMAMutateID;		extern char &PPCVSXFMAMutateID;
Show All 37 Lines	enum TOF {

/// These values identify relocations on immediates folded		/// These values identify relocations on immediates folded
/// into memory operations.		/// into memory operations.
MO_DTPREL_LO = 5 << 4,		MO_DTPREL_LO = 5 << 4,
MO_TLSLD_LO = 6 << 4,		MO_TLSLD_LO = 6 << 4,
MO_TOC_LO = 7 << 4,		MO_TOC_LO = 7 << 4,

// Symbol for VK_PPC_TLS fixup attached to an ADD instruction		// Symbol for VK_PPC_TLS fixup attached to an ADD instruction
MO_TLS = 8 << 4,		MO_TLS = 8 << 4

// Symbols for VK_PPC_TLSGD and VK_PPC_TLSLD in __tls_get_addr
// call sequences.
MO_TLSLD = 9 << 4,
MO_TLSGD = 10 << 4
};		};
} // end namespace PPCII		} // end namespace PPCII

} // end namespace llvm;		} // end namespace llvm;

#endif		#endif

lib/Target/PowerPC/PPCAsmPrinter.cpp

Show First 20 Lines • Show All 801 Lines • ▼ Show 20 Lines	const MCExpr *SymGotTlsGD =
OutContext);		OutContext);
EmitToStreamer(OutStreamer,		EmitToStreamer(OutStreamer,
MCInstBuilder(Subtarget.isPPC64() ? PPC::ADDI8 : PPC::ADDI)		MCInstBuilder(Subtarget.isPPC64() ? PPC::ADDI8 : PPC::ADDI)
.addReg(MI->getOperand(0).getReg())		.addReg(MI->getOperand(0).getReg())
.addReg(MI->getOperand(1).getReg())		.addReg(MI->getOperand(1).getReg())
.addExpr(SymGotTlsGD));		.addExpr(SymGotTlsGD));
return;		return;
}		}
		case PPC::GETtlsADDR:
		// Transform: %X3 = GETtlsADDR %X3, <ga:@sym>
		// Into: BL8_NOP_TLS __tls_get_addr(sym@tlsgd)
		case PPC::GETtlsADDR32: {
		// Transform: %R3 = GettlsADDR32 %R3, <ga:@sym>
		// Into: BL_TLS __tls_get_addr(sym@tlsgd)@PLT

		StringRef Name = "__tls_get_addr";
		MCSymbol *TlsGetAddr = OutContext.GetOrCreateSymbol(Name);
		MCSymbolRefExpr::VariantKind Kind = MCSymbolRefExpr::VK_None;

		hfinkelUnsubmitted Not Done Reply Inline Actions Can there be an assert here that the relevant operands are actually equal to PPC::X3 or PPC::R3? If so, please add one. hfinkel: Can there be an assert here that the relevant operands are actually equal to PPC::X3 or PPC::R3?
		if (!Subtarget.isPPC64() && !Subtarget.isDarwin() &&
		TM.getRelocationModel() == Reloc::PIC_)
		Kind = MCSymbolRefExpr::VK_PLT;
		const MCSymbolRefExpr *TlsRef =
		MCSymbolRefExpr::Create(TlsGetAddr, Kind, OutContext);
		const MachineOperand &MO = MI->getOperand(2);
		const GlobalValue *GValue = MO.getGlobal();
		MCSymbol *MOSymbol = getSymbol(GValue);
		const MCExpr *SymVar =
		MCSymbolRefExpr::Create(MOSymbol, MCSymbolRefExpr::VK_PPC_TLSGD,
		OutContext);
		EmitToStreamer(OutStreamer,
		MCInstBuilder(Subtarget.isPPC64() ?
		PPC::BL8_NOP_TLS : PPC::BL_TLS)
		.addExpr(TlsRef)
		.addExpr(SymVar));
		return;
		}
case PPC::ADDIStlsldHA: {		case PPC::ADDIStlsldHA: {
// Transform: %Xd = ADDIStlsldHA %X2, <ga:@sym>		// Transform: %Xd = ADDIStlsldHA %X2, <ga:@sym>
// Into: %Xd = ADDIS8 %X2, sym@got@tlsld@ha		// Into: %Xd = ADDIS8 %X2, sym@got@tlsld@ha
assert(Subtarget.isPPC64() && "Not supported for 32-bit PowerPC");		assert(Subtarget.isPPC64() && "Not supported for 32-bit PowerPC");
const MachineOperand &MO = MI->getOperand(2);		const MachineOperand &MO = MI->getOperand(2);
const GlobalValue *GValue = MO.getGlobal();		const GlobalValue *GValue = MO.getGlobal();
MCSymbol *MOSymbol = getSymbol(GValue);		MCSymbol *MOSymbol = getSymbol(GValue);
const MCExpr *SymGotTlsLD =		const MCExpr *SymGotTlsLD =
Show All 21 Lines	const MCExpr *SymGotTlsLD =
OutContext);		OutContext);
EmitToStreamer(OutStreamer,		EmitToStreamer(OutStreamer,
MCInstBuilder(Subtarget.isPPC64() ? PPC::ADDI8 : PPC::ADDI)		MCInstBuilder(Subtarget.isPPC64() ? PPC::ADDI8 : PPC::ADDI)
.addReg(MI->getOperand(0).getReg())		.addReg(MI->getOperand(0).getReg())
.addReg(MI->getOperand(1).getReg())		.addReg(MI->getOperand(1).getReg())
.addExpr(SymGotTlsLD));		.addExpr(SymGotTlsLD));
return;		return;
}		}
		case PPC::GETtlsldADDR:
		// Transform: %X3 = GETtlsldADDR %X3, <ga:@sym>
		// Into: BL8_NOP_TLS __tls_get_addr(sym@tlsld)
		case PPC::GETtlsldADDR32: {
		// Transform: %R3 = GETtlsldADDR32 %R3, <ga:@sym>
		// Into: BL_TLS __tls_get_addr(sym@tlsld)@PLT
		hfinkelUnsubmitted Not Done Reply Inline Actions Can you please refactor the code here so that it shares as much as possible with the PPC::GETtlsADDR/PPC::GETtlsADDR32 handling above. It looks quite similar. hfinkel: Can you please refactor the code here so that it shares as much as possible with the PPC…

		StringRef Name = "__tls_get_addr";
		MCSymbol *TlsGetAddr = OutContext.GetOrCreateSymbol(Name);
		MCSymbolRefExpr::VariantKind Kind = MCSymbolRefExpr::VK_None;

		if (!Subtarget.isPPC64() && !Subtarget.isDarwin() &&
		TM.getRelocationModel() == Reloc::PIC_)
		Kind = MCSymbolRefExpr::VK_PLT;

		const MCSymbolRefExpr *TlsRef =
		MCSymbolRefExpr::Create(TlsGetAddr, Kind, OutContext);
		const MachineOperand &MO = MI->getOperand(2);
		const GlobalValue *GValue = MO.getGlobal();
		MCSymbol *MOSymbol = getSymbol(GValue);
		const MCExpr *SymVar =
		MCSymbolRefExpr::Create(MOSymbol, MCSymbolRefExpr::VK_PPC_TLSLD,
		OutContext);
		EmitToStreamer(OutStreamer,
		MCInstBuilder(Subtarget.isPPC64() ?
		PPC::BL8_NOP_TLS : PPC::BL_TLS)
		.addExpr(TlsRef)
		.addExpr(SymVar));
		return;
		}
case PPC::ADDISdtprelHA:		case PPC::ADDISdtprelHA:
// Transform: %Xd = ADDISdtprelHA %X3, <ga:@sym>		// Transform: %Xd = ADDISdtprelHA %X3, <ga:@sym>
// Into: %Xd = ADDIS8 %X3, sym@dtprel@ha		// Into: %Xd = ADDIS8 %X3, sym@dtprel@ha
case PPC::ADDISdtprelHA32: {		case PPC::ADDISdtprelHA32: {
// Transform: %Rd = ADDISdtprelHA32 %R3, <ga:@sym>		// Transform: %Rd = ADDISdtprelHA32 %R3, <ga:@sym>
// Into: %Rd = ADDIS %R3, sym@dtprel@ha		// Into: %Rd = ADDIS %R3, sym@dtprel@ha
const MachineOperand &MO = MI->getOperand(2);		const MachineOperand &MO = MI->getOperand(2);
const GlobalValue *GValue = MO.getGlobal();		const GlobalValue *GValue = MO.getGlobal();
▲ Show 20 Lines • Show All 633 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCFrameLowering.cpp

Show First 20 Lines • Show All 349 Lines • ▼ Show 20 Lines	static bool hasSpills(const MachineFunction &MF) {
return FuncInfo->hasSpills();		return FuncInfo->hasSpills();
}		}

static bool hasNonRISpills(const MachineFunction &MF) {		static bool hasNonRISpills(const MachineFunction &MF) {
const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();		const PPCFunctionInfo *FuncInfo = MF.getInfo<PPCFunctionInfo>();
return FuncInfo->hasNonRISpills();		return FuncInfo->hasNonRISpills();
}		}

		/// MustSaveLR - Return true if this function requires that we save the LR
		/// register onto the stack in the prolog and restore it in the epilog of the
		/// function.
		static bool MustSaveLR(const MachineFunction &MF, unsigned LR) {
		const PPCFunctionInfo *MFI = MF.getInfo<PPCFunctionInfo>();

		// We need a save/restore of LR if there is any def of LR (which is
		// defined by calls, including the PIC setup sequence), or if there is
		// some use of the LR stack slot (e.g. for builtin_return_address).
		// (LR comes in 32 and 64 bit versions.)
		MachineRegisterInfo::def_iterator RI = MF.getRegInfo().def_begin(LR);
		return RI !=MF.getRegInfo().def_end() \|\| MFI->isLRStoreRequired();
		}

/// determineFrameLayout - Determine the size of the frame and maximum call		/// determineFrameLayout - Determine the size of the frame and maximum call
/// frame size.		/// frame size.
unsigned PPCFrameLowering::determineFrameLayout(MachineFunction &MF,		unsigned PPCFrameLowering::determineFrameLayout(MachineFunction &MF,
bool UpdateMF,		bool UpdateMF,
bool UseEstimate) const {		bool UseEstimate) const {
MachineFrameInfo *MFI = MF.getFrameInfo();		MachineFrameInfo *MFI = MF.getFrameInfo();

// Get the number of bytes to allocate from the FrameInfo		// Get the number of bytes to allocate from the FrameInfo
Show All 10 Lines	unsigned PPCFrameLowering::determineFrameLayout(MachineFunction &MF,

// If we are a leaf function, and use up to 224 bytes of stack space,		// If we are a leaf function, and use up to 224 bytes of stack space,
// don't have a frame pointer, calls, or dynamic alloca then we do not need		// don't have a frame pointer, calls, or dynamic alloca then we do not need
// to adjust the stack pointer (we fit in the Red Zone).		// to adjust the stack pointer (we fit in the Red Zone).
// The 32-bit SVR4 ABI has no Red Zone. However, it can still generate		// The 32-bit SVR4 ABI has no Red Zone. However, it can still generate
// stackless code if all local vars are reg-allocated.		// stackless code if all local vars are reg-allocated.
bool DisableRedZone = MF.getFunction()->getAttributes().		bool DisableRedZone = MF.getFunction()->getAttributes().
hasAttribute(AttributeSet::FunctionIndex, Attribute::NoRedZone);		hasAttribute(AttributeSet::FunctionIndex, Attribute::NoRedZone);
		unsigned LR = RegInfo->getRARegister();
if (!DisableRedZone &&		if (!DisableRedZone &&
(Subtarget.isPPC64() \|\| // 32-bit SVR4, no stack-		(Subtarget.isPPC64() \|\| // 32-bit SVR4, no stack-
!Subtarget.isSVR4ABI() \|\| // allocated locals.		!Subtarget.isSVR4ABI() \|\| // allocated locals.
FrameSize == 0) &&		FrameSize == 0) &&
FrameSize <= 224 && // Fits in red zone.		FrameSize <= 224 && // Fits in red zone.
!MFI->hasVarSizedObjects() && // No dynamic alloca.		!MFI->hasVarSizedObjects() && // No dynamic alloca.
!MFI->adjustsStack() && // No calls.		!MFI->adjustsStack() && // No calls.
		!MustSaveLR(MF, LR) &&
!RegInfo->hasBasePointer(MF)) { // No special alignment.		!RegInfo->hasBasePointer(MF)) { // No special alignment.
// No need for frame		// No need for frame
if (UpdateMF)		if (UpdateMF)
MFI->setStackSize(0);		MFI->setStackSize(0);
return 0;		return 0;
}		}

// Get the maximum call frame size of all the calls.		// Get the maximum call frame size of all the calls.
▲ Show 20 Lines • Show All 704 Lines • ▼ Show 20 Lines	if (MF.getTarget().Options.GuaranteedTailCallOpt &&
BuildMI(MBB, MBBI, dl, TII.get(PPC::TAILBCTR8));		BuildMI(MBB, MBBI, dl, TII.get(PPC::TAILBCTR8));
} else if (RetOpcode == PPC::TCRETURNai8) {		} else if (RetOpcode == PPC::TCRETURNai8) {
MBBI = MBB.getLastNonDebugInstr();		MBBI = MBB.getLastNonDebugInstr();
MachineOperand &JumpTarget = MBBI->getOperand(0);		MachineOperand &JumpTarget = MBBI->getOperand(0);
BuildMI(MBB, MBBI, dl, TII.get(PPC::TAILBA8)).addImm(JumpTarget.getImm());		BuildMI(MBB, MBBI, dl, TII.get(PPC::TAILBA8)).addImm(JumpTarget.getImm());
}		}
}		}

/// MustSaveLR - Return true if this function requires that we save the LR
/// register onto the stack in the prolog and restore it in the epilog of the
/// function.
static bool MustSaveLR(const MachineFunction &MF, unsigned LR) {
const PPCFunctionInfo *MFI = MF.getInfo<PPCFunctionInfo>();

// We need a save/restore of LR if there is any def of LR (which is
// defined by calls, including the PIC setup sequence), or if there is
// some use of the LR stack slot (e.g. for builtin_return_address).
// (LR comes in 32 and 64 bit versions.)
MachineRegisterInfo::def_iterator RI = MF.getRegInfo().def_begin(LR);
return RI !=MF.getRegInfo().def_end() \|\| MFI->isLRStoreRequired();
}

void		void
PPCFrameLowering::processFunctionBeforeCalleeSavedScan(MachineFunction &MF,		PPCFrameLowering::processFunctionBeforeCalleeSavedScan(MachineFunction &MF,
RegScavenger *) const {		RegScavenger *) const {
const PPCRegisterInfo *RegInfo =		const PPCRegisterInfo *RegInfo =
static_cast<const PPCRegisterInfo *>(MF.getSubtarget().getRegisterInfo());		static_cast<const PPCRegisterInfo *>(MF.getSubtarget().getRegisterInfo());

// Save and clear the LR state.		// Save and clear the LR state.
PPCFunctionInfo *FI = MF.getInfo<PPCFunctionInfo>();		PPCFunctionInfo *FI = MF.getInfo<PPCFunctionInfo>();
▲ Show 20 Lines • Show All 546 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines	enum NodeType {
/// divisor).		/// divisor).
SRA_ADDZE,		SRA_ADDZE,

/// CALL - A direct function call.		/// CALL - A direct function call.
/// CALL_NOP is a call with the special NOP which follows 64-bit		/// CALL_NOP is a call with the special NOP which follows 64-bit
/// SVR4 calls.		/// SVR4 calls.
CALL, CALL_NOP,		CALL, CALL_NOP,

/// CALL_TLS and CALL_NOP_TLS - Versions of CALL and CALL_NOP used
/// to access TLS variables.
CALL_TLS, CALL_NOP_TLS,

/// CHAIN,FLAG = MTCTR(VAL, CHAIN[, INFLAG]) - Directly corresponds to a		/// CHAIN,FLAG = MTCTR(VAL, CHAIN[, INFLAG]) - Directly corresponds to a
/// MTCTR instruction.		/// MTCTR instruction.
MTCTR,		MTCTR,

/// CHAIN,FLAG = BCTRL(CHAIN, INFLAG) - Directly corresponds to a		/// CHAIN,FLAG = BCTRL(CHAIN, INFLAG) - Directly corresponds to a
/// BCTRL instruction.		/// BCTRL instruction.
BCTRL,		BCTRL,

▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	enum NodeType {
/// register to sym\@got\@tlsgd\@ha.		/// register to sym\@got\@tlsgd\@ha.
ADDIS_TLSGD_HA,		ADDIS_TLSGD_HA,

/// G8RC = ADDI_TLSGD_L G8RReg, Symbol - For the general-dynamic TLS		/// G8RC = ADDI_TLSGD_L G8RReg, Symbol - For the general-dynamic TLS
/// model, produces an ADDI8 instruction that adds G8RReg to		/// model, produces an ADDI8 instruction that adds G8RReg to
/// sym\@got\@tlsgd\@l.		/// sym\@got\@tlsgd\@l.
ADDI_TLSGD_L,		ADDI_TLSGD_L,

		/// G8RC = GET_TLS_ADDR %X3, Symbol - For the general-dynamic TLS
		/// model, produces a call to __tls_get_addr(sym\@tlsgd).
		GET_TLS_ADDR,

/// G8RC = ADDIS_TLSLD_HA %X2, Symbol - For the local-dynamic TLS		/// G8RC = ADDIS_TLSLD_HA %X2, Symbol - For the local-dynamic TLS
/// model, produces an ADDIS8 instruction that adds the GOT base		/// model, produces an ADDIS8 instruction that adds the GOT base
/// register to sym\@got\@tlsld\@ha.		/// register to sym\@got\@tlsld\@ha.
ADDIS_TLSLD_HA,		ADDIS_TLSLD_HA,

/// G8RC = ADDI_TLSLD_L G8RReg, Symbol - For the local-dynamic TLS		/// G8RC = ADDI_TLSLD_L G8RReg, Symbol - For the local-dynamic TLS
/// model, produces an ADDI8 instruction that adds G8RReg to		/// model, produces an ADDI8 instruction that adds G8RReg to
/// sym\@got\@tlsld\@l.		/// sym\@got\@tlsld\@l.
ADDI_TLSLD_L,		ADDI_TLSLD_L,

/// G8RC = ADDIS_DTPREL_HA %X3, Symbol, Chain - For the		/// G8RC = GET_TLSLD_ADDR %X3, Symbol - For the local-dynamic TLS
/// local-dynamic TLS model, produces an ADDIS8 instruction		/// model, produces a call to __tls_get_addr(sym\@tlsld).
/// that adds X3 to sym\@dtprel\@ha. The Chain operand is needed		GET_TLSLD_ADDR,
/// to tie this in place following a copy to %X3 from the result
/// of a GET_TLSLD_ADDR.		/// G8RC = ADDIS_DTPREL_HA %X3, Symbol - For the local-dynamic TLS
		/// model, produces an ADDIS8 instruction that adds X3 to
		/// sym\@dtprel\@ha.
ADDIS_DTPREL_HA,		ADDIS_DTPREL_HA,

/// G8RC = ADDI_DTPREL_L G8RReg, Symbol - For the local-dynamic TLS		/// G8RC = ADDI_DTPREL_L G8RReg, Symbol - For the local-dynamic TLS
/// model, produces an ADDI8 instruction that adds G8RReg to		/// model, produces an ADDI8 instruction that adds G8RReg to
/// sym\@got\@dtprel\@l.		/// sym\@got\@dtprel\@l.
ADDI_DTPREL_L,		ADDI_DTPREL_L,

/// VRRC = VADD_SPLAT Elt, EltSize - Temporary node to be expanded		/// VRRC = VADD_SPLAT Elt, EltSize - Temporary node to be expanded
▲ Show 20 Lines • Show All 376 Lines • ▼ Show 20 Lines	SDValue EmitTailCallLoadFPAndRetAddr(SelectionDAG & DAG,
SDValue &FPOpOut,		SDValue &FPOpOut,
bool isDarwinABI,		bool isDarwinABI,
SDLoc dl) const;		SDLoc dl) const;

SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerRETURNADDR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFRAMEADDR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerConstantPool(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerConstantPool(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerBlockAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerBlockAddress(SDValue Op, SelectionDAG &DAG) const;
std::pair<SDValue,SDValue> lowerTLSCall(SDValue Op, SDLoc dl,
SelectionDAG &DAG) const;
SDValue LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerGlobalTLSAddress(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerGlobalAddress(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerJumpTable(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerJumpTable(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSETCC(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSETCC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINIT_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerADJUST_TRAMPOLINE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVASTART(SDValue Op, SelectionDAG &DAG,		SDValue LowerVASTART(SDValue Op, SelectionDAG &DAG,
const PPCSubtarget &Subtarget) const;		const PPCSubtarget &Subtarget) const;
▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 785 Lines • ▼ Show 20 Lines	const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
case PPCISD::TOC_ENTRY: return "PPCISD::TOC_ENTRY";		case PPCISD::TOC_ENTRY: return "PPCISD::TOC_ENTRY";
case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC";		case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC";
case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseReg";		case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseReg";
case PPCISD::SRL: return "PPCISD::SRL";		case PPCISD::SRL: return "PPCISD::SRL";
case PPCISD::SRA: return "PPCISD::SRA";		case PPCISD::SRA: return "PPCISD::SRA";
case PPCISD::SHL: return "PPCISD::SHL";		case PPCISD::SHL: return "PPCISD::SHL";
case PPCISD::CALL: return "PPCISD::CALL";		case PPCISD::CALL: return "PPCISD::CALL";
case PPCISD::CALL_NOP: return "PPCISD::CALL_NOP";		case PPCISD::CALL_NOP: return "PPCISD::CALL_NOP";
case PPCISD::CALL_TLS: return "PPCISD::CALL_TLS";
case PPCISD::CALL_NOP_TLS: return "PPCISD::CALL_NOP_TLS";
case PPCISD::MTCTR: return "PPCISD::MTCTR";		case PPCISD::MTCTR: return "PPCISD::MTCTR";
case PPCISD::BCTRL: return "PPCISD::BCTRL";		case PPCISD::BCTRL: return "PPCISD::BCTRL";
case PPCISD::BCTRL_LOAD_TOC: return "PPCISD::BCTRL_LOAD_TOC";		case PPCISD::BCTRL_LOAD_TOC: return "PPCISD::BCTRL_LOAD_TOC";
case PPCISD::RET_FLAG: return "PPCISD::RET_FLAG";		case PPCISD::RET_FLAG: return "PPCISD::RET_FLAG";
case PPCISD::READ_TIME_BASE: return "PPCISD::READ_TIME_BASE";		case PPCISD::READ_TIME_BASE: return "PPCISD::READ_TIME_BASE";
case PPCISD::EH_SJLJ_SETJMP: return "PPCISD::EH_SJLJ_SETJMP";		case PPCISD::EH_SJLJ_SETJMP: return "PPCISD::EH_SJLJ_SETJMP";
case PPCISD::EH_SJLJ_LONGJMP: return "PPCISD::EH_SJLJ_LONGJMP";		case PPCISD::EH_SJLJ_LONGJMP: return "PPCISD::EH_SJLJ_LONGJMP";
case PPCISD::MFOCRF: return "PPCISD::MFOCRF";		case PPCISD::MFOCRF: return "PPCISD::MFOCRF";
Show All 17 Lines	const char *PPCTargetLowering::getTargetNodeName(unsigned Opcode) const {
case PPCISD::LD_TOC_L: return "PPCISD::LD_TOC_L";		case PPCISD::LD_TOC_L: return "PPCISD::LD_TOC_L";
case PPCISD::ADDI_TOC_L: return "PPCISD::ADDI_TOC_L";		case PPCISD::ADDI_TOC_L: return "PPCISD::ADDI_TOC_L";
case PPCISD::PPC32_GOT: return "PPCISD::PPC32_GOT";		case PPCISD::PPC32_GOT: return "PPCISD::PPC32_GOT";
case PPCISD::ADDIS_GOT_TPREL_HA: return "PPCISD::ADDIS_GOT_TPREL_HA";		case PPCISD::ADDIS_GOT_TPREL_HA: return "PPCISD::ADDIS_GOT_TPREL_HA";
case PPCISD::LD_GOT_TPREL_L: return "PPCISD::LD_GOT_TPREL_L";		case PPCISD::LD_GOT_TPREL_L: return "PPCISD::LD_GOT_TPREL_L";
case PPCISD::ADD_TLS: return "PPCISD::ADD_TLS";		case PPCISD::ADD_TLS: return "PPCISD::ADD_TLS";
case PPCISD::ADDIS_TLSGD_HA: return "PPCISD::ADDIS_TLSGD_HA";		case PPCISD::ADDIS_TLSGD_HA: return "PPCISD::ADDIS_TLSGD_HA";
case PPCISD::ADDI_TLSGD_L: return "PPCISD::ADDI_TLSGD_L";		case PPCISD::ADDI_TLSGD_L: return "PPCISD::ADDI_TLSGD_L";
		case PPCISD::GET_TLS_ADDR: return "PPCISD::GET_TLS_ADDR";
case PPCISD::ADDIS_TLSLD_HA: return "PPCISD::ADDIS_TLSLD_HA";		case PPCISD::ADDIS_TLSLD_HA: return "PPCISD::ADDIS_TLSLD_HA";
case PPCISD::ADDI_TLSLD_L: return "PPCISD::ADDI_TLSLD_L";		case PPCISD::ADDI_TLSLD_L: return "PPCISD::ADDI_TLSLD_L";
		case PPCISD::GET_TLSLD_ADDR: return "PPCISD::GET_TLSLD_ADDR";
case PPCISD::ADDIS_DTPREL_HA: return "PPCISD::ADDIS_DTPREL_HA";		case PPCISD::ADDIS_DTPREL_HA: return "PPCISD::ADDIS_DTPREL_HA";
case PPCISD::ADDI_DTPREL_L: return "PPCISD::ADDI_DTPREL_L";		case PPCISD::ADDI_DTPREL_L: return "PPCISD::ADDI_DTPREL_L";
case PPCISD::VADD_SPLAT: return "PPCISD::VADD_SPLAT";		case PPCISD::VADD_SPLAT: return "PPCISD::VADD_SPLAT";
case PPCISD::SC: return "PPCISD::SC";		case PPCISD::SC: return "PPCISD::SC";
}		}
}		}

EVT PPCTargetLowering::getSetCCResultType(LLVMContext &, EVT VT) const {		EVT PPCTargetLowering::getSetCCResultType(LLVMContext &, EVT VT) const {
▲ Show 20 Lines • Show All 826 Lines • ▼ Show 20 Lines	SDValue PPCTargetLowering::LowerBlockAddress(SDValue Op,

unsigned MOHiFlag, MOLoFlag;		unsigned MOHiFlag, MOLoFlag;
bool isPIC = GetLabelAccessInfo(DAG.getTarget(), MOHiFlag, MOLoFlag);		bool isPIC = GetLabelAccessInfo(DAG.getTarget(), MOHiFlag, MOLoFlag);
SDValue TgtBAHi = DAG.getTargetBlockAddress(BA, PtrVT, 0, MOHiFlag);		SDValue TgtBAHi = DAG.getTargetBlockAddress(BA, PtrVT, 0, MOHiFlag);
SDValue TgtBALo = DAG.getTargetBlockAddress(BA, PtrVT, 0, MOLoFlag);		SDValue TgtBALo = DAG.getTargetBlockAddress(BA, PtrVT, 0, MOLoFlag);
return LowerLabelRef(TgtBAHi, TgtBALo, isPIC, DAG);		return LowerLabelRef(TgtBAHi, TgtBALo, isPIC, DAG);
}		}

// Generate a call to __tls_get_addr for the given GOT entry Op.
std::pair<SDValue,SDValue>
PPCTargetLowering::lowerTLSCall(SDValue Op, SDLoc dl,
SelectionDAG &DAG) const {

Type IntPtrTy = getDataLayout()->getIntPtrType(DAG.getContext());
TargetLowering::ArgListTy Args;
TargetLowering::ArgListEntry Entry;
Entry.Node = Op;
Entry.Ty = IntPtrTy;
Args.push_back(Entry);

TargetLowering::CallLoweringInfo CLI(DAG);
CLI.setDebugLoc(dl).setChain(DAG.getEntryNode())
.setCallee(CallingConv::C, IntPtrTy,
DAG.getTargetExternalSymbol("__tls_get_addr", getPointerTy()),
std::move(Args), 0);

return LowerCallTo(CLI);
}

SDValue PPCTargetLowering::LowerGlobalTLSAddress(SDValue Op,		SDValue PPCTargetLowering::LowerGlobalTLSAddress(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {

// FIXME: TLS addresses currently use medium model code sequences,		// FIXME: TLS addresses currently use medium model code sequences,
// which is the most useful form. Eventually support for small and		// which is the most useful form. Eventually support for small and
// large models could be added if users need it, at the cost of		// large models could be added if users need it, at the cost of
// additional complexity.		// additional complexity.
GlobalAddressSDNode *GA = cast<GlobalAddressSDNode>(Op);		GlobalAddressSDNode *GA = cast<GlobalAddressSDNode>(Op);
Show All 29 Lines	if (Model == TLSModel::InitialExec) {
} else		} else
GOTPtr = DAG.getNode(PPCISD::PPC32_GOT, dl, PtrVT);		GOTPtr = DAG.getNode(PPCISD::PPC32_GOT, dl, PtrVT);
SDValue TPOffset = DAG.getNode(PPCISD::LD_GOT_TPREL_L, dl,		SDValue TPOffset = DAG.getNode(PPCISD::LD_GOT_TPREL_L, dl,
PtrVT, TGA, GOTPtr);		PtrVT, TGA, GOTPtr);
return DAG.getNode(PPCISD::ADD_TLS, dl, PtrVT, TPOffset, TGATLS);		return DAG.getNode(PPCISD::ADD_TLS, dl, PtrVT, TPOffset, TGATLS);
}		}

if (Model == TLSModel::GeneralDynamic) {		if (Model == TLSModel::GeneralDynamic) {
SDValue TGA = DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0,		SDValue TGA = DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, 0);
PPCII::MO_TLSGD);
SDValue GOTPtr;		SDValue GOTPtr;
if (is64bit) {		if (is64bit) {
SDValue GOTReg = DAG.getRegister(PPC::X2, MVT::i64);		SDValue GOTReg = DAG.getRegister(PPC::X2, MVT::i64);
GOTPtr = DAG.getNode(PPCISD::ADDIS_TLSGD_HA, dl, PtrVT,		GOTPtr = DAG.getNode(PPCISD::ADDIS_TLSGD_HA, dl, PtrVT,
GOTReg, TGA);		GOTReg, TGA);
} else {		} else {
if (picLevel == PICLevel::Small)		if (picLevel == PICLevel::Small)
GOTPtr = DAG.getNode(PPCISD::GlobalBaseReg, dl, PtrVT);		GOTPtr = DAG.getNode(PPCISD::GlobalBaseReg, dl, PtrVT);
else		else
GOTPtr = DAG.getNode(PPCISD::PPC32_PICGOT, dl, PtrVT);		GOTPtr = DAG.getNode(PPCISD::PPC32_PICGOT, dl, PtrVT);
}		}
SDValue GOTEntry = DAG.getNode(PPCISD::ADDI_TLSGD_L, dl, PtrVT,		SDValue GOTEntry = DAG.getNode(PPCISD::ADDI_TLSGD_L, dl,
GOTPtr, TGA);		PtrVT, GOTPtr, TGA);
std::pair<SDValue, SDValue> CallResult = lowerTLSCall(GOTEntry, dl, DAG);		return DAG.getNode(PPCISD::GET_TLS_ADDR, dl, PtrVT, GOTEntry, TGA);
return CallResult.first;
}		}

if (Model == TLSModel::LocalDynamic) {		if (Model == TLSModel::LocalDynamic) {
SDValue TGA = DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0,		SDValue TGA = DAG.getTargetGlobalAddress(GV, dl, PtrVT, 0, 0);
PPCII::MO_TLSLD);
SDValue GOTPtr;		SDValue GOTPtr;
if (is64bit) {		if (is64bit) {
SDValue GOTReg = DAG.getRegister(PPC::X2, MVT::i64);		SDValue GOTReg = DAG.getRegister(PPC::X2, MVT::i64);
GOTPtr = DAG.getNode(PPCISD::ADDIS_TLSLD_HA, dl, PtrVT,		GOTPtr = DAG.getNode(PPCISD::ADDIS_TLSLD_HA, dl, PtrVT,
GOTReg, TGA);		GOTReg, TGA);
} else {		} else {
if (picLevel == PICLevel::Small)		if (picLevel == PICLevel::Small)
GOTPtr = DAG.getNode(PPCISD::GlobalBaseReg, dl, PtrVT);		GOTPtr = DAG.getNode(PPCISD::GlobalBaseReg, dl, PtrVT);
else		else
GOTPtr = DAG.getNode(PPCISD::PPC32_PICGOT, dl, PtrVT);		GOTPtr = DAG.getNode(PPCISD::PPC32_PICGOT, dl, PtrVT);
}		}
SDValue GOTEntry = DAG.getNode(PPCISD::ADDI_TLSLD_L, dl, PtrVT,		SDValue GOTEntry = DAG.getNode(PPCISD::ADDI_TLSLD_L, dl, PtrVT,
GOTPtr, TGA);		GOTPtr, TGA);
std::pair<SDValue, SDValue> CallResult = lowerTLSCall(GOTEntry, dl, DAG);		SDValue TLSAddr = DAG.getNode(PPCISD::GET_TLSLD_ADDR, dl,
SDValue TLSAddr = CallResult.first;		PtrVT, GOTEntry, TGA);
SDValue Chain = CallResult.second;		SDValue DtvOffsetHi = DAG.getNode(PPCISD::ADDIS_DTPREL_HA, dl,
SDValue DtvOffsetHi = DAG.getNode(PPCISD::ADDIS_DTPREL_HA, dl, PtrVT,		PtrVT, TLSAddr, TGA);
Chain, TLSAddr, TGA);
return DAG.getNode(PPCISD::ADDI_DTPREL_L, dl, PtrVT, DtvOffsetHi, TGA);		return DAG.getNode(PPCISD::ADDI_DTPREL_L, dl, PtrVT, DtvOffsetHi, TGA);
}		}

llvm_unreachable("Unknown TLS model!");		llvm_unreachable("Unknown TLS model!");
}		}

SDValue PPCTargetLowering::LowerGlobalAddress(SDValue Op,		SDValue PPCTargetLowering::LowerGlobalAddress(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
▲ Show 20 Lines • Show All 2,002 Lines • ▼ Show 20 Lines	if (needIndirectCall) {
if (isTailCall)		if (isTailCall)
Ops.push_back(DAG.getRegister(isPPC64 ? PPC::CTR8 : PPC::CTR, PtrVT));		Ops.push_back(DAG.getRegister(isPPC64 ? PPC::CTR8 : PPC::CTR, PtrVT));
}		}

// If this is a direct call, pass the chain and the callee.		// If this is a direct call, pass the chain and the callee.
if (Callee.getNode()) {		if (Callee.getNode()) {
Ops.push_back(Chain);		Ops.push_back(Chain);
Ops.push_back(Callee);		Ops.push_back(Callee);

// If this is a call to __tls_get_addr, find the symbol whose address
// is to be taken and add it to the list. This will be used to
// generate __tls_get_addr(<sym>@tlsgd) or __tls_get_addr(<sym>@tlsld).
// We find the symbol by walking the chain to the CopyFromReg, walking
// back from the CopyFromReg to the ADDI_TLSGD_L or ADDI_TLSLD_L, and
// pulling the symbol from that node.
if (ExternalSymbolSDNode *S = dyn_cast<ExternalSymbolSDNode>(Callee))
if (!strcmp(S->getSymbol(), "__tls_get_addr")) {
assert(!needIndirectCall && "Indirect call to __tls_get_addr???");
SDNode *AddI = Chain.getNode()->getOperand(2).getNode();
SDValue TGTAddr = AddI->getOperand(1);
assert(TGTAddr.getNode()->getOpcode() == ISD::TargetGlobalTLSAddress &&
"Didn't find target global TLS address where we expected one");
Ops.push_back(TGTAddr);
CallOpc = PPCISD::CALL_TLS;
}
}		}
// If this is a tail call add stack pointer delta.		// If this is a tail call add stack pointer delta.
if (isTailCall)		if (isTailCall)
Ops.push_back(DAG.getConstant(SPDiff, MVT::i32));		Ops.push_back(DAG.getConstant(SPDiff, MVT::i32));

// Add argument registers to the end of the list so that they are known live		// Add argument registers to the end of the list so that they are known live
// into the call.		// into the call.
for (unsigned i = 0, e = RegsToPass.size(); i != e; ++i)		for (unsigned i = 0, e = RegsToPass.size(); i != e; ++i)
▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	if (CallOpc == PPCISD::BCTRL) {
SDValue TOCOff = DAG.getIntPtrConstant(TOCSaveOffset);		SDValue TOCOff = DAG.getIntPtrConstant(TOCSaveOffset);
SDValue AddTOC = DAG.getNode(ISD::ADD, dl, MVT::i64, StackPtr, TOCOff);		SDValue AddTOC = DAG.getNode(ISD::ADD, dl, MVT::i64, StackPtr, TOCOff);

// The address needs to go after the chain input but before the flag (or		// The address needs to go after the chain input but before the flag (or
// any other variadic arguments).		// any other variadic arguments).
Ops.insert(std::next(Ops.begin()), AddTOC);		Ops.insert(std::next(Ops.begin()), AddTOC);
} else if ((CallOpc == PPCISD::CALL) &&		} else if ((CallOpc == PPCISD::CALL) &&
(!isLocalCall(Callee) \|\|		(!isLocalCall(Callee) \|\|
DAG.getTarget().getRelocationModel() == Reloc::PIC_)) {		DAG.getTarget().getRelocationModel() == Reloc::PIC_))
// Otherwise insert NOP for non-local calls.		// Otherwise insert NOP for non-local calls.
CallOpc = PPCISD::CALL_NOP;		CallOpc = PPCISD::CALL_NOP;
} else if (CallOpc == PPCISD::CALL_TLS)
// For 64-bit SVR4, TLS calls are always non-local.
CallOpc = PPCISD::CALL_NOP_TLS;
}		}

Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops);		Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops);
InFlag = Chain.getValue(1);		InFlag = Chain.getValue(1);

Chain = DAG.getCALLSEQ_END(Chain, DAG.getIntPtrConstant(NumBytes, true),		Chain = DAG.getCALLSEQ_END(Chain, DAG.getIntPtrConstant(NumBytes, true),
DAG.getIntPtrConstant(BytesCalleePops, true),		DAG.getIntPtrConstant(BytesCalleePops, true),
InFlag, dl);		InFlag, dl);
▲ Show 20 Lines • Show All 6,067 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCInstr64Bit.td

Show First 20 Lines • Show All 196 Lines • ▼ Show 20 Lines
def : Pat<(PPCcall_nop (i64 tglobaladdr:$dst)),		def : Pat<(PPCcall_nop (i64 tglobaladdr:$dst)),
(BL8_NOP tglobaladdr:$dst)>;		(BL8_NOP tglobaladdr:$dst)>;

def : Pat<(PPCcall (i64 texternalsym:$dst)),		def : Pat<(PPCcall (i64 texternalsym:$dst)),
(BL8 texternalsym:$dst)>;		(BL8 texternalsym:$dst)>;
def : Pat<(PPCcall_nop (i64 texternalsym:$dst)),		def : Pat<(PPCcall_nop (i64 texternalsym:$dst)),
(BL8_NOP texternalsym:$dst)>;		(BL8_NOP texternalsym:$dst)>;

def : Pat<(PPCcall_nop_tls texternalsym:$func, tglobaltlsaddr:$sym),
(BL8_NOP_TLS texternalsym:$func, tglobaltlsaddr:$sym)>;

// Atomic operations		// Atomic operations
let usesCustomInserter = 1 in {		let usesCustomInserter = 1 in {
let Defs = [CR0] in {		let Defs = [CR0] in {
def ATOMIC_LOAD_ADD_I64 : Pseudo<		def ATOMIC_LOAD_ADD_I64 : Pseudo<
(outs g8rc:$dst), (ins memrr:$ptr, g8rc:$incr), "#ATOMIC_LOAD_ADD_I64",		(outs g8rc:$dst), (ins memrr:$ptr, g8rc:$incr), "#ATOMIC_LOAD_ADD_I64",
[(set i64:$dst, (atomic_load_add_64 xoaddr:$ptr, i64:$incr))]>;		[(set i64:$dst, (atomic_load_add_64 xoaddr:$ptr, i64:$incr))]>;
def ATOMIC_LOAD_SUB_I64 : Pseudo<		def ATOMIC_LOAD_SUB_I64 : Pseudo<
(outs g8rc:$dst), (ins memrr:$ptr, g8rc:$incr), "#ATOMIC_LOAD_SUB_I64",		(outs g8rc:$dst), (ins memrr:$ptr, g8rc:$incr), "#ATOMIC_LOAD_SUB_I64",
▲ Show 20 Lines • Show All 683 Lines • ▼ Show 20 Lines	def ADDIStlsgdHA: Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
[(set i64:$rD,		[(set i64:$rD,
(PPCaddisTlsgdHA i64:$reg, tglobaltlsaddr:$disp))]>,		(PPCaddisTlsgdHA i64:$reg, tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;
def ADDItlsgdL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),		def ADDItlsgdL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
"#ADDItlsgdL",		"#ADDItlsgdL",
[(set i64:$rD,		[(set i64:$rD,
(PPCaddiTlsgdL i64:$reg, tglobaltlsaddr:$disp))]>,		(PPCaddiTlsgdL i64:$reg, tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;
		let Defs = [LR8] in {
		hfinkelUnsubmitted Not Done Reply Inline Actions No need for the { } around on def. hfinkel: No need for the { } around on def.
		def GETtlsADDR : Pseudo<(outs g8rc:$rD), (ins g8rc:$reg, tlsgd:$sym),
		hfinkelUnsubmitted Not Done Reply Inline Actions To follow-up on the IRC messages I missed, to keep the anti-dep breaker from disturbing the r3 assignments, you'll probably want to set let hasExtraSrcRegAllocReq = 1, and/or hasExtraDefRegAllocReq = 1 here (and on the other pseudos in this patch). hfinkel: To follow-up on the IRC messages I missed, to keep the anti-dep breaker from disturbing the r3…
		"#GETtlsADDR",
		[(set i64:$rD,
		(PPCgetTlsAddr i64:$reg, tglobaltlsaddr:$sym))]>,
		isPPC64;
		}
def ADDIStlsldHA: Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),		def ADDIStlsldHA: Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
"#ADDIStlsldHA",		"#ADDIStlsldHA",
[(set i64:$rD,		[(set i64:$rD,
(PPCaddisTlsldHA i64:$reg, tglobaltlsaddr:$disp))]>,		(PPCaddisTlsldHA i64:$reg, tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;
def ADDItlsldL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),		def ADDItlsldL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
"#ADDItlsldL",		"#ADDItlsldL",
[(set i64:$rD,		[(set i64:$rD,
(PPCaddiTlsldL i64:$reg, tglobaltlsaddr:$disp))]>,		(PPCaddiTlsldL i64:$reg, tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;
		let Defs = [LR8] in {
		hfinkelUnsubmitted Not Done Reply Inline Actions Same here (no { } needed). hfinkel: Same here (no { } needed).
		def GETtlsldADDR : Pseudo<(outs g8rc:$rD), (ins g8rc:$reg, tlsgd:$sym),
		"#GETtlsldADDR",
		[(set i64:$rD,
		(PPCgetTlsldAddr i64:$reg, tglobaltlsaddr:$sym))]>,
		isPPC64;
		}
def ADDISdtprelHA: Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),		def ADDISdtprelHA: Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
"#ADDISdtprelHA",		"#ADDISdtprelHA",
[(set i64:$rD,		[(set i64:$rD,
(PPCaddisDtprelHA i64:$reg,		(PPCaddisDtprelHA i64:$reg,
tglobaltlsaddr:$disp))]>,		tglobaltlsaddr:$disp))]>,
isPPC64;		isPPC64;
def ADDIdtprelL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),		def ADDIdtprelL : Pseudo<(outs g8rc:$rD), (ins g8rc_nox0:$reg, s16imm64:$disp),
"#ADDIdtprelL",		"#ADDIdtprelL",
▲ Show 20 Lines • Show All 241 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCInstrInfo.cpp

	Show First 20 Lines • Show All 2,271 Lines • ▼ Show 20 Lines
	}			}

	INITIALIZE_PASS(PPCEarlyReturn, DEBUG_TYPE,			INITIALIZE_PASS(PPCEarlyReturn, DEBUG_TYPE,
	"PowerPC Early-Return Creation", false, false)			"PowerPC Early-Return Creation", false, false)

	char PPCEarlyReturn::ID = 0;			char PPCEarlyReturn::ID = 0;
	FunctionPass*			FunctionPass*
	llvm::createPPCEarlyReturnPass() { return new PPCEarlyReturn(); }			llvm::createPPCEarlyReturnPass() { return new PPCEarlyReturn(); }

				#undef DEBUG_TYPE
				hfinkelUnsubmitted Not Done Reply Inline Actions When you rebase, you'll discover that I just moved all of these other passes into separate files. Please don't add it here, but rather add a separate file. I think that this file becomes somewhat unwieldy with a collection of passes tacked on to the end. hfinkel: When you rebase, you'll discover that I just moved all of these other passes into separate…
				#define DEBUG_TYPE "ppc-tls-dynamic-call"

				namespace llvm {
				void initializePPCTLSDynamicCallPass(PassRegistry&);
				}

				namespace {
				// PPCTLSDynamicCall pass - Add copies to and from GPR3 around
				// GETtls[ld]ADDR machine instructions.
				hfinkelUnsubmitted Not Done Reply Inline Actions Please explain why we're doing this here (that it turns into a call, and thus the register choice is actually constrained). hfinkel: Please explain why we're doing this here (that it turns into a call, and thus the register…
				struct PPCTLSDynamicCall : public MachineFunctionPass {
				static char ID;
				PPCTLSDynamicCall() : MachineFunctionPass(ID) {
				initializePPCTLSDynamicCallPass(*PassRegistry::getPassRegistry());
				}

				const PPCTargetMachine *TM;
				const PPCInstrInfo *TII;

				protected:
				bool processBlock(MachineBasicBlock &MBB) {
				bool Changed = false;
				bool Is64Bit = TM->getSubtargetImpl()->isPPC64();

				for (MachineBasicBlock::iterator I = MBB.begin(), IE = MBB.end();
				I != IE; ++I) {
				MachineInstr *MI = I;

				if (MI->getOpcode() != PPC::GETtlsADDR &&
				MI->getOpcode() != PPC::GETtlsldADDR)
				continue;

				DEBUG(dbgs() << "TLS Dynamic Call Fixup:\n " << *MI;);

				unsigned OutReg = MI->getOperand(0).getReg();
				unsigned InReg = MI->getOperand(1).getReg();
				DebugLoc DL = MI->getDebugLoc();
				unsigned GPR3 = Is64Bit ? PPC::X3 : PPC::R3;

				BuildMI(MBB, I, DL, TII->get(TargetOpcode::COPY), GPR3)
				.addReg(InReg);
				MI->getOperand(0).setReg(GPR3);
				MI->getOperand(1).setReg(GPR3);
				BuildMI(MBB, ++I, DL, TII->get(TargetOpcode::COPY), OutReg)
				.addReg(GPR3);

				Changed = true;
				}

				return Changed;
				}

				public:
				bool runOnMachineFunction(MachineFunction &MF) override {
				TM = static_cast<const PPCTargetMachine *>(&MF.getTarget());
				TII = TM->getSubtargetImpl()->getInstrInfo();

				bool Changed = false;

				for (MachineFunction::iterator I = MF.begin(); I != MF.end();) {
				MachineBasicBlock &B = *I++;
				if (processBlock(B))
				Changed = true;
				}

				return Changed;
				}

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				MachineFunctionPass::getAnalysisUsage(AU);
				}
				};
				}

				INITIALIZE_PASS_BEGIN(PPCTLSDynamicCall, DEBUG_TYPE,
				"PowerPC TLS Dynamic Call Fixup", false, false)
				INITIALIZE_PASS_END(PPCTLSDynamicCall, DEBUG_TYPE,
				"PowerPC TLS Dynamic Call Fixup", false, false)

				//char &llvm::PPCTLSDynamicCallID = PPCTLSDynamicCall::ID;

				char PPCTLSDynamicCall::ID = 0;
				FunctionPass*
				llvm::createPPCTLSDynamicCallPass() { return new PPCTLSDynamicCall(); }

lib/Target/PowerPC/PPCInstrInfo.td

Show First 20 Lines • Show All 104 Lines • ▼ Show 20 Lines
def PPCppc32GOT : SDNode<"PPCISD::PPC32_GOT", SDTIntLeaf, []>;		def PPCppc32GOT : SDNode<"PPCISD::PPC32_GOT", SDTIntLeaf, []>;

def PPCaddisGotTprelHA : SDNode<"PPCISD::ADDIS_GOT_TPREL_HA", SDTIntBinOp>;		def PPCaddisGotTprelHA : SDNode<"PPCISD::ADDIS_GOT_TPREL_HA", SDTIntBinOp>;
def PPCldGotTprelL : SDNode<"PPCISD::LD_GOT_TPREL_L", SDTIntBinOp,		def PPCldGotTprelL : SDNode<"PPCISD::LD_GOT_TPREL_L", SDTIntBinOp,
[SDNPMayLoad]>;		[SDNPMayLoad]>;
def PPCaddTls : SDNode<"PPCISD::ADD_TLS", SDTIntBinOp, []>;		def PPCaddTls : SDNode<"PPCISD::ADD_TLS", SDTIntBinOp, []>;
def PPCaddisTlsgdHA : SDNode<"PPCISD::ADDIS_TLSGD_HA", SDTIntBinOp>;		def PPCaddisTlsgdHA : SDNode<"PPCISD::ADDIS_TLSGD_HA", SDTIntBinOp>;
def PPCaddiTlsgdL : SDNode<"PPCISD::ADDI_TLSGD_L", SDTIntBinOp>;		def PPCaddiTlsgdL : SDNode<"PPCISD::ADDI_TLSGD_L", SDTIntBinOp>;
		def PPCgetTlsAddr : SDNode<"PPCISD::GET_TLS_ADDR", SDTIntBinOp>;
def PPCaddisTlsldHA : SDNode<"PPCISD::ADDIS_TLSLD_HA", SDTIntBinOp>;		def PPCaddisTlsldHA : SDNode<"PPCISD::ADDIS_TLSLD_HA", SDTIntBinOp>;
def PPCaddiTlsldL : SDNode<"PPCISD::ADDI_TLSLD_L", SDTIntBinOp>;		def PPCaddiTlsldL : SDNode<"PPCISD::ADDI_TLSLD_L", SDTIntBinOp>;
def PPCaddisDtprelHA : SDNode<"PPCISD::ADDIS_DTPREL_HA", SDTIntBinOp,		def PPCgetTlsldAddr : SDNode<"PPCISD::GET_TLSLD_ADDR", SDTIntBinOp>;
[SDNPHasChain]>;		def PPCaddisDtprelHA : SDNode<"PPCISD::ADDIS_DTPREL_HA", SDTIntBinOp>;
def PPCaddiDtprelL : SDNode<"PPCISD::ADDI_DTPREL_L", SDTIntBinOp>;		def PPCaddiDtprelL : SDNode<"PPCISD::ADDI_DTPREL_L", SDTIntBinOp>;

def PPCvperm : SDNode<"PPCISD::VPERM", SDT_PPCvperm, []>;		def PPCvperm : SDNode<"PPCISD::VPERM", SDT_PPCvperm, []>;

def PPCcmpb : SDNode<"PPCISD::CMPB", SDTIntBinOp, []>;		def PPCcmpb : SDNode<"PPCISD::CMPB", SDTIntBinOp, []>;

// These nodes represent the 32-bit PPC shifts that operate on 6-bit shift		// These nodes represent the 32-bit PPC shifts that operate on 6-bit shift
// amounts. These nodes are generated by the multi-precision shift code.		// amounts. These nodes are generated by the multi-precision shift code.
def PPCsrl : SDNode<"PPCISD::SRL" , SDTIntShiftOp>;		def PPCsrl : SDNode<"PPCISD::SRL" , SDTIntShiftOp>;
def PPCsra : SDNode<"PPCISD::SRA" , SDTIntShiftOp>;		def PPCsra : SDNode<"PPCISD::SRA" , SDTIntShiftOp>;
def PPCshl : SDNode<"PPCISD::SHL" , SDTIntShiftOp>;		def PPCshl : SDNode<"PPCISD::SHL" , SDTIntShiftOp>;

// These are target-independent nodes, but have target-specific formats.		// These are target-independent nodes, but have target-specific formats.
def callseq_start : SDNode<"ISD::CALLSEQ_START", SDT_PPCCallSeqStart,		def callseq_start : SDNode<"ISD::CALLSEQ_START", SDT_PPCCallSeqStart,
[SDNPHasChain, SDNPOutGlue]>;		[SDNPHasChain, SDNPOutGlue]>;
def callseq_end : SDNode<"ISD::CALLSEQ_END", SDT_PPCCallSeqEnd,		def callseq_end : SDNode<"ISD::CALLSEQ_END", SDT_PPCCallSeqEnd,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue]>;		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue]>;

def SDT_PPCCall : SDTypeProfile<0, -1, [SDTCisInt<0>]>;		def SDT_PPCCall : SDTypeProfile<0, -1, [SDTCisInt<0>]>;
def PPCcall : SDNode<"PPCISD::CALL", SDT_PPCCall,		def PPCcall : SDNode<"PPCISD::CALL", SDT_PPCCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;		SDNPVariadic]>;
def PPCcall_tls : SDNode<"PPCISD::CALL_TLS", SDT_PPCCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;
def PPCcall_nop : SDNode<"PPCISD::CALL_NOP", SDT_PPCCall,		def PPCcall_nop : SDNode<"PPCISD::CALL_NOP", SDT_PPCCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;		SDNPVariadic]>;
def PPCcall_nop_tls : SDNode<"PPCISD::CALL_NOP_TLS", SDT_PPCCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;
def PPCmtctr : SDNode<"PPCISD::MTCTR", SDT_PPCCall,		def PPCmtctr : SDNode<"PPCISD::MTCTR", SDT_PPCCall,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue]>;		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue]>;
def PPCbctrl : SDNode<"PPCISD::BCTRL", SDTNone,		def PPCbctrl : SDNode<"PPCISD::BCTRL", SDTNone,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
SDNPVariadic]>;		SDNPVariadic]>;
def PPCbctrl_load_toc : SDNode<"PPCISD::BCTRL_LOAD_TOC",		def PPCbctrl_load_toc : SDNode<"PPCISD::BCTRL_LOAD_TOC",
SDTypeProfile<0, 1, []>,		SDTypeProfile<0, 1, []>,
[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,		[SDNPHasChain, SDNPOptInGlue, SDNPOutGlue,
▲ Show 20 Lines • Show All 2,298 Lines • ▼ Show 20 Lines	def : Pat<(and (rotl i32:$in, i32:$sh), maskimm32:$imm),
(RLWNM $in, $sh, (MB maskimm32:$imm), (ME maskimm32:$imm))>;		(RLWNM $in, $sh, (MB maskimm32:$imm), (ME maskimm32:$imm))>;

// Calls		// Calls
def : Pat<(PPCcall (i32 tglobaladdr:$dst)),		def : Pat<(PPCcall (i32 tglobaladdr:$dst)),
(BL tglobaladdr:$dst)>;		(BL tglobaladdr:$dst)>;
def : Pat<(PPCcall (i32 texternalsym:$dst)),		def : Pat<(PPCcall (i32 texternalsym:$dst)),
(BL texternalsym:$dst)>;		(BL texternalsym:$dst)>;

def : Pat<(PPCcall_tls texternalsym:$func, tglobaltlsaddr:$sym),
(BL_TLS texternalsym:$func, tglobaltlsaddr:$sym)>;

def : Pat<(PPCtc_return (i32 tglobaladdr:$dst), imm:$imm),		def : Pat<(PPCtc_return (i32 tglobaladdr:$dst), imm:$imm),
(TCRETURNdi tglobaladdr:$dst, imm:$imm)>;		(TCRETURNdi tglobaladdr:$dst, imm:$imm)>;

def : Pat<(PPCtc_return (i32 texternalsym:$dst), imm:$imm),		def : Pat<(PPCtc_return (i32 texternalsym:$dst), imm:$imm),
(TCRETURNdi texternalsym:$dst, imm:$imm)>;		(TCRETURNdi texternalsym:$dst, imm:$imm)>;

def : Pat<(PPCtc_return CTRRC:$dst, imm:$imm),		def : Pat<(PPCtc_return CTRRC:$dst, imm:$imm),
(TCRETURNri CTRRC:$dst, imm:$imm)>;		(TCRETURNri CTRRC:$dst, imm:$imm)>;
Show All 38 Lines	def LDgotTprelL32: Pseudo<(outs gprc:$rD), (ins s16imm:$disp, gprc_nor0:$reg),
(PPCldGotTprelL tglobaltlsaddr:$disp, i32:$reg))]>;		(PPCldGotTprelL tglobaltlsaddr:$disp, i32:$reg))]>;
def : Pat<(PPCaddTls i32:$in, tglobaltlsaddr:$g),		def : Pat<(PPCaddTls i32:$in, tglobaltlsaddr:$g),
(ADD4TLS $in, tglobaltlsaddr:$g)>;		(ADD4TLS $in, tglobaltlsaddr:$g)>;

def ADDItlsgdL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),		def ADDItlsgdL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),
"#ADDItlsgdL32",		"#ADDItlsgdL32",
[(set i32:$rD,		[(set i32:$rD,
(PPCaddiTlsgdL i32:$reg, tglobaltlsaddr:$disp))]>;		(PPCaddiTlsgdL i32:$reg, tglobaltlsaddr:$disp))]>;
		let Defs = [LR] in {
		hfinkelUnsubmitted Not Done Reply Inline Actions You don't need the { } around a single following def. hfinkel: You don't need the { } around a single following def.
		def GETtlsADDR32 : Pseudo<(outs gprc:$rD), (ins gprc:$reg, tlsgd32:$sym),
		"GETtlsADDR32",
		[(set i32:$rD,
		(PPCgetTlsAddr i32:$reg, tglobaltlsaddr:$sym))]>;
		}
def ADDItlsldL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),		def ADDItlsldL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),
"#ADDItlsldL32",		"#ADDItlsldL32",
[(set i32:$rD,		[(set i32:$rD,
(PPCaddiTlsldL i32:$reg, tglobaltlsaddr:$disp))]>;		(PPCaddiTlsldL i32:$reg, tglobaltlsaddr:$disp))]>;
		let Defs = [LR] in {
		hfinkelUnsubmitted Not Done Reply Inline Actions Same here (no { } needed). hfinkel: Same here (no { } needed).
		def GETtlsldADDR32 : Pseudo<(outs gprc:$rD), (ins gprc:$reg, tlsgd32:$sym),
		"GETtlsldADDR32",
		[(set i32:$rD,
		(PPCgetTlsldAddr i32:$reg,
		tglobaltlsaddr:$sym))]>;
		}
def ADDIdtprelL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),		def ADDIdtprelL32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),
"#ADDIdtprelL32",		"#ADDIdtprelL32",
[(set i32:$rD,		[(set i32:$rD,
(PPCaddiDtprelL i32:$reg, tglobaltlsaddr:$disp))]>;		(PPCaddiDtprelL i32:$reg, tglobaltlsaddr:$disp))]>;
def ADDISdtprelHA32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),		def ADDISdtprelHA32 : Pseudo<(outs gprc:$rD), (ins gprc_nor0:$reg, s16imm:$disp),
"#ADDISdtprelHA32",		"#ADDISdtprelHA32",
[(set i32:$rD,		[(set i32:$rD,
(PPCaddisDtprelHA i32:$reg,		(PPCaddisDtprelHA i32:$reg,
▲ Show 20 Lines • Show All 1,256 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCMCInstLower.cpp

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	case PPCII::MO_TLSLD_LO:
RefKind = MCSymbolRefExpr::VK_PPC_GOT_TLSLD_LO;		RefKind = MCSymbolRefExpr::VK_PPC_GOT_TLSLD_LO;
break;		break;
case PPCII::MO_TOC_LO:		case PPCII::MO_TOC_LO:
RefKind = MCSymbolRefExpr::VK_PPC_TOC_LO;		RefKind = MCSymbolRefExpr::VK_PPC_TOC_LO;
break;		break;
case PPCII::MO_TLS:		case PPCII::MO_TLS:
RefKind = MCSymbolRefExpr::VK_PPC_TLS;		RefKind = MCSymbolRefExpr::VK_PPC_TLS;
break;		break;
case PPCII::MO_TLSGD:
RefKind = MCSymbolRefExpr::VK_PPC_TLSGD;
break;
case PPCII::MO_TLSLD:
RefKind = MCSymbolRefExpr::VK_PPC_TLSLD;
break;
}		}

if (MO.getTargetFlags() == PPCII::MO_PLT_OR_STUB && !isDarwin)		if (MO.getTargetFlags() == PPCII::MO_PLT_OR_STUB && !isDarwin)
RefKind = MCSymbolRefExpr::VK_PLT;		RefKind = MCSymbolRefExpr::VK_PLT;

const MCExpr *Expr = MCSymbolRefExpr::Create(Symbol, RefKind, Ctx);		const MCExpr *Expr = MCSymbolRefExpr::Create(Symbol, RefKind, Ctx);

if (!MO.isJTI() && MO.getOffset())		if (!MO.isJTI() && MO.getOffset())
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCTargetMachine.cpp

Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	#endif
addPass(createPPCVSXCopyPass());		addPass(createPPCVSXCopyPass());
return false;		return false;
}		}

void PPCPassConfig::addPreRegAlloc() {		void PPCPassConfig::addPreRegAlloc() {
initializePPCVSXFMAMutatePass(*PassRegistry::getPassRegistry());		initializePPCVSXFMAMutatePass(*PassRegistry::getPassRegistry());
insertPass(VSXFMAMutateEarly ? &RegisterCoalescerID : &MachineSchedulerID,		insertPass(VSXFMAMutateEarly ? &RegisterCoalescerID : &MachineSchedulerID,
&PPCVSXFMAMutateID);		&PPCVSXFMAMutateID);
		addPass(createPPCTLSDynamicCallPass());
}		}

void PPCPassConfig::addPreSched2() {		void PPCPassConfig::addPreSched2() {
addPass(createPPCVSXCopyCleanupPass(), false);		addPass(createPPCVSXCopyCleanupPass(), false);

if (getOptLevel() != CodeGenOpt::None)		if (getOptLevel() != CodeGenOpt::None)
addPass(&IfConverterID);		addPass(&IfConverterID);
}		}
Show All 15 Lines

test/CodeGen/PowerPC/retaddr2.ll

	; RUN: llc -mcpu=pwr7 < %s \| FileCheck %s			; RUN: llc -mcpu=pwr7 < %s \| FileCheck %s
	target datalayout = "E-m:e-i64:64-n32:64"			target datalayout = "E-m:e-i64:64-n32:64"
	target triple = "powerpc64-unknown-linux-gnu"			target triple = "powerpc64-unknown-linux-gnu"

	; Function Attrs: nounwind readnone			; Function Attrs: nounwind readnone
	define i8* @test1() #0 {			define i8* @test1() #0 {
	entry:			entry:
	%0 = tail call i8* @llvm.returnaddress(i32 0)			%0 = tail call i8* @llvm.returnaddress(i32 0)
	ret i8* %0			ret i8* %0
	}			}

	; CHECK-LABEL: @test1			; CHECK-LABEL: @test1
	; CHECK: mflr 0			; CHECK: mflr 0
	; CHECK: std 0, 16(1)			; CHECK: std 0, 16(1)
	; FIXME: These next two lines don't both need to load the same value.			; FIXME: These next two lines don't both need to load the same value.
	; CHECK-DAG: ld 3, 16(1)			; CHECK-DAG: ld 3, 64(1)
	; CHECK-DAG: ld 0, 16(1)			; CHECK-DAG: ld 0, 16(1)
	; CHECK: mtlr 0			; CHECK: mtlr 0
	; CHECK: blr			; CHECK: blr

	; Function Attrs: nounwind readnone			; Function Attrs: nounwind readnone
	declare i8* @llvm.returnaddress(i32) #0			declare i8* @llvm.returnaddress(i32) #0

	attributes #0 = { nounwind readnone }			attributes #0 = { nounwind readnone }

test/CodeGen/PowerPC/tls-cse.ll

				; RUN: llc -march=ppc64 -mcpu=pwr7 -O2 -relocation-model=pic < %s \| FileCheck %s
				; RUN: llc -march=ppc64 -mcpu=pwr7 -O2 -relocation-model=pic < %s \| grep "__tls_get_addr" \| count 1

				; This test was derived from LLVM's own
				; PrettyStackTraceEntry::~PrettyStackTraceEntry(). It demonstrates an
				; opportunity for CSE of calls to __tls_get_addr().

				target datalayout = "e-m:e-i64:64-n32:64"
				target triple = "powerpc64le-unknown-linux-gnu"

				%"class.llvm::PrettyStackTraceEntry" = type { i32 (...)*, %"class.llvm::PrettyStackTraceEntry" }

				@_ZTVN4llvm21PrettyStackTraceEntryE = unnamed_addr constant [5 x i8] [i8 null, i8* null, i8* bitcast (void (%"class.llvm::PrettyStackTraceEntry") @_ZN4llvm21PrettyStackTraceEntryD2Ev to i8), i8 bitcast (void (%"class.llvm::PrettyStackTraceEntry") @_ZN4llvm21PrettyStackTraceEntryD0Ev to i8), i8 bitcast (void ()* @__cxa_pure_virtual to i8*)], align 8
				@_ZL20PrettyStackTraceHead = internal thread_local unnamed_addr global %"class.llvm::PrettyStackTraceEntry"* null, align 8
				@.str = private unnamed_addr constant [87 x i8] c"PrettyStackTraceHead == this && \22Pretty stack trace entry destruction is out of order\22\00", align 1
				@.str1 = private unnamed_addr constant [64 x i8] c"/home/wschmidt/llvm/llvm-test2/lib/Support/PrettyStackTrace.cpp\00", align 1
				@__PRETTY_FUNCTION__._ZN4llvm21PrettyStackTraceEntryD2Ev = private unnamed_addr constant [62 x i8] c"virtual llvm::PrettyStackTraceEntry::~PrettyStackTraceEntry()\00", align 1

				declare void @_ZN4llvm21PrettyStackTraceEntryD2Ev(%"class.llvm::PrettyStackTraceEntry"* %this) unnamed_addr
				declare void @__cxa_pure_virtual()
				declare void @__assert_fail(i8, i8, i32 zeroext, i8*)
				declare void @_ZdlPv(i8*)

				define void @_ZN4llvm21PrettyStackTraceEntryD0Ev(%"class.llvm::PrettyStackTraceEntry"* %this) unnamed_addr align 2 {
				entry:
				%0 = getelementptr inbounds %"class.llvm::PrettyStackTraceEntry"* %this, i64 0, i32 0
				store i32 (...) bitcast (i8 getelementptr inbounds ([5 x i8] @_ZTVN4llvm21PrettyStackTraceEntryE, i64 0, i64 2) to i32 (...)), i32 (...)* %0, align 8
				%1 = load %"class.llvm::PrettyStackTraceEntry"** @_ZL20PrettyStackTraceHead, align 8
				%cmp.i = icmp eq %"class.llvm::PrettyStackTraceEntry"* %1, %this
				br i1 %cmp.i, label %_ZN4llvm21PrettyStackTraceEntryD2Ev.exit, label %cond.false.i

				cond.false.i: ; preds = %entry
				tail call void @__assert_fail(i8* getelementptr inbounds ([87 x i8]* @.str, i64 0, i64 0), i8* getelementptr inbounds ([64 x i8]* @.str1, i64 0, i64 0), i32 zeroext 119, i8* getelementptr inbounds ([62 x i8]* @__PRETTY_FUNCTION__._ZN4llvm21PrettyStackTraceEntryD2Ev, i64 0, i64 0))
				unreachable

				_ZN4llvm21PrettyStackTraceEntryD2Ev.exit: ; preds = %entry
				%NextEntry.i.i = getelementptr inbounds %"class.llvm::PrettyStackTraceEntry"* %this, i64 0, i32 1
				%2 = bitcast %"class.llvm::PrettyStackTraceEntry"** %NextEntry.i.i to i64*
				%3 = load i64* %2, align 8
				store i64 %3, i64* bitcast (%"class.llvm::PrettyStackTraceEntry"** @_ZL20PrettyStackTraceHead to i64*), align 8
				%4 = bitcast %"class.llvm::PrettyStackTraceEntry"* %this to i8*
				tail call void @_ZdlPv(i8* %4)
				ret void
				}

				; CHECK-LABEL: _ZN4llvm21PrettyStackTraceEntryD0Ev:
				; CHECK: addis [[REG1:[0-9]+]], 2, _ZL20PrettyStackTraceHead@got@tlsld@ha
				; CHECK: addi 3, [[REG1]], _ZL20PrettyStackTraceHead@got@tlsld@l
				; CHECK: bl __tls_get_addr(_ZL20PrettyStackTraceHead@tlsld)
				; CHECK: addis 3, 3, _ZL20PrettyStackTraceHead@dtprel@ha
				; CHECK: ld {{[0-9]+}}, _ZL20PrettyStackTraceHead@dtprel@l(3)
				; CHECK: std {{[0-9]+}}, _ZL20PrettyStackTraceHead@dtprel@l(3)

test/CodeGen/PowerPC/tls-store2.ll

Show All 13 Lines	entry:
%var = alloca i8*, align 8		%var = alloca i8*, align 8
store i8* %ptr, i8** %var, align 8		store i8* %ptr, i8** %var, align 8
store i8 %var, i8* @__once_callable, align 8		store i8 %var, i8* @__once_callable, align 8
store void ()* @__once_call_impl, void ()** @__once_call, align 8		store void ()* @__once_call_impl, void ()** @__once_call, align 8
ret i64 %flag		ret i64 %flag
}		}

; CHECK-LABEL: call_once:		; CHECK-LABEL: call_once:
; CHECK: addis 3, 2, __once_callable@got@tlsgd@ha		; CHECK: addi 3, {{[0-9]+}}, __once_callable@got@tlsgd@l
; CHECK: addi 3, 3, __once_callable@got@tlsgd@l
; CHECK: bl __tls_get_addr(__once_callable@tlsgd)		; CHECK: bl __tls_get_addr(__once_callable@tlsgd)
; CHECK-NEXT: nop		; CHECK-NEXT: nop
; CHECK: std {{[0-9]+}}, 0(3)		; CHECK: std {{[0-9]+}}, 0(3)
; CHECK: addis 3, 2, __once_call@got@tlsgd@ha		; CHECK: addi 3, {{[0-9]+}}, __once_call@got@tlsgd@l
; CHECK: addi 3, 3, __once_call@got@tlsgd@l
; CHECK: bl __tls_get_addr(__once_call@tlsgd)		; CHECK: bl __tls_get_addr(__once_call@tlsgd)
; CHECK-NEXT: nop		; CHECK-NEXT: nop
; CHECK: std {{[0-9]+}}, 0(3)		; CHECK: std {{[0-9]+}}, 0(3)

declare void @__once_call_impl()		declare void @__once_call_impl()