This is an archive of the discontinued LLVM Phabricator instance.

Differential D69737

[PGO][PGSO] TargetInstrInfo part.
AbandonedPublic

Authored by hjyamauchi on Nov 1 2019, 1:50 PM.

Download Raw Diff

Details

Reviewers

davidxl

Summary

(Split of off D67120)

TargetInstrInfo changes for profile guided size optimization.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 40423
Build 40530: arc lint + arc unit

Event Timeline

hjyamauchi created this revision.Nov 1 2019, 1:50 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2019, 1:50 PM

Herald added subscribers: jsji, tpr, kbarton and 11 others. · View Herald Transcript

Harbormaster completed remote builds in B40423: Diff 227529.Nov 1 2019, 1:54 PM

Herald added a subscriber: • wuzish. · View Herald TranscriptNov 1 2019, 1:54 PM

arsenm added inline comments.Nov 1 2019, 1:58 PM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	Why is this needed for commuting?

hjyamauchi added inline comments.Nov 1 2019, 2:20 PM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	There isn't anything particular about commuting, but wherever we make a size-vs-speed code gen choice (one in X86InstrInfo.cpp), and if we want to do it in a profile guided manner, we'd need to propagate profile (PSI/MBFI) down.

arsenm added inline comments.Nov 1 2019, 4:23 PM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	I don’t think it’s appropriate to pass this in to commuteInstruction. It should just perform the commute and not concern itself with profitability. It would be the caller’s responsibility to determine if commuting is a good idea

hjyamauchi marked an inline comment as done.Nov 4 2019, 11:24 AM

hjyamauchi added inline comments.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	Around X86InstrInfo.cpp:1576, case X86::BLENDPDrri: case X86::BLENDPDrri: case X86::BLENDPSrri: case X86::VBLENDPDrri: case X86::VBLENDPSrri: { // If we're optimizing for size, try to use MOVSD/MOVSS. if (MI.getParent()->getParent()->getFunction().hasOptSize()) { .... } this code tries to replace BLEND with MOVSD, if possible, under optsize. This is unfortunately internal to Target/X86InstrInfo (TII) and isn't up to the caller, except X86InstrInfo internally checks for optsize on the function. This patch would like to make it profile guided (eg. do this for cold code, even without optsize) and it needs access to the profile (PSI/MBFI) there in some way. It doesn't seem like a good idea to pass the profile to TII per pass or function because it is TII is created once and stateless. There doesn't seem a good way to access the profile directly from TII because the profile isn't attached to the function or the IR. Do you see some other way to accomplish this? We could consider instead passing "bool OptSize" to commuteInstruction. What do you think?

arsenm added inline comments.Nov 4 2019, 11:36 AM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	This usage of OptSize seems broken in the first place. commuteInstruction is used, for example, in MachineCSE. This would mean optsize can potentially disable CSE for some operations, which will increase instruction count (and therefore code size) by blocking the commute. AMDGPU also has situations for example where.a smaller encoding is possible, but we have a pass that uses this based on when it makes sense to switch encodings. commuteInstruction isn't making this decision.

davidxl added inline comments.Nov 5 2019, 10:52 AM

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	I agree with Matt. Can we remove this change for commute and see if it matters for size? The existing code is broken and we don't want this badness to be exposed at interface level.

hjyamauchi marked an inline comment as done.Nov 5 2019, 1:38 PM

hjyamauchi added inline comments.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	Yeah, it looks like some CSE between blend and movsd/movss could indeed be missed depending on whether optsize. This code came from https://reviews.llvm.org/rL336731. It may be alternatively fine to do a blend -> movsd/movss replacement in a separate pass if optsize. Thoughts?

hjyamauchi marked an inline comment as done and an inline comment as not done.Nov 5 2019, 3:52 PM

hjyamauchi added inline comments.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
136–138	I measured the size impact for an internal app. Turned out to be very tiny 0.0002% :) I believe we can drop this patch.

hjyamauchi mentioned this in D67120: [PGO] Profile guided code size optimization (continued)..Nov 6 2019, 3:33 PM

hjyamauchi abandoned this revision.Feb 4 2020, 11:26 AM

Herald added a subscriber: kerbowa. · View Herald TranscriptFeb 4 2020, 11:26 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetInstrInfo.h

26 lines

lib/

CodeGen/

ExecutionDomainFix.cpp

5 lines

InlineSpiller.cpp

6 lines

LiveRangeEdit.cpp

3 lines

MachineCSE.cpp

5 lines

PeepholeOptimizer.cpp

7 lines

RegisterCoalescer.cpp

3 lines

TargetInstrInfo.cpp

17 lines

TwoAddressInstructionPass.cpp

3 lines

Target/

AArch64/

AArch64InstrInfo.h

2 lines

AArch64InstrInfo.cpp

1 line

AMDGPU/

2 lines

10 lines

2 lines

7 lines

15 lines

SIShrinkInstructions.cpp

13 lines

ARM/

ARMBaseInstrInfo.h

9 lines

ARMBaseInstrInfo.cpp

12 lines

Thumb2SizeReduction.cpp

5 lines

PowerPC/

PPCInstrInfo.h

5 lines

PPCInstrInfo.cpp

8 lines

SystemZ/

SystemZInstrInfo.h

9 lines

SystemZInstrInfo.cpp

12 lines

SystemZPostRewrite.cpp

3 lines

SystemZShortenInst.cpp

4 lines

WebAssembly/

WebAssemblyInstrInfo.h

5 lines

WebAssemblyInstrInfo.cpp

7 lines

WebAssemblyRegStackify.cpp

6 lines

X86/

X86FastISel.cpp

2 lines

X86InstrInfo.h

26 lines

X86InstrInfo.cpp

116 lines

Diff 227529

llvm/include/llvm/CodeGen/TargetInstrInfo.h

Show All 38 Lines
namespace llvm {		namespace llvm {

class AAResults;		class AAResults;
class DFAPacketizer;		class DFAPacketizer;
class InstrItineraryData;		class InstrItineraryData;
class LiveIntervals;		class LiveIntervals;
class LiveVariables;		class LiveVariables;
class MachineLoop;		class MachineLoop;
		class MachineBlockFrequencyInfo;
class MachineMemOperand;		class MachineMemOperand;
class MachineRegisterInfo;		class MachineRegisterInfo;
class MCAsmInfo;		class MCAsmInfo;
class MCInst;		class MCInst;
struct MCSchedModel;		struct MCSchedModel;
class Module;		class Module;
		class ProfileSummaryInfo;
class ScheduleDAG;		class ScheduleDAG;
class ScheduleHazardRecognizer;		class ScheduleHazardRecognizer;
class SDNode;		class SDNode;
class SelectionDAG;		class SelectionDAG;
class RegScavenger;		class RegScavenger;
class TargetRegisterClass;		class TargetRegisterClass;
class TargetRegisterInfo;		class TargetRegisterInfo;
class TargetSchedModel;		class TargetSchedModel;
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines	protected:
/// The default implementation simply swaps the commutable operands.		/// The default implementation simply swaps the commutable operands.
///		///
/// If NewMI is false, MI is modified in place and returned; otherwise, a		/// If NewMI is false, MI is modified in place and returned; otherwise, a
/// new machine instruction is created and returned.		/// new machine instruction is created and returned.
///		///
/// Do not call this method for a non-commutable instruction.		/// Do not call this method for a non-commutable instruction.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
virtual MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		virtual MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		arsenmUnsubmitted Not Done Reply Inline Actions Why is this needed for commuting? arsenm: Why is this needed for commuting?
		hjyamauchiAuthorUnsubmitted Not Done Reply Inline Actions There isn't anything particular about commuting, but wherever we make a size-vs-speed code gen choice (one in X86InstrInfo.cpp), and if we want to do it in a profile guided manner, we'd need to propagate profile (PSI/MBFI) down. hjyamauchi: There isn't anything particular about commuting, but wherever we make a size-vs-speed code gen…
		arsenmUnsubmitted Not Done Reply Inline Actions I don’t think it’s appropriate to pass this in to commuteInstruction. It should just perform the commute and not concern itself with profitability. It would be the caller’s responsibility to determine if commuting is a good idea arsenm: I don’t think it’s appropriate to pass this in to commuteInstruction. It should just perform…
		hjyamauchiAuthorUnsubmitted Done Reply Inline Actions Around X86InstrInfo.cpp:1576, case X86::BLENDPDrri: case X86::BLENDPDrri: case X86::BLENDPSrri: case X86::VBLENDPDrri: case X86::VBLENDPSrri: { // If we're optimizing for size, try to use MOVSD/MOVSS. if (MI.getParent()->getParent()->getFunction().hasOptSize()) { .... } this code tries to replace BLEND with MOVSD, if possible, under optsize. This is unfortunately internal to Target/X86InstrInfo (TII) and isn't up to the caller, except X86InstrInfo internally checks for optsize on the function. This patch would like to make it profile guided (eg. do this for cold code, even without optsize) and it needs access to the profile (PSI/MBFI) there in some way. It doesn't seem like a good idea to pass the profile to TII per pass or function because it is TII is created once and stateless. There doesn't seem a good way to access the profile directly from TII because the profile isn't attached to the function or the IR. Do you see some other way to accomplish this? We could consider instead passing "bool OptSize" to commuteInstruction. What do you think? hjyamauchi: Around X86InstrInfo.cpp:1576, case X86::BLENDPDrri: case X86::BLENDPDrri: case X86…
		arsenmUnsubmitted Not Done Reply Inline Actions This usage of OptSize seems broken in the first place. commuteInstruction is used, for example, in MachineCSE. This would mean optsize can potentially disable CSE for some operations, which will increase instruction count (and therefore code size) by blocking the commute. AMDGPU also has situations for example where.a smaller encoding is possible, but we have a pass that uses this based on when it makes sense to switch encodings. commuteInstruction isn't making this decision. arsenm: This usage of OptSize seems broken in the first place. commuteInstruction is used, for example…
		davidxlUnsubmitted Not Done Reply Inline Actions I agree with Matt. Can we remove this change for commute and see if it matters for size? The existing code is broken and we don't want this badness to be exposed at interface level. davidxl: I agree with Matt. Can we remove this change for commute and see if it matters for size?
		hjyamauchiAuthorUnsubmitted Done Reply Inline Actions Yeah, it looks like some CSE between blend and movsd/movss could indeed be missed depending on whether optsize. This code came from https://reviews.llvm.org/rL336731. It may be alternatively fine to do a blend -> movsd/movss replacement in a separate pass if optsize. Thoughts? hjyamauchi: Yeah, it looks like some CSE between blend and movsd/movss could indeed be missed depending on…
		hjyamauchiAuthorUnsubmitted Done Reply Inline Actions I measured the size impact for an internal app. Turned out to be very tiny 0.0002% :) I believe we can drop this patch. hjyamauchi: I measured the size impact for an internal app. Turned out to be very tiny 0.0002% :) I believe…
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const;		unsigned OpIdx2) const;

/// Assigns the (CommutableOpIdx1, CommutableOpIdx2) pair of commutable		/// Assigns the (CommutableOpIdx1, CommutableOpIdx2) pair of commutable
/// operand indices to (ResultIdx1, ResultIdx2).		/// operand indices to (ResultIdx1, ResultIdx2).
/// One or both input values of the pair: (ResultIdx1, ResultIdx2) may be		/// One or both input values of the pair: (ResultIdx1, ResultIdx2) may be
/// predefined to some indices or be undefined (designated by the special		/// predefined to some indices or be undefined (designated by the special
/// value 'CommuteAnyOperandIndex').		/// value 'CommuteAnyOperandIndex').
▲ Show 20 Lines • Show All 254 Lines • ▼ Show 20 Lines	public:
/// If NewMI is false, MI is modified in place and returned; otherwise, a		/// If NewMI is false, MI is modified in place and returned; otherwise, a
/// new machine instruction is created and returned.		/// new machine instruction is created and returned.
///		///
/// Do not call this method for a non-commutable instruction or		/// Do not call this method for a non-commutable instruction or
/// for non-commuable operands.		/// for non-commuable operands.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
MachineInstr *		MachineInstr *
commuteInstruction(MachineInstr &MI, bool NewMI = false,		commuteInstruction(MachineInstr &MI, ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI = false,
unsigned OpIdx1 = CommuteAnyOperandIndex,		unsigned OpIdx1 = CommuteAnyOperandIndex,
unsigned OpIdx2 = CommuteAnyOperandIndex) const;		unsigned OpIdx2 = CommuteAnyOperandIndex) const;

/// Returns true iff the routine could find two commutable operands in the		/// Returns true iff the routine could find two commutable operands in the
/// given machine instruction.		/// given machine instruction.
/// The 'SrcOpIdx1' and 'SrcOpIdx2' are INPUT and OUTPUT arguments.		/// The 'SrcOpIdx1' and 'SrcOpIdx2' are INPUT and OUTPUT arguments.
/// If any of the INPUT values is set to the special value		/// If any of the INPUT values is set to the special value
/// 'CommuteAnyOperandIndex' then the method arbitrarily picks a commutable		/// 'CommuteAnyOperandIndex' then the method arbitrarily picks a commutable
▲ Show 20 Lines • Show All 593 Lines • ▼ Show 20 Lines	public:
/// If this is possible, a new instruction is returned with the specified		/// If this is possible, a new instruction is returned with the specified
/// operand folded, otherwise NULL is returned.		/// operand folded, otherwise NULL is returned.
/// The new instruction is inserted before MI, and the client is responsible		/// The new instruction is inserted before MI, and the client is responsible
/// for removing the old instruction.		/// for removing the old instruction.
/// If VRM is passed, the assigned physregs can be inspected by target to		/// If VRM is passed, the assigned physregs can be inspected by target to
/// decide on using an opcode (note that those assignments can still change).		/// decide on using an opcode (note that those assignments can still change).
MachineInstr *foldMemoryOperand(MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineInstr *foldMemoryOperand(MachineInstr &MI, ArrayRef<unsigned> Ops,
int FI,		int FI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const;		VirtRegMap *VRM = nullptr) const;

/// Same as the previous version except it allows folding of any load and		/// Same as the previous version except it allows folding of any load and
/// store from / to any address, not just from a specific stack slot.		/// store from / to any address, not just from a specific stack slot.
MachineInstr *foldMemoryOperand(MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineInstr *foldMemoryOperand(MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineInstr &LoadMI,		MachineInstr &LoadMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr) const;		LiveIntervals *LIS = nullptr) const;

/// Return true when there is potentially a faster code sequence		/// Return true when there is potentially a faster code sequence
/// for an instruction chain ending in \p Root. All potential patterns are		/// for an instruction chain ending in \p Root. All potential patterns are
/// returned in the \p Pattern vector. Pattern should be sorted in priority		/// returned in the \p Pattern vector. Pattern should be sorted in priority
/// order since the pattern evaluator stops checking as soon as it finds a		/// order since the pattern evaluator stops checking as soon as it finds a
/// faster sequence.		/// faster sequence.
/// \param Root - Instruction that could be combined with one of its operands		/// \param Root - Instruction that could be combined with one of its operands
▲ Show 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	protected:
/// Target-independent code in foldMemoryOperand will		/// Target-independent code in foldMemoryOperand will
/// take care of adding a MachineMemOperand to the newly created instruction.		/// take care of adding a MachineMemOperand to the newly created instruction.
/// The instruction and any auxiliary instructions necessary will be inserted		/// The instruction and any auxiliary instructions necessary will be inserted
/// at InsertPt.		/// at InsertPt.
virtual MachineInstr *		virtual MachineInstr *
foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const {		VirtRegMap *VRM = nullptr) const {
return nullptr;		return nullptr;
}		}

/// Target-dependent implementation for foldMemoryOperand.		/// Target-dependent implementation for foldMemoryOperand.
/// Target-independent code in foldMemoryOperand will		/// Target-independent code in foldMemoryOperand will
/// take care of adding a MachineMemOperand to the newly created instruction.		/// take care of adding a MachineMemOperand to the newly created instruction.
/// The instruction and any auxiliary instructions necessary will be inserted		/// The instruction and any auxiliary instructions necessary will be inserted
/// at InsertPt.		/// at InsertPt.
virtual MachineInstr *foldMemoryOperandImpl(		virtual MachineInstr *foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals *LIS = nullptr) const {		LiveIntervals *LIS = nullptr) const {
return nullptr;		return nullptr;
}		}

/// Target-dependent implementation of getRegSequenceInputs.		/// Target-dependent implementation of getRegSequenceInputs.
///		///
/// \returns true if it is possible to build the equivalent		/// \returns true if it is possible to build the equivalent
/// REG_SEQUENCE inputs with the pair \p MI, \p DefIdx. False otherwise.		/// REG_SEQUENCE inputs with the pair \p MI, \p DefIdx. False otherwise.
▲ Show 20 Lines • Show All 270 Lines • ▼ Show 20 Lines	public:
/// def and use are in the same BB. We only look at one load and see		/// def and use are in the same BB. We only look at one load and see
/// whether it can be folded into MI. FoldAsLoadDefReg is the virtual register		/// whether it can be folded into MI. FoldAsLoadDefReg is the virtual register
/// defined by the load we are trying to fold. DefMI returns the machine		/// defined by the load we are trying to fold. DefMI returns the machine
/// instruction that defines FoldAsLoadDefReg, and the function returns		/// instruction that defines FoldAsLoadDefReg, and the function returns
/// the machine instruction generated due to folding.		/// the machine instruction generated due to folding.
virtual MachineInstr *optimizeLoadInstr(MachineInstr &MI,		virtual MachineInstr *optimizeLoadInstr(MachineInstr &MI,
const MachineRegisterInfo *MRI,		const MachineRegisterInfo *MRI,
unsigned &FoldAsLoadDefReg,		unsigned &FoldAsLoadDefReg,
MachineInstr *&DefMI) const {		MachineInstr *&DefMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {
return nullptr;		return nullptr;
}		}

/// 'Reg' is known to be defined by a move immediate instruction,		/// 'Reg' is known to be defined by a move immediate instruction,
/// try to fold the immediate into the use instruction.		/// try to fold the immediate into the use instruction.
/// If MRI->hasOneNonDBGUse(Reg) is true, and this function returns true,		/// If MRI->hasOneNonDBGUse(Reg) is true, and this function returns true,
/// then the caller may assume that DefMI has been erased from its parent		/// then the caller may assume that DefMI has been erased from its parent
/// block. The caller may assume that it will not be erased by this		/// block. The caller may assume that it will not be erased by this
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	public:
getExecutionDomain(const MachineInstr &MI) const {		getExecutionDomain(const MachineInstr &MI) const {
return std::make_pair(0, 0);		return std::make_pair(0, 0);
}		}

/// Change the opcode of MI to execute in Domain.		/// Change the opcode of MI to execute in Domain.
///		///
/// The bit (1 << Domain) must be set in the mask returned from		/// The bit (1 << Domain) must be set in the mask returned from
/// getExecutionDomain(MI).		/// getExecutionDomain(MI).
virtual void setExecutionDomain(MachineInstr &MI, unsigned Domain) const {}		virtual void setExecutionDomain(MachineInstr &MI, unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {}

/// Returns the preferred minimum clearance		/// Returns the preferred minimum clearance
/// before an instruction with an unwanted partial register update.		/// before an instruction with an unwanted partial register update.
///		///
/// Some instructions only write part of a register, and implicitly need to		/// Some instructions only write part of a register, and implicitly need to
/// read the other parts of the register. This may cause unwanted stalls		/// read the other parts of the register. This may cause unwanted stalls
/// preventing otherwise unrelated instructions from executing in parallel in		/// preventing otherwise unrelated instructions from executing in parallel in
/// an out-of-order CPU.		/// an out-of-order CPU.
▲ Show 20 Lines • Show All 294 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExecutionDomainFix.cpp

Show First 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	void ExecutionDomainFix::force(int rx, unsigned domain) {
}		}
}		}

void ExecutionDomainFix::collapse(DomainValue *dv, unsigned domain) {		void ExecutionDomainFix::collapse(DomainValue *dv, unsigned domain) {
assert(dv->hasDomain(domain) && "Cannot collapse");		assert(dv->hasDomain(domain) && "Cannot collapse");

// Collapse all the instructions.		// Collapse all the instructions.
while (!dv->Instrs.empty())		while (!dv->Instrs.empty())
TII->setExecutionDomain(*dv->Instrs.pop_back_val(), domain);		TII->setExecutionDomain(*dv->Instrs.pop_back_val(), domain, nullptr,
		nullptr);
dv->setSingleDomain(domain);		dv->setSingleDomain(domain);

// If there are multiple users, give them new, unique DomainValues.		// If there are multiple users, give them new, unique DomainValues.
if (!LiveRegs.empty() && dv->Refs > 1)		if (!LiveRegs.empty() && dv->Refs > 1)
for (unsigned rx = 0; rx != NumRegs; ++rx)		for (unsigned rx = 0; rx != NumRegs; ++rx)
if (LiveRegs[rx] == dv)		if (LiveRegs[rx] == dv)
setLiveReg(rx, alloc(domain));		setLiveReg(rx, alloc(domain));
}		}
▲ Show 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	for (unsigned i = mi->getDesc().getNumDefs(),
// now.		// now.
kill(rx);		kill(rx);
}		}
}		}

// If the collapsed operands force a single domain, propagate the collapse.		// If the collapsed operands force a single domain, propagate the collapse.
if (isPowerOf2_32(available)) {		if (isPowerOf2_32(available)) {
unsigned domain = countTrailingZeros(available);		unsigned domain = countTrailingZeros(available);
TII->setExecutionDomain(*mi, domain);		TII->setExecutionDomain(*mi, domain, nullptr, nullptr);
visitHardInstr(mi, domain);		visitHardInstr(mi, domain);
return;		return;
}		}

// Kill off any remaining uses that don't match available, and build a list of		// Kill off any remaining uses that don't match available, and build a list of
// incoming DomainValues that we want to merge.		// incoming DomainValues that we want to merge.
SmallVector<int, 4> Regs;		SmallVector<int, 4> Regs;
for (int rx : used) {		for (int rx : used) {
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/lib/CodeGen/InlineSpiller.cpp

Show First 20 Lines • Show All 829 Lines • ▼ Show 20 Lines	foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>> Ops,
// If we only have implicit uses, we won't be able to fold that.		// If we only have implicit uses, we won't be able to fold that.
// Moreover, TargetInstrInfo::foldMemoryOperand will assert if we try!		// Moreover, TargetInstrInfo::foldMemoryOperand will assert if we try!
if (FoldOps.empty())		if (FoldOps.empty())
return false;		return false;

MachineInstrSpan MIS(MI, MI->getParent());		MachineInstrSpan MIS(MI, MI->getParent());

MachineInstr *FoldMI =		MachineInstr *FoldMI =
LoadMI ? TII.foldMemoryOperand(MI, FoldOps, LoadMI, &LIS)		LoadMI ? TII.foldMemoryOperand(MI, FoldOps, LoadMI, nullptr, nullptr,
: TII.foldMemoryOperand(*MI, FoldOps, StackSlot, &LIS, &VRM);		&LIS)
		: TII.foldMemoryOperand(*MI, FoldOps, StackSlot, nullptr, nullptr,
		&LIS, &VRM);
if (!FoldMI)		if (!FoldMI)
return false;		return false;

// Remove LIS for any dead defs in the original MI not in FoldMI.		// Remove LIS for any dead defs in the original MI not in FoldMI.
for (MIBundleOperands MO(*MI); MO.isValid(); ++MO) {		for (MIBundleOperands MO(*MI); MO.isValid(); ++MO) {
if (!MO->isReg())		if (!MO->isReg())
continue;		continue;
Register Reg = MO->getReg();		Register Reg = MO->getReg();
▲ Show 20 Lines • Show All 693 Lines • Show Last 20 Lines

llvm/lib/CodeGen/LiveRangeEdit.cpp

Show First 20 Lines • Show All 220 Lines • ▼ Show 20 Lines	bool LiveRangeEdit::foldAsLoad(LiveInterval *LI,

LLVM_DEBUG(dbgs() << "Try to fold single def: " << *DefMI		LLVM_DEBUG(dbgs() << "Try to fold single def: " << *DefMI
<< " into single use: " << *UseMI);		<< " into single use: " << *UseMI);

SmallVector<unsigned, 8> Ops;		SmallVector<unsigned, 8> Ops;
if (UseMI->readsWritesVirtualRegister(LI->reg, &Ops).second)		if (UseMI->readsWritesVirtualRegister(LI->reg, &Ops).second)
return false;		return false;

MachineInstr FoldMI = TII.foldMemoryOperand(UseMI, Ops, *DefMI, &LIS);		MachineInstr FoldMI = TII.foldMemoryOperand(UseMI, Ops, *DefMI,
		nullptr, nullptr, &LIS);
if (!FoldMI)		if (!FoldMI)
return false;		return false;
LLVM_DEBUG(dbgs() << " folded: " << *FoldMI);		LLVM_DEBUG(dbgs() << " folded: " << *FoldMI);
LIS.ReplaceMachineInstrInMaps(UseMI, FoldMI);		LIS.ReplaceMachineInstrInMaps(UseMI, FoldMI);
if (UseMI->isCall())		if (UseMI->isCall())
UseMI->getMF()->moveCallSiteInfo(UseMI, FoldMI);		UseMI->getMF()->moveCallSiteInfo(UseMI, FoldMI);
UseMI->eraseFromParent();		UseMI->eraseFromParent();
DefMI->addRegisterDead(LI->reg, nullptr);		DefMI->addRegisterDead(LI->reg, nullptr);
▲ Show 20 Lines • Show All 239 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineCSE.cpp

Show First 20 Lines • Show All 532 Lines • ▼ Show 20 Lines	if (!FoundCSE) {
// Try again to see if CSE is possible.		// Try again to see if CSE is possible.
FoundCSE = VNT.count(MI);		FoundCSE = VNT.count(MI);
}		}
}		}

// Commute commutable instructions.		// Commute commutable instructions.
bool Commuted = false;		bool Commuted = false;
if (!FoundCSE && MI->isCommutable()) {		if (!FoundCSE && MI->isCommutable()) {
if (MachineInstr NewMI = TII->commuteInstruction(MI)) {		if (MachineInstr NewMI = TII->commuteInstruction(MI, nullptr,
		nullptr)) {
Commuted = true;		Commuted = true;
FoundCSE = VNT.count(NewMI);		FoundCSE = VNT.count(NewMI);
if (NewMI != MI) {		if (NewMI != MI) {
// New instruction. It doesn't need to be kept.		// New instruction. It doesn't need to be kept.
NewMI->eraseFromParent();		NewMI->eraseFromParent();
Changed = true;		Changed = true;
} else if (!FoundCSE)		} else if (!FoundCSE)
// MI was changed but it didn't help, commute it back!		// MI was changed but it didn't help, commute it back!
(void)TII->commuteInstruction(*MI);		(void)TII->commuteInstruction(*MI, nullptr, nullptr);
}		}
}		}

// If the instruction defines physical registers and the values may be		// If the instruction defines physical registers and the values may be
// used, then it's not safe to replace it with a common subexpression.		// used, then it's not safe to replace it with a common subexpression.
// It's also not safe if the instruction uses physical registers.		// It's also not safe if the instruction uses physical registers.
bool CrossMBBPhysDef = false;		bool CrossMBBPhysDef = false;
SmallSet<unsigned, 8> PhysRefs;		SmallSet<unsigned, 8> PhysRefs;
▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

llvm/lib/CodeGen/PeepholeOptimizer.cpp

Show First 20 Lines • Show All 1,574 Lines • ▼ Show 20 Lines	if (findTargetRecurrence(PHI.getOperand(0).getReg(), TargetRegs, RC)) {
// Commutes operands of instructions in RC if necessary so that the copy to		// Commutes operands of instructions in RC if necessary so that the copy to
// be generated from PHI can be coalesced.		// be generated from PHI can be coalesced.
LLVM_DEBUG(dbgs() << "Optimize recurrence chain from " << PHI);		LLVM_DEBUG(dbgs() << "Optimize recurrence chain from " << PHI);
for (auto &RI : RC) {		for (auto &RI : RC) {
LLVM_DEBUG(dbgs() << "\tInst: " << *(RI.getMI()));		LLVM_DEBUG(dbgs() << "\tInst: " << *(RI.getMI()));
auto CP = RI.getCommutePair();		auto CP = RI.getCommutePair();
if (CP) {		if (CP) {
Changed = true;		Changed = true;
TII->commuteInstruction((RI.getMI()), false, (CP).first,		TII->commuteInstruction(*(RI.getMI()), nullptr, nullptr, false,
(*CP).second);		(CP).first, (CP).second);
LLVM_DEBUG(dbgs() << "\t\tCommuted: " << *(RI.getMI()));		LLVM_DEBUG(dbgs() << "\t\tCommuted: " << *(RI.getMI()));
}		}
}		}
}		}

return Changed;		return Changed;
}		}

▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	for (MachineBasicBlock::iterator MII = MBB.begin(), MIE = MBB.end();
if (FoldAsLoadDefCandidates.count(FoldAsLoadDefReg)) {		if (FoldAsLoadDefCandidates.count(FoldAsLoadDefReg)) {
// We need to fold load after optimizeCmpInstr, since		// We need to fold load after optimizeCmpInstr, since
// optimizeCmpInstr can enable folding by converting SUB to CMP.		// optimizeCmpInstr can enable folding by converting SUB to CMP.
// Save FoldAsLoadDefReg because optimizeLoadInstr() resets it and		// Save FoldAsLoadDefReg because optimizeLoadInstr() resets it and
// we need it for markUsesInDebugValueAsUndef().		// we need it for markUsesInDebugValueAsUndef().
unsigned FoldedReg = FoldAsLoadDefReg;		unsigned FoldedReg = FoldAsLoadDefReg;
MachineInstr *DefMI = nullptr;		MachineInstr *DefMI = nullptr;
if (MachineInstr *FoldMI =		if (MachineInstr *FoldMI =
TII->optimizeLoadInstr(*MI, MRI, FoldAsLoadDefReg, DefMI)) {		TII->optimizeLoadInstr(*MI, MRI, FoldAsLoadDefReg, DefMI,
		nullptr, nullptr)) {
// Update LocalMIs since we replaced MI with FoldMI and deleted		// Update LocalMIs since we replaced MI with FoldMI and deleted
// DefMI.		// DefMI.
LLVM_DEBUG(dbgs() << "Replacing: " << *MI);		LLVM_DEBUG(dbgs() << "Replacing: " << *MI);
LLVM_DEBUG(dbgs() << " With: " << *FoldMI);		LLVM_DEBUG(dbgs() << " With: " << *FoldMI);
LocalMIs.erase(MI);		LocalMIs.erase(MI);
LocalMIs.erase(DefMI);		LocalMIs.erase(DefMI);
LocalMIs.insert(FoldMI);		LocalMIs.insert(FoldMI);
if (MI->isCall())		if (MI->isCall())
▲ Show 20 Lines • Show All 336 Lines • Show Last 20 Lines

llvm/lib/CodeGen/RegisterCoalescer.cpp

Show First 20 Lines • Show All 826 Lines • ▼ Show 20 Lines	RegisterCoalescer::removeCopyByCommutingDef(const CoalescerPair &CP,

LLVM_DEBUG(dbgs() << "\tremoveCopyByCommutingDef: " << AValNo->def << '\t'		LLVM_DEBUG(dbgs() << "\tremoveCopyByCommutingDef: " << AValNo->def << '\t'
<< *DefMI);		<< *DefMI);

// At this point we have decided that it is legal to do this		// At this point we have decided that it is legal to do this
// transformation. Start by commuting the instruction.		// transformation. Start by commuting the instruction.
MachineBasicBlock *MBB = DefMI->getParent();		MachineBasicBlock *MBB = DefMI->getParent();
MachineInstr *NewMI =		MachineInstr *NewMI =
TII->commuteInstruction(*DefMI, false, UseOpIdx, NewDstIdx);		TII->commuteInstruction(*DefMI, nullptr, nullptr, false, UseOpIdx,
		NewDstIdx);
if (!NewMI)		if (!NewMI)
return { false, false };		return { false, false };
if (Register::isVirtualRegister(IntA.reg) &&		if (Register::isVirtualRegister(IntA.reg) &&
Register::isVirtualRegister(IntB.reg) &&		Register::isVirtualRegister(IntB.reg) &&
!MRI->constrainRegClass(IntB.reg, MRI->getRegClass(IntA.reg)))		!MRI->constrainRegClass(IntB.reg, MRI->getRegClass(IntA.reg)))
return { false, false };		return { false, false };
if (NewMI != DefMI) {		if (NewMI != DefMI) {
LIS->ReplaceMachineInstrInMaps(DefMI, NewMI);		LIS->ReplaceMachineInstrInMaps(DefMI, NewMI);
▲ Show 20 Lines • Show All 2,909 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetInstrInfo.cpp

Show First 20 Lines • Show All 148 Lines • ▼ Show 20 Lines	TargetInstrInfo::ReplaceTailWithBranchTo(MachineBasicBlock::iterator Tail,

// If MBB isn't immediately before MBB, insert a branch to it.		// If MBB isn't immediately before MBB, insert a branch to it.
if (++MachineFunction::iterator(MBB) != MachineFunction::iterator(NewDest))		if (++MachineFunction::iterator(MBB) != MachineFunction::iterator(NewDest))
insertBranch(*MBB, NewDest, nullptr, SmallVector<MachineOperand, 0>(), DL);		insertBranch(*MBB, NewDest, nullptr, SmallVector<MachineOperand, 0>(), DL);
MBB->addSuccessor(NewDest);		MBB->addSuccessor(NewDest);
}		}

MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,		MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
bool NewMI, unsigned Idx1,		bool NewMI, unsigned Idx1,
unsigned Idx2) const {		unsigned Idx2) const {
const MCInstrDesc &MCID = MI.getDesc();		const MCInstrDesc &MCID = MI.getDesc();
bool HasDef = MCID.getNumDefs();		bool HasDef = MCID.getNumDefs();
if (HasDef && !MI.getOperand(0).isReg())		if (HasDef && !MI.getOperand(0).isReg())
// No idea how to commute this instruction. Target should implement its own.		// No idea how to commute this instruction. Target should implement its own.
return nullptr;		return nullptr;

▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	MachineInstr *TargetInstrInfo::commuteInstructionImpl(MachineInstr &MI,
// renamable property is only queried/set for physical registers.		// renamable property is only queried/set for physical registers.
if (Register::isPhysicalRegister(Reg1))		if (Register::isPhysicalRegister(Reg1))
CommutedMI->getOperand(Idx2).setIsRenamable(Reg1IsRenamable);		CommutedMI->getOperand(Idx2).setIsRenamable(Reg1IsRenamable);
if (Register::isPhysicalRegister(Reg2))		if (Register::isPhysicalRegister(Reg2))
CommutedMI->getOperand(Idx1).setIsRenamable(Reg2IsRenamable);		CommutedMI->getOperand(Idx1).setIsRenamable(Reg2IsRenamable);
return CommutedMI;		return CommutedMI;
}		}

MachineInstr *TargetInstrInfo::commuteInstruction(MachineInstr &MI, bool NewMI,		MachineInstr *TargetInstrInfo::commuteInstruction(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const {		unsigned OpIdx2) const {
// If OpIdx1 or OpIdx2 is not specified, then this method is free to choose		// If OpIdx1 or OpIdx2 is not specified, then this method is free to choose
// any commutable operand, which is done in findCommutedOpIndices() method		// any commutable operand, which is done in findCommutedOpIndices() method
// called below.		// called below.
if ((OpIdx1 == CommuteAnyOperandIndex \|\| OpIdx2 == CommuteAnyOperandIndex) &&		if ((OpIdx1 == CommuteAnyOperandIndex \|\| OpIdx2 == CommuteAnyOperandIndex) &&
!findCommutedOpIndices(MI, OpIdx1, OpIdx2)) {		!findCommutedOpIndices(MI, OpIdx1, OpIdx2)) {
assert(MI.isCommutable() &&		assert(MI.isCommutable() &&
"Precondition violation: MI must be commutable.");		"Precondition violation: MI must be commutable.");
return nullptr;		return nullptr;
}		}
return commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return commuteInstructionImpl(MI, PSI, MBFI, NewMI, OpIdx1, OpIdx2);
}		}

bool TargetInstrInfo::fixCommutedOpIndices(unsigned &ResultIdx1,		bool TargetInstrInfo::fixCommutedOpIndices(unsigned &ResultIdx1,
unsigned &ResultIdx2,		unsigned &ResultIdx2,
unsigned CommutableOpIdx1,		unsigned CommutableOpIdx1,
unsigned CommutableOpIdx2) {		unsigned CommutableOpIdx2) {
if (ResultIdx1 == CommuteAnyOperandIndex &&		if (ResultIdx1 == CommuteAnyOperandIndex &&
ResultIdx2 == CommuteAnyOperandIndex) {		ResultIdx2 == CommuteAnyOperandIndex) {
▲ Show 20 Lines • Show All 265 Lines • ▼ Show 20 Lines	for (unsigned i = StartIdx; i < MI.getNumOperands(); ++i) {
else		else
MIB.add(MO);		MIB.add(MO);
}		}
return NewMI;		return NewMI;
}		}

MachineInstr *TargetInstrInfo::foldMemoryOperand(MachineInstr &MI,		MachineInstr *TargetInstrInfo::foldMemoryOperand(MachineInstr &MI,
ArrayRef<unsigned> Ops, int FI,		ArrayRef<unsigned> Ops, int FI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS,		LiveIntervals *LIS,
VirtRegMap *VRM) const {		VirtRegMap *VRM) const {
auto Flags = MachineMemOperand::MONone;		auto Flags = MachineMemOperand::MONone;
for (unsigned OpIdx : Ops)		for (unsigned OpIdx : Ops)
Flags \|= MI.getOperand(OpIdx).isDef() ? MachineMemOperand::MOStore		Flags \|= MI.getOperand(OpIdx).isDef() ? MachineMemOperand::MOStore
: MachineMemOperand::MOLoad;		: MachineMemOperand::MOLoad;

MachineBasicBlock *MBB = MI.getParent();		MachineBasicBlock *MBB = MI.getParent();
Show All 31 Lines	if (MI.getOpcode() == TargetOpcode::STACKMAP \|\|
MI.getOpcode() == TargetOpcode::PATCHPOINT \|\|		MI.getOpcode() == TargetOpcode::PATCHPOINT \|\|
MI.getOpcode() == TargetOpcode::STATEPOINT) {		MI.getOpcode() == TargetOpcode::STATEPOINT) {
// Fold stackmap/patchpoint.		// Fold stackmap/patchpoint.
NewMI = foldPatchpoint(MF, MI, Ops, FI, *this);		NewMI = foldPatchpoint(MF, MI, Ops, FI, *this);
if (NewMI)		if (NewMI)
MBB->insert(MI, NewMI);		MBB->insert(MI, NewMI);
} else {		} else {
// Ask the target to do the actual folding.		// Ask the target to do the actual folding.
NewMI = foldMemoryOperandImpl(MF, MI, Ops, MI, FI, LIS, VRM);		NewMI = foldMemoryOperandImpl(MF, MI, Ops, MI, FI, PSI, MBFI, LIS, VRM);
}		}

if (NewMI) {		if (NewMI) {
NewMI->setMemRefs(MF, MI.memoperands());		NewMI->setMemRefs(MF, MI.memoperands());
// Add a memory operand, foldMemoryOperandImpl doesn't do that.		// Add a memory operand, foldMemoryOperandImpl doesn't do that.
assert((!(Flags & MachineMemOperand::MOStore) \|\|		assert((!(Flags & MachineMemOperand::MOStore) \|\|
NewMI->mayStore()) &&		NewMI->mayStore()) &&
"Folded a def to a non-store!");		"Folded a def to a non-store!");
Show All 25 Lines	MachineInstr *TargetInstrInfo::foldMemoryOperand(MachineInstr &MI,
else		else
loadRegFromStackSlot(*MBB, Pos, MO.getReg(), FI, RC, TRI);		loadRegFromStackSlot(*MBB, Pos, MO.getReg(), FI, RC, TRI);
return &*--Pos;		return &*--Pos;
}		}

MachineInstr *TargetInstrInfo::foldMemoryOperand(MachineInstr &MI,		MachineInstr *TargetInstrInfo::foldMemoryOperand(MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineInstr &LoadMI,		MachineInstr &LoadMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS) const {		LiveIntervals *LIS) const {
assert(LoadMI.canFoldAsLoad() && "LoadMI isn't foldable!");		assert(LoadMI.canFoldAsLoad() && "LoadMI isn't foldable!");
#ifndef NDEBUG		#ifndef NDEBUG
for (unsigned OpIdx : Ops)		for (unsigned OpIdx : Ops)
assert(MI.getOperand(OpIdx).isUse() && "Folding load into def!");		assert(MI.getOperand(OpIdx).isUse() && "Folding load into def!");
#endif		#endif

MachineBasicBlock &MBB = *MI.getParent();		MachineBasicBlock &MBB = *MI.getParent();
MachineFunction &MF = *MBB.getParent();		MachineFunction &MF = *MBB.getParent();

// Ask the target to do the actual folding.		// Ask the target to do the actual folding.
MachineInstr *NewMI = nullptr;		MachineInstr *NewMI = nullptr;
int FrameIndex = 0;		int FrameIndex = 0;

if ((MI.getOpcode() == TargetOpcode::STACKMAP \|\|		if ((MI.getOpcode() == TargetOpcode::STACKMAP \|\|
MI.getOpcode() == TargetOpcode::PATCHPOINT \|\|		MI.getOpcode() == TargetOpcode::PATCHPOINT \|\|
MI.getOpcode() == TargetOpcode::STATEPOINT) &&		MI.getOpcode() == TargetOpcode::STATEPOINT) &&
isLoadFromStackSlot(LoadMI, FrameIndex)) {		isLoadFromStackSlot(LoadMI, FrameIndex)) {
// Fold stackmap/patchpoint.		// Fold stackmap/patchpoint.
NewMI = foldPatchpoint(MF, MI, Ops, FrameIndex, *this);		NewMI = foldPatchpoint(MF, MI, Ops, FrameIndex, *this);
if (NewMI)		if (NewMI)
NewMI = &*MBB.insert(MI, NewMI);		NewMI = &*MBB.insert(MI, NewMI);
} else {		} else {
// Ask the target to do the actual folding.		// Ask the target to do the actual folding.
NewMI = foldMemoryOperandImpl(MF, MI, Ops, MI, LoadMI, LIS);		NewMI = foldMemoryOperandImpl(MF, MI, Ops, MI, LoadMI, PSI, MBFI, LIS);
}		}

if (!NewMI)		if (!NewMI)
return nullptr;		return nullptr;

// Copy the memoperands from the load to the folded instruction.		// Copy the memoperands from the load to the folded instruction.
if (MI.memoperands_empty()) {		if (MI.memoperands_empty()) {
NewMI->setMemRefs(MF, LoadMI.memoperands());		NewMI->setMemRefs(MF, LoadMI.memoperands());
▲ Show 20 Lines • Show All 594 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TwoAddressInstructionPass.cpp

	Show First 20 Lines • Show All 678 Lines • ▼ Show 20 Lines
	/// and live variables if needed. Return true if it is successful.			/// and live variables if needed. Return true if it is successful.
	bool TwoAddressInstructionPass::commuteInstruction(MachineInstr *MI,			bool TwoAddressInstructionPass::commuteInstruction(MachineInstr *MI,
	unsigned DstIdx,			unsigned DstIdx,
	unsigned RegBIdx,			unsigned RegBIdx,
	unsigned RegCIdx,			unsigned RegCIdx,
	unsigned Dist) {			unsigned Dist) {
	Register RegC = MI->getOperand(RegCIdx).getReg();			Register RegC = MI->getOperand(RegCIdx).getReg();
	LLVM_DEBUG(dbgs() << "2addr: COMMUTING : " << *MI);			LLVM_DEBUG(dbgs() << "2addr: COMMUTING : " << *MI);
	MachineInstr NewMI = TII->commuteInstruction(MI, false, RegBIdx, RegCIdx);			MachineInstr NewMI = TII->commuteInstruction(MI, nullptr, nullptr, false,
				RegBIdx, RegCIdx);

	if (NewMI == nullptr) {			if (NewMI == nullptr) {
	LLVM_DEBUG(dbgs() << "2addr: COMMUTING FAILED!\n");			LLVM_DEBUG(dbgs() << "2addr: COMMUTING FAILED!\n");
	return false;			return false;
	}			}

	LLVM_DEBUG(dbgs() << "2addr: COMMUTED TO: " << *NewMI);			LLVM_DEBUG(dbgs() << "2addr: COMMUTED TO: " << *NewMI);
	assert(NewMI == MI &&			assert(NewMI == MI &&
	▲ Show 20 Lines • Show All 1,179 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.h

Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines	public:
// with subreg operands to foldMemoryOperandImpl.		// with subreg operands to foldMemoryOperandImpl.
bool isSubregFoldable() const override { return true; }		bool isSubregFoldable() const override { return true; }

using TargetInstrInfo::foldMemoryOperandImpl;		using TargetInstrInfo::foldMemoryOperandImpl;
MachineInstr *		MachineInstr *
foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const override;		VirtRegMap *VRM = nullptr) const override;

/// \returns true if a branch from an instruction with opcode \p BranchOpc		/// \returns true if a branch from an instruction with opcode \p BranchOpc
/// bytes is capable of jumping to a position \p BrOffset bytes away.		/// bytes is capable of jumping to a position \p BrOffset bytes away.
bool isBranchOffsetInRange(unsigned BranchOpc,		bool isBranchOffsetInRange(unsigned BranchOpc,
int64_t BrOffset) const override;		int64_t BrOffset) const override;

▲ Show 20 Lines • Show All 227 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

Show First 20 Lines • Show All 3,171 Lines • ▼ Show 20 Lines	if (NumPredicateVectors) {
emitFrameOffsetAdj(MBB, MBBI, DL, DestReg, SrcReg, NumPredicateVectors,		emitFrameOffsetAdj(MBB, MBBI, DL, DestReg, SrcReg, NumPredicateVectors,
AArch64::ADDPL_XXI, TII, Flag, NeedsWinCFI, nullptr);		AArch64::ADDPL_XXI, TII, Flag, NeedsWinCFI, nullptr);
}		}
}		}

MachineInstr *AArch64InstrInfo::foldMemoryOperandImpl(		MachineInstr *AArch64InstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals LIS, VirtRegMap VRM) const {		LiveIntervals LIS, VirtRegMap VRM) const {
// This is a bit of a hack. Consider this instruction:		// This is a bit of a hack. Consider this instruction:
//		//
// %0 = COPY %sp; GPR64all:%0		// %0 = COPY %sp; GPR64all:%0
//		//
// We explicitly chose GPR64all for the virtual register so such a copy might		// We explicitly chose GPR64all for the virtual register so such a copy might
// be eliminated by RegisterCoalescer. However, that may not be possible, and		// be eliminated by RegisterCoalescer. However, that may not be possible, and
// %0 may even spill. We can't spill %sp, and since it is in the GPR64all		// %0 may even spill. We can't spill %sp, and since it is in the GPR64all
▲ Show 20 Lines • Show All 2,592 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp

Show First 20 Lines • Show All 509 Lines • ▼ Show 20 Lines	if (Use == TII->getNamedOperand(OrigMI, AMDGPU::OpName::src0)) {
DPPMIs.push_back(DPPInst);		DPPMIs.push_back(DPPInst);
Rollback = false;		Rollback = false;
}		}
} else if (OrigMI.isCommutable() &&		} else if (OrigMI.isCommutable() &&
Use == TII->getNamedOperand(OrigMI, AMDGPU::OpName::src1)) {		Use == TII->getNamedOperand(OrigMI, AMDGPU::OpName::src1)) {
auto *BB = OrigMI.getParent();		auto *BB = OrigMI.getParent();
auto *NewMI = BB->getParent()->CloneMachineInstr(&OrigMI);		auto *NewMI = BB->getParent()->CloneMachineInstr(&OrigMI);
BB->insert(OrigMI, NewMI);		BB->insert(OrigMI, NewMI);
if (TII->commuteInstruction(*NewMI)) {		if (TII->commuteInstruction(*NewMI, nullptr, nullptr)) {
LLVM_DEBUG(dbgs() << " commuted: " << *NewMI);		LLVM_DEBUG(dbgs() << " commuted: " << *NewMI);
if (auto DPPInst = createDPPInst(NewMI, MovMI, CombOldVGPR,		if (auto DPPInst = createDPPInst(NewMI, MovMI, CombOldVGPR,
OldOpndValue, CombBCZ)) {		OldOpndValue, CombBCZ)) {
DPPMIs.push_back(DPPInst);		DPPMIs.push_back(DPPInst);
Rollback = false;		Rollback = false;
}		}
} else		} else
LLVM_DEBUG(dbgs() << " failed: cannot be commuted\n");		LLVM_DEBUG(dbgs() << " failed: cannot be commuted\n");
▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

Show First 20 Lines • Show All 270 Lines • ▼ Show 20 Lines	if ((Fold.isImm() \|\| Fold.isFI() \|\| Fold.isGlobal()) && Fold.needsShrink()) {
// this. Should track set of foldable movs instead of looking for uses		// this. Should track set of foldable movs instead of looking for uses
// when looking at a use.		// when looking at a use.
Dst0.setReg(NewReg0);		Dst0.setReg(NewReg0);
for (unsigned I = MI->getNumOperands() - 1; I > 0; --I)		for (unsigned I = MI->getNumOperands() - 1; I > 0; --I)
MI->RemoveOperand(I);		MI->RemoveOperand(I);
MI->setDesc(TII.get(AMDGPU::IMPLICIT_DEF));		MI->setDesc(TII.get(AMDGPU::IMPLICIT_DEF));

if (Fold.isCommuted())		if (Fold.isCommuted())
TII.commuteInstruction(*Inst32, false);		TII.commuteInstruction(*Inst32, nullptr, nullptr, false);
return true;		return true;
}		}

assert(!Fold.needsShrink() && "not handled");		assert(!Fold.needsShrink() && "not handled");

if (Fold.isImm()) {		if (Fold.isImm()) {
Old.ChangeToImmediate(Fold.ImmToFold);		Old.ChangeToImmediate(Fold.ImmToFold);
return true;		return true;
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	if (!TII->isOperandLegal(*MI, OpNo, OpToFold)) {
// the call of commuteInstruction() below. Such situations are avoided		// the call of commuteInstruction() below. Such situations are avoided
// here explicitly as OpNo must be a register operand to be a candidate		// here explicitly as OpNo must be a register operand to be a candidate
// for memory folding.		// for memory folding.
if (CanCommute && (!MI->getOperand(CommuteIdx0).isReg() \|\|		if (CanCommute && (!MI->getOperand(CommuteIdx0).isReg() \|\|
!MI->getOperand(CommuteIdx1).isReg()))		!MI->getOperand(CommuteIdx1).isReg()))
return false;		return false;

if (!CanCommute \|\|		if (!CanCommute \|\|
!TII->commuteInstruction(*MI, false, CommuteIdx0, CommuteIdx1))		!TII->commuteInstruction(*MI, nullptr, nullptr, false, CommuteIdx0,
		CommuteIdx1))
return false;		return false;

if (!TII->isOperandLegal(*MI, CommuteOpNo, OpToFold)) {		if (!TII->isOperandLegal(*MI, CommuteOpNo, OpToFold)) {
if ((Opc == AMDGPU::V_ADD_I32_e64 \|\|		if ((Opc == AMDGPU::V_ADD_I32_e64 \|\|
Opc == AMDGPU::V_SUB_I32_e64 \|\|		Opc == AMDGPU::V_SUB_I32_e64 \|\|
Opc == AMDGPU::V_SUBREV_I32_e64) && // FIXME		Opc == AMDGPU::V_SUBREV_I32_e64) && // FIXME
(OpToFold->isImm() \|\| OpToFold->isFI() \|\| OpToFold->isGlobal())) {		(OpToFold->isImm() \|\| OpToFold->isFI() \|\| OpToFold->isGlobal())) {
MachineRegisterInfo &MRI = MI->getParent()->getParent()->getRegInfo();		MachineRegisterInfo &MRI = MI->getParent()->getParent()->getRegInfo();
Show All 11 Lines	if (!TII->isOperandLegal(*MI, CommuteOpNo, OpToFold)) {
// Make sure to get the 32-bit version of the commuted opcode.		// Make sure to get the 32-bit version of the commuted opcode.
unsigned MaybeCommutedOpc = MI->getOpcode();		unsigned MaybeCommutedOpc = MI->getOpcode();
int Op32 = AMDGPU::getVOPe32(MaybeCommutedOpc);		int Op32 = AMDGPU::getVOPe32(MaybeCommutedOpc);

appendFoldCandidate(FoldList, MI, CommuteOpNo, OpToFold, true, Op32);		appendFoldCandidate(FoldList, MI, CommuteOpNo, OpToFold, true, Op32);
return true;		return true;
}		}

TII->commuteInstruction(*MI, false, CommuteIdx0, CommuteIdx1);		TII->commuteInstruction(*MI, nullptr, nullptr, false, CommuteIdx0,
		CommuteIdx1);
return false;		return false;
}		}

appendFoldCandidate(FoldList, MI, CommuteOpNo, OpToFold, true);		appendFoldCandidate(FoldList, MI, CommuteOpNo, OpToFold, true);
return true;		return true;
}		}

appendFoldCandidate(FoldList, MI, OpNo, OpToFold);		appendFoldCandidate(FoldList, MI, OpNo, OpToFold);
▲ Show 20 Lines • Show All 792 Lines • ▼ Show 20 Lines	if (updateOperand(Fold, TII, TRI, *ST)) {
MRI->clearKillFlags(Fold.OpToFold->getReg());		MRI->clearKillFlags(Fold.OpToFold->getReg());
}		}
LLVM_DEBUG(dbgs() << "Folded source from " << MI << " into OpNo "		LLVM_DEBUG(dbgs() << "Folded source from " << MI << " into OpNo "
<< static_cast<int>(Fold.UseOpNo) << " of "		<< static_cast<int>(Fold.UseOpNo) << " of "
<< *Fold.UseMI << '\n');		<< *Fold.UseMI << '\n');
tryFoldInst(TII, Fold.UseMI);		tryFoldInst(TII, Fold.UseMI);
} else if (Fold.isCommuted()) {		} else if (Fold.isCommuted()) {
// Restoring instruction's original operand order if fold has failed.		// Restoring instruction's original operand order if fold has failed.
TII->commuteInstruction(*Fold.UseMI, false);		TII->commuteInstruction(*Fold.UseMI, nullptr, nullptr, false);
}		}
}		}
}		}

// Clamp patterns are canonically selected to v_max_* instructions, so only		// Clamp patterns are canonically selected to v_max_* instructions, so only
// handle them.		// handle them.
const MachineOperand *SIFoldOperands::isClamp(const MachineInstr &MI) const {		const MachineOperand *SIFoldOperands::isClamp(const MachineInstr &MI) const {
unsigned Op = MI.getOpcode();		unsigned Op = MI.getOpcode();
▲ Show 20 Lines • Show All 282 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInsertSkips.cpp

Show First 20 Lines • Show All 365 Lines • ▼ Show 20 Lines	for (++A ; A != E ; ++A) {
ReadsCond \|= A->readsRegister(CondReg, TRI);		ReadsCond \|= A->readsRegister(CondReg, TRI);
}		}
if (A == E)		if (A == E)
return false;		return false;

MachineOperand &Op1 = A->getOperand(1);		MachineOperand &Op1 = A->getOperand(1);
MachineOperand &Op2 = A->getOperand(2);		MachineOperand &Op2 = A->getOperand(2);
if (Op1.getReg() != ExecReg && Op2.isReg() && Op2.getReg() == ExecReg) {		if (Op1.getReg() != ExecReg && Op2.isReg() && Op2.getReg() == ExecReg) {
TII->commuteInstruction(*A);		TII->commuteInstruction(*A, nullptr, nullptr);
Changed = true;		Changed = true;
}		}
if (Op1.getReg() != ExecReg)		if (Op1.getReg() != ExecReg)
return Changed;		return Changed;
if (Op2.isImm() && Op2.getImm() != -1)		if (Op2.isImm() && Op2.getImm() != -1)
return Changed;		return Changed;

unsigned SReg = AMDGPU::NoRegister;		unsigned SReg = AMDGPU::NoRegister;
▲ Show 20 Lines • Show All 156 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.h

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	private:

unsigned findUsedSGPR(const MachineInstr &MI, int OpIndices[3]) const;		unsigned findUsedSGPR(const MachineInstr &MI, int OpIndices[3]) const;

protected:		protected:
bool swapSourceModifiers(MachineInstr &MI,		bool swapSourceModifiers(MachineInstr &MI,
MachineOperand &Src0, unsigned Src0OpName,		MachineOperand &Src0, unsigned Src0OpName,
MachineOperand &Src1, unsigned Src1OpName) const;		MachineOperand &Src1, unsigned Src1OpName) const;

MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx0,		unsigned OpIdx0,
unsigned OpIdx1) const override;		unsigned OpIdx1) const override;

public:		public:
enum TargetOperandFlags {		enum TargetOperandFlags {
MO_MASK = 0xf,		MO_MASK = 0xf,

MO_NONE = 0,		MO_NONE = 0,
▲ Show 20 Lines • Show All 878 Lines • ▼ Show 20 Lines	public:
}		}

void fixImplicitOperands(MachineInstr &MI) const;		void fixImplicitOperands(MachineInstr &MI) const;

MachineInstr *foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		MachineInstr *foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt,		MachineBasicBlock::iterator InsertPt,
int FrameIndex,		int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const override;		VirtRegMap *VRM = nullptr) const override;
};		};

/// \brief Returns true if a reg:subreg pair P has a TRC class		/// \brief Returns true if a reg:subreg pair P has a TRC class
inline bool isOfRegClass(const TargetInstrInfo::RegSubRegPair &P,		inline bool isOfRegClass(const TargetInstrInfo::RegSubRegPair &P,
const TargetRegisterClass &TRC,		const TargetRegisterClass &TRC,
MachineRegisterInfo &MRI) {		MachineRegisterInfo &MRI) {
▲ Show 20 Lines • Show All 118 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

Show First 20 Lines • Show All 1,658 Lines • ▼ Show 20 Lines	else
return nullptr;		return nullptr;

NonRegOp.ChangeToRegister(Reg, false, false, IsKill, IsDead, IsUndef, IsDebug);		NonRegOp.ChangeToRegister(Reg, false, false, IsKill, IsDead, IsUndef, IsDebug);
NonRegOp.setSubReg(SubReg);		NonRegOp.setSubReg(SubReg);

return &MI;		return &MI;
}		}

MachineInstr *SIInstrInfo::commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *SIInstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned Src0Idx,		unsigned Src0Idx,
unsigned Src1Idx) const {		unsigned Src1Idx) const {
assert(!NewMI && "this should never be used");		assert(!NewMI && "this should never be used");

unsigned Opc = MI.getOpcode();		unsigned Opc = MI.getOpcode();
int CommutedOpcode = commuteOpcode(Opc);		int CommutedOpcode = commuteOpcode(Opc);
if (CommutedOpcode == -1)		if (CommutedOpcode == -1)
return nullptr;		return nullptr;

assert(AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0) ==		assert(AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0) ==
static_cast<int>(Src0Idx) &&		static_cast<int>(Src0Idx) &&
AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src1) ==		AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src1) ==
static_cast<int>(Src1Idx) &&		static_cast<int>(Src1Idx) &&
"inconsistency with findCommutedOpIndices");		"inconsistency with findCommutedOpIndices");

MachineOperand &Src0 = MI.getOperand(Src0Idx);		MachineOperand &Src0 = MI.getOperand(Src0Idx);
MachineOperand &Src1 = MI.getOperand(Src1Idx);		MachineOperand &Src1 = MI.getOperand(Src1Idx);

MachineInstr *CommutedMI = nullptr;		MachineInstr *CommutedMI = nullptr;
if (Src0.isReg() && Src1.isReg()) {		if (Src0.isReg() && Src1.isReg()) {
if (isOperandLegal(MI, Src1Idx, &Src0)) {		if (isOperandLegal(MI, Src1Idx, &Src0)) {
// Be sure to copy the source modifiers to the right place.		// Be sure to copy the source modifiers to the right place.
CommutedMI		CommutedMI
= TargetInstrInfo::commuteInstructionImpl(MI, NewMI, Src0Idx, Src1Idx);		= TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI, NewMI,
		Src0Idx, Src1Idx);
}		}

} else if (Src0.isReg() && !Src1.isReg()) {		} else if (Src0.isReg() && !Src1.isReg()) {
// src0 should always be able to support any operand type, so no need to		// src0 should always be able to support any operand type, so no need to
// check operand legality.		// check operand legality.
CommutedMI = swapRegAndNonRegOperand(MI, Src0, Src1);		CommutedMI = swapRegAndNonRegOperand(MI, Src0, Src1);
} else if (!Src0.isReg() && Src1.isReg()) {		} else if (!Src0.isReg() && Src1.isReg()) {
if (isOperandLegal(MI, Src1Idx, &Src0))		if (isOperandLegal(MI, Src1Idx, &Src0))
▲ Show 20 Lines • Show All 749 Lines • ▼ Show 20 Lines	if (Src2->isReg() && Src2->getReg() == Reg) {
}		}

if (Src1->isReg() && !Src0Inlined ) {		if (Src1->isReg() && !Src0Inlined ) {
// We have one slot for inlinable constant so far - try to fill it		// We have one slot for inlinable constant so far - try to fill it
MachineInstr *Def = MRI->getUniqueVRegDef(Src1->getReg());		MachineInstr *Def = MRI->getUniqueVRegDef(Src1->getReg());
if (Def && Def->isMoveImmediate() &&		if (Def && Def->isMoveImmediate() &&
isInlineConstant(Def->getOperand(1)) &&		isInlineConstant(Def->getOperand(1)) &&
MRI->hasOneUse(Src1->getReg()) &&		MRI->hasOneUse(Src1->getReg()) &&
commuteInstruction(UseMI)) {		commuteInstruction(UseMI, nullptr, nullptr)) {
Src0->ChangeToImmediate(Def->getOperand(1).getImm());		Src0->ChangeToImmediate(Def->getOperand(1).getImm());
} else if ((Register::isPhysicalRegister(Src1->getReg()) &&		} else if ((Register::isPhysicalRegister(Src1->getReg()) &&
RI.isSGPRClass(RI.getPhysRegClass(Src1->getReg()))) \|\|		RI.isSGPRClass(RI.getPhysRegClass(Src1->getReg()))) \|\|
(Register::isVirtualRegister(Src1->getReg()) &&		(Register::isVirtualRegister(Src1->getReg()) &&
RI.isSGPRClass(MRI->getRegClass(Src1->getReg()))))		RI.isSGPRClass(MRI->getRegClass(Src1->getReg()))))
return false;		return false;
// VGPR is okay as Src1 - fallthrough		// VGPR is okay as Src1 - fallthrough
}		}
▲ Show 20 Lines • Show All 4,092 Lines • ▼ Show 20 Lines	MachineInstr *SIInstrInfo::createPHISourceCopy(
return TargetInstrInfo::createPHISourceCopy(MBB, InsPt, DL, Src, SrcSubReg,		return TargetInstrInfo::createPHISourceCopy(MBB, InsPt, DL, Src, SrcSubReg,
Dst);		Dst);
}		}

bool llvm::SIInstrInfo::isWave32() const { return ST.isWave32(); }		bool llvm::SIInstrInfo::isWave32() const { return ST.isWave32(); }

MachineInstr *SIInstrInfo::foldMemoryOperandImpl(		MachineInstr *SIInstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex, LiveIntervals *LIS,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
VirtRegMap *VRM) const {		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
		LiveIntervals LIS, VirtRegMap VRM) const {
// This is a bit of a hack (copied from AArch64). Consider this instruction:		// This is a bit of a hack (copied from AArch64). Consider this instruction:
//		//
// %0:sreg_32 = COPY $m0		// %0:sreg_32 = COPY $m0
//		//
// We explicitly chose SReg_32 for the virtual register so such a copy might		// We explicitly chose SReg_32 for the virtual register so such a copy might
// be eliminated by RegisterCoalescer. However, that may not be possible, and		// be eliminated by RegisterCoalescer. However, that may not be possible, and
// %0 may even spill. We can't spill $m0 normally (it would require copying to		// %0 may even spill. We can't spill $m0 normally (it would require copying to
// a numbered SGPR anyway), and since it is in the SReg_32 register class,		// a numbered SGPR anyway), and since it is in the SReg_32 register class,
Show All 20 Lines

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	if (Register::isVirtualRegister(Reg) && MRI.hasOneUse(Reg)) {
return true;		return true;
}		}
}		}
}		}
}		}

// We have failed to fold src0, so commute the instruction and try again.		// We have failed to fold src0, so commute the instruction and try again.
if (TryToCommute && MI.isCommutable()) {		if (TryToCommute && MI.isCommutable()) {
if (TII->commuteInstruction(MI)) {		if (TII->commuteInstruction(MI, nullptr, nullptr)) {
if (foldImmediates(MI, TII, MRI, false))		if (foldImmediates(MI, TII, MRI, false))
return true;		return true;

// Commute back.		// Commute back.
TII->commuteInstruction(MI);		TII->commuteInstruction(MI, nullptr, nullptr);
}		}
}		}

return false;		return false;
}		}

static bool isKImmOperand(const SIInstrInfo *TII, const MachineOperand &Src) {		static bool isKImmOperand(const SIInstrInfo *TII, const MachineOperand &Src) {
return isInt<16>(Src.getImm()) &&		return isInt<16>(Src.getImm()) &&
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if ((MO.isReg() && MO.isImplicit()) \|\| MO.isRegMask())
NewMI.addOperand(MF, MO);		NewMI.addOperand(MF, MO);
}		}
}		}

static void shrinkScalarCompare(const SIInstrInfo *TII, MachineInstr &MI) {		static void shrinkScalarCompare(const SIInstrInfo *TII, MachineInstr &MI) {
// cmpk instructions do scc = dst <cc op> imm16, so commute the instruction to		// cmpk instructions do scc = dst <cc op> imm16, so commute the instruction to
// get constants on the RHS.		// get constants on the RHS.
if (!MI.getOperand(0).isReg())		if (!MI.getOperand(0).isReg())
TII->commuteInstruction(MI, false, 0, 1);		TII->commuteInstruction(MI, nullptr, nullptr, false, 0, 1);

const MachineOperand &Src1 = MI.getOperand(1);		const MachineOperand &Src1 = MI.getOperand(1);
if (!Src1.isImm())		if (!Src1.isImm())
return;		return;

int SOPKOpc = AMDGPU::getSOPKOp(MI.getOpcode());		int SOPKOpc = AMDGPU::getSOPKOp(MI.getOpcode());
if (SOPKOpc == -1)		if (SOPKOpc == -1)
return;		return;
▲ Show 20 Lines • Show All 155 Lines • ▼ Show 20 Lines	if (Opc == AMDGPU::S_AND_B32) {
Opc = AMDGPU::S_XNOR_B32;		Opc = AMDGPU::S_XNOR_B32;
}		}
} else {		} else {
llvm_unreachable("unexpected opcode");		llvm_unreachable("unexpected opcode");
}		}

if ((Opc == AMDGPU::S_ANDN2_B32 \|\| Opc == AMDGPU::S_ORN2_B32) &&		if ((Opc == AMDGPU::S_ANDN2_B32 \|\| Opc == AMDGPU::S_ORN2_B32) &&
SrcImm == Src0) {		SrcImm == Src0) {
if (!TII->commuteInstruction(MI, false, 1, 2))		if (!TII->commuteInstruction(MI, nullptr, nullptr, false, 1, 2))
NewImm = 0;		NewImm = 0;
}		}

if (NewImm != 0) {		if (NewImm != 0) {
if (Register::isVirtualRegister(Dest->getReg()) && SrcReg->isReg()) {		if (Register::isVirtualRegister(Dest->getReg()) && SrcReg->isReg()) {
MRI.setRegAllocationHint(Dest->getReg(), 0, SrcReg->getReg());		MRI.setRegAllocationHint(Dest->getReg(), 0, SrcReg->getReg());
MRI.setRegAllocationHint(SrcReg->getReg(), 0, Dest->getReg());		MRI.setRegAllocationHint(SrcReg->getReg(), 0, Dest->getReg());
return true;		return true;
▲ Show 20 Lines • Show All 264 Lines • ▼ Show 20 Lines	for (I = MBB.begin(); I != MBB.end(); I = Next) {
// satisfied.		// satisfied.
if (MI.getOpcode() == AMDGPU::S_ADD_I32 \|\|		if (MI.getOpcode() == AMDGPU::S_ADD_I32 \|\|
MI.getOpcode() == AMDGPU::S_MUL_I32) {		MI.getOpcode() == AMDGPU::S_MUL_I32) {
const MachineOperand *Dest = &MI.getOperand(0);		const MachineOperand *Dest = &MI.getOperand(0);
MachineOperand *Src0 = &MI.getOperand(1);		MachineOperand *Src0 = &MI.getOperand(1);
MachineOperand *Src1 = &MI.getOperand(2);		MachineOperand *Src1 = &MI.getOperand(2);

if (!Src0->isReg() && Src1->isReg()) {		if (!Src0->isReg() && Src1->isReg()) {
if (TII->commuteInstruction(MI, false, 1, 2))		if (TII->commuteInstruction(MI, nullptr, nullptr, false, 1, 2))
std::swap(Src0, Src1);		std::swap(Src0, Src1);
}		}

// FIXME: This could work better if hints worked with subregisters. If		// FIXME: This could work better if hints worked with subregisters. If
// we have a vector add of a constant, we usually don't get the correct		// we have a vector add of a constant, we usually don't get the correct
// allocation due to the subregister usage.		// allocation due to the subregister usage.
if (Register::isVirtualRegister(Dest->getReg()) && Src0->isReg()) {		if (Register::isVirtualRegister(Dest->getReg()) && Src0->isReg()) {
MRI.setRegAllocationHint(Dest->getReg(), 0, Src0->getReg());		MRI.setRegAllocationHint(Dest->getReg(), 0, Src0->getReg());
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	for (I = MBB.begin(); I != MBB.end(); I = Next) {
}		}

if (!TII->hasVALU32BitEncoding(MI.getOpcode()))		if (!TII->hasVALU32BitEncoding(MI.getOpcode()))
continue;		continue;

if (!TII->canShrink(MI, MRI)) {		if (!TII->canShrink(MI, MRI)) {
// Try commuting the instruction and see if that enables us to shrink		// Try commuting the instruction and see if that enables us to shrink
// it.		// it.
if (!MI.isCommutable() \|\| !TII->commuteInstruction(MI) \|\|		if (!MI.isCommutable() \|\|
		!TII->commuteInstruction(MI, nullptr, nullptr) \|\|
!TII->canShrink(MI, MRI))		!TII->canShrink(MI, MRI))
continue;		continue;
}		}

// getVOPe32 could be -1 here if we started with an instruction that had		// getVOPe32 could be -1 here if we started with an instruction that had
// a 32-bit encoding and then commuted it to an instruction that did not.		// a 32-bit encoding and then commuted it to an instruction that did not.
if (!TII->hasVALU32BitEncoding(MI.getOpcode()))		if (!TII->hasVALU32BitEncoding(MI.getOpcode()))
continue;		continue;
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMBaseInstrInfo.h

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	protected:

/// Commutes the operands in the given instruction.		/// Commutes the operands in the given instruction.
/// The commutable operands are specified by their indices OpIdx1 and OpIdx2.		/// The commutable operands are specified by their indices OpIdx1 and OpIdx2.
///		///
/// Do not call this method for a non-commutable instruction or for		/// Do not call this method for a non-commutable instruction or for
/// non-commutable pair of operand indices OpIdx1 and OpIdx2.		/// non-commutable pair of operand indices OpIdx1 and OpIdx2.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const override;		unsigned OpIdx2) const override;

/// If the specific machine instruction is a instruction that moves/copies		/// If the specific machine instruction is a instruction that moves/copies
/// value from one register to another register return true along with		/// value from one register to another register return true along with
/// @Source machine operand and @Destination machine operand.		/// @Source machine operand and @Destination machine operand.
bool isCopyInstrImpl(const MachineInstr &MI, const MachineOperand *&Source,		bool isCopyInstrImpl(const MachineInstr &MI, const MachineOperand *&Source,
const MachineOperand *&Destination) const override;		const MachineOperand *&Destination) const override;
▲ Show 20 Lines • Show All 213 Lines • ▼ Show 20 Lines	int getOperandLatency(const InstrItineraryData *ItinData,
unsigned UseIdx) const override;		unsigned UseIdx) const override;
int getOperandLatency(const InstrItineraryData *ItinData,		int getOperandLatency(const InstrItineraryData *ItinData,
SDNode *DefNode, unsigned DefIdx,		SDNode *DefNode, unsigned DefIdx,
SDNode *UseNode, unsigned UseIdx) const override;		SDNode *UseNode, unsigned UseIdx) const override;

/// VFP/NEON execution domains.		/// VFP/NEON execution domains.
std::pair<uint16_t, uint16_t>		std::pair<uint16_t, uint16_t>
getExecutionDomain(const MachineInstr &MI) const override;		getExecutionDomain(const MachineInstr &MI) const override;
void setExecutionDomain(MachineInstr &MI, unsigned Domain) const override;		void setExecutionDomain(MachineInstr &MI, unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const override;

unsigned		unsigned
getPartialRegUpdateClearance(const MachineInstr &, unsigned,		getPartialRegUpdateClearance(const MachineInstr &, unsigned,
const TargetRegisterInfo *) const override;		const TargetRegisterInfo *) const override;
void breakPartialRegDependency(MachineInstr &, unsigned,		void breakPartialRegDependency(MachineInstr &, unsigned,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;

/// Get the number of addresses by LDM or VLDM or zero for unknown.		/// Get the number of addresses by LDM or VLDM or zero for unknown.
▲ Show 20 Lines • Show All 312 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

Show First 20 Lines • Show All 2,140 Lines • ▼ Show 20 Lines	if (Opc == ARM::tB)
return ARM::tBcc;		return ARM::tBcc;
if (Opc == ARM::t2B)		if (Opc == ARM::t2B)
return ARM::t2Bcc;		return ARM::t2Bcc;

llvm_unreachable("Unknown unconditional branch opcode!");		llvm_unreachable("Unknown unconditional branch opcode!");
}		}

MachineInstr *ARMBaseInstrInfo::commuteInstructionImpl(MachineInstr &MI,		MachineInstr *ARMBaseInstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
bool NewMI,		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const {		unsigned OpIdx2) const {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
case ARM::MOVCCr:		case ARM::MOVCCr:
case ARM::t2MOVCCr: {		case ARM::t2MOVCCr: {
// MOVCC can be commuted by inverting the condition.		// MOVCC can be commuted by inverting the condition.
unsigned PredReg = 0;		unsigned PredReg = 0;
ARMCC::CondCodes CC = getInstrPredicate(MI, PredReg);		ARMCC::CondCodes CC = getInstrPredicate(MI, PredReg);
// MOVCC AL can't be inverted. Shouldn't happen.		// MOVCC AL can't be inverted. Shouldn't happen.
if (CC == ARMCC::AL \|\| PredReg != ARM::CPSR)		if (CC == ARMCC::AL \|\| PredReg != ARM::CPSR)
return nullptr;		return nullptr;
MachineInstr *CommutedMI =		MachineInstr *CommutedMI =
TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI, NewMI, OpIdx1,
		OpIdx2);
if (!CommutedMI)		if (!CommutedMI)
return nullptr;		return nullptr;
// After swapping the MOVCC operands, also invert the condition.		// After swapping the MOVCC operands, also invert the condition.
CommutedMI->getOperand(CommutedMI->findFirstPredOperandIdx())		CommutedMI->getOperand(CommutedMI->findFirstPredOperandIdx())
.setImm(ARMCC::getOppositeCondition(CC));		.setImm(ARMCC::getOppositeCondition(CC));
return CommutedMI;		return CommutedMI;
}		}
}		}
return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI, NewMI, OpIdx1,
		OpIdx2);
}		}

/// Identify instructions that can be folded into a MOVCC instruction, and		/// Identify instructions that can be folded into a MOVCC instruction, and
/// return the defining instruction.		/// return the defining instruction.
MachineInstr *		MachineInstr *
ARMBaseInstrInfo::canFoldIntoMOVCC(unsigned Reg, const MachineRegisterInfo &MRI,		ARMBaseInstrInfo::canFoldIntoMOVCC(unsigned Reg, const MachineRegisterInfo &MRI,
const TargetInstrInfo *TII) const {		const TargetInstrInfo *TII) const {
if (!Register::isVirtualRegister(Reg))		if (!Register::isVirtualRegister(Reg))
▲ Show 20 Lines • Show All 2,741 Lines • ▼ Show 20 Lines	static bool getImplicitSPRUseForDPRUse(const TargetRegisterInfo *TRI,

// If the register is known not to be live, there is no need to add an		// If the register is known not to be live, there is no need to add an
// implicit-use.		// implicit-use.
ImplicitSReg = 0;		ImplicitSReg = 0;
return true;		return true;
}		}

void ARMBaseInstrInfo::setExecutionDomain(MachineInstr &MI,		void ARMBaseInstrInfo::setExecutionDomain(MachineInstr &MI,
unsigned Domain) const {		unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {
unsigned DstReg, SrcReg, DReg;		unsigned DstReg, SrcReg, DReg;
unsigned Lane;		unsigned Lane;
MachineInstrBuilder MIB(*MI.getParent()->getParent(), MI);		MachineInstrBuilder MIB(*MI.getParent()->getParent(), MI);
const TargetRegisterInfo *TRI = &getRegisterInfo();		const TargetRegisterInfo *TRI = &getRegisterInfo();
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
llvm_unreachable("cannot handle opcode!");		llvm_unreachable("cannot handle opcode!");
break;		break;
▲ Show 20 Lines • Show All 528 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/Thumb2SizeReduction.cpp

Show First 20 Lines • Show All 752 Lines • ▼ Show 20 Lines	if (!isARMLowRegister(Reg0) \|\| !isARMLowRegister(Reg1)
\|\| !isARMLowRegister(Reg2))		\|\| !isARMLowRegister(Reg2))
return false;		return false;
if (Reg0 != Reg2) {		if (Reg0 != Reg2) {
// If the other operand also isn't the same as the destination, we		// If the other operand also isn't the same as the destination, we
// can't reduce.		// can't reduce.
if (Reg1 != Reg0)		if (Reg1 != Reg0)
return false;		return false;
// Try to commute the operands to make it a 2-address instruction.		// Try to commute the operands to make it a 2-address instruction.
MachineInstr CommutedMI = TII->commuteInstruction(MI);		MachineInstr CommutedMI = TII->commuteInstruction(MI, nullptr, nullptr);
if (!CommutedMI)		if (!CommutedMI)
return false;		return false;
}		}
} else if (Reg0 != Reg1) {		} else if (Reg0 != Reg1) {
// Try to commute the operands to make it a 2-address instruction.		// Try to commute the operands to make it a 2-address instruction.
unsigned CommOpIdx1 = 1;		unsigned CommOpIdx1 = 1;
unsigned CommOpIdx2 = TargetInstrInfo::CommuteAnyOperandIndex;		unsigned CommOpIdx2 = TargetInstrInfo::CommuteAnyOperandIndex;
if (!TII->findCommutedOpIndices(*MI, CommOpIdx1, CommOpIdx2) \|\|		if (!TII->findCommutedOpIndices(*MI, CommOpIdx1, CommOpIdx2) \|\|
MI->getOperand(CommOpIdx2).getReg() != Reg0)		MI->getOperand(CommOpIdx2).getReg() != Reg0)
return false;		return false;
MachineInstr *CommutedMI =		MachineInstr *CommutedMI =
TII->commuteInstruction(*MI, false, CommOpIdx1, CommOpIdx2);		TII->commuteInstruction(*MI, nullptr, nullptr, false,
		CommOpIdx1, CommOpIdx2);
if (!CommutedMI)		if (!CommutedMI)
return false;		return false;
}		}
if (Entry.LowRegs2 && !isARMLowRegister(Reg0))		if (Entry.LowRegs2 && !isARMLowRegister(Reg0))
return false;		return false;
if (Entry.Imm2Limit) {		if (Entry.Imm2Limit) {
unsigned Imm = MI->getOperand(2).getImm();		unsigned Imm = MI->getOperand(2).getImm();
unsigned Limit = (1 << Entry.Imm2Limit) - 1;		unsigned Limit = (1 << Entry.Imm2Limit) - 1;
▲ Show 20 Lines • Show All 370 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.h

Show First 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	protected:
///		///
/// Do not call this method for a non-commutable instruction or for		/// Do not call this method for a non-commutable instruction or for
/// non-commutable pair of operand indices OpIdx1 and OpIdx2.		/// non-commutable pair of operand indices OpIdx1 and OpIdx2.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
///		///
/// For example, we can commute rlwimi instructions, but only if the		/// For example, we can commute rlwimi instructions, but only if the
/// rotate amt is zero. We also have to munge the immediates a bit.		/// rotate amt is zero. We also have to munge the immediates a bit.
MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const override;		unsigned OpIdx2) const override;

public:		public:
explicit PPCInstrInfo(PPCSubtarget &STI);		explicit PPCInstrInfo(PPCSubtarget &STI);

/// getRegisterInfo - TargetInstrInfo is a superset of MRegister info. As		/// getRegisterInfo - TargetInstrInfo is a superset of MRegister info. As
/// such, whenever a client has an instance of instruction info, it should		/// such, whenever a client has an instance of instruction info, it should
▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

Show First 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	if (MI.getOperand(1).isImm() && !MI.getOperand(1).getImm() &&
MI.getOperand(2).isFI()) {		MI.getOperand(2).isFI()) {
FrameIndex = MI.getOperand(2).getIndex();		FrameIndex = MI.getOperand(2).getIndex();
return MI.getOperand(0).getReg();		return MI.getOperand(0).getReg();
}		}
}		}
return 0;		return 0;
}		}

MachineInstr *PPCInstrInfo::commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *PPCInstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const {		unsigned OpIdx2) const {
MachineFunction &MF = *MI.getParent()->getParent();		MachineFunction &MF = *MI.getParent()->getParent();

// Normal instructions can be commuted the obvious way.		// Normal instructions can be commuted the obvious way.
if (MI.getOpcode() != PPC::RLWIMI && MI.getOpcode() != PPC::RLWIMIo)		if (MI.getOpcode() != PPC::RLWIMI && MI.getOpcode() != PPC::RLWIMIo)
return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI, NewMI, OpIdx1,
		OpIdx2);
// Note that RLWIMI can be commuted as a 32-bit instruction, but not as a		// Note that RLWIMI can be commuted as a 32-bit instruction, but not as a
// 64-bit instruction (so we don't handle PPC::RLWIMI8 here), because		// 64-bit instruction (so we don't handle PPC::RLWIMI8 here), because
// changing the relative order of the mask operands might change what happens		// changing the relative order of the mask operands might change what happens
// to the high-bits of the mask (and, thus, the result).		// to the high-bits of the mask (and, thus, the result).

// Cannot commute if it has a non-zero rotate count.		// Cannot commute if it has a non-zero rotate count.
if (MI.getOperand(3).getImm() != 0)		if (MI.getOperand(3).getImm() != 0)
return nullptr;		return nullptr;
▲ Show 20 Lines • Show All 3,932 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZInstrInfo.h

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	protected:
///		///
/// The arguments 'CommuteOpIdx1' and 'CommuteOpIdx2' specify the operands		/// The arguments 'CommuteOpIdx1' and 'CommuteOpIdx2' specify the operands
/// to be commuted.		/// to be commuted.
///		///
/// Do not call this method for a non-commutable instruction or		/// Do not call this method for a non-commutable instruction or
/// non-commutable operands.		/// non-commutable operands.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned CommuteOpIdx1,		unsigned CommuteOpIdx1,
unsigned CommuteOpIdx2) const override;		unsigned CommuteOpIdx2) const override;

public:		public:
explicit SystemZInstrInfo(SystemZSubtarget &STI);		explicit SystemZInstrInfo(SystemZSubtarget &STI);

// Override TargetInstrInfo.		// Override TargetInstrInfo.
unsigned isLoadFromStackSlot(const MachineInstr &MI,		unsigned isLoadFromStackSlot(const MachineInstr &MI,
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	void loadRegFromStackSlot(MachineBasicBlock &MBB,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
MachineInstr *convertToThreeAddress(MachineFunction::iterator &MFI,		MachineInstr *convertToThreeAddress(MachineFunction::iterator &MFI,
MachineInstr &MI,		MachineInstr &MI,
LiveVariables *LV) const override;		LiveVariables *LV) const override;
MachineInstr *		MachineInstr *
foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const override;		VirtRegMap *VRM = nullptr) const override;
MachineInstr *foldMemoryOperandImpl(		MachineInstr *foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr) const override;		LiveIntervals *LIS = nullptr) const override;
bool expandPostRAPseudo(MachineInstr &MBBI) const override;		bool expandPostRAPseudo(MachineInstr &MBBI) const override;
bool reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const		bool reverseBranchCondition(SmallVectorImpl<MachineOperand> &Cond) const
override;		override;

// Return the SystemZRegisterInfo, which this class owns.		// Return the SystemZRegisterInfo, which this class owns.
const SystemZRegisterInfo &getRegisterInfo() const { return RI; }		const SystemZRegisterInfo &getRegisterInfo() const { return RI; }

▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines	SystemZInstrInfo::emitGRX32Move(MachineBasicBlock &MBB,
unsigned Rotate = (DestIsHigh != SrcIsHigh ? 32 : 0);		unsigned Rotate = (DestIsHigh != SrcIsHigh ? 32 : 0);
return BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)		return BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)
.addReg(DestReg, RegState::Undef)		.addReg(DestReg, RegState::Undef)
.addReg(SrcReg, getKillRegState(KillSrc) \| getUndefRegState(UndefSrc))		.addReg(SrcReg, getKillRegState(KillSrc) \| getUndefRegState(UndefSrc))
.addImm(32 - Size).addImm(128 + 31).addImm(Rotate);		.addImm(32 - Size).addImm(128 + 31).addImm(Rotate);
}		}

MachineInstr *SystemZInstrInfo::commuteInstructionImpl(MachineInstr &MI,		MachineInstr *SystemZInstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
bool NewMI,		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const {		unsigned OpIdx2) const {
auto cloneIfNew = [NewMI](MachineInstr &MI) -> MachineInstr & {		auto cloneIfNew = [NewMI](MachineInstr &MI) -> MachineInstr & {
if (NewMI)		if (NewMI)
return *MI.getParent()->getParent()->CloneMachineInstr(&MI);		return *MI.getParent()->getParent()->CloneMachineInstr(&MI);
return MI;		return MI;
};		};

switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
case SystemZ::SELRMux:		case SystemZ::SELRMux:
case SystemZ::SELFHR:		case SystemZ::SELFHR:
case SystemZ::SELR:		case SystemZ::SELR:
case SystemZ::SELGR:		case SystemZ::SELGR:
case SystemZ::LOCRMux:		case SystemZ::LOCRMux:
case SystemZ::LOCFHR:		case SystemZ::LOCFHR:
case SystemZ::LOCR:		case SystemZ::LOCR:
case SystemZ::LOCGR: {		case SystemZ::LOCGR: {
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
// Invert condition.		// Invert condition.
unsigned CCValid = WorkingMI.getOperand(3).getImm();		unsigned CCValid = WorkingMI.getOperand(3).getImm();
unsigned CCMask = WorkingMI.getOperand(4).getImm();		unsigned CCMask = WorkingMI.getOperand(4).getImm();
WorkingMI.getOperand(4).setImm(CCMask ^ CCValid);		WorkingMI.getOperand(4).setImm(CCMask ^ CCValid);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
default:		default:
return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI,
		NewMI, OpIdx1, OpIdx2);
}		}
}		}

// If MI is a simple load or store for a frame object, return the register		// If MI is a simple load or store for a frame object, return the register
// it loads or stores and set FrameIndex to the index of the frame object.		// it loads or stores and set FrameIndex to the index of the frame object.
// Return 0 otherwise.		// Return 0 otherwise.
//		//
// Flag is SimpleBDXLoad for loads and SimpleBDXStore for stores.		// Flag is SimpleBDXLoad for loads and SimpleBDXStore for stores.
▲ Show 20 Lines • Show All 341 Lines • ▼ Show 20 Lines	case SystemZ::LOCGR:
else		else
return false;		return false;
break;		break;
default:		default:
return false;		return false;
}		}

if (CommuteIdx != -1)		if (CommuteIdx != -1)
if (!commuteInstruction(UseMI, false, CommuteIdx, UseIdx))		if (!commuteInstruction(UseMI, nullptr, nullptr, false, CommuteIdx, UseIdx))
return false;		return false;

bool DeleteDef = MRI->hasOneNonDBGUse(Reg);		bool DeleteDef = MRI->hasOneNonDBGUse(Reg);
UseMI.setDesc(get(NewUseOpc));		UseMI.setDesc(get(NewUseOpc));
if (TieOps)		if (TieOps)
UseMI.tieOperands(0, 1);		UseMI.tieOperands(0, 1);
UseMI.getOperand(UseIdx).ChangeToImmediate(ImmVal);		UseMI.getOperand(UseIdx).ChangeToImmediate(ImmVal);
if (DeleteDef)		if (DeleteDef)
▲ Show 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	if (LogicOp And = interpretAndImmediate(MI.getOpcode())) {
}		}
}		}
return nullptr;		return nullptr;
}		}

MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(		MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals LIS, VirtRegMap VRM) const {		LiveIntervals LIS, VirtRegMap VRM) const {
const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();		const TargetRegisterInfo *TRI = MF.getSubtarget().getRegisterInfo();
const MachineFrameInfo &MFI = MF.getFrameInfo();		const MachineFrameInfo &MFI = MF.getFrameInfo();
unsigned Size = MFI.getObjectSize(FrameIndex);		unsigned Size = MFI.getObjectSize(FrameIndex);
unsigned Opcode = MI.getOpcode();		unsigned Opcode = MI.getOpcode();

if (Ops.size() == 2 && Ops[0] == 0 && Ops[1] == 1) {		if (Ops.size() == 2 && Ops[0] == 0 && Ops[1] == 1) {
if (LIS != nullptr && (Opcode == SystemZ::LA \|\| Opcode == SystemZ::LAY) &&		if (LIS != nullptr && (Opcode == SystemZ::LA \|\| Opcode == SystemZ::LAY) &&
▲ Show 20 Lines • Show All 197 Lines • ▼ Show 20 Lines	MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(
}		}

return nullptr;		return nullptr;
}		}

MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(		MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals *LIS) const {		LiveIntervals *LIS) const {
return nullptr;		return nullptr;
}		}

bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {		bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
case SystemZ::L128:		case SystemZ::L128:
splitMove(MI, SystemZ::LG);		splitMove(MI, SystemZ::LG);
▲ Show 20 Lines • Show All 564 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZPostRewrite.cpp

Show First 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	if (DestIsHigh != Src1IsHigh) {
MBBI->getOperand(2).setReg(DestReg);		MBBI->getOperand(2).setReg(DestReg);
Src2Reg = DestReg;		Src2Reg = DestReg;
Src2IsHigh = DestIsHigh;		Src2IsHigh = DestIsHigh;
}		}
}		}

// If the destination (now) matches one source, prefer this to be first.		// If the destination (now) matches one source, prefer this to be first.
if (DestReg != Src1Reg && DestReg == Src2Reg) {		if (DestReg != Src1Reg && DestReg == Src2Reg) {
TII->commuteInstruction(*MBBI, false, 1, 2);		TII->commuteInstruction(*MBBI, nullptr, nullptr, false, 1, 2);
std::swap(Src1Reg, Src2Reg);		std::swap(Src1Reg, Src2Reg);
std::swap(Src1IsHigh, Src2IsHigh);		std::swap(Src1IsHigh, Src2IsHigh);
}		}

if (!DestIsHigh && !Src1IsHigh && !Src2IsHigh)		if (!DestIsHigh && !Src1IsHigh && !Src2IsHigh)
MBBI->setDesc(TII->get(LowOpcode));		MBBI->setDesc(TII->get(LowOpcode));
else if (DestIsHigh && Src1IsHigh && Src2IsHigh)		else if (DestIsHigh && Src1IsHigh && Src2IsHigh)
MBBI->setDesc(TII->get(HighOpcode));		MBBI->setDesc(TII->get(HighOpcode));
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines	bool SystemZPostRewrite::runOnMachineFunction(MachineFunction &MF) {
TII = static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo());		TII = static_cast<const SystemZInstrInfo *>(MF.getSubtarget().getInstrInfo());

bool Modified = false;		bool Modified = false;
for (auto &MBB : MF)		for (auto &MBB : MF)
Modified \|= selectMBB(MBB);		Modified \|= selectMBB(MBB);

return Modified;		return Modified;
}		}

llvm/lib/Target/SystemZ/SystemZShortenInst.cpp

Show First 20 Lines • Show All 179 Lines • ▼ Show 20 Lines
// the destination, convert to the equivalent load-on-condition.		// the destination, convert to the equivalent load-on-condition.
bool SystemZShortenInst::shortenSelect(MachineInstr &MI, unsigned Opcode) {		bool SystemZShortenInst::shortenSelect(MachineInstr &MI, unsigned Opcode) {
if (MI.getOperand(0).getReg() == MI.getOperand(1).getReg()) {		if (MI.getOperand(0).getReg() == MI.getOperand(1).getReg()) {
MI.setDesc(TII->get(Opcode));		MI.setDesc(TII->get(Opcode));
MI.tieOperands(0, 1);		MI.tieOperands(0, 1);
return true;		return true;
}		}
if (MI.getOperand(0).getReg() == MI.getOperand(2).getReg()) {		if (MI.getOperand(0).getReg() == MI.getOperand(2).getReg()) {
TII->commuteInstruction(MI, false, 1, 2);		TII->commuteInstruction(MI, nullptr, nullptr, false, 1, 2);
MI.setDesc(TII->get(Opcode));		MI.setDesc(TII->get(Opcode));
MI.tieOperands(0, 1);		MI.tieOperands(0, 1);
return true;		return true;
}		}
return false;		return false;
}		}

// Process all instructions in MBB. Return true if something changed.		// Process all instructions in MBB. Return true if something changed.
▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	for (auto MBBI = MBB.rbegin(), MBBE = MBB.rend(); MBBI != MBBE; ++MBBI) {
default: {		default: {
int TwoOperandOpcode = SystemZ::getTwoOperandOpcode(MI.getOpcode());		int TwoOperandOpcode = SystemZ::getTwoOperandOpcode(MI.getOpcode());
if (TwoOperandOpcode == -1)		if (TwoOperandOpcode == -1)
break;		break;

if ((MI.getOperand(0).getReg() != MI.getOperand(1).getReg()) &&		if ((MI.getOperand(0).getReg() != MI.getOperand(1).getReg()) &&
(!MI.isCommutable() \|\|		(!MI.isCommutable() \|\|
MI.getOperand(0).getReg() != MI.getOperand(2).getReg() \|\|		MI.getOperand(0).getReg() != MI.getOperand(2).getReg() \|\|
!TII->commuteInstruction(MI, false, 1, 2)))		!TII->commuteInstruction(MI, nullptr, nullptr, false, 1, 2)))
break;		break;

MI.setDesc(TII->get(TwoOperandOpcode));		MI.setDesc(TII->get(TwoOperandOpcode));
MI.tieOperands(0, 1);		MI.tieOperands(0, 1);
if (TwoOperandOpcode == SystemZ::SLL \|\|		if (TwoOperandOpcode == SystemZ::SLL \|\|
TwoOperandOpcode == SystemZ::SLA \|\|		TwoOperandOpcode == SystemZ::SLA \|\|
TwoOperandOpcode == SystemZ::SRL \|\|		TwoOperandOpcode == SystemZ::SRL \|\|
TwoOperandOpcode == SystemZ::SRA) {		TwoOperandOpcode == SystemZ::SRA) {
Show All 30 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h

Show First 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	public:
const WebAssemblyRegisterInfo &getRegisterInfo() const { return RI; }		const WebAssemblyRegisterInfo &getRegisterInfo() const { return RI; }

bool isReallyTriviallyReMaterializable(const MachineInstr &MI,		bool isReallyTriviallyReMaterializable(const MachineInstr &MI,
AAResults *AA) const override;		AAResults *AA) const override;

void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,		void copyPhysReg(MachineBasicBlock &MBB, MachineBasicBlock::iterator MI,
const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,		const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,
bool KillSrc) const override;		bool KillSrc) const override;
MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const override;		unsigned OpIdx2) const override;

bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,		bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify = false) const override;		bool AllowModify = false) const override;
unsigned removeBranch(MachineBasicBlock &MBB,		unsigned removeBranch(MachineBasicBlock &MBB,
Show All 12 Lines

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	void WebAssemblyInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
else		else
llvm_unreachable("Unexpected register class");		llvm_unreachable("Unexpected register class");

BuildMI(MBB, I, DL, get(CopyOpcode), DestReg)		BuildMI(MBB, I, DL, get(CopyOpcode), DestReg)
.addReg(SrcReg, KillSrc ? RegState::Kill : 0);		.addReg(SrcReg, KillSrc ? RegState::Kill : 0);
}		}

MachineInstr *WebAssemblyInstrInfo::commuteInstructionImpl(		MachineInstr *WebAssemblyInstrInfo::commuteInstructionImpl(
MachineInstr &MI, bool NewMI, unsigned OpIdx1, unsigned OpIdx2) const {		MachineInstr &MI,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
		bool NewMI, unsigned OpIdx1, unsigned OpIdx2) const {
// If the operands are stackified, we can't reorder them.		// If the operands are stackified, we can't reorder them.
WebAssemblyFunctionInfo &MFI =		WebAssemblyFunctionInfo &MFI =
*MI.getParent()->getParent()->getInfo<WebAssemblyFunctionInfo>();		*MI.getParent()->getParent()->getInfo<WebAssemblyFunctionInfo>();
if (MFI.isVRegStackified(MI.getOperand(OpIdx1).getReg()) \|\|		if (MFI.isVRegStackified(MI.getOperand(OpIdx1).getReg()) \|\|
MFI.isVRegStackified(MI.getOperand(OpIdx2).getReg()))		MFI.isVRegStackified(MI.getOperand(OpIdx2).getReg()))
return nullptr;		return nullptr;

// Otherwise use the default implementation.		// Otherwise use the default implementation.
return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI, NewMI,
		OpIdx1, OpIdx2);
}		}

// Branch analysis.		// Branch analysis.
bool WebAssemblyInstrInfo::analyzeBranch(MachineBasicBlock &MBB,		bool WebAssemblyInstrInfo::analyzeBranch(MachineBasicBlock &MBB,
MachineBasicBlock *&TBB,		MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool /AllowModify/) const {		bool /AllowModify/) const {
▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp

Show First 20 Lines • Show All 734 Lines • ▼ Show 20 Lines	public:
/// constraints. If possible, and if we haven't already tried it and declined		/// constraints. If possible, and if we haven't already tried it and declined
/// it, commute Insert's operands and prepare to revisit it.		/// it, commute Insert's operands and prepare to revisit it.
void maybeCommute(MachineInstr *Insert, TreeWalkerState &TreeWalker,		void maybeCommute(MachineInstr *Insert, TreeWalkerState &TreeWalker,
const WebAssemblyInstrInfo *TII) {		const WebAssemblyInstrInfo *TII) {
if (TentativelyCommuting) {		if (TentativelyCommuting) {
assert(!Declined &&		assert(!Declined &&
"Don't decline commuting until you've finished trying it");		"Don't decline commuting until you've finished trying it");
// Commuting didn't help. Revert it.		// Commuting didn't help. Revert it.
TII->commuteInstruction(Insert, /NewMI=*/false, Operand0, Operand1);		TII->commuteInstruction(Insert, nullptr, nullptr, /NewMI=*/false,
		Operand0, Operand1);
TentativelyCommuting = false;		TentativelyCommuting = false;
Declined = true;		Declined = true;
} else if (!Declined && TreeWalker.hasRemainingOperands(Insert)) {		} else if (!Declined && TreeWalker.hasRemainingOperands(Insert)) {
Operand0 = TargetInstrInfo::CommuteAnyOperandIndex;		Operand0 = TargetInstrInfo::CommuteAnyOperandIndex;
Operand1 = TargetInstrInfo::CommuteAnyOperandIndex;		Operand1 = TargetInstrInfo::CommuteAnyOperandIndex;
if (TII->findCommutedOpIndices(*Insert, Operand0, Operand1)) {		if (TII->findCommutedOpIndices(*Insert, Operand0, Operand1)) {
// Tentatively commute the operands and try again.		// Tentatively commute the operands and try again.
TII->commuteInstruction(Insert, /NewMI=*/false, Operand0, Operand1);		TII->commuteInstruction(Insert, nullptr, nullptr, /NewMI=*/false,
		Operand0, Operand1);
TreeWalker.resetTopOperands(Insert);		TreeWalker.resetTopOperands(Insert);
TentativelyCommuting = true;		TentativelyCommuting = true;
Declined = false;		Declined = false;
}		}
}		}
}		}

/// Stackification for some operand was successful. Reset to the default		/// Stackification for some operand was successful. Reset to the default
▲ Show 20 Lines • Show All 178 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86FastISel.cpp

Show First 20 Lines • Show All 3,935 Lines • ▼ Show 20 Lines	bool X86FastISel::tryToFoldLoadIntoMI(MachineInstr *MI, unsigned OpNo,
if (Alignment == 0) // Ensure that codegen never sees alignment 0		if (Alignment == 0) // Ensure that codegen never sees alignment 0
Alignment = DL.getABITypeAlignment(LI->getType());		Alignment = DL.getABITypeAlignment(LI->getType());

SmallVector<MachineOperand, 8> AddrOps;		SmallVector<MachineOperand, 8> AddrOps;
AM.getFullAddress(AddrOps);		AM.getFullAddress(AddrOps);

MachineInstr *Result = XII.foldMemoryOperandImpl(		MachineInstr *Result = XII.foldMemoryOperandImpl(
FuncInfo.MF, MI, OpNo, AddrOps, FuncInfo.InsertPt, Size, Alignment,		FuncInfo.MF, MI, OpNo, AddrOps, FuncInfo.InsertPt, Size, Alignment,
/AllowCommute=/true);		/AllowCommute=/true, nullptr, nullptr);
if (!Result)		if (!Result)
return false;		return false;

// The index register could be in the wrong register class. Unfortunately,		// The index register could be in the wrong register class. Unfortunately,
// foldMemoryOperandImpl could have commuted the instruction so its not enough		// foldMemoryOperandImpl could have commuted the instruction so its not enough
// to just look at OpNo + the offset to the index reg. We actually need to		// to just look at OpNo + the offset to the index reg. We actually need to
// scan the instruction to find the index reg and see if its the correct reg		// scan the instruction to find the index reg and see if its the correct reg
// class.		// class.
▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.h

Show All 18 Lines
#include "llvm/CodeGen/ISDOpcodes.h"		#include "llvm/CodeGen/ISDOpcodes.h"
#include "llvm/CodeGen/TargetInstrInfo.h"		#include "llvm/CodeGen/TargetInstrInfo.h"
#include <vector>		#include <vector>

#define GET_INSTRINFO_HEADER		#define GET_INSTRINFO_HEADER
#include "X86GenInstrInfo.inc"		#include "X86GenInstrInfo.inc"

namespace llvm {		namespace llvm {
		class MachineBlockFrequencyInfo;
class MachineInstrBuilder;		class MachineInstrBuilder;
		class ProfileSummaryInfo;
class X86RegisterInfo;		class X86RegisterInfo;
class X86Subtarget;		class X86Subtarget;

namespace X86 {		namespace X86 {

enum AsmComments {		enum AsmComments {
// For instr that was compressed from EVEX to VEX.		// For instr that was compressed from EVEX to VEX.
AC_EVEX_2_VEX = MachineInstr::TAsmComments		AC_EVEX_2_VEX = MachineInstr::TAsmComments
▲ Show 20 Lines • Show All 300 Lines • ▼ Show 20 Lines	public:
/// specified operand(s). If this is possible, the target should perform the		/// specified operand(s). If this is possible, the target should perform the
/// folding and return true, otherwise it should return false. If it folds		/// folding and return true, otherwise it should return false. If it folds
/// the instruction, it is likely that the MachineInstruction the iterator		/// the instruction, it is likely that the MachineInstruction the iterator
/// references has been changed.		/// references has been changed.
MachineInstr *		MachineInstr *
foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, int FrameIndex,		MachineBasicBlock::iterator InsertPt, int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
LiveIntervals *LIS = nullptr,		LiveIntervals *LIS = nullptr,
VirtRegMap *VRM = nullptr) const override;		VirtRegMap *VRM = nullptr) const override;

/// foldMemoryOperand - Same as the previous version except it allows folding		/// foldMemoryOperand - Same as the previous version except it allows folding
/// of any load and store from / to any address, not just from a specific		/// of any load and store from / to any address, not just from a specific
/// stack slot.		/// stack slot.
MachineInstr *foldMemoryOperandImpl(		MachineInstr *foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals *LIS = nullptr) const override;		LiveIntervals *LIS = nullptr) const override;

/// unfoldMemoryOperand - Separate a single instruction which folded a load or		/// unfoldMemoryOperand - Separate a single instruction which folded a load or
/// a store or a load and a store into two or more instruction. If this is		/// a store or a load and a store into two or more instruction. If this is
/// possible, returns true as well as the new instructions by reference.		/// possible, returns true as well as the new instructions by reference.
bool		bool
unfoldMemoryOperand(MachineFunction &MF, MachineInstr &MI, unsigned Reg,		unfoldMemoryOperand(MachineFunction &MF, MachineInstr &MI, unsigned Reg,
bool UnfoldLoad, bool UnfoldStore,		bool UnfoldLoad, bool UnfoldStore,
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	public:
///		///
unsigned getGlobalBaseReg(MachineFunction *MF) const;		unsigned getGlobalBaseReg(MachineFunction *MF) const;

std::pair<uint16_t, uint16_t>		std::pair<uint16_t, uint16_t>
getExecutionDomain(const MachineInstr &MI) const override;		getExecutionDomain(const MachineInstr &MI) const override;

uint16_t getExecutionDomainCustom(const MachineInstr &MI) const;		uint16_t getExecutionDomainCustom(const MachineInstr &MI) const;

void setExecutionDomain(MachineInstr &MI, unsigned Domain) const override;		void setExecutionDomain(MachineInstr &MI, unsigned Domain,
		ProfileSummaryInfo *PSI,
bool setExecutionDomainCustom(MachineInstr &MI, unsigned Domain) const;		const MachineBlockFrequencyInfo *MBFI) const override;

		bool setExecutionDomainCustom(MachineInstr &MI, unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const;

unsigned		unsigned
getPartialRegUpdateClearance(const MachineInstr &MI, unsigned OpNum,		getPartialRegUpdateClearance(const MachineInstr &MI, unsigned OpNum,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
unsigned getUndefRegClearance(const MachineInstr &MI, unsigned &OpNum,		unsigned getUndefRegClearance(const MachineInstr &MI, unsigned &OpNum,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;
void breakPartialRegDependency(MachineInstr &MI, unsigned OpNum,		void breakPartialRegDependency(MachineInstr &MI, unsigned OpNum,
const TargetRegisterInfo *TRI) const override;		const TargetRegisterInfo *TRI) const override;

MachineInstr *foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		MachineInstr *foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
unsigned OpNum,		unsigned OpNum,
ArrayRef<MachineOperand> MOs,		ArrayRef<MachineOperand> MOs,
MachineBasicBlock::iterator InsertPt,		MachineBasicBlock::iterator InsertPt,
unsigned Size, unsigned Alignment,		unsigned Size, unsigned Alignment,
bool AllowCommute) const;		bool AllowCommute,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const;

bool isHighLatencyDef(int opc) const override;		bool isHighLatencyDef(int opc) const override;

bool hasHighOperandLatency(const TargetSchedModel &SchedModel,		bool hasHighOperandLatency(const TargetSchedModel &SchedModel,
const MachineRegisterInfo *MRI,		const MachineRegisterInfo *MRI,
const MachineInstr &DefMI, unsigned DefIdx,		const MachineInstr &DefMI, unsigned DefIdx,
const MachineInstr &UseMI,		const MachineInstr &UseMI,
unsigned UseIdx) const override;		unsigned UseIdx) const override;
Show All 29 Lines	public:
/// def and use are in the same BB. We only look at one load and see		/// def and use are in the same BB. We only look at one load and see
/// whether it can be folded into MI. FoldAsLoadDefReg is the virtual register		/// whether it can be folded into MI. FoldAsLoadDefReg is the virtual register
/// defined by the load we are trying to fold. DefMI returns the machine		/// defined by the load we are trying to fold. DefMI returns the machine
/// instruction that defines FoldAsLoadDefReg, and the function returns		/// instruction that defines FoldAsLoadDefReg, and the function returns
/// the machine instruction generated due to folding.		/// the machine instruction generated due to folding.
MachineInstr *optimizeLoadInstr(MachineInstr &MI,		MachineInstr *optimizeLoadInstr(MachineInstr &MI,
const MachineRegisterInfo *MRI,		const MachineRegisterInfo *MRI,
unsigned &FoldAsLoadDefReg,		unsigned &FoldAsLoadDefReg,
MachineInstr *&DefMI) const override;		MachineInstr *&DefMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const override;

std::pair<unsigned, unsigned>		std::pair<unsigned, unsigned>
decomposeMachineOperandsTargetFlags(unsigned TF) const override;		decomposeMachineOperandsTargetFlags(unsigned TF) const override;

ArrayRef<std::pair<unsigned, const char *>>		ArrayRef<std::pair<unsigned, const char *>>
getSerializableDirectMachineOperandTargetFlags() const override;		getSerializableDirectMachineOperandTargetFlags() const override;

virtual outliner::OutlinedFunction getOutliningCandidateInfo(		virtual outliner::OutlinedFunction getOutliningCandidateInfo(
Show All 30 Lines	protected:
///		///
/// The arguments 'CommuteOpIdx1' and 'CommuteOpIdx2' specify the operands		/// The arguments 'CommuteOpIdx1' and 'CommuteOpIdx2' specify the operands
/// to be commuted.		/// to be commuted.
///		///
/// Do not call this method for a non-commutable instruction or		/// Do not call this method for a non-commutable instruction or
/// non-commutable operands.		/// non-commutable operands.
/// Even though the instruction is commutable, the method may still		/// Even though the instruction is commutable, the method may still
/// fail to commute the operands, null pointer is returned in such cases.		/// fail to commute the operands, null pointer is returned in such cases.
MachineInstr *commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned CommuteOpIdx1,		unsigned CommuteOpIdx1,
unsigned CommuteOpIdx2) const override;		unsigned CommuteOpIdx2) const override;

/// If the specific machine instruction is a instruction that moves/copies		/// If the specific machine instruction is a instruction that moves/copies
/// value from one register to another register return true along with		/// value from one register to another register return true along with
/// @Source machine operand and @Destination machine operand.		/// @Source machine operand and @Destination machine operand.
bool isCopyInstrImpl(const MachineInstr &MI, const MachineOperand *&Source,		bool isCopyInstrImpl(const MachineInstr &MI, const MachineOperand *&Source,
const MachineOperand *&Destination) const override;		const MachineOperand *&Destination) const override;
▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show All 13 Lines
#include "X86.h"		#include "X86.h"
#include "X86InstrBuilder.h"		#include "X86InstrBuilder.h"
#include "X86InstrFoldTables.h"		#include "X86InstrFoldTables.h"
#include "X86MachineFunctionInfo.h"		#include "X86MachineFunctionInfo.h"
#include "X86Subtarget.h"		#include "X86Subtarget.h"
#include "X86TargetMachine.h"		#include "X86TargetMachine.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Sequence.h"		#include "llvm/ADT/Sequence.h"
		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/CodeGen/LivePhysRegs.h"		#include "llvm/CodeGen/LivePhysRegs.h"
#include "llvm/CodeGen/LiveVariables.h"		#include "llvm/CodeGen/LiveVariables.h"
		#include "llvm/CodeGen/MachineBlockFrequencyInfo.h"
#include "llvm/CodeGen/MachineConstantPool.h"		#include "llvm/CodeGen/MachineConstantPool.h"
#include "llvm/CodeGen/MachineDominators.h"		#include "llvm/CodeGen/MachineDominators.h"
#include "llvm/CodeGen/MachineFrameInfo.h"		#include "llvm/CodeGen/MachineFrameInfo.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineModuleInfo.h"		#include "llvm/CodeGen/MachineModuleInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
		#include "llvm/CodeGen/MachineSizeOpts.h"
#include "llvm/CodeGen/StackMaps.h"		#include "llvm/CodeGen/StackMaps.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/DebugInfoMetadata.h"		#include "llvm/IR/DebugInfoMetadata.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCExpr.h"		#include "llvm/MC/MCExpr.h"
#include "llvm/MC/MCInst.h"		#include "llvm/MC/MCInst.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
▲ Show 20 Lines • Show All 1,481 Lines • ▼ Show 20 Lines	#define VPERM_CASES_BROADCAST(Orig, New) \
VPERM_CASES(VPERMT2W, VPERMI2W)		VPERM_CASES(VPERMT2W, VPERMI2W)
}		}

llvm_unreachable("Unreachable!");		llvm_unreachable("Unreachable!");
#undef VPERM_CASES_BROADCAST		#undef VPERM_CASES_BROADCAST
#undef VPERM_CASES		#undef VPERM_CASES
}		}

MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI, bool NewMI,		MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		bool NewMI,
unsigned OpIdx1,		unsigned OpIdx1,
unsigned OpIdx2) const {		unsigned OpIdx2) const {
auto cloneIfNew = [NewMI](MachineInstr &MI) -> MachineInstr & {		auto cloneIfNew = [NewMI](MachineInstr &MI) -> MachineInstr & {
if (NewMI)		if (NewMI)
return *MI.getParent()->getParent()->CloneMachineInstr(&MI);		return *MI.getParent()->getParent()->CloneMachineInstr(&MI);
return MI;		return MI;
};		};

Show All 14 Lines	case X86::SHLD64rri8:{// A = SHLD64rri8 B, C, I -> A = SHRD64rri8 C, B, (64-I)
case X86::SHLD32rri8: Size = 32; Opc = X86::SHRD32rri8; break;		case X86::SHLD32rri8: Size = 32; Opc = X86::SHRD32rri8; break;
case X86::SHRD64rri8: Size = 64; Opc = X86::SHLD64rri8; break;		case X86::SHRD64rri8: Size = 64; Opc = X86::SHLD64rri8; break;
case X86::SHLD64rri8: Size = 64; Opc = X86::SHRD64rri8; break;		case X86::SHLD64rri8: Size = 64; Opc = X86::SHRD64rri8; break;
}		}
unsigned Amt = MI.getOperand(3).getImm();		unsigned Amt = MI.getOperand(3).getImm();
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
WorkingMI.getOperand(3).setImm(Size - Amt);		WorkingMI.getOperand(3).setImm(Size - Amt);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::PFSUBrr:		case X86::PFSUBrr:
case X86::PFSUBRrr: {		case X86::PFSUBRrr: {
// PFSUB x, y: x = x - y		// PFSUB x, y: x = x - y
// PFSUBR x, y: x = y - x		// PFSUBR x, y: x = y - x
unsigned Opc =		unsigned Opc =
(X86::PFSUBRrr == MI.getOpcode() ? X86::PFSUBrr : X86::PFSUBRrr);		(X86::PFSUBRrr == MI.getOpcode() ? X86::PFSUBrr : X86::PFSUBRrr);
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::BLENDPDrri:		case X86::BLENDPDrri:
case X86::BLENDPSrri:		case X86::BLENDPSrri:
case X86::VBLENDPDrri:		case X86::VBLENDPDrri:
case X86::VBLENDPSrri:		case X86::VBLENDPSrri: {
// If we're optimizing for size, try to use MOVSD/MOVSS.		// If we're optimizing for size, try to use MOVSD/MOVSS.
if (MI.getParent()->getParent()->getFunction().hasOptSize()) {		auto *MBB = MI.getParent();
		auto MF = MBB->getParent();
		bool OptForSize = MF->getFunction().hasOptSize() \|\|
		llvm::shouldOptimizeForSize(MBB, PSI, MBFI);
		if (OptForSize) {
unsigned Mask, Opc;		unsigned Mask, Opc;
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default: llvm_unreachable("Unreachable!");		default: llvm_unreachable("Unreachable!");
case X86::BLENDPDrri: Opc = X86::MOVSDrr; Mask = 0x03; break;		case X86::BLENDPDrri: Opc = X86::MOVSDrr; Mask = 0x03; break;
case X86::BLENDPSrri: Opc = X86::MOVSSrr; Mask = 0x0F; break;		case X86::BLENDPSrri: Opc = X86::MOVSSrr; Mask = 0x0F; break;
case X86::VBLENDPDrri: Opc = X86::VMOVSDrr; Mask = 0x03; break;		case X86::VBLENDPDrri: Opc = X86::VMOVSDrr; Mask = 0x03; break;
case X86::VBLENDPSrri: Opc = X86::VMOVSSrr; Mask = 0x0F; break;		case X86::VBLENDPSrri: Opc = X86::VMOVSSrr; Mask = 0x0F; break;
}		}
if ((MI.getOperand(3).getImm() ^ Mask) == 1) {		if ((MI.getOperand(3).getImm() ^ Mask) == 1) {
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
WorkingMI.RemoveOperand(3);		WorkingMI.RemoveOperand(3);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI,
		PSI, MBFI,
/NewMI=/false,		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
}		}
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
		}
case X86::PBLENDWrri:		case X86::PBLENDWrri:
case X86::VBLENDPDYrri:		case X86::VBLENDPDYrri:
case X86::VBLENDPSYrri:		case X86::VBLENDPSYrri:
case X86::VPBLENDDrri:		case X86::VPBLENDDrri:
case X86::VPBLENDWrri:		case X86::VPBLENDWrri:
case X86::VPBLENDDYrri:		case X86::VPBLENDDYrri:
case X86::VPBLENDWYrri:{		case X86::VPBLENDWYrri:{
int8_t Mask;		int8_t Mask;
Show All 12 Lines	case X86::VPBLENDWYrri:{
case X86::VPBLENDWYrri: Mask = (int8_t)0xFF; break;		case X86::VPBLENDWYrri: Mask = (int8_t)0xFF; break;
}		}
// Only the least significant bits of Imm are used.		// Only the least significant bits of Imm are used.
// Using int8_t to ensure it will be sign extended to the int64_t that		// Using int8_t to ensure it will be sign extended to the int64_t that
// setImm takes in order to match isel behavior.		// setImm takes in order to match isel behavior.
int8_t Imm = MI.getOperand(3).getImm() & Mask;		int8_t Imm = MI.getOperand(3).getImm() & Mask;
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(3).setImm(Mask ^ Imm);		WorkingMI.getOperand(3).setImm(Mask ^ Imm);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::INSERTPSrr:		case X86::INSERTPSrr:
case X86::VINSERTPSrr:		case X86::VINSERTPSrr:
case X86::VINSERTPSZrr: {		case X86::VINSERTPSZrr: {
unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm();		unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm();
unsigned ZMask = Imm & 15;		unsigned ZMask = Imm & 15;
unsigned DstIdx = (Imm >> 4) & 3;		unsigned DstIdx = (Imm >> 4) & 3;
unsigned SrcIdx = (Imm >> 6) & 3;		unsigned SrcIdx = (Imm >> 6) & 3;

// We can commute insertps if we zero 2 of the elements, the insertion is		// We can commute insertps if we zero 2 of the elements, the insertion is
// "inline" and we don't override the insertion with a zero.		// "inline" and we don't override the insertion with a zero.
if (DstIdx == SrcIdx && (ZMask & (1 << DstIdx)) == 0 &&		if (DstIdx == SrcIdx && (ZMask & (1 << DstIdx)) == 0 &&
countPopulation(ZMask) == 2) {		countPopulation(ZMask) == 2) {
unsigned AltIdx = findFirstSet((ZMask \| (1 << DstIdx)) ^ 15);		unsigned AltIdx = findFirstSet((ZMask \| (1 << DstIdx)) ^ 15);
assert(AltIdx < 4 && "Illegal insertion index");		assert(AltIdx < 4 && "Illegal insertion index");
unsigned AltImm = (AltIdx << 6) \| (AltIdx << 4) \| ZMask;		unsigned AltImm = (AltIdx << 6) \| (AltIdx << 4) \| ZMask;
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(AltImm);		WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(AltImm);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
return nullptr;		return nullptr;
}		}
case X86::MOVSDrr:		case X86::MOVSDrr:
case X86::MOVSSrr:		case X86::MOVSSrr:
case X86::VMOVSDrr:		case X86::VMOVSDrr:
case X86::VMOVSSrr:{		case X86::VMOVSSrr:{
// On SSE41 or later we can commute a MOVSS/MOVSD to a BLENDPS/BLENDPD.		// On SSE41 or later we can commute a MOVSS/MOVSD to a BLENDPS/BLENDPD.
if (Subtarget.hasSSE41()) {		if (Subtarget.hasSSE41()) {
unsigned Mask, Opc;		unsigned Mask, Opc;
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default: llvm_unreachable("Unreachable!");		default: llvm_unreachable("Unreachable!");
case X86::MOVSDrr: Opc = X86::BLENDPDrri; Mask = 0x02; break;		case X86::MOVSDrr: Opc = X86::BLENDPDrri; Mask = 0x02; break;
case X86::MOVSSrr: Opc = X86::BLENDPSrri; Mask = 0x0E; break;		case X86::MOVSSrr: Opc = X86::BLENDPSrri; Mask = 0x0E; break;
case X86::VMOVSDrr: Opc = X86::VBLENDPDrri; Mask = 0x02; break;		case X86::VMOVSDrr: Opc = X86::VBLENDPDrri; Mask = 0x02; break;
case X86::VMOVSSrr: Opc = X86::VBLENDPSrri; Mask = 0x0E; break;		case X86::VMOVSSrr: Opc = X86::VBLENDPSrri; Mask = 0x0E; break;
}		}

auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
WorkingMI.addOperand(MachineOperand::CreateImm(Mask));		WorkingMI.addOperand(MachineOperand::CreateImm(Mask));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}

// Convert to SHUFPD.		// Convert to SHUFPD.
assert(MI.getOpcode() == X86::MOVSDrr &&		assert(MI.getOpcode() == X86::MOVSDrr &&
"Can only commute MOVSDrr without SSE4.1");		"Can only commute MOVSDrr without SSE4.1");

auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(X86::SHUFPDrri));		WorkingMI.setDesc(get(X86::SHUFPDrri));
WorkingMI.addOperand(MachineOperand::CreateImm(0x02));		WorkingMI.addOperand(MachineOperand::CreateImm(0x02));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::SHUFPDrri: {		case X86::SHUFPDrri: {
// Commute to MOVSD.		// Commute to MOVSD.
assert(MI.getOperand(3).getImm() == 0x02 && "Unexpected immediate!");		assert(MI.getOperand(3).getImm() == 0x02 && "Unexpected immediate!");
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(X86::MOVSDrr));		WorkingMI.setDesc(get(X86::MOVSDrr));
WorkingMI.RemoveOperand(3);		WorkingMI.RemoveOperand(3);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::PCLMULQDQrr:		case X86::PCLMULQDQrr:
case X86::VPCLMULQDQrr:		case X86::VPCLMULQDQrr:
case X86::VPCLMULQDQYrr:		case X86::VPCLMULQDQYrr:
case X86::VPCLMULQDQZrr:		case X86::VPCLMULQDQZrr:
case X86::VPCLMULQDQZ128rr:		case X86::VPCLMULQDQZ128rr:
case X86::VPCLMULQDQZ256rr: {		case X86::VPCLMULQDQZ256rr: {
// SRC1 64bits = Imm[0] ? SRC1[127:64] : SRC1[63:0]		// SRC1 64bits = Imm[0] ? SRC1[127:64] : SRC1[63:0]
// SRC2 64bits = Imm[4] ? SRC2[127:64] : SRC2[63:0]		// SRC2 64bits = Imm[4] ? SRC2[127:64] : SRC2[63:0]
unsigned Imm = MI.getOperand(3).getImm();		unsigned Imm = MI.getOperand(3).getImm();
unsigned Src1Hi = Imm & 0x01;		unsigned Src1Hi = Imm & 0x01;
unsigned Src2Hi = Imm & 0x10;		unsigned Src2Hi = Imm & 0x10;
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(3).setImm((Src1Hi << 4) \| (Src2Hi >> 4));		WorkingMI.getOperand(3).setImm((Src1Hi << 4) \| (Src2Hi >> 4));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::VPCMPBZ128rri: case X86::VPCMPUBZ128rri:		case X86::VPCMPBZ128rri: case X86::VPCMPUBZ128rri:
case X86::VPCMPBZ256rri: case X86::VPCMPUBZ256rri:		case X86::VPCMPBZ256rri: case X86::VPCMPUBZ256rri:
case X86::VPCMPBZrri: case X86::VPCMPUBZrri:		case X86::VPCMPBZrri: case X86::VPCMPUBZrri:
case X86::VPCMPDZ128rri: case X86::VPCMPUDZ128rri:		case X86::VPCMPDZ128rri: case X86::VPCMPUDZ128rri:
case X86::VPCMPDZ256rri: case X86::VPCMPUDZ256rri:		case X86::VPCMPDZ256rri: case X86::VPCMPUDZ256rri:
case X86::VPCMPDZrri: case X86::VPCMPUDZrri:		case X86::VPCMPDZrri: case X86::VPCMPUDZrri:
Show All 15 Lines	MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI,
case X86::VPCMPWZ128rrik: case X86::VPCMPUWZ128rrik:		case X86::VPCMPWZ128rrik: case X86::VPCMPUWZ128rrik:
case X86::VPCMPWZ256rrik: case X86::VPCMPUWZ256rrik:		case X86::VPCMPWZ256rrik: case X86::VPCMPUWZ256rrik:
case X86::VPCMPWZrrik: case X86::VPCMPUWZrrik: {		case X86::VPCMPWZrrik: case X86::VPCMPUWZrrik: {
// Flip comparison mode immediate (if necessary).		// Flip comparison mode immediate (if necessary).
unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm() & 0x7;		unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm() & 0x7;
Imm = X86::getSwappedVPCMPImm(Imm);		Imm = X86::getSwappedVPCMPImm(Imm);
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(Imm);		WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(Imm);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::VPCOMBri: case X86::VPCOMUBri:		case X86::VPCOMBri: case X86::VPCOMUBri:
case X86::VPCOMDri: case X86::VPCOMUDri:		case X86::VPCOMDri: case X86::VPCOMUDri:
case X86::VPCOMQri: case X86::VPCOMUQri:		case X86::VPCOMQri: case X86::VPCOMUQri:
case X86::VPCOMWri: case X86::VPCOMUWri: {		case X86::VPCOMWri: case X86::VPCOMUWri: {
// Flip comparison mode immediate (if necessary).		// Flip comparison mode immediate (if necessary).
unsigned Imm = MI.getOperand(3).getImm() & 0x7;		unsigned Imm = MI.getOperand(3).getImm() & 0x7;
Imm = X86::getSwappedVPCOMImm(Imm);		Imm = X86::getSwappedVPCOMImm(Imm);
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(3).setImm(Imm);		WorkingMI.getOperand(3).setImm(Imm);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::VCMPSDZrr:		case X86::VCMPSDZrr:
case X86::VCMPSSZrr:		case X86::VCMPSSZrr:
case X86::VCMPPDZrri:		case X86::VCMPPDZrri:
case X86::VCMPPSZrri:		case X86::VCMPPSZrri:
case X86::VCMPPDZ128rri:		case X86::VCMPPDZ128rri:
case X86::VCMPPSZ128rri:		case X86::VCMPPSZ128rri:
case X86::VCMPPDZ256rri:		case X86::VCMPPDZ256rri:
case X86::VCMPPSZ256rri:		case X86::VCMPPSZ256rri:
case X86::VCMPPDZrrik:		case X86::VCMPPDZrrik:
case X86::VCMPPSZrrik:		case X86::VCMPPSZrrik:
case X86::VCMPPDZ128rrik:		case X86::VCMPPDZ128rrik:
case X86::VCMPPSZ128rrik:		case X86::VCMPPSZ128rrik:
case X86::VCMPPDZ256rrik:		case X86::VCMPPDZ256rrik:
case X86::VCMPPSZ256rrik: {		case X86::VCMPPSZ256rrik: {
unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm() & 0x1f;		unsigned Imm = MI.getOperand(MI.getNumOperands() - 1).getImm() & 0x1f;
Imm = X86::getSwappedVCMPImm(Imm);		Imm = X86::getSwappedVCMPImm(Imm);
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(Imm);		WorkingMI.getOperand(MI.getNumOperands() - 1).setImm(Imm);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::VPERM2F128rr:		case X86::VPERM2F128rr:
case X86::VPERM2I128rr: {		case X86::VPERM2I128rr: {
// Flip permute source immediate.		// Flip permute source immediate.
// Imm & 0x02: lo = if set, select Op1.lo/hi else Op0.lo/hi.		// Imm & 0x02: lo = if set, select Op1.lo/hi else Op0.lo/hi.
// Imm & 0x20: hi = if set, select Op1.lo/hi else Op0.lo/hi.		// Imm & 0x20: hi = if set, select Op1.lo/hi else Op0.lo/hi.
int8_t Imm = MI.getOperand(3).getImm() & 0xFF;		int8_t Imm = MI.getOperand(3).getImm() & 0xFF;
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.getOperand(3).setImm(Imm ^ 0x22);		WorkingMI.getOperand(3).setImm(Imm ^ 0x22);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::MOVHLPSrr:		case X86::MOVHLPSrr:
case X86::UNPCKHPDrr:		case X86::UNPCKHPDrr:
case X86::VMOVHLPSrr:		case X86::VMOVHLPSrr:
case X86::VUNPCKHPDrr:		case X86::VUNPCKHPDrr:
case X86::VMOVHLPSZrr:		case X86::VMOVHLPSZrr:
case X86::VUNPCKHPDZ128rr: {		case X86::VUNPCKHPDZ128rr: {
assert(Subtarget.hasSSE2() && "Commuting MOVHLP/UNPCKHPD requires SSE2!");		assert(Subtarget.hasSSE2() && "Commuting MOVHLP/UNPCKHPD requires SSE2!");

unsigned Opc = MI.getOpcode();		unsigned Opc = MI.getOpcode();
switch (Opc) {		switch (Opc) {
default: llvm_unreachable("Unreachable!");		default: llvm_unreachable("Unreachable!");
case X86::MOVHLPSrr: Opc = X86::UNPCKHPDrr; break;		case X86::MOVHLPSrr: Opc = X86::UNPCKHPDrr; break;
case X86::UNPCKHPDrr: Opc = X86::MOVHLPSrr; break;		case X86::UNPCKHPDrr: Opc = X86::MOVHLPSrr; break;
case X86::VMOVHLPSrr: Opc = X86::VUNPCKHPDrr; break;		case X86::VMOVHLPSrr: Opc = X86::VUNPCKHPDrr; break;
case X86::VUNPCKHPDrr: Opc = X86::VMOVHLPSrr; break;		case X86::VUNPCKHPDrr: Opc = X86::VMOVHLPSrr; break;
case X86::VMOVHLPSZrr: Opc = X86::VUNPCKHPDZ128rr; break;		case X86::VMOVHLPSZrr: Opc = X86::VUNPCKHPDZ128rr; break;
case X86::VUNPCKHPDZ128rr: Opc = X86::VMOVHLPSZrr; break;		case X86::VUNPCKHPDZ128rr: Opc = X86::VMOVHLPSZrr; break;
}		}
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::CMOV16rr: case X86::CMOV32rr: case X86::CMOV64rr: {		case X86::CMOV16rr: case X86::CMOV32rr: case X86::CMOV64rr: {
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
unsigned OpNo = MI.getDesc().getNumOperands() - 1;		unsigned OpNo = MI.getDesc().getNumOperands() - 1;
X86::CondCode CC = static_cast<X86::CondCode>(MI.getOperand(OpNo).getImm());		X86::CondCode CC = static_cast<X86::CondCode>(MI.getOperand(OpNo).getImm());
WorkingMI.getOperand(OpNo).setImm(X86::GetOppositeBranchCondition(CC));		WorkingMI.getOperand(OpNo).setImm(X86::GetOppositeBranchCondition(CC));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
case X86::VPTERNLOGDZrri: case X86::VPTERNLOGDZrmi:		case X86::VPTERNLOGDZrri: case X86::VPTERNLOGDZrmi:
case X86::VPTERNLOGDZ128rri: case X86::VPTERNLOGDZ128rmi:		case X86::VPTERNLOGDZ128rri: case X86::VPTERNLOGDZ128rmi:
case X86::VPTERNLOGDZ256rri: case X86::VPTERNLOGDZ256rmi:		case X86::VPTERNLOGDZ256rri: case X86::VPTERNLOGDZ256rmi:
case X86::VPTERNLOGQZrri: case X86::VPTERNLOGQZrmi:		case X86::VPTERNLOGQZrri: case X86::VPTERNLOGQZrmi:
case X86::VPTERNLOGQZ128rri: case X86::VPTERNLOGQZ128rmi:		case X86::VPTERNLOGQZ128rri: case X86::VPTERNLOGQZ128rmi:
case X86::VPTERNLOGQZ256rri: case X86::VPTERNLOGQZ256rmi:		case X86::VPTERNLOGQZ256rri: case X86::VPTERNLOGQZ256rmi:
Show All 18 Lines	MachineInstr *X86InstrInfo::commuteInstructionImpl(MachineInstr &MI,
case X86::VPTERNLOGDZ128rmbikz:		case X86::VPTERNLOGDZ128rmbikz:
case X86::VPTERNLOGDZ256rmbikz:		case X86::VPTERNLOGDZ256rmbikz:
case X86::VPTERNLOGDZrmbikz:		case X86::VPTERNLOGDZrmbikz:
case X86::VPTERNLOGQZ128rmbikz:		case X86::VPTERNLOGQZ128rmbikz:
case X86::VPTERNLOGQZ256rmbikz:		case X86::VPTERNLOGQZ256rmbikz:
case X86::VPTERNLOGQZrmbikz: {		case X86::VPTERNLOGQZrmbikz: {
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
commuteVPTERNLOG(WorkingMI, OpIdx1, OpIdx2);		commuteVPTERNLOG(WorkingMI, OpIdx1, OpIdx2);
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}
default: {		default: {
if (isCommutableVPERMV3Instruction(MI.getOpcode())) {		if (isCommutableVPERMV3Instruction(MI.getOpcode())) {
unsigned Opc = getCommutedVPERMV3Opcode(MI.getOpcode());		unsigned Opc = getCommutedVPERMV3Opcode(MI.getOpcode());
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}

const X86InstrFMA3Group *FMA3Group = getFMA3Group(MI.getOpcode(),		const X86InstrFMA3Group *FMA3Group = getFMA3Group(MI.getOpcode(),
MI.getDesc().TSFlags);		MI.getDesc().TSFlags);
if (FMA3Group) {		if (FMA3Group) {
unsigned Opc =		unsigned Opc =
getFMA3OpcodeToCommuteOperands(MI, OpIdx1, OpIdx2, *FMA3Group);		getFMA3OpcodeToCommuteOperands(MI, OpIdx1, OpIdx2, *FMA3Group);
auto &WorkingMI = cloneIfNew(MI);		auto &WorkingMI = cloneIfNew(MI);
WorkingMI.setDesc(get(Opc));		WorkingMI.setDesc(get(Opc));
return TargetInstrInfo::commuteInstructionImpl(WorkingMI, /NewMI=/false,		return TargetInstrInfo::commuteInstructionImpl(WorkingMI, PSI, MBFI,
		/NewMI=/false,
OpIdx1, OpIdx2);		OpIdx1, OpIdx2);
}		}

return TargetInstrInfo::commuteInstructionImpl(MI, NewMI, OpIdx1, OpIdx2);		return TargetInstrInfo::commuteInstructionImpl(MI, PSI, MBFI,
		NewMI, OpIdx1, OpIdx2);
}		}
}		}
}		}

bool		bool
X86InstrInfo::findThreeSrcCommutedOpIndices(const MachineInstr &MI,		X86InstrInfo::findThreeSrcCommutedOpIndices(const MachineInstr &MI,
unsigned &SrcOpIdx1,		unsigned &SrcOpIdx1,
unsigned &SrcOpIdx2,		unsigned &SrcOpIdx2,
▲ Show 20 Lines • Show All 1,951 Lines • ▼ Show 20 Lines

/// Try to remove the load by folding it to a register		/// Try to remove the load by folding it to a register
/// operand at the use. We fold the load instructions if load defines a virtual		/// operand at the use. We fold the load instructions if load defines a virtual
/// register, the virtual register is used once in the same BB, and the		/// register, the virtual register is used once in the same BB, and the
/// instructions in-between do not load or store, and have no side effects.		/// instructions in-between do not load or store, and have no side effects.
MachineInstr *X86InstrInfo::optimizeLoadInstr(MachineInstr &MI,		MachineInstr *X86InstrInfo::optimizeLoadInstr(MachineInstr &MI,
const MachineRegisterInfo *MRI,		const MachineRegisterInfo *MRI,
unsigned &FoldAsLoadDefReg,		unsigned &FoldAsLoadDefReg,
MachineInstr *&DefMI) const {		MachineInstr *&DefMI,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {
// Check whether we can move DefMI here.		// Check whether we can move DefMI here.
DefMI = MRI->getVRegDef(FoldAsLoadDefReg);		DefMI = MRI->getVRegDef(FoldAsLoadDefReg);
assert(DefMI);		assert(DefMI);
bool SawStore = false;		bool SawStore = false;
if (!DefMI->isSafeToMove(nullptr, SawStore))		if (!DefMI->isSafeToMove(nullptr, SawStore))
return nullptr;		return nullptr;

// Collect information about virtual register operands of MI.		// Collect information about virtual register operands of MI.
Show All 9 Lines	for (unsigned i = 0, e = MI.getNumOperands(); i != e; ++i) {
if (MO.getSubReg() \|\| MO.isDef())		if (MO.getSubReg() \|\| MO.isDef())
return nullptr;		return nullptr;
SrcOperandIds.push_back(i);		SrcOperandIds.push_back(i);
}		}
if (SrcOperandIds.empty())		if (SrcOperandIds.empty())
return nullptr;		return nullptr;

// Check whether we can fold the def into SrcOperandId.		// Check whether we can fold the def into SrcOperandId.
if (MachineInstr FoldMI = foldMemoryOperand(MI, SrcOperandIds, DefMI)) {		if (MachineInstr FoldMI = foldMemoryOperand(MI, SrcOperandIds, DefMI,
		PSI, MBFI)) {
FoldAsLoadDefReg = 0;		FoldAsLoadDefReg = 0;
return FoldMI;		return FoldMI;
}		}

return nullptr;		return nullptr;
}		}

/// Expand a single-def pseudo instruction to a two-addr		/// Expand a single-def pseudo instruction to a two-addr
▲ Show 20 Lines • Show All 944 Lines • ▼ Show 20 Lines	static bool shouldPreventUndefRegUpdateMemFold(MachineFunction &MF,
MachineInstr *VRegDef = RegInfo.getUniqueVRegDef(MI.getOperand(1).getReg());		MachineInstr *VRegDef = RegInfo.getUniqueVRegDef(MI.getOperand(1).getReg());
return VRegDef && VRegDef->isImplicitDef();		return VRegDef && VRegDef->isImplicitDef();
}		}


MachineInstr *X86InstrInfo::foldMemoryOperandImpl(		MachineInstr *X86InstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, unsigned OpNum,		MachineFunction &MF, MachineInstr &MI, unsigned OpNum,
ArrayRef<MachineOperand> MOs, MachineBasicBlock::iterator InsertPt,		ArrayRef<MachineOperand> MOs, MachineBasicBlock::iterator InsertPt,
unsigned Size, unsigned Align, bool AllowCommute) const {		unsigned Size, unsigned Align, bool AllowCommute,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI) const {
bool isSlowTwoMemOps = Subtarget.slowTwoMemOps();		bool isSlowTwoMemOps = Subtarget.slowTwoMemOps();
bool isTwoAddrFold = false;		bool isTwoAddrFold = false;

// For CPUs that favor the register form of a call or push,		// For CPUs that favor the register form of a call or push,
// do not fold loads into calls or pushes, unless optimizing for size		// do not fold loads into calls or pushes, unless optimizing for size
// aggressively.		// aggressively.
if (isSlowTwoMemOps && !MF.getFunction().hasMinSize() &&		if (isSlowTwoMemOps && !MF.getFunction().hasMinSize() &&
(MI.getOpcode() == X86::CALL32r \|\| MI.getOpcode() == X86::CALL64r \|\|		(MI.getOpcode() == X86::CALL32r \|\| MI.getOpcode() == X86::CALL64r \|\|
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	if (findCommutedOpIndices(MI, CommuteOpIdx1, CommuteOpIdx2)) {

// If either of the commutable operands are tied to the destination		// If either of the commutable operands are tied to the destination
// then we can not commute + fold.		// then we can not commute + fold.
if ((HasDef && Reg0 == Reg1 && Tied1) \|\|		if ((HasDef && Reg0 == Reg1 && Tied1) \|\|
(HasDef && Reg0 == Reg2 && Tied2))		(HasDef && Reg0 == Reg2 && Tied2))
return nullptr;		return nullptr;

MachineInstr *CommutedMI =		MachineInstr *CommutedMI =
commuteInstruction(MI, false, CommuteOpIdx1, CommuteOpIdx2);		commuteInstruction(MI, PSI, MBFI, false, CommuteOpIdx1, CommuteOpIdx2);
if (!CommutedMI) {		if (!CommutedMI) {
// Unable to commute.		// Unable to commute.
return nullptr;		return nullptr;
}		}
if (CommutedMI != &MI) {		if (CommutedMI != &MI) {
// New instruction. We can't fold from this.		// New instruction. We can't fold from this.
CommutedMI->eraseFromParent();		CommutedMI->eraseFromParent();
return nullptr;		return nullptr;
}		}

// Attempt to fold with the commuted version of the instruction.		// Attempt to fold with the commuted version of the instruction.
NewMI = foldMemoryOperandImpl(MF, MI, CommuteOpIdx2, MOs, InsertPt,		NewMI = foldMemoryOperandImpl(MF, MI, CommuteOpIdx2, MOs, InsertPt,
Size, Align, /AllowCommute=/false);		Size, Align, /AllowCommute=/false,
		PSI, MBFI);
if (NewMI)		if (NewMI)
return NewMI;		return NewMI;

// Folding failed again - undo the commute before returning.		// Folding failed again - undo the commute before returning.
MachineInstr *UncommutedMI =		MachineInstr *UncommutedMI =
commuteInstruction(MI, false, CommuteOpIdx1, CommuteOpIdx2);		commuteInstruction(MI, PSI, MBFI, false, CommuteOpIdx1, CommuteOpIdx2);
if (!UncommutedMI) {		if (!UncommutedMI) {
// Unable to commute.		// Unable to commute.
return nullptr;		return nullptr;
}		}
if (UncommutedMI != &MI) {		if (UncommutedMI != &MI) {
// New instruction. It doesn't need to be kept.		// New instruction. It doesn't need to be kept.
UncommutedMI->eraseFromParent();		UncommutedMI->eraseFromParent();
return nullptr;		return nullptr;
Show All 9 Lines	if (PrintFailedFusing && !MI.isCopy())
dbgs() << "We failed to fuse operand " << OpNum << " in " << MI;		dbgs() << "We failed to fuse operand " << OpNum << " in " << MI;
return nullptr;		return nullptr;
}		}

MachineInstr *		MachineInstr *
X86InstrInfo::foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,		X86InstrInfo::foldMemoryOperandImpl(MachineFunction &MF, MachineInstr &MI,
ArrayRef<unsigned> Ops,		ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt,		MachineBasicBlock::iterator InsertPt,
int FrameIndex, LiveIntervals *LIS,		int FrameIndex,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI,
		LiveIntervals *LIS,
VirtRegMap *VRM) const {		VirtRegMap *VRM) const {
// Check switch flag		// Check switch flag
if (NoFusing)		if (NoFusing)
return nullptr;		return nullptr;

// Avoid partial and undef register update stalls unless optimizing for size.		// Avoid partial and undef register update stalls unless optimizing for size.
if (!MF.getFunction().hasOptSize() &&		if (!MF.getFunction().hasOptSize() &&
(hasPartialRegUpdate(MI.getOpcode(), Subtarget, /ForLoadFold/true) \|\|		(hasPartialRegUpdate(MI.getOpcode(), Subtarget, /ForLoadFold/true) \|\|
Show All 33 Lines	if (Ops.size() == 2 && Ops[0] == 0 && Ops[1] == 1) {
// Change to CMPXXri r, 0 first.		// Change to CMPXXri r, 0 first.
MI.setDesc(get(NewOpc));		MI.setDesc(get(NewOpc));
MI.getOperand(1).ChangeToImmediate(0);		MI.getOperand(1).ChangeToImmediate(0);
} else if (Ops.size() != 1)		} else if (Ops.size() != 1)
return nullptr;		return nullptr;

return foldMemoryOperandImpl(MF, MI, Ops[0],		return foldMemoryOperandImpl(MF, MI, Ops[0],
MachineOperand::CreateFI(FrameIndex), InsertPt,		MachineOperand::CreateFI(FrameIndex), InsertPt,
Size, Alignment, /AllowCommute=/true);		Size, Alignment, /AllowCommute=/true,
		PSI, MBFI);
}		}

/// Check if \p LoadMI is a partial register load that we can't fold into \p MI		/// Check if \p LoadMI is a partial register load that we can't fold into \p MI
/// because the latter uses contents that wouldn't be defined in the folded		/// because the latter uses contents that wouldn't be defined in the folded
/// version. For instance, this transformation isn't legal:		/// version. For instance, this transformation isn't legal:
/// movss (%rdi), %xmm0		/// movss (%rdi), %xmm0
/// addps %xmm0, %xmm0		/// addps %xmm0, %xmm0
/// ->		/// ->
▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	static bool isNonFoldablePartialRegisterLoad(const MachineInstr &LoadMI,
}		}

return false;		return false;
}		}

MachineInstr *X86InstrInfo::foldMemoryOperandImpl(		MachineInstr *X86InstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
		ProfileSummaryInfo PSI, const MachineBlockFrequencyInfo MBFI,
LiveIntervals *LIS) const {		LiveIntervals *LIS) const {

// TODO: Support the case where LoadMI loads a wide register, but MI		// TODO: Support the case where LoadMI loads a wide register, but MI
// only uses a subreg.		// only uses a subreg.
for (auto Op : Ops) {		for (auto Op : Ops) {
if (MI.getOperand(Op).getSubReg())		if (MI.getOperand(Op).getSubReg())
return nullptr;		return nullptr;
}		}

// If loading from a FrameIndex, fold directly from the FrameIndex.		// If loading from a FrameIndex, fold directly from the FrameIndex.
unsigned NumOps = LoadMI.getDesc().getNumOperands();		unsigned NumOps = LoadMI.getDesc().getNumOperands();
int FrameIndex;		int FrameIndex;
if (isLoadFromStackSlot(LoadMI, FrameIndex)) {		if (isLoadFromStackSlot(LoadMI, FrameIndex)) {
if (isNonFoldablePartialRegisterLoad(LoadMI, MI, MF))		if (isNonFoldablePartialRegisterLoad(LoadMI, MI, MF))
return nullptr;		return nullptr;
return foldMemoryOperandImpl(MF, MI, Ops, InsertPt, FrameIndex, LIS);		return foldMemoryOperandImpl(MF, MI, Ops, InsertPt, FrameIndex, PSI, MBFI,
		LIS);
}		}

// Check switch flag		// Check switch flag
if (NoFusing) return nullptr;		if (NoFusing) return nullptr;

// Avoid partial and undef register update stalls unless optimizing for size.		// Avoid partial and undef register update stalls unless optimizing for size.
if (!MF.getFunction().hasOptSize() &&		if (!MF.getFunction().hasOptSize() &&
(hasPartialRegUpdate(MI.getOpcode(), Subtarget, /ForLoadFold/true) \|\|		(hasPartialRegUpdate(MI.getOpcode(), Subtarget, /ForLoadFold/true) \|\|
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	default: {

// Folding a normal load. Just copy the load's address operands.		// Folding a normal load. Just copy the load's address operands.
MOs.append(LoadMI.operands_begin() + NumOps - X86::AddrNumOperands,		MOs.append(LoadMI.operands_begin() + NumOps - X86::AddrNumOperands,
LoadMI.operands_begin() + NumOps);		LoadMI.operands_begin() + NumOps);
break;		break;
}		}
}		}
return foldMemoryOperandImpl(MF, MI, Ops[0], MOs, InsertPt,		return foldMemoryOperandImpl(MF, MI, Ops[0], MOs, InsertPt,
/Size=/0, Alignment, /AllowCommute=/true);		/Size=/0, Alignment, /AllowCommute=/true,
		PSI, MBFI);
}		}

static SmallVector<MachineMemOperand *, 2>		static SmallVector<MachineMemOperand *, 2>
extractLoadMMOs(ArrayRef<MachineMemOperand *> MMOs, MachineFunction &MF) {		extractLoadMMOs(ArrayRef<MachineMemOperand *> MMOs, MachineFunction &MF) {
SmallVector<MachineMemOperand *, 2> LoadMMOs;		SmallVector<MachineMemOperand *, 2> LoadMMOs;

for (MachineMemOperand *MMO : MMOs) {		for (MachineMemOperand *MMO : MMOs) {
if (!MMO->isLoad())		if (!MMO->isLoad())
▲ Show 20 Lines • Show All 1,310 Lines • ▼ Show 20 Lines	case X86::MOVHLPSrr:
return 0;		return 0;
case X86::SHUFPDrri:		case X86::SHUFPDrri:
return 0x6;		return 0x6;
}		}
return 0;		return 0;
}		}

bool X86InstrInfo::setExecutionDomainCustom(MachineInstr &MI,		bool X86InstrInfo::setExecutionDomainCustom(MachineInstr &MI,
unsigned Domain) const {		unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {
assert(Domain > 0 && Domain < 4 && "Invalid execution domain");		assert(Domain > 0 && Domain < 4 && "Invalid execution domain");
uint16_t dom = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;		uint16_t dom = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;
assert(dom && "Not an SSE instruction");		assert(dom && "Not an SSE instruction");

unsigned Opcode = MI.getOpcode();		unsigned Opcode = MI.getOpcode();
unsigned NumOperands = MI.getDesc().getNumOperands();		unsigned NumOperands = MI.getDesc().getNumOperands();

auto SetBlendDomain = [&](unsigned ImmWidth, bool Is256) {		auto SetBlendDomain = [&](unsigned ImmWidth, bool Is256) {
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	bool X86InstrInfo::setExecutionDomainCustom(MachineInstr &MI,
case X86::UNPCKHPDrr:		case X86::UNPCKHPDrr:
case X86::MOVHLPSrr:		case X86::MOVHLPSrr:
// We just need to commute the instruction which will switch the domains.		// We just need to commute the instruction which will switch the domains.
if (Domain != dom && Domain != 3 &&		if (Domain != dom && Domain != 3 &&
MI.getOperand(1).getReg() == MI.getOperand(2).getReg() &&		MI.getOperand(1).getReg() == MI.getOperand(2).getReg() &&
MI.getOperand(0).getSubReg() == 0 &&		MI.getOperand(0).getSubReg() == 0 &&
MI.getOperand(1).getSubReg() == 0 &&		MI.getOperand(1).getSubReg() == 0 &&
MI.getOperand(2).getSubReg() == 0) {		MI.getOperand(2).getSubReg() == 0) {
commuteInstruction(MI, false);		commuteInstruction(MI, PSI, MBFI, false);
return true;		return true;
}		}
// We must always return true for MOVHLPSrr.		// We must always return true for MOVHLPSrr.
if (Opcode == X86::MOVHLPSrr)		if (Opcode == X86::MOVHLPSrr)
return true;		return true;
break;		break;
case X86::SHUFPDrri: {		case X86::SHUFPDrri: {
if (Domain == 1) {		if (Domain == 1) {
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if (lookup(opcode, domain, ReplaceableInstrs)) {
else		else
validDomains = 0xc;		validDomains = 0xc;
}		}
}		}
}		}
return std::make_pair(domain, validDomains);		return std::make_pair(domain, validDomains);
}		}

void X86InstrInfo::setExecutionDomain(MachineInstr &MI, unsigned Domain) const {		void X86InstrInfo::setExecutionDomain(MachineInstr &MI, unsigned Domain,
		ProfileSummaryInfo *PSI,
		const MachineBlockFrequencyInfo *MBFI) const {
assert(Domain>0 && Domain<4 && "Invalid execution domain");		assert(Domain>0 && Domain<4 && "Invalid execution domain");
uint16_t dom = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;		uint16_t dom = (MI.getDesc().TSFlags >> X86II::SSEDomainShift) & 3;
assert(dom && "Not an SSE instruction");		assert(dom && "Not an SSE instruction");

// Attempt to match for custom instructions.		// Attempt to match for custom instructions.
if (setExecutionDomainCustom(MI, Domain))		if (setExecutionDomainCustom(MI, Domain, PSI, MBFI))
return;		return;

const uint16_t *table = lookup(MI.getOpcode(), dom, ReplaceableInstrs);		const uint16_t *table = lookup(MI.getOpcode(), dom, ReplaceableInstrs);
if (!table) { // try the other table		if (!table) { // try the other table
assert((Subtarget.hasAVX2() \|\| Domain < 3) &&		assert((Subtarget.hasAVX2() \|\| Domain < 3) &&
"256-bit vector operations only available in AVX2");		"256-bit vector operations only available in AVX2");
table = lookup(MI.getOpcode(), dom, ReplaceableInstrsAVX2);		table = lookup(MI.getOpcode(), dom, ReplaceableInstrsAVX2);
}		}
▲ Show 20 Lines • Show All 1,257 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PGO][PGSO] TargetInstrInfo part.AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 227529

llvm/include/llvm/CodeGen/TargetInstrInfo.h

llvm/lib/CodeGen/ExecutionDomainFix.cpp

llvm/lib/CodeGen/InlineSpiller.cpp

llvm/lib/CodeGen/LiveRangeEdit.cpp

llvm/lib/CodeGen/MachineCSE.cpp

llvm/lib/CodeGen/PeepholeOptimizer.cpp

llvm/lib/CodeGen/RegisterCoalescer.cpp

llvm/lib/CodeGen/TargetInstrInfo.cpp

llvm/lib/CodeGen/TwoAddressInstructionPass.cpp

llvm/lib/Target/AArch64/AArch64InstrInfo.h

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/lib/Target/AMDGPU/GCNDPPCombine.cpp

llvm/lib/Target/AMDGPU/SIFoldOperands.cpp

llvm/lib/Target/AMDGPU/SIInsertSkips.cpp

llvm/lib/Target/AMDGPU/SIInstrInfo.h

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp

llvm/lib/Target/ARM/ARMBaseInstrInfo.h

llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp

llvm/lib/Target/ARM/Thumb2SizeReduction.cpp

llvm/lib/Target/PowerPC/PPCInstrInfo.h

llvm/lib/Target/PowerPC/PPCInstrInfo.cpp

llvm/lib/Target/SystemZ/SystemZInstrInfo.h

llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp

llvm/lib/Target/SystemZ/SystemZPostRewrite.cpp

llvm/lib/Target/SystemZ/SystemZShortenInst.cpp

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.h

llvm/lib/Target/WebAssembly/WebAssemblyInstrInfo.cpp

llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp

llvm/lib/Target/X86/X86FastISel.cpp

llvm/lib/Target/X86/X86InstrInfo.h

llvm/lib/Target/X86/X86InstrInfo.cpp

[PGO][PGSO] TargetInstrInfo part.
AbandonedPublic